The EU AI Act compliance checklist by risk tier is the operating artefact that turns a sprawling regulation into a workable program. Four tiers — unacceptable, high, limited, minimal — set the obligation depth; a separate stack covers general-purpose AI models. Classifying a system to the right tier is the cheapest decision in the entire program and the one teams most consistently get wrong in both directions.
The regulation applies extraterritorially. Providers and deployers located outside the European Union are still in scope when the output of an AI system is used in the EU, when the system is placed on the EU market, or when EU residents are subject to its decisions. That covers a substantial share of US, UK, and APAC-headquartered SaaS, agency, and enterprise platforms. The practical question is not whether you are in scope — for most companies serving any EU customer, you already are — but which tier each system lands in and what the resulting control set looks like.
This guide walks the four-tier classification, the high-risk obligation stack that drives roughly thirty of the fifty checklist items, the lighter-touch limited-risk transparency rules, the separate general-purpose AI obligations that affect foundation- model providers, a decision tree for correct tier assignment, and the consolidated 50-point checklist sorted by tier. The audience is the cross-functional team — legal, security, product, engineering — that has to operate the program rather than the team that has to brief the board.
- 01Extraterritorial reach applies to most non-EU vendors.If output is used in the EU, the system is placed on the EU market, or EU residents are subject to its decisions, the Act applies regardless of where the provider is established. For most US-headquartered SaaS and AI platforms with European customers, the question is which tier you fall into, not whether you are in scope.
- 02High-risk obligations are extensive — roughly thirty controls.High-risk systems carry the heaviest stack — risk management, data governance, technical documentation, human oversight, accuracy, robustness, cybersecurity, conformity assessment, registration in the EU database, and post-market monitoring. Around thirty of the fifty checklist items live here. Classify correctly before scoping the program.
- 03Limited-risk is the most commonly mis-classified tier.Chatbots, emotion-recognition systems, biometric categorisation, and synthetic content typically fall into the limited-risk tier with transparency and labelling duties — not the full high-risk stack. Treating a limited-risk system as high-risk inflates program cost and timeline without reducing real risk.
- 04GPAI carries a separate obligation stack from application AI.Providers of general-purpose AI models follow a different track — technical documentation, training-data summaries, copyright policies, and, for systemic-risk models above the compute threshold, model evaluations and incident reporting. GPAI obligations sit alongside application-tier obligations rather than replacing them.
- 05Quarterly cadence aligns to enforcement deadlines.The Act phases in obligations on staggered dates through 2026 and 2027. A quarterly compliance cadence — classification refresh, control evidence walk, conformity-assessment gate review, post-market monitoring data check — keeps the program moving with the enforcement timeline rather than catching up to it.
01 — Why It MattersExtraterritorial reach — if you sell in EU, you comply.
The EU AI Act is the first comprehensive horizontal regulation of AI in any major jurisdiction. It applies a risk-based framework with tiered obligations, civil penalties that scale to a percentage of worldwide turnover, and an enforcement architecture that includes a new EU AI Office plus member-state market- surveillance authorities. For most teams the first question is whether the regulation applies at all — and the answer is yes far more often than vendors expect.
Three triggers bring a non-EU provider into scope. First, placing an AI system on the EU market — the same surface that triggers CE-marking obligations on physical goods. Second, the output of an AI system used in the EU — which covers cross-border SaaS where the model itself runs outside the EU but its decisions land on an EU resident. Third, AI systems used by EU-established deployers regardless of where the system was built. The combined surface is large enough that most B2B SaaS with any European footprint should assume in-scope and reason from tier.
The penalty structure is what makes classification matter. Unacceptable-tier violations attract the highest fines — up to seven percent of worldwide annual turnover or thirty-five million euros, whichever is higher. High-risk obligations attract up to three percent or fifteen million euros. Misleading information to authorities sits at one and a half percent or seven and a half million. The penalty curve is steep enough that mis-classifying a low-risk system as high-risk produces wasted spend, while mis-classifying a high-risk system as limited produces fine exposure that dwarfs the saved control cost by an order of magnitude.
The enforcement timeline phases in obligations across roughly three years from entry into force. Prohibited-practice provisions are the earliest enforced; high-risk obligations phase in over a longer window to give industry time to build conformity-assessment pipelines; GPAI obligations have their own timeline. The phased approach is helpful for planning but cuts both ways — by the time the latest obligations bind, the earliest have been in force long enough that “we are still preparing” is no longer a defensible posture. A quarterly compliance cadence anchored to the enforcement-date roadmap keeps the program moving with the deadlines rather than against them.
For deeper context on the legislative architecture and US implications specifically, our earlier EU AI Act primer for US businesses and the broader European business compliance guide cover the legal and operational background. This playbook assumes the reader has decided the regulation applies and is scoping the control program.
02 — ClassificationUnacceptable, high, limited, minimal.
The classification model is the spine of the Act. Every AI system in scope falls into one of four tiers, and the tier determines whether the system can be placed on the EU market at all, what obligations attach if it can, and what evidence the provider must maintain to demonstrate compliance. The four tiers cover an enormous range of capability — from a banned social-scoring system at one end to a spam filter at the other — and the right classification is rarely obvious for systems that fall in the middle band.
The matrix below summarises the obligation depth at each tier. Read it as a scoping tool: locate the system in the matrix, then expand the controls in the sections that follow.
Unacceptable risk · prohibited
Practices the Act bans outright. Social scoring by public authorities, untargeted scraping of facial images for recognition databases, emotion recognition in workplaces and schools, biometric categorisation by sensitive attributes, exploitation of vulnerabilities to materially distort behaviour, and real-time remote biometric identification in public spaces (narrow exceptions for law enforcement). No conformity pathway — the only response is to not deploy.
Do not place on marketHigh risk · full stack
Systems listed in Annex III plus AI used as a safety component of products under existing harmonised legislation. Annex III covers biometric identification, critical-infrastructure management, education and vocational training scoring, employment and worker management, access to essential services, law enforcement, migration and border control, and administration of justice. Triggers the full thirty-control stack — risk management, data governance, documentation, human oversight, conformity assessment, registration, post-market monitoring.
Run full compliance programLimited risk · transparency
Systems that interact with humans (chatbots), emotion-recognition or biometric-categorisation systems outside prohibited contexts, and systems generating synthetic image, audio, video, or text content (deep fakes). Obligation is transparency — inform the user they are interacting with AI, label synthetic content as such, and disclose emotion-recognition or biometric-categorisation. Substantially lighter than high-risk; commonly over-scoped by cautious legal teams.
Disclose and labelMinimal risk · voluntary
Everything else. Spam filters, AI-enabled video games, inventory management AI, marketing personalisation that does not produce consequential decisions. No mandatory obligations under the Act. The European Commission encourages voluntary codes of conduct, and a sensible governance baseline — model documentation, eval cadence, incident logging — remains good practice even where the law does not require it.
Voluntary code of conductTwo structural points get lost in summaries of the four-tier model. First, the classification is per-system, not per-vendor — a single vendor may operate systems across all four tiers and must run the appropriate program for each. Second, the tier depends on intended purpose plus deployment context, not on the underlying model technology. The same large language model can power a tier-4 spam summariser and a tier-2 employment screening tool; the system, not the model, is classified.
The classification mistake teams most consistently make is collapsing limited-risk into high-risk out of caution. A customer-support chatbot that answers questions about order status is a tier-3 limited-risk system with a transparency obligation — “you are chatting with an AI assistant” — and effectively no further high-risk apparatus. Running a full thirty-control program against it burns budget that should go to systems that genuinely sit in tier 2. The classification tree in Section 06 is designed to keep these from creeping up the tier ladder by default.
"The classification is per-system, not per-vendor. The same model can power a tier-4 spam summariser and a tier-2 employment screening tool — the system, not the model, is classified."— Stage 0 of the EU AI Act compliance playbook
03 — High RiskConformity assessment, registration, post-market monitoring.
High-risk obligations are the heaviest stack in the Act and the section where most compliance budget actually lands. Five operational pillars structure the work: risk management, technical documentation and record-keeping, human oversight, accuracy and robustness, and post-market monitoring. Each pillar maps to a cluster of controls; together they make up roughly thirty of the fifty checklist items in Section 07.
The grid below summarises the five pillars and the artefact each one produces. Treat the artefact column as the evidence a market-surveillance authority would request — if you cannot produce it on a week's notice, the pillar is not yet operational.
Risk management system
Iterative · documented · throughout lifecycleIdentify and analyse foreseeable risks to health, safety, and fundamental rights. Estimate and evaluate risks under intended use and reasonably foreseeable misuse. Adopt risk-management measures and test residual risk. The system is iterative — updated as the AI system evolves through training, deployment, and operation.
Article 9Data governance and quality
Training · validation · testing data setsTraining, validation, and testing data sets meet quality criteria — relevance, representativeness, freedom from errors, completeness. Examine for biases that could affect health, safety, or fundamental rights. Document data provenance, collection processes, and labelling decisions. Sensitive categories are processed only when strictly necessary.
Article 10Technical documentation and logs
Drawn up before placing on market · kept currentTechnical documentation per Annex IV — system description, design specs, training methodology, performance metrics, foreseeable risks, human-oversight design, validation results. Automatic logging of events relevant to risk identification and post-market monitoring; logs retained for a duration appropriate to intended purpose.
Articles 11, 12Human oversight and transparency
Design-time · operator-friendly · stop functionsHuman oversight measures designed and built into the system to prevent or minimise risks. Natural persons assigned to oversight can monitor operation, understand capabilities and limitations, remain aware of automation bias, correctly interpret output, decide not to use the output, and intervene or interrupt. Instructions for use accompany the system.
Articles 13, 14Post-market monitoring
Systematic · documented · proportionatePost-market monitoring plan documented and operating. Collect, document, and analyse data on performance throughout the lifecycle. Continuously evaluate compliance with the Act's requirements. Report serious incidents to relevant market-surveillance authority within prescribed windows. Update the risk management system based on monitoring findings.
Articles 17, 61, 73Two procedural milestones bracket the operational pillars. Before a high-risk system is placed on the market or put into service, the provider must complete a conformity assessment — internal control for most Annex III systems, third-party notified-body assessment for biometric identification systems and certain product-safety cases. The assessment produces an EU declaration of conformity and authorises the CE marking. The system is then registered in the EU database for high-risk AI systems, a publicly searchable register run by the Commission.
After deployment, the obligation surface continues. The provider maintains technical documentation for ten years after the system is placed on the market. The post-market monitoring plan runs for the operational life of the system. Serious incidents and malfunctions that constitute a breach of EU fundamental-rights law obligations must be reported to the relevant authority, with timelines that vary by severity. The deployer, separately, carries duties around fundamental-rights impact assessment for certain public-sector and essential-services use cases.
04 — Limited RiskTransparency obligations and labeling.
The limited-risk tier covers systems with specific transparency duties but none of the high-risk operational apparatus. Four categories fall here in practice: AI systems intended to interact with natural persons (chatbots, voice assistants), emotion-recognition or biometric-categorisation systems outside prohibited contexts, AI systems generating or manipulating synthetic image, audio, or video content (deep fakes), and AI systems generating or manipulating text published with the purpose of informing the public on matters of public interest.
For each category the obligation is to ensure the affected person knows. A chatbot must disclose that the user is interacting with AI unless the context makes it obvious. Synthetic content must be labelled as artificially generated or manipulated, in a clear and distinguishable manner, with limited exceptions for evidently artistic, satirical, or similar contexts. Emotion-recognition systems must inform the persons exposed to them. The compliance work is substantially lighter than high-risk — but the disclosure has to actually surface in the user experience, not just sit in a terms-of-service page.
Human-AI interaction disclosure
Chatbots, voice assistants, and AI agents that interact with natural persons must disclose that the interlocutor is an AI system, unless the context makes it obvious. The disclosure has to be perceivable in the channel — a banner in the chat UI, a spoken introduction by a voice assistant, an avatar label. Buried in legal small print does not qualify.
Article 50(1)Synthetic content labelling
AI systems generating or manipulating synthetic image, audio, or video content must mark outputs as artificially generated or manipulated in a machine-readable format and detectable as such. Providers ensure the marking is interoperable, robust, and reliable as far as technically feasible — the C2PA standard and similar provenance schemes are emerging as the implementation baseline.
Article 50(2)Emotion and biometric disclosure
Emotion-recognition systems and biometric-categorisation systems, outside the prohibited workplace and education contexts, must inform the natural persons exposed to them of their operation. The disclosure precedes or accompanies use; consent under data-protection law remains a separate obligation that often applies in parallel.
Article 50(3)Deep-fake content disclosure
Where AI generates content constituting a deep fake — image, audio, or video resembling existing persons, objects, places, events — the deployer discloses that the content has been artificially generated or manipulated. Exceptions for evidently artistic, creative, satirical, or fictional works; the disclosure obligation does not eliminate parody or critique.
Article 50(4)The provider-deployer split is sharper at this tier than at high-risk. The provider builds the labelling capability into the system; the deployer operates it in the surface where the user sees the content. A SaaS that generates AI images is the provider and embeds the C2PA-style provenance marker; the agency that uses the SaaS to produce a campaign is the deployer and surfaces the “AI-generated” label in the campaign asset. Both have obligations; the boundary between them gets tested when content leaves one platform and travels to another that strips the metadata.
Pragmatic implementation note: design the disclosure into the user experience early. Retrofitting an “AI assistant” label into a mature chatbot UX is harder than shipping it from the first sprint, and the retrofit invariably sparks UX debates that delay launch. Treat the limited-risk disclosure as a default UX pattern, not an afterthought.
05 — GPAIGeneral-purpose AI separate obligations.
General-purpose AI models — foundation models capable of performing a wide range of distinct tasks regardless of how they are placed on the market — carry their own obligation stack under the Act, distinct from the application-tier classifications above. The stack applies to model providers; downstream deployers that integrate a GPAI model into an application are subject to the application-tier rules for the integrated system in addition to passing through model-level documentation from the provider.
Two GPAI sub-tiers exist. Standard GPAI models — Llama-class and Mistral-class open or closed releases below the systemic- risk compute threshold — carry documentation, transparency, and copyright obligations. Systemic-risk GPAI models, defined today by training compute above ten-to-the-twenty-five floating-point operations, carry the standard obligations plus additional model-evaluation, adversarial-testing, incident- reporting, and cybersecurity duties. The Commission can designate further models as systemic-risk based on capability and reach.
GPAI obligation surface · sub-tiers and how they stack with application-tier rules
Source: EU AI Act, GPAI provisions and Annexes (Digital Applied summary)For downstream deployers integrating a GPAI model, the practical workflow is to request the provider's technical documentation package — typically delivered as a model card plus a technical-documentation annex — and to use that documentation as the upstream input to the integrated system's own conformity assessment if the integrated system is high-risk. The Code of Practice for GPAI providers, currently in development under the EU AI Office, is the interpretive guidance that fills in implementation detail ahead of regulatory clarification.
Two operational notes for deployers. First, do not assume the GPAI model provider has cleared every documentation obligation just because the model is on the EU market — some of the highest-profile providers have published model cards substantially less detailed than what the Act's Annex IX anticipates. The deployer's own compliance position depends on the package received from upstream. Second, the training-content summary obligation produces an artefact that most downstream rights-holder requests will reference; the summary is the document the deployer points to when an EU publisher or creator asserts that copyrighted material was used in training.
06 — Classification TreeDecision tree for correct tier assignment.
The decision tree below is the operational tool teams use to assign tiers when scoping a new system or when re-classifying an existing one. Walk the questions in order — the first “yes” lands the tier. The questions are framed to surface the misclassification patterns we see most often in audits: cautious legal teams pushing limited-risk systems up into high-risk, engineering teams under-classifying systems that touch Annex III use cases, GPAI providers ignoring the separate stack because their downstream deployers are doing the application-tier work.
# EU AI Act classification decision tree
# Walk the questions in order. First "yes" lands the tier.
01 · Is the system a general-purpose AI model
(capable of a wide range of distinct tasks regardless of
placement on the market)?
YES → GPAI track · run Section 05 obligations
· also classify any application built on it (continue
the tree for the application)
NO → continue to 02
02 · Does the system implement a prohibited practice?
Social scoring by public authorities; untargeted scraping for
facial-recognition databases; emotion recognition in workplaces
or schools; biometric categorisation by sensitive attributes;
exploitation of vulnerabilities to materially distort behaviour;
real-time remote biometric identification in public spaces
(narrow law-enforcement exceptions).
YES → TIER 1 · UNACCEPTABLE · do not place on EU market
NO → continue to 03
03 · Is the system a safety component of a product covered by
Annex II harmonised legislation? (machinery, toys, medical
devices, in vitro diagnostics, lifts, radio equipment etc.)
YES → TIER 2 · HIGH RISK · full stack
NO → continue to 04
04 · Is the system listed in Annex III?
· Biometric identification and categorisation of natural
persons (where not prohibited)
· Critical infrastructure management and operation
· Education and vocational training (scoring, admissions)
· Employment, workers management, access to self-employment
(CV screening, performance evaluation)
· Access to and enjoyment of essential private and public
services (credit scoring, public benefits eligibility,
emergency dispatch)
· Law enforcement
· Migration, asylum, border control management
· Administration of justice and democratic processes
YES → TIER 2 · HIGH RISK · full stack
(note Article 6(3) derogation for some narrow tasks
that do not pose significant risk to health, safety,
or fundamental rights)
NO → continue to 05
05 · Does the system interact with natural persons, generate or
manipulate synthetic image / audio / video / text content,
conduct emotion recognition, or perform biometric
categorisation (outside prohibited contexts)?
YES → TIER 3 · LIMITED RISK · transparency and labelling
NO → continue to 06
06 · None of the above triggered.
→ TIER 4 · MINIMAL RISK · no mandatory obligations under
the Act · voluntary code of conduct encouraged
(sensible governance baseline remains good practice)
# After tier assignment:
# · Document the classification reasoning · sign and date
# · Record in the system registry alongside owner and review date
# · Re-classify when intended purpose or deployment context changes
# · Re-classify at quarterly cadence regardlessTwo procedural disciplines make the tree work in production. First, the classification reasoning is documented and signed — a one-page memo per system that names the tier, walks the tree, cites the relevant article or annex, and identifies the system owner. The memo is what an auditor reads; the tier label alone is not evidence. Second, classifications are revisited at quarterly cadence and on-trigger. The triggers are change of intended purpose, change of deployment context, change of the underlying model, or any material change to the user surface — any of which can push a system between tiers.
The Article 6(3) derogation deserves a flag. For Annex III systems performing narrow procedural tasks or improving the result of previously completed human activity, a derogation from high-risk classification is available — but only with documented justification and registration. The derogation is narrower than vendors hope and the documentation burden is real; default to in-scope unless a qualified legal review agrees the derogation applies on the specific facts.
07 — 50-Point ChecklistImplementation by tier.
The consolidated 50-point checklist below sorts the implementation surface by tier-weighted priority. The bars show the cumulative obligation weight at each tier — high-risk absorbs roughly thirty controls, GPAI a further ten, limited- risk five, the program-level governance scaffolding five. The bars are a scoping aid, not a literal control count; the point is the shape of the program, not the exact arithmetic.
Read the bars as a budgeting tool. If the system landed in tier 2 in Section 06, expect to scope thirty controls plus program scaffolding. If the system landed in tier 3, scope five controls plus scaffolding. If you are a GPAI provider as well as an application provider, expect to stack the GPAI block on top of the application-tier block. The fastest way to over-spend on EU AI Act compliance is to scope every system as tier-2 by default; the fastest way to under-deliver is to scope a true tier-2 system at tier-3 depth.
The 50-point checklist · obligation surface by control cluster
Source: Digital Applied EU AI Act 50-point checklist · tier-weighted scoping summaryThe orange bars mark obligation clusters — the visual aim is to read the shape of the program at a glance and to plan staffing accordingly. The high-risk pillars are where most of the engineering, security, and legal work lands. The program-scaffolding cluster is the meta-governance work that applies whether the portfolio contains one tier-2 system or forty; building this first is what makes the per-system cluster work tractable. Without an AI inventory and a classification memo template, the first thirty-control scoping exercise reinvents that scaffolding for itself and the second redoes the same work for the next system.
For organisations standing up an Act program from cold start, the practical sequence is: inventory → classify → program scaffolding → pillar-by-pillar build for the highest-tier system → roll the pattern to the next system. Trying to lift every system through every pillar in parallel is the anti-pattern; the program drowns in cross-cutting work and never gets the first system audit-ready. The compounding pattern of running a Stage 8 governance loop alongside the Act program — described in our agentic AI governance templates — is the cheapest way to keep the Act program from becoming a parallel-track ceremony detached from operating reality.
A separate dimension that interacts with Act compliance is data residency and processing geography. Several high-risk and GPAI obligations interact with where training data, user data, and model inference physically sit; the architectural choices that make residency tractable also tend to make the Act's documentation easier to maintain. Our AI data residency architecture patterns covers the deployment models that pair best with EU AI Act programs.
For organisations that want the program delivered rather than self-built, our AI transformation engagements include EU AI Act program setup — inventory, classification, scaffolding, pillar build, conformity preparation, and handover to the in-house team for steady-state operation.
EU AI Act compliance is a classification problem first — get the tier right and the rest follows.
The shape of an EU AI Act program is set the moment a system is classified. Tier 1 means do not deploy. Tier 2 means a roughly thirty-control program with conformity assessment, registration, and post-market monitoring stretched across the system's lifecycle. Tier 3 means transparency and labelling, surfaced in the user experience. Tier 4 means a sensible governance baseline that is not legally mandated. The same model technology can power systems at every tier; the system, not the model, is what the Act regulates.
The expensive mistakes go in both directions. Mis-classifying a tier-3 chatbot as tier-2 inflates program cost and timeline without reducing real risk. Mis-classifying a tier-2 employment-screening tool as tier-3 produces fine exposure that dwarfs the saved control cost. The signed classification memo, walked through the decision tree and stored in the system register, is the cheapest control in the program and the one that determines whether the rest of the program is correctly scoped.
Practical next step: build the AI inventory and classify every system this quarter. Do not wait for the program scaffolding to be complete; the inventory and the classifications are the inputs the scaffolding needs to be scoped. Most teams discover, on first inventory pass, that they have fewer high-risk systems than they feared and more limited-risk systems than they expected. That discovery shifts the program from an existential lift to a tractable schedule that lands well inside the enforcement window.