GDPR-Compliant AI Workflows: Data Residency, DPAs, And Audit Trails
Most EU teams adding an LLM step to a real workflow hit the same wall. The prototype works, and then someone asks where the personal data goes. If the answer is "into a US API, we think," the project spends the next quarter in legal review.
GDPR does not prohibit LLM steps. It asks a known list of questions, and almost all of them are architecture decisions, not paperwork. Teams that settle those decisions in week one ship AI workflow automation at roughly the same speed as teams that ignore them — minus the remediation later.
Settle The Lawful Basis Before The PoC
The first conversation is not with an engineer. Bring a one-page description of the workflow to whoever owns privacy — DPO, counsel, or the founder wearing that hat — and settle five questions:
What is the lawful basis for each category of personal data the workflow touches: contract, legitimate interest, or consent?
Does any step produce a decision with legal or similarly significant effect on a person (Article 22)?
Who is controller and who is processor at each step, including the model provider?
Does any data leave the EU, and under which transfer mechanism?
How long are inputs, outputs, and logs retained, and who can delete them?
The Article 22 question matters most because it changes the architecture. If the workflow influences credit, employment, insurance, housing, or anything comparable, it needs a meaningful human decision point — not a rubber stamp at the end. That is a screen you design and budget for, not a sentence in a policy document.
Send Fields, Not Records
Data minimization is the cheapest compliance control, and it usually improves model output at the same time. The principle: the model sees only what the step needs.
Send the specific fields a step uses — the complaint text and a list of categories, not the entire CRM record.
Replace names, emails, and account numbers with internal IDs before the model call; re-join them after the response.
Filter retrieval before it reaches the prompt, so RAG chunks are scoped to what the workflow actually needs.
Pseudonymize free-text inputs where feasible; incidental personal data inside a pasted email thread is still personal data.
Log exactly what was sent. Minimization you cannot demonstrate to an auditor does not count.
This is where compliance and engineering pull in the same direction: field-level prompts are easier to test, cheaper to run, and easier to defend.
Residency, DPAs, And Retention
Where the model runs is a procurement decision with three workable options, in increasing order of control and cost:
EU-region endpoints. The major model providers offer EU-region inference, processing data inside the EU under the provider's standard DPA. The sensible default for most projects.
Your-cloud deployment. Models served inside your own AWS, Azure, or GCP tenant in an EU region, so data stays within cloud agreements you have already vetted.
Self-hosted open-weights models. Maximum control, real operational cost. Choose this because a regulator or a customer contract requires it, not as a reflex.
Whichever option you pick, make the DPA chain explicit. When we build a workflow, we operate as your processor under a DPA, and the model provider appears as a named sub-processor with its own terms. Before the PoC starts, verify two clauses in those terms: that API inputs are not used for model training (standard on business tiers, but get it in writing), and that the retention window for abuse monitoring — typically zero to thirty days — matches what you told your DPO.
Audit Logs Are A GDPR Feature
Most teams treat audit logging as engineering overhead. Under GDPR it does double duty. Every model call in the workflow should record:
The minimized input payload that was actually sent
The model, its version, and the prompt version
The output, with a confidence signal where available
The reviewer who approved, corrected, or rejected it
Timestamps and the retention clock that drives deletion
That single table supports your Article 30 records of processing, answers subject access requests about automated processing, and gives you something concrete to show when a person objects to a decision. It is also what makes human review in AI workflows meaningful under Article 22: review without a log is an opinion; review with a log is accountability.
What A Compliance-Ready Two-Week PoC Includes
None of this requires a long program. In a two-week PoC, compliance is part of the build, not a parallel track:
A data-flow diagram naming every system personal data touches, including the model endpoint and its region
A lawful-basis and Article 22 worksheet drafted for your DPO to confirm — input for counsel, not a substitute for it
Pseudonymization at the workflow boundary, with re-identification only after the model call
An EU-region or your-cloud endpoint agreed in week one, not retrofitted in month three
The audit log table live from day one, covering every model call
A human review screen for any decision that approaches Article 22 territory
Handover documentation a DPO can read without an engineer in the room
This fits inside a Quick DX PoC (two weeks, $12,500–$18,000) because these are design decisions, not extra software — see packages for scope. Working with an EU entity (Poland) also simplifies the surrounding logistics: EU invoicing, SEPA, and a processor that is itself subject to GDPR.
One honest caveat to close. We are engineers, not lawyers. This article maps the technical decisions that make a GDPR review fast; the legal judgment on lawful basis and Article 22 belongs to your DPO or counsel. The practical move is simple: put the one-page data flow in front of them in week one, and legal review turns from a blocker into a sign-off.