GDPR & Data Residency Posture¶
Status: Draft (2026-04-28; partial sweeps 2026-05-21 + 2026-05-22). Sections marked [CLIENT INPUT REQUIRED] still need a decision from the platform owner before this becomes a publishable / partner-facing document. Engineering content (sub-processor list, hosting region, encryption posture, technical access controls) is derived from the live infrastructure and is authoritative as of the date above. The 2026-05-21 sweep folded in side-effect answers from Plan #027 (mail stack) and Riff #28 / Plan #020 work. The 2026-05-22 sweep marks the DSAR endpoints (§6) as implemented (Riff #175). The remaining open items in §10 are narrower than the original draft.
This document describes how Portugal Odyssey Platform handles personal data under the EU General Data Protection Regulation (Regulation 2016/679, "GDPR"). It is intended for:
- Partners, who need to assess the platform as a data processor before signing a Data Processing Agreement (DPA).
- Customers (travelers), who consult it via the public-fo privacy-policy page.
- Auditors and regulators, who may request evidence of GDPR Article 30 records.
For the engineering reference of where data lives and how it's encrypted, see docs/developers/architecture/system-overview.md.
1. Roles under GDPR¶
| Activity | Portugal Odyssey acts as | Why |
|---|---|---|
| Storing partner business data (Partner profile, services, contracts, documents) | Data Controller | The platform decides why and how partner data is processed (catalog publication, contract lifecycle, AI matching). |
| Storing customer profile + booking history (preferences, contact, past bookings) | Data Controller | The platform decides what to collect and why. |
| Processing customer-provided data on behalf of partners (e.g. forwarding traveler details for a booking confirmation) | Data Processor for that partner | Partner is the controller for the booking record on their side; platform passes through and stores for service delivery. |
| Processing partner-uploaded documents through AI (RAG ingestion, translation, summarisation) | Data Processor for that partner | Partner controls what they upload; platform processes per documented purposes. |
This dual role (controller + processor) is recorded in the partner DPA. [CLIENT INPUT REQUIRED]: confirm the dual-role characterisation matches the legal advice you've received, or amend.
2. Hosting and data residency¶
| Component | Location | Provider | Notes |
|---|---|---|---|
Application VPS (31.97.159.7, 185.166.39.210) |
Paris, France (EU) | Hostinger International Limited (AS47583) | Per ipinfo.io lookup 2026-04-28; hostname srv884655.hstgr.cloud. EU-located → no Article 44 cross-border concerns for primary storage. |
| DNS | Cloudflare (global anycast; control plane US) | Cloudflare Inc. | Used for DNS-01 ACME challenge + record resolution only. No customer data flows through Cloudflare (no proxy/CDN; records are grey-cloud). |
| TLS certificates | Let's Encrypt (ISRG, US 501(c)(3)) | Internet Security Research Group | Public CA; no customer data. |
| Container registry + CI runner | GitLab.com (US-hosted) + self-hosted runner on VPS (FR) | GitLab Inc. (registry) / self-hosted (runner) | Source code + build artefacts only; no customer/partner data flows through CI. |
| External monitoring | UptimeRobot (US) | UptimeRobot Service Provider Ltd. | HEAD/keyword probes against public health endpoints; no personal data observed. |
Primary data (customer profiles, partner profiles, bookings, documents, AI vectors) lives only on the EU VPS. Backups: [CLIENT INPUT REQUIRED] — current backup destination, retention, and encryption posture is undocumented; needs decision before launch.
3. Sub-processors¶
Sub-processors process personal data on behalf of the platform under GDPR Article 28. The list below is derived from infrastructure/env-templates/*.template (production reference set, 2026-04-28).
| Sub-processor | Purpose | Personal data exposed | Data location | DPA status |
|---|---|---|---|---|
| Stripe | Payment processing (cards, refunds, customer-id mapping) | Customer name, email, billing address, masked card data, transaction metadata | Global (EU + US per Stripe DPA) | Stripe SCC + DPA on file via Stripe Dashboard |
| OpenAI | LLM inference (translation, summarisation, sanitization, sentiment, moderation, enhancement, embeddings) | Partner document text, partner-provided service descriptions; no customer PII by design | US (per OpenAI API terms) | Standard API terms include zero-retention for API by default; [CLIENT INPUT REQUIRED] confirm the org-level zero-retention setting is enabled on the OpenAI account |
| Anthropic | LLM inference (alternate provider; same task surface as OpenAI) | Same as OpenAI | US | Standard API terms; [CLIENT INPUT REQUIRED] confirm zero-retention if available |
| Google Cloud (Gemini API) | LLM inference (alternate provider) | Same as OpenAI | US/EU per Google's regional routing | Standard API terms; [CLIENT INPUT REQUIRED] confirm region pinning |
| Google Maps | Map tile rendering on /contact and partner-location pages |
Customer IP at the time of map load | Global | Standard Google Cloud terms; key is domain-restricted (Apr 2026 rotation) |
| Resend | Outbound transactional email — DKIM/SPF/MX on mail.portugalodyssey.pt (booking confirmations, account verification, contact-form replies, Keycloak realm SMTP, notification-service) |
Customer email + booking summary | US (Resend, Inc.) | Resend Standard Terms + DPA (signed via dashboard, resend.com/legal/dpa) |
| Mailcow (self-hosted) | Inbound mail + mailbox storage + webmail + CalDAV/CardDAV (info@, cristina@, jose@, bookings@, partners@, postmaster@, abuse@, dpo@, etc.) | Mailbox contents for @portugalodyssey.pt addresses; SOGo session data |
EU (FR) — runs in po-mail compose project on srv884655 |
Not a sub-processor — self-hosted on the EU VPS. Listed for completeness because it processes inbound personal data. |
| LangSmith / LangChain (optional) | Observability of LLM calls (traces, prompts, completions) | Same as the LLM provider | US | Disabled by default (env LANGCHAIN_TRACING_V2 = false); enable only if Article 28 DPA signed |
Self-hosted components (Postgres, Redis, RabbitMQ, MinIO, Keycloak, Traefik, ai-service, all microservices) run on the EU VPS and are not sub-processors — the platform operates them directly. (rag-service was folded into ai-service in W2-7, 2026-04-30.)
Action item: publish the sub-processor list at https://portugalodyssey.pt/legal/sub-processors before partners sign DPAs, with a 30-day notification commitment for changes (Article 28(2)).
4. Categories of personal data¶
Derived from the data model in docs/project-overview.md §3 and the live Postgres schemas under infrastructure/migrations/.
4.1 Partner data (controller)¶
- Identity: legal name, trading name, tax ID, VAT number, company registration number.
- Contact: contact-person name, contact email, contact phone, fiscal address, business address.
- Operational: uploaded compliance documents (PDFs converted to vectors via Docling + embedded in pgvector), service descriptions, contract attachments.
- Authentication: Keycloak user record (email, password hash, MFA tokens), session cookies.
Lawful basis: Contract (GDPR Art. 6(1)(b)) — necessary for the partner-platform service agreement.
4.2 Customer data (controller)¶
- Identity: name, email, country of residence.
- Preferences / context: free-text preferences, social context (group composition), environmental/temporal preferences, past bookings (used to feed the AI matching engine).
- Bookings: experience selections, dates, payment intent ID (Stripe), confirmation status.
- Authentication: Keycloak user record; same shape as partner.
Lawful basis: Contract (Art. 6(1)(b)) for booking and service delivery; [CLIENT INPUT REQUIRED] decide whether preference/context data also requires explicit consent (Art. 6(1)(a)) for the AI matching engine, or whether legitimate interest (Art. 6(1)(f)) is the basis with a balancing-test record.
4.3 Aggregated / derived data¶
- AI embeddings (1536-dim vectors stored in
rag_document_vectors) — derived from partner-uploaded documents. Treat as personal data when partner documents contain PII (e.g. operator résumés, signatory names on contracts). - Activity log (RabbitMQ-backed, persisted by analytics-service) — records actor + action + entity for auditability; contains user IDs.
- AI inference logs (LangSmith if enabled, ai-service application logs always) — may include the prompt text, which may include partner-provided content.
5. Retention¶
[CLIENT INPUT REQUIRED] — no retention policy is documented. Provisional defaults below; client must confirm or amend before publication:
| Data category | Default retention | Trigger to delete |
|---|---|---|
| Customer profile + booking history | 5 years after last booking | Customer DSAR (Article 17) or auto-prune cron |
| Cancelled / declined partner applications | 90 days | Auto-prune cron |
| Active partner records | Duration of contract + 7 years (tax law) | Contract termination + statutory window |
| Contract documents (signed PDFs) | 10 years | Statutory tax / commercial law retention |
| AI inference logs (ai-service stdout) | 7 days (Docker log-opts cap = 50 MB ring buffer) | Auto-rotated by Docker |
| Activity log | 1 year | Auto-prune cron |
| Backups | [CLIENT INPUT REQUIRED] | — |
6. Data subject rights (DSAR)¶
The platform must support, within 30 days of receiving a verified request:
| Right | GDPR Article | Implementation status |
|---|---|---|
| Access (right to a copy) | Art. 15 | Implemented (2026-05-22, Riff #175). GET /api/v1/me/export returns a JSON bundle covering 9 services + Keycloak record. Keycloak JWT-authenticated. |
| Rectification | Art. 16 | Implemented for customer self-service via partner-console / public-fo profile edit. Partner side: contact platform admin. |
| Erasure ("right to be forgotten") | Art. 17 | Implemented (2026-05-22, Riff #175 + #200). DELETE /api/v1/me?confirm=<token> cascades soft-delete across 9 services + disables the Keycloak user + revokes active sessions. Confirmation token (HMAC-SHA256, 5-min TTL) issued via GET /api/v1/me/delete-confirmation to prevent CSRF. Hard-purge cron (Riff #200, 2026-05-23) physically removes soft-deleted rows after the statutory retention window (default 30 days, configurable via DSAR_HARD_PURGE_RETENTION_DAYS) — see infrastructure/scripts/dsar-hard-purge.sh for the orchestrator invocation. |
| Restriction | Art. 18 | Not implemented |
| Portability | Art. 20 | Implemented (2026-05-22, Riff #175). Same endpoint as Art. 15 — output is structured JSON ("commonly used machine-readable format"). |
| Objection | Art. 21 | Manual (email). |
| Automated decision-making opt-out | Art. 22 | AI matching engine is in scope. No explicit opt-out today. |
Implementation snapshot (2026-05-22): Riff #175 closed at commit 1eec2b9. Endpoints live on qual via api-gateway proxy /api/v1/me/* → auth-service /auth/dsar/*. Aggregator fans out to partner-service, experience-service (handles bookings too — Wave 2-1 simplification), contract-service, payment-service, file-service, ai-service (rag_document_vectors), notification-service, review-service, plus Keycloak admin REST for the user record + active sessions. Failures per-service surface as { name, error } blocks rather than dropping silently. DSAR_INTERNAL_KEY shared secret guards the /internal/me/* callee endpoints; provisioned in dev + qual compose, pending operator value for prod when prod stack deploys.
DSAR contact email: dpo@portugalodyssey.pt — live mailbox on the self-hosted Mailcow stack (Plan #027). Mail to this alias routes to Cristina + José for triage. Will be published on the customer-facing /legal/privacy page when that ships.
7. Security posture (Article 32 — appropriate technical measures)¶
| Control | Implementation |
|---|---|
| Encryption in transit | All public traffic via Traefik + Let's Encrypt TLS (ECDSA P-256 wildcards via Cloudflare DNS-01); HSTS via default-headers@file middleware. No plaintext HTTP listeners. |
| Encryption at rest | Postgres data is NOT encrypted at rest by default (pgvector/pgvector:pg15 uses standard volumes). MinIO uploads also unencrypted at rest. Action: decide whether VPS-level disk encryption (LUKS) is needed; Hostinger doesn't provide it by default. |
| Secrets management | .env.{qual,prod} files on VPS with file-mode 600; signing certificate password-protected .p12; AI provider keys + JWT secrets isolated per-environment. .mcp.json carries zero secrets (2026-04-28 structural fix; see CLAUDE.md §Session hygiene). |
| Access control | Keycloak realm portugal-odyssey enforces RBAC (admin / partner / customer); MFA available, [CLIENT INPUT REQUIRED] confirm whether MFA is enforced for admin and partner roles. |
| Audit logging | Activity-log RabbitMQ topic captured by analytics-service; Docker stdout captured per-container with log-opts (50 MB ring buffer); GitLab CI deployment history. |
| Vulnerability scanning | gitleaks pre-commit (po-platform; cc-platform); GitLab CI security stage exists but currently advisory. Action: enable dependency_scanning + container_scanning jobs. |
| Backups + DR | [CLIENT INPUT REQUIRED] — undocumented today. Pre-launch task. |
| Incident response | INC-NNN doctrine at ~/.claude/incidents/ (engineering side); customer-facing 72-hour breach notification process [CLIENT INPUT REQUIRED]. |
8. International data transfers¶
- Primary storage stays in EU (FR). No Article 44 transfer concern for the database itself.
- Stripe + OpenAI + Anthropic + Google are processors that transfer to the US under their respective SCC frameworks. Each has signed-or-clickwrap DPAs; the partner DPA (§3) needs to enumerate them as authorised sub-processors.
- No customer data is shared with the AI providers by the current code paths — only partner documents and partner-provided service text. [CLIENT INPUT REQUIRED] confirm this remains the rule (or amend if the AI matching engine grows to embed customer preferences in third-party prompts).
9. Cookies and tracking¶
Public-fo is a Vite-built SPA on qual.portugalodyssey.pt (will be apex portugalodyssey.pt post-launch). Inventory:
- Strictly necessary: Keycloak session cookies (auth), language preference. Lawful basis: contract / legitimate interest. No consent required.
- Analytics: none currently wired (verified 2026-05-21 — no GA4 / Plausible / Matomo tags in
frontends/public-fo/). No consent banner required at MVP launch. [CLIENT INPUT REQUIRED] if/when analytics is added later: pick product (Plausible recommended for cookieless), and re-evaluate banner need. - Advertising / retargeting: none. Per the brand stance (Riff #8 identity proposal — "no sponsored row, no commission-based re-ranking"), retargeting pixels are out of scope.
Action item: add a https://portugalodyssey.pt/legal/cookies page when any non-strictly-necessary cookie is introduced.
10. Open items (consolidated)¶
These are the remaining [CLIENT INPUT REQUIRED] items. The 2026-05-21 sweep resolved four of the original 12 from side-effect work: DSAR contact email (dpo@portugalodyssey.pt mailbox now live via Plan #027), SMTP provider (Resend outbound + Mailcow self-hosted inbound), analytics product (none wired; no banner needed for MVP), and partial revenue-side data flows (clarified in revenue-model.md as Model 3). Tackle the remaining 8 in this order before partner DPAs are signed or before the public privacy policy is published.
Pre-DPA (block partner onboarding): 1. Confirm dual-role (controller + processor) characterisation in §1 with legal counsel. 2. Confirm OpenAI / Anthropic / Google zero-retention or region-pinning settings (§3). 3. Decide and document the backup destination + retention + encryption posture (§2, §5, §7).
Pre-launch (block public site go-live):
4. Decide MFA enforcement for admin + partner roles (§7). (Keycloak realm has no CONFIGURE_TOTP required action wired today.)
5. ~~Implement DSAR endpoints (Art. 15 / 17 / 20)~~ — DONE 2026-05-22 (Riff #175, commit 1eec2b9) + hard-purge sweeper DONE 2026-05-23 (Riff #200). Endpoints live on qual; 30-day SLA now achievable. Hard-purge: every DSAR-participating service exposes POST /internal/dsar/purge (X-DSAR-Key authenticated); auth-service exposes POST /internal/dsar/purge-all for the fan-out; infrastructure/scripts/dsar-hard-purge.sh is the operator entry point. Remaining sub-item: production.yml env wiring + DSAR_INTERNAL_KEY + DSAR_HARD_PURGE_RETENTION_DAYS GitLab CI variables when prod stack first deploys; schedule the hard-purge script as a GitLab CI scheduled pipeline (or VPS cron) once prod traffic exists.
6. Decide AI-matching lawful basis: consent (Art. 6(1)(a)) vs legitimate interest (Art. 6(1)(f)) (§4.2).
7. Decide retention defaults in §5 (or override).
8. VPS-level disk encryption (LUKS) decision (§7).
Pre-public-privacy-policy: 9. Document customer-facing 72-hour breach notification process (§7).
(Analytics consent UX is no longer in this list — analytics tag is not wired and the banner is not needed at MVP launch. It returns to the list only if/when analytics is added.)
11. References¶
- GDPR full text: https://eur-lex.europa.eu/eli/reg/2016/679/oj
- EDPB guidelines: https://edpb.europa.eu/our-work-tools/general-guidance_en
- CNPD (Portugal DPA): https://www.cnpd.pt/
- Internal:
docs/project-overview.md,docs/developers/architecture/system-overview.md,CLAUDE.md