Skip to content

ADR 009: Self-Hosted Mailcow with Resend Smarthost

Date: 2026-05-12 (Plan #027 Slices A–F shipped) / 2026-05-13 (Phase 4 soak close) Status: Accepted

Context

The platform's transactional email pipeline (booking confirmations, password resets, notification fan-out) was wired to Resend in B-1 (2026-04-30), with notification-service and Keycloak both speaking SMTP to smtp.resend.com:587. That covered outbound transactional mail only — there were no @portugalodyssey.pt mailboxes a human could log into.

Three operational gaps remained:

  1. Cristina has no work mailbox on her own domain. She was still using a personal Gmail for partner correspondence and platform admin alerts.
  2. No inbound destination for the apex domain. Mail sent to info@portugalodyssey.pt, support@…, bookings@… bounced with no-MX. DMARC aggregate reports (a soft requirement once we tighten policy) had no rua= destination because there was no mailbox to receive them.
  3. Customer-facing identity: a tourism marketplace whose landing page says "contact us at info@portugalodyssey.pt" and then bounces that mail has a credibility problem.

Three solutions were considered:

  1. Google Workspace (€6/mo/user × ~7 mailboxes = ~€42/mo, ~€500/yr). Zero ops burden, best deliverability, mature mobile clients. Costs scale linearly with mailbox count. Data residency: US-based (Workspace's PT region is a routing-only edge, not storage). GDPR posture requires a Data Processing Agreement and Standard Contractual Clauses for transfers — workable, paperwork-heavy.
  2. Migadu (€19/yr flat, unlimited mailboxes + aliases). EU-hosted (Switzerland; non-EU adequacy decision under GDPR). Reliable for personal/SMB. Roughly 1/30th the price of Workspace. Same zero-ops profile. Limit: small ops team, narrower SLA than Workspace.
  3. Self-hosted Mailcow on the existing Hostinger KVM 2 VPS. Open-source dockerised mail stack (Postfix + Dovecot + SOGo webmail + Rspamd + ClamAV + Unbound + 14 other containers). €0 incremental cost. Full data residency control (mail data sits on the VPS we already own). Operational burden: backups, blacklist remediation, version upgrades, deliverability tuning.

The deliverability problem for option 3 is the killer if not handled: a fresh VPS IP has zero sender reputation; major receivers (Gmail, Outlook) routinely drop or spam-fold mail from cold IPs. Warming a single IP across the major mailbox providers takes weeks to months of careful sending.

Decision

Hybrid: self-host Mailcow on the qual VPS for mailbox storage + inbound MX, smarthost all outbound through Resend.

  • Inbound: Cloudflare DNS publishes mx.portugalodyssey.pt A → 31.97.159.7; apex MX → 10 mx.portugalodyssey.pt. Receiving servers connect to Mailcow's Postfix on port 25 directly. Mail lands in Dovecot mailboxes on the VPS.
  • Outbound: Mailcow's Postfix is configured with relayhost = [smtp.resend.com]:587 + SASL plain-text auth + STARTTLS. Every outbound message goes Mailcow → Resend → recipient. Resend's IPs are already warm, DKIM-signed (the resend._domainkey.portugalodyssey.pt selector signs both transactional and Mailcow-relayed mail), and SPF-aligned via the send.portugalodyssey.pt envelope-from rewrite.
  • Auth signals (validated 2026-05-12 with mail-tester 10/10 on a cold apex domain):
  • SPF: envelope-from is bounces+xxx@send.portugalodyssey.pt; send. has v=spf1 include:amazonses.com ~all. Resend's bounce subdomain is the SPF identity, not the apex.
  • DKIM: Resend signs with d=portugalodyssey.pt; s=resend. Both transactional (notification-service) and human-sent (Mailcow → Resend) mail carry the same signature, so DMARC alignment is uniform.
  • DMARC: _dmarc.portugalodyssey.pt publishes v=DMARC1; p=none; (monitor mode). Migration plan to p=quarantine at T+30 days (target ~2026-06-13), then p=reject at T+60 days assuming clean aggregate reports.

This decision was effectively predetermined by the transactional pipeline (B-1, Resend) — once Resend was the warm-IP egress, smarthosting Mailcow through it was the only credible path to deliverability for a self-hosted mailbox setup. The actual decision was option 3 vs option 1/2; the smarthost mechanic followed.

Resource footprint

Mailcow runs as 17 containers under compose project po-mail, isolated from the existing po-shared / po-qual / po-prod projects. Steady-state memory ~1.7 GB; CPU < 5% on a 4-vCPU/8GB-RAM KVM 2 VPS. Disk: ~3 GB images + per-mailbox storage. Resource limits applied per the B-3 5-tier convention.

Routing through Traefik

Mailcow's nginx (port 8081 inside container, 9080/9443 on host) terminates HTTPS for its admin UI and SOGo webmail. We route the four mail-related web hostnames (mail-admin / webmail / autoconfig / autodiscover.portugalodyssey.pt) through Traefik in po-shared to the Mailcow nginx via port 9443 with insecureSkipVerify=true (Mailcow uses an internal snake-oil cert; Traefik owns the public Let's Encrypt cert).

Raw SMTP/IMAP ports (25/465/587/993/4190) bind directly to the host — Traefik does not proxy these. The host firewall allows these ports from 0.0.0.0.

What this is NOT

  • Not a substitute for the transactional pipeline. notification-service and Keycloak still SMTP-relay directly to Resend on port 587. Mailcow is for human mailboxes only.
  • Not a high-availability mail setup. The qual VPS is single-node; if it goes down, inbound mail queues on sending servers for 24-72h before bouncing (RFC 5321 standard retry). That's an acceptable SLA at pre-launch scale.
  • Not a multi-tenant mail offering. There is exactly one tenant (portugalodyssey.pt); the mail domain is platform-owned, not partner-facing.

Alternatives Considered

Option Cost (year 1) Ops burden Deliverability Decision
Google Workspace ~€500 Zero Best-in-class warm IPs Rejected: cost + US data residency adds GDPR paperwork
Migadu (CH-hosted) €19 Zero Good (shared warm pool) Rejected: storage outside our trust boundary; vendor concentration with Resend already
Mailcow + Resend smarthost €0 incremental Real (~2-4h/month) Excellent via Resend egress Accepted
Mailcow direct egress (no smarthost) €0 Real + weeks of IP warming Poor initially; needs careful warming Rejected: launch-blocking timeline risk
Postfix-only self-host €0 Higher (no admin UI; manual everything) Same as Mailcow Rejected: Mailcow's admin UI + Rspamd/ClamAV bundles are free value

Consequences

Positive

  • €0 marginal cost for as many mailboxes/aliases as we want. Current footprint: 7 mailboxes + 6 aliases (Slice E).
  • Data residency is the qual VPS (Lisbon-routed Hostinger DC). All mail content stays inside the same trust boundary as the rest of platform data.
  • postmaster@portugalodyssey.pt exists, unblocking rua= aggregate-report destination for the planned DMARC p=quarantine migration.
  • Operationally consistent with the rest of the platform: Mailcow runs under Docker Compose, sits behind Traefik for its web surfaces, integrates with the existing TLS infrastructure (Let's Encrypt wildcard via Cloudflare DNS-01).
  • Mail-tester 10/10 on first cold send validates the deliverability path end-to-end.

Negative

  • Real operational burden: Mailcow upgrades happen ~monthly (security patches in the Postfix/Dovecot/Rspamd containers). Manual make mail-update is the current cadence; no auto-update.
  • Backup is a hand-built path (Slice G, deferred at Plan #027 close). Until Slice G ships, a VPS-loss event means inbox content loss (transactional mail survives via Resend's logs).
  • Single point of failure: if the qual VPS goes down, both platform services AND mail go down together. Acceptable at pre-launch; revisit when we have a real customer load.
  • Resend coupling: if Resend goes down or rate-limits us, outbound mail from Mailcow queues. The transactional pipeline shares this risk (B-1 already coupled us to Resend). Mitigation: if Resend becomes a chronic problem, swap the smarthost — Mailcow's relayhost is a one-line config change.
  • Cold-apex risk during initial DMARC tightening: if we move p=none → p=quarantine too aggressively, some legitimate-but-unsigned automated mail nobody remembers configuring could be quarantined. Mitigation: 30 days of rua= aggregate reports BEFORE the policy tightens.

Operational follow-ups (Slice G + H deferred at Plan #027 close)

  • Slice G: Mailcow backup → MinIO po-mail-backups (daily 03:00 UTC), retention 7d+4w+6m. Prometheus exporter for Mailcow + Grafana dashboard. Filed as Riff in docs/ai/backlog.md ("Filed 2026-05-13").
  • DMARC tightening: rua= added soon for visibility; p=quarantine at T+30 days (~2026-06-13).
  • IPv6 mail: Not supported today (no AAAA on mx., IPv4-only Resend MX target). Three coordinated changes needed if we ever enable: AAAA on mx., Hostinger IPv6 PTR change, Mailcow inet_protocols = all. No urgency.

References

  • Execution log: docs/implementation-plans/027-mail-server-mailcow/EXECUTION.md
  • Operations runbook: docs/devops/mail/mailcow-operations.md
  • Cristina IMAP setup guide: docs/clients/operations/mail-setup-guide.md
  • Mailcow upstream docs: https://docs.mailcow.email/
  • Resend SMTP smarthost docs: https://resend.com/docs/send-with-smtp
  • Spamhaus public-resolver throttle (127.255.255.254 code): https://www.spamhaus.org/news/article/807/