# Production Deployment Checklist ## Architecture Overview **Production Domains:** - Frontend: `https://account.example.com` - Oathkeeper Proxy: `https://auth.example.com` (port 4455) - Django API: `https://api.example.com` - Kratos: Internal only (ports 4433/4434) - Oathkeeper API: Internal only (port 4456) **All services run on the same VM**, so internal communication uses localhost/docker network. --- ## Pre-Deployment Checklist ### 1. Security Hardening #### Kratos Secrets ```bash # Generate new secrets for production openssl rand -hex 16 # SECRETS_DEFAULT openssl rand -hex 16 # SECRETS_COOKIE openssl rand -hex 16 # SECRETS_CIPHER ``` Update in `nexus-5-auth-kratos/.env.production`: - [ ] `SECRETS_DEFAULT` - New random value - [ ] `SECRETS_COOKIE` - New random value - [ ] `SECRETS_CIPHER` - New random value #### Database Passwords - [ ] Change `POSTGRES_PASSWORD` in `nexus-5-auth-kratos/.env.production` - [ ] Update `KRATOS_DSN` with the new URL-encoded password #### SMTP Configuration - [ ] Verify SMTP credentials in `nexus-5-auth-kratos/config/kratos.yml` (line 128) - [ ] Consider using environment variable instead of hardcoded value ### 2. SSL/TLS Configuration #### Oathkeeper (https://auth.example.com) - [ ] Configure reverse proxy (nginx/caddy) for SSL termination - [ ] Install SSL certificate for `auth.example.com` - [ ] Configure proxy to forward to `localhost:4455` #### Frontend (https://account.example.com) - [ ] Configure reverse proxy for SSL termination - [ ] Install SSL certificate for `account.example.com` - [ ] Configure proxy to forward to SvelteKit server (typically port 3000 or 5173) #### CORS Configuration Verify Oathkeeper CORS is configured (`nexus-5-auth-oathkeeper/config/oathkeeper.yml`): - [x] `https://account.example.com` in allowed_origins - [x] `https://auth.example.com` in allowed_origins - [x] `https://api.example.com` in allowed_origins - [x] `allow_credentials: true` ### 3. Environment Files #### Replace .env files with production versions: ```bash # Kratos cp nexus-5-auth-kratos/.env.production nexus-5-auth-kratos/.env # Oathkeeper cp nexus-5-auth-oathkeeper/.env.production nexus-5-auth-oathkeeper/.env # Frontend cp nexus-5-auth-frontend/.env.production nexus-5-auth-frontend/.env ``` #### Verify all environment variables: - [ ] `nexus-5-auth-kratos/.env` - [ ] `nexus-5-auth-oathkeeper/.env` - [ ] `nexus-5-auth-frontend/.env` --- ## Deployment Steps ### 1. Database Setup ```bash cd nexus-5-auth-kratos # Start PostgreSQL docker compose up -d postgres # Wait for PostgreSQL to be ready docker compose logs -f postgres # Wait for "database system is ready to accept connections" # Run Kratos migrations docker compose run --rm kratos migrate sql -e --yes ``` ### 2. Deploy Kratos ```bash cd nexus-5-auth-kratos # Build and start Kratos docker compose up -d kratos # Verify it's running docker compose ps docker compose logs kratos # Test health endpoint curl http://localhost:4433/health/ready ``` **Expected response:** ```json {"status": "ok"} ``` ### 3. Deploy Oathkeeper ```bash cd nexus-5-auth-oathkeeper # Rebuild with updated config docker compose build oathkeeper # Start Oathkeeper docker compose up -d oathkeeper # Verify it's running docker compose ps docker compose logs oathkeeper # Test health endpoint curl http://localhost:4456/health/ready ``` **Expected response:** ```json {"status": "ok"} ``` ### 4. Test Access Rules ```bash # List all configured rules curl http://localhost:4456/rules | jq . # Verify rule count (should be 9 rules) curl -s http://localhost:4456/rules | jq 'length' ``` **Expected rules:** 1. `kratos:self-service` 2. `kratos:admin:identities` 3. `kratos:admin:recovery` 4. `kratos:admin:courier` 5. `kratos:admin:sessions` 6. `kratos:sessions:api` 7. `django:api:public` 8. `django:api:protected` 9. `django:api:v1` ### 5. Deploy Frontend #### Option A: Docker Deployment (Recommended) ```bash cd nexus-5-auth-frontend # Copy production environment cp .env.production .env # Ensure ory-network exists docker network create ory-network 2>/dev/null || true # Build and start docker compose up -d # Verify it's running docker compose ps docker compose logs frontend # Test health endpoint curl http://localhost:3000/ ``` **Expected response:** HTML page content #### Option B: PM2 Deployment ```bash cd nexus-5-auth-frontend # Install dependencies npm install # Copy production environment cp .env.production .env # Build for production npm run build # Deploy with PM2 pm2 start npm --name "nexus-auth-frontend" -- start pm2 save ``` #### Option C: Direct Node Deployment ```bash cd nexus-5-auth-frontend # Install dependencies npm install # Copy production environment cp .env.production .env # Build for production npm run build # Start with node node build ``` ### 6. Configure Reverse Proxy #### Example Nginx Configuration **File: `/etc/nginx/sites-available/auth.example.com`** ```nginx server { listen 443 ssl http2; server_name auth.example.com; ssl_certificate /path/to/ssl/cert.pem; ssl_certificate_key /path/to/ssl/key.pem; location / { proxy_pass http://localhost:4455; proxy_set_header Host $host; proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; proxy_set_header X-Forwarded-Proto $scheme; # WebSocket support (if needed) proxy_http_version 1.1; proxy_set_header Upgrade $http_upgrade; proxy_set_header Connection "upgrade"; } } ``` **File: `/etc/nginx/sites-available/account.example.com`** ```nginx server { listen 443 ssl http2; server_name account.example.com; ssl_certificate /path/to/ssl/cert.pem; ssl_certificate_key /path/to/ssl/key.pem; location / { proxy_pass http://localhost:3000; # Adjust to your SvelteKit port proxy_set_header Host $host; proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; proxy_set_header X-Forwarded-Proto $scheme; # WebSocket support for HMR (disable in production) proxy_http_version 1.1; proxy_set_header Upgrade $http_upgrade; proxy_set_header Connection "upgrade"; } } ``` Enable sites and reload nginx: ```bash sudo ln -s /etc/nginx/sites-available/auth.example.com /etc/nginx/sites-enabled/ sudo ln -s /etc/nginx/sites-available/account.example.com /etc/nginx/sites-enabled/ sudo nginx -t sudo systemctl reload nginx ``` --- ## Post-Deployment Testing ### 1. Health Checks ```bash # Kratos public API curl https://auth.example.com/health/ready # Kratos admin API (through Oathkeeper - requires auth) curl https://auth.example.com/admin/identities # Oathkeeper API (internal) curl http://localhost:4456/health/ready ``` ### 2. Frontend Testing Visit `https://account.example.com` and test: - [ ] Registration flow - [ ] Login flow - [ ] Email verification - [ ] Password recovery - [ ] Settings page - [ ] Admin dashboard (identity management) - [ ] Session management - [ ] Logout ### 3. WebAuthn Testing - [ ] Register with passkey/security key - [ ] Login with passkey/security key - [ ] TOTP (authenticator app) setup - [ ] TOTP login ### 4. API Testing Test Django integration (once you have an authenticated session): ```bash # Public API (no auth) curl https://auth.example.com/api/public/ # Protected API (with session cookie) curl -b cookies.txt https://auth.example.com/api/protected/ # Bearer token API curl -H "Authorization: Bearer YOUR_TOKEN" https://auth.example.com/api/v1/ ``` ### 5. Verify Headers Forwarded to Django Create a test identity and check headers received by Django: **Expected headers from Oathkeeper:** ``` X-User-ID: X-User-Email: user@example.com X-User-First-Name: John X-User-Last-Name: Doe X-User-Phone: +1234567890 X-User-Profile-Type: team|customer X-Django-Profile-ID: X-Customer-ID: ``` --- ## Django Backend Integration ### 1. Update Django Settings Add trusted headers and CORS configuration: ```python # settings.py # Oathkeeper proxy headers USE_X_FORWARDED_HOST = True USE_X_FORWARDED_PORT = True SECURE_PROXY_SSL_HEADER = ('HTTP_X_FORWARDED_PROTO', 'https') # CORS settings CORS_ALLOWED_ORIGINS = [ "https://account.example.com", "https://auth.example.com", ] CORS_ALLOW_CREDENTIALS = True # Session/Cookie settings SESSION_COOKIE_DOMAIN = '.example.com' CSRF_COOKIE_DOMAIN = '.example.com' SESSION_COOKIE_SECURE = True CSRF_COOKIE_SECURE = True SESSION_COOKIE_SAMESITE = 'Lax' CSRF_COOKIE_SAMESITE = 'Lax' ``` ### 2. Create Authentication Middleware ```python # middleware/kratos_auth.py class KratosAuthMiddleware: def __init__(self, get_response): self.get_response = get_response def __call__(self, request): # Extract Kratos identity from headers user_id = request.META.get('HTTP_X_USER_ID') user_email = request.META.get('HTTP_X_USER_EMAIL') first_name = request.META.get('HTTP_X_USER_FIRST_NAME') last_name = request.META.get('HTTP_X_USER_LAST_NAME') phone = request.META.get('HTTP_X_USER_PHONE') profile_type = request.META.get('HTTP_X_USER_PROFILE_TYPE') django_profile_id = request.META.get('HTTP_X_DJANGO_PROFILE_ID') customer_id = request.META.get('HTTP_X_CUSTOMER_ID') if user_id and user_email: # Look up or create user based on Kratos identity # Attach to request.user or request.kratos_user pass response = self.get_response(request) return response ``` Add to `MIDDLEWARE` in settings.py: ```python MIDDLEWARE = [ # ... other middleware 'your_app.middleware.kratos_auth.KratosAuthMiddleware', ] ``` ### 3. Sync Identity on Registration When a user registers in Kratos, sync to Django: **Option A: Webhook (recommended)** - Configure Kratos webhook to call Django API on identity creation - Django creates corresponding TeamProfile/CustomerProfile - Returns django_profile_id to be stored in Kratos metadata_public **Option B: Poll/Manual Sync** - Periodic task to sync new Kratos identities to Django - Less real-time but simpler to implement --- ## Monitoring & Logging ### 1. Log Aggregation Collect logs from all services: ```bash # Kratos logs docker compose -f nexus-5-auth-kratos/docker-compose.yml logs -f kratos # Oathkeeper logs docker compose -f nexus-5-auth-oathkeeper/docker-compose.yml logs -f oathkeeper # Frontend logs (if using PM2) pm2 logs nexus-auth-frontend ``` ### 2. Metrics to Monitor - [ ] Kratos health endpoint: `GET /health/ready` - [ ] Oathkeeper health endpoint: `GET /health/ready` - [ ] Database connection pool usage - [ ] Session count - [ ] Identity count - [ ] Failed login attempts - [ ] Email delivery failures ### 3. Set Log Levels **Production log levels:** - Kratos: `LOG_LEVEL=info` - Oathkeeper: `log.level=info` - Frontend: Configure in SvelteKit --- ## Backup & Recovery ### 1. Database Backups ```bash # Backup Kratos database docker compose -f nexus-5-auth-kratos/docker-compose.yml exec postgres \ pg_dump -U kratos kratos > kratos-backup-$(date +%Y%m%d).sql # Restore docker compose -f nexus-5-auth-kratos/docker-compose.yml exec -T postgres \ psql -U kratos kratos < kratos-backup-20251014.sql ``` ### 2. Configuration Backups - [ ] Backup `nexus-5-auth-kratos/config/` - [ ] Backup `nexus-5-auth-oathkeeper/config/` - [ ] Backup `.env` files (encrypted storage!) - [ ] Backup JWKS keys: `nexus-5-auth-oathkeeper/config/id_token.jwks.json` --- ## Rollback Plan If issues occur in production: ### 1. Quick Rollback ```bash # Stop services docker compose down # Revert to previous .env git checkout HEAD~1 nexus-5-auth-*/ # Restart with old config docker compose up -d ``` ### 2. Database Rollback ```bash # Restore from backup docker compose exec -T postgres psql -U kratos kratos < kratos-backup-YYYYMMDD.sql ``` --- ## Security Checklist - [ ] All secrets rotated for production - [ ] SSL certificates installed and valid - [ ] HTTPS enforced on all domains - [ ] Database passwords strong and unique - [ ] SMTP credentials secured - [ ] Cookie domain set to `.example.com` - [ ] Session cookies marked as Secure - [ ] CORS properly configured - [ ] Admin API requires authentication - [ ] Rate limiting configured (if needed) - [ ] Firewall rules: Only 443/80 exposed publicly - [ ] Internal ports (4433, 4434, 4456, 5432) blocked from external access --- ## Support & Troubleshooting ### Common Issues **Issue: "Cookie not being set"** - Check `session.cookie.domain` in kratos.yml is `example.com` - Verify HTTPS is working - Check browser dev tools > Application > Cookies **Issue: "CORS errors"** - Verify Oathkeeper CORS config includes all domains - Check `allow_credentials: true` - Verify Origin header matches allowed_origins **Issue: "Redirect loop"** - Check `preserve_host` settings in access rules - Verify Kratos `allowed_return_urls` includes production domains **Issue: "WebAuthn not working"** - Verify `webauthn.config.rp.id` is `example.com` - Check `webauthn.config.rp.origins` includes production URLs - Ensure HTTPS is working (WebAuthn requires secure context) ### Debug Commands ```bash # Check Oathkeeper rules curl http://localhost:4456/rules | jq . # Check Kratos sessions curl -H "Cookie: ory_kratos_session=..." http://localhost:4433/sessions/whoami # Test Oathkeeper decision API curl -H "Cookie: ory_kratos_session=..." http://localhost:4455/decisions/admin/identities # View Kratos configuration docker compose exec kratos cat /etc/kratos/kratos.yml ``` --- ## Production Deployment Complete! 🎉 Once all checklist items are complete, your Nexus 5 Auth system is production-ready with: ✅ Ory Kratos for identity management ✅ Ory Oathkeeper for authentication & authorization ✅ SvelteKit frontend with admin dashboard ✅ Full Django integration with custom headers ✅ Secure session management across subdomains ✅ WebAuthn/TOTP support ✅ Email verification & recovery ✅ Complete API endpoint coverage **Next Steps:** 1. Monitor logs for the first 24 hours 2. Test all user flows in production 3. Set up automated backups 4. Configure monitoring/alerting 5. Document any environment-specific configurations