598 lines
14 KiB
Markdown
598 lines
14 KiB
Markdown
# Production Deployment Checklist
|
|
|
|
## Architecture Overview
|
|
|
|
**Production Domains:**
|
|
- Frontend: `https://account.example.com`
|
|
- Oathkeeper Proxy: `https://auth.example.com` (port 4455)
|
|
- Django API: `https://api.example.com`
|
|
- Kratos: Internal only (ports 4433/4434)
|
|
- Oathkeeper API: Internal only (port 4456)
|
|
|
|
**All services run on the same VM**, so internal communication uses localhost/docker network.
|
|
|
|
---
|
|
|
|
## Pre-Deployment Checklist
|
|
|
|
### 1. Security Hardening
|
|
|
|
#### Kratos Secrets
|
|
```bash
|
|
# Generate new secrets for production
|
|
openssl rand -hex 16 # SECRETS_DEFAULT
|
|
openssl rand -hex 16 # SECRETS_COOKIE
|
|
openssl rand -hex 16 # SECRETS_CIPHER
|
|
```
|
|
|
|
Update in `nexus-5-auth-kratos/.env.production`:
|
|
- [ ] `SECRETS_DEFAULT` - New random value
|
|
- [ ] `SECRETS_COOKIE` - New random value
|
|
- [ ] `SECRETS_CIPHER` - New random value
|
|
|
|
#### Database Passwords
|
|
- [ ] Change `POSTGRES_PASSWORD` in `nexus-5-auth-kratos/.env.production`
|
|
- [ ] Update `KRATOS_DSN` with the new URL-encoded password
|
|
|
|
#### SMTP Configuration
|
|
- [ ] Verify SMTP credentials in `nexus-5-auth-kratos/config/kratos.yml` (line 128)
|
|
- [ ] Consider using environment variable instead of hardcoded value
|
|
|
|
### 2. SSL/TLS Configuration
|
|
|
|
#### Oathkeeper (https://auth.example.com)
|
|
- [ ] Configure reverse proxy (nginx/caddy) for SSL termination
|
|
- [ ] Install SSL certificate for `auth.example.com`
|
|
- [ ] Configure proxy to forward to `localhost:4455`
|
|
|
|
#### Frontend (https://account.example.com)
|
|
- [ ] Configure reverse proxy for SSL termination
|
|
- [ ] Install SSL certificate for `account.example.com`
|
|
- [ ] Configure proxy to forward to SvelteKit server (typically port 3000 or 5173)
|
|
|
|
#### CORS Configuration
|
|
Verify Oathkeeper CORS is configured (`nexus-5-auth-oathkeeper/config/oathkeeper.yml`):
|
|
- [x] `https://account.example.com` in allowed_origins
|
|
- [x] `https://auth.example.com` in allowed_origins
|
|
- [x] `https://api.example.com` in allowed_origins
|
|
- [x] `allow_credentials: true`
|
|
|
|
### 3. Environment Files
|
|
|
|
#### Replace .env files with production versions:
|
|
```bash
|
|
# Kratos
|
|
cp nexus-5-auth-kratos/.env.production nexus-5-auth-kratos/.env
|
|
|
|
# Oathkeeper
|
|
cp nexus-5-auth-oathkeeper/.env.production nexus-5-auth-oathkeeper/.env
|
|
|
|
# Frontend
|
|
cp nexus-5-auth-frontend/.env.production nexus-5-auth-frontend/.env
|
|
```
|
|
|
|
#### Verify all environment variables:
|
|
- [ ] `nexus-5-auth-kratos/.env`
|
|
- [ ] `nexus-5-auth-oathkeeper/.env`
|
|
- [ ] `nexus-5-auth-frontend/.env`
|
|
|
|
---
|
|
|
|
## Deployment Steps
|
|
|
|
### 1. Database Setup
|
|
|
|
```bash
|
|
cd nexus-5-auth-kratos
|
|
|
|
# Start PostgreSQL
|
|
docker compose up -d postgres
|
|
|
|
# Wait for PostgreSQL to be ready
|
|
docker compose logs -f postgres
|
|
# Wait for "database system is ready to accept connections"
|
|
|
|
# Run Kratos migrations
|
|
docker compose run --rm kratos migrate sql -e --yes
|
|
```
|
|
|
|
### 2. Deploy Kratos
|
|
|
|
```bash
|
|
cd nexus-5-auth-kratos
|
|
|
|
# Build and start Kratos
|
|
docker compose up -d kratos
|
|
|
|
# Verify it's running
|
|
docker compose ps
|
|
docker compose logs kratos
|
|
|
|
# Test health endpoint
|
|
curl http://localhost:4433/health/ready
|
|
```
|
|
|
|
**Expected response:**
|
|
```json
|
|
{"status": "ok"}
|
|
```
|
|
|
|
### 3. Deploy Oathkeeper
|
|
|
|
```bash
|
|
cd nexus-5-auth-oathkeeper
|
|
|
|
# Rebuild with updated config
|
|
docker compose build oathkeeper
|
|
|
|
# Start Oathkeeper
|
|
docker compose up -d oathkeeper
|
|
|
|
# Verify it's running
|
|
docker compose ps
|
|
docker compose logs oathkeeper
|
|
|
|
# Test health endpoint
|
|
curl http://localhost:4456/health/ready
|
|
```
|
|
|
|
**Expected response:**
|
|
```json
|
|
{"status": "ok"}
|
|
```
|
|
|
|
### 4. Test Access Rules
|
|
|
|
```bash
|
|
# List all configured rules
|
|
curl http://localhost:4456/rules | jq .
|
|
|
|
# Verify rule count (should be 9 rules)
|
|
curl -s http://localhost:4456/rules | jq 'length'
|
|
```
|
|
|
|
**Expected rules:**
|
|
1. `kratos:self-service`
|
|
2. `kratos:admin:identities`
|
|
3. `kratos:admin:recovery`
|
|
4. `kratos:admin:courier`
|
|
5. `kratos:admin:sessions`
|
|
6. `kratos:sessions:api`
|
|
7. `django:api:public`
|
|
8. `django:api:protected`
|
|
9. `django:api:v1`
|
|
|
|
### 5. Deploy Frontend
|
|
|
|
#### Option A: Docker Deployment (Recommended)
|
|
|
|
```bash
|
|
cd nexus-5-auth-frontend
|
|
|
|
# Copy production environment
|
|
cp .env.production .env
|
|
|
|
# Ensure ory-network exists
|
|
docker network create ory-network 2>/dev/null || true
|
|
|
|
# Build and start
|
|
docker compose up -d
|
|
|
|
# Verify it's running
|
|
docker compose ps
|
|
docker compose logs frontend
|
|
|
|
# Test health endpoint
|
|
curl http://localhost:3000/
|
|
```
|
|
|
|
**Expected response:** HTML page content
|
|
|
|
#### Option B: PM2 Deployment
|
|
|
|
```bash
|
|
cd nexus-5-auth-frontend
|
|
|
|
# Install dependencies
|
|
npm install
|
|
|
|
# Copy production environment
|
|
cp .env.production .env
|
|
|
|
# Build for production
|
|
npm run build
|
|
|
|
# Deploy with PM2
|
|
pm2 start npm --name "nexus-auth-frontend" -- start
|
|
pm2 save
|
|
```
|
|
|
|
#### Option C: Direct Node Deployment
|
|
|
|
```bash
|
|
cd nexus-5-auth-frontend
|
|
|
|
# Install dependencies
|
|
npm install
|
|
|
|
# Copy production environment
|
|
cp .env.production .env
|
|
|
|
# Build for production
|
|
npm run build
|
|
|
|
# Start with node
|
|
node build
|
|
```
|
|
|
|
### 6. Configure Reverse Proxy
|
|
|
|
#### Example Nginx Configuration
|
|
|
|
**File: `/etc/nginx/sites-available/auth.example.com`**
|
|
```nginx
|
|
server {
|
|
listen 443 ssl http2;
|
|
server_name auth.example.com;
|
|
|
|
ssl_certificate /path/to/ssl/cert.pem;
|
|
ssl_certificate_key /path/to/ssl/key.pem;
|
|
|
|
location / {
|
|
proxy_pass http://localhost:4455;
|
|
proxy_set_header Host $host;
|
|
proxy_set_header X-Real-IP $remote_addr;
|
|
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
|
|
proxy_set_header X-Forwarded-Proto $scheme;
|
|
|
|
# WebSocket support (if needed)
|
|
proxy_http_version 1.1;
|
|
proxy_set_header Upgrade $http_upgrade;
|
|
proxy_set_header Connection "upgrade";
|
|
}
|
|
}
|
|
```
|
|
|
|
**File: `/etc/nginx/sites-available/account.example.com`**
|
|
```nginx
|
|
server {
|
|
listen 443 ssl http2;
|
|
server_name account.example.com;
|
|
|
|
ssl_certificate /path/to/ssl/cert.pem;
|
|
ssl_certificate_key /path/to/ssl/key.pem;
|
|
|
|
location / {
|
|
proxy_pass http://localhost:3000; # Adjust to your SvelteKit port
|
|
proxy_set_header Host $host;
|
|
proxy_set_header X-Real-IP $remote_addr;
|
|
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
|
|
proxy_set_header X-Forwarded-Proto $scheme;
|
|
|
|
# WebSocket support for HMR (disable in production)
|
|
proxy_http_version 1.1;
|
|
proxy_set_header Upgrade $http_upgrade;
|
|
proxy_set_header Connection "upgrade";
|
|
}
|
|
}
|
|
```
|
|
|
|
Enable sites and reload nginx:
|
|
```bash
|
|
sudo ln -s /etc/nginx/sites-available/auth.example.com /etc/nginx/sites-enabled/
|
|
sudo ln -s /etc/nginx/sites-available/account.example.com /etc/nginx/sites-enabled/
|
|
sudo nginx -t
|
|
sudo systemctl reload nginx
|
|
```
|
|
|
|
---
|
|
|
|
## Post-Deployment Testing
|
|
|
|
### 1. Health Checks
|
|
|
|
```bash
|
|
# Kratos public API
|
|
curl https://auth.example.com/health/ready
|
|
|
|
# Kratos admin API (through Oathkeeper - requires auth)
|
|
curl https://auth.example.com/admin/identities
|
|
|
|
# Oathkeeper API (internal)
|
|
curl http://localhost:4456/health/ready
|
|
```
|
|
|
|
### 2. Frontend Testing
|
|
|
|
Visit `https://account.example.com` and test:
|
|
- [ ] Registration flow
|
|
- [ ] Login flow
|
|
- [ ] Email verification
|
|
- [ ] Password recovery
|
|
- [ ] Settings page
|
|
- [ ] Admin dashboard (identity management)
|
|
- [ ] Session management
|
|
- [ ] Logout
|
|
|
|
### 3. WebAuthn Testing
|
|
|
|
- [ ] Register with passkey/security key
|
|
- [ ] Login with passkey/security key
|
|
- [ ] TOTP (authenticator app) setup
|
|
- [ ] TOTP login
|
|
|
|
### 4. API Testing
|
|
|
|
Test Django integration (once you have an authenticated session):
|
|
|
|
```bash
|
|
# Public API (no auth)
|
|
curl https://auth.example.com/api/public/
|
|
|
|
# Protected API (with session cookie)
|
|
curl -b cookies.txt https://auth.example.com/api/protected/
|
|
|
|
# Bearer token API
|
|
curl -H "Authorization: Bearer YOUR_TOKEN" https://auth.example.com/api/v1/
|
|
```
|
|
|
|
### 5. Verify Headers Forwarded to Django
|
|
|
|
Create a test identity and check headers received by Django:
|
|
|
|
**Expected headers from Oathkeeper:**
|
|
```
|
|
X-User-ID: <kratos_identity_id>
|
|
X-User-Email: user@example.com
|
|
X-User-First-Name: John
|
|
X-User-Last-Name: Doe
|
|
X-User-Phone: +1234567890
|
|
X-User-Profile-Type: team|customer
|
|
X-Django-Profile-ID: <uuid>
|
|
X-Customer-ID: <uuid>
|
|
```
|
|
|
|
---
|
|
|
|
## Django Backend Integration
|
|
|
|
### 1. Update Django Settings
|
|
|
|
Add trusted headers and CORS configuration:
|
|
|
|
```python
|
|
# settings.py
|
|
|
|
# Oathkeeper proxy headers
|
|
USE_X_FORWARDED_HOST = True
|
|
USE_X_FORWARDED_PORT = True
|
|
SECURE_PROXY_SSL_HEADER = ('HTTP_X_FORWARDED_PROTO', 'https')
|
|
|
|
# CORS settings
|
|
CORS_ALLOWED_ORIGINS = [
|
|
"https://account.example.com",
|
|
"https://auth.example.com",
|
|
]
|
|
CORS_ALLOW_CREDENTIALS = True
|
|
|
|
# Session/Cookie settings
|
|
SESSION_COOKIE_DOMAIN = '.example.com'
|
|
CSRF_COOKIE_DOMAIN = '.example.com'
|
|
SESSION_COOKIE_SECURE = True
|
|
CSRF_COOKIE_SECURE = True
|
|
SESSION_COOKIE_SAMESITE = 'Lax'
|
|
CSRF_COOKIE_SAMESITE = 'Lax'
|
|
```
|
|
|
|
### 2. Create Authentication Middleware
|
|
|
|
```python
|
|
# middleware/kratos_auth.py
|
|
|
|
class KratosAuthMiddleware:
|
|
def __init__(self, get_response):
|
|
self.get_response = get_response
|
|
|
|
def __call__(self, request):
|
|
# Extract Kratos identity from headers
|
|
user_id = request.META.get('HTTP_X_USER_ID')
|
|
user_email = request.META.get('HTTP_X_USER_EMAIL')
|
|
first_name = request.META.get('HTTP_X_USER_FIRST_NAME')
|
|
last_name = request.META.get('HTTP_X_USER_LAST_NAME')
|
|
phone = request.META.get('HTTP_X_USER_PHONE')
|
|
profile_type = request.META.get('HTTP_X_USER_PROFILE_TYPE')
|
|
django_profile_id = request.META.get('HTTP_X_DJANGO_PROFILE_ID')
|
|
customer_id = request.META.get('HTTP_X_CUSTOMER_ID')
|
|
|
|
if user_id and user_email:
|
|
# Look up or create user based on Kratos identity
|
|
# Attach to request.user or request.kratos_user
|
|
pass
|
|
|
|
response = self.get_response(request)
|
|
return response
|
|
```
|
|
|
|
Add to `MIDDLEWARE` in settings.py:
|
|
```python
|
|
MIDDLEWARE = [
|
|
# ... other middleware
|
|
'your_app.middleware.kratos_auth.KratosAuthMiddleware',
|
|
]
|
|
```
|
|
|
|
### 3. Sync Identity on Registration
|
|
|
|
When a user registers in Kratos, sync to Django:
|
|
|
|
**Option A: Webhook (recommended)**
|
|
- Configure Kratos webhook to call Django API on identity creation
|
|
- Django creates corresponding TeamProfile/CustomerProfile
|
|
- Returns django_profile_id to be stored in Kratos metadata_public
|
|
|
|
**Option B: Poll/Manual Sync**
|
|
- Periodic task to sync new Kratos identities to Django
|
|
- Less real-time but simpler to implement
|
|
|
|
---
|
|
|
|
## Monitoring & Logging
|
|
|
|
### 1. Log Aggregation
|
|
|
|
Collect logs from all services:
|
|
```bash
|
|
# Kratos logs
|
|
docker compose -f nexus-5-auth-kratos/docker-compose.yml logs -f kratos
|
|
|
|
# Oathkeeper logs
|
|
docker compose -f nexus-5-auth-oathkeeper/docker-compose.yml logs -f oathkeeper
|
|
|
|
# Frontend logs (if using PM2)
|
|
pm2 logs nexus-auth-frontend
|
|
```
|
|
|
|
### 2. Metrics to Monitor
|
|
|
|
- [ ] Kratos health endpoint: `GET /health/ready`
|
|
- [ ] Oathkeeper health endpoint: `GET /health/ready`
|
|
- [ ] Database connection pool usage
|
|
- [ ] Session count
|
|
- [ ] Identity count
|
|
- [ ] Failed login attempts
|
|
- [ ] Email delivery failures
|
|
|
|
### 3. Set Log Levels
|
|
|
|
**Production log levels:**
|
|
- Kratos: `LOG_LEVEL=info`
|
|
- Oathkeeper: `log.level=info`
|
|
- Frontend: Configure in SvelteKit
|
|
|
|
---
|
|
|
|
## Backup & Recovery
|
|
|
|
### 1. Database Backups
|
|
|
|
```bash
|
|
# Backup Kratos database
|
|
docker compose -f nexus-5-auth-kratos/docker-compose.yml exec postgres \
|
|
pg_dump -U kratos kratos > kratos-backup-$(date +%Y%m%d).sql
|
|
|
|
# Restore
|
|
docker compose -f nexus-5-auth-kratos/docker-compose.yml exec -T postgres \
|
|
psql -U kratos kratos < kratos-backup-20251014.sql
|
|
```
|
|
|
|
### 2. Configuration Backups
|
|
|
|
- [ ] Backup `nexus-5-auth-kratos/config/`
|
|
- [ ] Backup `nexus-5-auth-oathkeeper/config/`
|
|
- [ ] Backup `.env` files (encrypted storage!)
|
|
- [ ] Backup JWKS keys: `nexus-5-auth-oathkeeper/config/id_token.jwks.json`
|
|
|
|
---
|
|
|
|
## Rollback Plan
|
|
|
|
If issues occur in production:
|
|
|
|
### 1. Quick Rollback
|
|
```bash
|
|
# Stop services
|
|
docker compose down
|
|
|
|
# Revert to previous .env
|
|
git checkout HEAD~1 nexus-5-auth-*/
|
|
|
|
# Restart with old config
|
|
docker compose up -d
|
|
```
|
|
|
|
### 2. Database Rollback
|
|
```bash
|
|
# Restore from backup
|
|
docker compose exec -T postgres psql -U kratos kratos < kratos-backup-YYYYMMDD.sql
|
|
```
|
|
|
|
---
|
|
|
|
## Security Checklist
|
|
|
|
- [ ] All secrets rotated for production
|
|
- [ ] SSL certificates installed and valid
|
|
- [ ] HTTPS enforced on all domains
|
|
- [ ] Database passwords strong and unique
|
|
- [ ] SMTP credentials secured
|
|
- [ ] Cookie domain set to `.example.com`
|
|
- [ ] Session cookies marked as Secure
|
|
- [ ] CORS properly configured
|
|
- [ ] Admin API requires authentication
|
|
- [ ] Rate limiting configured (if needed)
|
|
- [ ] Firewall rules: Only 443/80 exposed publicly
|
|
- [ ] Internal ports (4433, 4434, 4456, 5432) blocked from external access
|
|
|
|
---
|
|
|
|
## Support & Troubleshooting
|
|
|
|
### Common Issues
|
|
|
|
**Issue: "Cookie not being set"**
|
|
- Check `session.cookie.domain` in kratos.yml is `example.com`
|
|
- Verify HTTPS is working
|
|
- Check browser dev tools > Application > Cookies
|
|
|
|
**Issue: "CORS errors"**
|
|
- Verify Oathkeeper CORS config includes all domains
|
|
- Check `allow_credentials: true`
|
|
- Verify Origin header matches allowed_origins
|
|
|
|
**Issue: "Redirect loop"**
|
|
- Check `preserve_host` settings in access rules
|
|
- Verify Kratos `allowed_return_urls` includes production domains
|
|
|
|
**Issue: "WebAuthn not working"**
|
|
- Verify `webauthn.config.rp.id` is `example.com`
|
|
- Check `webauthn.config.rp.origins` includes production URLs
|
|
- Ensure HTTPS is working (WebAuthn requires secure context)
|
|
|
|
### Debug Commands
|
|
|
|
```bash
|
|
# Check Oathkeeper rules
|
|
curl http://localhost:4456/rules | jq .
|
|
|
|
# Check Kratos sessions
|
|
curl -H "Cookie: ory_kratos_session=..." http://localhost:4433/sessions/whoami
|
|
|
|
# Test Oathkeeper decision API
|
|
curl -H "Cookie: ory_kratos_session=..." http://localhost:4455/decisions/admin/identities
|
|
|
|
# View Kratos configuration
|
|
docker compose exec kratos cat /etc/kratos/kratos.yml
|
|
```
|
|
|
|
---
|
|
|
|
## Production Deployment Complete! 🎉
|
|
|
|
Once all checklist items are complete, your Nexus 5 Auth system is production-ready with:
|
|
|
|
✅ Ory Kratos for identity management
|
|
✅ Ory Oathkeeper for authentication & authorization
|
|
✅ SvelteKit frontend with admin dashboard
|
|
✅ Full Django integration with custom headers
|
|
✅ Secure session management across subdomains
|
|
✅ WebAuthn/TOTP support
|
|
✅ Email verification & recovery
|
|
✅ Complete API endpoint coverage
|
|
|
|
**Next Steps:**
|
|
1. Monitor logs for the first 24 hours
|
|
2. Test all user flows in production
|
|
3. Set up automated backups
|
|
4. Configure monitoring/alerting
|
|
5. Document any environment-specific configurations
|