140 lines
4.5 KiB
Markdown
140 lines
4.5 KiB
Markdown
# Architecture
|
|
|
|
This document explains the design decisions behind Arvandor.
|
|
|
|
## Network Separation
|
|
|
|
### Why Two Networks?
|
|
|
|
```
|
|
Internet ──► Proxmox Host ──► vmbr1 (192.168.100.0/24)
|
|
│
|
|
└──► Nebula (10.10.10.0/24)
|
|
```
|
|
|
|
**Bridge Network (vmbr1)**
|
|
- Used only for Terraform provisioning and Ansible access
|
|
- VMs firewall blocks all bridge traffic except from Proxmox host
|
|
- No inter-VM communication on this network
|
|
|
|
**Nebula Overlay**
|
|
- All application traffic uses encrypted Nebula tunnels
|
|
- Group-based firewall rules for segmentation
|
|
- Works across any network boundary (cloud, datacenter, home)
|
|
|
|
### Benefits
|
|
|
|
1. **Defense in depth** - Compromise of bridge network doesn't expose services
|
|
2. **Migration ready** - Move VMs anywhere, Nebula handles connectivity
|
|
3. **Zero-trust** - VMs authenticate via certificates, not network position
|
|
|
|
## VMID Allocation
|
|
|
|
VMIDs follow a logical pattern:
|
|
|
|
| Range | Purpose | Example |
|
|
|-------|---------|---------|
|
|
| 1000-1999 | Management | DNS, Caddy |
|
|
| 2000-2999 | Services | Vault, Gitea |
|
|
| 3000-3999 | Data | PostgreSQL, Valkey |
|
|
| 4000-4999 | Workloads | Applications |
|
|
| 5000-5999 | Monitoring | Prometheus |
|
|
|
|
The last digits determine the IP address:
|
|
- VMID 1001 → x.x.x.11
|
|
- VMID 3000 → x.x.x.30
|
|
|
|
## High Availability
|
|
|
|
All data services run as 3-node clusters:
|
|
|
|
### PostgreSQL (Patroni + etcd)
|
|
|
|
```
|
|
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
|
|
│ postgres-01 │ │ postgres-02 │ │ postgres-03 │
|
|
│ Leader │◄─│ Replica │◄─│ Replica │
|
|
│ + etcd │ │ + etcd │ │ + etcd │
|
|
└─────────────┘ └─────────────┘ └─────────────┘
|
|
```
|
|
|
|
- Patroni handles leader election
|
|
- etcd provides distributed consensus
|
|
- Automatic failover on leader failure
|
|
|
|
### Valkey (Sentinel)
|
|
|
|
```
|
|
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
|
|
│ valkey-01 │ │ valkey-02 │ │ valkey-03 │
|
|
│ Master │──│ Replica │ │ Replica │
|
|
│ + Sentinel │ │ + Sentinel │ │ + Sentinel │
|
|
└─────────────┘ └─────────────┘ └─────────────┘
|
|
```
|
|
|
|
- Sentinel monitors master health
|
|
- Automatic promotion on master failure
|
|
- ACL-based per-service key isolation
|
|
|
|
### Vault (Raft)
|
|
|
|
```
|
|
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
|
|
│ vault-01 │ │ vault-02 │ │ vault-03 │
|
|
│ Leader │──│ Standby │──│ Standby │
|
|
└─────────────┘ └─────────────┘ └─────────────┘
|
|
```
|
|
|
|
- Integrated Raft storage (no external backend)
|
|
- Automatic leader election
|
|
- Unseal required after restart
|
|
|
|
## Security Model
|
|
|
|
### Three-Layer Firewall
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────────────────┐
|
|
│ 1. Proxmox VM Firewall → Egress control │
|
|
│ 2. Nebula Groups → East-west segmentation │
|
|
│ 3. Guest iptables → Defense in depth │
|
|
└─────────────────────────────────────────────────────────────┘
|
|
```
|
|
|
|
### Nebula Groups
|
|
|
|
| Group | Can Access |
|
|
|-------|------------|
|
|
| admin | Everything |
|
|
| infrastructure | infrastructure |
|
|
| projects | infrastructure |
|
|
| games | Nothing (isolated) |
|
|
|
|
### Vault Integration
|
|
|
|
Applications use Vault for:
|
|
- Dynamic database credentials (short-lived)
|
|
- Service secrets (API keys, etc.)
|
|
- AppRole authentication
|
|
|
|
## Service Discovery
|
|
|
|
Internal DNS provides hostname resolution:
|
|
|
|
```
|
|
<hostname>.nebula → Nebula IP
|
|
```
|
|
|
|
VMs query 10.10.10.11 (DNS server) via Nebula. External queries forward to Cloudflare (1.1.1.1).
|
|
|
|
## Provisioning Flow
|
|
|
|
```
|
|
1. terraform apply → Create VM
|
|
2. bootstrap.yml → Update packages
|
|
3. security.yml → Configure firewall
|
|
4. nebula.yml → Join overlay network
|
|
5. <service>.yml → Deploy service
|
|
6. data-service.yml → Provision credentials
|
|
```
|