blob: 1213254a6ae9499d3cdfc222e4d8ee3c691ce319 (
plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
|
# CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
## Overview
This is an Ansible-based infrastructure-as-code repository for managing the liz.coffee homelab infrastructure. It orchestrates deployment of services across a Docker Swarm cluster (3 nodes: swarm-one, swarm-two, swarm-three) and an outbound proxy server.
## Architecture
### Infrastructure Layout
- **Swarm Cluster**: 3-node Docker Swarm cluster at 10.128.0.201-203
- Primary services deployed as Docker Swarm stacks
- Shared Ceph storage mounted across all nodes
- Keepalived for high availability
- Traefik as the ingress controller with automatic TLS via Let's Encrypt
- **Outbound Proxy**: External-facing NGINX reverse proxy (outbound-two.liz.coffee)
- Routes external traffic to internal services via the swarm loadbalancer
- Uses docker-compose instead of swarm stacks
### Service Deployment Patterns
Services fall into two deployment models:
1. **Docker Swarm Services** (most services): Use `tasks/manage-docker-swarm-service.yml`
- Deployed via `docker stack deploy`
- Templates rendered from `playbooks/roles/{service}/templates/`
- Health checks and rolling updates configured in docker-compose.yml
- Traefik labels for automatic routing and TLS
2. **Docker Compose Services** (nginx_proxy, outbound): Use `tasks/manage-docker-compose-service.yml`
- Deployed via systemd service `docker-compose@{service}`
- Supports rollout using docker-rollout tool for zero-downtime deployments
### Common Task Files
- `tasks/manage-docker-swarm-service.yml`: Renders templates and deploys swarm stack
- `tasks/manage-docker-compose-service.yml`: Renders templates, manages systemd service, performs rollouts
- `tasks/copy-rendered-templates-recursive.yml`: Copies Jinja2 templates to destination
## Key Commands
### Vault Management
Initialize or update vault secrets:
```bash
./ansible-vault-init.sh [secret_name]
```
To avoid password prompts, store vault password in `secrets.pwd`:
```bash
echo "your_password" > secrets.pwd
```
### Deployment
Full deployment (all services in order):
```bash
ansible-playbook -e @secrets.enc --vault-password-file secrets.pwd deploy.yml
```
Deploy a specific playbook during development:
```bash
ansible-playbook -e @secrets.enc --vault-password-file secrets.pwd playbooks/{service}.yml
```
### Linting
```bash
yamllint --strict .
ansible-lint
```
### Creating New Services
Use the `create.py` script to scaffold a new service:
```bash
./create.py --service-name myservice --container-image myimage:latest --service-port 8080 [--external] [--internal]
```
This generates:
- Ansible role in `playbooks/roles/{service}/`
- Docker compose template with Traefik labels
- Group vars in `group_vars/{service}.yml`
- Inventory entry and playbook hook in `deploy.yml`
- NGINX config (if `--external` specified)
- DNS records (Cloudflare if `--external`, LabDNS if `--internal`)
## File Organization
- `inventory`: Ansible inventory defining host groups and connection details
- `deploy.yml`: Master playbook importing all service playbooks in deployment order
- `playbooks/`: Individual service playbooks
- `playbooks/roles/`: Service-specific roles containing tasks and templates
- `{service}/tasks/main.yml`: Task entry point
- `{service}/templates/`: Jinja2 templates (docker-compose.yml, configs, etc.)
- `group_vars/`: Variables per service/host group
- `secrets.enc`: Ansible vault encrypted secrets
- `ansible.cfg`: Ansible configuration (inventory path, SSH settings)
## Variable Conventions
Each service typically defines in `group_vars/{service}.yml`:
- `{service}_domain`: FQDN for the service
- `{service}_base`: Base directory path on swarm nodes (usually under `{{ swarm_base }}`)
Common variables available across all playbooks:
- `deployment_time`: Timestamp of deployment (forces container recreation)
- `timezone`: System timezone
- `homelab_build`: Boolean indicating local vs production deployment
- `loadbalancer_ip`: Internal VIP for the swarm cluster
## Important Notes
- Most services use swarm-one (10.128.0.201) as the deployment target in inventory
- Secrets are referenced as `{{ secret_name }}` from the vault
- All swarm services should connect to the `proxy` external network for Traefik routing
- Use `--resolve-image=always` in stack deploys to ensure latest images are pulled
- The outbound role manages NGINX configs in `playbooks/roles/outbound/templates/proxy/nginx/conf.d/`
|