80 lines
3.5 KiB
Markdown
80 lines
3.5 KiB
Markdown
# K3s Cluster Setup Plan (Enterprise Ready)
|
|
|
|
## 1. Architecture Overview
|
|
|
|
We will deploy a High-Availability (HA) K3s cluster consisting of 3 Control Plane nodes (embedded etcd). This setup is resilient against the failure of a single node.
|
|
|
|
* **Topology:** 3 Nodes (Server + Agent mixed).
|
|
* **Operating System:** Ubuntu 24.04 (via Terraform/Cloud-Init).
|
|
* **Networking:**
|
|
* VLAN 40 (IP Range: `10.100.40.0/24`).
|
|
* **VIP (Virtual IP):** A floating IP managed by `kube-vip` for the API Server and Ingress Controller.
|
|
* **Ingress Flow:**
|
|
* `Internet` -> `Traefik im k3s Cluster (VIP 10.100.40.6)` -> `Traefik Ingress (K3s)` -> `Pod`.
|
|
* **GitOps:**
|
|
* **Tool:** FluxCD.
|
|
* **Repository Structure:**
|
|
* `stabify-infra` (Current): Bootstraps the nodes, installs K3s, installs Flux Binary.
|
|
* `stabify-gitops` (New): Watched by Flux. Contains system workloads (Cert-Manager, Traefik Internal) and User Apps.
|
|
|
|
## 2. Terraform Changes (`terraform/`)
|
|
|
|
We will update the existing `locals.tf` to reflect the 3-node HA structure.
|
|
|
|
* **`terraform/locals.tf`**:
|
|
* Refactor `vms` map:
|
|
* `vm-k3s-master-400` (`10.100.40.10`)
|
|
* `vm-k3s-master-401` (`10.100.40.11`)
|
|
* `vm-k3s-master-402` (`10.100.40.12`)
|
|
* Define VIPs:
|
|
* `k3s-api-vip`: `10.100.40.1` (or `.5`) - Endpoint for kubectl and Nodes.
|
|
* `k3s-ingress-vip`: `10.100.40.2` (or `.6`) - Endpoint for Traefik Edge.
|
|
|
|
* **`terraform/main.tf`**:
|
|
* Add `opnsense_unbound_host_override` resources for the VIPs to ensure internal DNS resolution.
|
|
|
|
## 3. Ansible Role Design (`infrastructure/ansible/`)
|
|
|
|
We will create a new role `k3s` and a corresponding playbook.
|
|
|
|
* **Inventory (`inventory.ini`)**:
|
|
* Add `[k3s_masters]` group.
|
|
|
|
* **Role: `k3s`**:
|
|
* **Task: System Prep:** Install `open-iscsi`, `nfs-common`, `curl`. Configure sysctl (bridged traffic).
|
|
* **Task: Install K3s (First Node):**
|
|
* Exec: `curl -sfL https://get.k3s.io | sh -`
|
|
* Args: `--cluster-init --disable traefik --disable servicelb --tls-san k3s-api.stabify.de`
|
|
* **Task: Install K3s (Other Nodes):**
|
|
* Args: `--server https://<First-Node-IP>:6443 --token <Secret>`
|
|
* **Task: Install Kube-VIP:**
|
|
* Deploy Manifest for Control Plane HA (ARP Mode).
|
|
* Deploy Manifest for Service LoadBalancer (ARP Mode).
|
|
* **Task: Bootstrap Flux:**
|
|
* Install Flux CLI.
|
|
* Run `flux bootstrap git ...`.
|
|
|
|
## 4. Network & DNS Strategy
|
|
|
|
* **DNS Records (OPNsense):**
|
|
* `vm-k3s-master-*.stabify.de` -> Node IPs (Managed by Terraform).
|
|
* `k3s-api.stabify.de` -> `10.100.40.5` (VIP).
|
|
* `*.k3s.stabify.de` -> `10.100.40.6` (Ingress VIP).
|
|
|
|
* **Traefik Edge Config (im k3s Cluster):**
|
|
* File Provider für TLS Passthrough zu k3s Services.
|
|
* ConfigMap: `traefik-edge-dynamic-k3s`
|
|
* Rule: `HostSNIRegexp('^.+\.k3s\.stabify\.de$')`
|
|
* Target: `10.100.40.6:443` (TLS Passthrough).
|
|
|
|
## 5. Next Steps for Implementation
|
|
|
|
1. **Refactor Terraform:** Update `locals.tf` to 3 Masters. Apply to create VMs.
|
|
2. **DNS Update:** Verify OPNsense records.
|
|
3. **Ansible Development:** Create `k3s` role.
|
|
4. **Execute Ansible:** Deploy Cluster.
|
|
5. **Flux Bootstrap:** Link cluster to GitOps repo.
|
|
6. **Traefik Edge:** Configure routing.
|
|
|
|
This plan ensures a clean separation of concerns: Terraform builds the hardware, Ansible installs the OS/Cluster software, and Flux manages the workloads.
|