implemented LE and ACME and fixed some bugs
This commit is contained in:
865
docs/AUTO_RENEWAL_KONZEPT.md
Normal file
865
docs/AUTO_RENEWAL_KONZEPT.md
Normal file
@@ -0,0 +1,865 @@
|
||||
# Automatische Zertifikats-Erneuerung Konzept für Let's Encrypt
|
||||
|
||||
## 1. Übersicht
|
||||
|
||||
### 1.1 Ziel
|
||||
Implementierung einer automatischen Erneuerungsfunktion für Let's Encrypt (LE) Zertifikate, die ablaufende Zertifikate rechtzeitig erneuert, bevor sie ablaufen.
|
||||
|
||||
### 1.2 Anforderungen
|
||||
- **Proaktive Erneuerung**: Zertifikate werden erneuert, bevor sie ablaufen (z.B. 30 Tage vor Ablauf)
|
||||
- **Automatische Ausführung**: Läuft im Hintergrund ohne Benutzerinteraktion
|
||||
- **Fehlerbehandlung**: Robustes Error-Handling und Retry-Mechanismus
|
||||
- **Logging & Monitoring**: Umfassendes Logging für Nachverfolgbarkeit
|
||||
- **Konfigurierbarkeit**: Erneuerungs-Schwellenwerte und Intervalle konfigurierbar
|
||||
- **Berechtigungen**: Respektiert bestehende Permission-Systeme
|
||||
- **DNS-Validierung**: Automatische DNS-Challenge-Validierung vor Erneuerung
|
||||
|
||||
---
|
||||
|
||||
## 2. Architektur
|
||||
|
||||
### 2.1 Komponenten-Übersicht
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────┐
|
||||
│ Auto-Renewal System │
|
||||
├─────────────────────────────────────────────────────────────┤
|
||||
│ │
|
||||
│ ┌──────────────┐ ┌──────────────┐ ┌─────────────┐ │
|
||||
│ │ Scheduler │───>│ Certificate │───>│ Renewal │ │
|
||||
│ │ (Cron) │ │ Scanner │ │ Worker │ │
|
||||
│ └──────────────┘ └──────────────┘ └─────────────┘ │
|
||||
│ │ │ │ │
|
||||
│ │ │ │ │
|
||||
│ v v v │
|
||||
│ ┌──────────────┐ ┌──────────────┐ ┌─────────────┐ │
|
||||
│ │ Config │ │ Database │ │ ACME │ │
|
||||
│ │ Manager │ │ Queries │ │ Client │ │
|
||||
│ └──────────────┘ └──────────────┘ └─────────────┘ │
|
||||
│ │
|
||||
│ ┌──────────────┐ ┌──────────────┐ ┌─────────────┐ │
|
||||
│ │ Logger │ │ Notifier │ │ Retry │ │
|
||||
│ │ Service │ │ Service │ │ Manager │ │
|
||||
│ └──────────────┘ └──────────────┘ └─────────────┘ │
|
||||
│ │
|
||||
└─────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
### 2.2 Workflow
|
||||
|
||||
```
|
||||
1. Scheduler startet (z.B. täglich um 02:00 Uhr)
|
||||
│
|
||||
├─> 2. Scanner identifiziert ablaufende Zertifikate
|
||||
│ (expires_at < now + renewal_threshold)
|
||||
│
|
||||
├─> 3. Für jedes Zertifikat:
|
||||
│ │
|
||||
│ ├─> 3.1 Prüfe ob Auto-Renewal aktiviert
|
||||
│ │
|
||||
│ ├─> 3.2 Prüfe ob bereits Erneuerung läuft
|
||||
│ │
|
||||
│ ├─> 3.3 Prüfe Berechtigungen (Space-Zugriff)
|
||||
│ │
|
||||
│ ├─> 3.4 Validiere DNS (CNAME Check)
|
||||
│ │
|
||||
│ ├─> 3.5 Erstelle Renewal-Job
|
||||
│ │
|
||||
│ └─> 3.6 Queue für Worker
|
||||
│
|
||||
├─> 4. Worker verarbeitet Jobs sequenziell
|
||||
│ │
|
||||
│ ├─> 4.1 Hole FQDN-Informationen
|
||||
│ │
|
||||
│ ├─> 4.2 Hole ACME-Provider-Konfiguration
|
||||
│ │
|
||||
│ ├─> 4.3 Rufe RequestCertificate() auf
|
||||
│ │
|
||||
│ ├─> 4.4 Speichere neues Zertifikat
|
||||
│ │
|
||||
│ ├─> 4.5 Markiere altes Zertifikat als "replaced"
|
||||
│ │
|
||||
│ └─> 4.6 Logge Erfolg/Fehler
|
||||
│
|
||||
└─> 5. Cleanup & Reporting
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 3. Datenbank-Schema
|
||||
|
||||
### 3.1 Erweiterte Certificates-Tabelle
|
||||
|
||||
```sql
|
||||
-- Migration: Erweitere certificates-Tabelle um Auto-Renewal-Felder
|
||||
ALTER TABLE certificates ADD COLUMN auto_renewal_enabled BOOLEAN DEFAULT 1;
|
||||
ALTER TABLE certificates ADD COLUMN renewal_attempts INTEGER DEFAULT 0;
|
||||
ALTER TABLE certificates ADD COLUMN last_renewal_attempt DATETIME;
|
||||
ALTER TABLE certificates ADD COLUMN next_renewal_check DATETIME;
|
||||
ALTER TABLE certificates ADD COLUMN renewal_status TEXT; -- 'pending', 'in_progress', 'success', 'failed', 'disabled'
|
||||
ALTER TABLE certificates ADD COLUMN replaced_by_cert_id TEXT; -- ID des neuen Zertifikats
|
||||
ALTER TABLE certificates ADD COLUMN replaces_cert_id TEXT; -- ID des ersetzten Zertifikats
|
||||
```
|
||||
|
||||
### 3.2 Neue Tabelle: certificate_renewal_logs
|
||||
|
||||
```sql
|
||||
CREATE TABLE IF NOT EXISTS certificate_renewal_logs (
|
||||
id TEXT PRIMARY KEY,
|
||||
certificate_id TEXT NOT NULL,
|
||||
fqdn_id TEXT NOT NULL,
|
||||
space_id TEXT NOT NULL,
|
||||
renewal_status TEXT NOT NULL, -- 'started', 'success', 'failed', 'skipped'
|
||||
renewal_reason TEXT, -- 'expiring_soon', 'manual', 'retry'
|
||||
error_message TEXT,
|
||||
old_expires_at DATETIME,
|
||||
new_expires_at DATETIME,
|
||||
new_certificate_id TEXT,
|
||||
renewal_duration_seconds INTEGER,
|
||||
trace_id TEXT,
|
||||
created_at DATETIME NOT NULL,
|
||||
FOREIGN KEY (certificate_id) REFERENCES certificates(id) ON DELETE CASCADE,
|
||||
FOREIGN KEY (fqdn_id) REFERENCES fqdns(id) ON DELETE CASCADE,
|
||||
FOREIGN KEY (space_id) REFERENCES spaces(id) ON DELETE CASCADE
|
||||
);
|
||||
|
||||
CREATE INDEX idx_renewal_logs_certificate_id ON certificate_renewal_logs(certificate_id);
|
||||
CREATE INDEX idx_renewal_logs_created_at ON certificate_renewal_logs(created_at);
|
||||
CREATE INDEX idx_renewal_logs_status ON certificate_renewal_logs(renewal_status);
|
||||
```
|
||||
|
||||
### 3.3 Neue Tabelle: renewal_config
|
||||
|
||||
```sql
|
||||
CREATE TABLE IF NOT EXISTS renewal_config (
|
||||
id TEXT PRIMARY KEY DEFAULT 'global',
|
||||
enabled BOOLEAN DEFAULT 1,
|
||||
renewal_threshold_days INTEGER DEFAULT 30, -- Erneuere X Tage vor Ablauf
|
||||
check_interval_hours INTEGER DEFAULT 24, -- Wie oft prüfen (in Stunden)
|
||||
max_renewal_attempts INTEGER DEFAULT 3, -- Max. Versuche pro Zertifikat
|
||||
retry_delay_hours INTEGER DEFAULT 24, -- Wartezeit zwischen Retries
|
||||
notification_enabled BOOLEAN DEFAULT 0,
|
||||
notification_email TEXT,
|
||||
created_at DATETIME NOT NULL,
|
||||
updated_at DATETIME NOT NULL
|
||||
);
|
||||
|
||||
-- Initiale Konfiguration einfügen
|
||||
INSERT INTO renewal_config (id, enabled, renewal_threshold_days, check_interval_hours, max_renewal_attempts, retry_delay_hours, created_at, updated_at)
|
||||
VALUES ('global', 1, 30, 24, 3, 24, datetime('now'), datetime('now'));
|
||||
```
|
||||
|
||||
### 3.4 FQDN-Tabelle Erweiterung
|
||||
|
||||
```sql
|
||||
-- Optional: Pro-FQDN Auto-Renewal-Einstellungen
|
||||
ALTER TABLE fqdns ADD COLUMN auto_renewal_enabled BOOLEAN DEFAULT 1;
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 4. Konfiguration
|
||||
|
||||
### 4.1 Global Configuration (Environment Variables)
|
||||
|
||||
```bash
|
||||
# Auto-Renewal Einstellungen
|
||||
AUTO_RENEWAL_ENABLED=true
|
||||
AUTO_RENEWAL_THRESHOLD_DAYS=30
|
||||
AUTO_RENEWAL_CHECK_INTERVAL_HOURS=24
|
||||
AUTO_RENEWAL_SCHEDULE="0 2 * * *" # Cron-Format: Täglich um 02:00 Uhr
|
||||
AUTO_RENEWAL_MAX_ATTEMPTS=3
|
||||
AUTO_RENEWAL_RETRY_DELAY_HOURS=24
|
||||
|
||||
# Notifications
|
||||
AUTO_RENEWAL_NOTIFICATIONS_ENABLED=false
|
||||
AUTO_RENEWAL_NOTIFICATION_EMAIL=admin@example.com
|
||||
|
||||
# Concurrency
|
||||
AUTO_RENEWAL_MAX_CONCURRENT=1 # Anzahl paralleler Erneuerungen
|
||||
```
|
||||
|
||||
### 4.2 Per-FQDN Configuration
|
||||
|
||||
- **Default**: Auto-Renewal aktiviert für alle FQDNs
|
||||
- **Opt-out**: Pro FQDN deaktivierbar über `fqdns.auto_renewal_enabled`
|
||||
- **Opt-out**: Pro Zertifikat deaktivierbar über `certificates.auto_renewal_enabled`
|
||||
|
||||
---
|
||||
|
||||
## 5. Scheduler-Implementierung
|
||||
|
||||
### 5.1 Optionen
|
||||
|
||||
#### Option A: Go Cron Library (Empfohlen)
|
||||
```go
|
||||
import "github.com/robfig/cron/v3"
|
||||
|
||||
c := cron.New()
|
||||
c.AddFunc("0 2 * * *", func() {
|
||||
runCertificateRenewalScan()
|
||||
})
|
||||
c.Start()
|
||||
```
|
||||
|
||||
**Vorteile:**
|
||||
- Einfach zu implementieren
|
||||
- Gut getestet
|
||||
- Cron-Format unterstützt
|
||||
|
||||
**Nachteile:**
|
||||
- Läuft nur im Backend-Prozess
|
||||
- Bei Neustart muss Scheduler neu gestartet werden
|
||||
|
||||
#### Option B: Separate Background Service
|
||||
```go
|
||||
// Separate Go-Routine die kontinuierlich läuft
|
||||
go func() {
|
||||
ticker := time.NewTicker(24 * time.Hour)
|
||||
for {
|
||||
select {
|
||||
case <-ticker.C:
|
||||
runCertificateRenewalScan()
|
||||
case <-ctx.Done():
|
||||
return
|
||||
}
|
||||
}
|
||||
}()
|
||||
```
|
||||
|
||||
**Vorteile:**
|
||||
- Einfacher zu debuggen
|
||||
- Keine externe Dependency
|
||||
|
||||
**Nachteile:**
|
||||
- Weniger flexibel als Cron
|
||||
- Muss selbst implementiert werden
|
||||
|
||||
#### Option C: System Cron Job
|
||||
```bash
|
||||
# /etc/cron.d/certigo-renewal
|
||||
0 2 * * * curl -X POST http://localhost:8080/api/internal/renewal/scan
|
||||
```
|
||||
|
||||
**Vorteile:**
|
||||
- Unabhängig vom Backend-Prozess
|
||||
- Läuft auch wenn Backend neu gestartet wird
|
||||
|
||||
**Nachteile:**
|
||||
- Externe Dependency (curl)
|
||||
- Schwieriger zu debuggen
|
||||
- Benötigt separaten API-Endpunkt
|
||||
|
||||
**Empfehlung: Option A (Go Cron Library)**
|
||||
|
||||
---
|
||||
|
||||
## 6. Certificate Scanner
|
||||
|
||||
### 6.1 Query für ablaufende Zertifikate
|
||||
|
||||
```sql
|
||||
SELECT
|
||||
c.id,
|
||||
c.fqdn_id,
|
||||
c.space_id,
|
||||
c.certificate_id,
|
||||
c.provider_id,
|
||||
c.expires_at,
|
||||
c.auto_renewal_enabled,
|
||||
c.renewal_status,
|
||||
c.renewal_attempts,
|
||||
c.last_renewal_attempt,
|
||||
f.fqdn,
|
||||
f.acme_email,
|
||||
f.acme_key_id,
|
||||
f.provider_id as fqdn_provider_id
|
||||
FROM certificates c
|
||||
INNER JOIN fqdns f ON c.fqdn_id = f.id
|
||||
WHERE
|
||||
-- Nur Leaf-Zertifikate (nicht Intermediate)
|
||||
c.is_intermediate = 0
|
||||
-- Nur Let's Encrypt Zertifikate (via certigo-acmeproxy)
|
||||
AND c.provider_id = 'certigo-acmeproxy'
|
||||
-- Nur gültige/ausgestellte Zertifikate
|
||||
AND c.status = 'issued'
|
||||
-- Auto-Renewal muss aktiviert sein
|
||||
AND (c.auto_renewal_enabled IS NULL OR c.auto_renewal_enabled = 1)
|
||||
AND (f.auto_renewal_enabled IS NULL OR f.auto_renewal_enabled = 1)
|
||||
-- Zertifikat läuft bald ab
|
||||
AND c.expires_at IS NOT NULL
|
||||
AND datetime(c.expires_at) <= datetime('now', '+' || ? || ' days')
|
||||
-- Keine laufende Erneuerung
|
||||
AND (c.renewal_status IS NULL OR c.renewal_status != 'in_progress')
|
||||
-- Nicht zu viele Versuche
|
||||
AND (c.renewal_attempts IS NULL OR c.renewal_attempts < ?)
|
||||
-- Retry-Delay eingehalten
|
||||
AND (
|
||||
c.last_renewal_attempt IS NULL
|
||||
OR datetime(c.last_renewal_attempt) <= datetime('now', '-' || ? || ' hours')
|
||||
)
|
||||
ORDER BY c.expires_at ASC;
|
||||
```
|
||||
|
||||
### 6.2 Filter-Logik
|
||||
|
||||
**Ausschluss-Kriterien:**
|
||||
1. ✅ Intermediate-Zertifikate (nur Leaf)
|
||||
2. ✅ Nur `certigo-acmeproxy` Provider
|
||||
3. ✅ Status = 'issued'
|
||||
4. ✅ Auto-Renewal aktiviert (Certificate + FQDN)
|
||||
5. ✅ `expires_at` innerhalb Threshold
|
||||
6. ✅ Keine laufende Erneuerung (`renewal_status != 'in_progress'`)
|
||||
7. ✅ Max. Versuche nicht überschritten
|
||||
8. ✅ Retry-Delay eingehalten
|
||||
|
||||
---
|
||||
|
||||
## 7. Renewal Worker
|
||||
|
||||
### 7.1 Renewal-Prozess
|
||||
|
||||
```go
|
||||
func renewCertificate(certID string, fqdnID string, spaceID string) error {
|
||||
traceID := generateTraceID()
|
||||
|
||||
// 1. Markiere als "in_progress"
|
||||
updateRenewalStatus(certID, "in_progress", traceID)
|
||||
|
||||
// 2. Hole FQDN-Informationen
|
||||
fqdn, err := getFQDN(fqdnID)
|
||||
if err != nil {
|
||||
logRenewalError(certID, traceID, "FQDN nicht gefunden", err)
|
||||
return err
|
||||
}
|
||||
|
||||
// 3. Prüfe DNS (CNAME)
|
||||
if !validateDNSCNAME(fqdn.FQDN) {
|
||||
logRenewalError(certID, traceID, "DNS-CNAME nicht gültig", nil)
|
||||
return fmt.Errorf("DNS validation failed")
|
||||
}
|
||||
|
||||
// 4. Hole ACME-Provider-Konfiguration
|
||||
providerConfig, err := getACMEProviderConfig(fqdn.ProviderID)
|
||||
if err != nil {
|
||||
logRenewalError(certID, traceID, "Provider-Konfiguration nicht gefunden", err)
|
||||
return err
|
||||
}
|
||||
|
||||
// 5. Rufe RequestCertificate() auf
|
||||
result, err := RequestCertificate(
|
||||
fqdn.FQDN,
|
||||
fqdn.AcmeEmail,
|
||||
fqdnID,
|
||||
fqdn.AcmeKeyID,
|
||||
traceID,
|
||||
updateTokenFunc,
|
||||
cleanupTokenFunc,
|
||||
statusCallback,
|
||||
)
|
||||
|
||||
if err != nil {
|
||||
// 6a. Fehler: Erhöhe Versuche, setze Retry-Zeitpunkt
|
||||
incrementRenewalAttempts(certID)
|
||||
setNextRenewalCheck(certID, time.Now().Add(retryDelay))
|
||||
updateRenewalStatus(certID, "failed", traceID)
|
||||
logRenewalError(certID, traceID, "Erneuerung fehlgeschlagen", err)
|
||||
return err
|
||||
}
|
||||
|
||||
// 6b. Erfolg: Speichere neues Zertifikat
|
||||
newCertID, err := saveNewCertificate(result, fqdnID, spaceID)
|
||||
if err != nil {
|
||||
logRenewalError(certID, traceID, "Fehler beim Speichern", err)
|
||||
return err
|
||||
}
|
||||
|
||||
// 7. Verknüpfe alte und neue Zertifikate
|
||||
linkCertificates(certID, newCertID)
|
||||
|
||||
// 8. Markiere als erfolgreich
|
||||
updateRenewalStatus(certID, "success", traceID)
|
||||
logRenewalSuccess(certID, newCertID, traceID)
|
||||
|
||||
// 9. Optional: Benachrichtigung senden
|
||||
sendRenewalNotification(fqdn.FQDN, newCertID, traceID)
|
||||
|
||||
return nil
|
||||
}
|
||||
```
|
||||
|
||||
### 7.2 Concurrency Control
|
||||
|
||||
**Sequenzielle Verarbeitung:**
|
||||
- Pro FQDN nur eine Erneuerung gleichzeitig
|
||||
- Pro Space max. N Erneuerungen parallel (konfigurierbar)
|
||||
- Global max. M Erneuerungen parallel (konfigurierbar)
|
||||
|
||||
**Implementierung:**
|
||||
```go
|
||||
// Semaphore für Concurrency Control
|
||||
var renewalSemaphore = make(chan struct{}, maxConcurrentRenewals)
|
||||
|
||||
func renewCertificateWithLock(certID string, fqdnID string, spaceID string) error {
|
||||
renewalSemaphore <- struct{}{} // Acquire
|
||||
defer func() { <-renewalSemaphore }() // Release
|
||||
|
||||
return renewCertificate(certID, fqdnID, spaceID)
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 8. Fehlerbehandlung & Retry
|
||||
|
||||
### 8.1 Fehler-Kategorien
|
||||
|
||||
| Fehler-Typ | Retry? | Max. Versuche | Beispiel |
|
||||
|-----------|--------|--------------|----------|
|
||||
| DNS-Validierung fehlgeschlagen | ✅ Ja | 3 | CNAME nicht gesetzt |
|
||||
| ACME-Provider-Fehler | ✅ Ja | 3 | Rate Limit erreicht |
|
||||
| Netzwerk-Fehler | ✅ Ja | 5 | Timeout, Connection Error |
|
||||
| Konfigurations-Fehler | ❌ Nein | 0 | Provider nicht konfiguriert |
|
||||
| Berechtigungs-Fehler | ❌ Nein | 0 | Kein Space-Zugriff |
|
||||
|
||||
### 8.2 Retry-Strategie
|
||||
|
||||
**Exponential Backoff:**
|
||||
```
|
||||
Versuch 1: Sofort
|
||||
Versuch 2: Nach 24 Stunden
|
||||
Versuch 3: Nach 48 Stunden
|
||||
Versuch 4+: Nach 72 Stunden
|
||||
```
|
||||
|
||||
**Oder: Fixed Delay**
|
||||
```
|
||||
Alle Retries: Nach X Stunden (konfigurierbar, Default: 24h)
|
||||
```
|
||||
|
||||
### 8.3 Fehler-Logging
|
||||
|
||||
```go
|
||||
type RenewalError struct {
|
||||
CertificateID string
|
||||
FQDN string
|
||||
ErrorType string // 'dns', 'acme', 'network', 'config'
|
||||
ErrorMessage string
|
||||
TraceID string
|
||||
Timestamp time.Time
|
||||
Attempt int
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 9. Logging & Monitoring
|
||||
|
||||
### 9.1 Structured Logging
|
||||
|
||||
**Erfolgreiche Erneuerung:**
|
||||
```json
|
||||
{
|
||||
"event": "certificate_renewal_success",
|
||||
"trace_id": "abc123",
|
||||
"certificate_id": "cert-uuid",
|
||||
"fqdn": "example.com",
|
||||
"old_expires_at": "2025-02-15T10:00:00Z",
|
||||
"new_expires_at": "2025-05-15T10:00:00Z",
|
||||
"new_certificate_id": "new-cert-uuid",
|
||||
"duration_seconds": 45,
|
||||
"timestamp": "2025-01-15T02:05:00Z"
|
||||
}
|
||||
```
|
||||
|
||||
**Fehlgeschlagene Erneuerung:**
|
||||
```json
|
||||
{
|
||||
"event": "certificate_renewal_failed",
|
||||
"trace_id": "abc123",
|
||||
"certificate_id": "cert-uuid",
|
||||
"fqdn": "example.com",
|
||||
"error_type": "dns_validation",
|
||||
"error_message": "CNAME record not found",
|
||||
"attempt": 1,
|
||||
"max_attempts": 3,
|
||||
"next_retry": "2025-01-16T02:00:00Z",
|
||||
"timestamp": "2025-01-15T02:05:00Z"
|
||||
}
|
||||
```
|
||||
|
||||
### 9.2 Audit Logs
|
||||
|
||||
**Integration in bestehendes Audit-System:**
|
||||
```go
|
||||
auditService.Track(ctx, "RENEW", "certificate", certID, "system", "auto-renewal", map[string]interface{}{
|
||||
"fqdn": fqdn,
|
||||
"old_expires_at": oldExpiresAt,
|
||||
"new_expires_at": newExpiresAt,
|
||||
"trace_id": traceID,
|
||||
}, ipAddress, userAgent)
|
||||
```
|
||||
|
||||
### 9.3 Metrics
|
||||
|
||||
**Zu tracken:**
|
||||
- Anzahl Erneuerungen pro Tag/Woche/Monat
|
||||
- Erfolgsrate (Erfolgreich / Gesamt)
|
||||
- Durchschnittliche Erneuerungsdauer
|
||||
- Anzahl fehlgeschlagener Erneuerungen
|
||||
- Anzahl Retries
|
||||
- Zertifikate die bald ablaufen (Warnung)
|
||||
|
||||
---
|
||||
|
||||
## 10. API-Endpunkte
|
||||
|
||||
### 10.1 Manuelle Erneuerung
|
||||
|
||||
**POST** `/api/spaces/{spaceId}/fqdns/{fqdnId}/certificates/{certId}/renew`
|
||||
|
||||
Manuell eine Erneuerung auslösen.
|
||||
|
||||
**Response:**
|
||||
```json
|
||||
{
|
||||
"success": true,
|
||||
"message": "Erneuerung gestartet",
|
||||
"trace_id": "abc123",
|
||||
"estimated_completion": "2025-01-15T02:05:00Z"
|
||||
}
|
||||
```
|
||||
|
||||
### 10.2 Erneuerungs-Status abrufen
|
||||
|
||||
**GET** `/api/spaces/{spaceId}/fqdns/{fqdnId}/certificates/{certId}/renewal-status`
|
||||
|
||||
**Response:**
|
||||
```json
|
||||
{
|
||||
"auto_renewal_enabled": true,
|
||||
"renewal_status": "success",
|
||||
"renewal_attempts": 1,
|
||||
"last_renewal_attempt": "2025-01-15T02:00:00Z",
|
||||
"next_renewal_check": "2025-01-16T02:00:00Z",
|
||||
"replaced_by_cert_id": "new-cert-uuid"
|
||||
}
|
||||
```
|
||||
|
||||
### 10.3 Erneuerungs-Logs abrufen
|
||||
|
||||
**GET** `/api/spaces/{spaceId}/fqdns/{fqdnId}/certificates/{certId}/renewal-logs`
|
||||
|
||||
**Query Parameters:**
|
||||
- `limit` (optional): Anzahl Einträge (Default: 50)
|
||||
- `offset` (optional): Pagination Offset
|
||||
|
||||
**Response:**
|
||||
```json
|
||||
{
|
||||
"logs": [
|
||||
{
|
||||
"id": "log-uuid",
|
||||
"renewal_status": "success",
|
||||
"renewal_reason": "expiring_soon",
|
||||
"old_expires_at": "2025-02-15T10:00:00Z",
|
||||
"new_expires_at": "2025-05-15T10:00:00Z",
|
||||
"new_certificate_id": "new-cert-uuid",
|
||||
"renewal_duration_seconds": 45,
|
||||
"trace_id": "abc123",
|
||||
"created_at": "2025-01-15T02:00:00Z"
|
||||
}
|
||||
],
|
||||
"total": 10,
|
||||
"limit": 50,
|
||||
"offset": 0
|
||||
}
|
||||
```
|
||||
|
||||
### 10.4 Auto-Renewal konfigurieren
|
||||
|
||||
**PUT** `/api/spaces/{spaceId}/fqdns/{fqdnId}/certificates/{certId}/auto-renewal`
|
||||
|
||||
**Body:**
|
||||
```json
|
||||
{
|
||||
"enabled": true
|
||||
}
|
||||
```
|
||||
|
||||
### 10.5 Global Configuration
|
||||
|
||||
**GET** `/api/internal/renewal/config`
|
||||
|
||||
**Response:**
|
||||
```json
|
||||
{
|
||||
"enabled": true,
|
||||
"renewal_threshold_days": 30,
|
||||
"check_interval_hours": 24,
|
||||
"max_renewal_attempts": 3,
|
||||
"retry_delay_hours": 24
|
||||
}
|
||||
```
|
||||
|
||||
**PUT** `/api/internal/renewal/config`
|
||||
|
||||
**Body:**
|
||||
```json
|
||||
{
|
||||
"enabled": true,
|
||||
"renewal_threshold_days": 30,
|
||||
"check_interval_hours": 24,
|
||||
"max_renewal_attempts": 3,
|
||||
"retry_delay_hours": 24
|
||||
}
|
||||
```
|
||||
|
||||
### 10.6 Manueller Scan (für Testing)
|
||||
|
||||
**POST** `/api/internal/renewal/scan`
|
||||
|
||||
Löst einen manuellen Scan aus (nur für Admins).
|
||||
|
||||
**Response:**
|
||||
```json
|
||||
{
|
||||
"success": true,
|
||||
"certificates_found": 5,
|
||||
"certificates_queued": 3,
|
||||
"certificates_skipped": 2
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 11. Frontend-Integration
|
||||
|
||||
### 11.1 UI-Komponenten
|
||||
|
||||
#### Auto-Renewal Toggle
|
||||
- **Ort**: Certificate Detail View
|
||||
- **Funktion**: Ein/Aus-Schalter für Auto-Renewal pro Zertifikat
|
||||
|
||||
#### Renewal Status Badge
|
||||
- **Ort**: Certificate List & Detail View
|
||||
- **Anzeige**:
|
||||
- 🟢 "Auto-Renewal aktiv" (wenn enabled)
|
||||
- 🟡 "Erneuerung läuft" (wenn in_progress)
|
||||
- 🔴 "Erneuerung fehlgeschlagen" (wenn failed)
|
||||
- ⚪ "Auto-Renewal deaktiviert" (wenn disabled)
|
||||
|
||||
#### Renewal History
|
||||
- **Ort**: Certificate Detail View
|
||||
- **Anzeige**: Tabelle mit Erneuerungs-Logs
|
||||
- **Spalten**: Datum, Status, Grund, Neue Ablaufzeit, Trace ID
|
||||
|
||||
#### Manuelle Erneuerung Button
|
||||
- **Ort**: Certificate Detail View
|
||||
- **Funktion**: "Jetzt erneuern" Button (falls Auto-Renewal deaktiviert)
|
||||
|
||||
#### Upcoming Renewals Dashboard
|
||||
- **Ort**: Dashboard/Overview
|
||||
- **Anzeige**: Liste von Zertifikaten die bald erneuert werden
|
||||
- **Filter**: Nach Space, FQDN, Ablaufdatum
|
||||
|
||||
### 11.2 Notifications (Optional)
|
||||
|
||||
**Email-Benachrichtigungen:**
|
||||
- Erfolgreiche Erneuerung
|
||||
- Fehlgeschlagene Erneuerung (nach max. Versuchen)
|
||||
- Warnung: Zertifikat läuft in X Tagen ab (falls Erneuerung fehlschlägt)
|
||||
|
||||
**In-App Notifications:**
|
||||
- Toast-Notification bei erfolgreicher Erneuerung
|
||||
- Alert bei fehlgeschlagener Erneuerung
|
||||
|
||||
---
|
||||
|
||||
## 12. Sicherheit & Berechtigungen
|
||||
|
||||
### 12.1 Berechtigungen
|
||||
|
||||
**Auto-Renewal ausführen:**
|
||||
- System-User (für automatische Erneuerungen)
|
||||
- Admin-User (für manuelle Erneuerungen)
|
||||
- User mit `FULL_ACCESS` auf Space (für manuelle Erneuerungen)
|
||||
|
||||
**Auto-Renewal konfigurieren:**
|
||||
- Admin-User
|
||||
- User mit `FULL_ACCESS` auf Space
|
||||
|
||||
**Erneuerungs-Logs anzeigen:**
|
||||
- Alle User mit Space-Zugriff (READ-Berechtigung)
|
||||
|
||||
### 12.2 Rate Limiting
|
||||
|
||||
**Let's Encrypt Rate Limits:**
|
||||
- 50 Certificates per Registered Domain per week
|
||||
- 300 New Orders per Account per 3 hours
|
||||
|
||||
**Schutz:**
|
||||
- Tracke Anzahl Erneuerungen pro FQDN
|
||||
- Verzögere Erneuerung wenn Rate Limit erreicht
|
||||
- Logge Warnung bei Rate Limit
|
||||
|
||||
---
|
||||
|
||||
## 13. Testing & Rollout
|
||||
|
||||
### 13.1 Test-Plan
|
||||
|
||||
**Phase 1: Unit Tests**
|
||||
- [ ] Certificate Scanner Query
|
||||
- [ ] Renewal Worker Logic
|
||||
- [ ] Retry-Mechanismus
|
||||
- [ ] Error-Handling
|
||||
|
||||
**Phase 2: Integration Tests**
|
||||
- [ ] End-to-End Erneuerung (mit Staging ACME)
|
||||
- [ ] Fehler-Szenarien (DNS-Fehler, Rate Limit)
|
||||
- [ ] Concurrency Tests
|
||||
|
||||
**Phase 3: Staging Tests**
|
||||
- [ ] Test mit echten Staging-Zertifikaten
|
||||
- [ ] Monitoring & Logging prüfen
|
||||
- [ ] Performance-Tests
|
||||
|
||||
**Phase 4: Production Rollout**
|
||||
- [ ] Feature Flag aktivieren
|
||||
- [ ] Monitoring aktivieren
|
||||
- [ ] Schrittweise Aktivierung (zuerst einzelne FQDNs)
|
||||
|
||||
### 13.2 Rollback-Plan
|
||||
|
||||
**Falls Probleme auftreten:**
|
||||
1. Auto-Renewal global deaktivieren (Config)
|
||||
2. Laufende Erneuerungen abbrechen (Status zurücksetzen)
|
||||
3. Manuelle Erneuerung weiterhin möglich
|
||||
|
||||
---
|
||||
|
||||
## 14. Monitoring & Alerting
|
||||
|
||||
### 14.1 Health Checks
|
||||
|
||||
**Endpoint:** `GET /api/health/renewal`
|
||||
|
||||
**Response:**
|
||||
```json
|
||||
{
|
||||
"status": "healthy",
|
||||
"last_scan": "2025-01-15T02:00:00Z",
|
||||
"next_scan": "2025-01-16T02:00:00Z",
|
||||
"certificates_pending": 2,
|
||||
"certificates_in_progress": 1,
|
||||
"certificates_failed": 0
|
||||
}
|
||||
```
|
||||
|
||||
### 14.2 Alerts
|
||||
|
||||
**Zu überwachen:**
|
||||
- ❌ Auto-Renewal Service läuft nicht
|
||||
- ⚠️ Viele fehlgeschlagene Erneuerungen (> 10% in 24h)
|
||||
- ⚠️ Zertifikate laufen in < 7 Tagen ab (ohne Erneuerung)
|
||||
- ⚠️ Rate Limit erreicht
|
||||
- ⚠️ Scheduler läuft nicht (letzter Scan > 48h her)
|
||||
|
||||
---
|
||||
|
||||
## 15. Zukünftige Erweiterungen
|
||||
|
||||
### 15.1 Multi-Provider Support
|
||||
- Erneuerung für andere Provider (nicht nur Let's Encrypt)
|
||||
|
||||
### 15.2 Smart Scheduling
|
||||
- Erneuerung basierend auf Traffic-Patterns
|
||||
- Erneuerung außerhalb der Geschäftszeiten
|
||||
|
||||
### 15.3 Batch Renewals
|
||||
- Erneuerung mehrerer Zertifikate gleichzeitig (wenn möglich)
|
||||
|
||||
### 15.4 Webhook-Integration
|
||||
- Webhooks für erfolgreiche/fehlgeschlagene Erneuerungen
|
||||
- Integration mit externen Monitoring-Tools
|
||||
|
||||
### 15.5 Certificate Rotation
|
||||
- Automatische Rotation von Private Keys
|
||||
- Unterstützung für Key-Rollover
|
||||
|
||||
---
|
||||
|
||||
## 16. Abhängigkeiten
|
||||
|
||||
### 16.1 Backend (Go)
|
||||
|
||||
```go
|
||||
// Cron Scheduler
|
||||
github.com/robfig/cron/v3
|
||||
|
||||
// (Bereits vorhanden)
|
||||
// - ACME Client (acme_client.go)
|
||||
// - Certificate Parser (cert_parser.go)
|
||||
// - Logger (cert_logger.go)
|
||||
```
|
||||
|
||||
### 16.2 Frontend
|
||||
|
||||
Keine zusätzlichen Dependencies nötig.
|
||||
|
||||
---
|
||||
|
||||
## 17. Risiken & Mitigation
|
||||
|
||||
### 17.1 Risiken
|
||||
|
||||
| Risiko | Wahrscheinlichkeit | Impact | Mitigation |
|
||||
|--------|-------------------|--------|------------|
|
||||
| Rate Limit erreicht | Mittel | Hoch | Rate Limit Tracking, Verzögerung |
|
||||
| DNS-Validierung fehlschlägt | Mittel | Hoch | DNS-Check vor Erneuerung, Retry |
|
||||
| ACME-Provider Downtime | Niedrig | Hoch | Retry-Mechanismus, Fallback |
|
||||
| Doppelte Erneuerung | Niedrig | Mittel | Status-Check, Locking |
|
||||
| Datenbank-Lock | Niedrig | Mittel | Transaktionen, Timeouts |
|
||||
|
||||
### 17.2 Best Practices
|
||||
|
||||
- ✅ Idempotenz: Erneuerung kann mehrfach ausgeführt werden ohne Probleme
|
||||
- ✅ Atomic Operations: Datenbank-Transaktionen für Konsistenz
|
||||
- ✅ Graceful Degradation: Bei Fehlern weiterhin manuelle Erneuerung möglich
|
||||
- ✅ Comprehensive Logging: Alle Schritte loggen für Debugging
|
||||
- ✅ Rate Limit Awareness: Respektiere Let's Encrypt Limits
|
||||
|
||||
---
|
||||
|
||||
## 18. Zusammenfassung
|
||||
|
||||
### 18.1 Vorteile
|
||||
|
||||
- **Automatisierung**: Keine manuelle Intervention nötig
|
||||
- **Zuverlässigkeit**: Zertifikate laufen nicht mehr ab
|
||||
- **Zeitersparnis**: Weniger manuelle Arbeit
|
||||
- **Sicherheit**: Immer gültige Zertifikate
|
||||
|
||||
### 18.2 Herausforderungen
|
||||
|
||||
- **Komplexität**: Zusätzliche Infrastruktur und Code
|
||||
- **Fehlerbehandlung**: Robustes Error-Handling erforderlich
|
||||
- **Rate Limits**: Let's Encrypt Limits beachten
|
||||
- **Testing**: Umfangreiche Tests erforderlich
|
||||
|
||||
### 18.3 Empfohlene Implementierungs-Reihenfolge
|
||||
|
||||
1. **Phase 1**: Datenbank-Schema & Grundfunktionalität
|
||||
2. **Phase 2**: Scanner & Worker
|
||||
3. **Phase 3**: Scheduler & Automation
|
||||
4. **Phase 4**: Frontend-Integration
|
||||
5. **Phase 5**: Monitoring & Alerting
|
||||
6. **Phase 6**: Notifications (Optional)
|
||||
|
||||
---
|
||||
|
||||
**Erstellt am**: 2025-01-XX
|
||||
**Version**: 1.0
|
||||
**Status**: Konzept - Noch nicht implementiert
|
||||
|
||||
Reference in New Issue
Block a user