📚 Navigation: Main README | Documentation Index | Cert-Manager Module
This document provides comprehensive guidance for managing SSL certificates in the Obsidian stack using cert-manager and Let's Encrypt.
Related Documentation:
- Main README - Platform overview
- Cert-Manager Module - Certificate automation
- Reverse Proxy Setup - Nginx proxy configuration
The Obsidian stack consists of two main services that require SSL certificates:
- Obsidian Application: Accessible at
blackrock.gray-beard.com - CouchDB Database: Accessible at
couchdb.blackrock.gray-beard.com
SSL certificates are automatically managed using cert-manager with Let's Encrypt as the certificate authority. The system supports both staging and production certificate environments.
- cert-manager: Kubernetes controller for automatic certificate provisioning and renewal
- ClusterIssuer: Defines the certificate authority (Let's Encrypt) configuration
- Certificate Resources: Automatically created by cert-manager based on ingress annotations
- TLS Secrets: Kubernetes secrets containing the issued certificates and private keys
- Traefik Ingress Controller: Serves HTTPS traffic using the certificates
- Ingress resources are applied with cert-manager annotations
- cert-manager detects the annotations and creates Certificate resources
- ACME challenge is initiated with Let's Encrypt
- Upon successful validation, certificates are issued and stored as Kubernetes secrets
- Traefik ingress controller uses the certificates to serve HTTPS traffic
- cert-manager automatically renews certificates before expiration
For production certificates, use the following ingress resources:
obsidian-ingress-tls.yaml- Obsidian application with production certificatescouchdb-ingress-tls.yaml- CouchDB with production certificates
Key Configuration Elements:
metadata:
annotations:
kubernetes.io/ingress.class: "traefik"
cert-manager.io/cluster-issuer: "letsencrypt-prod"
spec:
tls:
- hosts:
- blackrock.gray-beard.com
secretName: obsidian-tlsFor staging/testing certificates, use:
obsidian-ingress-tls-staging.yaml- Obsidian application with staging certificatescouchdb-ingress-tls-staging.yaml- CouchDB with staging certificates
Key Configuration Elements:
metadata:
annotations:
cert-manager.io/cluster-issuer: "letsencrypt-staging"Certificates are stored as Kubernetes TLS secrets:
obsidian-tls- Contains certificate for blackrock.gray-beard.comcouchdb-tls- Contains certificate for couchdb.blackrock.gray-beard.com
Before switching certificate environments:
-
Verify cert-manager is running:
kubectl get pods -n cert-manager
-
Check ClusterIssuer availability:
kubectl get clusterissuer
-
Ensure DNS records point to your ingress IP:
nslookup blackrock.gray-beard.com nslookup couchdb.blackrock.gray-beard.com
Use Case: Testing certificate configuration, avoiding Let's Encrypt rate limits
-
Delete existing certificates and secrets (if switching from production):
kubectl delete certificate obsidian-tls couchdb-tls -n obsidian kubectl delete secret obsidian-tls couchdb-tls -n obsidian
-
Apply staging ingress resources:
kubectl apply -f obsidian-ingress-tls-staging.yaml kubectl apply -f couchdb-ingress-tls-staging.yaml
-
Verify certificate requests:
kubectl get certificates -n obsidian kubectl describe certificate obsidian-tls -n obsidian kubectl describe certificate couchdb-tls -n obsidian
-
Monitor certificate issuance:
kubectl get certificaterequests -n obsidian kubectl logs -n cert-manager deployment/cert-manager
Use Case: Deploying to production environment
-
Delete staging certificates and secrets:
kubectl delete certificate obsidian-tls couchdb-tls -n obsidian kubectl delete secret obsidian-tls couchdb-tls -n obsidian
-
Apply production ingress resources:
kubectl apply -f obsidian-ingress-tls.yaml kubectl apply -f couchdb-ingress-tls.yaml
-
Verify certificate requests:
kubectl get certificates -n obsidian kubectl describe certificate obsidian-tls -n obsidian kubectl describe certificate couchdb-tls -n obsidian
-
Validate production certificates:
./validate-ssl-certificates.sh
Create a script to automate the switching process:
#!/bin/bash
# switch-certificates.sh
ENVIRONMENT=${1:-staging}
if [ "$ENVIRONMENT" = "production" ]; then
echo "Switching to production certificates..."
kubectl delete certificate obsidian-tls couchdb-tls -n obsidian --ignore-not-found
kubectl delete secret obsidian-tls couchdb-tls -n obsidian --ignore-not-found
kubectl apply -f obsidian-ingress-tls.yaml
kubectl apply -f couchdb-ingress-tls.yaml
elif [ "$ENVIRONMENT" = "staging" ]; then
echo "Switching to staging certificates..."
kubectl delete certificate obsidian-tls couchdb-tls -n obsidian --ignore-not-found
kubectl delete secret obsidian-tls couchdb-tls -n obsidian --ignore-not-found
kubectl apply -f obsidian-ingress-tls-staging.yaml
kubectl apply -f couchdb-ingress-tls-staging.yaml
else
echo "Usage: $0 [staging|production]"
exit 1
fi
echo "Waiting for certificates to be issued..."
kubectl wait --for=condition=Ready certificate/obsidian-tls -n obsidian --timeout=300s
kubectl wait --for=condition=Ready certificate/couchdb-tls -n obsidian --timeout=300s
echo "Certificate switch complete!"Symptoms:
kubectl get certificates -n obsidian
NAME READY SECRET AGE
obsidian-tls False obsidian-tls 5mDiagnosis:
kubectl describe certificate obsidian-tls -n obsidian
kubectl get certificaterequests -n obsidian
kubectl describe certificaterequest <request-name> -n obsidianCommon Causes & Solutions:
-
DNS Resolution Issues: Verify domain resolves to ingress IP
nslookup blackrock.gray-beard.com kubectl get ingress -n obsidian -o wide
-
ClusterIssuer Not Ready: Check ClusterIssuer status
kubectl get clusterissuer kubectl describe clusterissuer letsencrypt-prod
-
ACME Challenge Failure: Check challenge resources
kubectl get challenges -n obsidian kubectl describe challenge <challenge-name> -n obsidian
Symptoms:
- Browser shows certificate warnings
- SSL validation script reports expired certificates
Diagnosis:
./validate-ssl-certificates.sh
openssl s_client -connect blackrock.gray-beard.com:443 -servername blackrock.gray-beard.comSolutions:
-
Force Certificate Renewal:
kubectl delete certificate obsidian-tls -n obsidian kubectl delete secret obsidian-tls -n obsidian # Reapply ingress to trigger new certificate request kubectl apply -f obsidian-ingress-tls.yaml -
Check cert-manager Logs:
kubectl logs -n cert-manager deployment/cert-manager
Symptoms:
kubectl get secret obsidian-tls -n obsidian
Error from server (NotFound): secrets "obsidian-tls" not foundSolutions:
-
Verify Certificate Resource Exists:
kubectl get certificates -n obsidian
-
Check Certificate Status:
kubectl describe certificate obsidian-tls -n obsidian
-
Recreate Certificate:
kubectl delete certificate obsidian-tls -n obsidian kubectl apply -f obsidian-ingress-tls.yaml
Symptoms:
- Certificate requests fail with rate limit errors
- cert-manager logs show "too many certificates already issued"
Solutions:
-
Switch to Staging Environment:
kubectl apply -f obsidian-ingress-tls-staging.yaml kubectl apply -f couchdb-ingress-tls-staging.yaml
-
Wait for Rate Limit Reset: Let's Encrypt rate limits reset weekly
-
Use Different Domain: If testing, use a different subdomain
# Check all certificates
kubectl get certificates -A
# Check specific certificate details
kubectl describe certificate obsidian-tls -n obsidian
# Check certificate events
kubectl get events -n obsidian --field-selector involvedObject.kind=Certificate# Check cert-manager pods
kubectl get pods -n cert-manager
# Check cert-manager logs
kubectl logs -n cert-manager deployment/cert-manager
# Check webhook connectivity
kubectl get validatingwebhookconfigurations
kubectl get mutatingwebhookconfigurations# List active challenges
kubectl get challenges -A
# Check challenge details
kubectl describe challenge <challenge-name> -n obsidian
# Check challenge events
kubectl get events -n obsidian --field-selector involvedObject.kind=Challenge# List TLS secrets
kubectl get secrets -n obsidian --field-selector type=kubernetes.io/tls
# Examine certificate in secret
kubectl get secret obsidian-tls -n obsidian -o jsonpath='{.data.tls\.crt}' | base64 -d | openssl x509 -text -nooutUse the provided validation script for regular monitoring:
# Run validation script
./validate-ssl-certificates.sh
# Schedule regular validation (add to crontab)
0 6 * * * /path/to/validate-ssl-certificates.shMonitor certificate expiration dates:
# Check certificate expiration
kubectl get certificates -n obsidian -o custom-columns=NAME:.metadata.name,READY:.status.conditions[0].status,EXPIRY:.status.notAfter
# Get detailed expiration info
kubectl get secret obsidian-tls -n obsidian -o jsonpath='{.data.tls\.crt}' | base64 -d | openssl x509 -enddate -nooutSet up expiration alerts:
Create a monitoring script that alerts when certificates expire within 30 days:
#!/bin/bash
# certificate-expiry-check.sh
NAMESPACE="obsidian"
SECRETS=("obsidian-tls" "couchdb-tls")
WARNING_DAYS=30
for secret in "${SECRETS[@]}"; do
if kubectl get secret "$secret" -n "$NAMESPACE" >/dev/null 2>&1; then
expiry_date=$(kubectl get secret "$secret" -n "$NAMESPACE" -o jsonpath='{.data.tls\.crt}' | base64 -d | openssl x509 -enddate -noout | cut -d= -f2)
expiry_epoch=$(date -d "$expiry_date" +%s)
current_epoch=$(date +%s)
days_until_expiry=$(( (expiry_epoch - current_epoch) / 86400 ))
if [ $days_until_expiry -lt $WARNING_DAYS ]; then
echo "WARNING: Certificate $secret expires in $days_until_expiry days"
# Add alerting mechanism here (email, Slack, etc.)
fi
fi
doneMonitor cert-manager components:
# Check cert-manager deployment health
kubectl get deployments -n cert-manager
# Monitor cert-manager resource usage
kubectl top pods -n cert-manager
# Check for cert-manager errors
kubectl logs -n cert-manager deployment/cert-manager --tail=100 | grep -i errorMonthly renewal testing in staging:
# Switch to staging
kubectl apply -f obsidian-ingress-tls-staging.yaml
kubectl apply -f couchdb-ingress-tls-staging.yaml
# Force certificate renewal
kubectl delete certificate obsidian-tls couchdb-tls -n obsidian
kubectl delete secret obsidian-tls couchdb-tls -n obsidian
# Wait for renewal
kubectl wait --for=condition=Ready certificate/obsidian-tls -n obsidian --timeout=300s
kubectl wait --for=condition=Ready certificate/couchdb-tls -n obsidian --timeout=300s
# Validate certificates
./validate-ssl-certificates.sh
# Switch back to production if needed
kubectl apply -f obsidian-ingress-tls.yaml
kubectl apply -f couchdb-ingress-tls.yamlBefore updating cert-manager:
-
Backup current certificates:
kubectl get certificates -n obsidian -o yaml > certificates-backup.yaml kubectl get secrets -n obsidian --field-selector type=kubernetes.io/tls -o yaml > secrets-backup.yaml
-
Test in staging environment first
-
Monitor certificate renewal after update
Verify ClusterIssuer configuration:
# Check ClusterIssuer status
kubectl get clusterissuer -o wide
# Verify ACME account registration
kubectl describe clusterissuer letsencrypt-prod
kubectl describe clusterissuer letsencrypt-stagingUpdate ClusterIssuer email if needed:
# Edit ClusterIssuer
kubectl edit clusterissuer letsencrypt-prodSet up alerts for certificates expiring within 30 days:
# Example Prometheus alert rule
groups:
- name: certificate-expiry
rules:
- alert: CertificateExpiringSoon
expr: (cert_manager_certificate_expiration_timestamp_seconds - time()) / 86400 < 30
for: 1h
labels:
severity: warning
annotations:
summary: "Certificate {{ $labels.name }} expires soon"
description: "Certificate {{ $labels.name }} in namespace {{ $labels.namespace }} expires in {{ $value }} days"Monitor for failed certificate requests:
# Example Prometheus alert rule
groups:
- name: certificate-failures
rules:
- alert: CertificateRequestFailed
expr: cert_manager_certificate_ready_status{condition="False"} == 1
for: 15m
labels:
severity: critical
annotations:
summary: "Certificate request failed"
description: "Certificate {{ $labels.name }} in namespace {{ $labels.namespace }} failed to be issued"- Always test certificate changes in staging first
- Monitor certificate expiration dates regularly
- Keep cert-manager updated to the latest stable version
- Use separate ClusterIssuers for staging and production
- Implement automated monitoring and alerting
- Document any custom certificate configurations
- Regularly backup certificate configurations
- Test disaster recovery procedures
For detailed information about SSL certificate validation, including the validation script usage and troubleshooting, see README-ssl-validation.md.
The validation script provides:
- HTTPS connectivity testing
- Certificate validation and expiration checking
- Kubernetes secret verification
- cert-manager integration checks
- Detailed error reporting and troubleshooting guidance