fix: registry ingress + woodpecker pulls + registry dns overrides

This commit is contained in:
Ashwin Kumar Sivakumar 2026-04-17 05:25:04 +05:30
parent 39e69a374a
commit 75acea11eb
16 changed files with 194 additions and 16 deletions

View file

@ -6,4 +6,4 @@ patchesStrategicMerge:
- replicas-patch.yaml - replicas-patch.yaml
images: images:
- name: registry.nxtgauge.com/nxtgauge-admin-solid - name: registry.nxtgauge.com/nxtgauge-admin-solid
newTag: e044d4c newTag: high-performance-latest

View file

@ -9,11 +9,11 @@ patches:
name: nxtgauge-rust-gateway name: nxtgauge-rust-gateway
images: images:
- name: registry.nxtgauge.com/nxtgauge-rust-gateway - name: registry.nxtgauge.com/nxtgauge-rust-gateway
newTag: d084491 newTag: high-performance-latest
- name: registry.nxtgauge.com/nxtgauge-rust-users - name: registry.nxtgauge.com/nxtgauge-rust-users
newTag: 9444056 newTag: high-performance-latest
- name: registry.nxtgauge.com/nxtgauge-frontend-solid - name: registry.nxtgauge.com/nxtgauge-frontend-solid
newTag: 152f918 newTag: high-performance-latest
- name: registry.nxtgauge.com/nxtgauge-rust-companies - name: registry.nxtgauge.com/nxtgauge-rust-companies
newTag: high-performance-latest newTag: high-performance-latest
- name: registry.nxtgauge.com/nxtgauge-rust-job-seekers - name: registry.nxtgauge.com/nxtgauge-rust-job-seekers

View file

@ -6,4 +6,4 @@ patchesStrategicMerge:
- replicas-patch.yaml - replicas-patch.yaml
images: images:
- name: registry.nxtgauge.com/nxtgauge-frontend-solid - name: registry.nxtgauge.com/nxtgauge-frontend-solid
newTag: d26f0bf newTag: high-performance-latest

View file

@ -0,0 +1,21 @@
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: coredns-nodehosts
namespace: argocd
spec:
destination:
namespace: kube-system
server: https://kubernetes.default.svc
project: default
source:
path: ops/coredns-nodehosts
repoURL: https://github.com/Traceworks2023/nxtgauge-gitops.git
targetRevision: main
syncPolicy:
automated:
prune: true
selfHeal: true
syncOptions:
- CreateNamespace=true

View file

@ -0,0 +1,21 @@
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: registry-ingress
namespace: argocd
spec:
destination:
namespace: registry
server: https://kubernetes.default.svc
project: default
source:
path: ops/registry-ingress
repoURL: https://github.com/Traceworks2023/nxtgauge-gitops.git
targetRevision: main
syncPolicy:
automated:
prune: true
selfHeal: true
syncOptions:
- CreateNamespace=true

View file

@ -0,0 +1,21 @@
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: woodpecker-registry-pull
namespace: argocd
spec:
destination:
namespace: woodpecker
server: https://kubernetes.default.svc
project: default
source:
path: ops/woodpecker-registry-pull
repoURL: https://github.com/Traceworks2023/nxtgauge-gitops.git
targetRevision: main
syncPolicy:
automated:
prune: true
selfHeal: true
syncOptions:
- CreateNamespace=true

View file

@ -0,0 +1,6 @@
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- patch-coredns-nodehosts.yaml

View file

@ -0,0 +1,12 @@
apiVersion: v1
kind: ConfigMap
metadata:
name: coredns
namespace: kube-system
data:
NodeHosts: |
10.0.0.2 nxtgauge-1
10.0.0.3 nxtgauge-2
10.0.0.5 nxtgauge-3
10.0.0.2 registry.nxtgauge.com

View file

@ -1,17 +1,25 @@
# k3s Local Registry Node Configuration # k3s Registry Node Configuration
This repo now uses `registry.nxtgauge.internal:5000` for backend images. This repo uses `registry.nxtgauge.com` for backend images.
## Why ## Why
Image pulls happen on k3s nodes via containerd, not inside cluster DNS context. Image pulls happen on k3s nodes via containerd, not inside cluster DNS context.
Using `*.svc.cluster.local` for image pulls can fail with DNS lookup errors from node runtime. Using `*.svc.cluster.local` for image pulls can fail with DNS lookup errors from node runtime.
## Required node config ## Required node config
Each node must have `/etc/rancher/k3s/registries.yaml` configured to trust and use the registry. Each node must have `/etc/rancher/k3s/registries.yaml` configured with auth for the registry.
Template file: Template file:
- `ops/k3s/registries.yaml` - `ops/k3s/registries.yaml`
## Recommended node DNS/hosts override (prevents Cloudflare/proxy path)
Even if `registry.nxtgauge.com` is set to "DNS only" in Cloudflare, k3s nodes can still end up resolving to public/IPv6 records depending on upstream DNS/caches.
For reliable large image pulls/pushes (avoids `413 Payload Too Large` from proxies), point nodes directly at the in-cluster ingress VIP:
- Traefik VIPs: `10.0.0.2`, `10.0.0.3`, `10.0.0.5`
- Recommended: pick one stable VIP (example `10.0.0.2`) and map `registry.nxtgauge.com` to it on every node.
## Apply to all nodes ## Apply to all nodes
1. Export required env vars: 1. Export required env vars:
@ -20,6 +28,7 @@ Template file:
export K3S_NODES="node1 node2 node3" export K3S_NODES="node1 node2 node3"
export REGISTRY_USERNAME="<registry-user>" export REGISTRY_USERNAME="<registry-user>"
export REGISTRY_PASSWORD="<registry-pass>" export REGISTRY_PASSWORD="<registry-pass>"
export REGISTRY_VIP_IP="10.0.0.2" # optional but recommended
``` ```
2. Apply config and restart k3s on each node: 2. Apply config and restart k3s on each node:
@ -48,8 +57,8 @@ kubectl -n nxtgauge describe pod <failing-pod>
``` ```
## Notes ## Notes
- Ensure DNS for `registry.nxtgauge.internal` resolves from every k3s node. - Ensure DNS for `registry.nxtgauge.com` resolves from every k3s node.
- If DNS is not available, use a stable node-reachable IP:port and update: - If DNS is not available, use a stable node-reachable IP and update:
- backend GitOps manifests - backend GitOps manifests
- backend Woodpecker registry push target - backend Woodpecker registry push target
- `ops/k3s/registries.yaml` - `ops/k3s/registries.yaml`

View file

@ -5,6 +5,7 @@ set -euo pipefail
# export K3S_NODES="node1 node2 node3" # export K3S_NODES="node1 node2 node3"
# export REGISTRY_USERNAME="..." # export REGISTRY_USERNAME="..."
# export REGISTRY_PASSWORD="..." # export REGISTRY_PASSWORD="..."
# export REGISTRY_VIP_IP="10.0.0.2" # optional (recommended)
# ./ops/k3s/apply-registries.sh # ./ops/k3s/apply-registries.sh
if [[ -z "${K3S_NODES:-}" ]]; then if [[ -z "${K3S_NODES:-}" ]]; then
@ -27,7 +28,14 @@ sed \
for node in ${K3S_NODES}; do for node in ${K3S_NODES}; do
echo "Applying registry config on ${node}" echo "Applying registry config on ${node}"
scp "$TMP_FILE" "${node}:/tmp/registries.yaml" scp "$TMP_FILE" "${node}:/tmp/registries.yaml"
ssh "$node" "sudo mkdir -p /etc/rancher/k3s && sudo mv /tmp/registries.yaml /etc/rancher/k3s/registries.yaml && sudo systemctl restart k3s || sudo systemctl restart k3s-agent" ssh "$node" "sudo mkdir -p /etc/rancher/k3s && sudo mv /tmp/registries.yaml /etc/rancher/k3s/registries.yaml"
if [[ -n "${REGISTRY_VIP_IP:-}" ]]; then
echo "Ensuring /etc/hosts contains registry.nxtgauge.com -> ${REGISTRY_VIP_IP} on ${node}"
ssh "$node" "sudo sh -lc 'grep -q \"\\sregistry\\.nxtgauge\\.com\\b\" /etc/hosts && sed -i \"s/^.*\\sregistry\\.nxtgauge\\.com\\b.*/${REGISTRY_VIP_IP} registry.nxtgauge.com/\" /etc/hosts || echo \"${REGISTRY_VIP_IP} registry.nxtgauge.com\" >> /etc/hosts'"
fi
ssh "$node" "sudo systemctl restart k3s || sudo systemctl restart k3s-agent"
echo "Waiting for ${node} to recover..." echo "Waiting for ${node} to recover..."
sleep 8 sleep 8
done done

View file

@ -1,12 +1,10 @@
mirrors: mirrors:
"registry.nxtgauge.internal:5000": "registry.nxtgauge.com":
endpoint: endpoint:
- "http://registry.nxtgauge.internal:5000" - "https://registry.nxtgauge.com"
configs: configs:
"registry.nxtgauge.internal:5000": "registry.nxtgauge.com":
tls:
insecure_skip_verify: true
auth: auth:
username: "${REGISTRY_USERNAME}" username: "${REGISTRY_USERNAME}"
password: "${REGISTRY_PASSWORD}" password: "${REGISTRY_PASSWORD}"

View file

@ -0,0 +1,6 @@
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- registry-ingress.yaml

View file

@ -0,0 +1,27 @@
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: docker-registry
namespace: registry
annotations:
cert-manager.io/cluster-issuer: letsencrypt-prod
traefik.ingress.kubernetes.io/router.entrypoints: web,websecure
traefik.ingress.kubernetes.io/router.priority: "100"
spec:
ingressClassName: traefik
tls:
- hosts:
- registry.nxtgauge.com
secretName: registry-tls
rules:
- host: registry.nxtgauge.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: docker-registry
port:
number: 5000

View file

@ -0,0 +1,35 @@
# Woodpecker: allow pulling from private registry
Woodpecker pipelines run as Kubernetes pods in the `woodpecker` namespace. If pipeline step images use `registry.nxtgauge.com/...` (private, Basic auth), kubelet needs an `imagePullSecret`.
This is required for base images (example `registry.nxtgauge.com/rust:alpine`) and also for any mirrored plugin images (example `registry.nxtgauge.com/kaniko:2.1.1`).
## Required secret
Create this once:
```bash
kubectl -n woodpecker create secret docker-registry registry-nxtgauge-pull \
--docker-server=registry.nxtgauge.com \
--docker-username="<REGISTRY_USERNAME>" \
--docker-password="<REGISTRY_PASSWORD>" \
--docker-email="ci@nxtgauge.com"
```
## Mirroring common plugin images (optional)
If your pipelines reference plugin images from the internal registry (example `registry.nxtgauge.com/kaniko:2.1.1`) make sure those images exist in the registry.
Example mirror from Docker Hub to internal:
```bash
docker pull woodpeckerci/plugin-kaniko:2.1.1
docker tag woodpeckerci/plugin-kaniko:2.1.1 registry.nxtgauge.com/kaniko:2.1.1
docker push registry.nxtgauge.com/kaniko:2.1.1
```
## What this kustomize applies
It patches/ensures the `default` ServiceAccount in `woodpecker` includes:
- `imagePullSecrets: [registry-nxtgauge-pull]`

View file

@ -0,0 +1,6 @@
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- serviceaccount-default.yaml

View file

@ -0,0 +1,8 @@
apiVersion: v1
kind: ServiceAccount
metadata:
name: default
namespace: woodpecker
imagePullSecrets:
- name: registry-nxtgauge-pull