Why?

I’ve been using ingress-nginx for years now, and while I haven’t had any complaints, it is soon to be replaced by ingate. I’ll probably give that a try when it’s released, but until then I needed an alternative gateway controller.

Envoy Gateway is in the process of being adopted at my workplace, so I wanted to get familiar with it and have an environemnt for testing as well.

Better now than later to cutover to the new kubernetes gateway API I suppose.

Creating a gateway

The documentation for installing envoy gateway is pretty starghtforward. The documentation is here. You don’t need install the gateway API CRDs separately, they are shipped with the manifest in the documentation.

Assuming you have a working gateway controller deployed, we need to create a gateway class and gateway.

This GatewayClass definition needs to target the controller deployed in the previous step. The default controller name is gateway.envoyproxy.io/gatewayclass-controller.

apiVersion: gateway.networking.k8s.io/v1beta1
kind: GatewayClass
metadata:
  name: envoy-gateway-class
spec:
  controllerName: gateway.envoyproxy.io/gatewayclass-controller

A simple gateway definition, which will create a deployment and service to act as an ingress gateway, would look like the following:

apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
  name: homelab
  namespace: envoy-gateway-system
spec:
  gatewayClassName: envoy-gateway-class

NOTE: The gatewayClassName field in the spec needs to match the name of the gateway class created above.

For my self-hosted services at home, I use a *.int.kyledev.co wildcard domain that is only exposed to my Tailscale tailnet. This allows me to access them away from my home network without public internet exposure. So, I need to expose the gateway to my tailnet.

With the tailscale operator installed, we can annotate our gateway with:

tailscale.com/expose: "true"
tailscale.com/hostname: "homelab-gateway" # optional - sets the machine name in the tailnet

The service also needs a loadBalancerClass of type tailscale to expose it to the tailnet.

We can create an EnvoyProxy config for this gateway to expose it to the tailnet. This is essentially a template that any future gateway definitions can use.

apiVersion: gateway.envoyproxy.io/v1alpha1
kind: EnvoyProxy
metadata:
  name: tailscale-proxy
spec:
  provider:
    type: "Kubernetes"
    kubernetes:
      envoyService:
        annotations:
          tailscale.com/expose: "true"
          tailscale.com/hostname: "homelab-gateway"
        loadBalancerClass: "tailscale"
        name: homelab-gateway

With the EnvoyProxy config created, we can configure the gateway to point to it

apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
  name: homelab
  namespace: envoy-gateway-system
spec:
  gatewayClassName: envoy-gateway-class
  infrastructure:
    parametersRef:
      group: gateway.envoyproxy.io
      kind: EnvoyProxy
      name: tailscale-proxy

If we apply those gateway resources, we should see something similar to:

$ k get gateway -n envoy-gateway-system
NAME      CLASS                 ADDRESS        PROGRAMMED   AGE
homelab   envoy-gateway-class   100.79.90.18   True         4d
$ k get svc -n envoy-gateway-system 
NAME              TYPE           CLUSTER-IP      EXTERNAL-IP                                     PORT(S)                                   AGE
envoy-gateway     ClusterIP      10.43.243.74    <none>                                          18000/TCP,18001/TCP,18002/TCP,19001/TCP   4d
homelab-gateway   LoadBalancer   10.43.216.210   100.79.90.18,homelab-gateway.tail18ac2.ts.net   80:31685/TCP,443:31965/TCP                3d1h

Configuring the gateway

In my cluster, I have the following setup:

A *.kyledev.co wildcard that is exposed to the public internet, and is served by a cloudflared tunnel to my cluster. I also have a VPS running plex that is publicy accessible.

And as mentioned previously, *.int.kyledev.co, which resolves by running pihole on a machine that is connected to my tailnet.

On the pihole server, I create a dnsmasq config at /etc/dnsmasq.d/99-tsnet.conf where 100.79.90.18 is the tailnet IP of my envoy proxy tailscale machine

address=/int.kyledev.co/100.79.90.18

This will point anything that matches the *.int.kyledev.co wildcard to the tailnet IP of the gateway service exposed by the tailscale operator.

I then override DNS for the machines on the tailnet to point to my pihole server. As a bonus, I also get tailnet-wide benefits of using pihole as a DNS server.

Next, I had 2 scenarios I wanted to cover:

1. Automatic certificates for my domains
2. Allow HTTP + HTTPS traffic to *.kyledev.co and *.int.kyledev.co

Automatic certificates

Luckily, cert-manager supports the gateway API already, and I already had it install for certificate management. I have version 1.17.2 deployed as of this writing.

To enable the gateway API for cert-manager, we need to add this flag to the container args in the deployment of the controller

--enable-gateway-api

My full deployment looks like this:

deployment.yaml
# Source: cert-manager/templates/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: cert-manager
  namespace: cert-manager
  labels:
    app: cert-manager
    app.kubernetes.io/name: cert-manager
    app.kubernetes.io/instance: cert-manager
    app.kubernetes.io/component: "controller"
    app.kubernetes.io/version: "v1.17.2"
spec:
  replicas: 1
  selector:
    matchLabels:
      app.kubernetes.io/name: cert-manager
      app.kubernetes.io/instance: cert-manager
      app.kubernetes.io/component: "controller"
  template:
    metadata:
      labels:
        app: cert-manager
        app.kubernetes.io/name: cert-manager
        app.kubernetes.io/instance: cert-manager
        app.kubernetes.io/component: "controller"
        app.kubernetes.io/version: "v1.17.2"
      annotations:
        prometheus.io/path: "/metrics"
        prometheus.io/scrape: 'true'
        prometheus.io/port: '9402'
    spec:
      serviceAccountName: cert-manager
      enableServiceLinks: false
      securityContext:
        runAsNonRoot: true
        seccompProfile:
          type: RuntimeDefault
      containers:
        - name: cert-manager-controller
          image: "quay.io/jetstack/cert-manager-controller:v1.17.2"
          imagePullPolicy: IfNotPresent
          args:
          - --v=2
          - --cluster-resource-namespace=$(POD_NAMESPACE)
          - --leader-election-namespace=kube-system
          - --acme-http01-solver-image=quay.io/jetstack/cert-manager-acmesolver:v1.17.2
          - --max-concurrent-challenges=60
          - --enable-gateway-api # new flag
          ports:
          - containerPort: 9402
            name: http-metrics
            protocol: TCP
          - containerPort: 9403
            name: http-healthz
            protocol: TCP
          securityContext:
            allowPrivilegeEscalation: false
            capabilities:
              drop:
              - ALL
            readOnlyRootFilesystem: true
          env:
          - name: POD_NAMESPACE
            valueFrom:
              fieldRef:
                fieldPath: metadata.namespace
          # LivenessProbe settings are based on those used for the Kubernetes
          # controller-manager. See:
          # https://github.com/kubernetes/kubernetes/blob/806b30170c61a38fedd54cc9ede4cd6275a1ad3b/cmd/kubeadm/app/util/staticpod/utils.go#L241-L245
          livenessProbe:
            httpGet:
              port: http-healthz
              path: /livez
              scheme: HTTP
            initialDelaySeconds: 10
            periodSeconds: 10
            timeoutSeconds: 15
            successThreshold: 1
            failureThreshold: 8
      nodeSelector:
        kubernetes.io/os: linux

Then you can can annotate the gateway with cert-manager.io/cluster-issuer: <your-issuer-name> to automatically provision certificates for us with an issuer of choice.

In my case, I use multiple dns names, so I created a certificate for the wildcard domain manually as this isn’t supported via annotations.

apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
  name: kyledev-wildcard-tls
  namespace: envoy-gateway-system
spec:
  secretName: kyledev-tls
  issuerRef:
    name: letsencrypt-prod
    kind: ClusterIssuer
  commonName: "*.kyledev.co"
  dnsNames:
    - "*.kyledev.co"
    - "*.int.kyledev.co"

Traffic Configuration

Each gateway can be configured with an array of listeners which allows fine grained control over the traffic that is routed to the gateway.

In my setup, I needed to create 2 listeners for the scenarios I outlined above.

1. `*.kyledev.co`
2. `*.int.kyledev.co`

In both of these scenarios, I wanted to be able to create HTTPRoutes from any namespace, and let the gateway handle terminating TLS.

If you’re using cert-manager (or any other automation) the certificateRefs should reference the secrets that cert-manager will create. Otherwise, you need to create the secrets manually beforehand.

apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
  name: homelab
  namespace: envoy-gateway-system
spec:
  gatewayClassName: envoy-gateway-class
  infrastructure:
    parametersRef:
      group: gateway.envoyproxy.io
      kind: EnvoyProxy
      name: tailscale-proxy
  listeners:
    - name: kyledev-http
      protocol: HTTP
      port: 80
      hostname: "*.kyledev.co"
      allowedRoutes:
        namespaces:
          from: All
    - name: kyledev-https
      protocol: HTTPS
      port: 443
      hostname: "*.kyledev.co"
      allowedRoutes:
        namespaces:
          from: All
      tls:
        mode: Terminate
        certificateRefs:
          - name: kyledev-tls
            kind: Secret
            group: ""

Testing it out

To actually route traffic to a service we need to create a HTTPRoute resource.

This is pretty straightforward, below is an example HTTPRoute using my newly created gateway.

apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  name: blog-public
  namespace: blog
spec:
  parentRefs:
    - name: homelab # name of the gateway
      namespace: envoy-gateway-system # name of the gateway namespace
  hostnames:
    - "blog.kyledev.co"
  rules:
    - backendRefs:
        - name: blog
          port: 80

This blog is now exposed from my cluster using the gateway API. You can see the exact HTTPRoute resources for

  • dev internally used in my tailnet for testing purposes blog.int.kyledev.co
  • prod for the “battle-tested” changes at blog.kyledev.co, which is what you’re reading right now

For some traffic I expose to the public internet, I use cloudflared to create a tunnel to my cluster. I did a previous post on that setup here. My tunnel configuration for the public instance of my blog is here. The tl;dr is point the hostname to the envoy-gateway service.

Once the HTTPRoute resources were created, I was able to resolve my blog at blog.kyledev.co and blog.int.kyledev.co