TL;DR
Recently I had to look at horizontally scaling a traditional web-app on kubernetes. Here i will explain how I achieved it and what ingress controller is and why to use it.
I assume you know what pods are so I will quickly breakdown service and ingress resources.
Service
Service is a logical abstraction communication layer to pods. During normal operations pods get’s created, destroyed, scaled out, etc.
A Service make’s it easy to always connect to the pods by connecting to their service which stays stable during the pod life cycle. A important thing about services are what their type is, it determines how the service expose itself to the cluster or the internet. Some of the service types are :
ClusterIP Your service is only expose internally to the cluster on the internal cluster IP. A example would be to deploy Hasicorp’s vault and expose it only internally.
NodePort Expose the service on the EC2 Instance on the specified port. This will be exposed to the internet. Off course it this all depends on your AWS Security group / VPN rules.
LoadBalancer Supported on Amazon and Google cloud, this creates the cloud providers your using load balancer. So on Amazon it creates a ELB that points to your service on your cluster.
ExternalName Create a CNAME dns record to a external domain.
For more information about Services look at https://kubernetes.io/docs/concepts/services-networking/service/
Ingress
An Ingress is a collection of rules that allow inbound connections to reach the cluster services
You define a number of rules to access a service
Scenario
Imagine this scenario, you have a cluster running, on Amazon, you have multiple applications deployed to it, some are jvm microservices (spring boot) running inside embedded tomcat, and to add to the mix, you have a couple of SPA sitting in a Apache web server that serves static content.
All applications needs to have TLS, some of the api’s endpoints have changed, but you still have to serve the old endpoint path, so you need to do some sort of path rewrite. How do you expose everything to the internet? The obvious answer is create a type LoadBalancer service for each, but, then multiple ELB’s will be created, you have to deal with TLS termination at each ELB, you have to CNAME your applications/api’s domain names to the right ELB’s, and in general just have very little control over the ELB.
Enter Ingress Controllers. 👍
What is an ingress controller?
An Ingress Controller is a daemon, deployed as a Kubernetes Pod, that watches the apiserver's /ingresses endpoint for updates to the Ingress resource. Its job is to satisfy requests for Ingresses.
You deploy a ingress controller, create a type LoadBalancer service for it, and it sits and monitors Kubernetes api server’s /ingresses endpoint and acts as a reverse proxy for the ingress rules it found there.
You then deploy your application and expose it’s service as a type NodePort, and create ingress rules for it. The ingress controller then picks up the new deployed service and proxy traffic to it from outside.
Following this setup, you only have one ELB then on Amazon, and a central place at the ingress controller to manage the traffic coming into your cluster to your applications.
To visualise how this works, check out this little guy! Traefik is one implementation you can use as an ingress.
But I have chosen nginx ingress controller instead as it supports sticky sessions and as a reverse proxy is extremely popular solution.
So lets get to the interesting part; coding!!!
Demo
I am going to setup a kubernetes gossip cluster on AWS using kops. Then create nginx ingress controller and reverse proxy to a sample app called echoheader.
To setup a k8s cluster on AWS, follow the guide at https://github.com/shavo007/k8s-ingress
If you do not want to install kops and the other tools needed, I have built a simple docker image that you can use instead.
https://store.docker.com/community/images/shanelee007/alpine-kops
This includes:
- Kops
- Kubectl
- AWS CLI
- Terraform
Once you have the cluster what we need to do is setup a default backend service for nginx.
The default backend is the default service that nginx falls backs to if if cannot route a request successfully. The default backend needs to satisfy the following two requirements :
serves a 404 page at /
serves 200 on a /healthz
See more at https://github.com/kubernetes/ingress-nginx/tree/master/deploy
Run the mandatory commands and install without RBAC roles.
Then install layer 7 service on AWS
https://github.com/kubernetes/ingress-nginx/tree/master/deploy#aws or install the service defined in my repo
kubectl apply -f ingress-nginx-svc.yaml
When you run these commands, it created a deployment with one replica of the nginx-ingress-controller and a service for it of type LoadBalancer which created a ELB for us on AWS. Let’s confirm that. Get the service :
kubectl get services -n ingress-nginx -o wide | grep nginx
We can now test the default back-end
ELB=$(kubectl get svc ingress-nginx -n ingress-nginx -o jsonpath='{.status.loadBalancer.ingress[0].hostname}')
curl $ELB
You should see the following:
default backend - 404
All good so far..
This means everything is working correctly and the ELB forwarded traffic to our nginx-ingress-controller and the nginx-ingress-controller passed it along to the default-backend-service that we deployed.
Deploy our application
Now run
kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/master/images/echoheaders/echo-app.yaml
kubectl apply -f ingress.yaml
This will create deployment and service for echo-header app. This app simply returns information about the http request as output.
If you look at the ingress resource, you will see annotations defined.
ingress.kubernetes.io/ssl-redirect: "true"
will redirect http to https.
To view all annotations check out https://github.com/kubernetes/ingress-nginx/blob/master/docs/annotations.md
One ingress rule is to route all requests for virtual host foo.bar.com to service echoheaders on path /backend. So lets test it out!
curl $ELB/backend -H 'Host: foo.bar.com'
You should get 200 response back with request headers and other info.
Sticky sessions
Now to one of the main features that nginx provides. nginx-ingress-controller can handle sticky sessions as it bypass the service level and route directly the pods. More info can be found here
https://github.com/kubernetes/ingress-nginx/tree/master/docs/examples/affinity/cookie
Update (17/10/2017) examples have been removed from repo! To find out more on the annotations related to sticky session go to https://github.com/kubernetes/ingress-nginx/blob/master/docs/annotations.md#miscellaneous==
To test it out we need to first scale our app echo-headers: Lets scale echo-headers deployment to three pods
kubectl scale --replicas=3 deployment/echoheaders
Now lets create the sticky ingress
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: nginx-test-sticky
annotations:
kubernetes.io/ingress.class: "nginx"
ingress.kubernetes.io/affinity: "cookie"
ingress.kubernetes.io/session-cookie-name: "route"
ingress.kubernetes.io/session-cookie-hash: "sha1"
spec:
rules:
- host: stickyingress.example.com
http:
paths:
- backend:
serviceName: echoheaders
servicePort: 80
path: /foo
What this setting does it, instruct nginx to use the nginx-sticky-module-ng module (https://bitbucket.org/nginx-goodies/nginx-sticky-module-ng) that’s bundled with the controller to handle all sticky sessions for us.
kubectl apply -f sticky-ingress.yaml
There is a very useful tool called kubetail that you can use to tail the logs of a pod and verify the sticky session behaviour. To install kubetail check out https://github.com/johanhaleby/kubetail
Now in one terminal window, we can tail the logs
kubetail -l app=echoheaders
and in another send in multiple requests to the virtual host stickyingress.example.com
curl -D cookies.txt $ELB/foo -H 'Host: stickyingress.example.com'
while true; do sleep 1;curl -b cookies.txt $ELB/foo -H 'Host: stickyingress.example.com';done
When the backend server is removed, the requests are then re-routed to another upstream server and NGINX creates a new cookie, as the previous hash became invalid.
As, you can see, requests are sent to the same pod for every subsequent request.
Proxy protocol
Lots of times you need to pass a user’s IP address / hostname through to your application. A example would be, to have the hostname of the user in your application logs.
To enable passing along the hostname, enable the below annotation
service.beta.kubernetes.io/aws-load-balancer-proxy-protocol: '*'
Update 1(7/10/2017) It looks like this is not needed anymore
For more see https://github.com/kubernetes/ingress-nginx#source-ip-address
To conclude, i have showcased above a subset of features for ingress. Others include path-rewrite, TLS termination, path routing, scaling, rbac, auth and prometheus metrics. For more info check out resources below.
Resources
For more information visit:
Github project : https://github.com/shavo007/k8s-ingress
Kubernetes nginx ingress: https://github.com/kubernetes/ingress-nginx
External DNS: https://github.com/kubernetes-incubator/external-dns/blob/master/docs/tutorials/nginx-ingress.md
Kubernetes faqs:
https://github.com/hubt/kubernetes-faq/blob/master/README.md#kubernetes-faq
Alpine-kops: https://store.docker.com/community/images/shanelee007/alpine-kops