CI/CD Federated Graph with AWS

I have been following this tutorial and it is working great locally. GitHub - apollosolutions/federation-subscription-tools: A set of demonstration utilities to facilitate.

I would like to implement a similar Federated Graph into our current AWS architecture using the AWS CDK.

Is there perhaps any further instructions or advice on the best approach to deploy a production environment? We would like to start small and then scale over the next year. Each of the subgraphs will exist in its own GitHub repo. Is it best to use docker containers? Is something like Kubernetes needed now or in the future? Thank you.

This sounds mostly like an infrastructure preference question, as you could accomplish CI/CD of Apollo servers via many different ways.

In the past, my company used AWS ECS for this, with AWS CDK, and AWS CodeBuild & CodePipeline. This is before we transitioned from GitHub to GitLab.

With GitLab, one of the main architectural features it has is the concept of actual deeply nested folders, called “groups”. This prevents us from needing to have a very flat project structure, like in GitHub. That one change kind of trickles down into everything else we do, and allows us very clean and specific targeting of resources. Instead of having a monorepo with a bunch of folders in it, we can just easily have a set of folders in GitLab that allow us to work on each project in relative isolation.

GitLab also has built-in CI/CD via a YAML file, which allows us to do pretty much anything we want, and it has recently added some built-in terraform features, which we don’t really use right now, but I’ll get to why. The terraform support looks cool, though.

Since every project can have its own CI/CD workflow, and since folders exist in GitLab, you can really easily deploy individual projects, keeping job times down pretty significantly.

We looked at our options and decided to go with Kubernetes, specifically we use AWS EKS, which is basically regular Kubernetes but you don’t have to build and manage the underlying infrastructure of the cluster. Rather, you treat your cluster as more of an application that runs all of your other applications.

We don’t use AWS CDK or Terraform for our EKS clusters because EKS doesn’t have a ton of Infrastructure as Code support right now, and managing it by hand is pretty trivial, because almost everything is abstracted away. You can spin up a full production-ready cluster in about an hour if you know what you’re doing.

We deploy GitLab’s CI runners to an EKS cluster to run jobs for us, saving us cost of using the GitLab runner minutes. Using our own runners basically allows us to run as many jobs as we need to without any real issue.

The projects themselves are all back by Kubernetes, so every project is pretty standardized. We use helm to template our kubernetes resources and make release installation/uninstallation/rollback very easy. On top of this, we use helmfile to alter individual helm releases, so that production/staging/develop/local all have their own easy-to-configure variants.

The combination of helmfile -> helm -> kubernetes basically makes it so that we don’t need infrastructure as code for 80% of our applications.

When working locally, we use Tilt, which is basically CI/CD for local development. Tilt -> helmfile -> helm -> docker desktop (w/ kubernetes enabled). Since tilt just uses helmfile, we gain the advantage of making our applications pretty much a 1:1 mirror of what we’d have in prod, with the exception of ingress (port forwarding is fine for us 99% of the time).

Kubernetes takes a bit of getting used to, but tools like k9s and popeye can make it pretty easy to spot-check issues with your app.

We like the results overall quite a lot. The local development experience is much nicer and more in-line with production, and our build time is mostly the same, but since we don’t need to use AWS CDK to handle an ECS deployment, our deployment time has gone from ~10-15m to ~1m. That, and the code is easier to test locally, so overall development speed is way faster than it used to be.

1 Like

Thank you for sharing, from what you have said, GitLab looks very exciting. I like that you can easily deploy individual projects. Is there any downside in defining infrastructure outside of AWS? Is there are overall view of the infrastructure similar to the CDK? What are the reasons you chose EKS over Fargate? How does Kubernetes work with other AWS services eg. EventBridge? Can you run some of the smaller services on Lambda without any issues?

Moving from GitHub to GitLab I would need to present major benefits.

Thank you Kevin.

Is there any downside in defining infrastructure outside of AWS?

Technically, infrastructure as code is always defined outside of AWS unless you’re hosting your repos via AWS’ git offering.

That said, Terraform can generate CloudFormation, just like CDK can, and it’s more of an industry standard right now. Both approaches will work fine, and both generally have limitations. For instance, CDK is a rendered template that has multiple stages in its build lifecycle, so very powerful, but when I used it was sometimes pretty hard to use, kind of forcing you to dig pretty deep into its inner workings.

I imagine Terraform has similar limitations, but they’re ultimately different ways of accomplishing the same thing.

Both of them kind of create their own idioms so specific that you could effectively consider it its own language, so if I was at the planning stage I would step back and evaluate them on their own merits.

Is there are overall view of the infrastructure similar to the CDK?

eksctl is a project that aims to make deploying EKS clusters as easy as kubectl/helm, and it largely creates CloudFormation stacks, but it’s still fairly early right now, and isn’t for declarative use at this time.

You could potentially create an EKS cluster via CDK or Terraform if you wanted it to create a CloudFormation stack.

What are the reasons you chose EKS over Fargate?

Fargate is an EC2 instance type, which EKS supports. You just specify namespaces in kubernetes that you’d like to use fargate. Spin up time is about 2 minutes, which is normal for fargate, but for kubernetes elasticity, pretty slow in my opinion. I would probably only use fargate for batch jobs, but using an ARM Spot Node is probably cheaper than fargate, and would probably take about the same time to spin up, but successive pods would spin up much faster.

Using cluster-autoscaler is probably a good idea to at least consider.

How does Kubernetes work with other AWS services eg. EventBridge?

You can attach resources via ingress, which in AWS creates AWS load balancers. With a service mesh like istio, you can direct traffic into kubernetes however you want, and have as many ingresses as you want. Load balancers are pretty expensive, so service meshes save a lot of cost and security headache here, because you can create “virtual gateways” that act as separate gateways and distinguish generally based on the headers, all on a minimal amount of load balancers.

Using cert-manager you can automatically create certs for your virtual gateways using Let’sEncrypt.

You can use an EgressGateway in istio to make all traffic go out through 1 route, but otherwise your pods will route based on which node they’re on, and travel out to the internet via normal VPC routing.

For other AWS offerings, if you give your pod internet access you can use an AWS SDK or REST API. If you need to trigger something in kubernetes based on an HTTP request, you can simply route traffic to your pod(s).

Can you run some of the smaller services on Lambda without any issues?

Lambda is completely separate from kubernetes. If you wanted to use “lambda in kubernetes”, projects like OpenFaaS exist. In my experience, these projects are already quite good, and coming along nicely but are still pretty new, so if growing pains are something that your company can’t deal with, then I would choose something very mature like Lambda and wait for the Kubernetes FaaS ecosystem to mature.

Moving from GitHub to GitLab I would need to present major benefits

It’s free to use, and if you provide your own CI/CD runners you get a very solid basic package.

The CEO of gitlab has like his own blog where he talks about gitlab, I found the pricing model to be very informative. That site is a bit hard to navigate sometimes, but for the most part there’s a lot of strategy just laid out to the public.

Gitlab is actually built using gitlab (with the exception of a few pieces of core infra), which means that new features that they internally want just kinda show up for their customers. They also have a public issue board for just about every service they offer, as gitlab is also a very solid project management tool (most of the more expensive tiers are mainly project management features).

They also have tons of docs, in my opinion one of the best in the industry.