We want to leverage ephemeral schema variants and ephemeral router deployments to support a stable E2E testing flow in our staging environment.
Internally, our deployment sister team is building out our ephemeral deployment architecture. PR’s get their own isolated staging environment and are routed to via headers + service mesh.
We want to hook into this architecture to enable subgraph developers to have the same experience even when they are making potentially conflicting changes to their schema. We have some questions around some of the implementation details
Questions
- How do developers deploy an ephemeral router?
-
The lowest friction approach is deploying the router as a container within the subgraph’s K8 pod. Unclear how the router knows which schema variant to run
-
Another option could be to have subgraph operators manually spin up a router deployment via CLI and pass in a “variant” flag
-
Any other ideas?
- How does the ephemeral router know which ephemeral schema variant to run at deploy time?
- When and how does the schema variant get created and published to Apollo studio?
- Guessing this happens at CI time via
rover supergraph compose
on the super graph and specifying an override of the subgraph schema on the current branch?
- Ideally we would only create a variant if the subgraph operator actually intends on deploying the change to staging
- What mechanism do subgraph operators use to route to the ephemeral router and how do they discover it?
- The most straightforward approach I’m guessing would just be to include the schema variant hash directly as a header, e.g. X-GraphQL-Testing-Schema: staging-382ab1c
- How does the routing layer dynamically update to make new ephemeral routers discoverable? Or what information do ephemeral routers expose to make dynamic routing possible?
- Presumably ephemeral routers use the same DNS CNAME
- Request parameters allow some component (LB, nginx, service mesh) to route to the specific router, but how do these components know about new ephemeral routers?
- How do we make the ephemeral schema variants available to FE who also want to test out the ephemeral subgraph change?
- How do ephemeral schema variants get garbage collected?
Our setup
- We have a federated super graph composed of many subgraphs.
- Our “GraphQL Router” service wraps the Apollo Router binary in a Dockerfile along with some other containers all spun up into one K8s pod
- We are using Apollo Studio and Apollo’s Schema Registry
- We listen internally for subgraph deployment events in a separate service, do some processing on the schema and publish the final result to Apollo
5 Likes
@Serey_Morm did a talk on something similar to this at GraphQL Summit in 2023: https://www.youtube.com/watch?v=qy9r2FTj3yk
1 Like
There’s a ton of context here, trying to answer through all of these at a high level, highly suggest watching my talk that Greg shared above, and happy to chat more.
- How do developers deploy an ephemeral router?
We created a workflow for developers to deploy their own dedicated Router for their branch to an ephemeral deployment environment, we use Garden.io which was an existing offering at our organization.
- How does the ephemeral router know which ephemeral schema variant to run at deploy time?
Part of the triggering flow for #1 — developers provide our platform with the “branch” or schema name they’re working on and we start router using the schema file as opposed to using Uplink.
- When and how does the schema variant get created and published to Apollo studio?
We have a global integration service/pipeline that every subgraph interacts with. We compose the supergraph with all subgraphs that have a matching branch name, that way we can compose all the delta schemas together. This step runs every time, it’s inexpensive to do this. More importantly, we don’t publish these branches to Studio, instead we store the composed supergraphs to a static file bucket.
- What mechanism do subgraph operators use to route to the ephemeral router and how do they discover it?
We have a dedicated service to handle this via a custom header. For client developers, all they need to do is tack on the header to select the variant they want.
- How does the routing layer dynamically update to make new ephemeral routers discoverable? Or what information do ephemeral routers expose to make dynamic routing possible?
We have a dedicated service to handle this. What makes this discoverable is the branch name that I’ve described above.
- How do we make the ephemeral schema variants available to FE who also want to test out the ephemeral subgraph change?
For us, the subgraph owners would share the branch/variant name with the FE team, all they need to do is add the special header.
- How do ephemeral schema variants get garbage collected?
As mentioned, we store the composed schemas in a static file hosting service such as S3.
2 Likes
Hey Serey, thank you for replying. Just have a small follow-up regarding
- How do we make the ephemeral schema variants available to FE who also want to test out the ephemeral subgraph change?
How can the FE engs pull down the schema if we never push the ephemeral schema to the uplink? Do they retrieve it directly from static storage? or do they pull it via introspecting the ephemeral router?
Naively we exposed it through introspection, but we learned that building an API endpoint to retrieve it directly from static file storage cut out a lot of the dependency, latency, and improved availability.