We are planning for a self-managed production setup with Apollo GraphQL Federation. I see 2 options.
integrate CI/CD with rover CLI to create super graph
use serviceList attribute on ApolloGateway so no Rover needed
both are not recommended by Apollo for prod. I’m interested in knowing bit of technical details about why they are not suitable for a containerized(k8) production env. where can we go wrong? Appreciate some light.
Hello! One of the biggest advantages of a managed federation solution is that it prevents downtime for your federated gateway. Any time you need to update your serviceList to add, remove, or modify a subgraph, you need to restart your gateway to reflect the new configuration. Similarly with providing a Rover-composed schema, you need to restart the gateway to provide the updated schema via the supergraphSdl constructor option.
This is not the case for managed federation, where your gateway regularly polls an external resource (most commonly the Apollo Uplink) for an updated configuration and applies it at runtime. This means that your API stays up even as you modify underlying services.
That’s sadly not possible for a lot of us that require all solutions to be on-premise with no traffic leaving our network. I hope that upcoming Federation rework levels the playing field so on-premise gets equal love (compared to managed Federation).
Hi @StephenBarlow - can you clarify please. Is it possible to have the external resource that the gateway polls to be something that is managed by our own infrastructure/cloud provider…for example a supergraph that has been composed and put in an s3 bucket? And on that I saw on the documents for the rover cli supergraph compose command (Rover supergraph commands - Apollo GraphQL Docs) that what it generates is not yet consumable by Apollo tools. Is there a way then with the rover cli or something else to generate the overall schema/supergraph from the subgraphs so that the gateway will understand it? Thanks
Actually Im pretty sure it doesnt interrogate all the services for each inbound query. I believe the primary negatives for using serviceList:
During the Apollo Gateway service startup – every service contributing to your federated graph must be up and able to be queried by the gateway, else it will fail to startup.
If you have in set to poll for updates (ie: via expermental_pollInterval) if any service is not up, then it will be unable to compose the supergraph and update until all services are available yet again.
After some further investigation I got the answer to my own question. The Rover CLI supergraph command will generate a file that can be understood by the gateway. Then there are some “experimental_” properties that can be set on the GatewayConfig object. Here is some code
import { GatewayConfig } from '@apollo/gateway'
import fs from 'fs';
export const gatewayConfig: GatewayConfig = {
experimental_pollInterval: 10000,
experimental_updateSupergraphSdl: async(config) => {
console.log('reading supergraphSdl file');
// here you could read from say AWS S3 or wherever you want
const supergraphSdl = fs.readFileSync('prod-schema.graphql', {encoding: 'utf-8'})
return {
id: new Date().toISOString(),
supergraphSdl
}
}
};
and then in the gatweay startup…
import 'reflect-metadata';
import { ApolloServer } from 'apollo-server';
import { ApolloGateway } from '@apollo/gateway';
import { listen as userListen } from './user-subgraph/index'
import { listen as transactionsListen } from './transactions-subgraph/index'
import { listen as paymentsListen } from './payments-subgraph/index'
import { gatewayConfig } from './gatewayConfig'
async function bootstrap() {
const gateway = new ApolloGateway(gatewayConfig);
const server = new ApolloServer({
gateway,
tracing: false,
playground: true,
subscriptions: false
});
await Promise.all([
userListen(3001),
transactionsListen(3002),
paymentsListen(3003)
]);
server.listen({ port: 3000 }).then(({ url }) => {
console.log(`Apollo Gateway ready at ${url}`);
});
}
bootstrap().catch(console.error);
@StephenBarlow what is the plan for the “experimental_pollInterval” and “experimental_updateSupergraphSdl” config options? Will they be kepts and made not experimental? Thanks
There used to be an option in Federation, Gateway.load() which would tell the gateway to refetch the schema. This option was removed in version 2.23 and we still don’t understand why. We also are trying to understand why polling for changes is becoming the norm over pushing/triggering when there is a change. The loadI() option obviously does nothing for changing the paths or numbers of sub-graphs, a reload is required for that. Using gateway.load() was for the most part a very efficient pattern. Though our gateway deploy (cd) is quite fast and downtime isn’t experienced.
There is too much to understand I’m new in this can any one please correct me if i’m wrong.
There was a method Gateway.load() which is deprecated after version 2.23 now there is only one option to reload schema from subgraph that is Managed Federation. correct me if i’m wrong
I also feel your pain. This was an issue for us as well.
We first used Hive from the guild to be able to self-host the solution ourselves but found some issues with setting it up, response time on support tickets and it relied on Apollo’s router so we could not get federated subscriptions.
We’ve now switched to Cosmo and are using their router which is independent of Apollo but supports Apollo Federation V1 and V2. Quite Happy so far.