Confusion Around Extending Entities in Federation 2

I’m relatively new to federation, and after reading through the docs, looking at examples, and creating a POC, I’m a bit confused on what the current state of extending a type defined in one subgraph in another subgraph is. From the docs, I only see extending types referenced in the removing unnecessary syntax Moving to Apollo Federation 2 - Apollo GraphQL Docs section for moving from version 1 to version 2 of federation. In the federation 2 docs on entities, it is stated

Every subgraph that contributes at least one unique field to an entity must define a reference resolver for that entity.

From this info, it led me to believe that extending a type in another subgraph was something utilized in Federation 1 but not necessary in Federation 2. However, the restriction that you have to define a reference resolver for the type now seems odd to me. If each subgraph is an independent service with its own data store, one subgraph can’t load the base entity out of the other service’s data store. However, that service does have related info that makes sense to expose on that type as a field.

As an example, the Product and Review graph that is used in a lot of federation examples, if you have product and reviews services/subgraphs with their own data stores, it makes sense for the Product subgraph to defined a Product type and a Review subgraph to define a reviews field on the Product type. But it’s not obvious to me how the reviews subgraph would define a reference resolver for Product, which seems to point towards extending the type rather than re-declearing it and needing to define a reference resolver`1.

As another note, I’ve been using pothos as my schema builder. In their federation plugin examples, they mention extending external entities, which ends up using the @extends directive. This is actually how I originally tracked down that extending types between subgraphs even worked.

Anyways, I’m basically just trying to determine if extending entities is still the norm for handling this sort of situation and if it’s not, how do you deal with the reference resolver requirement in a subgraph that doesn’t have access to where the entities backing data is stored?

Hello :wave:

Extending entities is a core feature of Federation (both v1 and v2). Fed v2 simplified some of the logic around it so you no longer need to use extend type/@extends syntax to mark your “extended” entities.

Reference resolvers are the entities entry point to a given subgraph, i.e. in order to extend given entity with new functionality, we need to be able to resolve entity to some object. In the reviews example, the simplest Product entity resolver would just instantiate Product object with provided keys (in general entity should also verify whether provided key is a valid one, i.e. can be resolved to an object).

Take a look at the entities (basics) and entities (advanced) section of the docs for more details.

Thanks,
Derek

1 Like

@dkuc how do you validate the key of another object’s store? In this example in my Reviews subgraph I want to extend type Product and add a reviews field to it. I don’t own the Product database though — the Reviews subgraph can’t and shouldn’t know whether a given Product key is valid. The Federation Router should need to go check with the Product subgraph (which is the “owner” of the Product entity) to ensure that it’s valid.

Am I misunderstanding something? Otherwise, every subgraph will have to validate every key of every type it wants to extend which adds both a lot of coupling and potentially unnecessary network calls — depending on the call graph, the Product’s key may already have been validated by another request the Router has made.

@clayne11 in general, entities are object that can be uniquely identified across number of subgraphs. Each subgraph that specifies given entity needs logic to be able to resolve its local fields. Usage of those keys is transparent to the end users.

Given a simple use case

# product subgraph
type Product @key(fields: "id") {
  id: ID!
  name: String
}

type Query {
  product(id: ID!): Product
}

# review subgraph
type Product @key(fields: "id") {
  id: ID!
  reviews: [Review!]!
}

type Review {
  id: ID!
  text: String
}

You would generate a following supergraph API schema (this is the schema that your clients would interact with)

type Query {
  product(id: ID!): Product
}

type Product {
  id: ID!
  name: String
  reviews: [Review!]!
}

type Review {
  id: ID!
  text: String
}

Then given a simple query

query {
  product(id: "123") {
    name
    reviews
  }
}

Router/gateway would generate a query plan that first resolves Product from products subgraph (through the product query) and then uses that Product ID to resolve the reviews from reviews subgraph. Products subgraph has to be able to lookup Product.name based on the ID and the reviews subgraph has to be able to look up the reviews for given product ID.

Subgraphs have to be aware of some sort of relationship between entities and types as otherwise how would they be able to resolve it? You could actually achieve same supergraph by flipping where you declare the reviews, e.g. you could do something like

# product subgraph
type Product @key(fields: "id") {
  id: ID!
  name: String
  reviews: [Review!]!
}

# this is just a stub so we can reference it
type Review @key(fields: "id", resolvable: false) {
  id: ID!
}

type Query {
  product(id: ID!): Product
}

# review subgraph
type Review @key(fields: "id") {
  id: ID!
  text: String
}

The resulting supergraph API schema is the same but there is going to be a difference in generated Query Plans. In original version reviews subgraph knows how to fetch product reviews given an ID (i.e. in REST it could be represented as something like product/<id>/reviews) vs in the second version products subgraph knows how to get list of review ID for given Product and then you would resolve those reviews by their IDs in the review subgraph.

Yes, I understand how Federation works. This doesn’t address my concern.

Imagine I have product and review subgraphs, each backed by separate microservices with their own DB. The DBs look like this:

# inside products microservice
table_name: Product

Colums:
id: string
upc: string
price: double
# inside reviews microservice
table_name: Review

Columns:
id: string
author_id: string
product_id: string
body: string

How can the review subgraph validate the product key? It would have to call the product microservice to do this.

All the review subgraph / microservice knows is whether there are any reviews for a given product ID. It doesn’t know if the product is valid. The product might be “invalid” for a number of reasons:

  • the product isn’t visible to that user due to permissioning
  • the product was deleted
  • potentially other reasons

It also might be valid, but we don’t have reviews.

The only way to know deterministically is the product ID is valid is to make a network request to the product service / subgraph that “owns” the product data (aka the product service).

If you make the review subgraph run that query in its implementation, two issues happen:

  1. Coupling between the services. The review service now has to know about the product service.
  2. Unnecessary network calls. The user’s query may have been something like this:
query { 
  products {
    id
    reviews {
      body
    }
  }
}

In this case, the product keys would have already been validated by the call to the products subgraph before the review subgraph was even called.

Why should the reviews subgraph have to revalidate the keys through another network request to the product service?

To me it makes much more sense to have a declarative way for the router to do this validation for me by declaring a certain subgraph as an entity “owner”.

GraphQL is not tied to a database but definitely DB can be a source of data. Going with DB analogy - think about how this would work in DB world? First you would make a call to product table to get some data (with some conditions to ensure “valid” data) and then use those product ids (which would be your foreign keys) to fetch the reviews. Instead of making 2 separate calls you probably would want to run a single SQL statement that joins those two tables.

With GraphQL products query you would do the same validations as above → only return products that are visible to user and are valid. Router would then attempt to fetch reviews based on those valid product IDs (same as reading from DB table using FK constraint). Keep in mind that users cannot (unless you expose it as another query) just ask for reviews. With above GraphQL schema they can only access reviews data through products query (which should be doing your validation) which should be validating your input data.

If you wanted you could definitely expose the reviews(productId: ID!): [Review!]! query (which could correspond to something like product/<id>/reviews REST endpoint) but you would be bypassing the whole point of federation.