Reducing repetition in query responses

I’m trying to figure out strategies for reducing duplication / repetition in the network response to our queries when an array of objects each has a field that resolves to the same object.

For example, take the case where we have two types, topics and comments:

type Topic {
  __typename: 'Topic'
  id: ID!
  title: string!
}

type Comment {
  __typename: 'Comment'
  id: ID!
  message: string!
  topic: Topic!
}

When we’re building the UI, it would be extremely helpful to have access to the topic that a given comment is related to via comment.topic. Naively, we would write our query like this:

query getComments() {
  comments: {
    __typename
    id
    message
    topic {
      __typename
      id
      title
    }
  }
}

While this works, it results in the topic getting repeated multiple times in the query response (e.g. if you have 100 comments in a single topic, the topic is repeated in the response 100 times, once for each comment).

What I’m looking for is a way to have access to a comment’s topic, without repeating the topic over and over again over the network. Some of the options I’ve looked at are:

  • Only including typename and ID in the query and fetching the topics individually when needed by the UI (with a cache-first fetch policy). This does reduce repetition, but also requires us to scatter useQuery calls throughout the UI code
  • Side loading topics in a separate field in the query and only including topic ID and typename with comments. This does reduce the network traffic, but requires frequent use of lookup tables to get the topic for a given comment

Am I missing another option? Is there any way to include cached fields in the result from useQuery that doesn’t also cause them to be sent over the network?

In an ideal case, we would be able to query for all comments and include the topic ID and typename in the query and then, as long as the topic was previously loaded (which we can guarantee), be able to access the comment’s topic and all of its fields via comment.topic.

It’s entirely possible that I’ve missed some basic functionality, or that there’s a clever schema design strategy that I’m unaware of.

Thank you!

The most common strategy is to use gzip — because GraphQL guarantees field order, the repeated topic blocks are easily compressed.

It’s not a popular solution, but when you’re using a normalized cache like the one Apollo Client has, you could use GitHub - gajus/graphql-deduplicator: A GraphQL response deduplicator. Removes duplicate entities from the GraphQL response.. You have to be careful about including __typename and the id fields though.

Thank you, @lennyburdette! I’ll take a look at graphql-deduplicator. Gzipping responses is definitely worth implementing, but I’d prefer to solve the root of the problem (and reduce processing time + memory requirements for parsing the response)