Reducing repetition in query responses

carsondarling · April 12, 2023, 12:07am

I’m trying to figure out strategies for reducing duplication / repetition in the network response to our queries when an array of objects each has a field that resolves to the same object.

For example, take the case where we have two types, topics and comments:

type Topic {
  __typename: 'Topic'
  id: ID!
  title: string!
}

type Comment {
  __typename: 'Comment'
  id: ID!
  message: string!
  topic: Topic!
}

When we’re building the UI, it would be extremely helpful to have access to the topic that a given comment is related to via comment.topic. Naively, we would write our query like this:

query getComments() {
  comments: {
    __typename
    id
    message
    topic {
      __typename
      id
      title
    }
  }
}

While this works, it results in the topic getting repeated multiple times in the query response (e.g. if you have 100 comments in a single topic, the topic is repeated in the response 100 times, once for each comment).

What I’m looking for is a way to have access to a comment’s topic, without repeating the topic over and over again over the network. Some of the options I’ve looked at are:

Only including typename and ID in the query and fetching the topics individually when needed by the UI (with a cache-first fetch policy). This does reduce repetition, but also requires us to scatter useQuery calls throughout the UI code
Side loading topics in a separate field in the query and only including topic ID and typename with comments. This does reduce the network traffic, but requires frequent use of lookup tables to get the topic for a given comment

Am I missing another option? Is there any way to include cached fields in the result from useQuery that doesn’t also cause them to be sent over the network?

In an ideal case, we would be able to query for all comments and include the topic ID and typename in the query and then, as long as the topic was previously loaded (which we can guarantee), be able to access the comment’s topic and all of its fields via comment.topic.

It’s entirely possible that I’ve missed some basic functionality, or that there’s a clever schema design strategy that I’m unaware of.

Thank you!

lennyburdette · April 13, 2023, 8:26pm

The most common strategy is to use gzip — because GraphQL guarantees field order, the repeated topic blocks are easily compressed.

It’s not a popular solution, but when you’re using a normalized cache like the one Apollo Client has, you could use GitHub - gajus/graphql-deduplicator: A GraphQL response deduplicator. Removes duplicate entities from the GraphQL response.. You have to be careful about including __typename and the id fields though.

carsondarling · April 13, 2023, 9:29pm

Thank you, @lennyburdette! I’ll take a look at graphql-deduplicator. Gzipping responses is definitely worth implementing, but I’d prefer to solve the root of the problem (and reduce processing time + memory requirements for parsing the response)

Topic		Replies	Views
Multiple identical queries - only one is performed Client SDKs client , web	2	3978	January 25, 2023
Sharing data between filtered queries Client SDKs client , web	0	32	November 2, 2024
Cache and parent fields Client SDKs client , web	1	722	February 10, 2023
Generated code with union Query Client SDKs client , mobile , kotlin	6	488	February 6, 2025
Help to avoid repeated calls of apollo-client the parent component when called in the child component with the matching __typename Client SDKs client , web	6	51	December 19, 2024

Reducing repetition in query responses

Related topics