Input argument validation

Hows is everyone handling input argument validation? I’m looking at a start/end date validation. I created a DateTimestamp scalar that will ensure they’re valid dates but what is a good approach to ensuring start date is before end date (for example)?

I’m thinking something like this…

getThingForId(
  id: ID, 
  dateTimestampRange: {
    start: DateTimestamp, # <-- scalar ensures valid date, but needs to be before endDate argument
    end: DateTimestamp, # <-- scalar ensures valid date, but needs to be after startDate argument
  }
)

I saw this previous post, but it looks like it went stale and got archived: Validating multiple input arguments

I’ve used directives to validate input arguments before, but I haven’t needed to validate one argument against another. In your case, I think I would handle the validation directly in the resolver.

input CreateUserInput {
  name: String! @validate(maxLength: 50)
  email: String! @validate(email: true, maxLength: 50)
  age: Int! @validate(min: 18)
}
1 Like

I’d like to make the DateTimestampRange re-usable so every resolver that may have a “date range” associated doesn’t have to duplicate validation logic or reach out to some helper for validation but rather build it in (similar to how scalar or directive validation is somewhat “automated”).

My mind mostly goes to something like the @requires directive, but for introducing validation for fields against one another.

If you need to validate the start and end fields in multiple inputs and want to make your schema more expressive, you can create a directive to handle it, like this:

directive @validateDateRange(start: String = "start", end: String = "end") 
  on INPUT_FIELD_DEFINITION

input SubscriptionPeriodInput @validateDateRange(start: "startAt", end: "finishAt") {
  startAt: DateTime
  finishAt: DateTime
}

In this example, the directive validates the startAt and finishAt fields. If another input type uses different field names for a date range, you can simply pass them as arguments to the directive.

4 Likes

Hm - I like that idea. I’ll give it a try. Thanks!

1 Like

A philosophical question is whether to rely solely on GraphQL layer to perform the validation or expect that your underlying code to be resilient itself. BTW, that decision also affects how and where you test this.

At least until now, I haven’t relied on GraphQL frameworks themselves to shield the underlying code from anything. I have, instead, relied on GraphQL to help communicate the constraints and allow early validation (that I don’t rely on).

In other words, GraphQL or not/otherwise, the rest of my code needed to be “safe”, whatever that means in a specific context. It is the primary line of defense. I “annotate” GraphQL schema with directives to help present more meaningful errors to clients, when that is permitted. In some/special cases the same would enable clients to discover/introspect the validation and perform it themselves, for an even more user-friendly approach.

I have some more preferences… Here are 1.5 of them:

  1. If individual values are related to each other but not to the rest of stuff, I’d have a dedicated type for that, kind of like @michaeldouglasdev did with the ...PeriodInput (I’d drop the context). That makes it reusable (both code and docs, everywhere) and extensible.
  2. I like to do the same with “simple” scalars, too. For example, no, IPv4 address is not just a string with validation slapped onto it, it should be its own type. In this case it also helps with “round-tripping” of output values back to inputs. Additionally, if anyone is generating code from GraphQL schema and using strongly typed languages, it will ensure that someone’s favourite breakfast isn’t reused as that IPv4 address even if it looks similar.
2 Likes

A philosophical question is whether to rely solely on GraphQL layer to perform the validation or expect that your underlying code to be resilient itself. BTW, that decision also affects how and where you test this.

I love this question. I wonder if there is a vision from Apollo on this. Is there a world where we could/should consolidate that to just the GraphQL layer in a federated graph, or is the zero-trust policy going to remain the standard (at least in the router/subgraphs).

1 Like

A philosophical question is whether to rely solely on GraphQL layer to perform the validation or expect that your underlying code to be resilient itself.

I like this point. I read it as not “vs” but “and” - not if it validation like this should be in the schema (directive) or resolver layer… but if it should be in the schema (directive) and resolver layer.

I like the portability of the directive but it would also be beneficial to possibly share logic between the directive validation and resolver level validation.

I can’t speak for Apollo’s goals, but I can share my own experience here.

TL;DR: GraphQL is right for a lot of things, and wrong for a lot of things. For me, validation and sanitization in the GraphQL layer is for performance, not for protection.


When I first got into GraphQL back in 2016, this presentation by Dan Schafer, co-creator of GraphQL was a key pivot point for me.

The key principle in that presentation that applies here is that they specifically didn’t specify how to do things like authorization, that GraphQL shouldn’t do those things. To quote him: “… it’s one of the reasons why the spec and the reference implementation don’t dictate how we’re going to do this. It’s designed to be a thin API on top of a layer that DOES do it.”

So my take? If tomorrow something better [for my use cases] than GraphQL comes along, I want to be able to switch to that without worrying about critical bugs or security vulnerabilities at my business logic or data layers. The API interface layer should be a thin translation layer only. Any authorizations or validations here are for performance, so that the user doesn’t have to wait on the “extra hop” to my data layer (whatever “hop” means in that context; not always a separate server). If I generate a new REST API for my data, I don’t want to rebuild authorization decisions, so I put those as close to the data as possible. Validation goes in the business layer for the same reason.

That didn’t mean we shouldn’t add things like validation or authorization decisions in the GraphQL layer. It just meant that those things can come later to improve the experience and performance for the API consumer.

2 Likes

I totally agree that having security considerations at the data service and access layers is top priority.

For the Graph, I like Apollo’s perspective on Graph Security, “[Security is] mostly related to denial-of-service (DoS) attacks, and they fall under the categories of API discoverability and malicious operations.”

This thread isn’t proposing to replace or forget about security at the data service layer. I think you’re pretty spot on that it’s more about performance, albeit loosely about validation. The closer to the edge I can do a simple validation like “start date must be before end date. And, start and end date must both be valid dates” the quicker I return a simple validation check to the consumer and alleviate the underlying services from taking a hit just to run simple validation.

It’s also geared towards protecting the Graph’s transformation logic from trying to transform data that isn’t the shape it expects. In this case, if I have a simple DateTimestamp scalar to normalize date time stamps, I can use that (ts) type in my resolvers and have a higher confidence there won’t be a type discrepancy.

1 Like

It looks like this won’t work because directives can’t be applied to input or argument definitions.

Applying directive to input definition:

Error starting Apollo Server: [<unnamed>] Directive "@validateDateRange" may not be used on INPUT_FIELD_DEFINITION. {"extensions":{"code":"INVALID_GRAPHQL"},"stack":"GraphQLError: Directive \"@validateDateRange\" may not be used on INPUT_FIELD_DEFINITION.

When directive is applied to operation argument:

Error starting Apollo Server: [<unnamed>] Directive "@validateDateRange" may not be used on ARGUMENT_DEFINITION. {"extensions":{"code":"INVALID_GRAPHQL"},"stack":"GraphQLError: Directive \"@validateDateRange\" may not be used on ARGUMENT_DEFINITION.

Sorry, the correct definition type is INPUT_OBJECT instead of INPUT_FIELD_DEFINITION.

When I proposed the directive to input validation, I had written this part here. I didn’t notice the error

Do you have an example of the directive logic?

I’ve just finished implementing this directive. It still needs some improvements, such as supporting arrays of periods and possibly allowing different kinds of values, like ISO date strings or raw numbers (for example, to define min/max age or min/max price) and other kinds of validations and optimizations.

Maybe this directive could be renamed to @range, where it could accept a value type as an argument, but for now, it’s a good starting version.

You can test:

import {
  GraphQLInputObjectType,
  GraphQLSchema,
  defaultFieldResolver,
  isObjectType,
  isInputObjectType,
  getNamedType,
  GraphQLInputField,
} from "graphql";
import { getDirective, mapSchema, MapperKind } from "@graphql-tools/utils";

const directiveName = "validateDateRange";
type ValidateDateRangeDirectiveType = {
  start: string;
  end: string;
};

export function validateDateRangeDirective() {
  const metadata = new Map<string, ValidateDateRangeDirectiveType>();

  return {
    validateDateRangeDirectiveTransformer: (schema: GraphQLSchema) => {
      mapSchema(schema, {
        [MapperKind.INPUT_OBJECT_TYPE]: (fieldConfig) => {
          const directive = getDirective(
            schema,
            fieldConfig,
            directiveName
          )?.[0];

          if (directive) {
            metadata.set(fieldConfig.name, {
              start: directive.start,
              end: directive.end,
            });
            return fieldConfig;
          }
        },
      });

      return mapSchema(schema, {
        [MapperKind.OBJECT_FIELD]: (fieldConfig) => {
          const resolver = fieldConfig.resolve ?? defaultFieldResolver;

          fieldConfig.resolve = async (source, args, context, info) => {
            const parentType = info.schema.getType(info.parentType.name);

            if (isObjectType(parentType)) {
              const field = parentType.getFields()[info.fieldName];

              const argumentDefs = field.args;

              for (const def of argumentDefs) {
                const argName = def.name;
                const argValue = args[argName];

                const inputType = getNamedType(def.type);

                if (
                  isInputObjectType(inputType) &&
                  argValue &&
                  typeof argValue === "object"
                ) {
                  validateInputObject(argValue, inputType, metadata);
                }
              }
            }

            return resolver(source, args, context, info);
          };

          return fieldConfig;
        },
      });
    },
  };
}

function validateInputObject(
  value: any,
  inputType: GraphQLInputObjectType,
  inputTypeDirectives: Map<string, ValidateDateRangeDirectiveType>
) {
  const directiveArgs = inputTypeDirectives.get(inputType.name);

  if (directiveArgs) {
    const { start, end } = directiveArgs;

    const startValue = new Date(parseInt(value[start]));
    const endValue = new Date(parseInt(value[end]));

    if (startValue > endValue) {
      throw new Error(
        `Invalid date range in input '${inputType.name}': field '${start}' must be before '${end}'`
      );
    }
  }

  // Recursively check nested input fields
  const fields = inputType.getFields();
  for (const fieldName in fields) {
    const field: GraphQLInputField = fields[fieldName];
    const fieldType = getNamedType(field.type);
    const fieldValue = value[fieldName];

    if (
      isInputObjectType(fieldType) &&
      fieldValue &&
      typeof fieldValue === "object"
    ) {
      validateInputObject(fieldValue, fieldType, inputTypeDirectives);
    }
  }
}

Make sure to apply the schema transformer to activate the directive

const { validateDateRangeDirectiveTransformer } =
    validateDateRangeDirective();

const directiveTransformers = [
  //...othersDirectives,
  validateDateRangeDirectiveTransformer
]
const transformedSchema = directiveTransformers.reduce(
    (schema, transformer) => transformer(schema),
    buildSubgraphSchema({
      typeDefs,
      resolvers,
    })
  );

In my tests, I used timestamps to represent the data:

"data": {
    "period": {
      "startAt": "1745857565081",
      "finishAt": "1743857565081"
    }
  }

Hm that’s pretty clever.. I do like it but I’m still hesitant to embrace it since directives are not allowed on input and argument types/fields.

I feel like this is either a good use case that should be taken into consideration for GraphQL’s stance towards not allowing directives on inputs/arguments or it’s a really good hack that may build into a corner considering it doesn’t align with GraphQL design philosophy.

I’ve seen some other suggestions on implementing middleware or a custom scalar, neither of which I want to do.

Instead I took a pass at implementing a directive at the operation level…

schema.graphql: (FYI , DateTimestamp is just a custom scalar with some date-time validation baked in)

directive @validateDateRange on FIELD_DEFINITION

"""
Input type representing a date range.
A valid date range must have a start date that is before the end date.
"""
input DateRangeInput {
  "Start date of the date range"
  startDate: DateTimestamp!
  "End date of the date range"
  endDate: DateTimestamp!
}

type Query {
  thingByDate(
    input: ThingInput!
    dateRange: DateRangeInput!
  ): ThingByDateResponse! @validateDateRange
}

directive.ts

import { GraphQLSchema, GraphQLError } from 'graphql';
import { getDirective, MapperKind, mapSchema } from '@graphql-tools/utils';

const directiveName = 'validateDateRange';

function validateDateRange(schema: GraphQLSchema): GraphQLSchema {
  return mapSchema(schema, {
    [MapperKind.OBJECT_FIELD]: (fieldConfig) => {
      const validateDateRangeDirective = getDirective(
        schema,
        fieldConfig,
        directiveName,
      )?.[0];

      if (validateDateRangeDirective) {
        const { resolve } = fieldConfig;

        fieldConfig.resolve = async function (source, args, context, info) {
          const { dateRange } = args;

          if (dateRange.startDate > dateRange.endDate) {
            throw new GraphQLError('startDate must be before endDate');
          }

          if (resolve) {
            resolve(source, args, context, info);
          } else {
            return null;
          }
        };
      }
      return fieldConfig;
    },
  });
}

export default validateDateRange;

This feels “cleaner” and more aligned with GraphQL design philosophy.. but also feels kind of fragile. For example, if “I” “forget” to add the directive at the operation level then the validation does not happen on the dateRange input for the given operation. Maybe that’s a feature, not a bug? Now “I” would have the ability to accept a dateRange without applying validation by omitting the directive on an operation if need be?

The only other option I can think of (again, not wanting to do a scalar so I can keep my schema expressive) is creating a validation utility/helper method and pulling it into every resolver I want to apply validation.. which feels like the operation level directive with more boilerplate and chance of discrepancy.

I’d really love to hear from Apollo on this, perhaps the SAs have an opinion? @watson et al?

The “problem” with this new implementation is that the directive will always validate the dateRange field, and like you said, you’d need to apply it in multiple places.
Another issue is that it gets tricky when you have different fields representing periods, the directive would need to know which ones to validate.

I do like it but I’m still hesitant to embrace it since directives are not allowed on input and argument types/fields.

I got a bit confused here, because as far as I know, directives are allowed on input types and arguments.

There are actually two kinds of directives in GraphQL: schema directives and operation directives.

Schema directive locations include:

SCHEMA
SCALAR
OBJECT
FIELD_DEFINITION
ARGUMENT_DEFINITION
INTERFACE
UNION
ENUM
ENUM_VALUE
INPUT_OBJECT
INPUT_FIELD_DEFINITION

And operation directives:

QUERY
MUTATION
SUBSCRIPTION
FIELD
FRAGMENT_DEFINITION
FRAGMENT_SPREAD
INLINE_FRAGMENT

Even the GraphQL spec gives an example of a directive used in an argument:

directive @example on FIELD_DEFINITION | ARGUMENT_DEFINITION

type SomeType {
  field(arg: Int @example): String @example
}

About this part:

it’s a really good hack that may build into a corner considering it doesn’t align with GraphQL design philosophy.

I wouldn’t really call it a hack :sweat_smile:, I’m using @graphql-tools, which is actually suggested by Apollo, and everything I’m doing is spec-compliant.

As far as I know, what is not possible to do is applying directives to arguments inside an operation.

mutation CreateUser ($data: CreateUserInput! @someDirectiveX) {
  createUser(data: $data @someDirectiveY) 

Honestly, I was hoping I could handle everything with just [MapperKind.INPUT_OBJECT_TYPE], but I ended up also needing [MapperKind.OBJECT_FIELD] to get the validation working. Maybe there’s a cleaner way I haven’t found yet.

I hesitated to use the word “hack” because I didn’t want to offend. I didn’t mean it as a jab by any means or with any disrespect.

I saw the same schema directive locations list, but aside from your snippet I haven’t seen any other examples of directives applying execution logic to inputs or arguments (namely the underlying implementation logic). Yet, I’ve seen quite a bit of threads about people requesting the ability and the commenters of those threads saying it isn’t available intentionally because it goes against GraphQL design philosophy. A lot of threads about not even being able to use the built in @deprecated directive on input type and argument definitions.

I would be interested to see implementation examples for the argument directive in the GraphQL spec. Even more, I’d like to see an example where the directive is applied directly to an input type definition and it’s backing directive implementation.

As for the approach I’m exploring, I agree with you it isn’t fool proof and is a hack (my words) in it’s own kind of way. I’m going to continue to explore it, though, and try to make it a bit more robust. Like adding a check to ensure sure the dateRange input exists before applying the validation logic, etc.

I’m also going to continue to try and educate myself better on what seems to be a discrepancy on applying directives directly to input type and argument definitions.

Here is an updated version of what I’m thinking (the schema stays the same, with the directive being applied at the operation level):

directive.ts (this needs to be cleaned up)

import { ApolloServerErrorCode } from '@apollo/server/errors';
import { getDirective, MapperKind, mapSchema } from '@graphql-tools/utils';
import {
  defaultFieldResolver,
  getNamedType,
  GraphQLError,
  GraphQLInputObjectType,
  GraphQLFieldConfig,
  GraphQLSchema,
} from 'graphql';

const argName = 'dateRange';
const argTypeName = 'DateRangeInput';
const directiveName = 'validateDateRange';

function validateArgsList(
  fieldConfig: GraphQLFieldConfig<any, any>,
): void {
  const { args = {} } = fieldConfig;
  const argsContainDateRangeInput = Object.values(args).some((arg) => {
    const name = arg.astNode?.name?.value;
    const namedType = getNamedType(arg.type);
    return (
      name === argName &&
      namedType instanceof GraphQLInputObjectType &&
      namedType.name === argTypeName
    );
  });

  if (!argsContainDateRangeInput) {
    throw new GraphQLError(
      `Argument list must contain "dateRange: DateRangeInput!" when "${directiveName}" directive is used.`,
      {
        extensions: {
          field: fieldConfig.astNode?.name?.value,
          description: fieldConfig.description,
        },
      },
    );
  }
}

function validateDateRange(schema: GraphQLSchema): GraphQLSchema {
  return mapSchema(schema, {
    [MapperKind.OBJECT_FIELD]: (fieldConfig) => {
      const validateDateRangeDirective = getDirective(
        schema,
        fieldConfig,
        directiveName,
      )?.[0];

      if (validateDateRangeDirective) {
        validateArgsList(fieldConfig);

        const { resolve = defaultFieldResolver } = fieldConfig;

        fieldConfig.resolve = async function (source, args, context, info) {
          const dateRange = args[argName];

          if (!dateRange) {
            return resolve(source, args, context, info);
          }

          /**
           * `startDate` and `endDate` are DateTimestamp scalars which are Luxon
           * DateTime objects. "DateTime implements #valueOf to return the epoch
           * timestamp, so you can compare DateTimes with `<`, `>`, `<=`, and `>=`.
           * That lets you find out if one DateTime is after or before another DateTime."
           * ref: https://moment.github.io/luxon/#/math?id=comparing-datetimes
           */
          if (dateRange.startDate > dateRange.endDate) {
            throw new GraphQLError(
              'Invalid date range: "startDate" cannot be after "endDate". Please ensure that the "startDate" is earlier than to the "endDate".',
              {
                extensions: {
                  code: ApolloServerErrorCode.BAD_USER_INPUT,
                  field: argName,
                  details: {
                    startDate: dateRange.startDate.toUTC().toISO(),
                    endDate: dateRange.endDate.toUTC().toISO(),
                  },
                },
              },
            );
          }

          return resolve(source, args, context, info);
        };
      }
      return fieldConfig;
    },
  });
}

export default validateDateRange;

This adds build time schema validation and run time validation execution.

  • At build time, it checks to make sure if the @validateDateRane directive is applied to an operation, that the DateRangeInput type exists in the arguments list and it’s set to a dateRange field. If not, the build fails.
  • At execution time, it applies the validation to make sure the startDate field is before the endDate field. If so it executes the original resolver and if not it throws a validation error.
"valid - likeliest use case"
type Query {
  thingByDate(
    dateRange: DateRangeInput!
  ): ThingByDateResponse! @validateDateRange
}

"valid - DateRangeInput not required"
type Query {
  thingByDate(
    dateRange: DateRangeInput
  ): ThingByDateResponse! @validateDateRange
}

"valid - directive not applied"
type Query {
  thingByDate(
    dateRange: DateRangeInput!
  ): ThingByDateResponse!
}

"invalid - DateRangeInput not in arguments list"
type Query {
  thingByDate(
    dateRange: String!
  ): ThingByDateResponse! @validateDateRange
}

"invalid - dateRange field not in arguments list"
type Query {
  thingByDate(
    newDateRange: DateRangeInput!
  ): ThingByDateResponse! @validateDateRange
}

Thoughts?

1 Like

I hesitated to use the word “hack” because I didn’t want to offend. I didn’t mean it as a jab by any means or with any disrespect.

No worries at all, it’s totally fine, I didn’t take it the wrong way! :blush:

You are right, we have few or no examples in arguments or inputs. I wish to see more of this too.

Your implementation is good, I’m still unsure about how to handle cases where multiple dateRange arguments are received in different fields and in nested input.

Got any other input validation ideas? I’d like to try building some

Yea, that’s a good point, I’m gearing my solution towards my budding graph of less than a handful of developers.. I’d imagine it will need to mature and grow as the graph does to be viable in the long run.

For now, we’ve only seen operations with one “date range” so trying to just cover those bases.

I can’t think of any other use cases off the top of my head considering most other ones could probably be handled with validation built into a custom scalar (email, phone, etc.) and aren’t necessarily comparing n+ fields against one another.

I’ll come back with any that pop up though. Please do the same, too. I’m interested in this topic as it seems most things I’ve come across people end up just implementing a middleware or applying in-resolver helpers for validation.

1 Like