Skip to content

GraphQL Security Testing Without a Schema

By Alex Leahu
August 19, 2022

One of the main obstacles of a black box GraphQL security review is getting good coverage of the exposed functionality. Anyone who has reviewed a GraphQL API will have seen many requests that look something like this:

This isn't fun to look at, but more importantly, getting the coverage you need isn't feasible. You would need to spend a lot of time reviewing each request to determine queries, arguments, and fields. It also doesn't make much sense to test GraphQL endpoints by manipulating raw HTTP requests, and it's much more suitable to use tools like GraphiQL. This is only an issue if introspection is disabled. Otherwise, you could point GraphiQL (or similar tools) to the GraphQL endpoint and have a fully populated schema to aid the construction of queries.

I wondered if it would be possible to passively observe traffic and piece together a GraphQL schema based on the queries that went through Burp Suite. If I could do this, it would let anyone interact with a GraphQL API through GraphiQL even without having the schema. Having a GraphQL schema is a considerable improvement over working with raw HTTP requests. After some trial and error, I developed a practical approach to do just that.

GraphQuail

graphquail

I included this functionality as part of our Burp Suite extension, GraphQuail. This extension observes GraphQL API requests going through Burp and builds an internal GraphQL schema with each new query it sees. The extension also exposes GraphiQL and Voyager. The extension returns a fake response when it receives an introspection query. As a result, GraphQuail shows all queries, arguments, and fields available for use within the API.

The following video demonstrates GraphQuail building a schema for a GraphQL API with introspection disabled. We tested this against a local Go application called Traggo. The application had introspection enabled by default, but we just assumed that it didn't.

Implementation Details

This section describes our approach to building a GraphQL schema using queries sent through Burp Suite.

Every GraphQL query that goes through Burp Suite gets sent to a query transformer function we built using graphql-java. This function takes an existing schema (basically empty at first) and the GraphQL query. We parse the GraphQL query into an AST and then create a schema based on it. We then merge the newly created schema with the existing schema. The updated schema is stored in memory. This process repeats with every GraphQL query that comes in. As a result, we build a schema piece by piece that we can use in other tools.

I also wanted to give a big shout-out to AST explorer, a convenient tool for exploring how GraphQL queries and schemas are parsed.

Now, let's start with a basic query.

Input

{
  company(limit: 10) {
    ceo
    name
    summary
  }
}
{
  company(limit: 10) {
    ceo
    name
    summary
  }
}

Output

type Query {
  company(limit: Int): companyQueryType
}

type companyQueryType {
  ceo: UnknownScalar
  name: UnknownScalar
  summary: UnknownScalar
}

scalar UnknownScalar
type Query {
  company(limit: Int): companyQueryType
}

type companyQueryType {
  ceo: UnknownScalar
  name: UnknownScalar
  summary: UnknownScalar
}

scalar UnknownScalar

Every schema needs a defined Query type. In addition, a default scalar is defined in the schema as UnknownScalar, which is set as the field type whenever we don't know what the type is.

In the example above, we assigned all the fields: ceo, name, and summary as a UnknownScalar type. The query company also needed a type where we set all the fields inside. We followed the naming convention: field name + QueryType.

Here is a bit more complicated query that uses fragments.

Input

fragment postData on Post {
  id
  title
  text
  author {
    username
    displayName
  }
  ... on Category {
      name
      id
  }
}
query getPost($author: String!) {
  getPosts(author: $author) {
    post {
      ...postData
    }
  }
}
fragment postData on Post {
  id
  title
  text
  author {
    username
    displayName
  }
  ... on Category {
      name
      id
  }
}
query getPost($author: String!) {
  getPosts(author: $author) {
    post {
      ...postData
    }
  }
}

Output

type Query {
  getPosts(author: UnknownScalar): getPostsQueryType
}

type authorQueryType {
  displayName: UnknownScalar
  username: UnknownScalar
}

type getPostsQueryType {
  post: postGetpostsType
}

type postGetpostsType {
  author: UnknownScalar
  id: UnknownScalar
  text: UnknownScalar
  title: UnknownScalar
}

scalar UnknownScalar
type Query {
  getPosts(author: UnknownScalar): getPostsQueryType
}

type authorQueryType {
  displayName: UnknownScalar
  username: UnknownScalar
}

type getPostsQueryType {
  post: postGetpostsType
}

type postGetpostsType {
  author: UnknownScalar
  id: UnknownScalar
  text: UnknownScalar
  title: UnknownScalar
}

scalar UnknownScalar

You might notice a pattern whereby types are named relative to a field's parent. We discuss some caveats to this approach in the following section.

Caveats

Doesn't Work with Automated Persisted Queries

If the endpoint uses Automated Persisted Queries, proxy detection won't work. We need non-persisted queries going through Burp Suite to perform any detection.

Argument Types Not Always Detected

We can detect argument types when the type is included in the query. We don't guess the type based on the variables, although that could be something we implement in the future. When we don't know the type, we just set it as UnknownScalar. As a result, you may get errors in GraphiQL when adding a variable with the correct type, just because we don't know what the real type is. When this occurs, try using an Integer instead of a String or a Float instead of an Integer.

Generated Schema Is Not Real

The GraphQL schema generated is not a 1-to-1 representation of the actual schema. This is a limitation of building the schema through passive observation of queries because there isn't enough information to know the exact schema. This isn't too much of an issue for testing; we don't necessarily need the real schema, just something that gives the ability to exercise functionality.

Overly Unique Types

GraphQuail creates multiple types for what may be a unique type in the real schema. This happens because we don't know if a field within one query is the exact type as a field in another query, so we play it safe and treat them as unique types.

Input

{
  getStudents {
    id
    name
    age
  }
  getTeachers {
    id
    name
    school
  }
}
{
  getStudents {
    id
    name
    age
  }
  getTeachers {
    id
    name
    school
  }
}

Output

type Query {
  getStudents: getStudentsQueryType
  getTeachers: getTeachersQueryType
}

type getStudentsQueryType {
  age: UnknownScalar
  id: UnknownScalar
  name: UnknownScalar
}

type getTeachersQueryType {
  id: UnknownScalar
  name: UnknownScalar
  school: UnknownScalar
}

scalar UnknownScalar
type Query {
  getStudents: getStudentsQueryType
  getTeachers: getTeachersQueryType
}

type getStudentsQueryType {
  age: UnknownScalar
  id: UnknownScalar
  name: UnknownScalar
}

type getTeachersQueryType {
  id: UnknownScalar
  name: UnknownScalar
  school: UnknownScalar
}

scalar UnknownScalar

In the example above, getStudents and getTeachers are both a Person type in the actual schema, but when we generated our schema, we created two distinct types.

Incorrectly Combined Types

Sometimes a type may have an invalid field because we combined two incompatible types. This happens when two fields have the same parent field name but are different types. Here's an example:

Input

{
  users(limit: 10) {
    edge {
      node {
        name
      }
    }
  }

  buildings(limit: 10) {
    edge {
      node {
        address
      }
    }
  }
}
{
  users(limit: 10) {
    edge {
      node {
        name
      }
    }
  }

  buildings(limit: 10) {
    edge {
      node {
        address
      }
    }
  }
}

Output

type Query {
  buildings(limit: Int): buildingsQueryType
  users(limit: Int): usersQueryType
}

type buildingsQueryType {
  edge: edgeBuildingsType
}

type edgeBuildingsType {
  node: nodeEdgeType
}

type edgeUsersType {
  node: nodeEdgeType
}

type nodeEdgeType {
  address: UnknownScalar
  name: UnknownScalar
}

type usersQueryType {
  edge: edgeUsersType
}

scalar UnknownScalar
type Query {
  buildings(limit: Int): buildingsQueryType
  users(limit: Int): usersQueryType
}

type buildingsQueryType {
  edge: edgeBuildingsType
}

type edgeBuildingsType {
  node: nodeEdgeType
}

type edgeUsersType {
  node: nodeEdgeType
}

type nodeEdgeType {
  address: UnknownScalar
  name: UnknownScalar
}

type usersQueryType {
  edge: edgeUsersType
}

scalar UnknownScalar

There is an edge field with a node subfield in the queries above. We requested name in the users query and address in the buildings query. Because types are named based on the field's parent, there ended up being a single type created called nodeEdgeType, as follows:

type nodeEdgeType {
  address: UnknownScalar
  name: UnknownScalar
}
type nodeEdgeType {
  address: UnknownScalar
  name: UnknownScalar
}

The issue with this is that the address field wasn't valid for the users query and the name field wasn't for the buildings query. We thought they were the same type, but they weren't. The only way of getting around this is by using __typename fields to detect the real type names. However, at that point, it would no longer be passive discovery (unless the application already adds __typename fields). We may implement this feature in the future.

Future Work

We envision GraphQuail as a GraphQL security testing toolkit for Burp Suite and plan to add new features over time. Check out the repository for future updates and roadmap.