How to Build a GraphQL Client Cache — Part I

9 min readJan 27, 2016

I like GraphQL. I might even love it. It replaces your validation, your ORM, your mess of REST endpoint, and it offers you GraphiQL so you can debug queries and train new employees super fast. On top of all that, you can learn it in an afternoon.

Relay is different. It’s huge. It’s tightly coupled to React. It requires a major overhaul on the server and client. It takes a week to learn. I’m not saying it’s bad. It’s super powerful, ultra efficient, and if you have a business need that matches facebook’s, it’s just for you. But I need something small, flexible, front-end agnostic, and stupid easy (just like me, ayyyooooo).

So, I decided to build my own GraphQL client cache. I’ll call it cashay (Get it? it’s how fancy folks pronounce “cache” AND it rhymes with Relay. I’m hilarious.) I still haven’t started building it. I’ll build and blog at the same time. That way, you can see what’s going on in my head… if you’re into that kinda thing.

Scope the MVP

I never formally studied computer science. I was bit-twiddling before puberty & figured no old guy in a classroom could teach me something I couldn’t learn online faster, so, I opted to become a project manager. I use that PM approach when I build software and, as any PM knows, the first step is to define scope, so let’s do it:

For this MVP I want to only concern myself with queries (mutations and subscriptions will come later).
I don’t want to worry about arguments. I think that’ll be fairly easy to implement later.
I’ll ignore unions and fragments to make it stupid simple.
I’ll rule out pagination, that is a HUGE can of worms. But as a quick note to future me: I like cursor-based pagination (it’s efficient for DBs).
I’ll allow for modeling 1:M (and therefore M:N) relations because if I don’t now, I’ll have to completely redesign the cache later.
I’ll use Redux as the store since it’s a developer’s dream (time travel, easily view what’s inside, and a clean lower-level API for advanced uses).

Write the API

I always start at the desired result and work backwards (if you write tests, you do this, too). That means we start at the external API. Ever look at a package on NPM and think the author just wrote junk & then slapped an export in front of a couple functions?

Ugly UX kills a product, and developers (especially JavaScript developers) are some of the most bratty, judgmental, entitled pieces of crap out there, so if your API sucks, you’ll hear it.

I want my API to be dead simple. Something like cashay.query(queryString, options). Boom. A single method you can plant anywhere that is front-end agnostic. The “query” method name jives with the GraphQL lexicon. You pass in a GraphQL query string that you already wrote. Finally, there’s some options. I’ll probably need a transport since I’ll use this via HTTP and WS, and who knows what else (forceFetch boolean, maybe?). Using an options object allows for adding features without a breaking API change, so who cares, we’re future-proof!

Map the Flow — GET

When the query method is called, I want it to see if the store has what it needs. If it doesn’t, I want it to get what it needs from the server. In pseudocode:

I see 3 tricky parts:

Parsing the query string
Finding out what the store is missing
Sticking the GraphQL result into the store

I can’t parse the query string without knowing what I want it to look like. Ergo, I can’t solve #2 either. So, let’s tackle that last problem to figure out the shape of our data.

Map the Flow — SET

Given a (possibly nested) JSON object, I need to normalize it and stick it in a redux store. Normalizing makes it easier for my reducers & ensures a single source of truth. Thankfully, a tool already exists: https://github.com/gaearon/normalizr. Go read the API now. You feed it a JSON response and a schema, and it’ll normalize the response according to that schema. So where do we get the schema? From GraphQL, hopefully.

And there we have it! We know what our package will do, what it’s API looks like, and a basic flow. Now it’s time to dig into the weeds.

Turning a GraphQL Schema into a normalizr schema

Credit where credit is due, the visionary Huey Peterson (http://hueypetersen.com/) already tackled this problem, but I’m going to do it a little differently.

Defining the end result: a normalizrSchema

As before, let’s start at the end and work backwards to get there. As a guinea pig, I’ll use my Meatier repo (https://github.com/mattkrick/meatier), which has a schema for a dead simple Kanban. It has 2 tables: Lanes and Notes. 1 Lane has many Notes. 1 Note has 1 Lane. Since I’m not taking arguments, let’s assume I’ll return all the lanes. So, our schema look like this (remember GraphQL serves the result inside a data prop):

Now, let’s figure out how to achieve this given a GraphQL schema and query string.

Making a client schema

I need a copy of my GraphQL schema without all the server-specific resolve methods in it. So, I’ll use GraphQL’s baked in introspection:

Now, every query, mutation, subscription, object, and scalar you ever wrote resides as a Type in this schema (visual below). There’s just one problem… the response will follow the shape of a query string, which can take on any shape imaginable and include circular references, meaning there are an infinite amount of possible schemas. Yikes. So, let’s be lazy and only make a normalizr schema for each specific query string.

Parsing the query string

Checking out the GraphQL source code, it looks like the heavy lifting is done for us. GraphQL has a parse function that takes in the query string and spits out what it calls a Document, which is an Abstract Syntax Tree (AST). An AST is what your JavaScript code is turned into before it’s compiled by an engine like V8. It’s also what Babel uses to transpile your code from ES2015+ into ES5. They’re neat. I learned more here: https://github.com/thejameskyle/babel-handbook/blob/master/translations/en/plugin-handbook.md

Let’s begin with a query that covers our entire scope (now you see why a small scope is important!)

You can think of this query as a test (no, you don’t need a test framework to write a test). Here’s what we’re testing for:

Return a multi-part query
Return a query that is an array (getAllLanes)
Return a nested array (notes)
Return a circular reference (lane)
Return new props from a nested object (updatedAt)
Return a query that is an object (getFirstNote)

If we succeed in normalizing this query, we’ll fulfill our scope!

Matching the Document to the Schema

Now we need to mesh the Document (the parsed query string) to the Schema (the introspected GraphQL Schema). So, I’ll match up each Field to its corresponding Type from the schema.

From digging through the Document that I printed to my console (below), I can learn a little bit about the shape. I’ll give things more descriptive names to keep it clear in my head:

Each Document can have many definitions (query, mutation, subscription).
Each definition can have many selections (AKA queryObjects).
Each queryObject can have many selections (AKA fieldObjects).
Each fieldObject can have many selections (AKA fieldObjects).

The Document. If selections were turtles, it’d be turtles all the way down.

To get to the query I want, it looks like I’ll need queryType from the schema and queryObject.name.value from the Document:

To get there programmatically, we’re in luck again! GraphQL gives us the visit, which uses the visitor pattern (just like Babel) to perform a depth-first search.

Time to Code: Document + Schema = normalizrSchema

Look how far we’ve come without even writing a single line of (meaningful) code! This isn’t an accident. We’re human. Planning and coding are mutually exclusive. “Oh but I plan better when I code!” Hogwash. That’s the nerd equivalent of, “I drive better when I’m drunk.”

All we need to do is write a visitor to traverse the Document tree. Easy enough… except I have no idea what that means or how to do it. Luckily, GraphQL has a tutorial on how to do it baked into the freaking source code! https://github.com/graphql/graphql-js/blob/master/src/language/visitor.js

Seriously, could they make it any easier?

From my list of bullet points above, I’ll start with the Document. I want to make sure no one is trying to do something fancy like call a query and mutation at the same time:

Now that we are sure we only have 1 definition, let’s tackle the next bullet point and figure out if our operation is a query, mutation, or subscription:

Now we’re hoein’ where there’s taters! We just figured out we’re handling a query operation and we know where those queries live in the schema. We’ll save that to an opSchema variable outside the visitor scope so our other methods can use it.

Next we have to handle Fields; which, as our bullet points reminds us, refers to both queries and types. Since our end result is a populated object, let’s start with an empty one. I’ll give it a single method so it follows the same API as normalizr.normalize.

Since we are dealing with nesting, we’ll need an array we can use as a stack to figure out where we are (root? inside a query? inside a field? what field?). When we enter a field, we can push to the stack. When we leave it, we pop it. I don’t know if this is the best way to do things using the visitor pattern (honestly, I’ve never used it before), but we’ll find out. Finally, we need a helper. Given our normalizrSchema and stack, we want our current position that we can use for our parent:

Alright, we’re ready to prove that we’re not that weirdo who browses programming blogs all day but never writes any code:

Yikes. We’re pushing my personal limit of 30-lines per function! Let’s think this thing out. First, a normalizrSchema doesn’t care about scalars, it cares about objects; so, we can ignore anything that doesn’t have a selectionSet. Next, we need a way to figure out if it’s a queryObject or fieldObject (that list of bullet points above is pretty handy, huh?). If it’s a queryObject, it’ll exist in our operationSchema, so we look for it there. Otherwise, we know it’s a fieldObject and we can find it in our parent, which should be the last item in our stack.

Next, if getNormalizrValue (below) returns a result, we append it to our normalizrSchema. Finally, we push & pop, just like we talked about above. Just one little helper to go:

First, we check see if it might be a normalizr Entity, which means it’s an object with an id. Otherwise, we check to see if it’s an array or union. If so, we grab the name from ofType because we want e.g. Lane instead of getAllLanes.

Scope creep alert! It’s tempting to ensure unions are working right now, but they’re out of scope. Instead, we’ll write a line or 2 that we can use as a placeholder to test them after the MVP ships.

And there we have it! 4 hours and <80 lines of code later and we’ve made a schema factory that maps our GraphQL response.data to a normalized schema that we can stick in a redux store, so let’s do it!

Building the Redux Reducer

Like what you’re reading? Subscribe to read mo….

Just kidding! But seriously, blogging while coding is slow. I hope this gives you the confidence to approach more challenging coding projects. If this kinda thing is useful, give it a heart. If it hits 100 hearts by March, I’ll write part 2. Otherwise, be sure to check out cashay when it’s all finished. As always, I prefer learning to teaching, so if you how to do something better, don’t be shy.