Validating POST Data Against JSON Schema

Simon MacDonald’s avatar

by Simon MacDonald
@macdonst
on

complicated forms Photo by Sincerely Media on Unsplash

Often I create routes to POST data that will eventually get stored in a database. These routes support the both the application/json and application/x-www-form-urlencoded Content-Types. In Architect a typical POST handler will look like this:

// POST data
import arc from '@architect/functions'
import json from './json.mjs'
import HTML from './html.mjs'

export const handler = arc.http.async(json, HTML)

Instead of duplicating the validation code in the json and HTML handlers, it makes sense to put a validate middleware step before attempting to store the data.

// POST data
import arc from '@architect/functions'
import json from './json.mjs'
import HTML from './html.mjs'
import validate from './validate.mjs'

export const handler = arc.http.async(validate, json, HTML)

The question is, how do we validate our data? This is where JSON Schema comes into play.

JSON Schema

JSON Schema is a vocabulary for making assertions about JSON documents. You can use a JSON Schema document to annotate and validate other JSON documents.

We specify the structure of the incoming JSON doc and some validation rules around what values are acceptable for those properties.

Let’s look at a JSON Schema for a Book with three properties: title, author, and publication date.

// book.schema.json
{
  "id":"Book",
  "type":"object",
  "properties": {
    "title": { "type": "string" },
    "author": { "type": "string" },
    "publication_date": { "type": "integer" }
  }
}

The title and author are strings, but the publication date is an integer. It means that if we receive the following JSON payload, it will pass validation:

{
  "title":"Modern Software Engineering",
  "author": "Dave Farley",
  "publication_date": 2021
}

While the following payload will fail validation:

{
  "title":"Modern Software Engineering",
  "author": "Dave Farley",
  "publication_date": "2021"
}

As the publication date is a String and not an Integer. Failing the validation step will help prevent garbage from entering our database.

Validation Middleware

Now that we have a way of specifying what our data should look like let’s write some middleware to validate input.

// validate.mjs
import { Validator } from 'jsonschema'

const Book = // loaded from book.schema.json

export default async function validate(req) {
   const v = new Validator();
   let res = v.validate(req.body, Book)
   if (!res.valid) {
       return {
           statusCode: 500,
           json: { error: res.errors.map(e => e.stack).join('\\n') }
       }
   }
}

This code works great as long as the Content-Type is application/json. Unfortunately, if you are responding to a form post and the Content-Type is application/x-www-form-urlencoded, all of the properties in the request body will be strings. So the validation of the publication date will fail.

This is where @begin/validator comes in.

@begin/validator

The @begin/validator package:

Validates request bodies against a provided JSON Schema. Content-type’s supported included application/json and application/x-www-form-urlencoded. JSON request bodies are validated directly against the schema, while form-encoded bodies are coerced into schema format.

This results in a slight change to our validate middleware but hides the complexity of converting URL form-encoded data to an object that conforms to the structure of the JSON Schema.

import validator from '@begin/validator'
const Book = // book.schema.json
export default async function validate(req) {
   let res = validator(req, Book )
   if (!res.valid) {
       return {
           statusCode: 500,
           json: { error: res.errors.map(e => e.stack).join('\n') }
       }
   }
}

@begin/validator is in alpha right now, and pull requests are welcome!

Next Steps

JSON Schema is a valuable tool for validating input, but maybe, just maybe, we could also use it to create input forms automatically 🤔.