Adding Semantic Information to Your OpenAPI Description
Helping machines navigate your API results shouldn't be complicated.
Today, there's nothing more relevant than knowing the meaning of things. That's what adding semantic information does to data. It paints what would otherwise be pale into a world of vividness. To APIs, it makes responses self-explanatory and navigable. Suddenly, API client software can understand what to do with the information it receives. The names of fields become more than that, conveying their properties and relationships to other fields. With semantic information, your API truly becomes machine-readable. So, how do you add it? Stick with me to see how straightforward it can be.
This article is brought to you with the help of our supporter, Scalar.
Scalar is a suite of powerful & customizable developer tools to help you at all stages of API development. Create beautiful API Documentation in a notion-like editing experience.
We're all tired of building API integrations by hand. I mean, one thing is to write a piece of software for the very first time. Out of pure excitement, you might find that absorbing. However, after a few times, you feel like you could be doing better things with your time. You then look for ways to automate the process of consuming APIs. While automating the act of connecting to an API is trivial (you just pick a REST SDK, for example), making the client understand the response and what it can do with it isn't uncomplicated. Even more complicated is making the same client understand any API response.
Let me give you an example. Suppose there's an API that returns book information. Connecting to it is easy with a ready-to-use SDK or even by writing your own API client. Let's say you make a request to the /books
path to retrieve a list of books. Then you program the client to retrieve the name of the author of the first book on the list. So far, so good. Now, suppose you find another API that also has a /books
path. While you might be lucky and be able to use the same client to get the book author's name, that is not guaranteed to happen. Now, imagine a different scenario where you want to use the same client to get the author of a song, not a book. It's practically guaranteed that you'd have to change your code.
What's missing in the previous examples is something that helps the client code get the information it needs about the properties of the objects the API returns. With those properties available, the client code can be as generic as you want and still work with different types of objects. The first information the client code needs is related to the resources and collections the API makes available. Then, for each available resource, it needs to know its attributes and its relationship to other resources. With this information, the client can navigate the information the API returns.
The first application of such a client is an "API browser." Imagine being able to connect to any REST API and browse through its available information just like you do with any regular Website. That would be powerful, wouldn't it? Well, GraphQL actually supports the kind of API browsing I'm implying here. Just open any GraphiQL
client on a Web browser and you'll be able to connect to any GraphQL API. However, the same thing isn't available with REST APIs, generically speaking, which is my focus. If an API browser sounds exciting, imagining other applications isn't far-fetched. What types of things could become better by having access to semantic information?
Anything with heavy machine participation could benefit from semantic information. Including any application where AI needs to interface with external operations. It was precisely this situation that prompted me to learn how to add semantic information to an existing OpenAPI description. I thought that if I could semantically augment an OpenAPI document, then I could feed it to any AI-led software. AI agents would be able to understand the meaning behind the API description and use it autonomously. They would know how to compose the response of an API operation with other existing data. They'd know what data to send as the input of any API operation. They'd know how to combine multiple API operations into a meaningful sequence to orchestrate a workflow. In summary, AI agents would be able to interact with APIs as we humans do now.
So, "How can we add semantic information to an OpenAPI description?" I thought. What I found out is that you can use two layers of semantics—well, at least two, I imagine you can use more—to enrich an OpenAPI document. The first layer is adds context and information about data types to OpenAPI data schemas. With the second layer, which "lies on top" of the first one, you add discoverability information. You can add the first layer with the help of JSON-LD, the common data types from schema.org, and two OpenAPI attributes I didn't know about before: x-jsonld-context
and x-jsonld-type
. The second layer is possible by adding Hydra attributes such as hydra:member
and hydra:totalItems
, and using specific JSON-LD types such as hydra:Collection
. Let's dig deeper into each one of those formats and technologies and see a couple of examples.
The goal of JSON-LD is to transform disorganized data into pieces of linked information. The term "LD" means, unsurprisingly, "Linked Data." By adding linking information, you can help a machine understand how attributes are related. In other words, having linking information adds context to data. JSON-LD, currently in version 1.1, has been a W3C recommendation since July 2020. There are numerous implementations in different programming languages such as JavaScript, Python, and Go. Here's a trivial example of a JSON-LD document depicting my book "Building an API Product:"
{
"@context": "https://schema.org",
"@type": "Book",
"name": "Building an API Product: Design, Implement, Release, and Maintain API Products That Meet User Needs",
"author": {
"@type": "Person",
"name": "Bruno Pedro",
"sameAs": "https://orcid.org/0009-0006-1048-4848"
},
"publisher": {
"@type": "Organization",
"name": "Packt Publishing"
},
"datePublished": "2024-01-25",
"isbn": "978-1837630448",
"numberOfPages": 278,
"inLanguage": "en",
"url": "https://www.amazon.com/Building-API-Product-implement-maintain/dp/1837630445"
}
As you can see, some attributes are links to documents that exist elsewhere. That gives more context to the information available in this document. In particular, look at the @context
attribute. It establishes that all data types referenced in this document exist within the list of common types available on schema.org. So, for example, the @type
attribute inside author references https://schema.org/Person
, which is a real common data type (try clicking on that link and you'll see). So, how do we now add this contextual information to an existing OpenAPI document? To do that we'll follow the REST API Linked Data Keywords Internet Draft. It defines two keywords you can use to provide semantic information, x-jsonld-context
and x-jsonld-type
. These keywords provide, respectively, the same information as the JSON-LD @context
and @type
attributes you saw in the previous example. The goal now is to enrich an OpenAPI schema with semantic information. Let's look at an example YAML fragment of an OpenAPI schema defining a Book
:
components:
schemas:
Book:
type: object
x-jsonld-context: "https://schema.org"
x-jsonld-type: "Book"
required:
- name
- author
properties:
name:
type: string
description: "The title of the book."
example: "Building an API Product"
author:
type: string
description: "The person who wrote the book."
example: "Bruno Pedro"
x-jsonld-type: "Person"
publisher:
type: string
description: "The publisher of the book."
example: "Packt Publishing"
x-jsonld-type: "Organization"
datePublished:
type: string
format: date
description: "The date the book was published."
example: "2024-01-25"
isbn:
type: string
description: "The International Standard Book Number of the book."
example: "9781837630448"
numberOfPages:
type: integer
description: "The number of pages in the book."
example: 278
inLanguage:
type: string
description: "The language the book is written in."
example: "en"
url:
type: string
format: uri
description: "A URL to more information about the book."
example: "https://www.amazon.com/Building-API-Product-implement-maintain/dp/1837630445"
With this approach, we now have an OpenAPI definition of a data schema that is easier for a machine to understand. Knowing that the semantic type of the book object definition is https://schema.org/Book
removes ambiguity and makes it easier to use generic consumer code. What is still missing is the ability to navigate the data. How can a piece of software know what data is navigable and in what ways? That's where Hydra comes in. Among many other things, it lets you add pagination information to a collection of items. Let's see an example that defines the Books
collection, to understand how Hydra works:
components:
schemas:
Books:
type: object
x-jsonld-context: "https://schema.org"
x-jsonld-type: "hydra:Collection"
properties:
@context:
type: string
example: "https://schema.org"
description: "The JSON-LD context."
@id:
type: string
format: uri
example: "/books"
description: "The IRI of the collection."
@type:
type: string
example: "hydra:Collection"
description: "Indicates that this is a Hydra Collection."
totalItems:
type: integer
description: "The total number of items in the collection."
example: 1
member:
type: array
description: "The list of Book objects."
items:
$ref: "#/components/schemas/Book"
view:
type: object
description: "Pagination view with next/previous links."
properties:
@id:
type: string
format: uri
example: "/books?page=1"
@type:
type: string
example: "hydra:PartialCollectionView"
first:
type: string
format: uri
example: "/books?page=1"
next:
type: string
format: uri
example: "/books?page=2"
previous:
type: string
format: uri
example: "/books?page=1"
You now have a Books
object that is a hydra:Collection
. It has specific Hydra properties such as member
, which defines the shape of each collection item, and view
, which defines pagination information. A client that understands this vocabulary can consume this response in a generic way, without needing any custom implementation. Let's look at what such a response would look like:
{
"@context": "https://schema.org",
"@id": "/books",
"@type": "hydra:Collection",
"totalItems": 1,
"member": [
{
"@type": "schema:Book",
"name": "Building an API Product",
"author": {
"@type": "schema:Person",
"name": "Bruno Pedro"
},
"publisher": {
"@type": "schema:Organization",
"name": "Packt Publishing"
},
"datePublished": "2024-01-25",
"isbn": "9781837630448",
"numberOfPages": 278,
"inLanguage": "en",
"url": "https://www.amazon.com/Building-API-Product-implement-maintain/dp/1837630445"
}
],
"view": {
"@id": "/books?page=1",
"@type": "hydra:PartialCollectionView",
"first": "/books?page=1",
"next": "/books?page=2",
"previous": null
}
}
It all makes more sense now, right? Overall, adding these two semantic layers enhances the API and makes it easier to consume by a machine. Not only can the consumer understand what each data type means, it can also know how to navigate the API to obtain other related information.