On the Behavior of OpenAPI's oneOf
How exactly does OpenAPI's oneOf work?
You think you don't need certain OpenAPI features until you know what they do. This is exactly how I feel when I learn about things like OpenAPI's oneOf. I recently found myself in a debate involving the usage of oneOf and what its limitations are when generating API server and client code. Before this situation, I had a superficial knowledge of how oneOf works and what the alternatives are. Right now, I can tell you that I feel I know more than I really need. That's why I thought of sharing what I learned along the way with you. So, grab a coffee as we explore OpenAPI's oneOf.
This article is brought to you with the help of our supporter: Speakeasy.
Further expanding its best-in-class API tooling, Speakeasy now empowers teams with open standards. The platform simplifies OpenAPI Overlay adoption so you can focus on building. Click the button below to check out the playground.
So you thought you could design APIs without knowing anything about data modeling. Yes, for some time, you can manage to exclusively use primitive data types, e.g., strings, integers, and booleans. However, there comes a time when you need a more sophisticated way of defining the data types your API uses. At some point, you'll want to use objects, arrays, and even other ways of composing and selecting data types. That's where oneOf comes. Its functionality is quite simplistic, but its results can be powerful.
While many people don't know what oneOf does, there are others who think they know, but they're mistaken. I'm saying this because it's easy to misunderstand what you can do with oneOf. It happened to me. At first sight, I thought it was a way to define that a data type needs to be one of the list of types you provide. But it's not. Given a list of schemas, you use oneOf to define that a data type has to validate against one, and just one, of them. What it means is that a piece of data can't validate against two (or more) of the listed schemas. It has to be just one. Interesting, right? So, why is it useful?
One typical scenario where you'd use oneOf is when you want to define data (an object, for instance) that can assume the shape of one of several data types. Imagine you're defining an API for a supermarket. You can represent each product in stock as an object. Products usually have a price, a name, and also properties that are specific to each type of item. You'd use oneOf to define that a product has to validate against just one of the schemas of all available products. In other words, oneOf lets you define polymorphism.
Product:
type: object
oneOf:
- $ref: '#/components/schemas/Fruit'
- $ref: '#/components/schemas/Beverage'
- $ref: '#/components/schemas/Fish'
- $ref: '#/components/schemas/Meat'The validator tries to match a given object against the schemas on the oneOf list. To be valid, the object needs to validate against one and only one of the schemas. Let's define the Fish and Meat schemas to see how the polymorphism can work. So, Fish and Meat are both products. That means they share the common price and name properties. Both of them also have the weight property. Let's take advantage of inheritance to identify both Fish and Meat based on a common definition called BaseProduct.
BaseProduct:
type: object
properties:
price:
type: number
format: float
description: Price of the product
name:
type: string
description: Name of the product
Fish:
type: object
allOf:
- $ref: '#/components/schemas/BaseProduct'
- type: object
properties:
weight:
type: string
description: Weight of the fish product (e.g., 1kg, 500g)
Meat:
type: object
allOf:
- $ref: '#/components/schemas/BaseProduct'
- type: object
properties:
weight:
type: string
description: Weight of the meat product (e.g., 1kg, 500g)As you see, now both Fish and Meat look exactly the same. A product with a price, a name, and a weight would validate against both the Fish and Meat schemas. I did this on purpose to illustrate a peculiar situation where data can be validated against more than one schema. What happens in a situation like this is that the API becomes unusable if you try to work with fish and meat products, since they will never validate. Since both Fish and Meat schemas have precisely the same properties, anything that validates against one will validate against the other. So, how do you make this work?
One way to reduce the chance of having data that validates against more than one schema is, as obvious as it sounds, making the schemas different. In this case, we could add a species property to the Fish schema. But that wouldn't be enough, because, by default, you can add any property to an object, and it still validates against a schema that doesn't define that property. A product object with a species property would still validate against the Fish or Meat schemas. To fix the situation, you'd have to make the additionalProperties attribute false in all product schemas.
Fish:
type: object
allOf:
- $ref: '#/components/schemas/BaseProduct'
- type: object
properties:
weight:
type: string
description: Weight of the fish product (e.g., 1kg, 500g)
species:
type: string
description: Species of the fish (e.g., salmon, tuna)
additionalProperties: false It all sounds good, but we can still have false positives. Suppose you try to create a fish product, but you don't set the species property. How does the API know which schema to use? It doesn't, and it will validate against both Fish and Meat, making the validation fail because we're using oneOf. Fortunately, there's one way to fix this situation: setting required properties. Making the species property required on the Fish schema is the solution. It makes it impossible to create a fish product without specifying its species.
All this sounds a bit convoluted, right? I also think so. Fortunately, there's a feature that removes all ambiguity from the validation process. Instead of letting the validator guess which schema to use, you can identify it using the discriminator feature. It lets you use a property on the data you use to identify which one of the listed schemas to validate against. In its simplest form, the discriminator value is the name of the schema itself. Let's add the category property as a discriminator. Since all products inherit from BaseProduct, we simply have to add it there.
BaseProduct:
type: object
properties:
id:
type: integer
description: Unique identifier for the product
name:
type: string
description: Name of the product
price:
type: number
format: float
description: Price of the product
category:
type: string
description: Category of the product (e.g., fruit, beverage, fish, meat)
required:
- categoryNow, whenever you want to define a fish product, you have to set the category property to the value "Fish". You can even further enhance the schema definition by making the category property an enum and only allowing the existing schema names. Overall, I think that using a discriminator in combination with some of the other techniques I described before is a good approach to design polymorphism on your API without going crazy. There are more options, I know. But let's leave them for another time.

