Making machines understand how your API works is easy. If you use a well-known API documentation standard, document all operations in a clear way, identify each input and output format, and name things in a familiar way, you're in a good position. That is, if all you want to do is document individual API operations. If, on the other hand, you're interested in documenting complete use cases, then things can get more complicated. How can you tell a machine it needs to use two or more operations to fulfill a full use case? How do you explain that those operations need to be executed in a sequence? What if one or more of those operations fail? Stay with me to see what's possible.
This article is brought to you with the help of our supporter, Scalar.
Scalar is the modern OpenAPI platform for the entire API lifecycle. Govern APIs with Scalar Registry, test offline with their built-in Client, generate beautiful documentation, and ship SDKs instantly - all from your single source of truth.
There's a big difference between programming a multi-operation workflow and documenting use cases that can involve multiple operations. In the first situation, you don't leave any room for interpretation. You simply define the multiple steps of a workflow, the input parameters, the outputs of each step, and how they connect to each other. In the second situation, however, you're not documenting how the workflow should be executed. Instead, you're showing how each step works and how a consumer can combine it with any other step. The workflow composition is up to consumers to choose, based on their needs and what different combinations of operations can provide. The difficulty is in providing all the information about all the operations, so it's easy to understand how you can combine them to create a full workflow.
Clearly, using something like Arazzo isn't the best choice. While Arazzo is ideal for documenting full workflows, it's not useful if what you want is to provide information on how operations work and how you can combine them. Let's start by focusing on documenting the full details of each operation and then attempting to give instructions on how to make them work together. One solution to offering thorough documentation about each operation is to use a machine-readable API definition standard such as OpenAPI. The goal is to write complete explanations for each operation that go beyond what a simple API reference can offer. You should explain what the operation can do, what use cases it can fulfill, what its benefits are, and how it can be used in combination with other operations. This last part is especially important. Showing how consumers can use two or more operations together to form a workflow is the key to informing machines.
Putting all this documentation together doesn't sound too difficult. What is difficult, however, is making sure that machines can understand the documentation and use it in a productive way. One thing is to present a fully defined workflow where each step has been thoughtfully defined and documented. Another thing is letting machines decide how to put together different workflow steps to reach a desired goal. There are many things that can go wrong in that process, naturally. To start with, identifying the inputs and outputs of each step and matching them to any given context might not be easy. The names of the input variables play a big role in the correct identification of what they can do. Call one article title input variable as "name," and it will dramatically change how a machine interprets it. Or, a product price as "value," a sorting parameter as "order," and a paragraph as "text." Simply choosing the wrong name for an input or output variable can change the outcome of how a machine creates a dynamic workflow. Another hurdle is deciding what data types to use for inputs and outputs. Depending on how strict a machine is, it might not be able to match an output with the input of another operation. Picking a "float" data type for a price output variable and using "integer" for a similar variable on an input of another operation doesn't guarantee a machine will combine both. Using output content types that don't define data types makes this problem even worse, as machines don't have a way of identifying what each variable type is. Too many things can go wrong, and it's your job to decide the amount of freedom you give to machines when using your API.
One end of the spectrum makes machines fully autonomous in deciding how to combine multiple API operations. However, as you just saw, many things can go wrong, and the dynamic workflows might not achieve the expected goals. The other end of the spectrum, however, deprives machines of all decision-making and, instead, gives them ready-to-use workflows. All the workflows make sense and achieve desirable outcomes. On the other hand, you can't predict all possible workflows, so the breadth of possible outcomes depends on what you document. In the end, it's up to you to decide what approach you're most comfortable with. While many businesses opt for a well-defined scenario where all workflows are defined, there are many situations where there's freedom to experiment and let machines figure out what to do.