What is hypermedia, and why use it for AI apps?

What is hypermedia?

Hypermedia is a paradigm for interactive multimedia applications. It specifies that the server should send not just data, but also controls that can operate on that data. The user can then use those controls to navigate and operate on the data. The client doesn't have to know anything other than how to render the data and controls.

Hypermedia is the original vision of the web. Most of the web is built on top of the HyperText Transfer Protocol (HTTP), which is a hypermedia protocol. The original HTML specification was a hypermedia specification. The web was originally designed to be a hypermedia platform.

But the web has changed a lot since then. Frontend frameworks like React, developed for massively multiplayer apps like Facebook, took over the web, and brought all the complexity of those apps with them. Single page apps, state management, and client side routing are now the norm.

The idealism of hypermedia was replaced by the pragmatism of the SPA. But hypermedia is suddenly relevant again, in large part thanks to one library and its author: HTMX. HTMX is a small JavaScript library that brings the power of hypermedia to the modern web.

There's already a hypermedia client on every device: the web browser. With HTMX we'll be able to leverage the browser's built-in capabilities for rendering, caching and navigation, so we don't have to reinvent the wheel. And we'll be able to build apps that are easy to extend and maintain, without having to write a bunch of boilerplate code.

Hypermedia and AI

The hypermedia paradigm is a great fit for AI powered apps. Generative models are often slow and resource intensive. You don't want to run them on the client. You want to run them on a GPU instance in the cloud (with Replicate, for example).

If you're waiting many seconds for a model to generate text or images, you don't really care about the much smaller latency introduced by a server round trip. And you don't want to have to manage the state of the app on the client. You want to be able to quickly prototype and iterate on models, prompts, and parameters, without having to worry about re-configuring the frontend.

Hypermedia as the Engine of Application State

Hypermedia as the Engine of Application State (HATEOAS) is a constraint of the original REST specification. It requires that a client should interact with a network application entirely through hypermedia provided dynamically by application servers.

In the current scenario, most APIs are data APIs, not true REST APIs. They send JSON representations of data, which is then transformed into a document by frontend code. This requires the client to know the API of the web app, what data to expect, and how to transform it into a document.

HATEOAS, on the other hand, suggests sending a representation in hypermedia. This representation would include links to other resources, and controls that can operate on those resources. This powerful idea eliminates the need for the client to know anything about the API of your web app. It can discover the API as it goes along.

Limitations

Note that hypermedia might not be the best choice for applications that require real-time, bidirectional communication like chat apps or games. For these types of applications, technologies like WebSockets are more suitable.

The tools we'll use

The hypermedia paradigm is getting more popular, and there are several tools that make it easier to build hypermedia apps. We'll use a few of them in this guide.

HTMX

We talked about HTMX earlier. It's a small library you import into your <head> tag. It lets you add hypermedia attributes to your HTML. Its mission is to fill the missing capabilities of the web platform, and then get out of the way.

The original hypermedia controls are the link and form tags. They're the only tags that are allowed to to navigate and operate on resources. HTMX adds a few more attributes that let you do the same thing with any tag. It also adds a boost of interactivity by making it possible to update the page without a full page reload.

Val Town

Val Town is a social website to code in the cloud. You can create standalone JavaScript functions, called "vals", to run scripts, schedule actions, and serve HTTP endpoints. That last feature is what we'll use in this guide. We'll create a single stateless function that will serve our app.

We won't need to worry about setting up a backend, or managing a server. We'll just write our code and deploy it on the cloud. You can use the Val Town editor in any browser, even on your phone. You can copy and fork vals and call one val from another. It's a great way to quickly prototype and share code.

Replicate

Replicate runs machine learning models in the cloud. We have a library of open-source models that you can run with a few lines of code. If you're building your own machine learning models, Replicate makes it easy to deploy them at scale.

For our example app, we'll chain together calls to a few different models. We can use the Replicate JavaScript client to call the models from our val.