A Modern Architecture for FP
IO needs to die a horrible death.
As functional programmers, we sometimes delude ourselves into thinking that functional code is always significantly easier to understand, more declarative, and more powerful.
That’s just not true, as the following snippets should demonstrate:
What the heck does that function do? And why exactly should we care that it’s “purely functional”, again?
Yeah, I know, I know: the function returns a value representing an IO effect, which inverts the control: now the caller has explicit control over the effect (in imperative programming, it’s the callee that has control, which makes reasoning harder).
This inversion of control is part of the magic of functional programming.
But in the case of
IO, I can either ignore the value (in which case why does the function exist?), or I can sequence it into a chain of computations (in which case it reduces to ordinary imperative programming).
The only advantage of the latter is that because it’s a value, it can be stuffed into data structures, lazily executed, retried until success, and so on. Which is all well and good, but it’s at best a token benefit over writing the same thing in C.
Come on, this is 2015. We can do better than
The Free Revolution
I first heard of the “Free monad” a few years back, when I was functionally-curious but not really programming with functions. I had a chance to dig in for a workshop I did for LambdaConf 2014, and since then, I’ve used free monads several times, and encouraged adoption of them at my company.
A Free monad is basically just a way to stuff a sequential computation in a data structure, so you can inspect that data structure and “interpret” it later.
It’s called “free” because you get a monad for free for any higher-kinded
* -> * type (in the same way that a list is a free monoid, because it can take any type and give you a free monoid for that type by recording the “appends” in a list, which can be replayed later).
I’ll save the gory details, but I do want to share a neat interpretation of
Free as a description of a program (see my ScalaWorld 2015 presentation for more):
A description of a program that will halt, run forever, or produce a value `a` | / \ / \ / \ Free f a ^ ^ | \ | \ Value produced by program Operational Algebra Scala: Free[F[_], A]
In this interpretation,
Free f a is a description of a program,
f is the set of operations the program can be reduced to (called an algebra in this post) and
a is the value that will be produced by the program unless it halts or runs forever.
Free monads let us model arbitrary programs, not as a sequence of chunks of machine code (the
IO approach), but as a sequence of algebraic operations that describe the semantics of our program.
Such descriptions of programs can be introspected (one step at a time), interpreted, and transformed.
Moreover, not only do algebras compose (that is, if you have algebras
g, you can compose them into a composite algebra
Coproduct f g for some suitable definition of
Coproduct), but interpreters also compose — both horizontally and vertically.
If you can interpret
g separately into some
h, you can interpret the coproduct of their algebras into
h. Further, if you can interpret
h, then you can interpret
h (this involves a higher-order bind, which I find pretty cool!).
Free monads embody the essence of sequential computation, while free applicatives embody the essence of parallel computation.
Using both or some hybridization of them, you can therefore model arbitrary programs with both sequential and parallel computation, all of which reduce to operations having well-defined semantics.
Interpretation of Free Structures
I’ve used the word “interpret” several times. What does that really mean?
Interpretation is a so-called natural transformation between functors:
In the case of a free monad, you can interpret
Free f a to
Free g a with a natural transformation between
g (among other ways). This requires that
g be at least as capable as
IO is infinitely powerful so you can interpret anything to
Note that if
g is itself a
Monad, then you can collapse
Free g a into
Generally, in most examples of free monads, algebras are interpreted directly into effectful monads like
While this approach is technically superior to defining a whole program in
IO (you gain ability to isolate and reason about effects, modularize interpretation, and more), there are far more powerful techniques.
These techniques can forever change the way we write functional programs.
From Coal to Diamond
The following snippet shows what I consider to be idiomatic, FP-as-a-better-C logic for a modern business application:
The function saves a resource to a cloud store, with logging and error handling.
While “purely functional”, this code is really bad (sorry, it just is!):
- It’s hard to understand.
- It’s hard to test.
- It conflates different concerns (logging, error handling, business logic), and untangling them will introduce more complexity.
- It mixes different levels of abstraction (REST API versus business logic).
- It distributes knowledge that should be centralized (such as where and how to interact with the cloud files API).
Let’s solve these problems using the power of free algebras.
Layers of an Onion
Our first step is defining an algebra for the cloud files API.
This API should have well-defined semantics, and be high-level and composable.
That is, it should possess high-level semantics, and should be built from as few orthogonal operations as possible, relying on composition to satisfy more advanced use cases.
In our case, let’s assume we only need two operations: one to store a file, and one to list stored files:
With this algebra, we can define a lightweight Domain-Specific Language (DSL) for interacting with the API:
(This is too specialized, but ignore that for now.)
Note that our DSL defines the semantics of the cloud files API (one can even define laws for these operations), but does not actually describe how to provide the service.
In fact, the cloud files API is a REST API, so we can express the semantics of
CloudFilesF in terms of another DSL: one for REST APIs.
A simple approximation of this algebra might look something like this:
Now we can explicitly define how the semantics of the cloud files API map into the algebra of RESTful APIs by implementing the following function:
This interpreter function takes an operation
CloudFilesF, and interprets it into a “program” of operations in
Our core application can then speak in terms of the high-level, domain-focused algebra
CloudFilesF, while at some point, this will be dynamically interpreted into the low-level, protocol-focused algebra
We’re not done yet. Our original snippet had logging. Rather than tangle this into the domain logic or the protocol logic, we can define a new algebra
LogF to capture logging:
We can then define another interpreter which maps from
Interpreters compose, so we can take the
cloudFilesI interpreter, and compose it with the
logCloudFilesI interpreter, to yield a new interpreter:
where the helper functions
toRight lift a
Free f a or
Free g a into a
Free (Coproduct f g) a, respectively.
Finally, at the end of the world, we are going to have to provide a mapping from our final algebra (in this case,
Coproduct LogF HttpF) into something like
Note some of the benefits that gracefully fall out of this approach:
- Code interacting with the cloud files service does not know or care how it’s implemented. The cloud files layer is focused on our problem domain and speaks at the right level of abstraction.
- The mapping from the semantics of the cloud files service into the semantics of REST APIs is completely centralized and isolated from the rest of the application. We don’t even have to use this interpreter! For testing, for example, we can map the cloud files service into a mock service that speaks the high-level semantics directly.
- The logging code is also centralized and isolated, and completely untangled from business logic and REST APIs (note we could also log at the level of the REST API if we wanted to generalize the logging to all services that “compile” to REST APIs). The logging itself can be structured and uniform rather than random and scattered across the whole program. In addition, the logging need not log to a file, since we can supply different interpreters — some that throw away logging, others that log to a remote API.
- Everything is modular and composable. We can pick how we want to interpret the program, even based on runtime values, by composing interpreters appropriately.
Enjoying the Onion
We’ve gone from a low-level, imperative API, into a declarative description of our problem domain. We are able to describe our problem at a high level, as well as at a low level, and provide a precise means of mapping between different domains and levels of abstraction.
We are further able to completely untangle different aspects of our program, such as the mapping from a high-level domain into a REST API, or the logging of application activity.
Finally, we are able to consolidate knowledge that would otherwise be distributed throughout the program: knowledge of how to translate a cloud files operation into a REST API call; knowledge of how and what to log.
Perfecting the Approach
I think the benefits of this approach are apparent even in this toy example. In real world scenarios, however, the benefits should be even more persuasive.
That said, there are a few things we can tweak to improve it.
1. Orthogonal, Composable Algebras
In many real-world cases, free algebras are not orthogonal. Rather, there are a huge number of overlapping operations.
The reason is one of practicality: sometimes it’s more efficient to provide a large number of overlapping operations, than providing a small number of orthogonal ones.
Let’s take the example of a file system algebra (
FileF). One approach is simply to list all the common operations available in a file system:
But this set of operations is not primitive; that is, some operations can be composed from others. To simplify reasoning, formalize semantics, and reduce total volume of code, we should replace this list with a set of completely orthogonal operations.
In this form, no operation can be expressed in terms of any other. In addition, we can provide composite operations for things like moving:
While ideal from a theoretical perspective, as a practical matter, can you imagine renaming a 10 GB file by first creating a copy of it, and then deleting the old version?!?
That kind of inefficiency is not practical for most real world programs!
To solve this problem, we can write an optimizing interpreter, which is a special kind of interpreter that can detect patterns and substitute them with semantically equivalent but faster alternatives.
Now, there is a limitation here: free monads are too powerful to optimize, because operations depend on runtime values (such is implied by the very signature of bind!).
But free applicatives, however, are constrained enough to allow us to perform this sort of optimization. Or we can extend our free structure with a special kind of atomic sequencing operation that ignores the value of the left-hand side (e.g.
In either case, this lets our interpreter see far enough ahead into the structure of the program to perform semantically-equivalent optimizations (such as replacing a duplicate/delete by an OS-level rename/move).
2. Generalizing Interpreters
The interpreters I wrote for the toy example above are pretty specific: they can only interpret to a concrete target algebra.
By generalizing these interpreters, we can make the code less brittle to changes and maximize the places we can use them.
The general form for an interpreter is as follows:
This translates a single operation
f a into a program of operations in
But instead of interpreting to a concrete
g, we’d like to be able to say that we can interpret to any
g that supports the capabilities we require.
There are lots of ways of doing this (type-level machinery in Haskell, implicits in Scala, type classes), but the most straightforward way is to leverage some lens-machinery: specifically, a
Prism lets us construct and (when possible) deconstruct sum types.
The first step is defining a higher-order prism that can work with functors:
This lets us construct an
f a whenever we have a
We can now generalize the notion of an interpreter:
This says, in words, that so long as you can prove that any
g is at least as powerful as
g' (by supplying a
Inject), then the interpreter (which requires
g') can interpret into
This interpreter is fully polymorphic in its target algebra. While a bit more cumbersome to define, this more general interpreter is more robust to code changes and can be used in more places.
3. Generalizing DSLs
The final obvious improvement we can make is to generalize the DSLs.
I previously introduced the following DSL for the cloud files API:
This DSL requires the target algebra be
CloudFilesF. We can generalize this in the same way we generalized the interpreters: by requiring the target algebra be at least as powerful as
The following generalization takes advantage of PureScript’s first-class records, though there are lots of other ways to avoid boilerplate in other languages:
Now we can use the DSL in any target algebra that includes
Too much “purely functional” code is written by sequentially executing opaque chunks of machine code. That’s the
IO monad in Haskell,
Task in Scala, and
Eff in PureScript.
This leads to all the anti-patterns we see in imperative coding: mixing levels of abstraction, distributing knowledge that should be centralized, tangling concerns, and much more. Problems that have led to the invention of aspect-oriented programming, dependency injection, runtime metaprogramming, and other noble-minded, if ultimately criminal attempts at fixing what’s wrong with programming-in-the-large.
Fortunately, today’s bag of tools for functional programming has a far better solution: describing effects by reducing them to orthogonal, composable operations, and describing computation with these operations using a computational context like
In other words, in modern FP, we shouldn’t write programs — we should write descriptions of programs, which we can then introspect, transform, and interpret at will.
This leads to a style of programming where a large program is deconstructed into layers. These layers represent different levels of abstraction and different concerns.
Ultimately, all these “layers” are compiled into something like
IO, but that’s at the edges of the program. In between, highly-constrained algebras that can be reasoned about and manipulated are expressed into ever broader algebras, which come closer and closer to the “machine code” that is required for execution.
The approach outlined in this post is most certainly not the future of functional programming. There are problems here I haven’t talked about, and most languages don’t make this particular style of programming very easy or performant.
But I believe it’s a hint at what the future looks like. A future in which our programs are more like compilers than lists of machine instructions. A future where integration testing is dead. Where even the largest of programs can be broken down and understood in terms of smaller parts.
And that’s a future I’m very much looking forward to.
How about you?