ZIO 2.0 Released

On August 3rd, 2020, the ZIO community released ZIO 1—itself the product of 3 years of intensive research & development and continuous refinement based on commercial adoption and early production feedback.

The initial release of ZIO exceeded expectations, introducing dozens and dozens of name-brand companies to the power of functional Scala. As ZIO had done since it was called the “Scalaz 8 IO monad”, ZIO 1 pushed forward the state-of-the-art for effect systems, with async stack tracing, fiber dumps, and other novel features pioneered by the ZIO community.

To this day, the 1.0 release of ZIO has held up brilliantly, still offering the most expressive, type-safe, and feature-packed effect system in Scala, boasting features that are years away in other effect systems, including automatic structured concurrency, typed errors, context, schedules, and software transactional memory.

Yet, only dead software doesn’t evolve. For ZIO to remain relevant to the future of functional Scala, ZIO 1 had to innovate. So, for the past 2 years, adopters of ZIO have been sharing feedback. Some positive, some negative, and all of it incredibly useful.

The result of this feedback, combined with keen insight, research and development, and innovation from the core ZIO team (including OSS superstars Adam Fraser, Daniel Vigovszky, and Kit Langton), is the ZIO 2 release.

Officially previewed at ZIO World 2021, and tested in multiple release candidates over many months now, ZIO 2 represents a giant leap forward for functional effect systems in Scala.

Today, after more than a year of eager anticipation from ZIO users worldwide, I am extremely excited to announce the official release of ZIO 2.0!

The Big Picture

Throughout the ZIO 2 release, you will find five major themes organzing all the features and improvements that have been made to the library:

Runtime System. ZIO 2 pushes forward the state-of-the-art with a novel, third-generation runtime system that’s been optimized for Loom, and which, even in its early state, shows promise to be the fastest non-macro-based runtime system.
Developer Experience. ZIO 2 tries to eliminate many of the rough edges in ZIO 1, improve consistency, simplify the number and surface area of data types, and adopt architectural patterns that are more familiar to Java developers.
Monitoring. ZIO 2 improves and adds a whole host of new functionality designed to make it easier to monitor and troubleshoot ZIO 2 applications deployed in production. Much of this functionality will seamlessly benefit other libraries in the ZIO ecosystem.
Streaming. ZIO 2 completely re-imagines the foundations of streaming by unifying streams, sinks, and transformers through channels, providing a single, highly-expressive, principled way of solving data engineering challenges.
Performance. ZIO 2 retains its focus on high-performance commercial functional programming, achieving some significant improvements. Although there is more work to be done, particularly on the new runtime system and streams, ZIO 2 is very fast.

While it’s impossible to cover the scope of all of these improvements in a single post, I’m going to give you a tour of the highlights for each area, beginning with the one that’s sure to be the biggest surprise to ZIO users: the brand-new runtime system!

Runtime System

The runtime system is the engine that powers ZIO, executing ZIO effects in a way that preserves all the guarantees that ZIO provides around efficiency, resource-safety, concurrency, error handling, and asynchronicity.

The runtime system is the single most important component of any functional effect system: it must be bulletproof, performant, and tractable for reasons of maintenance, testing, and auditing.

ZIO World 2021 previewed some improvements to the ZIO runtime system, but these improvements have now been obsoleted by a complete rewrite of the ZIO runtime system.

ZIO 2 now features the world’s first third-generation engine for any Scala effect system, completed, tested, and unveiled just in time for the release of ZIO 2 final.

A Brief History of Engines

The first-generation engines included Scalaz Task and initial versions of Monix. They utilized an advanced technique called trampolining to achieve stack safety and support asynchronous execution without direct runtime support. Despite their power and benefits, they suffered from immature cancellation models, and they had no concept of fibers—they lacked pure forking and pure cancellation, and they did not have a well-developed model for resource safety in the presence of automatic cancellation.

A pre-release version of ZIO unveiled the first second-generation engine, with a clear fiber-based execution model, pure forking and cancellation, and integrated, cancellation-aware resource safety. Cats Effect 1.0 underwent an eleventh hour makeover to support these second-generation features (Cats Effect 3.x replicates more ZIO 1 features but does not change the core design).

For a long while, I did not think there would be a third-generation engine, as the fundamental approach to designing runtimes for effect systems has remained unchanged for years.

In working on ZIO 2, however, I was determined to improve integration performance. We know from experience that Java libraries (and some Scala libraries) frequently need to call into ZIO applications, and execute their effects. For example, a Java library that places an element into a ZIO queue needs to execute the effect returned by Queue.offer.

My interest in accelerating these integration use cases led me to develop a so-called fast-path interpreter, designed specifically to accelerate these (predominantly synchronous) use cases.

Fast-Path

The fast-path interpreter I designed for ZIO 2 eschewed traditional effect system architecture. For one, it did not use a trampoline, which meant it could not handle asynchronous effects.

Executing effects directly on the JVM stack, and only supporting a few of the ZIO operations, the fast-path interpreter tried to make as much progress as possible on its own, before ultimately giving up and delegating to the normal ZIO runtime system.

The fast-path interpreter was able to completely execute integration effects (like adding an element to a queue). This provided a significant performance boost in realistic scenarios, up to 100x faster than Cats Effect 3.x, and several times faster than ZIO 1.x.

I was pleased by these results, and I intended the fast-path interpreter to be a defining feature of ZIO 2, offering best-in-class performance for interop use cases. But I was also disappointed at having to maintain two runtime systems, and deep down, I felt like ZIO’s main runtime system could incorporate some learnings from the fast-path interpreter.

The only problem is, I didn’t know how, exactly.

Asynchronous Hell

The primary reason the fast-path interpreter couldn’t be used as the main runtime system is the existence of asynchronous operations. When a fiber executes an asynchronous operation, it must be suspended, with further instructions stored on the heap in a so-called reified stack (a todo list, of sorts, keeping track of what we are supposed to do after the async operation completes).

By the first quarter of 2022, ZIO 2 was almost complete, having just eliminated ZManaged, in favor of a radically simpler, faster, and more powerful design using scopes.

In theory, we could ship ZIO 2 whenever we wanted, after we polished documentation and interfaces. Yet, I couldn’t resist the urge to experiment with a different kind of engine, based on my experience designing and optimizing the fast-path interpreter.

A Breakthrough

On a long and grueling trans-Atlantic trip in April, I achieved a milestone: I prototyped a hybrid engine that tried to execute effects directly on the JVM stack, but which could also handle async operations by carefully and gradually unwinding the JVM stack, saving instructions for further processing on the heap in a reified stack.

In purely synchronous code, this prototype interpreter demonstrated faster performance than every second-generation engine, including ZIO 1 and Cats Effect 3.x. Unlike the fast-path interpreter, however, the prototype could also handle asynchronous operations. I quickly generalized the prototype to handle infinite tail-recursion using the same technique.

This crude prototype, however promising it might have been, was a world away from being a production-grade ZIO runtime system. Production runtime systems have to handle interruptibility (precise regions in which a fiber may or may not cancel an executing effect), finalization (executing cleanup actions during errors or interruption), concurrency, error reporting, error accumulation, stack trace generation (an obstacle as big as async!), and so much more.

There was no possible way to simply “graft” the prototype into the ZIO runtime system. The design was radically different and totally incompatible. So I had a choice: push the new design as far as possible, to see if it could become more than a toy; or abandon the prototype altogether.

As you can probably tell by the delayed release date for ZIO 2, I chose the former option!

Blank Sheet

Though daunted by the prospect of rewriting the ZIO runtime system (and potentially blocking the release of ZIO 2 by weeks or even months), I was also intrigued by the possibility of rethinking some fundamental design decisions made in the ZIO 1 runtime system.

ZIO 1 used immutable data to represent large parts of the fiber state, which allowed use of atomic references to coordinate changes across disparate parts of the fiber state. Unfortunately, the use of immutable data also led to increased heap churn and GC.

ZIO 1 also touched the heap more often than theoretically necessary in the hot path, accessing no less than two volatile variables for each iteration of its run loop (this grew to three volatile variables in the series/2.x branch, when fiber dumps were turned on by default).

If I was starting from a blank slate, I wanted the new runtime system to be able to use mutable data, and to minimize accessing the heap (especially volatile variables).

The New Design

After an intense series of weeks in which I spent nights and weekends writing and rewriting the new runtime system, and with the help of my colleague Adam Fraser at ZIO Hackathon Scotland, I came up with a design for the ZIO 2 runtime system that achieved all of my goals.

In the coming weeks, I will give a talk about the new design, but for purposes of the ZIO 2 release, it’s enough for me to highlight the achievements of this design, which include:

Microkernel. ZIO 1 achieved a remarkably small kernel by expressing parts of the ZIO execution semantics in ZIO itself. The ZIO 2 runtime system takes this technique and augments it with a hybrid declarative/executable encoding that reduces the size of the microkernel by 30%. The runtime system is leaner and meaner than any effect system in history.
Trampoline Avoidance. The ZIO 2 runtime system attempts to fully execute effects without a trampoline, and will only fallback to a reified stack in the presence of an asynchronous operation, an overly deep call stack, or a fiber that is hogging execution time. In Loom, there is never a need for asynchronous operations, which makes the ZIO 2 runtime system highly optimized for Loom.
Mutable State. The ZIO 2 runtime system utilizes a nano-actor core, which allows it to provide concurrent-safe implementation in the presence of fully mutable state, decreasing allocations and GC pressure. I have been and remain a vocal critic of actors in user-land code, but I like to think that use of actors in ZIO 2 is vindication that the actor model is useful in the design of concurrent and distributed systems (albeit as a private implementation detail).
Local State. The ZIO 2 runtime system stores all runtime flags (including interruptibility), as well as the line of code it is currently executing, on the JVM stack (not the heap), and as a result, only needs to touch the heap to process new messages in its message queue, which are infrequent and arise primarily from fiber-to-fiber interaction.
Cost-Free Tracing. The ZIO 2 runtime system achieves truly cost-free async stack traces for the happy path (when there are no errors). Given the 1.x runtime imposed 2-3x overhead on the happy path, this is a phenomenal achievement that will make a noticeable improvement in the behavior of real world ZIO applications, which have almost universally been deployed with tracing enabled.

The end result of this design is the world’s first third-generation engine for functional effect systems. The design achieves all the benefits of second-generation engines, but with an extreme amount of happy-path specialization that obviates the need for trampolining in many cases, and a super clean and minimal design that provides some assurance of correctness and stability.

The new runtime system has proven so solid, we’ve enabled previously ignored tests and added new test suites to give more rigorous semantics to interruption behavior.

Benefits

As of the date of this writing, we are just weeks away from the moment when the new runtime system first successfully passed the battery of automated tests that helps verify runtime system integrity.

As a result of the compressed time frame and our intense focus on correctness and stability, we are still early in the process of tweaking, tuning, and optimizing the new runtime system.

That said, the benchmarks point to one undeniable result: the ZIO 2 runtime system is very fast at synchronous workloads.

The following benchmark compares ZIO 2 with both ZIO 1 and Cats Effect 3 on a fibonacci example that is frequently used to benchmark functional effect systems:

Benchmark        (depth)   Mode  Cnt     Score     Error  Units
ZIO 1.x               20  thrpt   20   647.268 ±  12.892  ops/s
Cats Effect 3.x       20  thrpt   20   947.935 ± 124.824  ops/s
ZIO 2                 20  thrpt   20  1903.838 ±  21.224  ops/s

In fairness, and in the spirit of full transparency, I will note that ZIO 2 is currently slower than both ZIO 1 and Cats Effect 3 on async heavy workloads (workloads that always require trampolining), which was expected given the costs of lazy trampolining.

However, we are confident enough in our plan for improving async performance and in the importance of optimizing for Loom that we believe the ZIO 2 runtime system sets a new standard for innovation in functional effect systems.

In the coming weeks following the release, we will push forward the performance of the new runtime system, and publish a full analysis of its capabilities, with a special emphasis on what Loom-powered applications can expect to achieve under ZIO 2.

Developer Experience

Since ZIO 1 launched, we’ve seen new contributors like Kit Langton join the project with fresh ideas and a heightened passion for making ZIO a joy to use for beginners and experts alike.

We’ve also had a chance to see how our previous attention to developer experience resulted in significant adoption and happy users, and I think this motivated all contributors (including myself) to obsess over how we can make the experience of using ZIO 2 delightful, unsurprising, and familiar.

In the following sections, I’ll summarize some of the key improvements to developer experience.

Environment

In ZIO 1, the environment often ended up looking like Has[UserRepo] with Has[ConnectionPool] (Has-spaghetti!), and the environment APIs were confusing, because some methods expected you to wrap your services in Has, and others expected you to not wrap your services in Has.

In ZIO 2, the Has data type has been refactored into a more straightforward data type called ZEnvironment, which uses a phantom type parameter to track its contents. ZEnvironment itself has been baked into ZIO, so most users never need to know that it exists.

As a result, ZIO environments now have a super clean appearance and much cleaner API:

val effect : ZIO[UserRepo with ConnectionPool, Throwable, Unit]

Concurrent with this change, ZIO environment now embraces subtyping, which means that if you stick a Dog in the environment, you can now get both a Dog and an Animal out of the environment. Previously, if you stuck a Dog in the environment, you could only get a Dog out of the environment (even when Dog was a subtype of Animal), due to the invariance of Has.

To make this interface sound, ZIO 2 uses metaprogramming to prevent you from retrieving intersections from the environment (you can safely ignore this if you don’t know type theory!).

Layers

In ZIO 2, Kit Langton redesigned layers to tackle one of the biggest pain points of ZIO 1. With the aid of metaprogramming, ZIO 2 wires your layers together automatically.

This type-safe dependency injection is capable of wiring up an entire application dependency graph, parallelizing construction to minimize startup times, and appropriately handling asynchronous initialization, initialization failures, and resource-safety.

If you fail to specify all layers required by an effect, ZIO 2 will give you detailed, rich error messages, which guide you to satisfying all requirements of your effect.

ZIO 2 also deletes almost all ways to create layers, instead favoring very simple and predictable patterns of layer construction that can be used for simple or complex services.

Auto-Blocking

ZIO 1 was the first effect system to introduce the concept of a dedicated blocking pool, and operators to shift effects to run on this pool. While controversial, eventually other Scala effect systems replicated this behavior, and now the feature is standard.

The separate blocking pool allows you to make sure that blocking operations do not accidentally use up the worker threads from ZIO’s main thread pool, allowing you to achieve higher application throughput from the resulting increased efficiency.

While important for creating high-performance applications, the need to manually shift blocking workloads to the blocking thread pool has proven a usability hurdle, because users often are not aware of which Java methods block threads, and which do not. If they make mistakes, then application throughput suffers, without users being aware of what’s going on.

In ZIO 2, the new fiber-aware scheduler designed by Adam Fraser will monitor the execution of fibers and intelligently learn which fibers engage in blocking behavior. The scheduler shifts these blocking fibers automatically to the blocking thread pool. This innovative, smart behavior enables users to achieve great throughput while still focusing on business logic, without having to know exactly which methods may end up executing blocking IO operations.

Consistency

In ZIO 2, we have gone through every data type and every API, trying to make naming and organization as consistent as possible, so that your knowledge of one part of ZIO will transfer to other parts of ZIO. The greater consistency will teach your brain to see patterns between different methods and different data types—patterns that were there before, but which were obfuscated by ad hoc naming differences.

Resources

In ZIO 2, Adam Fraser successfully deleted the ZManaged data type (or, more precisely, moved it out of core and into a legacy package), because we found that ZIO environment was powerful enough to provide the same resource-safety guarantees, but with just ZIO.

Using a new concept called scopes, ZIO effects can now easily indicate in their types whether they leave resources open that need to be closed later, and a few simple operators allow closing scopes to ensure that all open resources are safely released.

In addition to improving performance, this change makes it easier for beginners to start writing resource-safe code, and it also improves correctness because there are no duplicated implementations for concurrent methods, which there were with ZManaged. Finally, the change improves expressiveness, because ZIO has many more methods than ZManaged, and now you can use all of these methods directly with resources.

Architecture

In ZIO 2, we made a conscious decision to embrace object-oriented architecture, where you define components of your application using interfaces, and you implement them with classes, which use their constructors to express their requirements on other interfaces.

This shift toward module-oriented dependency management is familiar to Java programmers, eliminates the need for accessors, and frees up ZIO environment for transient contextual information, or plumbing dependencies through library code.

Monitoring

In ZIO 2, we’ve spent a lot of time thinking about what happens after you deploy your application into production. This thinking has translated into numerous improvements designed to help you (or, in some cases, your ops team) have a great experience, led by Andreas Gies, myself, and others.

Traces

ZIO 1 pioneered asynchronous traces and execution tracing, inspiring Cats Effect to replicate the functionality into their own effect type.

While traces were a game-changer, they had a number of drawbacks, chiefly that they did not look anything like exception traces, and that they imposed a 2x or worse performance overhead on application code.

In ZIO 2, traces have been completely redesigned to emulate the appearance of Java stack traces, and they use a novel implementation in the runtime system that results in no performance overhead in the happy path.

Async traces are always turned on, and cannot be turned off, ensuring that every error which occurs in a ZIO application comes with rich context to help diagnose what went wrong.

Fiber Dumps

ZIO 1 also pioneered fiber dumps, which enable printing out a dump of fibers to see which fibers are running, which are suspended, and which are blocked by other fibers. However, fiber dumps in ZIO 1 were optional and required ZIO ZMX.

In ZIO 2, fiber dumps are now enabled by default, and require no configuration or libraries.

Logging

The ZIO runtime system has a lot of incredibly useful information, including what fiber your code is running on, context associated with the fiber, the line of code every fiber is executing, and much more.

In an effort to make this rich information more available for monitoring, ZIO 2 incorporates a logging front-end, which is expected to become the default logging facade for all ZIO 2 libraries and applications.

In a separate project called ZIO Logging, users can tie ZIO 2’s logging into whatever logging backend they prefer, from Logback to Java logging and everything in between.

Metrics

ZIO 2 features integrated metrics, including the ability to create user-defined metrics out of a variety of metric types (histograms, summaries, frequencies, counters, and gauges).

ZIO 2’s runtime system utilizes these metrics to provide detailed real-time data on running ZIO 2 applications, including fiber life times, fiber exit failures, fiber errors, and fiber counts.

In a separate project called ZIO Metrics Connectors, users can plug ZIO 2’s metric system into whatever monitoring solution they choose, including NewRelic, DataDog, and Prometheus.

Streaming

ZIO 1’s model of streaming was based on three distinct data types: streams (which produce values), transducers (which transform values), and sinks (which consume values).

If you wanted something else, or perhaps a hybrid that combined aspects of these data types, you had to build your own data type. At the same time, it was quite apparent to those of us who worked on the internals of streams that these data types are related to each other.

Thanks to herculean efforts by Daniel Vigovszky, Itamar Ravid, and Adam Fraser, this has all changed in ZIO 2, which features a completely overhauled streaming engine.

We evaluated many potential foundations for streaming, before channels emerged as the clear winner. A channel has two ends (a bit like a real-life pipe), and reads information on one end, and writes it on the other. Technically, channels read both termination and elements from an upstream channel, and write termination and elements to a downstream channel. The fundamental composition operator is connecting two channels together to form a composite channel.

In ZIO 2, streams, pipelines (the successor for transducers), and sinks are all thin wrappers around channels, creating a common unified code base that minimizes duplication. The runtime system for channels can fully optimize whole streaming pipelines, because ultimately, a fully composed pipeline (even one that is constructed from streams, pipelines, and sinks) is still just a channel.

Channels provide a rigorous foundation for ZIO 2 streams. Although our work on streams is far from done (in particular, we will improve performance and compositionality in 2.1), ZIO 2 channels present a giant leap from ZIO 1 streams, and contain the most rigorously complete yet minimal foundation for streaming in the entire Scala ecosystem.

Performance

The new runtime system in ZIO 2 was introduced primarily for performance reasons, and while more work remains to be done here around async workloads, initial results are amazing. ZIO 2 is a perfect fit for a post-Loom world, and its hybrid engine has set a new standard for functional effect systems.

Beyond the new runtime system, ZIO 2 introduces new high-performance data types like Hub and Pool, a new fiber-aware scheduler that is directly inspired by Tokio (an asynchronous runtime for Rust), much faster resource safety, faster async stack tracing, higher throughput due to auto-blocking, faster layer construction, and so much more.

Conclusion

ZIO introduced the Scala world to the power of commercial functional programming, with an innovative feature set that inspired and set the standard for other effect systems.

ZIO 2 builds on this tradition by acknowledging and learning from mistakes, and innovating in numerous areas, with the result that ZIO 2 is tremendously more powerful, far easier to use, and significantly easier to operate in production.

If you have been skeptical of what functional programming can bring to the table for your business, then now is your chance to put functional Scala to the test.

You won’t be disappointed. And if you are disappointed, we’ll happily listen to your feedback for ZIO 3!

John A De Goes