Four years of OCaml in production
Using OCaml in a heterogeneous environment to serve 5 million visitors daily.
For the last four years, issuu has been using OCaml to write production systems ranging from targeted ad serving, through real-time document similarity search in a 250-dimensional metric space, to a system for visitor-behavior analysis of web data. We have approximately 50,000 lines of OCaml code running in production.
This experience report will focus on the observations made along the way, both positive and negative. Finally we'll seek to debunk common myths and misconceptions regarding the use of OCaml in an industrial setting. We hope this will provide both OCaml enthusiasts and skeptics with feedback to help guide adoption decisions and inform compiler and library development.
The architectural setting
To provide better context for our findings, a short introduction to the architecture in which we use OCaml is helpful.
At issuu, we have implemented a microservice architecture in which heterogeneous services communicate through a central message bus (AMQP). HTTP(S) requests are forwarded the to message bus, using the path as the routing key, for the services to consume.
We've created a framework to easily create and deploy microservices written in OCaml. The services are built by a CI server (Jenkins) and continuously deployed using Debian packages. Each process is autonomous and handles one message at a time sequentially. Scaling is done trivially outside OCaml by starting multiple processes. Job scheduling and error scenarios are handled by AMQP.
OCaml was initially introduced for its execution speed and ease of refactoring. At the time, the primary programming languages were Erlang and Python, and issuu needed a language that could handle CPU-bound tasks.
We have been happily building and maintaining OCaml systems for the last four years. In this section, we'll go through some of the reasons for this.
Type system eliminates many classes of errors. Compared with languages like Java, the type system is strong enough to enforce Yaron Minsky's mantra: "Make illegal states unrepresentable." We can rule out null values and use algebraic data types to easily enforce constraints on the shape of our data.
Compiler-error-directed refactoring. Though OCaml lacks automatic refactoring tools, we have gotten used to writing code such that extending it in a natural way will break compilation in exactly the locations where decisions must be made. We do this primarily by avoiding wildcard patterns and treating compiler warnings as errors. It also helps to use named arguments and records, which is made ergonomic by OCaml's syntactic sugar for "punning" of identifiers.
Easy deployment of native binaries. Compared with interpreted languages like Python and Erlang, deployment of OCaml is pretty simple. Only a single binary needs to be copied to the production system; it does not need a complete OCaml environment, but depends only on a few system libraries. The Debian packaging system automatically detects system-library dependencies.
Predictable performance. OCaml looks like a very high-level language, but the abstractions it provides are surprisingly transparent. A trained eye can glance at OCaml source code and know exactly where memory allocation will occur and how data will be represented in memory. We have not observed the garbage collector causing high variance in response times.
There is always a way out. OCaml has an abundance of language features, often similar and overlapping. This can save us from having to rearchitect the entire program just to add a feature that doesn't fit well in our chosen structure. This way, we can build up technical debt to meet a deadline and pay it off later.
Our favorite examples are mutable record fields, polymorphic
variants, first-class modules and when all else fails,
- Polymorphic compare and hash. According to their type signature, OCaml's comparison and hash functions can take values of any type as input. However, they will fail at run time if those values contain closures or other forbidden values such as pointers outside the OCaml heap. This behavior is unsafe but immensely practical. It saves us from hundreds of lines of module-system boilerplate, and it has never caused a crash in production code.
While we are happy with OCaml, indeed there are some areas that could be improved to elevate our day-to-day programming.
Lack of parallelism. OCaml does not offer any primitives for parallelism through multithreading. In many cases, this does not affect us because we tend to distribute work between independent processes. But when large in-process caches are needed, it results in high memory pressure. This can only be alleviated by sharing the caches between processes, which is cumbersome and greatly affects code architecture. Having multithreading primitives could let us explore a more diverse range of architectures when developing services.
Non-standard standard library. The standard-library replacements Core and Batteries are great for projects like our in-house code. But almost none of the packages on OPAM use them, supposedly because they are afraid of pulling in large dependencies. This means that many libraries will stack-overflow on large inputs as they use the non-tail-recursive
List.mapin some place. Additionally, general-purpose structures tend to be reinvented in various libraries and are therefore incompatible even tough the intention is the same. One such case is the
resulttype that was merged into OCaml in 4.02.2, but others are still missing; iterators spring to mind.
Limited stack trace for exceptions. Stack traces provided by native-compiled OCaml programs are rarely very helpful. This makes it hard to guess the location of a bug based on production log files. We would happily accept an order-of-magnitude performance decrease in exception-handling performance if we could have better stack traces instead.
Incomplete support for debugging and profiling. Debugging and profiling often requires recompilation to bytecode or with a frame-pointer-emitting version of the OCaml compiler. There are at least three official ways to profile OCaml code, but we have not been able to get proper call-tree data from any of the native-code profiling tools.
When we talk to other companies about OCaml, we often hear various misconceptions. We reflect upon a selection of these below.
Hard to find developers. We have experienced quite the opposite. Mentioning OCaml in our job listings has attracted very good candidates, even if they do not know or use OCaml.
No libraries. OCaml does have fewer libraries than mainstream languages like Python, but in our experience this is mostly because they have less overlap in functionality. Most libraries are of high quality and contain few surprises. We use third-party libraries for connecting through AMQP, MySQL, PostgreSQL, Redis, ZeroMQ, HTTP and Amazon Web Services, and for data serialization with JSON, MessagePack, Protocol Buffers and bin_prot.
Lacking developer tools. The excellent Merlin tool provides IDE-like functionality in Emacs and Vim. These two text editors are not as mainstream as something like Eclipse, but it seems from our experience that developers interested in OCaml are already proficient in either Emacs or Vim.