Why Backtests Run Fast or Slow: A Comparison of Zipline, Moonshot, and Lean
Sun Jul 30 2023 by Brian StanleyBacktest speed can significantly affect research friction. The ability to form a hypothesis and quickly get an answer from a backtest allows you to investigate more hypotheses. In this article, I explore several factors that affect backtest speed and compare the performance of 3 open-source backtesters.
The backtesters I compare are:
- Moonshot, a vectorized backtester written in Python and used in QuantRocket
- Zipline, an event-driven backtester written in Python, originally developed by Quantopian and used in QuantRocket
- Lean, an event-driven backtester written in C# and used in QuantConnect
The factors I investigate for their impact on speed are:
- Universe size
- Hardware
- Backtester design (event-driven vs vectorized)
- Programming language
Large vs Small Universes
The larger your universe of securities, the more data the backtester must handle, and the slower it will run. To quantify this, I run a 10-year backtest of a fundamental factor strategy that selects high quality stocks as measured by the Piotroski F-Score, an indicator of a firm's financial health. For one set of tests, I use a small, static universe of 10 securities. For the other set of tests, I use the entire US equities market of approximately 8,000 securities, out of which I select the top 1,000 by market capitalization, then buy all stocks with F-Scores of 7 or higher. The runtimes are as follows:
All 3 backtesters process the small universe in under 30 seconds but take considerably longer to process the large universe. Moreover, the larger universe results in a large variation in speed among the different backtesters, with Lean running 12x slower than Zipline and 75x slower than Moonshot. The takeaway is that backtest speed is a negligible factor when your universe is small, but it can become a major factor when your universe is large.
Intel vs Apple Silicon
For all subsequent tests, I run only the large universe strategy.
To demonstrate how hardware can affect backtest speed, I repeat the Zipline and Moonshot backtests on two different Apple computers: an older Mac Pro with an Intel chip and a newer MacBook Pro with an Apple Silicon (M1) chip. The runtimes are as follows:
The Mac with the Apple Silicon chip runs the Zipline backtest three times faster than the Intel Mac, and it runs the Moonshot backtest four times faster. These results align with the generally glowing reviews that Apple Silicon chips have received since they first began appearing on Macs in late 2020. If you're in the market for new hardware, switching to a new Mac with Apple Silicon can provide an easy performance boost across multiple workflows.
Event-Driven vs Vectorized Backtesters
The above graphic shows that Moonshot runs faster than Zipline regardless of hardware. This can be attributed to the fact that Moonshot is a vectorized backtester while Zipline is an event-driven backtester. Event-driven backtesters run in a loop, feeding market data to your algorithm one bar at a time. Vectorized backtesters feed the entire data history to your algorithm all at once instead of one bar at a time. By not having to iterate separately through each bar, vectorized backtesters can achieve greater parallelization and will usually be faster than event-driven backtesters, all else being equal.
Python vs C#
The programming language in which a backtesting framework is written can also affect backtest speed. To demonstrate this, I compare the two event-driven backtesters: Zipline (written in Python) and Lean (written in C#). To compensate for any slowness in the Lean backtest that is attributable to running on QuantConnect's shared cloud hardware, I run the Zipline test on low-end cloud hardware of similar or lesser quality. (For reference, I also include the Zipline performance on Apple Silicon.) The runtimes are as follows:
It might seem counterintuitive that Zipline runs three times faster than Lean on basic cloud hardware, given that C# is a compiled language which in theory should run faster than Python, an interpreted language. The explanation is that underlying libraries make a critical difference. Zipline is built on top of Python's scientific computing libraries, most importantly numpy and pandas. These libraries are highly optimized and are themselves written in compiled languages. Although C# is natively fast, it lacks a comparable ecosystem of scientific packages. In a real-world application, most of the computational work is not done by the application's own source code but by the underlying libraries it relies on; thus the quality of those libraries is paramount.
A compounding challenge for Lean as a C# backtester stems from the fact that most quants prefer to write their trading strategies in Python rather than C#. To support this preference, QuantConnect uses a library called Python.NET to create a Python wrapper around Lean's C# code. Python.NET doesn't take advantage of Python's scientific computing libraries but rather adds a translation step that introduces additional latency into Lean backtests.
Conclusion
Speed is not the only, or even the most important, consideration when choosing a backtester. Nevertheless, quants who want to maximize their research productivity should keep the above factors in mind when choosing a backtester or purchasing new hardware.