zipline: Remove Implicit Dependency on Benchmarks and Treasury Returns
Background
Zipline currently requires two special data inputs for simulations: “Benchmark Returns”, which are used to calculate the “Alpha” and “Beta” metrics, among other things, and “Treasury Curves”, which were at one time used as the “Risk Free Rate”, which was part of the Sharpe Ratio calculation.
Since these inputs are required by all simulations, we implicitly fetch them from third party API sources if they’re not provided by users. We get treasury data from the US Federal Reserve’s API, and we get benchmarks from IEX.
Problems
Implicitly fetching benchmarks and treasuries causes many problems:
-
Implicitly fetching means that running simulations requires an internet connection. We try to make this less painful by caching downloaded results and re-using them when possible, but this is only a partial fix, and it means that many users don’t notice the implicit download until it starts causing mysterious problems.
-
The APIs we fetch from sometimes fail, leading to confusing behavior for users and spurious bug reports for Zipline maintainers.
-
The APIs we fetch from sometimes change in incompatible ways, which breaks older versions of Zipline. This is currently the case for the IEX API we use to fetch benchmarks, resulting in issues like:
-
Our default benchmark is US-centric. We default to using SPY as the benchmark, which only makes sense in the US (and even then, only makes sense if you have also historical dividends for SPY, which many users don’t have).
Proposed Solution
I think we should remove these implicit dependencies from Zipline. Treasuries we should just remove, since they’re not actually used anymore. Figuring out what to do with benchmarks is a bit trickier.
Treasuries
Removing treasuries is relatively straightforward because we no longer actually use them. A quick scan of our GitHub issues turns up these issues that should be fixed by the removal:
- https://github.com/quantopian/zipline/issues/324
- https://github.com/quantopian/zipline/issues/119
- https://github.com/quantopian/zipline/issues/144
- https://github.com/quantopian/zipline/issues/2422
I’ve opened a PR at https://github.com/quantopian/zipline/pull/2626 to finally remove all traces of the treasury subsystem.
Benchmarks
Benchmarks are a bit trickier. The benchmark is used in the calculation of the “alpha” and “beta” metrics, and many users are generally interested in comparing the returns of their strategy against a particular benchmark (often an ETF or index of some kind). We also don’t currently have a way to specify a benchmark from the command line, or to define a benchmark asset for a particular bundle.
I think there are a few things we could do to improve the situation here:
-
We could add the ability to define a benchmark explicitly when running Zipline via the CLI. We already have the ability to do this internally, but there’s no supported way to control the benchmark via the CLI or via an extension. I think this is necessary pretty much no matter what.
- Optionally, we could also make it required that the user tell us what their benchmark is. This would remove the need for implicit fetching of the benchmark. Users who don’t care could pass a dummy benchmark (e.g., of all zero returns).
-
Make the benchmark optional. Making the benchmark optional would result in
alpha
,beta
, and any other benchmark-dependent risk metrics not being populated in zipline’s output. The tricky thing here is to do this in a way that doesn’t result in performance degradataion when running with a benchmark. I think we either should do this or make the benchmark asset required. -
(Short Term) We can fix our IEX API calls for benchmark data to use the updated APIs. This doesn’t fix the systemic maintenance issues associated with the benchmark, but it would at least fix Zipline being straight-up broken for many people, which is its current status. I think the main challenge here is that IEX now requires an API token to work at all, which means we need to provide some mechanism for the user to pass in their API token.
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Reactions: 5
- Comments: 26 (19 by maintainers)
Forgive me, @samatix. I forgot I was copying an old patched
benchmarks.py
file into my docker container. I removed that and it works, now.I could have sworn I checked out the
benchmark
branch, @samatix. I’ll double check, but shouldn’tpip install git+https://github.com/samatix/zipline.git@benchmark
do the trick?Hi Scott,
I’ve made the proposal solution below based on your suggestions (any remark you might have is more than welcome). I’ll be working on it tomorrow evening and this weekend. My two goals are to post a workaround quickly for the IEX issue and to limit as much as possible the changes required on zipline.
Requirements
Solution
The option to make the benchmark optional is coherent with the fact that zipline is a specialised backtester (it won’t impact your need at Quantopian to see the performance results in realtime). If the end user wants to analyse his strategy’s performance, he can use Alphalens (compare the returns to the returns from a specific benchmark instrument or factor).
I can do the following:
Backtesting with Benchmark Data in CSV/JSON
Backtesting without Benchmark
Validation Steps