zipline: Getting benckmark data via IEX API does not work anymore

Zipline uses IEX API to get benchmark data in benchmarks.py:

def get_benchmark_returns(symbol):
    """
    Get a Series of benchmark returns from IEX associated with `symbol`.
    Default is `SPY`.

    Parameters
    ----------
    symbol : str
        Benchmark symbol for which we're getting the returns.

    The data is provided by IEX (https://iextrading.com/), and we can
    get up to 5 years worth of data.
    """
    r = requests.get(
        'https://api.iextrading.com/1.0/stock/{}/chart/5y'.format(symbol)
    )
    data = r.json()

    df = pd.DataFrame(data)

    df.index = pd.DatetimeIndex(df['date'])
    df = df['close']

    return df.sort_index().tz_localize('UTC').pct_change(1).iloc[1:]

However, according to the IEX FAQ page, the chart api was already removed on June 15, 2019. Currently, using this api to try to download any stock data such as SPY will return nothing but an HTTP 403 error. The functions of deprecated APIs are now transferred to their new API, IEX Cloud, which requires a unique token per user in any request. Any idea how to fix this issue in the long run?

About this issue

Original URL
State: closed
Created 5 years ago
Comments: 55 (5 by maintainers)

Commits related to this issue

requirements.txt: fix seaborn version. * fix seaborn to 0.10.1 to resolve conflict * remove obsolete patch: #2480 has been fixed [#2480](https://github.com/quantopian/zipline/issues/2480) — committed to peteut/zipline-env by peteut 3 years ago

Most upvoted comments

Inspired by this comment in #1951, here is the workaround for people who does NOT need benchmarks at all:

By default, zipline downloads benchmark data by making an http request in get_benchmark_returns() in zipline/data/benchmarks.py. It returns a pd.Series which will be saved to a csv file by ensure_benchmark_data() in zipline/data/loaders.py. So we can create a dummy benchmark file by setting all data entries to zero.

First, replace benchmarks.py with:

import pandas as pd
from trading_calendars import get_calendar

def get_benchmark_returns(symbol, first_date, last_date):
    cal = get_calendar('NYSE')
    
    dates = cal.sessions_in_range(first_date, last_date)

    data = pd.DataFrame(0.0, index=dates, columns=['close'])
    data = data['close']

    return data.sort_index().iloc[1:]

Then in loaders.py, replace data = get_benchmark_returns(symbol) with data = get_benchmark_returns(symbol, first_date, last_date)

In this example NYSE is used, but it also works when I use AlwaysOpenCalendar in my backtest, so I did not try to change it to some other calendar.

This is only a hack. In the long run I would suggest to change the benchmark downloading method to request other API in case you would like to use benchmarks in the future.

+36

MarikoKujo on Jun 17, 2019

Another fix is getting a free IEX api key and altering the API request in benchmarks.py

r= requests.get(
        "https://cloud.iexapis.com/stable/stock/{}/chart/5y?chartCloseOnly=True&token={}".format(symbol, IEX_KEY)
    )

+20

grobbie94 on Jun 27, 2019

My temporal fix (on zipline-live, which is branched off from 1.1.0):

diff --git a/zipline/data/benchmarks.py b/zipline/data/benchmarks.py
index 45137428..72d9c3cc 100644
--- a/zipline/data/benchmarks.py
+++ b/zipline/data/benchmarks.py
@@ -30,8 +30,10 @@ def get_benchmark_returns(symbol):
     The data is provided by IEX (https://iextrading.com/), and we can
     get up to 5 years worth of data.
     """
+    IEX_TOKEN = 'pk_TOKEN_COMES_HERE’  # FIXME: move to param
     r = requests.get(
-        'https://api.iextrading.com/1.0/stock/{}/chart/5y'.format(symbol)
+        'https://cloud.iexapis.com/stable/stock/{}/chart/5y?token={}'.format(
+            symbol, IEX_TOKEN)
     )
     data = json.loads(r.text)

+19

tibkiss on Jun 17, 2019

My take on solution:

Put proper error handling in place. No vital functionality requires benchmark data, and there’s no reason for a full on crash if it can’t be reached. Just print a warning message and move on with the backtest.
A hard coded benchmark from a hard coded source makes no sense. Using the SPY as bench doesn’t make sense either, in particular since this api call doesn’t seem to take dividends into account. If you really need a bm, have the symbol and bundle configurable.

People who are serious enough about backtesting to bother with setting up a local Zipline are not very likely to rely on Yahoo, Quandle, Google or other free sources, and they are very likely to use proper benchmarks instead of price series of an ETF.

+14

AndreasClenow on Sep 17, 2019

here’s a fix that goes back to yahoo as a benchmark source. replace this method in benchmarks.py and don’t forget to change the call to it in loader.py

import numpy as np
import pandas as pd
import pandas_datareader.data as pd_reader

def get_benchmark_returns(symbol, first_date, last_date):
    """
    Get a Series of benchmark returns from Yahoo associated with `symbol`.
    Default is `SPY`.

    Parameters
    ----------
    symbol : str
        Benchmark symbol for which we're getting the returns.

    The data is provided by Yahoo Finance
    """
    data = pd_reader.DataReader(
        symbol,
        'yahoo',
        first_date,
        last_date
    )

    data = data['Close']

    data[pd.Timestamp('2008-12-15')] = np.nan
    data[pd.Timestamp('2009-08-11')] = np.nan
    data[pd.Timestamp('2012-02-02')] = np.nan

    data = data.fillna(method='ffill')

    return data.sort_index().tz_localize('UTC').pct_change(1).iloc[1:]

shlomiku on Jun 18, 2019

@zipper-123 Perhaps you have a faulty cached file. Remember that ensure_benchmark_data in loader.py first attempts to read from disk. That happened to my while I was tinkering with a solution.

I ended up implementing a minor variation of the solution suggested by @marketneutral to sidestep the issue until it’s properly fixed. I kept the signature of get_benchmark_returns, and just used a wider date range than I’ll ever need.

In benchmark_py

def get_benchmark_returns(symbol):
    cal = get_calendar('NYSE')
    first_date = datetime(1930,1,1)
    last_date = datetime(2030,1,1)
    dates = cal.sessions_in_range(first_date, last_date)
    data = pd.DataFrame(0.0, index=dates, columns=['close'])
    data = data['close']
    return data.sort_index().iloc[1:]

And bypassing the cache in loader.py

    """
    if data is not None:
        return data
    """

AndreasClenow on Jun 21, 2019

Zipline will have to move their datasource for SPY data somewhere else besides IEX, or change the setup to require an API key. Another way to fix is:

sign up here: https://iexcloud.io/cloud-login#/register
find where your benchmarks.py file is: ipython, import zipline, zipline.__file__
Open the zipline lib in an IDE like Atom
Add your token from the IEX dashboard (instead of pk_numbers... below), and change the requests line to:

    token = 'pk_numbersnumbersnumbers'
    r = requests.get(
        'https://cloud.iexapis.com/stable/stock/{}/chart/5y?token={}'.format(symbol, token)
    )

I guess it’s the nature of the beast with financial data that it’s hard to find for free…but annoying.

nateGeorge on Sep 15, 2019

yes… mine is not a temporary fix. it’s a solution. I’m working with it for more than a month

shlomiku on Jul 18, 2019

My temporal fix (on zipline-live, which is branched off from 1.1.0):

diff --git a/zipline/data/benchmarks.py b/zipline/data/benchmarks.py
index 45137428..72d9c3cc 100644
--- a/zipline/data/benchmarks.py
+++ b/zipline/data/benchmarks.py
@@ -30,8 +30,10 @@ def get_benchmark_returns(symbol):
     The data is provided by IEX (https://iextrading.com/), and we can
     get up to 5 years worth of data.
     """
+    IEX_TOKEN = 'pk_TOKEN_COMES_HERE’  # FIXME: move to param
     r = requests.get(
-        'https://api.iextrading.com/1.0/stock/{}/chart/5y'.format(symbol)
+        'https://cloud.iexapis.com/stable/stock/{}/chart/5y?token={}'.format(
+            symbol, IEX_TOKEN)
     )
     data = json.loads(r.text)

This resolved it for me (had to +import json though).

Any chance this will be in the official release?

bernino on Dec 21, 2019