Historical data is returning bad data (not consistent)

kkarthik100
When fetching data via the Python Kite Connect historical API, I noticed that some candle values differ on successive fetches a few seconds apart. The candles that vary seem to differ each time. Sometimes, you may not see any difference,,but if repeated enough number of times, you can spot the issue. For example, here is code that fetches 5 minute Bank Nifty candles. I have seen this happen for stocks as well and for all timeframes. However, it is easier to reproduce on smaller timeframes, like 5 or 15 minutes.

from kiteconnect import KiteConnect

# Set API_KEY and ACCESS_TOKEN to your credentials.
# 260105 is the symbol token for Bank Nifty.
def Test():
API_KEY = ''
ACCESS_TOKEN = ''
kc = KiteConnect(API_KEY, ACCESS_TOKEN)
for c in kc.historical_data(260105, '2021-03-30', '2021-07-07', '5minute'):
candle = [
c['date'].strftime('%Y-%m-%dT%H:%M:%S+0530'),
c['open'],
c['high'],
c['low'],
c['close'],
c['volume']
]
print(candle)
Test()
I saved the code above in a file named failure.py. Then I run this from the (Linux) command line:
python3 failure.py > log_1
sleep 5
python3 failure.py > log_2
diff log_1 log_2

And this is the output of the diff command above. Notice carefully the candles that differ.

2706,2707c2706,2707
< ['2021-05-25T09:40:00+0530', 34888.1, 34888.1, 34798.45, 34815.4, 0]
< ['2021-05-25T09:45:00+0530', 34822.4, 34822.4, 34738.15, 34741.9, 0]
---
> ['2021-05-25T09:40:00+0530', 34888.1, 34888.1, 34798.45, 34824.95, 0]
> ['2021-05-25T09:45:00+0530', 34824.4, 34824.4, 34738.15, 34741.9, 0]
2779,2781c2779,2781
< ['2021-05-26T09:30:00+0530', 34559.8, 34577.25, 34518.65, 34528.45, 0]
< ['2021-05-26T09:35:00+0530', 34508.3, 34549.8, 34499.1, 34511.75, 0]
< ['2021-05-26T09:40:00+0530', 34507.55, 34515.8, 34463.25, 34508.1, 0]
---
> ['2021-05-26T09:30:00+0530', 34559.8, 34577.35, 34510.35, 34525.25, 0]
> ['2021-05-26T09:35:00+0530', 34523.9, 34554.9, 34498.35, 34511.45, 0]
> ['2021-05-26T09:40:00+0530', 34505.85, 34515.8, 34463.25, 34508.1, 0]
  • kkarthik100
    Is anyone from Zerodha looking into this? I was able to reproduce the issue 5 minutes back. I am having second thoughts about using Zerodha's historical API to do anything serious with it. I should probably look into other data providers.

    If all of this is "working as intended", an explanation as to why that is so would be welcome.
  • sujith
    We have informed the team to take a look at this.
  • kkarthik100
    Thanks for looking into this.

    I understand that the Kite connect API / Historical APIs are a "no guarantees" and "use at your own risk" type of products. But those typically imply availability (occassional server errors and such) guarantees. You have to guarantee something. For example, sending bad / inconsistent data like what is happening here is definitely a terrible thing, because this is a silent and dangerous error. I happened to notice it in my logs by sheer accident. Your other clients apparently have not noticed this. And how can they? We make the implicit (and valid) assumption that the data the broker gives us is correct . The error above is probably also manifest in Kite Web, but then, which discretionary trader is going back a few months on a 5 minute chart and how can they even know if some candles are bad?

    These have serious ramifications for algorithmic trading, where data is the starting point of everything - backtesting, statistical analysis, etc, etc. If I cannot trust the data, my backtest could be totally bogus, generating false entries and exits.

    Furthermore, the reason I posted this under "General" is that I suspect that this is a backend issue at your end and not tied to any client API in any language.
  • sushantkumar
    Hi team ! Please let me know if this warrants a separate post but we are seeing inconsistencies in our data too. Every day at 15:25pm the 5 minute candle we fetch from the historical API and the data we see on the UI does not match. This is something we have noticed happening on a couple of days; 7/9/2023 and 8/9/2023; Can this be looked into?
  • rakeshr
    Every day at 15:25pm the 5 minute candle we fetch from the historical API and the data we see on the UI does not match.
    Are you polling this request in real-time?
    You can create candles at your end from websocket. Go this this thread.
  • mr_easy
    Did you get any resolution for this? It's 2023 and I see that API is returning inconsistent data values, and the inconsistency is also not a few points difference, it's huge enough for algos.
  • rakeshr
    I see that API is returning inconsistent data values, and the inconsistency is also not a few points difference
    Can you paste those requests and response details to debug?
Sign In or Register to comment.