Sanity of Historical Data

ashwinjain1
Hello,

I would like to know if the daily historical data provided by Zerodha is guaranteed to be accurate. I have just subscribed to the historical data service and downloaded daily data from 2001 to 2006. After playing around with it for some time, I have found out quite a few (daily) candles are missing for multiple scrips. I'm wondering if it's my usage of the API or something else.

For example, data between 01/01/2003 & 24/01/2003 is missing for a lot of scrips.
Also, some days have duplicated data for the same scrips.

Please help.
  • Shaha
    If its just daily data that bothers u. Just crosscheck with NSE bhavcopy archive or Quandl data..
  • ashwinjain1
    Dear Shaha,

    There seem to be some glaring differences between the data provided by NSE (through BhavCopy) and the data retrieved through the Zerodha Historical API.

    For example, below is the data of RELIANCE on 21st May 2005:

    close date high low open volume
    0 82.08 2004-05-21T09:15:00+0530 82.36 80.01 80.09 24606588
    1 82.34 2004-05-21T09:15:00+0530 82.81 80.37 81.22 29001648
    2 81.38 2004-05-21T09:15:00+0530 83.10 78.10 81.70 38611600
    3 81.74 2004-05-21T09:15:00+0530 83.40 81.41 81.94 21571560
    4 83.21 2004-05-21T09:15:00+0530 83.69 81.42 83.67 26980928
    5 82.21 2004-05-21T09:15:00+0530 83.87 81.82 82.36 24861068
    6 83.20 2004-05-21T09:15:00+0530 83.87 82.28 82.53 17186688
    7 81.74 2004-05-21T09:15:00+0530 84.63 81.41 84.63 18890984
    8 82.27 2004-05-21T09:15:00+0530 84.80 81.99 84.48 23363088
    9 84.22 2004-05-21T09:15:00+0530 85.12 82.36 82.36 26446700
    10 81.92 2004-05-21T09:15:00+0530 85.37 80.69 84.64 35619668
    11 84.76 2004-05-21T09:15:00+0530 85.58 84.25 84.99 14067340
    12 81.10 2004-05-21T09:15:00+0530 85.62 80.11 84.44 28834780
    13 85.31 2004-05-21T09:15:00+0530 85.67 82.17 82.38 27143296
    14 84.21 2004-05-21T09:15:00+0530 85.96 83.52 85.96 19679360
    15 84.47 2004-05-21T09:15:00+0530 86.30 83.80 85.77 18215632
    16 86.19 2004-05-21T09:15:00+0530 86.44 83.59 84.25 29062976
    17 86.01 2004-05-21T09:15:00+0530 86.52 83.14 85.09 23610880
    18 85.02 2004-05-21T09:15:00+0530 86.63 84.56 85.58 22831880
    19 85.93 2004-05-21T09:15:00+0530 86.71 84.26 85.20 27027636
    20 85.35 2004-05-21T09:15:00+0530 86.76 85.07 86.53 20544960
    21 86.28 2004-05-21T09:15:00+0530 87.07 82.67 85.77 32026488
    22 86.96 2004-05-21T09:15:00+0530 88.61 86.22 87.09 27379240


    Notice how 23 candles are timestamped identically while the data is completely different.

    Also, there is a issue with the 'SERIES' column provided by NSE. Since this column is dropped in Zerodha-provided data, it is not possible to differentiate between data of one scrip under different series.

    Example: Below is the NSE-provided data for RELIANCE on 5th Jan 2001

    SYMBOL SERIES OPEN HIGH LOW CLOSE LAST PREVCLOSE TOTTRDQTY TOTTRDVAL TIMESTAMP
    RELIANCE BE 359 361.5 359 361.5 361.5 338.2 40000 14385000 05-Jan-01
    RELIANCE EQ 356.5 366.8 356.5 364.3 363.4 357.8 7460148 2701229190 05-Jan-01

    and below is the same data provided by Zerodha -

    close date high low open volume
    0 180.75 2001-01-05T09:15:00+0530 180.75 179.50 179.50 80000
    1 182.15 2001-01-05T09:15:00+0530 183.40 178.25 178.25 14920296

    Notice how the data provided by Zerodha turns out to be duplicate candle since the 'SERIES' column is missing.

    I would like to know if there is a way to bypass such issues with Zerodha-provided data. For example, is there is a particular date after which the data provided by Zerodha is guaranteed to be accurate?

    Regards,
    Ashwin

  • ashwinjain1
    Can someone from Zerodha please comment on the question above?
Sign In or Register to comment.