List of data errors in 5 minute historical data

rvsw
In the five-minute data, there are plenty of errors for different stocks where the low and open values are missing. I'm providing a partial list of the errors. There are 37 stocks in the futures and options segment which are having these errors. I can upload these if you need a full report and can provide a place where I can upload all of this. Can you please advise if these can be fixed. And if not, is there any suggested way so that we can fix it ourselves?

Bajaj Finance
ROW=['2015-05-15 09:15:00' 442.0 442.0 0.0 0.0 500]
Shree Cement
['2015-06-02 09:15:00' 11469.0 11470.0 0.0 0.0 194]
CHOLAFIN
ROW=['2015-07-23 09:15:01' 695.15 695.15 0.0 0.0 69]
NIITTECH
ROW=['2015-07-03 09:15:01' 391.95 391.95 0.0 0.0 93]
  • rakeshr
    @rvsw
    Looking to it.Do you see any pattern for these random stocks and random dates @09:15 data?
  • rakeshr
    rakeshr edited October 2018
    @rvsw
    It can also happen that,required scrip wouldn't have traded for first few minutes.
  • rvsw
    @rakeshhr thank you for the prompt response. I looked at the 1 minute data that I had downloaded and it does show that there have been trades that have happened. Also, if I add the volume for the 1 minute data, it does not result in the volume for the new five-minute data. However, that is a separate issue. Right now, you have to at least figure out why there are 0. In that data at 9:15 AM

    Bajaj finance
    15-05-2015 09:15 0 0 0 0 0
    15-05-2015 09:16 439.46 439.46 439.46 439.46 10
    15-05-2015 09:17 439.5 439.5 439.21 439.46 110
    15-05-2015 09:18 441 441 440 440 70
    15-05-2015 09:19 442 442 441 441 310

    CholaFin
    23-07-2015 09:15 0 0 0 0 0
    23-07-2015 09:16 0 0 0 0 0
    23-07-2015 09:17 0 0 0 0 0
    23-07-2015 09:18 0 0 0 0 0
    23-07-2015 09:19 692.85 693.55 692.85 692.85 57

    NIIT Tech
    03-07-2015 09:15 0 0 0 0 0
    03-07-2015 09:16 0 0 0 0 0
    03-07-2015 09:17 0 0 0 0 0
    03-07-2015 09:18 0 0 0 0 0
    03-07-2015 09:19 390.15 390.15 389.95 389.95 21

    Shree Cem
    02-06-2015 09:15 0 0 0 0 0
    02-06-2015 09:16 11425.15 11458.55 11425.1 11458.55 52
    02-06-2015 09:17 11435 11435 11425.15 11425.15 28
    02-06-2015 09:18 11459.95 11460 11438 11440 59
    02-06-2015 09:19 11469 11470 11453.05 11459.95 55
  • rakeshr
    @rvsw
    We have informed data team, they are looking to it.
  • rvsw
    Hello again
    I'm also seeing gaps in the data for 5 minute for the nifty 50. Since it is a five-minute data, there should be 75 entries for every day(with the exception of special days like Diwali). From what I have observed, from what I have seen, if there is no trade, zerodha fills in that entry with 0 volume but always provides a separate entry regardless of whether trade has happened or not.

    With this in mind, there are several days (especially in 2015) where there are less then 75 entries. I am appending the list. There is no way for me to upload a file. Otherwise, I could have hopefully provided this data to you in Microsoft Excel in a more better formatted way. I hope your data team can help out. Thank you again for your help. Please let me know if you need any assistance from me in troubleshooting this.

    2015-07-09 has only 64/75 data for the day.
    2015-07-03 has only 63/75 data for the day.
    2015-11-06 has only 65/75 data for the day.
    2015-07-29 has only 62/75 data for the day.
    2015-07-23 has only 64/75 data for the day.
    2015-07-14 has only 66/75 data for the day.
    2015-09-18 has only 62/75 data for the day.
    2015-10-08 has only 64/75 data for the day.
    2015-10-29 has only 65/75 data for the day.
    2015-09-10 has only 66/75 data for the day.
    2015-06-30 has only 65/75 data for the day.
    2015-11-04 has only 62/75 data for the day.
    2015-08-20 has only 63/75 data for the day.
    2015-08-25 has only 65/75 data for the day.
    2015-08-07 has only 62/75 data for the day.
    2015-07-07 has only 65/75 data for the day.
    2015-09-28 has only 64/75 data for the day.
    2015-06-26 has only 62/75 data for the day.
    2015-09-29 has only 65/75 data for the day.
    2015-09-21 has only 64/75 data for the day.
    2015-07-02 has only 64/75 data for the day.
    2015-09-09 has only 64/75 data for the day.
    2016-10-30 has only 12/75 data for the day.
    2015-06-29 has only 69/75 data for the day.
    2015-08-03 has only 63/75 data for the day.
    ....85 more lines
  • rvsw
    @rakeshr has there been any progress on this? I hope you understand that if the data integrity is in doubt, then all the strategies and whatever we do based on the data is not useful. In a bid to reconstruct the longer duration data from smaller duration data, I found problems in the smaller duration data also. For example, Reliance has discontinuity on December 22, 2015.

    Even if there is no trade, zerodha as a rule provides all the time stamps with the volume as 0. It's been almost one month since I have been trying to somehow get data of reasonable integrity. Providing the historical data service is a good attempt but really the data needs to be at least of some standard to be able to be used. In case if you have any other way of workaround please let me know. All that I am looking for is 5, 30 and 60 minute data for F&O stocks and major indexes to be reliable. And hopefully with errors that are either predictable so that it can be fixed or we can at least discount the shortcomings in reading the strategy results. We can converse off-line also if you wish.

    I had contacted zerodha customer service about this earlier, but they completely washed of that hands saying that I need to contact online
    thank you for any help


    2015-12-22 10:21:00+05:30,498.0,498.13,497.88,498.03,6502
    2015-12-22 10:22:00+05:30,497.95,498.0,497.95,498.0,5970
    2015-12-22 10:47:00+05:30,497.5,497.93,497.4,497.5,1162860
    2015-12-22 10:48:00+05:30,497.9,497.95,497.48,497.5,3294
    2015-12-22 10:49:00+05:30,497.75,497.95,497.53,497.9,3642
    2015-12-22 10:50:00+05:30,497.9,497.95,497.65,497.65,4644
    2015-12-22 10:51:00+05:30,497.98,497.98,497.88,497.9,4150
  • sujith
    Hi @rvsw,
    There may be inconsistencies in the data minute-level data pre-mid-2016. This was before the time when our system for constructing minute candles was fully stable. As such, for the older entries, it is possible that you see inconsistencies. You'll need to approach a data-vendor for data this old.
  • rvsw
    @rakeshr thank you for your prompt response and for the clarification. So from what I understand we should have valid data after 2016 then. That is good. I only need minute data for reconstructing the data for longer duration . If hte longer duration data is reliable, then I don't need 1 minute data.

    So the remaining point is when can we expect a response on the 5 minute data inconsistencies. If you need any assistance from me please let me know. I am hoping to work with you on make sure that this works rather than complaining :-). As long as the data is good - and the strategy results are trustworthy , it is good for me
  • sujith
    The 5minute data is generated using minute data stored in the database. You can try using data post-mid-2016.
Sign In or Register to comment.