Historical Data corrupt duplicate data.

keshav
I downloaded data for all F&O stocks but half of the files are corrupt with duplicate data. hear is the example:


==> ORIENTBANK <==
Time, Open, High, Low, Close, Vol
2015-01-01T09:15:00+0530, 339.4, 339.5, 337.85, 337.85, 2982
2015-01-01T09:16:00+0530, 338.1, 338.7, 337.85, 338.35, 6796
2015-01-01T09:16:00+0530, 338, 338.7, 337.9, 338.35, 6796
2015-01-01T09:17:00+0530, 338.35, 338.9, 338.2, 338.5, 7096
2015-01-01T09:17:00+0530, 338.4, 338.9, 338.2, 338.9, 6396
2015-01-01T09:18:00+0530, 338.5, 339.3, 338.5, 339.1, 7672
2015-01-01T09:18:00+0530, 338.9, 339.3, 338.5, 339.1, 6972
2015-01-01T09:19:00+0530, 339.3, 339.35, 338, 338, 5495
2015-01-01T09:20:00+0530, 338, 338.6, 338, 338.2, 3065

Do you see how there are multiple entries for 2015-01-01T09:17:00 / 18

  • sujith
    Hi @keshav,
    Are these duplicate entries for only JAN 2015 or different dates?
    If possible can you let us know the last recent date from where it starts? We will ask data team to look into this.
  • keshav
    No they are not just JAN of 2015, its everywhere, some times the duplicates have different value.
  • sujith
    Hi @keshav,
    Do you have instances of this happening after DEC 2015 also?
  • keshav
    I think so , I had to write a custom script to eliminate all duplicate entries and create a new one with average of both.
Sign In or Register to comment.