The historical API is returning multiple candles for the same date and time. Usually both the candles will have one of the OHLCV params slightly different.
For eg: These are what I found in the 5 minute candles
Hmm, Interestingly the API isn't returning the incorrect rows when I query the specific tokens and date. But I think over a date-range when you query data, some of the dates could be having these duplicate entries.
From my end: I just dumped whatever data I received from the API into a database and it was littered with these ghost data duplicate/incorrect candles.
A simple test is to check the number candles received for each token in a day, if its more than required (75 is the required for 5 min candles in a day), then one of the candles is a wrong entry.
Hey Sujith thanks for replying, this doesn't seem to happen on any specific days or for any specific token. Meaning I couldn't reproduce the error when I re-query the exact dates and tokens from candles/historic API.
To answer your question, for my database, I've queried NEXT 50 tokens since last year and 5 minute candles. And all of my database was intermittently filled with these duplicate rows data. I just save whatever I get from the API into my database (with no processing) and I got these duplicate data.
Although I'm confident that this is an issue from the API, one other thing I've noticed is that the API seems to correct itself on the second try. Meaning if I encounter an error, the error seems to disappear on the next try. For eg: when I query for historical data, more than an year old, sometimes the connection times out, but if I retry the same query again, it gives me the correct results. This makes me wonder if the 'duplicate rows' bug is also fixed on the second try for everyone else.
As a temporary solution: I've made an unique index on the data for date and time. And I ignore any further rows from the API since the difference between the duplicates is marginal.
This is just to bring the issue to your notice. Thanks a lot for being such a pro-active broker and community. I'm happy with my current solutions in place. And I believe you should take this issue up with rainmatter.
From my end: I just dumped whatever data I received from the API into a database and it was littered with these ghost data duplicate/incorrect candles.
A simple test is to check the number candles received for each token in a day, if its more than required (75 is the required for 5 min candles in a day), then one of the candles is a wrong entry.
Can you let us know the steps to reproduce this and params you sent?
To answer your question, for my database, I've queried NEXT 50 tokens since last year and 5 minute candles. And all of my database was intermittently filled with these duplicate rows data. I just save whatever I get from the API into my database (with no processing) and I got these duplicate data.
Although I'm confident that this is an issue from the API, one other thing I've noticed is that the API seems to correct itself on the second try. Meaning if I encounter an error, the error seems to disappear on the next try. For eg: when I query for historical data, more than an year old, sometimes the connection times out, but if I retry the same query again, it gives me the correct results. This makes me wonder if the 'duplicate rows' bug is also fixed on the second try for everyone else.
As a temporary solution: I've made an unique index on the data for date and time. And I ignore any further rows from the API since the difference between the duplicates is marginal.
This is just to bring the issue to your notice. Thanks a lot for being such a pro-active broker and community. I'm happy with my current solutions in place. And I believe you should take this issue up with rainmatter.