How to store web streaming data in pandas dataframe

CuriousTrader · November 2016

Hi,

I am new to python and trying to understand how the can I store data that is received in on_tick function in pandas dataframe as in can I pass a dataframe as an argument in on_tick function.

Or is there any way to store the data so that I can calculate Technical indicators like Bollinger/ MACD/ RSI.

Vivek · November 2016

@CuriousTrader You can't send extra arguments in on_tick function but you can call your functions from on_tick function. Similarly you can have logic to write data to file or database from on_tick function.

sabyasm · November 2016

Add the below line in github example:

df = pd.DataFrame(tick)

CuriousTrader · November 2016

@sabyasm - Will it append the tick data in dataframe ?

AnkitDoshi · November 2016

First you will have to convert the Tick data to DF in pandas.
You can do something like this-
import pandas as pd
Tick=pd.DataFrame(tick)

Also you can export this data to excel
Tick.to_excel('Tick.xlsx', sheet_name='Tick', index=False)

nithishkailas · July 2017

from kiteconnect import WebSocket as wb
kws = wb(api_key, public_token, user_id)
import pandas
def on_tick(tick, ws) : print(tick,"\n")
def on_tick(tick,ws) : new = pandas.DataFrame(tick,ws)
def on_connect(ws): ws.subscribe[408065]
def on_connect(ws):ws.set_mode(ws.MODE_LTP,[408065])
kws.on_tick=on_tick
kws.on_connect=on_connect
kws.connect()

THIS CODE IS SHOWING ME THIS ERROR

ERROR:websocket:error from callback >: 'WebSocket' object is not iterable

ERROR:websocket:error from callback >: 'WebSocket' object is not iterable

ERROR:websocket:error from callback >: 'WebSocket' object is not iterable

RP3436 · November 2017

Hi
I am new to python and API applications but Technical Analysis is my domain. I look forward to help / guidance till data receiving and management. I have read most of the discussions in the category Websocket and Python client and also gained some understanding on numPy,Pandas and sqlite3. I have few silly and basic questions , if someone can take pain to answer -
1. It is understood that multi threading should be used so that the main thread remains unblocked for receiving continuous data feed. I want to understand after which point (of the main code for websocket) the new thread should be activated . Should assigning to pandas data frame and subsequent/prior writing to a database ,be done in the new thread? A snippet?
2. If I need to focus only on 5-6 instruments , then should sqlite DB pose any challenge or will be ok ? I feel comfortable with sqlite.
3. Are Pandas data frames and sqlite (or any other DB) , both are required or any one can accomplish the job?
4. Which module and which function is suggested to form the candles of different time frames from tick data ? Hope some built in features are there for doing this task efficiently..
5. While using Historical data should there be two databases - one for historical and one for live streamed data? Any strategy may require historical as well as live data. So, Is appending live data to historical database a solution or there is any other process to manage data from these two sources.

Will be grateful if I get directed with some hints/references.
Pinaki Paul
[email protected]

ramatius · November 2017

You should make two processes, (1) to stream data into your system and (2) the algo that consumes the data.

For (1), you should be using a RDBMS such as MySQL. Use in-memory data tables if you need it fast. This way, you can have multiple algos running in parallel, consuming the same real-time data. AFAIK, this is a robust way of handling trade data.

RP3436 · November 2017

@Vivek
Can you kindly attempt a reply to my questions. My problem areas are creating a new thread and which function to call in the new thread on_tick or on_connect? Or on_tick in t1 and on_connect in t2? Can you give a direct code example using threading.thread where in one thread we are receiving ticks and in the other we are creating dataframes or database and doing analysis? Just few lines / words as hint . It will be useful for hundreds of new entrants who are struggling to make use of the apis.

Second, I can process the data captured in pandas data frames as well as in databases like MySQL , so for my kind of requirement where I focus only on 3 to 4 instruments, which approach is better? Does pandas fall under in-memory data tables.

Vivek · November 2017

@RP3436 It's a bad idea to use Sqlite since it doesn't support concurrency well. You can read more about it here - https://stackoverflow.com/a/26864360/973508

Here is a sample project which you can clone and use it - https://github.com/vividvilla/kite-connect-python-example

This is a simple example which uses Python Kite connect client to receive ticks and save it to Postgresql database. Celery is used as a Task queue manager to insert to database without blocking main Kite connect WebSocket thread.

Kite ticker subscribes to tokens in specified in stream.py with 5 second delay. Ticks received are sent to celery task queue where it will be inserted to db.

RP3436 · November 2017

@Vivek Thanks a lot. This is really helpful unlike the previous reply I got. If the initiative is to gain popularity - issues faced by newbies (in programming ) are to be addressed - through forum and also through user friendly documentation. Thanks again.

Howdy, Stranger!

Categories

In this Discussion

How to store web streaming data in pandas dataframe