Resource Utilization of Python Client( 9000 MODE_FULL )

vmx
Hello, wanted to know if this is how websocket usage looks like?
Setup
  • t3.medium with a gp3 - 3000 IOPS
  • Postgres, backed by timescaledb-tune(all yes)
  • 9000 Instruments MODE_FULL
  • This instance is dedicated for streaming, so nothing else apart from the python client[1] + database exist). At this point, no reads have started on the Database too, so it's just ingesting data
Usage
  • 07:21:52 up 2:03, 3 users, load average: 11.93, 11.89, 11.12
  • Mem: 3836 1802 139 1004 3153 2034
Questions
  • Should I consider changing the instance type? I have a feeling this gets billed a lot(as I'm using CPU cycles outside of allocation). What's the best one considering this appears to be a CPU intensive task and a bit low on memory.
  • Should I consider tweaking the IOPS on storage so if that's the bottleneck it alleviates things up a bit
  • Should I push postgres a bit more? As I mentioned earlier this is timescaledb-tune auto
  • Good places to learn about profiling applications, databases, designs so I could make better choices.
  • What's your cloud bill looking like now?
I've shared the pointers on what could be informative to get this discussion going, let me know if you'd need to look at something else. Even generic pointers are welcome!

[1] - 3 partitions processing 3000 instruments each, running in a multiprocessing setup. It's just python + psycopg.
  • mrkaran
    mrkaran edited May 19
    Hi

    > Should I consider changing the instance type

    Yeah, T3.medium is probably not ideal for what you're doing. It’s a burstable instance, so once those CPU credits run out (which they will, fast, with that load), performance tanks. That could explain the high load averages you're seeing

    You’ll probably get more consistent performance by switching to a compute-optimized instance like c6i.large or c6i.xlarge.


    > Should I consider tweaking the IOPS on storage

    3k IOPS should be sufficient IMHO but you can check `iostat` and see if there are any bottlenecks

    > Should I push postgres a bit more

    You can check https://pgtune.leopard.in.ua/ and match the recommended configs here.

    > Good places to learn about profiling applications, databases, designs so I could make better choices.

    You can also look at https://github.com/benfred/py-spy to profile your Python program and see if you can play around with the numbers to batch insert the instruments/or spawn more threads for the multiprocessing setup etc. TBH - Go would be better suited for such a task as it can utilise the underlying CPU cores much more, gives better control for concurrency and quite a lot more resource efficient than Python.
  • vmx
    Hi @mrkaran, thank you.
    • I resized it to c6i.large. The current uptime is 04:43:04 up 1:32, 2 users, load average: 9.43, 9.40, 9.38. I'll read up a bit more on load averages(as I've been doing for last couple of years :wink: diligently) to see if this time I get it right. btw, does it look okay?
    • The storage fills up fast too, 9000 instruments for one hour came out to be ~ 4GB, and it's IOPS/throughput do not seem to be causing any issues that I can see. So, increased the storage also
    • Will compare both timescaledb-tune and pgtune to see postgres works better
    • I have considered Golang and have started learning it, it'll take some time to re-translate this into golang.
    • I am running the profiler right now, while I'm still new to flame graphs what I found interesting in the trace is a new db connection is created each time data is inserted, not sure how much performance gain that'd give but I'll try moving it/caching the connection in. The thing is currently it does a async with await pg.AsyncConnection.connect allowing to connect asynchronously and also to release resource on commit, especially the connection. Maybe there exist better implementations for it

    Thank you again for all your pointers
  • vmx
    Some updates from today, after a bit of modifications and removing bloated validations/logic.
    • load average: 5.51, 6.01, 5.65
    • Mem: 3816 1655 112 936 3231 2160
    • c6i.large
    It's doing much better than before, although the reads are very slow.

    To-do:
    • I haven't been able to figure out caching in the async pg connection(it gave a large backtrace on what all things I'm not supposed to do as same thread shares transactions etc., the error logs fill up fast, maybe each thread should create/own it's own connection), so for now it is not changed. I find that psycopg prepares statements, allows some additional functionality that'd make things lightweight, so that would definitely help if we get it right.
    • There are few areas where I use external libraries like pydantic to ensure typing(Wish python had structs), which now seem to be an absolute overkill. I'll try getting rid of them and replacing with something native
Sign In or Register to comment.