-
Notifications
You must be signed in to change notification settings - Fork 1
Hexbin & scatter plot performance #2
Description
When do scatter and hexbin plots start performing poorly?
A timing of plt.hexbin with random numbers showed that for points less than 10^6 the time remained roughly consistenly 200 ms. For 10^7, the time increased dramatically to 1.6 s. 10^8 took 12 s. The importance here is that the number of hexbin is consistent for all tests, thus the timing difference stems from binning.
One year of 1-minute measurements is approximately half a million points.
ChatGPT showed similar results:
Now, let's consider the base case of 1 million points. How many bins are feasible?
100x100 bins is fast: 400 ms. 200x200 bins is ok: 1s. 1000x1000 bins is slow: 15s. Note that the difference in number of bins is 100 and not 10 as they are multiplied.
Conclusion
Hexbin is definitely suitable for 1 minute or 1 second data points. The default of 100x100 hexbins is a good option.