Skip to content

Hexbin & scatter plot performance #2

@AdamRJensen

Description

@AdamRJensen

When do scatter and hexbin plots start performing poorly?

A timing of plt.hexbin with random numbers showed that for points less than 10^6 the time remained roughly consistenly 200 ms. For 10^7, the time increased dramatically to 1.6 s. 10^8 took 12 s. The importance here is that the number of hexbin is consistent for all tests, thus the timing difference stems from binning.

One year of 1-minute measurements is approximately half a million points.

ChatGPT showed similar results:

Image

Now, let's consider the base case of 1 million points. How many bins are feasible?
100x100 bins is fast: 400 ms. 200x200 bins is ok: 1s. 1000x1000 bins is slow: 15s. Note that the difference in number of bins is 100 and not 10 as they are multiplied.

Conclusion

Hexbin is definitely suitable for 1 minute or 1 second data points. The default of 100x100 hexbins is a good option.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions