The Problem: I’ve been diving into the simulation datasets, and while the data is great, the manual indexing required to get to the metrics is a bit of a headache. Having to calculate offsets like line[nn3 + (src_node*n+dst_node)*7] every time is pretty tedious and, honestly, a magnet for "off-by-one" errors.
Right now, if someone wants to jump in and start training a model in PyTorch or TensorFlow, they have to spend a couple of hours just writing a parser before they can even see the data. It’s a bit of a barrier for people who just want to experiment with RouteNet.
The Fix: I think we should have a simple Python utility (maybe a data_loader.py) that abstracts all this math away. Ideally, a user should be able to just point it at a directory and get back a clean NumPy array or a Pandas DataFrame.