Replace per-point `nearest` extraction with vectorized `axis` spatial method by Copilot · Pull Request #143 · fish-pace/point-collocation

Copilot · 2026-04-02T16:58:44Z

Fix _extract_axis_batch to include time_dim as a vectorized indexer per variable before .load(), avoiding loading all time steps when only one is needed per point
Add test: test_time_dimension_excluded_from_loaded_array — 8-timestep dataset, 3 points each requesting a different time step, verifies correct per-point values
Add test: test_time_dimension_with_extra_axis — vectorized time + extra-axis (lev) selection together (the MERRA-2 coord_spec scenario)
Verified all pre-existing non-netCDF4 tests still pass; the 6 failing tests were already failing before this change due to missing netCDF4 package in the sandbox

Agent-Logs-Url: https://github.com/fish-pace/point-collocation/sessions/d42afd0a-b2de-4e6c-8a67-83ee84d30533 Co-authored-by: eeholmes <2545978+eeholmes@users.noreply.github.com>

eeholmes · 2026-04-02T18:33:12Z

@copilot The time went from 18 seconds for this to 65 seconds. Vectorized change should not have added time. The plan didn't change.

import pandas as pd
url = (
    "https://raw.githubusercontent.com/"
    "fish-pace/point-collocation/main/"
    "examples/fixtures/points.csv"
)
df_points = pd.read_csv(url)

import point_collocation as pc
plan = pc.plan(
    df_points,
    data_source="earthaccess",
    source_kwargs={
        "short_name": "PACE_OCI_L3M_RRS",
        "granule_name": "*.MO.*.4km.*",
    }
)

res = pc.matchup(plan[0:100], variables=["Rrs"])

plan

Plan: 595 points → 4 unique granule(s)
Points with 0 matches : 0
Points with >1 matches: 0
Time buffer: 0 days 00:00:00

…e before loop Agent-Logs-Url: https://github.com/fish-pace/point-collocation/sessions/7a7a4817-f270-4cd9-ba6a-232a2ac90795 Co-authored-by: eeholmes <2545978+eeholmes@users.noreply.github.com>

Copilot · 2026-04-02T18:38:35Z

@copilot The time went from 18 seconds for this to 65 seconds. Vectorized change should not have added time. The plan didn't change.

...

The slowdown was a dask recomputation bug (commit f9c0ea7). _extract_axis_batch did selected = ds[var].sel(..., method='nearest') which produces a lazy dask-backed DataArray, then the inner per-point loop called float(selected.isel({pts_dim: i})) 100 times — each call recomputed the full dask graph for all 100 points to extract a single value, giving O(N²) I/O.

The fix adds .load() immediately after the sel() call to materialize the dask graph exactly once. The per-point loop then operates on in-memory NumPy slices. A synthetic benchmark shows a 22× speedup on this pattern.

…d() to avoid loading all timesteps Agent-Logs-Url: https://github.com/fish-pace/point-collocation/sessions/d26c8381-f68d-40d5-acf9-869899e769c2 Co-authored-by: eeholmes <2545978+eeholmes@users.noreply.github.com>

eeholmes · 2026-04-02T21:21:27Z

@copilot Evaluate the profile for one of the matchup calls. It still feels slow but maybe there is not way to avoid.

Plan: 100 points → 18 unique granule(s)
Points with 0 matches : 0
Points with >1 matches: 0
Time buffer: 0 days 00:00:00

import cProfile
import pstats
import io

pr = cProfile.Profile()
pr.enable()

res = pc.matchup(plan, variables=["avw"])

pr.disable()

s = io.StringIO()
ps = pstats.Stats(pr, stream=s).sort_stats("cumtime")  # or "tottime"
ps.print_stats(40)   # top 40 entries
print(s.getvalue())

output

     5164968 function calls (5084085 primitive calls) in 21.025 seconds

Ordered by: cumulative time
List reduced from 9651 to 40 due to restriction <40>

ncalls tottime 140/3 0.001 3/2 0.000 2228/2 0.110 2/1 0.000 2/1 0.000 2/1 0.001 20/19 0.001 2620/1963 0.002 36 0.000 361/126 0.000 1906 0.003 72/54 0.000 72/54 0.000 54/36 0.000 54/36 0.001 90/37 0.004 20/19 0.000 19 0.001 9446/9153 0.013 95 0.004 19 0.000 19 0.000 932 0.004 1672 0.002 1976 0.009 951 0.004 951 0.005 2/1 0.000 2 0.000 0.000 2 0.000 0.000 5/2 0.000 5 0.000 0.000 5 0.000 0.000 5 0.000 0.000 9/5 0.000 5 0.000 0.000 5 0.000 0.000 5 0.000 0.000 154 0.000 10 0.000 percall cumtime percall filename:lineno(function)
0.000 16.299 5.433 /srv/conda/envs/notebook/lib/python3.12/threading.py:637(wait)
0.000 12.631 6.315 /srv/conda/envs/notebook/lib/python3.12/site-packages/IPython/core/interactiveshell.py:3665(run_code)
0.000 12.622 6.311 {built-in method builtins.exec}
0.000 12.622 12.622 /tmp/ipykernel_8493/1906214859.py:1()
0.000 12.622 12.622 /home/jovyan/point-collocation/src/point_collocation/core/engine.py:70(matchup)
0.000 12.622 12.622 /home/jovyan/point-collocation/src/point_collocation/core/engine.py:630(_execute_plan)
0.000 10.010 0.527 /srv/conda/envs/notebook/lib/python3.12/site-packages/xarray/backends/api.py:388(open_dataset)
0.000 9.524 0.005 {built-in method builtins.next}
0.000 9.505 0.264 /home/jovyan/point-collocation/src/point_collocation/core/_open_method.py:1145(_open_as_flat_dataset)
0.000 9.503 0.075 /srv/conda/envs/notebook/lib/python3.12/contextlib.py:132(enter)
0.000 8.898 0.005 /srv/conda/envs/notebook/lib/python3.12/site-packages/fsspec/caching.py:730(call)
0.000 8.832 0.164 /srv/conda/envs/notebook/lib/python3.12/site-packages/fsspec/caching.py:914(_fetch_block)
0.000 8.816 0.163 /srv/conda/envs/notebook/lib/python3.12/site-packages/fsspec/caching.py:62(_fetch)
0.000 8.795 0.244 /srv/conda/envs/notebook/lib/python3.12/site-packages/s3fs/core.py:2414(fetch_range)
0.000 8.793 0.244 /srv/conda/envs/notebook/lib/python3.12/site-packages/s3fs/core.py:2572(fetch_range)
0.000 8.752 0.237 /srv/conda/envs/notebook/lib/python3.12/site-packages/fsspec/asyn.py:63(sync)
0.000 6.043 0.318 /srv/conda/envs/notebook/lib/python3.12/site-packages/xarray/backends/h5netcdf.py:514(open_dataset)
0.000 6.024 0.317 /srv/conda/envs/notebook/lib/python3.12/site-packages/xarray/backends/store.py:27(open_dataset)
0.000 6.019 0.001 :892(iter)
0.000 6.012 0.063 /srv/conda/envs/notebook/lib/python3.12/site-packages/xarray/backends/h5netcdf.py:77(read_attributes)
0.000 5.980 0.315 /srv/conda/envs/notebook/lib/python3.12/site-packages/xarray/backends/common.py:347(load)
0.000 5.943 0.313 /srv/conda/envs/notebook/lib/python3.12/site-packages/xarray/backends/h5netcdf.py:321(get_attrs)
0.000 5.783 0.006 /srv/conda/envs/notebook/lib/python3.12/site-packages/fsspec/spec.py:2134(readinto)
0.000 5.774 0.003 /srv/conda/envs/notebook/lib/python3.12/site-packages/h5netcdf/attrs.py:113(iter)
0.000 5.772 0.003 /srv/conda/envs/notebook/lib/python3.12/site-packages/h5py/_hl/attrs.py:247(iter)
0.000 5.619 0.006 /srv/conda/envs/notebook/lib/python3.12/site-packages/fsspec/spec.py:2103(read)
0.000 5.612 0.006 /srv/conda/envs/notebook/lib/python3.12/site-packages/fsspec/caching.py:838(_fetch)
0.000 4.734 4.734 /srv/conda/envs/notebook/lib/python3.12/site-packages/earthaccess/auth.py:165(get_s3_credentials)
4.734 2.367 /srv/conda/envs/notebook/lib/python3.12/site-packages/requests/sessions.py:593(get)
4.733 2.367 /srv/conda/envs/notebook/lib/python3.12/site-packages/requests/sessions.py:500(request)
0.000 4.732 2.366 /srv/conda/envs/notebook/lib/python3.12/site-packages/requests/sessions.py:673(send)
4.723 0.945 /srv/conda/envs/notebook/lib/python3.12/site-packages/requests/adapters.py:590(send)
4.720 0.944 /srv/conda/envs/notebook/lib/python3.12/site-packages/urllib3/connectionpool.py:592(urlopen)
4.718 0.944 /srv/conda/envs/notebook/lib/python3.12/site-packages/urllib3/connectionpool.py:377(_make_request)
0.000 4.440 0.888 /srv/conda/envs/notebook/lib/python3.12/site-packages/requests/sessions.py:159(resolve_redirects)
4.422 0.884 /srv/conda/envs/notebook/lib/python3.12/site-packages/urllib3/connection.py:534(getresponse)
4.421 0.884 /srv/conda/envs/notebook/lib/python3.12/http/client.py:1386(getresponse)
4.420 0.884 /srv/conda/envs/notebook/lib/python3.12/http/client.py:324(begin)
0.000 4.418 0.029 {method 'readline' of '_io.BufferedReader' objects}
0.000 4.418 0.442 /srv/conda/envs/notebook/lib/python3.12/socket.py:706(readinto)

Copilot · 2026-04-02T21:24:18Z

@copilot Evaluate the profile for one of the matchup calls. It still feels slow but maybe there is not way to avoid.

Plan: 100 points → 18 unique granule(s)
...

The profile shows the bottleneck is cloud I/O, not extraction logic:

threading.wait accounts for 16.3 of 21 seconds (76%). This is the time spent waiting for network I/O — earthaccess opening 18 cloud-hosted HDF5 files, plus dask reading the actual data chunks from S3/HTTPS. At ~1.2s/granule that's typical latency for streaming access to remote HDF5 files.
The remaining ~5 seconds is Python processing overhead (5.1M function calls) covering file-open metadata parsing, xarray/dask graph construction, the vectorized .sel() + .load(), and one gc.collect() per granule to release HDF5 reference cycles.

The extraction code itself is not the bottleneck here. There's no algorithmic fix for serial cloud I/O latency short of opening granules in parallel (overlapping network waits), which would be a large architectural change.

So yes — for 18 cloud granules, ~21 seconds is roughly what to expect without parallelizing the granule-open loop.

Initial plan

076853f

Copilot AI assigned Copilot and eeholmes Apr 2, 2026

Copilot AI linked an issue Apr 2, 2026 that may be closed by this pull request

add vectorized nearest method #142

Open

Copilot started work on behalf of eeholmes April 2, 2026 16:58 View session

Add vectorized axis spatial method replacing per-point nearest

819149b

Agent-Logs-Url: https://github.com/fish-pace/point-collocation/sessions/d42afd0a-b2de-4e6c-8a67-83ee84d30533 Co-authored-by: eeholmes <2545978+eeholmes@users.noreply.github.com>

Copilot AI changed the title ~~[WIP] Add vectorized nearest method for spatial selections~~ Replace per-point nearest extraction with vectorized axis spatial method Apr 2, 2026

Copilot AI requested a review from eeholmes April 2, 2026 18:19

Copilot finished work on behalf of eeholmes April 2, 2026 18:19

Copilot started work on behalf of eeholmes April 2, 2026 18:33 View session

Fix O(N²) dask recomputation in _extract_axis_batch: call .load() onc…

f9c0ea7

…e before loop Agent-Logs-Url: https://github.com/fish-pace/point-collocation/sessions/7a7a4817-f270-4cd9-ba6a-232a2ac90795 Co-authored-by: eeholmes <2545978+eeholmes@users.noreply.github.com>

update version

c3d333e

Copilot finished work on behalf of eeholmes April 2, 2026 18:39

Copilot started work on behalf of eeholmes April 2, 2026 18:55 View session

Copilot finished work on behalf of eeholmes April 2, 2026 18:56

Copilot started work on behalf of eeholmes April 2, 2026 19:00 View session

Copilot finished work on behalf of eeholmes April 2, 2026 19:02

Copilot started work on behalf of eeholmes April 2, 2026 19:10 View session

Copilot finished work on behalf of eeholmes April 2, 2026 20:20

Copilot started work on behalf of eeholmes April 2, 2026 21:21 View session

Copilot finished work on behalf of eeholmes April 2, 2026 21:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replace per-point `nearest` extraction with vectorized `axis` spatial method#143

Replace per-point `nearest` extraction with vectorized `axis` spatial method#143
Copilot wants to merge 5 commits intomainfrom
copilot/add-vectorized-nearest-method

Copilot AI commented Apr 2, 2026 •

edited

Loading

Uh oh!

eeholmes commented Apr 2, 2026

Uh oh!

Copilot AI commented Apr 2, 2026

Uh oh!

eeholmes commented Apr 2, 2026

Uh oh!

Copilot AI commented Apr 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Copilot AI commented Apr 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

eeholmes commented Apr 2, 2026

Uh oh!

Copilot AI commented Apr 2, 2026

Uh oh!

eeholmes commented Apr 2, 2026

Uh oh!

Copilot AI commented Apr 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Copilot AI commented Apr 2, 2026 •

edited

Loading