diff --git a/.claude/sweep-security-state.csv b/.claude/sweep-security-state.csv index 4407f972..86cb5766 100644 --- a/.claude/sweep-security-state.csv +++ b/.claude/sweep-security-state.csv @@ -15,10 +15,10 @@ edge_detection,2026-04-25,1271,MEDIUM,6,,"MEDIUM (fixed #1271): the five public emerging_hotspots,2026-04-25,1274,HIGH,1,,"HIGH (fixed #1274): emerging_hotspots() public API only validated ndim and shape[0] >= 2. The numpy and cupy backends each materialised three full (T, H, W) cubes (a float32 input copy, gi_zscore float32, gi_bin int8) plus H*W temporaries with no memory check; a (100, 20000, 20000) input projected to ~480 GB. Fixed by adding _available_memory_bytes()/_check_memory(n_times, ny, nx) (12 bytes per cube cell budget) and calling it from the public API for non-dask inputs. Dask paths skip the guard because their map_blocks/map_overlap chunk functions do not materialise the full cube. MEDIUM (unfixed, Cat 6): public API does not call _validate_raster() so non-numeric dtypes fail later with a confusing error rather than a clean TypeError. No GPU kernels in this module (uses convolve_2d). No file I/O. Cat 3 statistical paths are robust: _mann_kendall_statistic_numpy guards var_s <= 0 before sqrt, both numpy and cupy backends raise ZeroDivisionError on global_std == 0, and _mk_pvalue handles z==0 explicitly." erosion,2026-04-25,1275,HIGH,1;3;6,,"HIGH (fixed #1275): erode() accepted three user-controlled parameters with no upper bound. (1) iterations sized rng.random((iterations, 2)) on the host (16 B/particle) and was copied to the GPU via cupy.asarray, so iterations=10**12 attempted ~16 TB on each side. (2) params['radius'] drove _build_brush which iterates (2r+1)**2 cells and stores three arrays of the same length, so radius=10**6 allocated ~12 TB of brush data. (3) params['max_lifetime'] is the inner per-particle JIT loop in both _erode_cpu and _erode_gpu_kernel, so max_lifetime=10**12 with the default iterations=50000 ran 5e16 step iterations. The existing _check_erosion_memory helper only fired on dask paths and ignored the random_pos and brush working sets. Fixed by capping all three parameters at the public erode() entry via _validate_scalar(max_val=...) (_MAX_ITERATIONS=1e8, _MAX_RADIUS=1024, _MAX_LIFETIME=1e5), rewriting _check_erosion_memory to include the random_pos buffer and brush bytes in its budget, and wiring the guard into _erode_numpy and _erode_cupy so every backend benefits (the dask paths inherit it via their _erode_numpy/_erode_cupy calls). Mirrors diffuse #1268 pattern. Deferred follow-ups (separate PRs): Cat 3 HIGH NaN input is not guarded in _erode_cpu / _erode_gpu_kernel -- a NaN cell propagates through bilinear interpolation into dir_x/dir_y, NaN bounds checks fall through, and particles can deposit NaN into arbitrary cells via cuda.atomic.add. Cat 6 MEDIUM erode() does not call _validate_raster() on agg -- non-numeric or wrong-ndim input fails inside numba/cupy with a confusing error. No Cat 2 (no int32 flat-index math), no Cat 4 (GPU kernel has bounds guard at line 184 plus per-step bounds checks before every read/write, brush writes are explicitly bounds-checked, no shared memory), no Cat 5 (no file I/O)." fire,2026-04-25,,,,,"Clean. Despite the module's size hint, fire.py is purely per-cell raster ops -- not cellular-automaton or front-tracking. Seven public APIs: dnbr, rdnbr, burn_severity_class, fireline_intensity, flame_length, rate_of_spread, kbdi. No iteration, no queues, no multi-channel state, no random numbers, no file paths. Cat 1: every output allocation matches input shape (single buffer, bounded by caller). Anderson-13 fuel table is a fixed 13x8 constant. _rothermel_fuel_constants returns 12 scalars before dispatch (no per-pixel state). Cat 2: no flat-index math, all indexing is 2-D (y, x); no height*width multiplication. Cat 3: rdnbr guards denom < 1e-10; burn_severity_class is threshold-only; flame_length guards v <= 0.0 before fractional power; rate_of_spread guards M_x>0/beta>0/denom>0 and clamps eta_M, U_mmin, R; kbdi clamps Q to [0, 800] and net_P to >= 0. Adversarial wind=inf or T=inf would push exp/power to inf in rate_of_spread/kbdi but inputs are user-controlled rasters, fire model is research-quality (LOW only). Cat 4: all 7 CUDA kernels (_dnbr_gpu L157, _rdnbr_gpu L246, _bsc_gpu L362, _fli_gpu L455, _fl_gpu L552, _ros_gpu L681, _kbdi_gpu L870) have 'y < out.shape[0] and x < out.shape[1]' bounds guard; every kernel is point-wise (no neighbour stencil) so the simple guard is sufficient; no shared memory, no syncthreads needed. Cat 5: no file I/O. Cat 6: every public function calls _validate_raster on each input raster (dnbr/rdnbr/fireline_intensity/rate_of_spread/kbdi pass 2-3 rasters each, all validated), validate_arrays enforces equal shape, _validate_scalar gates heat_content/fuel_model (1-13)/annual_precip, and every input is .astype('f4') before reaching any kernel so dtype is normalized." -flood,2026-05-03,1437,MEDIUM,3,,"Re-audit 2026-05-03. MEDIUM Cat 3 fixed in PR #1438 (travel_time and flood_depth_vegetation now validate mannings_n DataArray values are finite and strictly positive via _validate_mannings_n_dataarray helper). No remaining unfixed findings. Other categories clean: every allocation is same-shape as input; no flat index math; NaN propagation explicit in every backend; tan_slope clamped by _TAN_MIN; no CUDA kernels; no file I/O; every public API calls _validate_raster on DataArray inputs." +flood,2026-05-03,1437,MEDIUM,3,,Re-audit 2026-05-03. MEDIUM Cat 3 fixed in PR #1438 (travel_time and flood_depth_vegetation now validate mannings_n DataArray values are finite and strictly positive via _validate_mannings_n_dataarray helper). No remaining unfixed findings. Other categories clean: every allocation is same-shape as input; no flat index math; NaN propagation explicit in every backend; tan_slope clamped by _TAN_MIN; no CUDA kernels; no file I/O; every public API calls _validate_raster on DataArray inputs. focal,2026-04-27,1284,HIGH,1,,"HIGH (fixed PR #1286): apply(), focal_stats(), and hotspots() accepted unbounded user-supplied kernels via custom_kernel(), which only checks shape parity. The kernel-size guard from #1241 (_check_kernel_memory) only ran inside circle_kernel/annulus_kernel, so a (50001, 50001) custom kernel on a 10x10 raster allocated ~10 GB on the kernel itself plus a much larger padded raster before any work -- same shape as the bilateral DoS in #1236. Fixed by adding _check_kernel_vs_raster_memory in focal.py and wiring it into apply(), focal_stats(), and hotspots() after custom_kernel() validation. All 134 focal tests + 19 bilateral tests pass. No other findings: 10 CUDA kernels all have proper bounds + stencil guards; _validate_raster called on every public entry point; hotspots already raises ZeroDivisionError on constant-value rasters; _focal_variety_cuda uses a fixed-size local buffer (silent truncation but bounded); _focal_std_cuda/_focal_var_cuda clamp the catastrophic-cancellation case via if var < 0.0: var = 0.0; no file I/O." geodesic,2026-04-27,1283,HIGH,1,,"HIGH (fixed PR #1285): slope(method='geodesic') and aspect(method='geodesic') stack a (3, H, W) float64 array (data, lat, lon) before dispatch with no memory check. A large lat/lon-tagged raster passed to either function would OOM. Fixed by adding _check_geodesic_memory(rows, cols) in xrspatial/geodesic.py (mirrors morphology._check_kernel_memory): budgets 56 bytes/cell (24 stacked float64 + 4 float32 output + 24 padded copy + slack) and raises MemoryError when > 50% of available RAM; called from slope.py and aspect.py inside the geodesic branch before dispatch. No other findings: 6 CUDA kernels all have bounds guards (e.g. _run_gpu_geodesic_aspect at geodesic.py:395), custom 16x16 thread blocks avoid register spill, no shared memory, _validate_raster runs upstream in slope/aspect, all backends cast to float32, slope_mag < 1e-7 flat threshold prevents arctan2 NaN propagation, curvature correction uses hardcoded WGS84 R." -geotiff,2026-04-28,1215,HIGH,1;4,1219;1220,"Re-audit 2026-04-28: only #7984f7a since prior pass (predictor=3 multi-band CPU fix, LERC dedup, VRT nodata dtype, AREA_OR_POINT, BigTIFF LONG8 offsets). LZW/inflate GPU kernels gate every write on out_pos single 100 MB Range GET. Fixed via per-tile cap (MAX_TILE_BYTES_DEFAULT=256 MiB, XRSPATIAL_COG_MAX_TILE_BYTES env override). Other recent PRs (#1530 IFD chain cap, #1527 IFD count, #1535 LERC mask, #1532 PlanarConfig=2, #1534 HTTP coalesce, #1531 parallel write, #1528 nvCOMP batch) audited clean. | Prior: Re-audit 2026-04-28: only #7984f7a since prior pass (predictor=3 multi-band CPU fix, LERC dedup, VRT nodata dtype, AREA_OR_POINT, BigTIFF LONG8 offsets). LZW/inflate GPU kernels gate every write on out_pos= 3 and distance only as >= 1, with no upper bound on either. _glcm_numba_kernel iterates range(r-half, r+half+1) for every pixel, so window_size=1_000_001 on a 10x10 raster ran ~10^14 loop iterations with all neighbors failing the interior bounds check (CPU DoS). On the dask backends depth = window_size // 2 + distance drove map_overlap padding, so a huge window also caused oversize per-chunk allocations (memory DoS). Fixed by adding max_val caps in the public entrypoint: window_size <= max(3, min(rows, cols)) and distance <= max(1, window_size // 2). One cap covers every backend because cupy and dask+cupy call through to the CPU kernel after cupy.asnumpy. No other HIGH findings: levels is already capped at 256 so the per-pixel np.zeros((levels, levels)) matrix in the kernel is bounded to 512 KB. No CUDA kernels. No file I/O. Quantization clips to [0, levels-1] before the kernel and NaN maps to -1 which the kernel filters with i_val >= 0. Entropy log(p) and correlation p / (std_i * std_j) are both guarded. All four backends use _validate_raster and cast to float64 before quantizing. MEDIUM (unfixed, Cat 1): the per-pixel np.zeros((levels, levels)) allocation inside the hot loop is a perf issue (levels=256 -> 512 KB alloc+free per pixel) but not a security issue because levels is bounded. Could be hoisted out of the loop or replaced with an in-place clear, but that is an efficiency concern, not security." gpu_rtx,2026-04-29,1308,HIGH,1,,"HIGH (fixed #1308 / PR #1310): hillshade_rtx (gpu_rtx/hillshade.py:184) and viewshed_gpu (gpu_rtx/viewshed.py:269) allocated cupy device buffers sized by raster shape with no memory check. create_triangulation (mesh_utils.py:23-24) adds verts (12 B/px) + triangles (24 B/px) = 36 B/px; hillshade_rtx adds d_rays(32) + d_hits(16) + d_aux(12) + d_output(4) = 64 B/px (100 B/px total); viewshed_gpu adds d_rays(32) + d_hits(16) + d_visgrid(4) + d_vsrays(32) = 84 B/px (120 B/px total). A 30000x30000 raster asked for 90-108 GB of VRAM before cupy surfaced an opaque allocator error. Fixed by adding gpu_rtx/_memory.py with _available_gpu_memory_bytes() and _check_gpu_memory(func_name, h, w) helpers (cost_distance #1262 / sky_view_factor #1299 pattern, 120 B/px budget covers worst case, raises MemoryError when required > 50% of free VRAM, skips silently when memGetInfo() unavailable). Wired into both entry points after the cupy.ndarray type check and before create_triangulation. 9 new tests in test_gpu_rtx_memory.py (5 helper-unit + 4 end-to-end gated on has_rtx). All 81 existing hillshade/viewshed tests still pass. Cat 4 clean: all CUDA kernels (hillshade.py:25/62/106, viewshed.py:32/74/116, mesh_utils.py:50) have bounds guards; no shared memory, no syncthreads needed. MEDIUM not fixed (Cat 6): hillshade_rtx and viewshed_gpu do not call _validate_raster directly but parent hillshade() (hillshade.py:252) and viewshed() (viewshed.py:1707) already validate, so input validation runs before the gpu_rtx entry point - defense-in-depth, not exploitable. MEDIUM not fixed (Cat 2): mesh_utils.py:64-68 cast mesh_map_index to int32 in the triangle index buffer; overflows at H*W > 2.1B vertices (~46341x46341+) but the new memory guard rejects rasters that large first - documentation/clarity item rather than exploitable. MEDIUM not fixed (Cat 3): mesh_utils.py:19 scale = maxDim / maxH divides by zero on an all-zero raster, propagating inf/NaN into mesh vertex z-coords; separate follow-up. LOW not fixed (Cat 5): mesh_utils.write() opens user-supplied path without canonicalization but its only call site (mesh_utils.py:38-39) sits behind if False: in create_triangulation, not reachable in production." hillshade,2026-04-27,,,,,"Clean. Cat 1: only allocation is the output np.empty(data.shape) at line 32 (cupy at line 165) and a _pad_array with hardcoded depth=1 (line 62) -- bounded by caller, no user-controlled amplifier. Azimuth/altitude are scalars and don't drive size. Cat 2: numba kernel uses range(1, rows-1) with simple (y, x) indexing; numba range loops promote to int64. Cat 3: math.sqrt(1.0 + xx_plus_yy) is always >= 1.0 (no neg sqrt, no div-by-zero); NaN elevation propagates correctly through dz_dx/dz_dy -> shaded -> output (the shaded < 0.0 / shaded > 1.0 clamps don't fire on NaN). Azimuth validated to [0, 360], altitude to [0, 90]. Cat 4: _gpu_calc_numba (line 107) guards both grid bounds and 3x3 stencil reads via i > 0 and i < shape[0]-1 and j > 0 and j < shape[1]-1; no shared memory. Cat 5: no file I/O. Cat 6: hillshade() calls _validate_raster (line 252) and _validate_scalar for both azimuth (253) and angle_altitude (254); all four backend paths cast to float32; tests parametrize int32/int64/float32/float64." @@ -38,13 +38,11 @@ rasterize,2026-04-21,1223,HIGH,1;2,,HIGH: unbounded out/written allocation in _r reproject,2026-05-03,1431;1433;1435,MEDIUM,3;6,,"Re-audit 2026-05-03. ALL MEDIUM findings fixed across 3 PRs. Cat 6 fixed in PR #1432: reproject(), merge() (per-element), geoid_height_raster() now call _validate_raster (ndim=(2,3)). Cat 3 grid params fixed in PR #1434: _validate_grid_params() helper rejects resolution/width/height <=0 / non-finite, bounds with right<=left or top<=bottom, transform_precision <0; wired into reproject() and merge(). Cat 3 NaN/Inf rejection fixed in PR #1436: itrf_transform requires finite epoch and non-empty src/tgt, geoid_height enforces finite lon/lat in [-90,90], _detect_nodata rejects +/-Inf. No HIGH findings: _compute_output_grid enforces _MAX_OUTPUT_PIXELS=1e9; chunk workers cap source-window reads at 64 Mpix; _find_file resolves only from hardcoded _GEOID_MODELS dict." resample,2026-04-28,1295,HIGH,1,,"HIGH (fixed #1295): resample() did not bound output dimensions derived from user-supplied scale_factor / target_resolution. _output_shape returns max(1, round(in_h * scale_y)), max(1, round(in_w * scale_x)) and was passed straight through to the eager numpy / cupy backends, where _run_numpy and _run_cupy / the _AGG_FUNCS numba kernels and _nan_aware_interp_np allocated np.empty / cupy.empty / map_coordinates buffers of that size with no memory check. scale_factor=1e9 on a 4x4 raster requested ~190 EB; target_resolution=1e-9 on a meter-scale raster did the same. Fixed by adding _available_memory_bytes() / _available_gpu_memory_bytes() helpers and _check_resample_memory(out_h, out_w) / _check_resample_gpu_memory(out_h, out_w) guards (12 B/cell budget covering float64 working buffer + float32 output + map_coordinates temporary), wired into resample() before backend dispatch. Eager numpy and cupy paths run the guard; dask paths skip it because per-chunk allocations are bounded by chunk size. Mirrors the kde / line_density (#1287), focal (#1284), geodesic (#1283), cost_distance (#1262), and diffuse (#1267) patterns. No other findings: _validate_raster called at line 698, scale_y > 0 / scale_x > 0 enforced, AGGREGATE_METHODS rejects scale > 1.0, identity fast path bypasses dispatch entirely, all numba kernels guard count > 0 before division, no CUDA kernels (cupy paths use cupy ufuncs + cupyx.scipy.ndimage), no file I/O, all backends cast to float64 before computation and float32 on output." sieve,2026-04-28,1296,HIGH,1,,"HIGH (fixed #1296): sieve() on numpy and cupy backends had no memory guard. _label_connected allocates parent (int32, 4B/px), rank (int32, 4B/px, reused as root_to_id), region_map_flat (int32, 4B/px), plus a float64 result copy (8B/px) ~ 20 B/pixel of working memory before any check. The dask paths (_sieve_dask line 343 and _sieve_dask_cupy line 366) already raised MemoryError via _available_memory_bytes() at 28 B/pixel budget, but the public sieve() API at line 489 dispatched np.ndarray inputs straight into _sieve_numpy with no guard, and _sieve_cupy at line 308 transferred to host via data.get() then called _sieve_numpy, inheriting the gap. A 50000x50000 numpy raster requested ~50 GB silently. Fixed by extracting _check_memory(rows, cols) and _check_gpu_memory(rows, cols) helpers (mirrors cost_distance #1262 / mahalanobis #1288 / multispectral #1291 / kde #1287 pattern) at 28 B/pixel host budget plus 16 B/pixel GPU round-trip budget at 50% of available memory threshold. _check_memory wired into _sieve_numpy at the top before the float64 copy. _check_gpu_memory wired into _sieve_cupy before data.get(); it also calls _check_memory so the host budget still applies. Consolidated _available_memory_bytes definition (was duplicated). All 47 tests pass including 2 new memory-guard tests for the numpy backend (_sieve_numpy direct call + public sieve() API). No other findings: Cat 2 int32 indexing in _label_connected docstring acknowledges <2.1B pixel limit; the new memory guard rejects rasters that large before the int32 issue can trigger so this is a documentation/clarity follow-up rather than an exploitable bug. Cat 3 NaN handled via valid mask; Cat 4 no CUDA kernels; Cat 5 only /proc/meminfo read; Cat 6 _validate_raster called at line 478." -slope,2026-04-28,,,,,"Clean. slope() validates input via _validate_raster (line 383) and _validate_boundary (line 389). Cat 1: planar _cpu/_run_cupy allocate output matching input shape; geodesic paths build (3,H,W) float64 stacked array but are gated by _check_geodesic_memory(rows, cols) at line 410 (already fixed under geodesic audit, PR #1285). Cat 2: no int32 flat-index math; all loops 2D with range(). Cat 3: NaN propagates through arctan in planar kernels; geodesic delegates to _local_frame_project_and_fit which has explicit NaN guards and degenerate det check. Cat 4: _run_gpu (line 146) uses combined bounds+stencil guard 'i-di>=0 and i+di=0 and j+dj=0 and i+di=0 and j+dj0 check line 123); NaN elevation at observer cell would taint los_height but is a correctness not DoS concern. Cat 4: no CUDA kernels. Cat 5: no file I/O. Cat 6: elevations cast to float64 in _extract_transect line 79." worley,2026-04-28,,,,,"Clean. worley() calls _validate_raster at line 234 (Cat 6 OK). Cat 1: output allocation matches input agg.shape (np.empty_like at line 80, cupy.empty at line 174); not a width/height generator like bump, so unbounded alloc N/A. Cat 2: cell_x/cell_y use & 255 mask before perm-table indexing, no overflow risk; tid/block_size math bounded by hardware limits. Cat 3: no division by data-derived values; out.shape guards prevent zero-div in coordinate computation; no NaN read from input (pure noise generator). Cat 4 (PRIMARY): both @cuda.jit kernels (_worley_gpu line 99, _worley_gpu_xy line 135) have correct bounds guard 'if i < out.shape[0] and j < out.shape[1]'. cuda.shared.array(512, nb.int32) uses HARDCODED constant 512 (matches 256*2 perm table size), NOT derived from caller input — safe. cuda.syncthreads() called at line 110/147 between strided shared-mem write and reads. Each thread writes distinct sp[k] indices via 'range(tid, 512, block_size)', no race. All threads (incl. out-of-bounds) participate in the load loop before the bounds check, so syncthreads divergence is avoided. Cat 5: no file I/O. Minor: freq/seed not range-validated, _worley_numpy uses np.empty_like(data) which preserves int dtype if input is int (truncation). Functional, not security." zonal,2026-04-22,1227,HIGH,1;2;6,,"HIGH (fixed #1227): _stats_cupy used `if nodata_values:` (truthy) so nodata_values=0 silently skipped the filter on the cupy backend, producing wrong stats vs every other backend. MEDIUM (unfixed): _strides uses np.int32 for stride indices — can wrap for arrays > ~2B elements in the numpy path. MEDIUM (unfixed): hypsometric_integral() skips _validate_raster on zones/values; _regions_numpy has no memory guard (numpy-only path, bounded by caller-allocated input). MEDIUM (unfixed): _stats_numpy return_type='xarray.DataArray' allocates np.full((n_stats, values.size)) with no guard." -surface_distance,2026-04-28,1303,HIGH,1,,Fixed in PR #1305: added _check_memory and _check_gpu_memory guards to _surface_distance_numpy (line ~233) and _surface_distance_cupy (line ~448) before O(H*W) heap+output allocations. Dask paths inherit via per-chunk numpy call. Other categories clean. diff --git a/xrspatial/geotiff/_reader.py b/xrspatial/geotiff/_reader.py index 6188a879..ba91c8c3 100644 --- a/xrspatial/geotiff/_reader.py +++ b/xrspatial/geotiff/_reader.py @@ -53,6 +53,28 @@ def _check_dimensions(width, height, samples, max_pixels): ) +#: Default per-tile compressed-byte cap for HTTP COG reads. A crafted +#: ``TileByteCounts`` entry can declare arbitrarily many bytes, and the +#: HTTP path then tries to fetch and buffer that many bytes from the +#: server before it ever decompresses. 256 MiB tolerates legitimate +#: large tiles (RGB JPEG2000 at very high resolution can land in the +#: tens of MB) while keeping the fetch bounded. Override via the +#: ``XRSPATIAL_COG_MAX_TILE_BYTES`` environment variable. +MAX_TILE_BYTES_DEFAULT = 256 << 20 # 256 MiB + + +def _max_tile_bytes_from_env() -> int: + """Read the per-tile byte cap from the environment, or fall back to the default.""" + raw = _os_module.environ.get('XRSPATIAL_COG_MAX_TILE_BYTES') + if raw is None: + return MAX_TILE_BYTES_DEFAULT + try: + val = int(raw) + except (TypeError, ValueError): + return MAX_TILE_BYTES_DEFAULT + return max(1, val) + + # --------------------------------------------------------------------------- # Data source abstraction # --------------------------------------------------------------------------- @@ -1317,6 +1339,14 @@ def _fetch_decode_cog_http_tiles( # Empty tiles (byte_count == 0) and any tile_idx beyond the offsets # array are skipped here so the fetch list stays exactly aligned with # the placements list. + # + # Each tile's compressed size is bounded against MAX_TILE_BYTES BEFORE + # the fetch list is built. A crafted COG can claim arbitrarily large + # TileByteCounts; without this guard the HTTP layer would issue a + # Range request sized by the attacker's value (issue #1536). The cap + # is overridable via XRSPATIAL_COG_MAX_TILE_BYTES; the local-mmap + # path is naturally bounded by file size and does not need this check. + max_tile_bytes = _max_tile_bytes_from_env() fetch_ranges: list[tuple[int, int]] = [] placements: list[tuple[int, int]] = [] # (tr, tc) per fetched tile for tr in range(tile_row_start, tile_row_end): @@ -1328,6 +1358,15 @@ def _fetch_decode_cog_http_tiles( bc = byte_counts[tile_idx] if bc == 0: continue + if bc > max_tile_bytes: + raise ValueError( + f"TIFF tile {tile_idx} declares " + f"TileByteCount={bc:,} bytes, which exceeds the HTTP " + f"COG safety cap of {max_tile_bytes:,} bytes. The " + f"file is malformed or attempting denial-of-service. " + f"Override via XRSPATIAL_COG_MAX_TILE_BYTES if this " + f"file is legitimate." + ) fetch_ranges.append((off, bc)) placements.append((tr, tc)) diff --git a/xrspatial/geotiff/tests/test_security.py b/xrspatial/geotiff/tests/test_security.py index ec020391..da1e7722 100644 --- a/xrspatial/geotiff/tests/test_security.py +++ b/xrspatial/geotiff/tests/test_security.py @@ -517,3 +517,213 @@ def test_boundary_exact_count_ok(self, tmp_path): arr, _ = read_to_array(path) np.testing.assert_array_equal(arr, expected) + + +# --------------------------------------------------------------------------- +# HTTP COG: per-tile compressed-byte cap (issue #1536) +# +# A crafted TIFF served over HTTP can declare arbitrarily large +# TileByteCounts. Without the cap added in #1536, _fetch_decode_cog_http_tiles +# passes those values straight into Range GETs sized by the attacker. +# The local-mmap path is naturally bounded by file size, so these tests +# only exercise the HTTP path through a mock _HTTPSource. +# --------------------------------------------------------------------------- + +import threading + + +def _patch_tile_byte_counts(data: bytearray, value: int) -> None: + """Rewrite every TileByteCounts entry in *data* (in place) to *value*. + + Walks the first IFD looking for tag 325. Handles the LONG (4 byte) and + SHORT (2 byte) encodings, both inline and via overflow pointer. Used by + the tests below to forge a COG with attacker-controlled byte counts. + """ + from xrspatial.geotiff._header import parse_header + header = parse_header(bytes(data)) + bo = header.byte_order + ifd_offset = header.first_ifd_offset + num_entries = struct.unpack_from(f'{bo}H', data, ifd_offset)[0] + entry_offset = ifd_offset + 2 + + for i in range(num_entries): + eo = entry_offset + i * 12 + tag = struct.unpack_from(f'{bo}H', data, eo)[0] + if tag != 325: # TileByteCounts + continue + type_id = struct.unpack_from(f'{bo}H', data, eo + 2)[0] + count = struct.unpack_from(f'{bo}I', data, eo + 4)[0] + if type_id == 4: # LONG + total = count * 4 + if total <= 4: + for k in range(count): + struct.pack_into(f'{bo}I', data, eo + 8 + k * 4, value) + else: + ptr = struct.unpack_from(f'{bo}I', data, eo + 8)[0] + for k in range(count): + struct.pack_into( + f'{bo}I', data, ptr + k * 4, value) + elif type_id == 3: # SHORT + clipped = min(value, 0xFFFF) + total = count * 2 + if total <= 4: + for k in range(count): + struct.pack_into( + f'{bo}H', data, eo + 8 + k * 2, clipped) + else: + ptr = struct.unpack_from(f'{bo}I', data, eo + 8)[0] + for k in range(count): + struct.pack_into( + f'{bo}H', data, ptr + k * 2, clipped) + return + raise AssertionError("TileByteCounts (tag 325) not found in IFD") + + +class _MockHTTPSource: + """Minimal _HTTPSource stand-in that serves bytes from an in-memory buf. + + Tests use this to drive _read_cog_http through the HTTP code path without + spinning up a real server. read_range / read_ranges_coalesced match the + real source's signatures so the reader cannot tell the difference. + """ + + def __init__(self, buf: bytes): + self._url = 'mock://' + self._size = len(buf) + self._pool = None + self._buf = buf + self._lock = threading.Lock() + self.calls: list[tuple[int, int]] = [] + + def read_range(self, start: int, length: int) -> bytes: + with self._lock: + self.calls.append((start, length)) + return self._buf[start:start + length] + + def read_all(self) -> bytes: + with self._lock: + self.calls.append((0, len(self._buf))) + return self._buf + + def read_ranges(self, ranges, max_workers=8): + return [self.read_range(s, le) for s, le in ranges] + + def read_ranges_coalesced(self, ranges, max_workers=8, gap_threshold=None): + from xrspatial.geotiff._reader import ( + coalesce_ranges, split_coalesced_bytes, + COALESCE_GAP_THRESHOLD_DEFAULT, + ) + if gap_threshold is None: + gap_threshold = COALESCE_GAP_THRESHOLD_DEFAULT + merged, mapping = coalesce_ranges(ranges, gap_threshold=gap_threshold) + merged_bytes = self.read_ranges(merged, max_workers=max_workers) + return split_coalesced_bytes(merged_bytes, mapping) + + def close(self): + pass + + +class TestHTTPTileByteCountCap: + """Regression tests for the HTTP COG byte_count cap (#1536).""" + + def _build_forged_cog(self, tmp_path, byte_count_value: int) -> bytes: + """Build a real tiled COG, then patch every TileByteCounts entry.""" + from xrspatial.geotiff import to_geotiff + import xarray as xr + arr = np.arange(64 * 64, dtype=np.float32).reshape(64, 64) + da = xr.DataArray(arr, dims=['y', 'x']) + path = str(tmp_path / "forged_1536.tif") + to_geotiff(da, path, tile_size=32, compression='deflate') + with open(path, 'rb') as f: + data = bytearray(f.read()) + _patch_tile_byte_counts(data, byte_count_value) + return bytes(data) + + def test_huge_byte_count_rejected(self, tmp_path, monkeypatch): + """A tile claiming more bytes than the cap raises ValueError.""" + from xrspatial.geotiff import _reader as _reader_mod + + # 100 MB > the 1 MB cap we set for this test + forged = self._build_forged_cog(tmp_path, 100 * 1024 * 1024) + src = _MockHTTPSource(forged) + monkeypatch.setattr(_reader_mod, '_HTTPSource', lambda url: src) + monkeypatch.setenv('XRSPATIAL_COG_MAX_TILE_BYTES', str(1024 * 1024)) + + with pytest.raises(ValueError, match="TileByteCount"): + _reader_mod._read_cog_http('http://mock/forged.tif') + + def test_error_message_names_offending_value(self, tmp_path, monkeypatch): + """The error mentions the byte count and the cap so operators can + diagnose without re-reading the source.""" + from xrspatial.geotiff import _reader as _reader_mod + + forged = self._build_forged_cog(tmp_path, 50 * 1024 * 1024) + src = _MockHTTPSource(forged) + monkeypatch.setattr(_reader_mod, '_HTTPSource', lambda url: src) + monkeypatch.setenv('XRSPATIAL_COG_MAX_TILE_BYTES', str(1024)) + + with pytest.raises(ValueError) as excinfo: + _reader_mod._read_cog_http('http://mock/forged.tif') + msg = str(excinfo.value) + assert "52,428,800" in msg or "52428800" in msg # the byte count + assert "1,024" in msg or "1024" in msg # the cap + assert "denial-of-service" in msg.lower() or "malformed" in msg + + def test_normal_cog_still_reads(self, tmp_path, monkeypatch): + """Realistic per-tile byte counts pass under the default cap.""" + from xrspatial.geotiff import to_geotiff, _reader as _reader_mod + import xarray as xr + + arr = np.arange(64 * 64, dtype=np.float32).reshape(64, 64) + da = xr.DataArray(arr, dims=['y', 'x']) + path = str(tmp_path / "normal_1536.tif") + to_geotiff(da, path, tile_size=32, compression='deflate') + with open(path, 'rb') as f: + buf = f.read() + + src = _MockHTTPSource(buf) + monkeypatch.setattr(_reader_mod, '_HTTPSource', lambda url: src) + + result, _ = _reader_mod._read_cog_http('http://mock/normal.tif') + np.testing.assert_array_equal(result, arr) + + def test_env_override_lifts_cap(self, tmp_path, monkeypatch): + """A user with legitimate large tiles can raise the cap via env.""" + from xrspatial.geotiff import to_geotiff, _reader as _reader_mod + import xarray as xr + + arr = np.arange(64 * 64, dtype=np.float32).reshape(64, 64) + da = xr.DataArray(arr, dims=['y', 'x']) + path = str(tmp_path / "normal_env_1536.tif") + to_geotiff(da, path, tile_size=32, compression='deflate') + with open(path, 'rb') as f: + buf = f.read() + + src = _MockHTTPSource(buf) + monkeypatch.setattr(_reader_mod, '_HTTPSource', lambda url: src) + # A tiny cap WITHOUT the env override would fire; lift it past + # the actual byte counts and the read should succeed. + monkeypatch.setenv('XRSPATIAL_COG_MAX_TILE_BYTES', str(64 * 1024 * 1024)) + + result, _ = _reader_mod._read_cog_http('http://mock/normal_env.tif') + np.testing.assert_array_equal(result, arr) + + def test_local_path_unaffected_by_cap(self, tmp_path, monkeypatch): + """The local-mmap reader bypasses the HTTP cap. + + A patched on-disk file with huge byte_counts decodes via mmap + slicing, which silently truncates at EOF. The cap is HTTP-only, + so a tight env cap should not break legitimate local reads. + """ + from xrspatial.geotiff import open_geotiff, to_geotiff + import xarray as xr + + arr = np.arange(64 * 64, dtype=np.float32).reshape(64, 64) + da = xr.DataArray(arr, dims=['y', 'x']) + path = str(tmp_path / "local_normal_1536.tif") + to_geotiff(da, path, tile_size=32, compression='deflate') + + # Set the HTTP cap to 1 byte; local reads must still succeed + monkeypatch.setenv('XRSPATIAL_COG_MAX_TILE_BYTES', '1') + result = open_geotiff(path) + np.testing.assert_array_equal(result.values, arr)