This issue serves as the central place for discussing and tracking the implementation of the pygmt.select function in PyGMT. The issue will be closed when the initial implementation is complete. Progress is tracked at PyGMT: Wrapping GMT modules .
Documentation
GMT Option Flags and Modifiers
☑️: Implemented ; ⬜: To be implemented/discussed ; Strikethrough: Won't implement .
☑️ -A (area_thresh): Threshold for excluding small features based on area; skip polygons or coastline features smaller than this threshold.
☑️ -C (dist2pt): pointfile |lon /lat **+d** dist . Pass all records within dist of any point in pointfile (or a single lon/lat point).
☑️ -D (resolution): "full", "high", "intermediate", "low", "crude", or "auto". Coastline resolution used with mask_values.
☑️ -F (polygon): Pass all records whose locations are inside one of the closed polygons in polygonfile .
☑️ -G (mask_grid): Pass all records that fall inside valid (non-NaN, non-zero) nodes of a grid mask.
☑️ -I (reverse): [cflrsz ]. Reverse the sense of the test for one or more of the spatial criteria.
☑️ -J (projection): Map projection used when computing Cartesian distances from geographic coordinates.
☑️ -L (dist2line): linefile **+d** dist [+p ]. Pass all records within dist of any line segment in linefile .
☑️ -N (mask_values): wet/dry or ocean/land/lake/island/pond . Pass records based on whether they fall on land, ocean, or other geographic features.
☑️ -R (region): Rectangular region filter; pass only records inside the specified bounding box.
☑️ -V (verbose): Verbosity level.
-X/-Y: Use Figure.shift_origin instead.
☑️ -Z (z_subregion): min [/max ][+a ][+c col ][+i ]. Pass records whose z (or other column) value lies within the given range.
☑️ -b (binary): Binary input/output.
☑️ -d (nodata): Replace NaN with a specified nodata value on input/output.
☑️ -e (find): Pattern matching to select input rows.
☑️ -f (coltypes): Column data types.
☑️ -g (gap): Gap detection.
☑️ -h (header): Read/write header records.
☑️ -i (incols): Select input columns.
☑️ -o (outcols): Select output columns.
⬜ -q: Select rows by row number or range.
☑️ -s (skiprows): Skip rows containing NaN values.
☑️ -w (wrap): Wrap repeated cycles.
--PAR=value: Use pygmt.config instead.
Notes on Input Formats
data: Accepts a file path, 2-D numpy.ndarray, or pandas.DataFrame with (x, y) in the first two columns.
output_type: "pandas" (default), "numpy", or "file". Use "file" together with outfile.
Up to 7 spatial criteria can be combined simultaneously; all criteria must pass by default (logical AND). Use reverse to invert individual tests.
mask_values and resolution are only meaningful when testing against coastline features (criteria 5).
The deprecated parameters mask and gridmask (replaced by mask_values and mask_grid in v0.18.0) will be removed in v0.20.0.
Linked Pull Requests
Related Issues and Discussions
pygmt.select provides spatial subsetting at the data-table level; for subsetting a grid, use pygmt.grdcut instead.
Combining polygon and reverse="f" efficiently excludes points that fall inside a known contaminated region (e.g., land stations in a marine dataset).
The z_subregion (-Z) parameter supports multiple column tests when passed as a list, enabling multi-column range filtering in a single call.
This issue serves as the central place for discussing and tracking the implementation of the
pygmt.selectfunction in PyGMT. The issue will be closed when the initial implementation is complete. Progress is tracked at PyGMT: Wrapping GMT modules.Documentation
GMT Option Flags and Modifiers
☑️: Implemented; ⬜: To be implemented/discussed;
Strikethrough: Won't implement.-A(area_thresh): Threshold for excluding small features based on area; skip polygons or coastline features smaller than this threshold.-C(dist2pt): pointfile|lon/lat**+d** dist. Pass all records within dist of any point in pointfile (or a single lon/lat point).-D(resolution):"full","high","intermediate","low","crude", or"auto". Coastline resolution used withmask_values.-F(polygon): Pass all records whose locations are inside one of the closed polygons in polygonfile.-G(mask_grid): Pass all records that fall inside valid (non-NaN, non-zero) nodes of a grid mask.-I(reverse): [cflrsz]. Reverse the sense of the test for one or more of the spatial criteria.-J(projection): Map projection used when computing Cartesian distances from geographic coordinates.-L(dist2line): linefile**+d** dist[+p]. Pass all records within dist of any line segment in linefile.-N(mask_values): wet/dry or ocean/land/lake/island/pond. Pass records based on whether they fall on land, ocean, or other geographic features.-R(region): Rectangular region filter; pass only records inside the specified bounding box.-V(verbose): Verbosity level.: Use-X/-YFigure.shift_origininstead.-Z(z_subregion): min[/max][+a][+c col][+i]. Pass records whose z (or other column) value lies within the given range.-b(binary): Binary input/output.-d(nodata): Replace NaN with a specified nodata value on input/output.-e(find): Pattern matching to select input rows.-f(coltypes): Column data types.-g(gap): Gap detection.-h(header): Read/write header records.-i(incols): Select input columns.-o(outcols): Select output columns.-q: Select rows by row number or range.-s(skiprows): Skip rows containing NaN values.-w(wrap): Wrap repeated cycles.: Use--PAR=valuepygmt.configinstead.Notes on Input Formats
data: Accepts a file path, 2-Dnumpy.ndarray, orpandas.DataFramewith (x, y) in the first two columns.output_type:"pandas"(default),"numpy", or"file". Use"file"together withoutfile.reverseto invert individual tests.mask_valuesandresolutionare only meaningful when testing against coastline features (criteria 5).maskandgridmask(replaced bymask_valuesandmask_gridin v0.18.0) will be removed in v0.20.0.Linked Pull Requests
mask_grid(-G) parameter – Wrap gmtselect #1429z_subregion(-Z) parameter – Update documentation ofselect.py#2123mask→mask_valuesandgridmask→mask_grid(deprecation) – New Alias System: Add the private _to_string function #3986-q(row-number selection) optionmaskandgridmaskparameters in v0.20.0Related Issues and Discussions
pygmt.selectprovides spatial subsetting at the data-table level; for subsetting a grid, usepygmt.grdcutinstead.polygonandreverse="f"efficiently excludes points that fall inside a known contaminated region (e.g., land stations in a marine dataset).z_subregion(-Z) parameter supports multiple column tests when passed as a list, enabling multi-column range filtering in a single call.