forked from DOI-USGS/dataretrieval-python
-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathapi.py
More file actions
1750 lines (1612 loc) · 91.1 KB
/
api.py
File metadata and controls
1750 lines (1612 loc) · 91.1 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
"""Functions for downloading data from the Water Data APIs, including the USGS
Aquarius Samples database.
See https://api.waterdata.usgs.gov/ for API reference.
"""
import json
import logging
from io import StringIO
from typing import List, Optional, Tuple, Union, get_args
import pandas as pd
import requests
from requests.models import PreparedRequest
from dataretrieval.utils import BaseMetadata, to_str
from dataretrieval.waterdata.types import (
CODE_SERVICES,
METADATA_COLLECTIONS,
PROFILES,
SERVICES,
)
from dataretrieval.waterdata.utils import (
SAMPLES_URL,
get_ogc_data,
_construct_api_requests,
_walk_pages,
_check_profiles
)
# Set up logger for this module
logger = logging.getLogger(__name__)
def get_daily(
monitoring_location_id: Optional[Union[str, List[str]]] = None,
parameter_code: Optional[Union[str, List[str]]] = None,
statistic_id: Optional[Union[str, List[str]]] = None,
properties: Optional[List[str]] = None,
time_series_id: Optional[Union[str, List[str]]] = None,
daily_id: Optional[Union[str, List[str]]] = None,
approval_status: Optional[Union[str, List[str]]] = None,
unit_of_measure: Optional[Union[str, List[str]]] = None,
qualifier: Optional[Union[str, List[str]]] = None,
value: Optional[Union[str, List[str]]] = None,
last_modified: Optional[str] = None,
skip_geometry: Optional[bool] = None,
time: Optional[Union[str, List[str]]] = None,
bbox: Optional[List[float]] = None,
limit: Optional[int] = None,
convert_type: bool = True,
) -> Tuple[pd.DataFrame, BaseMetadata]:
"""Daily data provide one data value to represent water conditions for the
day.
Throughout much of the history of the USGS, the primary water data available
was daily data collected manually at the monitoring location once each day.
With improved availability of computer storage and automated transmission of
data, the daily data published today are generally a statistical summary or
metric of the continuous data collected each day, such as the daily mean,
minimum, or maximum value. Daily data are automatically calculated from the
continuous data of the same parameter code and are described by parameter
code and a statistic code. These data have also been referred to as “daily
values” or “DV”.
Parameters
----------
monitoring_location_id : string or list of strings, optional
A unique identifier representing a single monitoring location. This
corresponds to the id field in the monitoring-locations endpoint.
Monitoring location IDs are created by combining the agency code of
the agency responsible for the monitoring location (e.g. USGS) with
the ID number of the monitoring location (e.g. 02238500), separated
by a hyphen (e.g. USGS-02238500).
parameter_code : string or list of strings, optional
Parameter codes are 5-digit codes used to identify the constituent
measured and the units of measure. A complete list of parameter
codes and associated groupings can be found at
https://help.waterdata.usgs.gov/codes-and-parameters/parameters.
statistic_id : string or list of strings, optional
A code corresponding to the statistic an observation represents.
Example codes include 00001 (max), 00002 (min), and 00003 (mean).
A complete list of codes and their descriptions can be found at
https://help.waterdata.usgs.gov/code/stat_cd_nm_query?stat_nm_cd=%25&fmt=html.
properties : string or list of strings, optional
A vector of requested columns to be returned from the query.
Available options are: geometry, id, time_series_id,
monitoring_location_id, parameter_code, statistic_id, time, value,
unit_of_measure, approval_status, qualifier, last_modified
time_series_id : string or list of strings, optional
A unique identifier representing a single time series. This
corresponds to the id field in the time-series-metadata endpoint.
daily_id : string or list of strings, optional
A universally unique identifier (UUID) representing a single version of
a record. It is not stable over time. Every time the record is refreshed
in our database (which may happen as part of normal operations and does
not imply any change to the data itself) a new ID will be generated. To
uniquely identify a single observation over time, compare the time and
time_series_id fields; each time series will only have a single
observation at a given time.
approval_status : string or list of strings, optional
Some of the data that you have obtained from this U.S. Geological Survey
database may not have received Director's approval. Any such data values
are qualified as provisional and are subject to revision. Provisional
data are released on the condition that neither the USGS nor the United
States Government may be held liable for any damages resulting from its
use. This field reflects the approval status of each record, and is either
"Approved", meaining processing review has been completed and the data is
approved for publication, or "Provisional" and subject to revision. For
more information about provisional data, go to:
https://waterdata.usgs.gov/provisional-data-statement/.
unit_of_measure : string or list of strings, optional
A human-readable description of the units of measurement associated
with an observation.
qualifier : string or list of strings, optional
This field indicates any qualifiers associated with an observation, for
instance if a sensor may have been impacted by ice or if values were
estimated.
value : string or list of strings, optional
The value of the observation. Values are transmitted as strings in
the JSON response format in order to preserve precision.
last_modified : string, optional
The last time a record was refreshed in our database. This may happen
due to regular operational processes and does not necessarily indicate
anything about the measurement has changed. You can query this field
using date-times or intervals, adhering to RFC 3339, or using ISO 8601
duration objects. Intervals may be bounded or half-bounded (double-dots
at start or end).
Examples:
* A date-time: "2018-02-12T23:20:50Z"
* A bounded interval: "2018-02-12T00:00:00Z/2018-03-18T12:31:12Z"
* Half-bounded intervals: "2018-02-12T00:00:00Z/.." or "../2018-03-18T12:31:12Z"
* Duration objects: "P1M" for data from the past month or "PT36H" for the last 36 hours
Only features that have a last_modified that intersects the value of
datetime are selected.
skip_geometry : boolean, optional
This option can be used to skip response geometries for each feature.
The returning object will be a data frame with no spatial information.
Note that the USGS Water Data APIs use camelCase "skipGeometry" in
CQL2 queries.
time : string, optional
The date an observation represents. You can query this field using
date-times or intervals, adhering to RFC 3339, or using ISO 8601
duration objects. Intervals may be bounded or half-bounded (double-dots
at start or end). Only features that have a time that intersects the
value of datetime are selected. If a feature has multiple temporal
properties, it is the decision of the server whether only a single
temporal property is used to determine the extent or all relevant
temporal properties.
Examples:
* A date-time: "2018-02-12T23:20:50Z"
* A bounded interval: "2018-02-12T00:00:00Z/2018-03-18T12:31:12Z"
* Half-bounded intervals: "2018-02-12T00:00:00Z/.." or "../2018-03-18T12:31:12Z"
* Duration objects: "P1M" for data from the past month or "PT36H" for the last 36 hours
bbox : list of numbers, optional
Only features that have a geometry that intersects the bounding box are
selected. The bounding box is provided as four or six numbers,
depending on whether the coordinate reference system includes a vertical
axis (height or depth). Coordinates are assumed to be in crs 4326. The
expected format is a numeric vector structured: c(xmin,ymin,xmax,ymax).
Another way to think of it is c(Western-most longitude, Southern-most
latitude, Eastern-most longitude, Northern-most longitude).
limit : numeric, optional
The optional limit parameter is used to control the subset of the
selected features that should be returned in each page. The maximum
allowable limit is 50000. It may be beneficial to set this number lower
if your internet connection is spotty. The default (NA) will set the
limit to the maximum allowable limit for the service.
convert_type : boolean, optional
If True, converts columns to appropriate types.
Returns
-------
df : ``pandas.DataFrame`` or ``geopandas.GeoDataFrame``
Formatted data returned from the API query.
md: :obj:`dataretrieval.utils.Metadata`
A custom metadata object
Examples
--------
.. code::
>>> # Get daily flow data from a single site
>>> # over a yearlong period
>>> df, md = dataretrieval.waterdata.get_daily(
... monitoring_location_id="USGS-02238500",
... parameter_code="00060",
... time="2021-01-01T00:00:00Z/2022-01-01T00:00:00Z",
... )
>>> # Get approved daily flow data from multiple sites
>>> df, md = dataretrieval.waterdata.get_daily(
... monitoring_location_id = ["USGS-05114000", "USGS-09423350"],
... approval_status = "Approved",
... time = "2024-01-01/.."
"""
service = "daily"
output_id = "daily_id"
# Build argument dictionary, omitting None values
args = {
k: v
for k, v in locals().items()
if k not in {"service", "output_id"} and v is not None
}
return get_ogc_data(args, output_id, service)
def get_continuous(
monitoring_location_id: Optional[Union[str, List[str]]] = None,
parameter_code: Optional[Union[str, List[str]]] = None,
statistic_id: Optional[Union[str, List[str]]] = None,
properties: Optional[List[str]] = None,
time_series_id: Optional[Union[str, List[str]]] = None,
continuous_id: Optional[Union[str, List[str]]] = None,
approval_status: Optional[Union[str, List[str]]] = None,
unit_of_measure: Optional[Union[str, List[str]]] = None,
qualifier: Optional[Union[str, List[str]]] = None,
value: Optional[Union[str, List[str]]] = None,
last_modified: Optional[str] = None,
time: Optional[Union[str, List[str]]] = None,
limit: Optional[int] = None,
convert_type: bool = True,
) -> Tuple[pd.DataFrame, BaseMetadata]:
"""
Continuous data provide instantanous water conditions.
This is an early version of the continuous endpoint that is feature-complete
and is being made available for limited use. Geometries are not included
with the continuous endpoint. If the "time" input is left blank, the service
will return the most recent year of measurements. Users may request no more
than three years of data with each function call.
Continuous data are collected at a high frequency, typically 15-minute
intervals. Depending on the specific monitoring location, the data may be
transmitted automatically via telemetry and be available on WDFN within
minutes of collection, while other times the delivery of data may be delayed
if the monitoring location does not have the capacity to automatically
transmit data. Continuous data are described by parameter name and
parameter code (pcode). These data might also be referred to as
"instantaneous values" or "IV".
Parameters
----------
monitoring_location_id : string or list of strings, optional
A unique identifier representing a single monitoring location. This
corresponds to the id field in the monitoring-locations endpoint.
Monitoring location IDs are created by combining the agency code of
the agency responsible for the monitoring location (e.g. USGS) with
the ID number of the monitoring location (e.g. 02238500), separated
by a hyphen (e.g. USGS-02238500).
parameter_code : string or list of strings, optional
Parameter codes are 5-digit codes used to identify the constituent
measured and the units of measure. A complete list of parameter
codes and associated groupings can be found at
https://help.waterdata.usgs.gov/codes-and-parameters/parameters.
statistic_id : string or list of strings, optional
A code corresponding to the statistic an observation represents.
Continuous data are nearly always associated with statistic id
00011. Using a different code (such as 00003 for mean) will
typically return no results. A complete list of codes and their
descriptions can be found at
https://help.waterdata.usgs.gov/code/stat_cd_nm_query?stat_nm_cd=%25&fmt=html.
properties : string or list of strings, optional
A vector of requested columns to be returned from the query.
Available options are: geometry, id, time_series_id,
monitoring_location_id, parameter_code, statistic_id, time, value,
unit_of_measure, approval_status, qualifier, last_modified
time_series_id : string or list of strings, optional
A unique identifier representing a single time series. This
corresponds to the id field in the time-series-metadata endpoint.
continuous_id : string or list of strings, optional
A universally unique identifier (UUID) representing a single version of
a record. It is not stable over time. Every time the record is refreshed
in our database (which may happen as part of normal operations and does
not imply any change to the data itself) a new ID will be generated. To
uniquely identify a single observation over time, compare the time and
time_series_id fields; each time series will only have a single
observation at a given time.
approval_status : string or list of strings, optional
Some of the data that you have obtained from this U.S. Geological Survey
database may not have received Director's approval. Any such data values
are qualified as provisional and are subject to revision. Provisional
data are released on the condition that neither the USGS nor the United
States Government may be held liable for any damages resulting from its
use. This field reflects the approval status of each record, and is either
"Approved", meaining processing review has been completed and the data is
approved for publication, or "Provisional" and subject to revision. For
more information about provisional data, go to:
https://waterdata.usgs.gov/provisional-data-statement/.
unit_of_measure : string or list of strings, optional
A human-readable description of the units of measurement associated
with an observation.
qualifier : string or list of strings, optional
This field indicates any qualifiers associated with an observation, for
instance if a sensor may have been impacted by ice or if values were
estimated.
value : string or list of strings, optional
The value of the observation. Values are transmitted as strings in
the JSON response format in order to preserve precision.
last_modified : string, optional
The last time a record was refreshed in our database. This may happen
due to regular operational processes and does not necessarily indicate
anything about the measurement has changed. You can query this field
using date-times or intervals, adhering to RFC 3339, or using ISO 8601
duration objects. Intervals may be bounded or half-bounded (double-dots
at start or end).
Examples:
* A date-time: "2018-02-12T23:20:50Z"
* A bounded interval: "2018-02-12T00:00:00Z/2018-03-18T12:31:12Z"
* Half-bounded intervals: "2018-02-12T00:00:00Z/.." or "../2018-03-18T12:31:12Z"
* Duration objects: "P1M" for data from the past month or "PT36H" for the last 36 hours
Only features that have a last_modified that intersects the value of
datetime are selected.
time : string, optional
The date an observation represents. You can query this field using
date-times or intervals, adhering to RFC 3339, or using ISO 8601
duration objects. Intervals may be bounded or half-bounded (double-dots
at start or end). Only features that have a time that intersects the
value of datetime are selected. If a feature has multiple temporal
properties, it is the decision of the server whether only a single
temporal property is used to determine the extent or all relevant
temporal properties.
Examples:
* A date-time: "2018-02-12T23:20:50Z"
* A bounded interval: "2018-02-12T00:00:00Z/2018-03-18T12:31:12Z"
* Half-bounded intervals: "2018-02-12T00:00:00Z/.." or "../2018-03-18T12:31:12Z"
* Duration objects: "P1M" for data from the past month or "PT36H" for the last 36 hours
limit : numeric, optional
The optional limit parameter is used to control the subset of the
selected features that should be returned in each page. The maximum
allowable limit is 10000. It may be beneficial to set this number lower
if your internet connection is spotty. The default (NA) will set the
limit to the maximum allowable limit for the service.
convert_type : boolean, optional
If True, the function will convert the data to dates and qualifier to
string vector
Returns
-------
df : ``pandas.DataFrame`` or ``geopandas.GeoDataFrame``
Formatted data returned from the API query.
md: :obj:`dataretrieval.utils.Metadata`
A custom metadata object
Examples
--------
.. code::
>>> # Get instantaneous gage height data from a
>>> # single site from a single year
>>> df, md = dataretrieval.waterdata.get_continuous(
... monitoring_location_id="USGS-02238500",
... parameter_code="00065",
... time="2021-01-01T00:00:00Z/2022-01-01T00:00:00Z",
... )
"""
service = "continuous"
output_id = "continuous_id"
# Build argument dictionary, omitting None values
args = {
k: v
for k, v in locals().items()
if k not in {"service", "output_id"} and v is not None
}
return get_ogc_data(args, output_id, service)
def get_monitoring_locations(
monitoring_location_id: Optional[List[str]] = None,
agency_code: Optional[List[str]] = None,
agency_name: Optional[List[str]] = None,
monitoring_location_number: Optional[List[str]] = None,
monitoring_location_name: Optional[List[str]] = None,
district_code: Optional[List[str]] = None,
country_code: Optional[List[str]] = None,
country_name: Optional[List[str]] = None,
state_code: Optional[List[str]] = None,
state_name: Optional[List[str]] = None,
county_code: Optional[List[str]] = None,
county_name: Optional[List[str]] = None,
minor_civil_division_code: Optional[List[str]] = None,
site_type_code: Optional[List[str]] = None,
site_type: Optional[List[str]] = None,
hydrologic_unit_code: Optional[List[str]] = None,
basin_code: Optional[List[str]] = None,
altitude: Optional[List[str]] = None,
altitude_accuracy: Optional[List[str]] = None,
altitude_method_code: Optional[List[str]] = None,
altitude_method_name: Optional[List[str]] = None,
vertical_datum: Optional[List[str]] = None,
vertical_datum_name: Optional[List[str]] = None,
horizontal_positional_accuracy_code: Optional[List[str]] = None,
horizontal_positional_accuracy: Optional[List[str]] = None,
horizontal_position_method_code: Optional[List[str]] = None,
horizontal_position_method_name: Optional[List[str]] = None,
original_horizontal_datum: Optional[List[str]] = None,
original_horizontal_datum_name: Optional[List[str]] = None,
drainage_area: Optional[List[str]] = None,
contributing_drainage_area: Optional[List[str]] = None,
time_zone_abbreviation: Optional[List[str]] = None,
uses_daylight_savings: Optional[List[str]] = None,
construction_date: Optional[List[str]] = None,
aquifer_code: Optional[List[str]] = None,
national_aquifer_code: Optional[List[str]] = None,
aquifer_type_code: Optional[List[str]] = None,
well_constructed_depth: Optional[List[str]] = None,
hole_constructed_depth: Optional[List[str]] = None,
depth_source_code: Optional[List[str]] = None,
properties: Optional[List[str]] = None,
skip_geometry: Optional[bool] = None,
time: Optional[Union[str, List[str]]] = None,
bbox: Optional[List[float]] = None,
limit: Optional[int] = None,
convert_type: bool = True,
) -> Tuple[pd.DataFrame, BaseMetadata]:
"""Location information is basic information about the monitoring location
including the name, identifier, agency responsible for data collection, and
the date the location was established. It also includes information about
the type of location, such as stream, lake, or groundwater, and geographic
information about the location, such as state, county, latitude and
longitude, and hydrologic unit code (HUC).
Parameters
----------
monitoring_location_id : string or list of strings, optional
A unique identifier representing a single monitoring location. This
corresponds to the id field in the monitoring-locations endpoint.
Monitoring location IDs are created by combining the agency code of
the agency responsible for the monitoring location (e.g. USGS) with
the ID number of the monitoring location (e.g. 02238500), separated
by a hyphen (e.g. USGS-02238500).
agency_code : string or list of strings, optional
The agency that is reporting the data. Agency codes are fixed values
assigned by the National Water Information System (NWIS). A list of
agency codes is available at:
https://help.waterdata.usgs.gov/code/agency_cd_query?fmt=html.
agency_name : string or list of strings, optional
The name of the agency that is reporting the data.
monitoring_location_number : string or list of strings, optional
Each monitoring location in the USGS data base has a unique 8- to
15-digit identification number. Monitoring location numbers are
assigned based on this logic:
https://help.waterdata.usgs.gov/faq/sites/do-station-numbers-have-any-particular-meaning.
monitoring_location_name : string or list of strings, optional
This is the official name of the monitoring location in the database.
For well information this can be a district-assigned local number.
district_code : string or list of strings, optional
The Water Science Centers (WSCs) across the United States use the FIPS
state code as the district code. In some case, monitoring locations and
samples may be managed by a water science center that is adjacent to the
state in which the monitoring location actually resides. For example a
monitoring location may have a district code of 30 which translates to
Montana, but the state code could be 56 for Wyoming because that is where
the monitoring location actually is located.
country_code : string or list of strings, optional
The code for the country in which the monitoring location is located.
country_name : string or list of strings, optional
The name of the country in which the monitoring location is located.
state_code : string or list of strings, optional
State code. A two-digit ANSI code (formerly FIPS code) as defined by
the American National Standards Institute, to define States and
equivalents. A three-digit ANSI code is used to define counties and
county equivalents. A `lookup table
<https://www.census.gov/library/reference/code-lists/ansi.html#states>`_
is available. The only countries with
political subdivisions other than the US are Mexico and Canada. The Mexican
states have US state codes ranging from 81-86 and Canadian provinces have
state codes ranging from 90-98.
state_name : string or list of strings, optional
The name of the state or state equivalent in which the monitoring location
is located.
county_code : string or list of strings, optional
The code for the county or county equivalent (parish, borough, etc.) in which
the monitoring location is located. A `list of codes
<https://help.waterdata.usgs.gov/code/county_query?fmt=html>`_ is available.
county_name : string or list of strings, optional
The name of the county or county equivalent (parish, borough, etc.) in which
the monitoring location is located. A `list of codes
<https://help.waterdata.usgs.gov/code/county_query?fmt=html>`_ is available.
minor_civil_division_code : string or list of strings, optional
Codes for primary governmental or administrative divisions of the county or
county equivalent in which the monitoring location is located.
site_type_code : string or list of strings, optional
A code describing the hydrologic setting of the monitoring location. A `list of
codes <https://help.waterdata.usgs.gov/code/site_tp_query?fmt=html>`_ is available.
Example: "US:15:001" (United States: Hawaii, Hawaii County)
site_type : string or list of strings, optional
A description of the hydrologic setting of the monitoring location. A `list of
codes <https://help.waterdata.usgs.gov/code/site_tp_query?fmt=html>`_ is available.
hydrologic_unit_code : string or list of strings, optional
The United States is divided and sub-divided into successively smaller
hydrologic units which are classified into four levels: regions,
sub-regions, accounting units, and cataloging units. The hydrologic
units are arranged within each other, from the smallest (cataloging
units) to the largest (regions). Each hydrologic unit is identified by a
unique hydrologic unit code (HUC) consisting of two to eight digits
based on the four levels of classification in the hydrologic unit
system.
basin_code : string or list of strings, optional
The Basin Code or "drainage basin code" is a two-digit code that further
subdivides the 8-digit hydrologic-unit code. The drainage basin code is
defined by the USGS State Office where the monitoring location is
located.
altitude : string or list of strings, optional
Altitude of the monitoring location referenced to the specified Vertical
Datum.
altitude_accuracy : string or list of strings, optional
Accuracy of the altitude, in feet. An accuracy of +/- 0.1 foot would be
entered as “.1”. Many altitudes are interpolated from the contours on
topographic maps; accuracies determined in this way are generally
entered as one-half of the contour interval.
altitude_method_code : string or list of strings, optional
Codes representing the method used to measure altitude. A `list of
codes <https://help.waterdata.usgs.gov/code/alt_meth_cd_query?fmt=html>`_ is available.
altitude_method_name : float, optional
The name of the the method used to measure altitude. A `list of
codes <https://help.waterdata.usgs.gov/code/alt_meth_cd_query?fmt=html>`_ is available.
vertical_datum : float, optional
The datum used to determine altitude and vertical position at the
monitoring location. A `list of
codes <https://help.waterdata.usgs.gov/code/alt_datum_cd_query?fmt=html>`_ is available.
vertical_datum_name : float, optional
The datum used to determine altitude and vertical position at the
monitoring location. A `list of
codes <https://help.waterdata.usgs.gov/code/alt_datum_cd_query?fmt=html>`_ is available.
horizontal_positional_accuracy_code : string or list of strings, optional
Indicates the accuracy of the latitude longitude values. A `list of
codes <https://help.waterdata.usgs.gov/code/coord_acy_cd_query?fmt=html>`_ is available.
horizontal_positional_accuracy : string or list of strings, optional
Indicates the accuracy of the latitude longitude values. A `list of
codes <https://help.waterdata.usgs.gov/code/coord_acy_cd_query?fmt=html>`_ is available.
horizontal_position_method_code : string or list of strings, optional
Indicates the method used to determine latitude longitude values. A `list of
codes <https://help.waterdata.usgs.gov/code/coord_meth_cd_query?fmt=html>`_ is available.
horizontal_position_method_name : string or list of strings, optional
Indicates the method used to determine latitude longitude values. A `list of
codes <https://help.waterdata.usgs.gov/code/coord_meth_cd_query?fmt=html>`_ is available.
original_horizontal_datum : string or list of strings, optional
Coordinates are published in EPSG:4326 / WGS84 / World Geodetic System
1984. This field indicates the original datum used to determine
coordinates before they were converted. A `list of
codes <https://help.waterdata.usgs.gov/code/coord_datum_cd_query?fmt=html>`_ is available.
original_horizontal_datum_name : string or list of strings, optional
Coordinates are published in EPSG:4326 / WGS84 / World Geodetic System
1984. This field indicates the original datum used to determine coordinates
before they were converted. A `list of
codes <https://help.waterdata.usgs.gov/code/coord_datum_cd_query?fmt=html>`_ is available.
drainage_area : string or list of strings, optional
The area enclosed by a topographic divide from which direct surface runoff
from precipitation normally drains by gravity into the stream above that
point.
contributing_drainage_area : string or list of strings, optional
The contributing drainage area of a lake, stream, wetland, or estuary
monitoring location, in square miles. This item should be present only
if the contributing area is different from the total drainage area. This
situation can occur when part of the drainage area consists of very
porous soil or depressions that either allow all runoff to enter the
groundwater or traps the water in ponds so that rainfall does not
contribute to runoff. A transbasin diversion can also affect the total
drainage area.
time_zone_abbreviation : string or list of strings, optional
A short code describing the time zone used by a monitoring location.
uses_daylight_savings : string or list of strings, optional
A flag indicating whether or not a monitoring location uses daylight savings.
construction_date : string or list of strings, optional
Date the well was completed.
aquifer_code : string or list of strings, optional
Local aquifers in the USGS water resources data base are identified by a
geohydrologic unit code (a three-digit number related to the age of the
formation, followed by a 4 or 5 character abbreviation for the geologic
unit or aquifer name). Additional information is available
`at this link <https://help.waterdata.usgs.gov/faq/groundwater/local-aquifer-description>`_.
national_aquifer_code : string or list of strings, optional
National aquifers are the principal aquifers or aquifer systems in the United
States, defined as regionally extensive aquifers or aquifer systems that have
the potential to be used as a source of potable water. Not all groundwater
monitoring locations can be associated with a National Aquifer. Such
monitoring locations will not be retrieved using this search criteria. A `list
of National aquifer codes and names <https://help.waterdata.usgs.gov/code/nat_aqfr_query?fmt=html>`_
is available.
aquifer_type_code : string or list of strings, optional
Groundwater occurs in aquifers under two different conditions. Where water
only partly fills an aquifer, the upper surface is free to rise and decline.
These aquifers are referred to as unconfined (or water-table) aquifers. Where
water completely fills an aquifer that is overlain by a confining bed, the
aquifer is referred to as a confined (or artesian) aquifer. When a confined
aquifer is penetrated by a well, the water level in the well will rise above
the top of the aquifer (but not necessarily above land surface). Additional
information is available `at this link <https://help.waterdata.usgs.gov/faq/groundwater/local-aquifer-description>`_.
well_constructed_depth : string or list of strings, optional
The depth of the finished well, in feet below land surface datum. Note: Not
all groundwater monitoring locations have information on Well Depth. Such
monitoring locations will not be retrieved using this search criteria.
hole_constructed_depth : string or list of strings, optional
The total depth to which the hole is drilled, in feet below land surface datum.
Note: Not all groundwater monitoring locations have information on Hole Depth.
Such monitoring locations will not be retrieved using this search criteria.
depth_source_code : string or list of strings, optional
A code indicating the source of water-level data. A `list of
codes <https://help.waterdata.usgs.gov/code/water_level_src_cd_query?fmt=html>`_
is available.
properties : string or list of strings, optional
A vector of requested columns to be returned from the query. Available
options are: geometry, id, agency_code, agency_name,
monitoring_location_number, monitoring_location_name, district_code,
country_code, country_name, state_code, state_name, county_code,
county_name, minor_civil_division_code, site_type_code, site_type,
hydrologic_unit_code, basin_code, altitude, altitude_accuracy,
altitude_method_code, altitude_method_name, vertical_datum,
vertical_datum_name, horizontal_positional_accuracy_code,
horizontal_positional_accuracy, horizontal_position_method_code,
horizontal_position_method_name, original_horizontal_datum,
original_horizontal_datum_name, drainage_area,
contributing_drainage_area, time_zone_abbreviation,
uses_daylight_savings, construction_date, aquifer_code,
national_aquifer_code, aquifer_type_code, well_constructed_depth,
hole_constructed_depth, depth_source_code.
bbox : list of numbers, optional
Only features that have a geometry that intersects the bounding box are
selected. The bounding box is provided as four or six numbers,
depending on whether the coordinate reference system includes a vertical
axis (height or depth). Coordinates are assumed to be in crs 4326. The
expected format is a numeric vector structured: c(xmin,ymin,xmax,ymax).
Another way to think of it is c(Western-most longitude, Southern-most
latitude, Eastern-most longitude, Northern-most longitude).
limit : numeric, optional
The optional limit parameter is used to control the subset of the
selected features that should be returned in each page. The maximum
allowable limit is 50000. It may be beneficial to set this number lower
if your internet connection is spotty. The default (NA) will set the
limit to the maximum allowable limit for the service.
skip_geometry : boolean, optional
This option can be used to skip response geometries for each feature.
The returning object will be a data frame with no spatial information.
Note that the USGS Water Data APIs use camelCase "skipGeometry" in
CQL2 queries.
convert_type : boolean, optional
If True, converts columns to appropriate types.
Returns
-------
df : ``pandas.DataFrame`` or ``geopandas.GeoDataFrame``
Formatted data returned from the API query.
md: :obj:`dataretrieval.utils.Metadata`
A custom metadata object
Examples
--------
.. code::
>>> # Get monitoring locations within a bounding box
>>> # and leave out geometry
>>> df, md = dataretrieval.waterdata.get_monitoring_locations(
... bbox=[-90.2, 42.6, -88.7, 43.2], skip_geometry=True
... )
>>> # Get monitoring location info for specific sites
>>> # and only specific properties
>>> df, md = dataretrieval.waterdata.get_monitoring_locations(
... monitoring_location_id=["USGS-05114000", "USGS-09423350"],
... properties=["monitoring_location_id", "state_name", "country_name"],
... )
"""
service = "monitoring-locations"
output_id = "monitoring_location_id"
# Build argument dictionary, omitting None values
args = {
k: v
for k, v in locals().items()
if k not in {"service", "output_id"} and v is not None
}
return get_ogc_data(args, output_id, service)
def get_time_series_metadata(
monitoring_location_id: Optional[Union[str, List[str]]] = None,
parameter_code: Optional[Union[str, List[str]]] = None,
parameter_name: Optional[Union[str, List[str]]] = None,
properties: Optional[Union[str, List[str]]] = None,
statistic_id: Optional[Union[str, List[str]]] = None,
hydrologic_unit_code: Optional[Union[str, List[str]]] = None,
state_name: Optional[Union[str, List[str]]] = None,
last_modified: Optional[Union[str, List[str]]] = None,
begin: Optional[Union[str, List[str]]] = None,
end: Optional[Union[str, List[str]]] = None,
begin_utc: Optional[Union[str, List[str]]] = None,
end_utc: Optional[Union[str, List[str]]] = None,
unit_of_measure: Optional[Union[str, List[str]]] = None,
computation_period_identifier: Optional[Union[str, List[str]]] = None,
computation_identifier: Optional[Union[str, List[str]]] = None,
thresholds: Optional[int] = None,
sublocation_identifier: Optional[Union[str, List[str]]] = None,
primary: Optional[Union[str, List[str]]] = None,
parent_time_series_id: Optional[Union[str, List[str]]] = None,
time_series_id: Optional[Union[str, List[str]]] = None,
web_description: Optional[Union[str, List[str]]] = None,
skip_geometry: Optional[bool] = None,
time: Optional[Union[str, List[str]]] = None,
bbox: Optional[List[float]] = None,
limit: Optional[int] = None,
convert_type: bool = True,
) -> Tuple[pd.DataFrame, BaseMetadata]:
"""Daily data and continuous measurements are grouped into time series,
which represent a collection of observations of a single parameter,
potentially aggregated using a standard statistic, at a single monitoring
location. This endpoint provides metadata about those time series,
including their operational thresholds, units of measurement, and when
the earliest and most recent observations in a time series occurred.
Parameters
----------
monitoring_location_id : string or list of strings, optional
A unique identifier representing a single monitoring location. This
corresponds to the id field in the monitoring-locations endpoint.
Monitoring location IDs are created by combining the agency code of
the agency responsible for the monitoring location (e.g. USGS) with
the ID number of the monitoring location (e.g. 02238500), separated
by a hyphen (e.g. USGS-02238500).
parameter_code : string or list of strings, optional
Parameter codes are 5-digit codes used to identify the constituent
measured and the units of measure. A complete list of parameter
codes and associated groupings can be found at
https://help.waterdata.usgs.gov/codes-and-parameters/parameters.
parameter_name : string or list of strings, optional
A human-understandable name corresponding to parameter_code.
properties : string or list of strings, optional
A vector of requested columns to be returned from the query.
Available options are: geometry, id, time_series_id,
monitoring_location_id, parameter_code, statistic_id, time, value,
unit_of_measure, approval_status, qualifier, last_modified
statistic_id : string or list of strings, optional
A code corresponding to the statistic an observation represents.
Example codes include 00001 (max), 00002 (min), and 00003 (mean).
A complete list of codes and their descriptions can be found at
https://help.waterdata.usgs.gov/code/stat_cd_nm_query?stat_nm_cd=%25&fmt=html.
hydrologic_unit_code : string or list of strings, optional
The United States is divided and sub-divided into successively smaller
hydrologic units which are classified into four levels: regions,
sub-regions, accounting units, and cataloging units. The hydrologic
units are arranged within each other, from the smallest (cataloging units)
to the largest (regions). Each hydrologic unit is identified by a unique
hydrologic unit code (HUC) consisting of two to eight digits based on the
four levels of classification in the hydrologic unit system.
state_name : string or list of strings, optional
The name of the state or state equivalent in which the monitoring location
is located.
last_modified : string, optional
The last time a record was refreshed in our database. This may happen
due to regular operational processes and does not necessarily indicate
anything about the measurement has changed. You can query this field
using date-times or intervals, adhering to RFC 3339, or using ISO 8601
duration objects. Intervals may be bounded or half-bounded (double-dots
at start or end). Only features that have a last_modified that
intersects the value of datetime are selected.
Examples:
* A date-time: "2018-02-12T23:20:50Z"
* A bounded interval: "2018-02-12T00:00:00Z/2018-03-18T12:31:12Z"
* Half-bounded intervals: "2018-02-12T00:00:00Z/.." or
"../2018-03-18T12:31:12Z"
* Duration objects: "P1M" for data from the past month or "PT36H"
for the last 36 hours
begin : string or list of strings, optional
This field contains the same information as "begin_utc", but in the
local time of the monitoring location. It is retained for backwards
compatibility, but will be removed in V1 of these APIs.
end : string or list of strings, optional
This field contains the same information as "end_utc", but in the
local time of the monitoring location. It is retained for backwards
compatibility, but will be removed in V1 of these APIs.
begin_utc : string or list of strings, optional
The datetime of the earliest observation in the time series. Together
with end, this field represents the period of record of a time series.
Note that some time series may have large gaps in their collection
record. This field is currently in the local time of the monitoring
location. We intend to update this in version v0 to use UTC with a time
zone. You can query this field using date-times or intervals, adhering
to RFC 3339, or using ISO 8601 duration objects. Intervals may be
bounded or half-bounded (double-dots at start or end). Only features
that have a begin that intersects the value of datetime are selected.
Examples:
* A date-time: "2018-02-12T23:20:50Z"
* A bounded interval: "2018-02-12T00:00:00Z/2018-03-18T12:31:12Z"
* Half-bounded intervals: "2018-02-12T00:00:00Z/.." or "../2018-03-18T12:31:12Z"
* Duration objects: "P1M" for data from the past month or "PT36H" for the last 36 hours
end_utc : string or list of strings, optional
The datetime of the most recent observation in the time series. Data returned by
this endpoint updates at most once per day, and potentially less frequently than
that, and as such there may be more recent observations within a time series
than the time series end value reflects. Together with begin, this field
represents the period of record of a time series. It is additionally used to
determine whether a time series is "active". We intend to update this in
version v0 to use UTC with a time zone. You can query this field using date-times
or intervals, adhering to RFC 3339, or using ISO 8601 duration objects. Intervals
may be bounded or half-bounded (double-dots at start or end). Only
features that have a end that intersects the value of datetime are
selected.
Examples:
* A date-time: "2018-02-12T23:20:50Z"
* A bounded interval: "2018-02-12T00:00:00Z/2018-03-18T12:31:12Z"
* Half-bounded intervals: "2018-02-12T00:00:00Z/.." or "../2018-03-18T12:31:12Z"
* Duration objects: "P1M" for data from the past month or "PT36H" for the last 36 hours
unit_of_measure : string or list of strings, optional
A human-readable description of the units of measurement associated
with an observation.
computation_period_identifier : string or list of strings, optional
Indicates the period of data used for any statistical computations.
computation_identifier : string or list of strings, optional
Indicates whether the data from this time series represent a specific
statistical computation.
thresholds : numeric or list of numbers, optional
Thresholds represent known numeric limits for a time series, for example
the historic maximum value for a parameter or a level below which a
sensor is non-operative. These thresholds are sometimes used to
automatically determine if an observation is erroneous due to sensor
error, and therefore shouldn't be included in the time series.
sublocation_identifier : string or list of strings, optional
primary : string or list of strings, optional
parent_time_series_id : string or list of strings, optional
time_series_id : string or list of strings, optional
A unique identifier representing a single time series. This
corresponds to the id field in the time-series-metadata endpoint.
web_description : string or list of strings, optional
A description of what this time series represents, as used by WDFN and
other USGS data dissemination products.
skip_geometry : boolean, optional
This option can be used to skip response geometries for each feature.
The returning object will be a data frame with no spatial information.
Note that the USGS Water Data APIs use camelCase "skipGeometry" in
CQL2 queries.
bbox : list of numbers, optional
Only features that have a geometry that intersects the bounding box are
selected. The bounding box is provided as four or six numbers,
depending on whether the coordinate reference system includes a vertical
axis (height or depth). Coordinates are assumed to be in crs 4326. The
expected format is a numeric vector structured: c(xmin,ymin,xmax,ymax).
Another way to think of it is c(Western-most longitude, Southern-most
latitude, Eastern-most longitude, Northern-most longitude).
limit : numeric, optional
The optional limit parameter is used to control the subset of the
selected features that should be returned in each page. The maximum
allowable limit is 50000. It may be beneficial to set this number lower
if your internet connection is spotty. The default (None) will set the
limit to the maximum allowable limit for the service.
convert_type : boolean, optional
If True, converts columns to appropriate types.
Returns
-------
df : ``pandas.DataFrame`` or ``geopandas.GeoDataFrame``
Formatted data returned from the API query.
md: :obj:`dataretrieval.utils.Metadata`
A custom metadata object
Examples
--------
.. code::
>>> # Get timeseries metadata information from a single site
>>> # over a yearlong period
>>> df, md = dataretrieval.waterdata.get_time_series_metadata(
... monitoring_location_id="USGS-02238500"
... )
>>> # Get timeseries metadata information from multiple sites
>>> # that begin after January 1, 1990.
>>> df, md = dataretrieval.waterdata.get_time_series_metadata(
... monitoring_location_id = ["USGS-05114000", "USGS-09423350"],
... begin = "1990-01-01/.."
... )
"""
service = "time-series-metadata"
output_id = "time_series_id"
# Build argument dictionary, omitting None values
args = {
k: v
for k, v in locals().items()
if k not in {"service", "output_id"} and v is not None
}
return get_ogc_data(args, output_id, service)
def get_latest_continuous(
monitoring_location_id: Optional[Union[str, List[str]]] = None,
parameter_code: Optional[Union[str, List[str]]] = None,
statistic_id: Optional[Union[str, List[str]]] = None,
properties: Optional[Union[str, List[str]]] = None,
time_series_id: Optional[Union[str, List[str]]] = None,
latest_continuous_id: Optional[Union[str, List[str]]] = None,
approval_status: Optional[Union[str, List[str]]] = None,
unit_of_measure: Optional[Union[str, List[str]]] = None,
qualifier: Optional[Union[str, List[str]]] = None,
value: Optional[int] = None,
last_modified: Optional[Union[str, List[str]]] = None,
skip_geometry: Optional[bool] = None,
time: Optional[Union[str, List[str]]] = None,
bbox: Optional[List[float]] = None,
limit: Optional[int] = None,
convert_type: bool = True,
) -> Tuple[pd.DataFrame, BaseMetadata]:
"""This endpoint provides the most recent observation for each time series
of continuous data. Continuous data are collected via automated sensors
installed at a monitoring location. They are collected at a high frequency
and often at a fixed 15-minute interval. Depending on the specific monitoring
location, the data may be transmitted automatically via telemetry and be
available on WDFN within minutes of collection, while other times the delivery
of data may be delayed if the monitoring location does not have the capacity to
automatically transmit data. Continuous data are described by parameter name
and parameter code. These data might also be referred to as "instantaneous
values" or "IV"
Parameters
----------
monitoring_location_id : string or list of strings, optional
A unique identifier representing a single monitoring location. This
corresponds to the id field in the monitoring-locations endpoint.
Monitoring location IDs are created by combining the agency code of the
agency responsible for the monitoring location (e.g. USGS) with the ID
number of the monitoring location (e.g. 02238500), separated by a hyphen
(e.g. USGS-02238500).
parameter_code : string or list of strings, optional
Parameter codes are 5-digit codes used to identify the constituent
measured and the units of measure. A complete list of parameter codes
and associated groupings can be found at
https://help.waterdata.usgs.gov/codes-and-parameters/parameters.
statistic_id : string or list of strings, optional
A code corresponding to the statistic an observation represents.
Example codes include 00001 (max), 00002 (min), and 00003 (mean).
A complete list of codes and their descriptions can be found at
https://help.waterdata.usgs.gov/code/stat_cd_nm_query?stat_nm_cd=%25&fmt=html.
properties : string or list of strings, optional
A vector of requested columns to be returned from the query. Available
options are: geometry, id, time_series_id, monitoring_location_id,
parameter_code, statistic_id, time, value, unit_of_measure,
approval_status, qualifier, last_modified
time_series_id : string or list of strings, optional
A unique identifier representing a single time series. This
corresponds to the id field in the time-series-metadata endpoint.
latest_continuous_id : string or list of strings, optional
A universally unique identifier (UUID) representing a single version of
a record. It is not stable over time. Every time the record is refreshed
in our database (which may happen as part of normal operations and does
not imply any change to the data itself) a new ID will be generated. To
uniquely identify a single observation over time, compare the time and
time_series_id fields; each time series will only have a single
observation at a given time.
approval_status : string or list of strings, optional
Some of the data that you have obtained from this U.S. Geological Survey
database may not have received Director's approval. Any such data values
are qualified as provisional and are subject to revision. Provisional
data are released on the condition that neither the USGS nor the United
States Government may be held liable for any damages resulting from its
use. This field reflects the approval status of each record, and is either
"Approved", meaining processing review has been completed and the data is
approved for publication, or "Provisional" and subject to revision. For
more information about provisional data, go to:
https://waterdata.usgs.gov/provisional-data-statement/.
unit_of_measure : string or list of strings, optional
A human-readable description of the units of measurement associated
with an observation.
qualifier : string or list of strings, optional
This field indicates any qualifiers associated with an observation, for
instance if a sensor may have been impacted by ice or if values were
estimated.
value : string or list of strings, optional
The value of the observation. Values are transmitted as strings in
the JSON response format in order to preserve precision.
last_modified : string, optional
The last time a record was refreshed in our database. This may happen
due to regular operational processes and does not necessarily indicate
anything about the measurement has changed. You can query this field
using date-times or intervals, adhering to RFC 3339, or using ISO 8601
duration objects. Intervals may be bounded or half-bounded (double-dots
at start or end). Only features that have a last_modified that
intersects the value of datetime are selected.
Examples:
* A date-time: "2018-02-12T23:20:50Z"
* A bounded interval: "2018-02-12T00:00:00Z/2018-03-18T12:31:12Z"
* Half-bounded intervals: "2018-02-12T00:00:00Z/.." or "../2018-03-18T12:31:12Z"