Skip to content

Latest commit

 

History

History
134 lines (113 loc) · 35.5 KB

File metadata and controls

134 lines (113 loc) · 35.5 KB

Terrestrial microbiome biological/environmental metadata

  Minimal biological/environmental metadata for terrestrial microbiome

Category MIXS ID Metadata field Brief description Definition Example of Annotation Source
Sample metadata [“ENA Marine Microalgae Checklist; Checklist: ERC000043”] collected_by Who collected the sample Name of person or institute that collected the sample TU Braunschweig “ENA Marine Microalgae Checklist; Checklist: ERC000043”
[MIXS:0000001] samp_size Amount or size of the collected sample The total amount or size (volume (ml), mass (g) or area (m2) ) of sample collected 500 g “GSC MIXS: MIMAG”, “GSC MIXS: MIGSBacteria”, “GSC MIMS: Metagenome or Environmental”, “Minimum Information about a Single Ampligied Genome (MiSAG)”, “Minimum Information about an Uncultivated Virus Genome (Miuvig)”
[MIXS:0000011] collection_date Date at which the sample was collected The time of sampling, either as an instance (single point in time) or interval. In case no exact time is available, the date/time can be right truncated. [ISO8601] compliant 2013-03-25 “ENA Host Associated Checklist; Checklist: ERC000013”, “GSC MIXS: MIMAG”, “GSC MIXS: MIGSBacteria”, “GSC MIMS: Metagenome or Environmental”, “Minimum Information about a Single Ampligied Genome (MiSAG)”, “Minimum Information about an Uncultivated Virus Genome (Miuvig)”, “GSC MIxS Water; ENA Checklist: ERC000024”, “ENA Marine Microalgae Checklist; Checklist: ERC000043”, “ENA Tara Oceans; Checklist: ERC000030”, “ENA Micro B3; Checklist: ERC000027”
[MIXS:0000016] samp_mat_process Processing prior to analysis A brief description of any processing applied to the sample during or after retrieving the sample from environment, or a link to the relevant protocol(s) performed sieved < 2 mm, homogenized
[MIXS:0000026] source_mat_id identifier(s) of source material A unique identifier assigned to a [material sample] used for extracting nucleic acids, and subsequent sequencing. The identifier can refer either to the original material collected or to any derived sub-samples. The INSDC qualifiers /specimen_voucher, /bio_material, or /culture_collection may or may not share the same value as the source_mat_id field. For instance, the /specimen_voucher qualifier and source_mat_id may both contain 'UAM:Herps:14' , referring to both the specimen voucher and sampled tissue with the same identifier. However, the /culture_collection qualifier may refer to a value from an initial culture (e.g. ATCC:11775) while source_mat_id would refer to an identifier from some derived culture from which the nucleic acids were extracted (e.g. xatc123 or ark:/2154/R2) A2_OILSPILL_0001 “GSC MIXS: MIMAG”, “GSC MIXS: MIGSBacteria”, “GSC MIMS: Metagenome or Environmental”, “Minimum Information about a Single Ampligied Genome (MiSAG)”, “Minimum Information about an Uncultivated Virus Genome (Miuvig)”
[MIXS:0000092] project_name project name under which sampling and sequencing was done Name of the project within which the sequencing was organized Roadside Oil Spill Soil Microbiome Study
[MIXS:0000103] organism_count Organism count per unit Total cell count of any organism (or group of organisms) per gram, volume or area of sample; include organism name, count and enumeration method (e.g. qPCR) total prokaryotes; 4.2e8 cells g⁻¹; Real Time PCR [NCIT:C51962]
[MIXS:0000110] samp_store_temp Sample storage temperature Temperature at which sample was stored, e.g. -80 °C -20˚C
[MIXS:0000113] temp Temperature of sample Temperature of the sample at the time of sampling 15˚C “ENA Host Associated Checklist; Checklist: ERC000013”, “GSC MIXS: MIMAG”, “GSC MIXS: MIGSBacteria”, “GSC MIMS: Metagenome or Environmental”, “Minimum Information about a Single Ampligied Genome (MiSAG)”, “Minimum Information about an Uncultivated Virus Genome (Miuvig)”
[MIXS:0000116] samp_store_dur sample storage duration Duration for which the sample was stored. Indicate the duration for which the sample was stored written in ISO 8601 format 48 h
[MIXS:0000183] salinity Salinity in the sample The total concentration of all dissolved salts in a liquid or solid sample. While salinity can be measured by a complete chemical analysis, this method is difficult and time consuming. More often, it is instead derived from the conductivity measurement. This is known as practical salinity. These derivations compare the specific conductance of the sample to a salinity standard such as seawater 0.3 practical salinity unit (PSU) “GSC MIxS: Host-associatedMIMS”
[MIXS:0000185] water_content water content Water content measurement 15 % (w/w)
[MIXS:0000204] org_matter organic matter Concentration of organic matter 4.00%
[MIXS:0000205] org_nitro organic nitrogen Concentration of organic nitrogen 0.20%
[MIXS:0000322] sieving Sieve size / design Collection design of pooled samples and/or sieve size and amount of sample sieved 2 mm mesh
[MIXS:0000325] pool_dna_extracts Were DNA extracts pooled? Indicate whether multiple DNA extractions were mixed. If the answer yes, the number of extracts that were pooled should be given no
[MIXS:0000327] store_cond Pre-extraction storage Explain how and for how long the soil sample was stored before DNA extraction (fresh/frozen/other) on ice during transport
[MIXS:0000525] tot_carb Total carbon Total carbon content 2.00%
[MIXS:0000530] tot_nitro_content Total nitrogen Total nitrogen content of the sample 0.20%
[MIXS:0000533] tot_org_carb Total organic carbon Total organic carbon content 2.50%
[MIXS:0000555] chem_mutagen Mutagen treatment details Treatment involving use of mutagens; should include the name of mutagen, amount administered, treatment regimen including how many times the treatment was repeated, how long each treatment lasted, and the start and end time of the entire treatment; can include multiple mutagen regimens none
[MIXS:0000556] fertilizer_regm Fertilizer treatment details Information about treatment involving the use of fertilizers; should include the name of fertilizer, amount administered, treatment regimen including how many times the treatment was repeated, how long each treatment lasted, and the start and end time of the entire treatment; can include multiple fertilizer regimens none
[MIXS:0000557] fungicide_regm Fungicide treatment details Information about treatment involving use of fungicides; should include the name of fungicide, amount administered, treatment regimen including how many times the treatment was repeated, how long each treatment lasted, and the start and end time of the entire treatment; can include multiple fungicide regimens none
[MIXS:0000561] herbicide_regm Herbicide treatment details Information about treatment involving use of herbicides; should include the name of herbicide, amount administered, treatment regimen including how many times the treatment was repeated, how long each treatment lasted, and the start and end time of the entire treatment; can include multiple regimens none
[MIXS:0000568] humidity_regm Humidity treatment details Information about treatment involving an exposure to varying degree of humidity; should include amount of humidity administered, treatment regimen including how many times the treatment was repeated, how long each treatment lasted, and the start and end time of the entire treatment; can include multiple regimens none
[MIXS:0000569] light_regm Light exposure details Information about treatment(s) involving exposure to light, including both light intensity and quality natural daylight
[MIXS:0000570] mineral_nutr_regm Mineral nutrient treatments Information about treatment involving the use of mineral supplements; should include the name of mineral nutrient, amount administered, treatment regimen including how many times the treatment was repeated, how long each treatment lasted, and the start and end time of the entire treatment; can include multiple mineral nutrient regimens none
[MIXS:0000571] non_min_nutr_regm Non-mineral nutrient treatments Information about treatment involving exposure of plant to non-mineral nutrients such as oxygen, hydrogen or carbon; should include the name of non-mineral nutrient, amount administered, treatment regimen including how many times the treatment was repeated, how long each treatment lasted, and the start and end time of the entire treatment; can include multiple regimens none
[MIXS:0000573] pesticide_regm Pesticide treatment details Information about treatment involving use of insecticides; should include the name of pesticide, amount administered, treatment regimen including how many times the treatment was repeated, how long each treatment lasted, and the start and end time of the entire treatment; can include multiple pesticide regimens none
[MIXS:0000575] radiation_regm Radiation exposure details Information about treatment involving exposure of plant or a plant part to a particular radiation regimen; should include the radiation type, amount or intensity administered, treatment regimen including how many times the treatment was repeated, how long each treatment lasted, and the start and end time of the entire treatment; can include multiple radiation regimens ambient
[MIXS:0000576] rainfall_regm Rainfall treatment details Information about treatment involving an exposure to a given amount of rainfall; should include treatment regimen details such as how many times the treatment was repeated, how long each treatment lasted, and the start and end time of the entire treatment; can include multiple regimens natural rainfall
[MIXS:0000582] salt_regm Salt treatment details Information about treatment involving use of salts as supplement to liquid and soil growth media; should include the name of salt, amount administered, treatment regimen including how many times the treatment was repeated, how long each treatment lasted, and the start and end time of the entire treatment; can include multiple salt regimens none
[MIXS:0000639] agrochem_addition Agrochemical input history Addition of fertilizers, pesticides, etc. – amount and time of applications none
[MIXS:0000650] microbial_biomass Living microbial biomass The part of the organic matter in the soil that constitutes living microorganisms smaller than 5–10 µm. If you keep this, you would need to have correction factors used for conversion to the final units 0.05 mg kg⁻¹
[MIXS:0000652] heavy_metals Heavy metals present Heavy metals present in the sequenced sample and their concentrations. For multiple heavy metals and concentrations, add multiple copies of this field Pb [CHEBI:27889] 50 mg kg⁻¹| Zn [CHEBI:36560] 150 mg kg⁻¹
[MIXS:0000689] tot_phosphate Total phosphate Total amount or concentration of phosphate 25 mg kg⁻¹
[MIXS:0000752] misc_param Other measurements Any other measurement performed or parameter collected that is not listed here traffic density ~ 5 000 vehicles day⁻¹
[MIXS:0000753] oxy_stat_samp Oxygenation status Oxygenation status of sample aerobic
[MIXS:0000754] perturbation Applied perturbations Type of perturbation (e.g. chemical administration, physical disturbance) together with perturbation regimen; can include multiple perturbation types oil spill [ENVO:00002061]
[MIXS:0001001] ph Sample pH Ph measurement of the sample, or liquid portion of sample, or aqueous phase of the fluid 6.8 “GSC MIxS: WaterMIMS”
[MIXS:0001038] biotic_regm Biotic factor treatments Information about treatment(s) involving use of biotic factors, such as bacteria, viruses or fungi none
[MIXS:0001107] samp_name Unique identifier of sample A local identifier or name for the material sample used for extracting nucleic acids, and subsequent sequencing. It can refer either to the original material collected or to any derived sub-samples. It can have any format, but we suggest that you make it concise, unique and consistent within your lab, and as informative as possible. INSDC requires every sample name from a single Submitter to be unique. Use of a globally unique identifier for the field source_mat_id is recommended in addition to sample_name OilSpillSoil_A2_1 “ENA Host Associated Checklist; Checklist: ERC000013”, “GSC MIXS: MIMAG”, “GSC MIXS: MIGSBacteria”, “GSC MIMS: Metagenome or Environmental”, “Minimum Information about a Single Ampligied Genome (MiSAG)”, “Minimum Information about an Uncultivated Virus Genome (Miuvig)”
[MIXS:0001216] microb_cult_med microbiological_culture_medium_(applicable_only_if_microorganism_can_be_cultivated) Composition of processed material providing the needed nourishment for microorganisms or cells to grow in vitro. This field accepts terms listed under culture medium [OBI:0000079]. If the proper descriptor is not listed please use text to describe the culture medium minimal defined medium [MCO:0000881]|diesel-enriched Bushnell–Haas agar “MIMS: Metagenome/Environmental, Human-Associated; Version 6.0 Package”, MSI-ECWSG (Morrison et al. (2007))
[MIXS:0001320] samp_taxon_id taxonomy_identifier_of_dna_sample NCBI taxon id of the sample. Maybe be a single taxon or mixed taxa sample. Use ‘synthetic metagenome for mock community/positive controls, or ’blank sample’ for negative controls. Expected_value: [NCBI taxonomy ID] soil metagenome [NCBI:txid410658] “GSC MIXS: MIMAG”, “GSC MIXS: MIGSBacteria”, “GSC MIMS: Metagenome or Environmental”, “Minimum Information about a Single Ampligied Genome (MiSAG)”, “Minimum Information about an Uncultivated Virus Genome (Miuvig)”
[] microbial_isolate Microbial isolate cultured?: Y/N Y
Site metadata [MIXS:0000009] lat_lon geographic location (latitude and longitude) The geographical origin of the sample as defined by latitude and longitude. The values should be reported in decimal degrees, limited to 8 decimal points, and in WGS84 system 52.2929 10.5416
[MIXS:0000010] geo_loc_name geographic location (country and/or sea,region) The geographical origin of the sample as defined by the country or sea name followed by specific region name. Country or sea names should be chosen from the [INSDC country list], or the [GAZ ontology]. Terms permissible are also the geographic origin of the sample as defined by the marine region name chosen from the [Marine Regions vocabulary] Germany: Lower Saxony, Braunschweig district [GAZ:00008259] “ENA Host Associated Checklist; Checklist: ERC000013”, “GSC MIXS: MIMAG”, “GSC MIXS: MIGSBacteria”, “GSC MIMS: Metagenome or Environmental”, “Minimum Information about a Single Ampligied Genome (MiSAG)”, “Minimum Information about an Uncultivated Virus Genome (Miuvig)”
[MIXS:0000012] env_broad_scale Broad-scale environmental context Report the major environmental system the sample or specimen came from. Systems(s) identifiers should provide a coarse, general environmental context of where the sampling was done. Recommended use of EnvO s biome class: [ENVO_00000428]. If more than one term applies to the field, | should be used to separate them. temperate grassland biome [ENVO:01000193]|road [] “ENA Host Associated Checklist; Checklist: ERC000013”, “GSC MIXS: MIMAG”, “GSC MIXS: MIGSBacteria”, “GSC MIMS: Metagenome or Environmental”, “Minimum Information about a Single Ampligied Genome (MiSAG)”, “Minimum Information about an Uncultivated Virus Genome (Miuvig)”
[MIXS:0000013] env_local_scale Local-scale environmental context Report the entity or entities which are in the sample or specimen’s local vicinity and which you believe have significant causal influences on your sample or specimen. Entry should be of a smaller environmental context than env_broad_scale. Terms, such as anatomical sites, from other [OBO Library] ontologies which interoperate with EnvO (e.g. [UBERON]) are accepted in this field. If more than one term applies to the field, | should be used to separate them. Roadside [ENVO:01000447] “ENA Host Associated Checklist; Checklist: ERC000013”, “GSC MIXS: MIMAG”, “GSC MIXS: MIGSBacteria”, “GSC MIMS: Metagenome or Environmental”, “Minimum Information about a Single Ampligied Genome (MiSAG)”, “Minimum Information about an Uncultivated Virus Genome (Miuvig)”
[MIXS:0000014] env_medium Environmental medium Report the environmental material(s) immediately surrounding the sample or specimen at the time of sampling. Recommended use of EnvO’s subclasses of environmental material [ENVO:00010483]. Terms from other [OBO ontologies] are permissible as long as they reference mass/volume nouns (e.g. air, water, blood) and not discrete, countable entities (e.g. a tree, a leaf, a table top). If more than one term applies to the field, | should be used to separate them. contaminated soil [ENVO:00002116]|roadside soil [ENVO:00005743] “ENA Host Associated Checklist; Checklist: ERC000013”, “GSC MIXS: MIMAG”, “GSC MIXS: MIGSBacteria”, “GSC MIMS: Metagenome or Environmental”, “Minimum Information about a Single Ampligied Genome (MiSAG)”, “Minimum Information about an Uncultivated Virus Genome (Miuvig)”, “GSC MIxS Human Associated; ENA Checklist: ERC000014”
[MIXS:0000018] depth Depth below surface The vertical distance below local surface. For sediment or soil samples depth is measured from sediment or soil surface, respectively. Depth can be reported as an interval for subsurface samples 0-10 cm “ENA Host Associated Checklist; Checklist: ERC000013”, “GSC MIxS: Host-associatedMIMS”, “GSC MIXS: MIMAG”, “GSC MIXS: MIGSBacteria”, “GSC MIMS: Metagenome or Environmental”, “Minimum Information about a Single Ampligied Genome (MiSAG)”, “Minimum Information about an Uncultivated Virus Genome (Miuvig)”
[MIXS:0000093] elev Site elevation (m) Elevation of the sampling site is its height above a fixed reference point, most commonly the mean sea level. Elevation is mainly used when referring to points on the earth's surface, while altitude is used for points above the surface, such as an aircraft in flight or a spacecraft in orbit. Origin elevation in m 78 m “ENA Host Associated Checklist; Checklist: ERC000013”, “GSC MIxS: Host-associatedMIMS”, “ENA Soil; Checklist: ERC000022”, “GSC MIXS: MIMAG”, “GSC MIXS: MIGSBacteria”, “GSC MIMS: Metagenome or Environmental”, “Minimum Information about a Single Ampligied Genome (MiSAG)”, “Minimum Information about an Uncultivated Virus Genome (Miuvig)”
[MIXS:0000100] humidity Air humidity at sampling Amount of water vapour in the air, at the time of sampling 55% RH
[MIXS:0000112] solar_irradiance Incoming solar irradiance The amount of solar energy that arrives at a specific area of a surface during a specific time interval 450 W m⁻²
[MIXS:0000312] cur_vegetation Current vegetation type Vegetation classification from one or more standard classification systems, or agricultural crop mixed grassland [ENVO:01000448]
[MIXS:0000315] previous_land_use Previous land use Previous land use and dates mixed grassland [ENVO:01000448]
[MIXS:0000319] flooding Evidence of flooding Historical and/or physical evidence of flooding none
[MIXS:0000320] extreme_event Extreme physical events Unusual physical events that may have affected microbial populations oil spill [ENVO:00002061]
[MIXS:0000332] soil_type Soil type classification Description of the soil type or classification. This field accepts terms under [[soil [[\ENVO_00001998\]]]](http://purl.obolibrary.org/obo/ENVO_00001998). Multiple terms can be separated by pipes grassland soil [ENVO:00005750]
[MIXS:0000335] soil_texture % sand / silt / clay The relative proportion of different grain sizes of mineral particles in a soil, as described using a standard system; express as % sand (50 µm–2 mm), silt (2 µm–50 µm), and clay (< 2 µm) with textural name optional Silt [ENVO:01000016]
[MIXS:0000751] chem_administration Chemicals applied onsite List of chemical compounds administered to host or on site where sampling occurred. Can include multiple compounds separated by |. For compounds consult [chemical entities of biological interest ontology (chebi) (v 163)] e.g. fertilizer [CHEBI:33287] “GSC MIxS: Host-associatedMIMS”, “GSC MIxS: Human-associatedMIMS”
[MIXS:0000819] pres_animal_insect Animals / insects present The type and number of animals or insects present in the sampling space none
[MIXS:0000829] season Sampling season The season when sampling occurred; terms listed under ‘season’ ontology Spring [NCIT:C94731]
[MIXS:0001080] cur_land_use Current land use Present state of land use at the sample site public road [ENVO:01000780]
[MIXS:0001082] soil_horizon Soil horizon layer Specific layer in the land area which measures parallel to the soil surface and possesses physical characteristics which differ from the layers above and beneath surface soil [ENVO:02000059]
[MIXS:0001083] fao_class FAO soil classification Soil classification from the FAO World soil distribution from International Soil Reference and Information Centre (ISRIC). The list of available soil classifications can be found at https://www.isric.org/explore/world-soil-distribution Arenosol [ENVO:00002229]
[MIXS:0001086] fire Evidence / history of fire Historical and/or physical evidence of fire none
[MIXS:0001121] adjacent_environment Adjacent environmental features Description of environmental systems or features adjacent to the sampling site agricultural field [ENVO:00000114]
[MIXS:0001159] soil_cover Material covering soil Material covering the sampled soil. This field accepts terms under environmental material [ENVO:00010483] Arenosol [ENVO:00002229]
[MIXS:0001160] soil_pH Soil pH measurement Ph measurement of the soil 6.8
[MIXS:0001161] rel_location Relative sampling position Location of sampled soil relative to other parts of the farm e.g. under crop plant, near irrigation ditch, from the dirt road 1 m from asphalt edge, 20 m from agricultural field
[MIXS:0001163] soil_temp Soil temperature Temperature of soil at the time of sampling 15 °C

Terrestrial and Terrestrial-Constructed - Ontology recommendations

NCBI organismal classification - NCBITAXON - An ontology representation of the NCBI organismal taxonomy.

The Environment Ontology - ENVO - ENVO is an ontology which represents knowledge about environments,environmental processes, ecosystems, habitats, and related entities.

An ontology of core ecological entities - ECOCORE - An ontology of core ecological entities.

Chemical Entities of Biological Interest - ChEBI - An open-access database and ontology of chemical entities. The chemical entities in ChEBI are either naturally occurring molecules or synthetic compounds used to intervene in the processes of living organisms. ChEBI uses the nomenclature, symbolism and terminology endorsed by the International Union of Pure and Applied Chemistry (IUPAC) and the Nomenclature Committee of the International Union of Biochemistry and Molecular Biology (NC-IUBMB). ChEBI also incorporates an ontological classification, whereby the relationships between chemical entities or classes of entities and their parents and/or children are defined; this enables queries based for example on chemical class and role.

References

“ENA Host Associated Checklist; Checklist: ERC000013.” https://www.ebi.ac.uk/ena/browser/view/ERC000013.

“ENA Marine Microalgae Checklist; Checklist: ERC000043.” https://www.ebi.ac.uk/ena/browser/view/ERC000043.

“ENA Soil; Checklist: ERC000022.” https://www.ebi.ac.uk/ena/browser/view/ERC000022.

“GSC MIMS: Metagenome or Environmental.” https://genomicsstandardsconsortium.github.io/mixs/0010007/.

“GSC MIxS: Host-associatedMIMS.” https://genomicsstandardsconsortium.github.io/mixs/0016002/.

“GSC MIXS: MIGSBacteria.” https://genomicsstandardsconsortium.github.io/mixs/0010003/.

“GSC MIXS: MIMAG.” https://genomicsstandardsconsortium.github.io/mixs/0010011/.

“GSC MIxS: SoilMIMS.” https://genomicsstandardsconsortium.github.io/mixs/0016012/.

“GSC MIxS: WaterMIMS.” https://genomicsstandardsconsortium.github.io/mixs/0016014/.

“MIMS: Metagenome/Environmental, Human-Associated; Version 6.0 Package.” https://www.ncbi.nlm.nih.gov/biosample/docs/packages/MIMS.me.human-associated.5.0/.

“Minimum Information about a Single Ampligied Genome (MiSAG).” https://genomicsstandardsconsortium.github.io/mixs/0010010/.

“Minimum Information about an Uncultivated Virus Genome (Miuvig).” https://genomicsstandardsconsortium.github.io/mixs/0010012/.

“Minimum Information about Viral Genome Sequence (MigsVi).” https://genomicsstandardsconsortium.github.io/mixs/0010005/.

Morrison, Norman, Daniel Bearden, Jacob G. Bundy, Timothy Collette, Fraser Currie, Matthew Davey, Migdalia Dominguez, et al. 2007. “Standard Reporting Requirements for Biological Samples in Metabolomics Experiments: Environmental Context.” Metabolomics 3 (2): 203–10. https://doi.org/10.1007/s11306-007-0067-1.