| Category | MIXS ID | Metadata field | Brief description | Definition | Example of Annotation | Source |
|---|---|---|---|---|---|---|
| Sample metadata | [“ENA Marine Microalgae Checklist; Checklist: ERC000043”] | collected_by | Who collected the sample | Name of person or institute that collected the sample | Freie Universität Berlin | “ENA Marine Microalgae Checklist; Checklist: ERC000043” |
| [MIXS:0000001] | samp_size | Amount or size of the collected sample | The total amount or size (volume (ml), mass (g) or area (m2) ) of sample collected | 3 g feces | “GSC MIXS: MIMAG”, “GSC MIXS: MIGSBacteria”, “GSC MIMS: Metagenome or Environmental”, “Minimum Information about a Single Ampligied Genome (MiSAG)”, “Minimum Information about an Uncultivated Virus Genome (Miuvig)” | |
| [MIXS:0000011] | collection_date | Date at which the sample was collected | The time of sampling, either as an instance (single point in time) or interval. In case no exact time is available, the date/time can be right truncated. [ISO8601] compliant | 2008-01-23T19:23:10+00:00; 2008-01-23T19:23:10; 2008-01-23; 2008-01; 2008; Except: 2008-01; 2008 all are [ISO8601] compliant | “ENA Host Associated Checklist; Checklist: ERC000013”, “GSC MIXS: MIMAG”, “GSC MIXS: MIGSBacteria”, “GSC MIMS: Metagenome or Environmental”, “Minimum Information about a Single Ampligied Genome (MiSAG)”, “Minimum Information about an Uncultivated Virus Genome (Miuvig)” | |
| [MIXS:0000026] | source_mat_id | identifier(s) of source material | A unique identifier assigned to a [material sample] used for extracting nucleic acids, and subsequent sequencing. The identifier can refer either to the original material collected or to any derived sub-samples. The INSDC qualifiers /specimen_voucher, /bio_material, or /culture_collection may or may not share the same value as the source_mat_id field. For instance, the /specimen_voucher qualifier and source_mat_id may both contain 'UAM:Herps:14' , referring to both the specimen voucher and sampled tissue with the same identifier. However, the /culture_collection qualifier may refer to a value from an initial culture (e.g. ATCC:11775) while source_mat_id would refer to an identifier from some derived culture from which the nucleic acids were extracted (e.g. xatc123 or ark:/2154/R2) | DOG_FECAL_0001 | “GSC MIXS: MIMAG”, “GSC MIXS: MIGSBacteria”, “GSC MIMS: Metagenome or Environmental”, “Minimum Information about a Single Ampligied Genome (MiSAG)”, “Minimum Information about an Uncultivated Virus Genome (Miuvig)” | |
| [MIXS:0000092] | project_name | project name under which sampling and sequencing was done | Name of the project within which the sequencing was organized | Canine Gut Microbiome Sequencing Project | ||
| [MIXS:0000110] | samp_store_temp | Temperature at which the sample was stored | Temperature at which sample was stored, e.g. -80 ˚C | -80 ˚C | ||
| [MIXS:0000113] | temp | temperature | Temperature of the sample at the time of sampling | 37 ˚C | “ENA Host Associated Checklist; Checklist: ERC000013”, “GSC MIXS: MIMAG”, “GSC MIXS: MIGSBacteria”, “GSC MIMS: Metagenome or Environmental”, “Minimum Information about a Single Ampligied Genome (MiSAG)”, “Minimum Information about an Uncultivated Virus Genome (Miuvig)” | |
| [MIXS:0000183] | salinity | salinity | The total concentration of all dissolved salts in a liquid or solid sample. While salinity can be measured by a complete chemical analysis, this method is difficult and time consuming. More often, it is instead derived from the conductivity measurement. This is known as practical salinity. These derivations compare the specific conductance of the sample to a salinity standard such as seawater | 0 practical salinity unit (PSU) | “GSC MIxS: Host-associatedMIMS”, “GSC MIxS: Human-associatedMIMS”, “GSC MIxS Human Associated; ENA Checklist: ERC000014” | |
| [MIXS:0000249] | samp_dis_stage | Disease stage of sampled host | Stage of the disease at the time of sample collection, e.g. inoculation, penetration, infection, growth and reproduction, dissemination of pathogen | infection | ||
| [MIXS:0000752] | misc_param | miscellaneous parameter | Any other measurement performed or parameter collected, that is not listed here | household pet; kibble diet | ||
| [MIXS:0000753] | oxy_stat_samp | oxygenation status of sample | Oxygenation status of sample | anaerobic | ||
| [MIXS:0000754] | perturbation | perturbation | Type of perturbation, e.g. chemical administration, physical disturbance, etc., coupled with perturbation regimen including how many times the perturbation was repeated, how long each perturbation lasted, and the start and end time of the entire perturbation period; can include multiple perturbation types | praziquantel 5 mg kg⁻¹ PO, 24 h pre-sampling | ||
| [MIXS:0001107] | samp_name | sample name | A local identifier or name that for the material sample used for extracting nucleic acids, and subsequent sequencing. It can refer either to the original material collected or to any derived sub-samples. It can have any format, but we suggest that you make it concise, unique and consistent within your lab, and as informative as possible. INSDC requires every sample name from a single Submitter to be unique. Use of a globally unique identifier for the field source_mat_id is recommended in addition to sample_name | Dog_001_Feces_2008-01-23 | “ENA Host Associated Checklist; Checklist: ERC000013”, “GSC MIXS: MIMAG”, “GSC MIXS: MIGSBacteria”, “GSC MIMS: Metagenome or Environmental”, “Minimum Information about a Single Ampligied Genome (MiSAG)”, “Minimum Information about an Uncultivated Virus Genome (Miuvig)” | |
| [MIXS:0001216] | microb_cult_med | Microbiological culture medium (applicable only if microorganism can be cultivated) | Composition of processed material providing the needed nourishment for microorganisms or cells to grow in vitro. This field accepts terms listed under culture medium [OBI:0000079]. If the proper descriptor is not listed please use text to describe the culture medium | minimal defined medium [MCO:0000881] | “MIMS: Metagenome/Environmental, Human-Associated; Version 6.0 Package”, MSI-ECWSG (Morrison et al. (2007)) | |
| [MIXS:0001317] | samp_store_sol | Solution in which the sample was stored | Solution within which sample was stored, if any | RNALater [NCIT:C63348] | ||
| [MIXS:0001320] | samp_taxon_id | Taxonomical identifier of sample | NCBI taxon id of the sample. Maybe be a single taxon or mixed taxa sample. Use ‘synthetic metagenome for mock community/positive controls, or ’blank sample’ for negative controls. Expected_value: [NCBI taxonomy ID] | Gut Metagenome [NCBITaxon:749906] | “GSC MIXS: MIMAG”, “GSC MIXS: MIGSBacteria”, “GSC MIMS: Metagenome or Environmental”, “Minimum Information about a Single Ampligied Genome (MiSAG)”, “Minimum Information about an Uncultivated Virus Genome (Miuvig)” | |
| [] | microbial_isolate | Can the microbial isolate be cultured in vitro. Y/N | N | |||
| Site metadata | [MIXS:0000009] | lat_lon | geographic location (latitude and longitude) | The geographical origin of the sample as defined by latitude and longitude. The values should be reported in decimal degrees, limited to 8 decimal points, and in WGS84 system | 52.454456 13.293950 | |
| [MIXS:0000010] | geo_loc_name | geographic location (country and/or sea,region) | Geographic location (country and/or sea,region). The geographical origin of the sample as defined by the country or sea name followed by specific region name. Country or sea names should be chosen from the [INSDC country list], or the [GAZ ontology] | Germany: Berlin | “ENA Host Associated Checklist; Checklist: ERC000013”, “GSC MIXS: MIMAG”, “GSC MIXS: MIGSBacteria”, “GSC MIMS: Metagenome or Environmental”, “Minimum Information about a Single Ampligied Genome (MiSAG)”, “Minimum Information about an Uncultivated Virus Genome (Miuvig)” | |
| [MIXS:0000012] | env_broad_scale | Broad-scale environmental context | Report the major environmental system the sample or specimen came from. Systems(s) identifiers should provide a coarse, general environmental context of where the sampling was done. Recommended use of EnvO s biome class: [ENVO_00000428]. If more than one term applies to the field, | should be used to separate them. | urban biome [ENVO:01000249] | “ENA Host Associated Checklist; Checklist: ERC000013”, “GSC MIXS: MIMAG”, “GSC MIXS: MIGSBacteria”, “GSC MIMS: Metagenome or Environmental”, “Minimum Information about a Single Ampligied Genome (MiSAG)”, “Minimum Information about an Uncultivated Virus Genome (Miuvig)” | |
| [MIXS:0000013] | env_local_scale | Local-scale environmental context | Report the entity or entities which are in the sample or specimen’s local vicinity and which you believe have significant causal influences on your sample or specimen. Entry should be of a smaller environmental context than env_broad_scale. Terms, such as anatomical sites, from other [OBO Library] ontologies which interoperate with EnvO (e.g. [UBERON]) are accepted in this field. If more than one term applies to the field, | should be used to separate them. | household environment [ENVO:03501339] | “ENA Host Associated Checklist; Checklist: ERC000013”, “GSC MIXS: MIMAG”, “GSC MIXS: MIGSBacteria”, “GSC MIMS: Metagenome or Environmental”, “Minimum Information about a Single Ampligied Genome (MiSAG)”, “Minimum Information about an Uncultivated Virus Genome (Miuvig)” | |
| [MIXS:0000014] | env_medium | Environmental medium | Report the environmental material(s) immediately surrounding the sample or specimen at the time of sampling. Recommended use of EnvO’s subclasses of environmental material [ENVO:00010483]. Terms from other [OBO ontologies] are permissible as long as they reference mass/volume nouns (e.g. air, water, blood) and not discrete, countable entities (e.g. a tree, a leaf, a table top). If more than one term applies to the field, | should be used to separate them. | fecal material [ENVO:00002003] | “ENA Host Associated Checklist; Checklist: ERC000013”, “GSC MIXS: MIMAG”, “GSC MIXS: MIGSBacteria”, “GSC MIMS: Metagenome or Environmental”, “Minimum Information about a Single Ampligied Genome (MiSAG)”, “Minimum Information about an Uncultivated Virus Genome (Miuvig)”, “GSC MIxS Human Associated; ENA Checklist: ERC000014” | |
| [MIXS:0000018] | depth | depth | The vertical distance below local surface. For sediment or soil samples depth is measured from sediment or soil surface, respectively. Depth can be reported as an interval for subsurface samples | 0 m | “ENA Host Associated Checklist; Checklist: ERC000013”, “GSC MIxS: Host-associatedMIMS”, “GSC MIXS: MIMAG”, “GSC MIXS: MIGSBacteria”, “GSC MIMS: Metagenome or Environmental”, “Minimum Information about a Single Ampligied Genome (MiSAG)”, “Minimum Information about an Uncultivated Virus Genome (Miuvig)” | |
| [MIXS:0000093] | elev | elevation | Elevation of the sampling site is its height above a fixed reference point, most commonly the mean sea level. Elevation is mainly used when referring to points on the earth's surface, while altitude is used for points above the surface, such as an aircraft in flight or a spacecraft in orbit. Origin elevation in m | 34 m | “ENA Host Associated Checklist; Checklist: ERC000013”, “GSC MIxS: Host-associatedMIMS”, “GSC MIXS: MIMAG”, “GSC MIXS: MIGSBacteria”, “GSC MIMS: Metagenome or Environmental”, “Minimum Information about a Single Ampligied Genome (MiSAG)”, “Minimum Information about an Uncultivated Virus Genome (Miuvig)” | |
| [MIXS:0000094] | alt | altitude | Heights of objects such as airplanes, space shuttles, rockets, atmospheric balloons and heights of places such as atmospheric layers and clouds. It is used to measure the height of an object which is above the earth's surface. In this context, the altitude measurement is the vertical distance between the earth's surface above sea level and the sampled position in the air | not applicable | “ENA Host Associated Checklist; Checklist: ERC000013”, “GSC MIxS: Host-associatedMIMS”, “GSC MIXS: MIMAG”, “GSC MIXS: MIGSBacteria”, “GSC MIMS: Metagenome or Environmental”, “Minimum Information about a Single Ampligied Genome (MiSAG)”, “Minimum Information about an Uncultivated Virus Genome (Miuvig)” | |
| Site/host metadata | [MIXS:0000751] | chem_administration | List of chemical administered to sampled host or site | List of chemical compounds administered to host or on site where sampling occurred. Can include multiple compounds separated by |. For compounds consult [chemical entities of biological interest ontology (chebi) (v 163)] | praziquantel[CHEBI:45267] | “GSC MIxS: Host-associatedMIMS”, “GSC MIxS: Human-associatedMIMS” |
| Host metadata | [MIXS:0000031] | host_disease_stat | Disease status of the sampled host | List of diseases with which the host has been diagnosed; can include multiple diagnoses. The value of the field depends on host, non-human host diseases are free text. | cestode infection (Dipylidium caninum) OR healthy | “GSC MIxS: Host-associatedMIMS”, “ENA Host Associated Checklist; Checklist: ERC000013”, “GSC MIxS Human Associated; ENA Checklist: ERC000014”, “GSC MIxS: Human-associatedMIMS”, “GSC MIXS: MIGSBacteria”, “Minimum Information about Viral Genome Sequence (MigsVi)”, “Minimum Information about an Uncultivated Virus Genome (Miuvig)” |
| [MIXS:0000248] | host_common_name | Common name of the sampled host | Common name of the host | dog | “GSC MIxS: Host-associatedMIMS”, “ENA Host Associated Checklist; Checklist: ERC000013” | |
| [MIXS:0000250] | host_taxid | Taxonomy identifier of sampled host | NCBI taxon id of the host [NCBI taxonomy ID] | Canis lupus familiaris [NCBI:txid9615] | “GSC MIxS: Host-associatedMIMS”, “ENA Host Associated Checklist; Checklist: ERC000013”, MSI-ECWSG (Morrison et al. (2007)) | |
| [MIXS:0000251] | host_life_stage | Life stage of the sampled host | Description of life stage of host | adult | ||
| [MIXS:0000255] | host_age | Age of sampled host | Age of host at the time of sampling; relevant scale depends on species and study, e.g. Could be seconds for amoebae or centuries for trees | 5 y | “GSC MIxS: Host-associatedMIMS”, “ENA Host Associated Checklist; Checklist: ERC000013”, “GSC MIxS Human Associated; ENA Checklist: ERC000014”, “GSC MIxS: Human-associatedMIMS” | |
| [MIXS:0000256] | host_length | Length of sampled host | The length of host | 45 cm | “GSC MIxS: Host-associatedMIMS”, “ENA Host Associated Checklist; Checklist: ERC000013” | |
| [MIXS:0000260] | host_color | Color of sampled host | The color of host | brown | ||
| [MIXS:0000261] | host_shape | Morphological shape of sampled host | Morphological shape of host | Slender [PATO:0002212] | ||
| [MIXS:0000263] | host_tot_mass | Total mass of the sampled host | Total mass of the host at collection, the unit depends on host | 22 kg | “GSC MIxS: Host-associatedMIMS”, “ENA Host Associated Checklist; Checklist: ERC000013”, “GSC MIxS Human Associated; ENA Checklist: ERC000014”, “GSC MIxS: Human-associatedMIMS” | |
| [MIXS:0000264] | host_height | Height of sampled host | The height of subject | 55 cm | “GSC MIxS: Host-associatedMIMS”, “ENA Host Associated Checklist; Checklist: ERC000013”, “GSC MIxS Human Associated; ENA Checklist: ERC000014”, “GSC MIxS: Human-associatedMIMS” | |
| [MIXS:0000274] | host_body_temp | Body temperature of sampled host | Core body temperature of the host when sample was collected | 38.5˚C | ||
| [MIXS:0000365] | host_genotype | Observed genotype of sampled host | Observed genotype | domestic dog reference genome CanFam3.1 | ||
| [MIXS:0000859] | genetic_mod | genetic modification | Genetic modifications of the genome of an organism, which may occur naturally by spontaneous mutation, or be introduced by some experimental means, e.g. specification of a transgene or the gene knocked-out or details of transient transfection | none | ||
| [MIXS:0000861] | host_subject_id | Identifier assigned to sampled host | A unique identifier by which each subject can be referred to, de-identified | Dog_001 | ||
| [MIXS:0000862] | urobiom_sex | Physical sex of the sampled host | Physical sex of the host | Male [PATO:0000384] | “GSC MIxS: Host-associatedMIMS”, “ENA Host Associated Checklist; Checklist: ERC000013” | |
| [MIXS:0000866] | host_body_habitat | host body habitat | Original body habitat where the sample was obtained from | digestive tract [UBERON:0001555] | ||
| [MIXS:0000867] | host_body_site | Sampled body site of the host | Name of body site where the sample was obtained from, such as a specific organ or tissue (tongue, lung etc…). Recomended use of [FMA] or [UBERON] ontologies | colon [UBERON:0001155] | “GSC MIxS: Host-associatedMIMS”, “ENA Host Associated Checklist; Checklist: ERC000013”, “GSC MIxS Human Associated; ENA Checklist: ERC000014”, “GSC MIxS: Human-associatedMIMS” | |
| [MIXS:0000869] | host_diet | Diet of sampled host | Type of diet depending on the host, for animals omnivore, herbivore etc., for humans high-fat, meditteranean etc.; can include multiple diet types | omnivore [ecocore:00000082] | “GSC MIxS: Host-associatedMIMS”, “ENA Host Associated Checklist; Checklist: ERC000013”, “GSC MIxS Human Associated; ENA Checklist: ERC000014”, “GSC MIxS: Human-associatedMIMS” | |
| [MIXS:0000871] | host_growth_cond | Growth conditions of sampled host | Literature reference giving growth conditions of the host | individual housing [XCO:0000034] | ||
| [MIXS:0000874] | host_phenotype | Identified phenotype of sampled host | Phenotype of human or other host. Use terms from the phenotypic quality ontology (pato) or the Human Phenotype Ontology (HP) | Body condition score 5/9 | ||
| [MIXS:0000875] | gravidity | Gravidity of sampled host | Whether or not subject is gravid, and if yes date due or date post-conception, specifying which is used | Non-gravid | ||
| [MIXS:0000888] | host_body_product | Body product that was examined and sampled from host | Substance produced by the body, e.g. Stool, mucus, where the sample was obtained from. Use terms from the foundational model of anatomy ontology [FMA] or Uber-anatomy ontology [UBERON]. | feces [UBERON:0001988] | “GSC MIxS: Host-associatedMIMS”, “ENA Host Associated Checklist; Checklist: ERC000013”, “GSC MIxS Human Associated; ENA Checklist: ERC000014”, “GSC MIxS: Human-associatedMIMS” | |
| [MIXS:0001298] | host_symbiont | observed symbionts of sampled host | The taxonomic name of the organism(s) found living in mutualistic, commensalistic, or parasitic symbiosis with the specific host. The sampled symbiont can have its own symbionts. For example, parasites may have hyperparasites (=parasites of the parasite) | Bacteroides vulgatus [[NCBI:txid821]] | ||
| [MIXS:0001307] | type_of_symbiosis | type of symbiosis with sampled host | Type of biological interaction established between the symbiotic host organism being sampled and its respective host | Commensalism [ECOCORE:00000025] | ||
| [MIXS:0001308] | host_specificity | Specificity of symbiont of the sampled host | Level of specificity of symbiont-host interaction: e.g. generalist (symbiont able to establish associations with distantly related hosts) or species-specific | generalist | ||
| [MIXS:0001313] | host_cellular_loc | Cellular location of symbiont within the sampled host | The localization of the symbiotic host organism within the host from which it was sampled: e.g. intracellular if the symbiotic host organism is localized within the cells or extracellular if the symbiotic host organism is localized outside of cells | lumen of intestine [UBERON:0018543] |
NCBI organismal classification - NCBITAXON - An ontology representation of the NCBI organismal taxonomy.
Biological Spatial Ontology - BSPO - An ontology for respresenting spatial concepts, anatomical axes, gradients, regions, planes, sides and surfaces. These concepts can be used at multiple biological scales and in a diversity of taxa, including plants, animals and fungi. The BSPO is used to provide a source of anatomical location descriptors for logically defining anatomical entity classes in anatomy ontologies.
Uber-anatomy ontology - UBERON - Uberon is an integrated cross-species anatomy ontology representing a variety of entities classified according to traditional anatomical criteria such as structure, function and developmental lineage. The ontology includes comprehensive relationships to taxon-specific anatomical ontologies, allowing integration of functional, phenotype and expression data.
Cell Ontology - CL - The Cell Ontology is a structured controlled vocabulary for cell types in animals.
Neuro Behavior Ontology - NBO - An ontology of human and animal behaviours and behavioural phenotypes.
The BRENDA Tissue Ontology - BTO - A structured controlled vocabulary for the source of an enzyme comprising tissues, cell lines, cell types and cell cultures.
Gene Ontology - GO - The Gene Ontology (GO) provides a framework and set of concepts for describing the functions of gene products from all organisms.
Chemical Entities of Biological Interest - ChEBI - An open-access database and ontology of chemical entities. The chemical entities in ChEBI are either naturally occurring molecules or synthetic compounds used to intervene in the processes of living organisms. ChEBI uses the nomenclature, symbolism and terminology endorsed by the International Union of Pure and Applied Chemistry (IUPAC) and the Nomenclature Committee of the International Union of Biochemistry and Molecular Biology (NC-IUBMB). ChEBI also incorporates an ontological classification, whereby the relationships between chemical entities or classes of entities and their parents and/or children are defined; this enables queries based for example on chemical class and role.
The Environment Ontology - ENVO - ENVO is an ontology which represents knowledge about environments,environmental processes, ecosystems, habitats, and related entities.
An ontology of core ecological entities - ECOCORE - An ontology of core ecological entities.
Foundational Model of Anatomy Ontology - FMA - The FMA is a domain ontology that represents a coherent body of explicit declarative knowledge about human anatomy. Its ontological framework can be applied and extended to all other species. The Foundational Model of Anatomy (FMA) ontology is one of the information resources integrated in the distributed framework of the Anatomy Information System developed and maintained by the Structural Informatics Group at the University of Washington.
For readers of this repository, confused by the use of EnvO s ontologies, we recommend they read the EnvO s use documentation here: https://github.com/EnvironmentOntology/envo/wiki/Using-ENVO-with-MIxS.
“ENA Host Associated Checklist; Checklist: ERC000013.” https://www.ebi.ac.uk/ena/browser/view/ERC000013.
“ENA Marine Microalgae Checklist; Checklist: ERC000043.” https://www.ebi.ac.uk/ena/browser/view/ERC000043.
“GSC MIMS: Metagenome or Environmental.” https://genomicsstandardsconsortium.github.io/mixs/0010007/.
“GSC MIxS Human Associated; ENA Checklist: ERC000014.” https://www.ebi.ac.uk/ena/browser/view/ERC000014.
“GSC MIxS: Host-associatedMIMS.” https://genomicsstandardsconsortium.github.io/mixs/0016002/.
“GSC MIxS: Human-associatedMIMS.” https://genomicsstandardsconsortium.github.io/mixs/0016003/.
“GSC MIXS: MIGSBacteria.” https://genomicsstandardsconsortium.github.io/mixs/0010003/.
“GSC MIXS: MIMAG.” https://genomicsstandardsconsortium.github.io/mixs/0010011/.
“MIMS: Metagenome/Environmental, Human-Associated; Version 6.0 Package.” https://www.ncbi.nlm.nih.gov/biosample/docs/packages/MIMS.me.human-associated.5.0/.
“Minimum Information about a Single Ampligied Genome (MiSAG).” https://genomicsstandardsconsortium.github.io/mixs/0010010/.
“Minimum Information about an Uncultivated Virus Genome (Miuvig).” https://genomicsstandardsconsortium.github.io/mixs/0010012/.
“Minimum Information about Viral Genome Sequence (MigsVi).” https://genomicsstandardsconsortium.github.io/mixs/0010005/.
Morrison, Norman, Daniel Bearden, Jacob G. Bundy, Timothy Collette, Fraser Currie, Matthew Davey, Migdalia Dominguez, et al. 2007. “Standard Reporting Requirements for Biological Samples in Metabolomics Experiments: Environmental Context.” Metabolomics 3 (2): 203–10. https://doi.org/10.1007/s11306-007-0067-1.