Skip to content

digital-science/ds-external-dataset-transformations

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 

Repository files navigation

External Dataset Transformations

Transforms external public dataset archives into Avro for ingestion into BigQuery.

Datasets

Directory Description
datacite/ Parses archived DataCite DOI metadata records into Avro.
orcid/ Parses ORCID snapshot XML summaries into Avro.

See each subdirectory's README.md for detailed usage and schema information.

About

External dataset transformations for ORCID and DataCite ingestion into Google BigQuery. Supports parsing ORCID snapshot XML summaries into structured records and writing to Avro files for ingestion into BigQuery.

Resources

Stars

Watchers

Forks

Contributors