Skip to content

feat: ✨ add ability to download data dict from REDCap#33

Open
martonvago wants to merge 9 commits intomainfrom
feat/redcap-data-dict
Open

feat: ✨ add ability to download data dict from REDCap#33
martonvago wants to merge 9 commits intomainfrom
feat/redcap-data-dict

Conversation

@martonvago
Copy link
Copy Markdown
Collaborator

Description

This PR adds the ability to download the data dict from REDCap.

Closes #24

This PR needs an in-depth review.

Checklist

  • Formatted Markdown
  • Ran just run-all


def get_data_dict_from_redcap() -> dict[str, str]:
"""Gets the data dictionary from REDCap."""
token = os.environ.get("REDCAP_TOKEN")
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, the token has CPH_API_KEY in the name on GenomeDK. Should we keep the reference to CPH? I went with a more general name here.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think for the generic code we should use the more general name, for the implementation of this code on ON LiMiT Feasibility we'll need to know that this is the code for the instance on REDCap in Copenhagen. Later there will be tokens for Copenhagen, Aarhus and Odense.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will all the locations be in the same repo or different repos?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the REDCap locations will pull into this one Data Package repo, as far I know.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah okay! The metadata will be exactly the same, right (it's enough to get it from one location)? Then I will change the token name back to the CPH-specific one.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For feasibility the only instance that we will be pulling 'real' data from is the Copenhagen one, but for testing we should be able to use the Aarhus version. I'm not sure we should be testing the data transfer with real data to start with. It would have been easier if we had some time to run tests on the system in Cph before we went into production, hopefully we'll have that for the main study.

The main study is still being designed, but it does look like we'll run several REDCap instances on the Aarhus server, each with slightly different data dictionaries...

@martonvago martonvago moved this from Todo to In Review in Data development Mar 19, 2026
@martonvago martonvago marked this pull request as ready for review March 19, 2026 10:17
Copy link
Copy Markdown
Member

@lwjohnst86 lwjohnst86 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, just some minor comments


def get_data_dict_from_redcap() -> dict[str, str]:
"""Gets the data dictionary from REDCap."""
token = os.environ.get("REDCAP_TOKEN")
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the REDCap locations will pull into this one Data Package repo, as far I know.

@github-project-automation github-project-automation bot moved this from In Review to In Progress in Data development Mar 23, 2026
@lwjohnst86
Copy link
Copy Markdown
Member

@martonvago you can also include the downloaded data_dictionary.json file here, so that we can do work on it outside of GenomeDK (it's only metadata).

@martonvago
Copy link
Copy Markdown
Collaborator Author

martonvago commented Mar 23, 2026

@martonvago you can also include the downloaded data_dictionary.json file here, so that we can do work on it outside of GenomeDK (it's only metadata).

@lwjohnst86 as in include it in the repo? Oh I'm stupid I was thinking why would I include the test metadata. But you meant the real metadata 🤦

@martonvago martonvago requested a review from lwjohnst86 March 23, 2026 15:22
@martonvago martonvago moved this from In Progress to In Review in Data development Mar 23, 2026
@martonvago
Copy link
Copy Markdown
Collaborator Author

@lwjohnst86 Done!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: In Review

Development

Successfully merging this pull request may close these issues.

Extract data dictionary flow

3 participants