Skip to content

Update processing.json fields#5

Open
dougollerenshaw wants to merge 3 commits into
masterfrom
refactor/processing-metadata-fields
Open

Update processing.json fields#5
dougollerenshaw wants to merge 3 commits into
masterfrom
refactor/processing-metadata-fields

Conversation

@dougollerenshaw
Copy link
Copy Markdown
Collaborator

@dougollerenshaw dougollerenshaw commented May 20, 2026

This PR uses the CodeOcean REST API to fetch the CodeOcean capsule slug (the number in the URL), which allows it to construct the capsule URL and the current release version and the capsule (if it's a release version). If the asset was generated by a non-release capsule, the version will say "from non-release editable capsule".

Also adds the secrets file, which will ensure that the necessary API secret will be attached when the release capsule is run by an AllenInstitute scientist (necessary to get the asset info from the CodeOcean API). Note that the keys themselves are not committed and are only visible to internal users

Closes #4

After a run in an editable capsule, we get a processing.json that looks like this:

{
   "object_type": "Processing",
   "describedBy": "https://raw.githubusercontent.com/AllenNeuralDynamics/aind-data-schema/main/src/aind_data_schema/core/processing.py",
   "schema_version": "2.2.8",
   "data_processes": [
      {
         "object_type": "Data process",
         "process_type": "File format conversion",
         "name": "MAT to RDS conversion",
         "stage": "Processing",
         "code": {
            "object_type": "Code",
            "url": "https://codeocean.allenneuraldynamics.org/capsule/8349219/tree",
            "name": "LC-NE_BARseq_MAT-RDS_conversion",
            "version": "from non-release editable capsule",
            "commit_hash": null,
            "container": null,
            "run_script": "code/run",
            "language": "R",
            "language_version": null,
            "input_data": [
               {
                  "object_type": "Data asset",
                  "url": "s3://aind-open-data/barseq_780346_2025-06-13_12-00-00"
               }
            ],
            "parameters": null,
            "core_dependency": null
         },
         "experimenters": [
            "Polina Kosillo"
         ],
         "pipeline_name": null,
         "start_date_time": "2026-05-20T22:21:24.031594Z",
         "end_date_time": null,
         "output_path": null,
         "output_parameters": null,
         "notes": "Converts BARseq .mat (HDF5 v7.3) to SingleCellExperiment .rds files.",
         "resources": null
      }
   ],
   "pipelines": null,
   "notes": null,
   "dependency_graph": null
}

We can't see what the output of a release capsule looks like until we merge this in and release a capsule.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Fix processing.json files in derived data

1 participant