|
| 1 | +# Data Directory |
| 2 | + |
| 3 | +This directory contains neural and behavioral data for the ClickDV project. The data is organized by animal subjects and recording sessions. |
| 4 | + |
| 5 | +## Directory Structure |
| 6 | + |
| 7 | +``` |
| 8 | +data/ |
| 9 | +├── README.md # This file |
| 10 | +├── raw/ # Raw neural data files (.mat format) |
| 11 | +│ ├── A324/ # Subject A324 data |
| 12 | +│ │ ├── 2023-07-27/ |
| 13 | +│ │ │ └── A324_pycells_20230727.mat |
| 14 | +│ │ └── 2023-07-28/ |
| 15 | +│ │ └── A324_pycells_20230728.mat |
| 16 | +│ ├── A327/ # Subject A327 data |
| 17 | +│ │ ├── 2023-09-09/ |
| 18 | +│ │ │ └── A327_pycells_20230909.mat |
| 19 | +│ │ ├── 2023-09-10/ |
| 20 | +│ │ │ └── A327_pycells_20230910.mat |
| 21 | +│ │ ├── 2023-09-11/ |
| 22 | +│ │ │ └── A327_pycells_20230911.mat |
| 23 | +│ │ └── 2023-09-12/ |
| 24 | +│ │ └── A327_pycells_20230912.mat |
| 25 | +│ ├── C211/ # Subject C211 data |
| 26 | +│ │ ├── 2024-01-03/ |
| 27 | +│ │ │ └── C211_pycells_20240103.mat |
| 28 | +│ │ ├── 2024-01-04/ |
| 29 | +│ │ │ └── C211_pycells_20240104.mat |
| 30 | +│ │ ├── 2024-01-05/ |
| 31 | +│ │ │ └── C211_pycells_20240105.mat |
| 32 | +│ │ ├── 2024-01-06/ |
| 33 | +│ │ │ └── C211_pycells_20240106.mat |
| 34 | +│ │ ├── 2024-01-07/ |
| 35 | +│ │ │ └── C211_pycells_20240107.mat |
| 36 | +│ │ ├── 2024-01-08/ |
| 37 | +│ │ │ └── C211_pycells_20240108.mat |
| 38 | +│ │ └── 2024-01-10/ |
| 39 | +│ │ └── C211_pycells_20240110.mat |
| 40 | +│ └── Copy of twoarmedbandit_trainingrecordings.csv |
| 41 | +└── processed/ # Processed data outputs |
| 42 | + ├── aligned_sessions/ # Time-aligned neural activity |
| 43 | + ├── click_times/ # Extracted click timing data |
| 44 | + └── decision_variables/ # Computed decision variables |
| 45 | +``` |
| 46 | + |
| 47 | +## Data Format |
| 48 | + |
| 49 | +### Raw Data Files |
| 50 | +- **Format**: MATLAB `.mat` files |
| 51 | +- **Naming Convention**: `{ANIMAL_ID}_pycells_{YYYYMMDD}.mat` |
| 52 | +- **Content**: Neural spike times, behavioral timestamps, trial information |
| 53 | + |
| 54 | +### Key Data Fields |
| 55 | +Each `.mat` file contains: |
| 56 | +- `raw_spike_time_s`: Raw neural spike times in seconds |
| 57 | +- `filt_spike_time`: Filtered spike times (quality-approved units) |
| 58 | +- `clicks_on`: Click event timestamps |
| 59 | +- `cpoke_in`/`cpoke_out`: Center poke entry/exit times |
| 60 | +- `spoke`: Side poke timestamps (choice indicators) |
| 61 | +- `feedback`: Trial feedback timestamps |
| 62 | +- `region`: Brain region labels (e.g., 'ADS', 'NAc', 'MGB') |
| 63 | +- `hemisphere`: Recording hemisphere ('left'/'right') |
| 64 | + |
| 65 | +## Data Acquisition |
| 66 | + |
| 67 | +### For New Users |
| 68 | +1. **Contact**: Obtain data access from the Brody-Daw lab. I received this from Julie Charlton. |
| 69 | +2. **Download**: Request access to the lab's data repository |
| 70 | +3. **Placement**: Download files into the appropriate `data/raw/ANIMAL_ID/DATE/` directories |
| 71 | +4. **Verification**: Ensure file naming follows the convention above |
| 72 | + |
| 73 | +### Data Sources |
| 74 | +- **Origin**: Brody-Daw lab, Princeton University |
| 75 | +- **Recording Type**: Multi-unit neural recordings during two-armed bandit task |
| 76 | +- **Species**: Rat behavioral experiments |
| 77 | +- **Recording Regions**: Anterior Dorsal Striatum (ADS), Nucleus Accumbens (NAc), others |
| 78 | + |
| 79 | +### File Sizes |
| 80 | +- Individual session files: ~10-50 MB each |
| 81 | +- Total raw data: ~500 MB |
| 82 | +- Processed outputs: Variable, typically <100 MB |
| 83 | + |
| 84 | +## Setup Instructions |
| 85 | + |
| 86 | +1. **Create directory structure**: |
| 87 | + ```bash |
| 88 | + mkdir -p data/raw data/processed/aligned_sessions data/processed/click_times data/processed/decision_variables |
| 89 | + ``` |
| 90 | + |
| 91 | +2. **Obtain raw data**: |
| 92 | + - Contact lab for data access credentials |
| 93 | + - Download session files to appropriate `data/raw/ANIMAL_ID/DATE/` folders |
| 94 | + - Verify file integrity and naming conventions |
| 95 | + |
| 96 | +3. **Processed data**: |
| 97 | + - Will be generated by running analysis scripts |
| 98 | + - Intermediate outputs saved in `processed/` subdirectories |
| 99 | + - Can be regenerated from raw data as needed |
| 100 | + |
| 101 | +## Notes |
| 102 | + |
| 103 | +- Raw data files are ignored by git (see `.gitignore`) |
| 104 | +- Only commit processed outputs that are small and essential |
| 105 | +- For reproducibility, document any data preprocessing steps |
| 106 | +- Consider using data versioning tools (DVC) for larger datasets |
0 commit comments