Skip to content

Zenodo analysis: material category section empty due to PQG schema change #57

@rdhyee

Description

@rdhyee

Problem

The Material Category Analysis section on tutorials/zenodo_isamples_analysis.qmd shows empty results after the column alias fix in PR #56.

Root cause

The Jan 2026 wide parquet (isamples_202601_wide_h3.parquet) stores material categories as p__has_material_category BIGINT[] — an array of row IDs (foreign keys) pointing to IdentifiedConcept nodes in the narrow format. The old export format had has_material_category as a plain string ("rock", "sediment", etc.).

The current fix maps has_material_category to NULL so the page loads without errors, but the material breakdown charts are empty.

Fix options

  1. Pre-compute a lookup table in a small parquet file mapping row IDs → concept labels (similar to facet_summaries.parquet), and join at query time
  2. Add denormalized string columns to a future wide parquet build (e.g., has_material_category_label VARCHAR)
  3. Rewrite queries to join wide + narrow at runtime (expensive for browser-based DuckDB-WASM)

Option 1 is probably the best balance of effort vs. result.

Affected sections

  • Section 9: Material Category Analysis (bar chart, drill-down by source)
  • Any query referencing has_material_category as a string

Context

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions