Skip to content

Conversation

@MaxGhenis
Copy link
Contributor

Summary

Fixes #427

In pandas 3.0, string columns use StringDtype by default. When a pandas Series has a non-integer index (e.g., string index), array[0] does label-based lookup (looking for key "0") instead of positional access, causing KeyError: 0.

This broke policyengine-us when enabling pandas 3.0 support - specifically the county variable tests failed with:

policyengine_core/enums/enum.py:69: in encode
    if len(array) > 0 and isinstance(array[0], Enum):
                                      ^^^^^^^^
E   KeyError: 0

Changes

  • Modified Enum.encode() to use .iloc[0] for pandas Series to ensure positional access
  • Added two tests for encoding pandas Series containing Enum items:
    • test_enum_encode_pandas_series_with_enum_items - basic Series with default index
    • test_enum_encode_pandas_series_with_string_index - Series with string index (the failing case)

Test plan

  • New tests pass locally
  • All existing enum tests pass
  • CI passes

Related

🤖 Generated with Claude Code

…index

In pandas 3.0, string columns use StringDtype by default. When a Series
has a string index, array[0] does label-based lookup (looking for key "0")
instead of positional access, causing KeyError.

The fix uses .iloc[0] for pandas Series to ensure positional access.

Fixes #427

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@MaxGhenis MaxGhenis force-pushed the fix-pandas3-enum-encoding branch from f4d2fe3 to 9ad42b6 Compare January 25, 2026 04:07
@MaxGhenis MaxGhenis merged commit 13bf823 into master Jan 25, 2026
14 checks passed
@MaxGhenis MaxGhenis deleted the fix-pandas3-enum-encoding branch January 25, 2026 04:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Pandas 3.0: KeyError when encoding Enum variables with StringDtype index

2 participants