Skip to content

Conversation

@MaxGhenis
Copy link
Contributor

Summary

  • Fixes partnership_se_income showing $0 in PUF-based datasets
  • The raw IRS PUF doesn't contain k1bx14p/k1bx14s columns directly - these need to be derived from source columns
  • Implements the taxdata derivation formula: partnership_se = (E30400 + E30500) - E00900 - E02100

Background

The previous code looked for k1bx14p and k1bx14s columns in the PUF, defaulting to 0 when not found. However, these columns don't exist in the raw IRS PUF - they're derived by PSLmodels/taxdata in finalprep.py.

The derivation logic is:

  • E30400 = taxpayer's total SE taxable income (includes Sch C + Sch F + K-1 box 14)
  • E30500 = spouse's total SE taxable income
  • E00900 = Schedule C net profit/loss
  • E02100 = Schedule F farm income
  • K-1 Box 14 = Total SE - (Schedule C + Schedule F)

This ensures the SE tax calculation in PolicyEngine correctly includes partnership income for general partners.

Test plan

  • Verified derivation produces non-zero values: ~19,000 records with positive partnership_se_income
  • Verified weighted aggregate: $12.7B (reasonable given total SE income of ~$400B)
  • CI builds PUF-based datasets successfully
  • partnership_se_income appears with non-zero values in Enhanced CPS

🤖 Generated with Claude Code

MaxGhenis and others added 3 commits January 25, 2026 20:18
…columns

The raw IRS PUF doesn't contain k1bx14p/k1bx14s columns - these are derived
by PSLmodels/taxdata from the total SE income (E30400/E30500) minus Schedule C
(E00900) and Schedule F (E02100) income.

This fix implements the same derivation logic from taxdata's finalprep.py
split_earnings_variables function. The formula is:

  partnership_se = (E30400 + E30500) - E00900 - E02100

This ensures partnership_se_income has non-zero values in the PUF-based
datasets, enabling accurate SE tax calculations for general partners.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The E30400/E30500 PUF columns are already TAXABLE SE income (post-0.9235
deduction factor). Since PolicyEngine applies the 0.9235 factor itself in
taxable_self_employment_income, we need to provide GROSS partnership SE income.

Changes:
- Gross up E30400+E30500 by dividing by 0.9235 before subtracting Sch C/F
- Only compute when partnership activity exists (E25940+E25980-E25920-E25960 != 0)

This aligns with Yale Budget Lab's Tax-Data approach in process_puf.R:
  part_se = if_else(E25940 + E25980 - E25920 - E25960 != 0,
                    (E30400 + E30500) / 0.9235 - E00900 - E02100, 0)

Weighted sum increases from $12.7B to $55.7B, which is more realistic given
total SE income of ~$400B.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@MaxGhenis MaxGhenis merged commit 674f566 into main Jan 26, 2026
7 checks passed
@MaxGhenis MaxGhenis deleted the derive-partnership-se-income branch January 26, 2026 02:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants