Skip to content

feat(MachineLearning/PACLearning): VC dimension#563

Open
SamuelSchlesinger wants to merge 1 commit into
leanprover:mainfrom
SamuelSchlesinger:feat/pac-learning-vc-dim
Open

feat(MachineLearning/PACLearning): VC dimension#563
SamuelSchlesinger wants to merge 1 commit into
leanprover:mainfrom
SamuelSchlesinger:feat/pac-learning-vc-dim

Conversation

@SamuelSchlesinger
Copy link
Copy Markdown
Contributor

Adds the Vapnik-Chervonenkis dimension for binary concept classes C : ConceptClass α Bool, with:

  • SetShatters C W: C shatters a set W if every W' ⊆ W arises as the intersection of some c ⁻¹' {true} with W for c ∈ C.
  • SetShatters.subset / SetShatters.superset: anti/monotonicity in the shattered set and the concept class.
  • Finset.Shatters.toSetShatters: bridge from Mathlib's Finset.Shatters to SetShatters via characteristic functions.
  • vcDim C: VC dimension as sSup of shattered finite-set cardinalities.
  • HasFiniteVCDim C and hasFiniteVCDim_iff: the predicate making vcDim mathematically meaningful (uniform bound on shattered cardinalities), required as a hypothesis by downstream vcDim-stated lower bounds.

First layer of a stacked PR sequence formalizing the EHKV lower bound on PAC sample complexity.

Adds the Vapnik-Chervonenkis dimension for binary concept classes
`C : ConceptClass α Bool`, with:

- `SetShatters C W`: `C` shatters a set `W` if every `W' ⊆ W` arises as the
  intersection of some `c ⁻¹' {true}` with `W` for `c ∈ C`.
- `SetShatters.subset` / `SetShatters.superset`: anti/monotonicity in the
  shattered set and the concept class.
- `Finset.Shatters.toSetShatters`: bridge from Mathlib's `Finset.Shatters`
  to `SetShatters` via characteristic functions.
- `vcDim C`: VC dimension as `sSup` of shattered finite-set cardinalities.
- `HasFiniteVCDim C` and `hasFiniteVCDim_iff`: the predicate making `vcDim`
  mathematically meaningful (uniform bound on shattered cardinalities),
  required as a hypothesis by downstream `vcDim`-stated lower bounds.

First layer of a stacked PR sequence formalizing the EHKV lower bound
on PAC sample complexity.
Copy link
Copy Markdown
Collaborator

@fmontesi fmontesi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good enough to me for merging, but would be nice to get an extra confirmation from somebody more familiar with the domain.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants