Skip to content

reject unknown fields#139

Merged
binaryDiv merged 8 commits intomainfrom
prevent-additional-properties
Apr 13, 2026
Merged

reject unknown fields#139
binaryDiv merged 8 commits intomainfrom
prevent-additional-properties

Conversation

@the-infinity
Copy link
Copy Markdown
Contributor

For strict validation (eg validation services), it makes sense to be able to prevent additional properties. This MR provides this feature.

Open question: it might make sense to enable prevent additional attributes via env or something else, too, to make a strict validation in tests and therefore find inconsistencies. One could just take os.env for that. Do you think this is a good idea? And if yes, how should we call the env var? just VALIDATACLASS_PREVENT_ADDITIONAL_ATTRIBUTES?

@binaryDiv
Copy link
Copy Markdown
Contributor

Thanks for the PR!

First of all, terminology:

  1. In dataclasses (including validataclasses), we're usually speaking of "fields" rather than attributes or properties. Generally speaking, an attribute is an attribute in the literal sense of Python, a member variable of a class or object. A field is an attribute with a type annotation in a dataclass, for which an argument in __init__ is generated etc. In a validataclass, a field is an attribute that has a type annotation and a validator. More importantly, a field is what gets validated by the DataclassValidator. (And a property is a method that's decorated with @property.)
  2. By default, keys in an input dictionary are simply ignored if they don't have a corresponding field in the dataclass. We never get "additional attributes/fields" (this would imply they're added to and stored in the validated object and can be accessed).
  3. Prevent" also doesn't imply an error, so "prevent additional fields" really just describes the default behaviour. We want to reject fields in the input data that don't exist in the dataclass, i.e. unknown or non-existant fields. That would also be more in line with the RejectValidator. In a way, your new option is kind of like setting a RejectValidator as the fallback validator for unknown fields.

I would strongly suggest to use the wording that I've already suggested in our call last week: "reject unknown fields".

Secondly, about the implementation approach: Currently it's a setting on a validataclass that gets read by the DataclassValidator. That means a validataclass will always either allow or reject unknown fields. Instead, I would suggest to leave the validataclasses untouched and add the setting to the DataclassValidator directly. This allows more flexibility, because you can use the same validataclass with a different validator to either allow or reject unknown fields. Which also makes it easier to allow/reject based on the context or environment, because you don't need to change the validataclass for that, just the validator.

Which brings me to the third point: "One could just take os.env for that." - I'm against this for multiple reasons. One, it adds a kind of complexity and dependency to the user application (and even OS) that I don't think belongs in this library. The library is designed to be simple and relatively agnostic of how it's used. Second, we already have a system for context-sensitive validation (context arguments). If anything, we should use that. But I also think we shouldn't really implement something that's highly application-dependent into the library, but rather design it in a way that enables the library user to implement it themselves however fits best to their use case. And I think by moving the setting from the validataclass decorator to the DataclassValidator, we already give the user full flexibility of how to use this new feature - for example by subclassing and extending the DataclassValidator.

For example, if you have a project where unknown fields should always (or in most cases) be rejected, you can subclass the DataclassValidator and just set reject_unknown_fields=True as the default. If you want to set this depending on the context (e.g. for debugging purposes based on the app config or even for single API requests if a ?debug=true URL parameter is set or something like that), you can subclass the DataclassValidator and set the field depending on the context in whatever way works best for your application (current_app.config, os.env, or pass it as a context argument).

Would you agree with all this or do you see problems / have better ideas?

@the-infinity
Copy link
Copy Markdown
Contributor Author

Will rename the field accordingly, I just came from the JSON Schema wirld where this makes sense.

About the DataclassValidator: it makes sense to have them both. The decorator-approach makes most sense for JSON Schema translations, where this additionalProperties is a property of the object definition. If you auto-translate JSON Schema into validataclass and therefore create a library of validated objects, you don't know where its's used, so having it at the dataclass directly is important there. This can of course be overwritten (or first time set) at the DataclassValidator . Therefore, why not fulfilling both needs? Will add it to the MR.

About os.env: I like the idea of having a CI testing in struct mode, and normal operations not. By setting the CI in struct mode, you can catch accidental fields in test data which were not handled by accident. Reason for this in general: I like systems which check my code, not manual work. Also, it makes a lot of sense for DATEX2 validation again: DATEX2 is multiple hundred validataclasses. You don't want to subclass all of them and rebuild the whole validataclass tree if you want to have a not so strict normal input validation, but a strict web validator which actually checks if your DATEX2 is valid. This apples to any other complex data model, too: subclassing the specific validataclass means you have to rebuild the whole tree of validators. Subclassing to add a debug attribute means that one has to replace all usages in a project with the sublassed ExtendedDataclassValidator, which is also quite a lot of change for a simple switch. I am open to other solutions there, they should just not end up into too much work on the usage side.. Not important in the first step, though, but I would like to get to this somehow.

@binaryDiv
Copy link
Copy Markdown
Contributor

About the DataclassValidator: it makes sense to have them both.

So the validataclass can define the default behaviour and the DataclassValidator can override that. Yes, I think that makes sense. I thought about that too but thought it's a bit redundant, and my assumption was that in your project you would probably need that setting for every validataclass and not just for a specific subset of them. But if you have a good use case where it makes sense to have that as an inherent property of a validataclass, that makes sense.

You don't want to subclass all of them and rebuild the whole validataclass tree

No, I think you misunderstood me. I wasn't speaking of subclassing the validataclass to add a debug flag, that really doesn't make sense. But it does make sense to define a custom DataclassValidator for your specific application that you can adjust to your needs, rather than enforcing a very specific way to enable a feature via env variable. That's also one of the main design goals of this library: Keep things simple but extendable. It's not a workaround to subclass the DataclassValidator, that's intended usage. You can also keep the name "DataclassValidator", then you don't need to change every usage but only the import lines.

What complicates things is also that you might not actually want "debug mode" to affect every DataclassValidator. Unknown fields are not always an error case, sometimes you explicitly want to validate only a subset of fields (think of validation for responses of outgoing API requests - you may be interested in only 2 fields of the entire response, why validate the entire request?) This use case can be easily solved with the subclassing approach, just use the regular DataclassValidator for those objects. If it's a built-in feature of the DataclassValidator that reads os.env or something like that, you also need to build in a way to "force ignore" unknown fields, so now you have a lot of possible combinations for "flag in validataclass", "flag in DataclassValidator that overrides flag in validataclass", "flag in env that overrides otherflags", "flag that overrides the env flag" etc...

There may be ways to provide a simple and usage-agnostic way to do this, like some sort of global context/configuration of library behavior, but I think it's out of scope for this PR because there's a lot of things to consider to do that right. For now, please just leave it at 1. an option in the validataclass decorator to set the default behaviour, 2. an option in the DataclassValidator to override the default behaviour.

@the-infinity the-infinity force-pushed the prevent-additional-properties branch from 14ca883 to 81e5b42 Compare February 18, 2026 20:02
@the-infinity the-infinity changed the title prevent additional properties reject unknown fields Feb 18, 2026
@the-infinity
Copy link
Copy Markdown
Contributor Author

Changed the naming and added the DataclassValidator option. Did postpone the env thing as I want to play around with that a bit before jumping to conclusions.

Comment thread src/validataclass/exceptions/dataclass_exceptions.py Outdated
Comment thread src/validataclass/dataclasses/validataclass.py Outdated
Comment thread docs/05-dataclasses.md Outdated
Comment thread src/validataclass/dataclasses/validataclass.py
@binaryDiv binaryDiv added the new feature New feature that is not (only) a validator class label Mar 16, 2026
Comment thread src/validataclass/dataclasses/validataclass.py Outdated
@the-infinity the-infinity force-pushed the prevent-additional-properties branch from 7a4019c to 6ea7ff3 Compare March 27, 2026 08:16
@binaryDiv binaryDiv changed the base branch from main to dev-mypy April 8, 2026 14:37
Comment thread src/validataclass/dataclasses/validataclass.py
Base automatically changed from dev-mypy to main April 13, 2026 11:33
@binaryDiv binaryDiv force-pushed the prevent-additional-properties branch from 1c6b1fe to 330bb38 Compare April 13, 2026 11:36
@binaryDiv binaryDiv self-requested a review April 13, 2026 11:41
@binaryDiv binaryDiv merged commit 6c58904 into main Apr 13, 2026
6 checks passed
@binaryDiv binaryDiv deleted the prevent-additional-properties branch April 13, 2026 11:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

new feature New feature that is not (only) a validator class

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants