Skip to content

Make IdTag typesafe#6574

Merged
danakj merged 9 commits intocarbon-language:trunkfrom
danakj:id-tag-typesafe
Jan 13, 2026
Merged

Make IdTag typesafe#6574
danakj merged 9 commits intocarbon-language:trunkfrom
danakj:id-tag-typesafe

Conversation

@danakj
Copy link
Copy Markdown
Contributor

@danakj danakj commented Jan 9, 2026

The IdTag knows the type of the Id its tagging and the type of the Id being used as the tag. This prevents mixing up tagged and untagged ids, and avoids having to work with untyped integers.

Adds an Untagged marker struct that's used as the tag type in IdTag when no tag is desired.

The complexity of ConstantIds and TypeIds became a bit visible: TypeIds are concrete ConstantIds. And ConstantIds have two different tagging schemes, one for concrete and one for symbolic ids. And ConstantIds are actually re-cast InstIds with the same index. The LoweredTypeStore needs to work with tagged TypeIds, but the tags actually come from an InstId store in ConstantValueStore. Now this is expressed in the type system by getting the tags for TypeIds from the ConstantValueStore.

ValueStores without an TagId type parameter are now visibly untagged.

IdTag is now only default constructible when it does not have a tag, which means ValueStore is only default constructible when the TagId is untagged. This forces tagged value stores to be constructed correctly with a tag at compile time, and untagged ones to be constructed without.

FixedSizeValueStore has overloads for dealing with tagged and untagged Ids, since it can't default-construct ValueStore for tagged ids, and no longer requires passing in default-constructed tags when there is no tag in the ids.

@danakj danakj requested a review from a team as a code owner January 9, 2026 16:14
@danakj danakj requested review from chandlerc and removed request for a team January 9, 2026 16:14
Copy link
Copy Markdown
Contributor

@chandlerc chandlerc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the direction here, but currently it feels unfortunate how much duplication there is. Specifically:

  • Duplicating the id type: FooStore<InstId, ..., IdTag<InstId, ...>>
  • Writing CheckIRId everywhere...

I wonder if we could do something a bit more terse... My current idea is:

  • Use template aliases for Untagged<IdT> and TagId<IdT> to select between Unchecked vs. CheckIRId for the second template parameter.
    • If we do have other checked tags, create relevant aliases for them such as IRTagId, SymbolicTagId, ...
  • Use a template-template argument to the stores so we can just pass Untagged or TagId, and have it apply the IdT to it.

But not sure if that will work -- template template arguments are ... fussy. Let me know if not and if you don't see other easy ways to achieve this.

Comment thread toolchain/base/id_tag.h Outdated
CARBON_DCHECK(index >= 0, "{0}", index);
if (index < initial_reserved_ids_) {
return index;
return IdT{index};
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We generally use ()s to construct objects unless that doesn't compile or we're building some kind of container/tuple/struct of elements? This doesn't really feel like a container to me...

Maybe the Abseil tip is a useful heuristic here? https://abseil.io/tips/88

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Or let me know if I'm wrong and this actually is a case that doesn't compile w/o {}s...)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This choice here (and for TagIdT) was around the fact the type could be an integer and we like to use {} for those cuz we get compile-time errors for narrowing. But at this point I think there are no integers, so probably not gaining anything. The initialization store inside a template is a little unfortunate (or in general but yeah).

Comment thread toolchain/base/id_tag.h Outdated
Comment thread toolchain/check/context.h
Comment thread toolchain/sem_ir/ids.h Outdated
// IdTags for ConstantIds are slightly complex, and you need to know if the
// constant is concrete or symbolic to know its tag:
// - Concrete ConstantIds use the tag of the store of InstIds.
// - Symbolic ConstantIds are tagged by the internal SymbolicId.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are they tagged by the SymbolicId, or by the CheckIRId associated with the SymbolicId?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, fixed, it's the tag from the SymbolicId store

Comment thread toolchain/base/id_tag.h Outdated
@danakj
Copy link
Copy Markdown
Contributor Author

danakj commented Jan 12, 2026

I like the direction here, but currently it feels unfortunate how much duplication there is. Specifically:

  • Duplicating the id type: FooStore<InstId, ..., IdTag<InstId, ...>>
  • Writing CheckIRId everywhere...

I wonder if we could do something a bit more terse... My current idea is:

  • Use template aliases for Untagged<IdT> and TagId<IdT> to select between Unchecked vs. CheckIRId for the second template parameter.

    • If we do have other checked tags, create relevant aliases for them such as IRTagId, SymbolicTagId, ...
  • Use a template-template argument to the stores so we can just pass Untagged or TagId, and have it apply the IdT to it.

But not sure if that will work -- template template arguments are ... fussy. Let me know if not and if you don't see other easy ways to achieve this.

I tried this out and it got complex, but then realized we don't need anything fancy template-wise. The use of IdTag as the store parameter was based on the idea that you might need ValueStore<T, ..., IdTag<U, ...>> but that Does Not Work, which is what led to the IdTag type-conversion methods in ConstantValueStore. So we can just use the TagId as the template parameter to ValueStore and similar, and construct the (somewhat redundant) IdTag type internally, and we get the same type safety without syntactic redundancy.

Copy link
Copy Markdown
Contributor Author

@danakj danakj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PTAL

Comment thread toolchain/base/id_tag.h Outdated
CARBON_DCHECK(index >= 0, "{0}", index);
if (index < initial_reserved_ids_) {
return index;
return IdT{index};
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This choice here (and for TagIdT) was around the fact the type could be an integer and we like to use {} for those cuz we get compile-time errors for narrowing. But at this point I think there are no integers, so probably not gaining anything. The initialization store inside a template is a little unfortunate (or in general but yeah).

Comment thread toolchain/base/id_tag.h Outdated
Comment thread toolchain/check/context.h
Comment thread toolchain/sem_ir/ids.h Outdated
// IdTags for ConstantIds are slightly complex, and you need to know if the
// constant is concrete or symbolic to know its tag:
// - Concrete ConstantIds use the tag of the store of InstIds.
// - Symbolic ConstantIds are tagged by the internal SymbolicId.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, fixed, it's the tag from the SymbolicId store

Comment thread toolchain/base/id_tag.h Outdated
@danakj danakj requested a review from chandlerc January 12, 2026 18:51
Copy link
Copy Markdown
Contributor

@chandlerc chandlerc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LG

There are a few things that might be worth thinking about for follow-ups, but I'm not sure how to do them or if they're worth it, and they definitely don't seem like blockers...

One is it's a little hard on the reader to see ValueStore<FooId, ..., BarId> and remember that this is a store of FooIds, but tagged by BarIds. I almost want it to be CheckIRId::AsTag, and when constructing check_ir_id_.as_tag(). But not sure that would work, and it adds complexity that I'm not sure it pays for...

The other thing is that the Untagged default seems unfortunate for ValueStore... I think we want most of them to be tagged. But as you pointed out, we can't default to SemIR::CheckIRId in //toolchain/base. This makes me wonder whether we should have a ValueStore wrapper in SemIR that provides the default, and maybe not have a default in the //toolchain/base one? But very uncertain about this, just figured I'd mention it in case it sparks a good idea.

Comment thread toolchain/base/block_value_store.h
Comment thread toolchain/base/id_tag.h Outdated
@danakj
Copy link
Copy Markdown
Contributor Author

danakj commented Jan 13, 2026

LG

There are a few things that might be worth thinking about for follow-ups, but I'm not sure how to do them or if they're worth it, and they definitely don't seem like blockers...

One is it's a little hard on the reader to see ValueStore<FooId, ..., BarId> and remember that this is a store of FooIds, but tagged by BarIds. I almost want it to be CheckIRId::AsTag, and when constructing check_ir_id_.as_tag(). But not sure that would work, and it adds complexity that I'm not sure it pays for...

We could make the tag parameter either Untagged or Tag<TagIdT>, so then you have to write Tag<CheckIRId> instead. I have done that here now, PTAL and see what you think?

The other thing is that the Untagged default seems unfortunate for ValueStore... I think we want most of them to be tagged. But as you pointed out, we can't default to SemIR::CheckIRId in //toolchain/base. This makes me wonder whether we should have a ValueStore wrapper in SemIR that provides the default, and maybe not have a default in the //toolchain/base one? But very uncertain about this, just figured I'd mention it in case it sparks a good idea.

On top of the base issue, we want most of the ValueStores in semir and check to be tagged, But there are also value stores all over parse which don't use tagging. I am not super keen on a wrapper/alias in semir, as I feel like it would be obfuscating the types more than helping.

@danakj danakj requested a review from chandlerc January 13, 2026 17:16
@danakj danakj enabled auto-merge January 13, 2026 21:57
Copy link
Copy Markdown
Contributor

@chandlerc chandlerc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LG!

@danakj danakj added this pull request to the merge queue Jan 13, 2026
Merged via the queue into carbon-language:trunk with commit c64117d Jan 13, 2026
8 checks passed
@danakj danakj deleted the id-tag-typesafe branch January 13, 2026 23:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants