rustdoc-json: switch from HashMap to FxHashMap to fix non-determinism#108626
rustdoc-json: switch from HashMap to FxHashMap to fix non-determinism#108626bors merged 1 commit intorust-lang:masterfrom
Conversation
|
r? @jsha (rustbot has picked a reviewer for you, use r? to override) |
|
rustdoc-json-types is a public (although nightly-only) API. If possible, consider changing |
e921fe9 to
02b32e6
Compare
3f26808 to
098cf6d
Compare
|
I'm a bit nervous about the performance impact of this, as it changes the core data structure of rustdoc-json. Unfortunately we don't have any automated way to see if this actually matters (cc rust-lang/rustc-perf#1512) Could you use a |
I don't think this is true? FxHashMap is just changing the default hasher for performance, it doesn't affect the order. I vaguely recall there's a StableHasher type of something but it's not the same as the FxHasher. |
The relevant difference is that two FxHashMaps with the same order of inserts will always iterate in the same order, as they don't randomize the hash seed (unlike |
As part of my performance work on I found that using This doesn't say that we shouldn't replace If anyone would like to repeat my experiments, I'd be happy to help you get set up. And if |
I didn't want to involve For the performance concerns, here is a basic demonstration from my desktop machine. using `BTreeMap`using `FxHashMap`
using `HashMap`Those runs were on the precompiled bootstrap/compiler/library. All went directly to generating the json files. Machine specs Regarding to the results, |
It's fine, as this is only ment for use in the rust-lang/rust repo. The reexport I maintain for crates.io already handles this. While it's important that this library can be used on crates.io and stable, adding |
I will push it using |
|
These commits modify the If this was intentional then you can ignore this comment. |
278535c to
df061c8
Compare
I'm not sure what the policy on this is. I've asked on zulip |
|
Turns out it's fine. Thanks for fixing this. @bors r+ |
Signed-off-by: ozkanonur <work@onurozkan.dev>
3f8d3ab to
52c71e6
Compare
|
r=me when CI's green |
|
Could not assign reviewer from: |
|
@bors r+ |
…iaskrgr Rollup of 7 pull requests Successful merges: - rust-lang#106440 (Ignore files in .gitignore in tidy) - rust-lang#108613 (Remove `llvm.skip-rebuild` option) - rust-lang#108616 (Sync codegen defaults with compiler defaults and add a ping message so they stay in sync) - rust-lang#108618 (Rename `src/etc/vscode_settings.json` to `rust_analyzer_settings.json`) - rust-lang#108626 (rustdoc-json: switch from HashMap to FxHashMap to fix non-determinism) - rust-lang#108744 (Don't ICE when encountering bound var in builtin copy/clone bounds) - rust-lang#108749 (Clean up rustdoc-js tester.js file) Failed merges: r? `@ghost` `@rustbot` modify labels: rollup
`rustc_data_structures` rust-lang/rust#108626
The original motivation was me trying to remove the `#![allow(rustc::default_hash_types)]`, as after rust-lang#108626, we should be using `FxHashMap` here. I then realized I should also be able to remove the `.into_iter().collect()`, as we no longer need to convert from `FxHashMap` to `std::HashMap`. However, this didn't work, and I got the following error ``` error[E0308]: mismatched types --> src/librustdoc/json/mod.rs:235:13 | 235 | index, | ^^^^^ expected `rustc_hash::FxHasher`, found `FxHasher` | = note: `FxHasher` and `rustc_hash::FxHasher` have similar names, but are actually distinct types note: `FxHasher` is defined in crate `rustc_hash` --> /cargo/registry/src/index.crates.io-6f17d22bba15001f/rustc-hash-1.1.0/src/lib.rs:60:1 note: `rustc_hash::FxHasher` is defined in crate `rustc_hash` --> /home/alona/.cargo/registry/src/index.crates.io-6f17d22bba15001f/rustc-hash-1.1.0/src/lib.rs:60:1 | 60 | pub struct FxHasher { | ^^^^^^^^^^^^^^^^^^^ = note: perhaps two different versions of crate `rustc_hash` are being used? ``` This is because `librustdoc` got it's `FxHashMap` via the sysroot `rustc-data-strucures`, whereas `rustdoc-json-types` got it via `rustc-hash` on crates.io. To avoid this, `rustdoc-json-types` now uses `#![feature(rustc_private)]` to load the same version as `librustdoc`. However, this needs to be placed behind a feature, as `rustdoc-json-types` is also dependency of `src/tools/jsondocck`, which means need needs to be buildable without nightly features.
Use same `FxHashMap` in `rustdoc-json-types` and `librustdoc`. The original motivation was me trying to remove the `#![allow(rustc::default_hash_types)]`, as after rust-lang#108626, we should be using `FxHashMap` here. I then realized I should also be able to remove the `.into_iter().collect()`, as we no longer need to convert from `FxHashMap` to `std::HashMap`. However, this didn't work, and I got the following error ``` error[E0308]: mismatched types --> src/librustdoc/json/mod.rs:235:13 | 235 | index, | ^^^^^ expected `rustc_hash::FxHasher`, found `FxHasher` | = note: `FxHasher` and `rustc_hash::FxHasher` have similar names, but are actually distinct types note: `FxHasher` is defined in crate `rustc_hash` --> /cargo/registry/src/index.crates.io-6f17d22bba15001f/rustc-hash-1.1.0/src/lib.rs:60:1 note: `rustc_hash::FxHasher` is defined in crate `rustc_hash` --> /home/alona/.cargo/registry/src/index.crates.io-6f17d22bba15001f/rustc-hash-1.1.0/src/lib.rs:60:1 | 60 | pub struct FxHasher { | ^^^^^^^^^^^^^^^^^^^ = note: perhaps two different versions of crate `rustc_hash` are being used? ``` This is because `librustdoc` got it's `FxHashMap` via the sysroot `rustc-data-strucures`, whereas `rustdoc-json-types` got it via `rustc-hash` on crates.io. To avoid this, `rustdoc-json-types` now uses `#![feature(rustc_private)]` to load the same version as `librustdoc`. However, this needs to be placed behind a feature, as `rustdoc-json-types` is also dependency of `src/tools/jsondocck`, which means need needs to be buildable without nightly features. r? `@jyn514`
The original motivation was me trying to remove the `#![allow(rustc::default_hash_types)]`, as after rust-lang#108626, we should be using `FxHashMap` here. I then realized I should also be able to remove the `.into_iter().collect()`, as we no longer need to convert from `FxHashMap` to `std::HashMap`. However, this didn't work, and I got the following error ``` error[E0308]: mismatched types --> src/librustdoc/json/mod.rs:235:13 | 235 | index, | ^^^^^ expected `rustc_hash::FxHasher`, found `FxHasher` | = note: `FxHasher` and `rustc_hash::FxHasher` have similar names, but are actually distinct types note: `FxHasher` is defined in crate `rustc_hash` --> /cargo/registry/src/index.crates.io-6f17d22bba15001f/rustc-hash-1.1.0/src/lib.rs:60:1 note: `rustc_hash::FxHasher` is defined in crate `rustc_hash` --> /home/alona/.cargo/registry/src/index.crates.io-6f17d22bba15001f/rustc-hash-1.1.0/src/lib.rs:60:1 | 60 | pub struct FxHasher { | ^^^^^^^^^^^^^^^^^^^ = note: perhaps two different versions of crate `rustc_hash` are being used? ``` This is because `librustdoc` got it's `FxHashMap` via the sysroot `rustc-data-strucures`, whereas `rustdoc-json-types` got it via `rustc-hash` on crates.io. To avoid this, `rustdoc-json-types` now uses `#![feature(rustc_private)]` to load the same version as `librustdoc`. However, this needs to be placed behind a feature, as `rustdoc-json-types` is also dependency of `src/tools/jsondocck`, which means need needs to be buildable without nightly features.
Use same `FxHashMap` in `rustdoc-json-types` and `librustdoc`. The original motivation was me trying to remove the `#![allow(rustc::default_hash_types)]`, as after rust-lang#108626, we should be using `FxHashMap` here. I then realized I should also be able to remove the `.into_iter().collect()`, as we no longer need to convert from `FxHashMap` to `std::HashMap`. However, this didn't work, and I got the following error ``` error[E0308]: mismatched types --> src/librustdoc/json/mod.rs:235:13 | 235 | index, | ^^^^^ expected `rustc_hash::FxHasher`, found `FxHasher` | = note: `FxHasher` and `rustc_hash::FxHasher` have similar names, but are actually distinct types note: `FxHasher` is defined in crate `rustc_hash` --> /cargo/registry/src/index.crates.io-6f17d22bba15001f/rustc-hash-1.1.0/src/lib.rs:60:1 note: `rustc_hash::FxHasher` is defined in crate `rustc_hash` --> /home/alona/.cargo/registry/src/index.crates.io-6f17d22bba15001f/rustc-hash-1.1.0/src/lib.rs:60:1 | 60 | pub struct FxHasher { | ^^^^^^^^^^^^^^^^^^^ = note: perhaps two different versions of crate `rustc_hash` are being used? ``` This is because `librustdoc` got it's `FxHashMap` via the sysroot `rustc-data-strucures`, whereas `rustdoc-json-types` got it via `rustc-hash` on crates.io. To avoid this, `rustdoc-json-types` now uses `#![feature(rustc_private)]` to load the same version as `librustdoc`. However, this needs to be placed behind a feature, as `rustdoc-json-types` is also dependency of `src/tools/jsondocck`, which means need needs to be buildable without nightly features. r? `@jyn514`
Use same `FxHashMap` in `rustdoc-json-types` and `librustdoc`. The original motivation was me trying to remove the `#![allow(rustc::default_hash_types)]`, as after rust-lang#108626, we should be using `FxHashMap` here. I then realized I should also be able to remove the `.into_iter().collect()`, as we no longer need to convert from `FxHashMap` to `std::HashMap`. However, this didn't work, and I got the following error ``` error[E0308]: mismatched types --> src/librustdoc/json/mod.rs:235:13 | 235 | index, | ^^^^^ expected `rustc_hash::FxHasher`, found `FxHasher` | = note: `FxHasher` and `rustc_hash::FxHasher` have similar names, but are actually distinct types note: `FxHasher` is defined in crate `rustc_hash` --> /cargo/registry/src/index.crates.io-6f17d22bba15001f/rustc-hash-1.1.0/src/lib.rs:60:1 note: `rustc_hash::FxHasher` is defined in crate `rustc_hash` --> /home/alona/.cargo/registry/src/index.crates.io-6f17d22bba15001f/rustc-hash-1.1.0/src/lib.rs:60:1 | 60 | pub struct FxHasher { | ^^^^^^^^^^^^^^^^^^^ = note: perhaps two different versions of crate `rustc_hash` are being used? ``` This is because `librustdoc` got it's `FxHashMap` via the sysroot `rustc-data-strucures`, whereas `rustdoc-json-types` got it via `rustc-hash` on crates.io. To avoid this, `rustdoc-json-types` now uses `#![feature(rustc_private)]` to load the same version as `librustdoc`. However, this needs to be placed behind a feature, as `rustdoc-json-types` is also dependency of `src/tools/jsondocck`, which means need needs to be buildable without nightly features. r? ``@jyn514``
The original motivation was me trying to remove the `#![allow(rustc::default_hash_types)]`, as after rust-lang#108626, we should be using `FxHashMap` here. I then realized I should also be able to remove the `.into_iter().collect()`, as we no longer need to convert from `FxHashMap` to `std::HashMap`. However, this didn't work, and I got the following error ``` error[E0308]: mismatched types --> src/librustdoc/json/mod.rs:235:13 | 235 | index, | ^^^^^ expected `rustc_hash::FxHasher`, found `FxHasher` | = note: `FxHasher` and `rustc_hash::FxHasher` have similar names, but are actually distinct types note: `FxHasher` is defined in crate `rustc_hash` --> /cargo/registry/src/index.crates.io-6f17d22bba15001f/rustc-hash-1.1.0/src/lib.rs:60:1 note: `rustc_hash::FxHasher` is defined in crate `rustc_hash` --> /home/alona/.cargo/registry/src/index.crates.io-6f17d22bba15001f/rustc-hash-1.1.0/src/lib.rs:60:1 | 60 | pub struct FxHasher { | ^^^^^^^^^^^^^^^^^^^ = note: perhaps two different versions of crate `rustc_hash` are being used? ``` This is because `librustdoc` got it's `FxHashMap` via the sysroot `rustc-data-strucures`, whereas `rustdoc-json-types` got it via `rustc-hash` on crates.io. To avoid this, `rustdoc-json-types` now uses `#![feature(rustc_private)]` to load the same version as `librustdoc`. However, this needs to be placed behind a feature, as `rustdoc-json-types` is also dependency of `src/tools/jsondocck`, which means need needs to be buildable without nightly features.
The original motivation was me trying to remove the `#![allow(rustc::default_hash_types)]`, as after rust-lang#108626, we should be using `FxHashMap` here. I then realized I should also be able to remove the `.into_iter().collect()`, as we no longer need to convert from `FxHashMap` to `std::HashMap`. However, this didn't work, and I got the following error ``` error[E0308]: mismatched types --> src/librustdoc/json/mod.rs:235:13 | 235 | index, | ^^^^^ expected `rustc_hash::FxHasher`, found `FxHasher` | = note: `FxHasher` and `rustc_hash::FxHasher` have similar names, but are actually distinct types note: `FxHasher` is defined in crate `rustc_hash` --> /cargo/registry/src/index.crates.io-6f17d22bba15001f/rustc-hash-1.1.0/src/lib.rs:60:1 note: `rustc_hash::FxHasher` is defined in crate `rustc_hash` --> /home/alona/.cargo/registry/src/index.crates.io-6f17d22bba15001f/rustc-hash-1.1.0/src/lib.rs:60:1 | 60 | pub struct FxHasher { | ^^^^^^^^^^^^^^^^^^^ = note: perhaps two different versions of crate `rustc_hash` are being used? ``` This is because `librustdoc` got it's `FxHashMap` via the sysroot `rustc-data-strucures`, whereas `rustdoc-json-types` got it via `rustc-hash` on crates.io. To avoid this, `rustdoc-json-types` now uses `#![feature(rustc_private)]` to load the same version as `librustdoc`. However, this needs to be placed behind a feature, as `rustdoc-json-types` is also dependency of `src/tools/jsondocck`, which means need needs to be buildable without nightly features.
The original motivation was me trying to remove the `#![allow(rustc::default_hash_types)]`, as after rust-lang#108626, we should be using `FxHashMap` here. I then realized I should also be able to remove the `.into_iter().collect()`, as we no longer need to convert from `FxHashMap` to `std::HashMap`. However, this didn't work, and I got the following error ``` error[E0308]: mismatched types --> src/librustdoc/json/mod.rs:235:13 | 235 | index, | ^^^^^ expected `rustc_hash::FxHasher`, found `FxHasher` | = note: `FxHasher` and `rustc_hash::FxHasher` have similar names, but are actually distinct types note: `FxHasher` is defined in crate `rustc_hash` --> /cargo/registry/src/index.crates.io-6f17d22bba15001f/rustc-hash-1.1.0/src/lib.rs:60:1 note: `rustc_hash::FxHasher` is defined in crate `rustc_hash` --> /home/alona/.cargo/registry/src/index.crates.io-6f17d22bba15001f/rustc-hash-1.1.0/src/lib.rs:60:1 | 60 | pub struct FxHasher { | ^^^^^^^^^^^^^^^^^^^ = note: perhaps two different versions of crate `rustc_hash` are being used? ``` This is because `librustdoc` got it's `FxHashMap` via the sysroot `rustc-data-strucures`, whereas `rustdoc-json-types` got it via `rustc-hash` on crates.io. To avoid this, `rustdoc-json-types` now uses `#![feature(rustc_private)]` to load the same version as `librustdoc`. However, this needs to be placed behind a feature, as `rustdoc-json-types` is also dependency of `src/tools/jsondocck`, which means need needs to be buildable without nightly features.
Using
HashMapinrustdoc_json_types::Cratewere causing creating randomly ordered objects in the json doc files. Which might cause problems to people who are doing comparison on those files specially in CI pipelines. See #103785 (comment)This PR fixes that issue and extends the coverage of
tests/run-make/rustdoc-verify-output-filestesting ability.