[PERF CHECK ONLY] Add `inline` to trivial cross-crate accessors by traviscross · Pull Request #156343 · rust-lang/rust

traviscross · 2026-05-08T20:21:14Z

Note

This is a draft for running perf.

Small accessor functions aren't inlined across crate boundaries in all build configurations. Even with thin LTO, PGO, etc., the backend may be limited in what it's able and willing to inline without these annotations.

In detailed profiling, the inline attributes being added in this PR seemed to make a difference. Let's add them.

Notably, some of these fall within the expansion of the newtype_index! macro and will apply to the items it defines.

r? @traviscross

traviscross · 2026-05-08T20:21:39Z

@bors try @rust-timer queue

…s, r=<try> [PERF CHECK ONLY] Add `inline` to trivial cross-crate accessors

rust-bors · 2026-05-08T22:30:34Z

☀️ Try build successful (CI)
Build commit: 6000502 (6000502f00011b1e462abca902ad51df79f2d675, parent: 8068e2fc9afa8c888336b12db01987be768785f9)

rust-timer · 2026-05-08T23:12:03Z

Finished benchmarking commit (6000502): comparison URL.

Overall result: ❌✅ regressions and improvements - please read:

Benchmarking means the PR may be perf-sensitive. It's automatically marked not fit for rolling up. Overriding is possible but disadvised: it risks changing compiler perf.

Next, please: If you can, justify the regressions found in this try perf run in writing along with @rustbot label: +perf-regression-triaged. If not, fix the regressions and do another perf run. Neutral or positive results will clear the label automatically.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

Our most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.

	mean	range	count
Regressions ❌ (primary)	3.4%	[3.4%, 3.4%]	1
Regressions ❌ (secondary)	1.2%	[0.2%, 2.3%]	2
Improvements ✅ (primary)	-0.4%	[-0.6%, -0.1%]	11
Improvements ✅ (secondary)	-0.9%	[-1.0%, -0.8%]	6
All ❌✅ (primary)	-0.1%	[-0.6%, 3.4%]	12

Max RSS (memory usage)

Results (primary 1.4%, secondary 0.5%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

	mean	range	count
Regressions ❌ (primary)	1.4%	[1.4%, 1.4%]	1
Regressions ❌ (secondary)	3.2%	[3.2%, 3.2%]	1
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-2.2%	[-2.2%, -2.2%]	1
All ❌✅ (primary)	1.4%	[1.4%, 1.4%]	1

Cycles

Results (secondary 2.8%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	2.8%	[2.8%, 2.8%]	1
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-	-	0

Binary size

This perf run didn't have relevant results for this metric.

Bootstrap: 498.048s -> 501.772s (0.75%)
Artifact size: 397.10 MiB -> 397.12 MiB (0.00%)

traviscross · 2026-05-08T23:17:27Z

@bors try @rust-timer queue

Small accessor functions aren't inlined across crate boundaries in all build configurations. Even with thin LTO, PGO, etc., the backend may be limited in what it's able and willing to inline without these annotations. In detailed profiling, the `inline` attributes being added in this PR seemed to make a difference. Let's add them. Notably, some of these fall within the expansion of the `newtype_index!` macro and will apply to the items it defines. Being in the macro might explain (if later profiling confirms this win) why these weren't noticed earlier. This commit covers the cases where the auto-inline heuristic from Rust PR 116505 is unlikely to work.

Some trivial accessors have a single call, like `fn def_id(self) -> DefId { self.did() }`. After MIR inlining the inner call, the auto-inline heuristic from Rust PR 116505 might already cover these, but the heuristic is conservative. Let's annotate explicitly and profile.

These accessors have pure field/index accesses -- no calls. The auto-inline heuristic from Rust PR 116505 should already cover these. Let's annotate to confirm that in profiling. Maybe or maybe not we'd want to add these anyway as documentation of intent and a safety net against later changes that would cause the heuristic to fail.

traviscross · 2026-05-08T23:20:48Z

@bors try @rust-timer queue

…s, r=<try> [PERF CHECK ONLY] Add `inline` to trivial cross-crate accessors

rust-bors · 2026-05-09T02:20:45Z

💔 Test for fa3d3ef failed: CI

traviscross · 2026-05-09T02:21:15Z

@bors try @rust-timer queue

…s, r=<try> [PERF CHECK ONLY] Add `inline` to trivial cross-crate accessors

rust-bors · 2026-05-09T04:30:24Z

☀️ Try build successful (CI)
Build commit: c51601b (c51601be0e9654351a230f76a0910211ce8897a9, parent: fb0a5a5a9c892b351f34263d6d84da9dde72871a)

rust-timer · 2026-05-09T05:47:31Z

Finished benchmarking commit (c51601b): comparison URL.

Overall result: ❌✅ regressions and improvements - please read:

Benchmarking means the PR may be perf-sensitive. It's automatically marked not fit for rolling up. Overriding is possible but disadvised: it risks changing compiler perf.

Next, please: If you can, justify the regressions found in this try perf run in writing along with @rustbot label: +perf-regression-triaged. If not, fix the regressions and do another perf run. Neutral or positive results will clear the label automatically.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

Our most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.

	mean	range	count
Regressions ❌ (primary)	3.3%	[3.3%, 3.3%]	1
Regressions ❌ (secondary)	0.8%	[0.2%, 2.4%]	4
Improvements ✅ (primary)	-0.4%	[-0.6%, -0.2%]	14
Improvements ✅ (secondary)	-0.4%	[-1.1%, -0.1%]	20
All ❌✅ (primary)	-0.1%	[-0.6%, 3.3%]	15

Max RSS (memory usage)

Results (secondary 1.1%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	2.6%	[2.3%, 3.0%]	3
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-3.2%	[-3.2%, -3.2%]	1
All ❌✅ (primary)	-	-	0

Cycles

Results (primary -2.2%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-2.2%	[-2.2%, -2.2%]	1
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-2.2%	[-2.2%, -2.2%]	1

Binary size

Results (secondary -0.1%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-0.1%	[-0.1%, -0.1%]	1
All ❌✅ (primary)	-	-	0

Bootstrap: 497.93s -> 502.373s (0.89%)
Artifact size: 397.14 MiB -> 396.95 MiB (-0.05%)

traviscross · 2026-05-09T10:09:04Z

@bors try @rust-timer queue

…s, r=<try> [PERF CHECK ONLY] Add `inline` to trivial cross-crate accessors

rust-bors · 2026-05-09T12:16:52Z

☀️ Try build successful (CI)
Build commit: 49c68cb (49c68cbd71efe60226a619616e7014c736312888, parent: 0490dd938541ad996c5ad1ec6e274012afe3e1d4)

rust-timer · 2026-05-09T13:15:30Z

Finished benchmarking commit (49c68cb): comparison URL.

Overall result: ❌✅ regressions and improvements - please read:

Benchmarking means the PR may be perf-sensitive. It's automatically marked not fit for rolling up. Overriding is possible but disadvised: it risks changing compiler perf.

Next, please: If you can, justify the regressions found in this try perf run in writing along with @rustbot label: +perf-regression-triaged. If not, fix the regressions and do another perf run. Neutral or positive results will clear the label automatically.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

Our most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.

	mean	range	count
Regressions ❌ (primary)	3.2%	[3.2%, 3.2%]	1
Regressions ❌ (secondary)	2.3%	[2.3%, 2.3%]	1
Improvements ✅ (primary)	-0.4%	[-0.7%, -0.2%]	16
Improvements ✅ (secondary)	-0.4%	[-1.2%, -0.1%]	34
All ❌✅ (primary)	-0.2%	[-0.7%, 3.2%]	17

Max RSS (memory usage)

Results (primary -1.9%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-1.9%	[-1.9%, -1.9%]	1
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-1.9%	[-1.9%, -1.9%]	1

Cycles

Results (secondary -2.7%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-2.7%	[-3.1%, -2.3%]	3
All ❌✅ (primary)	-	-	0

Binary size

This perf run didn't have relevant results for this metric.

Bootstrap: 497.416s -> 498.988s (0.32%)
Artifact size: 397.18 MiB -> 397.06 MiB (-0.03%)

traviscross · 2026-05-09T21:43:41Z

@bors try parent=fb0a5a5a9c892b351f34263d6d84da9dde72871a @rust-timer queue

…s, r=<try> [PERF CHECK ONLY] Add `inline` to trivial cross-crate accessors

rust-bors · 2026-05-09T23:55:51Z

☀️ Try build successful (CI)
Build commit: 7534ee6 (7534ee67bcc11bcb5b42b95aada2b553a64690db, parent: fb0a5a5a9c892b351f34263d6d84da9dde72871a)

rust-timer · 2026-05-10T00:40:30Z

Finished benchmarking commit (7534ee6): comparison URL.

Overall result: ✅ improvements - no action needed

Benchmarking means the PR may be perf-sensitive. It's automatically marked not fit for rolling up. Overriding is possible but disadvised: it risks changing compiler perf.

@bors rollup=never
@rustbot label: -S-waiting-on-perf -perf-regression

Instruction count

Our most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	0.4%	[0.3%, 0.7%]	11
Improvements ✅ (primary)	-0.4%	[-0.8%, -0.2%]	81
Improvements ✅ (secondary)	-0.3%	[-0.6%, -0.2%]	75
All ❌✅ (primary)	-0.4%	[-0.8%, -0.2%]	81

Max RSS (memory usage)

Results (primary 1.7%, secondary 2.5%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

	mean	range	count
Regressions ❌ (primary)	4.6%	[2.8%, 6.7%]	4
Regressions ❌ (secondary)	2.5%	[2.2%, 3.1%]	3
Improvements ✅ (primary)	-1.2%	[-1.4%, -1.0%]	4
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	1.7%	[-1.4%, 6.7%]	8

Cycles

Results (secondary 10.3%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	10.3%	[2.2%, 18.3%]	3
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-	-	0

Binary size

Results (primary -0.0%, secondary -0.1%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-0.0%	[-0.0%, -0.0%]	3
Improvements ✅ (secondary)	-0.1%	[-0.1%, -0.1%]	3
All ❌✅ (primary)	-0.0%	[-0.0%, -0.0%]	3

Bootstrap: 497.93s -> 498.532s (0.12%)
Artifact size: 397.14 MiB -> 397.36 MiB (0.05%)

rustbot assigned traviscross May 8, 2026

rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels May 8, 2026