Skip to content

Path object creation and operations dominate CPU time on large resource trees #2568

@laeubi

Description

@laeubi

During analysis of two heapdumps shared here the following issue was discovered as a hotspot that can benefit from optimization:

Performance Data

Metric WITH transitive WITHOUT transitive Ratio
Path.<init>() (µs) 24,933,686 12,598,757 2.0×
Path.computeSegmentCount() (µs) 12,651,521 3,382,879 3.7×
Path.append() (µs) 14,264,596 5,157,314 2.8×
Path.equals() (µs) 6,327,370 1,207,964 5.2×
StringLatin1.replace() (µs) 32,550,291 9,422,718 3.5×

Description

Path objects are created extensively throughout the JDT and Platform. The constructor calls backslashToForward() (which calls String.replace('\\', '/') — accounting for the StringLatin1.replace() overhead) and computeSegmentCount() (which scans the entire path string). The equals() method compares segments array-by-array.

With transitive dependencies, the number of classpath entries, package fragment roots, and resource lookups increases substantially, causing proportionally more Path creation and comparison. The 3.5× increase in StringLatin1.replace() (32.5 seconds total!) is particularly notable — this is pure overhead from the backslashToForward() conversion that happens in every Path constructor, even on Linux where no backslashes exist.

Suggested Fix

  1. Skip backslash conversion on non-Windows platforms: Path(String) unconditionally calls backslashToForward() which does path.replace('\\', '/'). Since this build is running on a specific OS, the conversion could be skipped when File.separatorChar == '/'. (Note: The source already has a forWindows parameter in the internal constructor, but the public Path(String) uses Constants.RUNNING_ON_WINDOWS. Verify this is correctly set.)
  2. Intern or cache frequently used paths: Classpath entry paths, project paths, and source folder paths are created repeatedly. An interning mechanism or flyweight pattern would reduce object creation.
  3. Lazy segment computation: computeSegments() and computeSegmentCount() are called in the constructor. If many Path objects are created only for equality checks or toString(), lazy computation would save work.
  4. Use IPath.of() factory method: If a canonical IPath.of() exists (or should be added), it could return cached instances for common paths.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions