Skip to content

compiler: Faster compilation time#2949

Open
FabioLuporini wants to merge 37 commits into
mainfrom
faster-python-rebased-final
Open

compiler: Faster compilation time#2949
FabioLuporini wants to merge 37 commits into
mainfrom
faster-python-rebased-final

Conversation

@FabioLuporini

Copy link
Copy Markdown
Contributor

the key ingredients are:

  • A lot of "avoided, useless reconstructions" -- across all layers (equations, Clusters, IET)
  • More caching (which needed the above), again at all levels
  • More memoization
  • New loop fusion heuristics to avoid useless and expensive analysis

Bonus: a lot of, IMHO, good refactoring. And several new tests.

Compilation time improvements vary between 1.2x and 5x depending on the complexity of the Operator

@FabioLuporini FabioLuporini requested a review from mloubout June 16, 2026 10:37
@codecov

codecov Bot commented Jun 16, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 93.50181% with 72 lines in your changes missing coverage. Please review.
✅ Project coverage is 83.05%. Comparing base (58df797) to head (9499f27).

Files with missing lines Patch % Lines
devito/passes/clusters/fusion.py 88.82% 16 Missing and 4 partials ⚠️
devito/ir/support/guards.py 27.27% 8 Missing ⚠️
devito/symbolics/manipulation.py 80.95% 4 Missing and 4 partials ⚠️
devito/ir/clusters/cluster.py 94.31% 5 Missing ⚠️
devito/passes/iet/engine.py 87.80% 3 Missing and 2 partials ⚠️
devito/tools/memoization.py 93.58% 2 Missing and 3 partials ⚠️
devito/ir/iet/visitors.py 92.98% 4 Missing ⚠️
devito/tools/utils.py 71.42% 3 Missing and 1 partial ⚠️
devito/ir/support/basic.py 96.80% 1 Missing and 2 partials ⚠️
devito/ir/iet/nodes.py 94.44% 1 Missing and 1 partial ⚠️
... and 5 more
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2949      +/-   ##
==========================================
- Coverage   83.47%   83.05%   -0.43%     
==========================================
  Files         249      250       +1     
  Lines       52276    52949     +673     
  Branches     4503     4575      +72     
==========================================
+ Hits        43638    43976     +338     
- Misses       7880     8182     +302     
- Partials      758      791      +33     
Flag Coverage Δ
pytest-gpu-aomp-amdgpuX 68.89% <87.95%> (+0.29%) ⬆️
pytest-gpu-gcc- 78.42% <90.88%> (+0.20%) ⬆️
pytest-gpu-icx- 78.36% <90.88%> (+0.19%) ⬆️
pytest-gpu-nvc-nvidiaX ?

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@mloubout mloubout left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I like some of those intricate caching methods that could use better implementation rather than a new caching method every place it needs something a bit different.

Comment thread devito/ir/cgen/printer.py


def ccode(expr, printer=None, **settings):
def get_printer(printer, dtype=None):

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to do a lot more than it needs including creating printers that will never be used. Why not just

@memoized_func
def get_printer(printer, dtype=None):
    return  printer(settings={'dtype': dtype})

def __repr__(self):
return "Cluster([{}])".format(('\n' + ' '*9).join(f'{i}' for i in self.exprs))

def __getattr__(self, name):

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this is a cleaner way than either inheriting from EqBlock or not having EqBlock

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, it must be a wrapper, it must not inherit. That's the whole point of unlocking caching

"""

@classmethod
def _preprocess_args(cls, exprs, ispace=null_ispace, guards=None,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where is this used?


return Cluster(exprs, ispace, guards, properties, syncs, halo_scheme)

def rebuild(self, *args, **kwargs):

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seem to defeat the purpose not to rebuild just from _block.

Comment thread devito/ir/iet/nodes.py
handle.update(updates)
return type(self)(**handle)

def _same_arg(self, key, value):

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That looks like an odd bypass of memoized as well

Comment thread devito/ir/iet/nodes.py
updates = OrderedDict([(k, v) for k, v in zip(argnames, args, strict=False)])
updates.update(kwargs)

if updates and all(self._same_arg(k, v) for k, v in updates.items()):

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure how that same_arg is cheaper than just building the new object.

Comment thread devito/ir/iet/visitors.py
if not same_children:
return o._rebuild(*children, **kwargs)

if kwargs and not all(o._same_arg(k, v) for k, v in kwargs.items()):

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

those _same_arg and same_as_before are still not super cheap i'm not sure why this would be better


def pow_to_mul(expr):
if q_leaf(expr) or isinstance(expr, Basic):
if type(expr) in (tuple, list):

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should check if it's Iterable to handle generators and such

if all(a is b for a, b in zip(expr.args, args, strict=False)):
args = tuple(args)

if type(expr) is tuple:

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again check Iterable and return type(expr)(args)

Comment thread tests/test_iet.py
body = iet.body._rebuild(body=iet.body.body + (DummyExpr(x, x),))
return iet._rebuild(body=body), {}

monkeypatch.setattr(iet_engine, '_update_args',

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude is driving me crazy with these monkeypatch 😅 it's always what it tries first for no reason. Like it had 80 of those or something in horizon

@FabioLuporini FabioLuporini force-pushed the faster-python-rebased-final branch from c38c64a to 21f8660 Compare June 16, 2026 14:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants