Skip to content

Multi-cgra mapper integration#279

Draft
guosran wants to merge 14 commits intomainfrom
feature/multi-cgra-mapper-integration
Draft

Multi-cgra mapper integration#279
guosran wants to merge 14 commits intomainfrom
feature/multi-cgra-mapper-integration

Conversation

@guosran
Copy link
Collaborator

@guosran guosran commented Mar 6, 2026

Summary

This PR addresses issue #275. It adds multi-CGRA task placement support to MapTaskOnCgraPass and integrates it into ResourceAwareTaskOptimizationPass.

Changes

MapTaskOnCgraPass

  • Multi-CGRA placement in findBestPlacement: Tasks with cgra_count > 1 are now placed using a cascading shape search:

    1. Try all rectangular shapes of the requested cgra_count (e.g., 1×3, 3×1 for cgra_count=3).
    2. If no rectangular shape fits, try all connected non-rectangular shapes (L-shapes, T-shapes, etc.) via DFS enumeration.
    3. If nothing fits, fall back to cgra_count - 1 (reject the extra CGRA, keep the previous allocation).
  • parseTileShapeOffsets: Parses the tile_shape attribute string (e.g., "2x2", "2x2[(0,0)(1,0)(0,1)]") into physical CGRA offset coordinates. Added assertion for empty offsets.

  • computeScore: Updated to handle multi-CGRA placements by computing the minimum Manhattan distance from any CGRA in the placement to the target, reflecting fast bypass paths between adjacent CGRAs.

  • Helper functions: Extracted tryPlaceShape, getRectShapes, and tryNonRectShapes for clarity and reuse.

  • Public API: Exposed runMapTaskOnCgra(FuncOp, int, int) in TaskflowPasses.h so other passes can invoke placement directly.

ResourceAwareTaskOptimizationPass

  • Integration: After the balance/fusion convergence loop, calls runMapTaskOnCgra() to produce task_mapping_info attributes with global grid placement that respects multi-CGRA shapes.

guosran added 4 commits March 5, 2026 12:51
…ment

- findBestPlacement now tries rectangular shapes first, then non-rectangular
  connected shapes, then falls back to k-1 CGRAs (down to 1).
- Removed outdated TODO comment about MapTaskOnCgraPass not supporting
  multi-CGRA placement.
- Added assert for empty tile_shape offsets.
- Cleaned up USER COMMENT annotations.
…p comments

- findBestPlacement tries rect then non-rect shapes for requested cgra_count.
- If placement fails, caller falls back to cgra_count-1 (reject extra CGRA).
- Normalize /// to // comment style throughout MapTaskOnCgraPass.
- Remove outdated TODO comments.
@guosran guosran requested review from ShangkunLi and Copilot and removed request for Copilot March 6, 2026 01:12
- SRAM centroid now includes ALL CGRA positions of multi-CGRA tasks,
  not just placement[0].
- SSA proximity scoring uses min distance between two multi-CGRA
  placements (minDistToPlacement) instead of only comparing to
  the other task's primary position.
Copilot AI review requested due to automatic review settings March 6, 2026 01:30
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds multi-CGRA task placement support to MapTaskOnCgraPass and integrates global placement generation (task_mapping_info) into ResourceAwareTaskOptimizationPass after the balance/fusion loop converges (issue #275).

Changes:

  • Extend CGRA placement to support tasks spanning multiple CGRAs (rectangular and connected non-rectangular shapes), and update scoring to account for multi-tile proximity.
  • Expose runMapTaskOnCgra(func, rows, cols) as a callable helper and invoke it from ResourceAwareTaskOptimizationPass post-convergence.
  • Update multi-CGRA MLIR tests to reflect new RESOPT output formatting and/or placement results.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
lib/TaskflowDialect/Transforms/MapTaskOnCgraPass.cpp Implements multi-CGRA placement search + scoring changes; adds runMapTaskOnCgra entrypoint.
lib/TaskflowDialect/Transforms/Optimizations/ResourceAwareTaskOptimizationPass.cpp Calls runMapTaskOnCgra after convergence to materialize task_mapping_info.
include/TaskflowDialect/TaskflowPasses.h Exposes runMapTaskOnCgra in the public header.
test/multi-cgra/taskflow/resource-heavy/resource-heavy.mlir Adjusts RESOPT FileCheck expectations (line-wrapped attrs).
test/multi-cgra/taskflow/resnet/simple_resnet_tosa.mlir Adjusts RESOPT FileCheck expectations (line-wrapped attrs).
test/multi-cgra/taskflow/parallel-nested/parallel-nested.mlir Adjusts RESOPT FileCheck expectations (line-wrapped attrs).
test/multi-cgra/taskflow/multi-nested/multi-nested.mlir Updates placement expectations and RESOPT FileCheck formatting.
test/multi-cgra/taskflow/irregular-loop/irregular-loop.mlir Adjusts RESOPT FileCheck expectations (line-wrapped attrs).
Comments suppressed due to low confidence (1)

lib/TaskflowDialect/Transforms/MapTaskOnCgraPass.cpp:294

  • If findBestPlacement() returns an empty placement (including after the fallback), the code still commits placement.primary() which becomes (-1,-1). That invalid coordinate then propagates into task_mapping_info and into SRAM centroid computation (assignAllSRAMs sums all positions), producing incorrect placements instead of failing loudly. Handle the "no placement found" case explicitly (e.g., emit an MLIR diagnostic and signalPassFailure / return failure), and avoid pushing a sentinel position into task_node->placement.
          // Commits Placement.
          task_node->placement.push_back(placement.primary());
          for (size_t i = 1; i < placement.cgra_positions.size(); ++i) {
             task_node->placement.push_back(placement.cgra_positions[i]);
          }

          // Marks occupied.
          for (const auto &pos : placement.cgra_positions) {
            if (pos.row >= 0 && pos.row < grid_rows_ && pos.col >= 0 && pos.col < grid_cols_) {
                occupied_[pos.row][pos.col] = true;
            }
          }

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@guosran guosran force-pushed the feature/multi-cgra-mapper-integration branch from 07a38b2 to 2d00d5a Compare March 7, 2026 04:25
@tancheng tancheng requested a review from HobbitQia March 7, 2026 06:31

// Runs the CGRA task placement logic directly on a function.
// grid_rows/grid_cols default to 4x4 (kCgraGridRows/kCgraGridCols).
void runAllocateCgraToTask(mlir::func::FuncOp func,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should create two utility folders in include/TaskflowDialect/Util or lib/TaskflowDialect/Util and put these utility functions in these folders. You can refer to the folder structure in NeuraDialect.

@@ -70,7 +70,7 @@ def MapTaskOnCgra : Pass<"map-task-on-cgra", "func::FuncOp"> {

Uses a default 3x3 CGRA grid.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default configuration in runAllocateCgraToTask is 4x4. Maybe we can keep them consistent, 3x3 or 4x4.


/// Checks if any CGRA in this task is adjacent to any in other task.
// Checks if any CGRA in this task is adjacent to any in other task.
bool hasAdjacentCGRA(const TaskPlacement &other) const {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we rename it to hasTaskAdjacentCgra?

Comment on lines +272 to +273
// If the requested cgra_count doesn't fit, fall back to cgra_count-1
// (i.e. reject the extra CGRA and keep previous allocation).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the cgra_count is calculated in resourceAwareTaskOptimization pass, we should always be able to find the best placement in the findBestPlacement function, right?

Comment on lines 284 to 287
task_node->placement.push_back(placement.primary());
// Handles mapping one task on multi-CGRAs.
// TODO: Introduce explicit multi-CGRA binding logic.
for (size_t i = 1; i < placement.cgra_positions.size(); ++i) {
task_node->placement.push_back(placement.cgra_positions[i]);
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use one for loop for all cgra_positions commit?

Comment on lines +582 to +585
for (auto &shape : getRectShapes(cgra_count)) {
TaskPlacement p = tryPlaceShape(task_node, shape, graph);
if (!p.cgra_positions.empty()) return p;
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not combine getRectShapes and tryPlaceShape as a single function?

// Simple hash of sorted positions.
int64_t hash = 0;
for (auto &[col_off, row_off] : sorted_positions)
hash = hash * 131 + col_off * 17 + row_off;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do you choose these parameters? Maybe add a comment explaining how these parameters affect the sorting.

// 1. Rectangular shapes, sorted by squareness (e.g. 2×2 before 1×4),
// with smaller bounding-box area as tiebreaker.
// 2. Non-rectangular shapes (L, T, etc.) in all unique rotations.
static SmallVector<CgraShape> getAllPlacementShapes(int cgra_count) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we put this function into the util folder?

// (including the speculatively modified one).
//
// Returns true if all tasks can be placed without overlap.
static bool canAllTasksFitOnGrid(ArrayRef<int> task_cgra_counts) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we put this function into the util folder? So that we can further extend the mapping to support manually assigned cgra_counts for each task.

return std::make_unique<AllocateCgraToTaskPass>();
}

void runAllocateCgraToTask(func::FuncOp func, int grid_rows, int grid_cols) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Put this function into the util folder.

@guosran guosran marked this pull request as draft March 20, 2026 07:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants