-
Notifications
You must be signed in to change notification settings - Fork 198
cuD-PDLP #1391
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
Bubullzz
wants to merge
130
commits into
NVIDIA:main
Choose a base branch
from
Bubullzz:cuD-PDLP
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
cuD-PDLP #1391
Changes from all commits
Commits
Show all changes
130 commits
Select commit
Hold shift + click to select a range
1e0bd53
first commit !! added multi_gpu_partition file to solver settings
Bubullzz 978d17b
slowly skeletonning
Bubullzz dd0c0ef
better shard.cuh
Bubullzz 2037eca
wip
Bubullzz 0f62eff
added a bit of skeleton. Forward declared pdlp_solver in shard.hpp, t…
Bubullzz d89c85a
still wip but going well
Bubullzz 5534ff0
cursor broke everything grrr
Bubullzz dd935c5
partition loader now partition loads
Bubullzz 09eb20b
big advancements ayo ! We can soon start working on imlementing the s…
Bubullzz b5ebfd2
added pre loop setup need to manage boxing
Bubullzz 0965a60
added distributed transform
Bubullzz d4d1cab
added semicolon and existing runtime error enum
Bubullzz 6659dd9
added } and fixed cuot_expects in partition loader
Bubullzz b2ed271
small bug fixes
Bubullzz 50d16ce
a version that compiles #heheha 😎😎😎😎
Bubullzz 359d9f4
removed use of engine:transaform
Bubullzz 910a49a
added multi-gpu SpMV #heheha
Bubullzz 76c0b3f
transformed a transform. it compiles hehe
Bubullzz 5ec7138
updated take step for distributed. compiles but doesnt run. will chec…
Bubullzz 1f02afd
Merge branch 'main' into cuD-PDLP
Bubullzz de19f38
support spmvop on multi-gpu
Bubullzz 0030a6c
compile ready
Bubullzz 172ebc2
can run now
Bubullzz 23d0798
passing all tests, good merge
Bubullzz 30881ce
fixed the errors hihi, finished distributed part for compte_fixed_error
Bubullzz c33faf2
style
Bubullzz 98e0ce6
now manage halpern update in multi-gpu pdlp
Bubullzz 84128bf
small fix to calls of multi_gpu_engine_ and scale/unscale solutions.
Bubullzz abe4dd2
comments
Bubullzz 5c41497
added is multi gpu to pdhg
Bubullzz 37b1fda
added pdhg get mgpu engine
Bubullzz 57c7061
added non const convergence information getter
Bubullzz 9f78d05
compute_convergence_information is now on multi-gpu
Bubullzz c484485
fill_return_problem_solutionis now ready !!
Bubullzz fc46080
added reduced cost in gathering of solution, builds and runs
Bubullzz 6538382
updated mgpu scale/unscale logic
Bubullzz a88285a
wired mgpu restart
Bubullzz b34c5f6
dummy version locally seems to work ?????
Bubullzz b784a44
added dummy partitionner
Bubullzz ca7d7a9
added stream forking for cuda graph
Bubullzz 0310d50
updated convergence information to use potential_next rather than cu…
Bubullzz f811bc8
disabled graph, can sole afiro hehe
Bubullzz 4d7e2fc
added join_from_shards in convergence_info, now afiro is erfect 510 b…
Bubullzz 7ad4606
use spmvop in mgpu and fixed small bug of increment_iteration_since_l…
Bubullzz 03d1259
re-enabled graph. not working
Bubullzz cdc912b
Cleaner sync semantics, ez ez ez, single mGPU gives exact same result…
Bubullzz 04d22cf
pad local matrices for easier integration and allow mismatch of nnz b…
Bubullzz b41df45
copy scalars to host rather than direct d2d. better
Bubullzz a1ffe1d
force re-inject offset and variables to undo the sort, cheap and ugly…
Bubullzz c9394d9
few style changes, better args and prints
Bubullzz 4faa7df
added disable_graph flag, afiro gets solved on non-graph just as if i…
Bubullzz 61acddb
makes reductions in compute interraction adn movement use owned_size …
Bubullzz b8b59bf
added emtis partitionner, still need it in the env. it is FAST. but w…
Bubullzz 7d74e74
forgot to push a file, maybe doesnt compile lol
Bubullzz 859a299
fixed dummy partitionner on single gpu
Bubullzz 7daa740
added some plumbing, will not load full problem on gpu
Bubullzz 8a39e8c
added guard to ensure presolver is not supported in mGPU
Bubullzz 5a3b9ce
plumbed pdlp_distributed_solver with mps_data_model and now data doe…
Bubullzz e4739b5
removed usage of problem_t for distributed PDLP
Bubullzz 1903f4b
added a cuopt assert for solve_lp in mgpu mode
Bubullzz 0aacb4f
style
Bubullzz 6df8145
fixed bound/objective rescaling, now afiro on 8 shards work but hangs…
Bubullzz df9f793
actually disable the graph ^^ (kms)
Bubullzz 4c8bcd1
added option to export parts file
Bubullzz a8a8054
addded test for import export parts file
Bubullzz 5abcd2e
added full solve tests
Bubullzz 0b0ce2c
added kaminpar partitionner and possibility to chose the partitionner
Bubullzz 91b1ae5
style
Bubullzz caea509
Merge branch 'main' into cuD-PDLP
Bubullzz c6c5940
moved an expect for edge case from code rabbit
Bubullzz 3488874
updated test
Bubullzz bc1f87e
update comment for code rabbit
Bubullzz b28f07d
added check to ensure distributed pdlp is only activated with method_…
Bubullzz 9ae23a0
reverted back solve_lp from mps for better handling
Bubullzz 818ffcd
small clearer comment for pdlp_disable_graph
Bubullzz d3dad66
updated gather of final solution positionning in the code
Bubullzz 74c2d8f
added include for compile
Bubullzz cfdacd4
kaminpar compile
Bubullzz 0de9609
expect no initial or warm start
Bubullzz 1d43e8a
clean error if handle is null
Bubullzz f3b6343
better exept for never_restart_to_average
Bubullzz c658769
style
Bubullzz 38fffaf
removed scaled problem before shard building, and optimized code to g…
Bubullzz eb08f11
added prints to now rank data timings
Bubullzz 0816778
also print shard time
Bubullzz 9aca029
kaminpar quiet
Bubullzz 873d167
read mps for the compile !!
Bubullzz 5c74a4d
Merge branch 'main' into cuD-PDLP
Bubullzz e9cad7a
removed any reference to metis partitionner
Bubullzz 12edf2d
Change the way kaminpar is pulled
Bubullzz 2a35703
style
Bubullzz f5ac616
Better KaminPAr pulling
Bubullzz 40f7cf5
better kaminpar pulling again
Bubullzz c4565d1
added KAMINPAR_64BIT_EDGE_IDS
Bubullzz 3acd4d3
review ready kaminpar partitionner
Bubullzz cfbb27a
wheel measured ~727 MiB so bump up max allowed size
Bubullzz 5855aa8
bump wheel size
Bubullzz 7917f66
replaced 19 args function with a mps_data_model_t
Bubullzz a286381
cleaned multi_gpu engine. now taking mps_data_model as input.
Bubullzz fb3692d
new scaling
Bubullzz 6ebdcb9
disable graph in distributed tests
Bubullzz d65fca5
moved function into distributed_algorithms.hpp
Bubullzz 197df24
use RAII communicators
Bubullzz ed4f9e1
cleaned distributed_algorithm.cu
Bubullzz c193656
replaced raw cudamemcopy with raft::copy
Bubullzz c42f770
updated multi_gpu_engine for clarity
Bubullzz c551415
instantiate as float to allow compilation
Bubullzz 83668c0
Merge branch 'main' into cuD-PDLP
Bubullzz 558b90d
updated comments and cleaned partition_loader
Bubullzz 1223b4a
replaced hardcoded nccl types with compile time found types
Bubullzz 57c860a
cleaned partition.cpp/hpp and moved kaminpar partitionner into it
Bubullzz 349a127
review ready pdlp/distributed_pdlp
Bubullzz f40bcfe
updated solver settings comments
Bubullzz 7103655
removed mgpu_trace
Bubullzz 80fb188
remove comment in saddle_point.cu
Bubullzz e680422
cleaned pdhg.hpp and removed is_multi_gpu flag
Bubullzz 6000b75
cleaner distributed handling
Bubullzz 8c67f90
removed unused disrtibuted spmv in multi_gpu_engine
Bubullzz 1f42904
cleaned pdhg a bit more
Bubullzz 30319a0
pdhg review ready
Bubullzz 2a14a4d
finished cuopt_cli
Bubullzz 9052e31
cleaned initial_scaling
Bubullzz 26d7f9e
pdlp and solve.cu
Bubullzz 733334e
cnvergence info v2
Bubullzz 56d5580
FINISHED
Bubullzz e2a36ab
style
Bubullzz 45556da
moved mgpu engine up so graph gets destroyed before the mgpuengine co…
Bubullzz d5f1ce0
removed a useless inclyde and moved an assert up for coderabbit
Bubullzz 9c1e345
added nccl_try everywhere
Bubullzz 34196b1
style
Bubullzz File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,78 @@ | ||
| # cmake-format: off | ||
| # SPDX-FileCopyrightText: Copyright (c) 2026, NVIDIA CORPORATION & AFFILIATES. All rights reserved. | ||
| # SPDX-License-Identifier: Apache-2.0 | ||
| # cmake-format: on | ||
|
|
||
| # Multi-threaded graph partitioner for distributed PDLP. | ||
| # Uses rapids_cpm_find so a system / conda / .deb install of KaMinPar (which ships a | ||
| # CMake config package exporting KaMinPar::KaMinPar) is used when available, and | ||
| # otherwise the pinned source is cloned and built via CPM. KaMinPar depends on TBB, | ||
| # which cuOpt already requires (see find_package(TBB) for papilo). | ||
| function(find_and_configure_kaminpar) | ||
| set(oneValueArgs VERSION PINNED_TAG) | ||
| cmake_parse_arguments(PKG "" "${oneValueArgs}" "" ${ARGN}) | ||
|
|
||
| # NOTE: KaMinPar is intentionally NOT added to cuopt's BUILD/INSTALL export sets. | ||
| # It is a from-source static dependency that is fully embedded into libcuopt.so and | ||
| # never installed (INSTALL_KAMINPAR OFF below). Registering it in cuopt-exports would | ||
| # both break export generation ("target KaMinPar is not in any export set") and emit a | ||
| # bogus find_dependency(KaMinPar) into the installed cuopt config. It is linked by file | ||
| # in cpp/CMakeLists.txt (mirroring PSLP) so it stays out of cuopt's export interface. | ||
| rapids_cpm_find(KaMinPar ${PKG_VERSION} | ||
| GLOBAL_TARGETS KaMinPar::KaMinPar | ||
| CPM_ARGS | ||
| GIT_REPOSITORY https://github.com/KaHIP/KaMinPar.git | ||
| GIT_TAG ${PKG_PINNED_TAG} | ||
| EXCLUDE_FROM_ALL | ||
| OPTIONS | ||
| "KAMINPAR_BUILD_APPS OFF" | ||
| "KAMINPAR_BUILD_TOOLS OFF" | ||
| "KAMINPAR_BUILD_TESTS OFF" | ||
| "KAMINPAR_BUILD_BENCHMARKS OFF" | ||
| "KAMINPAR_BUILD_EXAMPLES OFF" | ||
| "KAMINPAR_BUILD_DISTRIBUTED OFF" | ||
| # Timers use global state and force single-threaded use of the library | ||
| # interface; disable so cuOpt can call the partitioner freely. | ||
| "KAMINPAR_ENABLE_TIMERS OFF" | ||
| # Avoid an extra hard dependency on Google Sparsehash. | ||
| "KAMINPAR_BUILD_WITH_SPARSEHASH OFF" | ||
| # cuOpt's TBB is discovered via a legacy find that only exposes TBB::tbb | ||
| # (no TBB::tbbmalloc target); disable KaMinPar's optional tbbmalloc use. | ||
| "KAMINPAR_ENABLE_TBB_MALLOC OFF" | ||
| # Large LP constraint graphs can exceed 2^31 directed edges. | ||
| "KAMINPAR_64BIT_EDGE_IDS ON" | ||
| "INSTALL_KAMINPAR OFF" | ||
| # Build KaMinPar as a STATIC library that is embedded into libcuopt.so (linked | ||
| # by file in cpp/CMakeLists.txt). The wheel build configures with | ||
| # BUILD_SHARED_LIBS=ON; without this override KaMinPar would build a separate | ||
| # libKaMinPar.so that is neither embedded nor shipped in the wheel. Forcing PIC | ||
| # is required so the static objects can be linked into the shared libcuopt.so | ||
| # (KaMinPar's KaMinParCommon OBJECT lib otherwise lacks -fPIC). | ||
| "BUILD_SHARED_LIBS OFF" | ||
| "CMAKE_POSITION_INDEPENDENT_CODE ON" | ||
| ) | ||
|
|
||
| if(KaMinPar_ADDED) | ||
| message(VERBOSE "CUOPT: Using KaMinPar located in ${KaMinPar_SOURCE_DIR}") | ||
| # KaMinPar's public header pulls in <tbb/global_control.h>. On older TBB releases | ||
| # that header is gated behind TBB_PREVIEW_GLOBAL_CONTROL (KaMinPar upstream assumes a | ||
| # newer oneTBB and never defines it). Define it on KaMinParCommon PUBLIC so it | ||
| # propagates to all KaMinPar translation units (KaMinPar links KaMinParCommon PUBLIC). | ||
| # Harmless on newer oneTBB where global_control is no longer a preview feature. | ||
| # Also force PIC on every KaMinPar target (the KaMinParCommon OBJECT library does not | ||
| # reliably inherit CMAKE_POSITION_INDEPENDENT_CODE) so the static archive can be | ||
| # embedded into the shared libcuopt.so. | ||
| foreach(_kaminpar_tgt KaMinParCommon KaMinPar KaMinParIO) | ||
| if(TARGET ${_kaminpar_tgt}) | ||
| set_target_properties(${_kaminpar_tgt} PROPERTIES POSITION_INDEPENDENT_CODE ON) | ||
| endif() | ||
| endforeach() | ||
| if(TARGET KaMinParCommon) | ||
| target_compile_definitions(KaMinParCommon PUBLIC TBB_PREVIEW_GLOBAL_CONTROL) | ||
| endif() | ||
| else() | ||
| message(VERBOSE "CUOPT: Using KaMinPar located in ${KaMinPar_DIR}") | ||
| endif() | ||
| endfunction() | ||
|
|
||
| find_and_configure_kaminpar(VERSION 3.7.3 PINNED_TAG v3.7.3) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.