Recipe Modularization, DAG Integration, and BigQuery Integration#2715
Recipe Modularization, DAG Integration, and BigQuery Integration#2715DannyLiCom wants to merge 4 commits intomainfrom
Conversation
a961644 to
f2a46a6
Compare
f2a46a6 to
3840670
Compare
8adde2d to
b08ea95
Compare
|
|
||
|
|
||
| def handle_cmd_args(cluster_config: XpkClusterConfig, *actions: str, **kwargs) -> bool: | ||
| def handle_cmd_args(cluster_config: XpkClusterConfig, is_delete: bool, user: str, **kwargs) -> bool: |
There was a problem hiding this comment.
Since the original recipe could only accept the --delete flag, this section had to be modified to allow it to accept multiple flags.
| parser_utils.add_arguments(parser) | ||
| args = parser.parse_args() | ||
|
|
||
| if len(sys.argv) > 2: |
There was a problem hiding this comment.
This logic is implemented to determine whether multiple flags are being used.
0303bcf to
78b8073
Compare
ce0fc18 to
c37edc0
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
c37edc0 to
57a9040
Compare
|
This PR has been automatically marked as stale because it has not had recent activity. It will be closed soon if no further activity occurs. Thank you for your contributions. |
|
This PR was closed because it has been inactive for a while. Please reopen it if you are still working on it. |
57a9040 to
991aaf6
Compare
| from MaxText.layers import nnx_wrappers | ||
| from MaxText.layers.initializers import variable_to_logically_partitioned | ||
| from MaxText.layers.quantizations import AqtQuantization as Quant | ||
| from MaxText.sharding import logical_to_mesh_axes, maybe_shard_with_name |
There was a problem hiding this comment.
I moved it because Pylint failed.
| from MaxText.sharding import logical_to_mesh_axes | ||
| from MaxText.layers import attentions, linears, nnx_wrappers, quantizations | ||
| from MaxText.layers.initializers import NdInitializer, default_bias_init, nd_dense_init, variable_to_logically_partitioned | ||
| from maxtext.kernels import megablox as mblx |
There was a problem hiding this comment.
I moved it because Pylint failed.
| done | ||
|
|
||
| echo "Successfully clean up all codes." | ||
| echo "Successfully clean up all codes." No newline at end of file |
There was a problem hiding this comment.
I’m not sure why this was flagged as a change; it looks exactly the same to me.
There was a problem hiding this comment.
You can remove the unintended change from the commit git restore <path/to/file>
|
LGTM, but I would just verify that the imports work properly at the top of the files. Since the MaxText modularization refactor, we may need to check that since a lot of these changes were from before that. |
aireenmei
left a comment
There was a problem hiding this comment.
Thanks for the work! I run the PR through AI and got these comments: https://paste.googleplex.com/6085130735190016. I looked at some points and they make sense to me, so I left inline comments. But there are more points, could you take a look?
| "--selected_model_framework", | ||
| type=parse_str_list, | ||
| default=["pathways"], | ||
| help="List of model frameworks (e.g., pathways, mcjax", |
There was a problem hiding this comment.
Need a closing ")" for help message.
| "--selected_model_names", | ||
| type=parse_str_list, | ||
| default=["llama3_1_8b_8192_v5e_256"], | ||
| help="List of model names (e.g., llama3_1_8b_8192_v5e_256, llama2-7b-v5e-256", |
There was a problem hiding this comment.
Need a closing ")" for help message.
| done | ||
|
|
||
| echo "Successfully clean up all codes." | ||
| echo "Successfully clean up all codes." No newline at end of file |
There was a problem hiding this comment.
You can remove the unintended change from the commit git restore <path/to/file>
| from MaxText.layers import nnx_wrappers | ||
| from MaxText.layers.initializers import variable_to_logically_partitioned | ||
| from MaxText.layers.quantizations import AqtQuantization as Quant | ||
| from MaxText.sharding import logical_to_mesh_axes, maybe_shard_with_name |
| from MaxText.sharding import logical_to_mesh_axes | ||
| from MaxText.layers import attentions, linears, nnx_wrappers, quantizations | ||
| from MaxText.layers.initializers import NdInitializer, default_bias_init, nd_dense_init, variable_to_logically_partitioned | ||
| from maxtext.kernels import megablox as mblx |
| user_configs.USER_CONFIG.headless = False | ||
| should_continue = helper.handle_cmd_args( | ||
| user_configs.USER_CONFIG.cluster_config, helper.DELETE, xpk_path=user_configs.USER_CONFIG.xpk_path | ||
| user_configs.USER_CONFIG.cluster_config, user_configs.USER_CONFIG.DELETE, user_configs.USER_CONFIG.user |
There was a problem hiding this comment.
Should this be user_configs.USER_CONFIG.delete (lowercase)?
| parser.add_argument( | ||
| "--server_image", | ||
| type=str, | ||
| default="us-docker.pkg.dev/cloud-tpu-v2-images/pathways/proxy_server", |
There was a problem hiding this comment.
The default values of server_image and proxy_image looks swapped.
|
🤖 Hi @aireenmei, I've received your request, and I'm working on it now! You can track my progress in the logs for more details. |
There was a problem hiding this comment.
This pull request introduces significant refactoring to the benchmarking recipes, modularizing argument parsing and improving configuration management. The changes make the recipes more flexible and easier to use from the command line, especially for DAG integration.
🔍 General Feedback
- The introduction of
parser_utils.pyis a good step towards centralizing command-line argument handling. - The use of a
UserConfigclass helps in managing configurations cleanly. - The addition of
check_and_create_bucketis a useful utility, though its error handling could be improved in the calling code.
| parser.add_argument( | ||
| "--colocated_python_image", | ||
| type=str, | ||
| default=None, | ||
| help="Colocated Python image.", | ||
| ) | ||
| parser.add_argument("--worker_flags", type=str, default="", help="Worker flags.") | ||
| parser.add_argument("--proxy_flags", type=str, default="", help="Proxy flags.") | ||
| parser.add_argument("--server_flags", type=str, default="", help="Server flags.") | ||
|
|
||
| # Model Configuration | ||
| parser.add_argument("--benchmark_steps", type=int, default=20, help="Number of benchmark steps.") |
There was a problem hiding this comment.
| parser.add_argument( | |
| "--colocated_python_image", | |
| type=str, | |
| default=None, | |
| help="Colocated Python image.", | |
| ) | |
| parser.add_argument("--worker_flags", type=str, default="", help="Worker flags.") | |
| parser.add_argument("--proxy_flags", type=str, default="", help="Proxy flags.") | |
| parser.add_argument("--server_flags", type=str, default="", help="Server flags.") | |
| # Model Configuration | |
| parser.add_argument("--benchmark_steps", type=int, default=20, help="Number of benchmark steps.") | |
| parser.add_argument( | |
| "--server_image", | |
| type=str, | |
| default="us-docker.pkg.dev/cloud-tpu-v2-images/pathways/server", | |
| help="Docker image for the server.", | |
| ) | |
| parser.add_argument( | |
| "--proxy_image", | |
| type=str, | |
| default="us-docker.pkg.dev/cloud-tpu-v2-images/pathways/proxy_server", | |
| help="Docker image for the proxy server.", | |
| ) |
|
|
||
| Returns: | ||
| An integer exit code from the workload execution (0 for success, non-zero for failure). | ||
| """ | ||
| storage_client = storage.Client(project=user_config.project) | ||
| check_and_create_bucket( |
There was a problem hiding this comment.
| Returns: | |
| An integer exit code from the workload execution (0 for success, non-zero for failure). | |
| """ | |
| storage_client = storage.Client(project=user_config.project) | |
| check_and_create_bucket( | |
| storage_client = storage.Client(project=user_config.project) | |
| bucket = check_and_create_bucket( | |
| storage_client, | |
| user_config.base_output_directory[5:].split("/")[0], | |
| user_config.region, | |
| ) | |
| if not bucket: | |
| print("Exiting due to failure in GCS bucket creation or access.") | |
| return 1 |
| """ | ||
| This module provides utility functions for Pathways-related benchmark recipes. | ||
|
|
||
| It includes helpers for building lists of model configurations based on user | ||
| selections and for generating `XpkClusterConfig` and `PathwaysConfig` objects. | ||
| """ | ||
|
|
There was a problem hiding this comment.
| """ | |
| This module provides utility functions for Pathways-related benchmark recipes. | |
| It includes helpers for building lists of model configurations based on user | |
| selections and for generating `XpkClusterConfig` and `PathwaysConfig` objects. | |
| """ | |
| """ | |
| This module provides utility functions for Pathways-related benchmark recipes. | |
| It includes helpers for building lists of model configurations based on user | |
| selections and for generating `XpkClusterConfig` and `PathwaysConfig` objects. | |
| """ |
|
🤖 I'm sorry @aireenmei, but I was unable to process your request. Please see the logs for more details. |
Description
When integrating Recipe Modularization, DAG Integration, and BigQuery, the DAG will pass in a command that includes multiple command-line flags.
Tests
Users can use commands similar to the following to run tests
Checklist
Before submitting this PR, please make sure (put X in square brackets):
gemini-reviewlabel.