SageMaker AI models and MLflow for agent evaluation with Strands Agents SDK by dhegde-aws · Pull Request #4871 · aws/amazon-sagemaker-examples

dhegde-aws · 2026-01-28T04:02:01Z

This PR adds a new notebook demonstrating how to use SageMaker AI endpoints and MLflow
with the Strands Agents SDK for building observable, production-ready AI agents.

What's included

Deploy foundation models from SageMaker JumpStart as inference endpoints
Configure SageMaker AI endpoints with Strands Agents SDK using SageMakerAIModel
Set up SageMaker Managed MLflow for automatic agent tracing and observability
Implement A/B testing using SageMaker production variants (Qwen3-4B vs Qwen3-8B)
Evaluate agent performance using MLflow GenAI scorers (custom + built-in)

Why SageMaker AI endpoints

Full infrastructure control over compute, networking, and scaling
Deploy custom/fine-tuned models or open-source alternatives
Cost predictability with reserved instances
Native MLflow integration for enterprise MLOps

Key SageMaker + MLflow features demonstrated

JumpStartModel for quick model deployment
Production variants for traffic splitting
target_variant parameter for controlled experiments
mlflow.strands.autolog() for automatic trace capture
mlflow.genai.evaluate() with Correctness and custom scorers

Testing done

Completed testing of the whole workbook on SageMaker AI Studio JupyterLab

Merge Checklist

Put an x in the boxes that apply. You can also fill these out after creating the PR. If you're unsure about any of them, don't hesitate to ask. We're here to help! This is simply a reminder of what we are going to look for before merging your pull request.

[x ] I have verified that my PR does not contain any new notebook/s which demonstrate a SageMaker functionality already showcased by another existing notebook in the repository
[ x] I have read the CONTRIBUTING doc and adhered to the guidelines regarding folder placement, notebook naming convention and example notebook best practices
I have updated the necessary documentation, including the README of the appropriate folder as well as the index.rst file
[x ] I have tested my notebook(s) and ensured it runs end-to-end
I have linted my notebook(s) and code using python3 -m black -l 100 {path}/{notebook-name}.ipynb

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

…s SDK with models deployed on SageMaker AI endpoints and MLflow observability. Covers SageMaker JumpStart model deployment, agent tracing with MLflow, A/B testing with production variants, and evaluation using MLflow GenAI scorers.

review-notebook-app · 2026-01-28T04:02:07Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

dhegde-aws · 2026-01-31T19:15:50Z

@aviruthen @monamo19 - Would request you to please review and merge this PR. Created this sample in support of a blog I am writing and has been approved in tech review.

Updated ml_ops/README.md to refer to the notebook in sm-mlflow_eval

c44d578

added new trace screenshot and updated deletion code in notebook

0e9a4f9

mollyheamazon approved these changes Feb 2, 2026

View reviewed changes

mollyheamazon merged commit f9712cd into aws:default Feb 2, 2026
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SageMaker AI models and MLflow for agent evaluation with Strands Agents SDK#4871

SageMaker AI models and MLflow for agent evaluation with Strands Agents SDK#4871
mollyheamazon merged 3 commits intoaws:defaultfrom
dhegde-aws:strands-mlflow-sagemaker-models

dhegde-aws commented Jan 28, 2026

Uh oh!

review-notebook-app Bot commented Jan 28, 2026

Uh oh!

dhegde-aws commented Jan 31, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

dhegde-aws commented Jan 28, 2026

What's included

Why SageMaker AI endpoints

Key SageMaker + MLflow features demonstrated

Testing done

Merge Checklist

Uh oh!

review-notebook-app Bot commented Jan 28, 2026

Uh oh!

dhegde-aws commented Jan 31, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants