Skip to content

Improve sysid robustness, prompted by #3284#3287

Open
kevinzakka wants to merge 6 commits into
google-deepmind:mainfrom
kevinzakka:sysid-issue-3284-fixes
Open

Improve sysid robustness, prompted by #3284#3287
kevinzakka wants to merge 6 commits into
google-deepmind:mainfrom
kevinzakka:sysid-issue-3284-fixes

Conversation

@kevinzakka
Copy link
Copy Markdown
Collaborator

@kevinzakka kevinzakka commented May 26, 2026

This PR depends on #3294. It adds 3 robustness aids to mujoco.sysid to hopefully alleviate issues like #3284. Specifically, it adds:

  • ParameterDict.move_off_bounds(): this function nudges any parameter starting at a box bound into the interior. Additionally, optimize will now warn you when it detects that a parameter is at a bound.
  • x_scale is now accepted as a kwarg in the mujoco solver. It rescales parameters with widely different magnitudes, a common source of ill-conditioning in sysid. The scipy backend already supported this.
  • check_conditinong=True is now an arg in optimize and it will warn you when the condition number of the Hessian exceeds 1e12, a heuristic we chose well below the float64 conditioning limit.

@yuvaltassa
Copy link
Copy Markdown
Collaborator

I agree with @aftersomemath , the residual scaling is definitely a mujoco.minimize feature, and a good one too.
Do you mind splitting this out into a pre-PR?

Per-parameter scaling via change of variables z = x / D. Supports
'jac' (adaptive D_i = 1/||J(:,i)|| per iteration, matches scipy's TRF),
explicit array, or a positive scalar. Default 1.0 is a no-op.
@kevinzakka kevinzakka force-pushed the sysid-issue-3284-fixes branch from 4878167 to 1327654 Compare May 27, 2026 00:25
Copy link
Copy Markdown
Contributor

@aftersomemath aftersomemath left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comments.

Comment thread python/mujoco/sysid/_src/parameter.py Outdated
rng = hi - lo
safe_rng = np.where(rng > 0, rng, 1.0)
v = self.as_vector().copy()
at_lo = (v - lo) <= 1e-3 * safe_rng
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1e-3 should probably be a parameter as well.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

assert (result_dir / "arm.xml").exists()


def _rank_1_residual_fn(x, p):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly, this residual function has a rank 1 Jacobian, but only because its output is scalar. A better test would be a function with 2 outputs, that depends on only one variable. Then the jacobian will be 2x2 and rank 1.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Comment thread python/mujoco/sysid/_src/optimize.py Outdated
x0: np.ndarray,
residual_fn: Callable[..., Any],
bounds: tuple[np.ndarray, np.ndarray],
x_scale: XScale = 1.0,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not needed if we are relying on **kwargs as the doc string suggests.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Comment thread python/mujoco/sysid/_src/optimize.py Outdated
verbose: If True, log parameter comparison table after optimization.
**optimizer_kwargs: Forwarded to the backend (e.g. ``max_iters``,
``verbose``, ``loss``).
check_conditioning: If True, estimate ``cond(JᵀJ)`` at the starting
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it ok to use unicode like supscript T?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Comment thread python/mujoco/sysid/_src/optimize.py Outdated
lo, hi = bounds
rng = hi - lo
safe_rng = np.where(rng > 0, rng, 1.0)
at_bound = ((x0 - lo) <= 1e-3 * safe_rng) | ((hi - x0) <= 1e-3 * safe_rng)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably check that x0 is within the bounds before executing this check

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Comment thread python/mujoco/sysid/_src/optimize.py Outdated
* ``loss``: scipy loss function name (scipy backends only).
* ``x_scale``: per-parameter scaling. ``"jac"`` is adaptive
``D_i = 1/||J(:,i)||`` per iteration; an explicit array or
positive scalar is used as ``D`` directly. Defaults: ``"jac"``
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

optimizer="scipy", verbose=False, check_conditioning=True,
max_iters=1,
)
fired = any("cond(JᵀJ)" in r.message for r in caplog.records)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unicode?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Comment thread python/mujoco/sysid/_src/optimize.py Outdated
residual_fn: Callable[..., Any],
threshold: float = 1e12,
) -> None:
"""Warn if cond(JᵀJ) at the starting point exceeds ``threshold``.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unicode?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Comment thread python/mujoco/sysid/_src/optimize.py Outdated
residuals, _, _ = residual_fn(x, initial_params)
return np.concatenate(residuals)

eps = np.finfo(np.float64).eps ** 0.5
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

eps should be the same as what will be passed to mujoco/scipy. This is their default, but the user could override.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Comment thread python/mujoco/sysid/_src/optimize.py Outdated
)
# eigvalsh on the gram matrix so rank-deficient directions are visible
# when n_params > n_residual components (SVD would drop to min(m, n)).
ev = np.maximum(np.linalg.eigvalsh(jac.T @ jac), 0.0)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this better than using Numpy's condition number function? https://numpy.org/doc/stable/reference/generated/numpy.linalg.cond.html

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

ParameterDict.move_off_bounds() shifts at-bound values into the interior;
optimize() warns when any non-frozen component starts at a bound.
mujoco.minimize.least_squares now supports x_scale natively, so the
sysid wrapper becomes a thin pass-through. 'jac' is adaptive per
iteration (was static at x0).
Pass check_conditioning=True to FD the Jacobian at the starting point
and warn if cond(J^T J) suggests numerical ill-conditioning.
@kevinzakka kevinzakka force-pushed the sysid-issue-3284-fixes branch from 1327654 to 857b55f Compare May 27, 2026 20:08
@aftersomemath
Copy link
Copy Markdown
Contributor

LGTM. Just need to answer the outstanding question about mu adaptation in #3294.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants