Skip to content

Conversation

@ankitlade12
Copy link
Contributor

  • Add ArcSinhTransformer class with loc and scale parameters
  • Support for positive and negative values (unlike LogTransformer)
  • Includes inverse_transform method
  • Add comprehensive tests with pytest parametrize
  • Add user guide documentation

Copy link
Collaborator

@solegalli solegalli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @ankitlade12

This transformer is ready to go (almost ;) )

We need a few additions:

In the user-guide, maybe we can say a bit more about how this transformer compares to log and arcsin? so that the user understands when to use this and when to use the others?

It might help to actually have a demo comparing the 3 transformers, highlighting when this one works and the others don't and viceversa.

It would also be helpful to understand why we should scale the data with loc and scale, not just the effect it has on the values, but an intuitive explanation to someone that is less familiar with transformations and wants to understand if they should apply this to their data or not.

Could we add at the end of the user api file the links with more info suggested in the issue:
https://stats.stackexchange.com/questions/1444/how-should-i-transform-non-negative-data-including-zeros

https://blogs.worldbank.org/en/impactevaluations/interpreting-treatment-effects-inverse-hyperbolic-sine-outcome-variable-and

And maybe also a link to the article:
https://www.jstor.org/stable/2288929

- Add ArcSinhTransformer class with loc and scale parameters
- Support for positive and negative values (unlike LogTransformer)
- Includes inverse_transform method
- Add comprehensive tests with pytest parametrize
- Add user guide documentation
@ankitlade12 ankitlade12 force-pushed the add-arcsinh-transformer branch from 4037b58 to 201b032 Compare January 12, 2026 11:40
@ankitlade12 ankitlade12 requested a review from solegalli January 23, 2026 16:40
Copy link
Collaborator

@solegalli solegalli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This transformer is technically good to go. However, the user guide needs a bit of work.

I went over the suggested articles, and my understanding is that this transformation is a good alternative when we have a variable with a distribution that would benefit from log transform, and also a lot of 0s as values, which the log can't handle. We need to discuss that much more.

We need a suitable example (I think we can find one among the datasets from imblearn), and then transform the variable and show the resulting distribution.

In the example shown in this user guide, we achieve the opposite of what we want with a variance stabilizing transformation: we had a more or less normal distribution, and after the transformation, it looks like there 2 distributions. I don't think that is what we want. That result, in fact, is probably one of the limitations of this transformation and a reason why it should not be implemented blindly.

I'll try to give it a go at updating the user docs tomorrow. If you want to go first, by all means do, and I'll pick up from where you left off :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

pseudolog transformer

2 participants