Environment Details
- Copulas version: 0.12.2
- Python version: 3.11
- Operating System: Linux
Error Description
As first described in #469, it seems that whenever Copulas is asked to print parameters for a fitted GaussianKDE distribution, it just prints out a copy of the data that was fitted.
In the code below, the final column (column z) is fitted to a GaussianKDE distribution.
from copulas.datasets import sample_trivariate_xyz
from copulas.multivariate import GaussianMultivariate
data = sample_trivariate_xyz()
dist = GaussianMultivariate()
dist.fit(data)
parameters = dist.to_dict()
univariates = parameters['univariates']
print(univariates[2])
{'dataset': [0.638689008563623, 1.058121237066397, 0.3725063445214631, 0.687369594994837, -0.8810681732344304, -0.7121672205062004, 5.050261904362624, ...
'type': 'copulas.univariate.gaussian_kde.GaussianKDE'
The data seems to be just be the exact values in column z
Expected Behavior
It's unexpected that the entire column's data would be reported at this step.
I would expect that when printing out the distribution, it would only show the 'type' of distribution and nothing else.
{ 'type': ''copulas.univariate.gaussian_kde.GaussianKDE' }
It seems like the "parameters" are set to the data in fit portion:
|
def _fit(self, X): |
|
if self._sample_size: |
|
X = gaussian_kde(X, bw_method=self.bw_method, weights=self.weights).resample( |
|
self._sample_size |
|
) |
|
self._params = {'dataset': X.tolist()} |
|
self._model = self._get_model() |
Ideally, the _params assigned to the GaussianKDE should be None, GaussianKDE is non-parametric distribution. Whatever info we need to save the state of the GassianKDE should be saved under a different name and not exposed as parameters.