Skip to content
Open

Doc #19

Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion TODO
Original file line number Diff line number Diff line change
Expand Up @@ -3,5 +3,6 @@
- [ ] Add docstrings to constants. Simply add "///" instead of "//" ? Find how to transfert comments from doxygen xml to breathe
- [x] Welcome page: better integrate code lines (`pip install...`). Also move section "Note for developpers" into a more appropriated section ?
- [ ] Move objects description from "tutorials.rst" into "Usage" section
- [ ] Ensure that warnings are raise is non-existing keyword arguments are used in functions (e.g. Nbiteration in place of NbIteration)
- [ ] Ensure that warnings are raised if non-existing keyword arguments are used in functions (e.g. Nbiteration in place of NbIteration)
- [ ] Have wrapper documentation be transferred to python module.
- [ ] Replace hard-coded relative paths by Path().parent etc.
1 change: 1 addition & 0 deletions conda/environment.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ dependencies:
- setuptools_scm
- cmake
- boost
- doxygen
- matplotlib-base
- pip
- pip:
Expand Down
1 change: 1 addition & 0 deletions doc/api/index.rst
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
API Reference
=============

.. toctree::
:maxdepth: 2
:caption: AML API:
Expand Down
55 changes: 53 additions & 2 deletions doc/examples/clustering.ipynb

Large diffs are not rendered by default.

10 changes: 10 additions & 0 deletions doc/user/autosum.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,16 @@ Reference guide
.. contents::


C++ binding guide
=================
.. automodule:: openalea.stat_tool._stat_tool
:members:
:undoc-members:
:inherited-members:
:show-inheritance:
:private-members:



Data structures
===================
Expand Down
2 changes: 1 addition & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -89,7 +89,7 @@ doc = [
# section specific to conda-only distributed package (not used by pip yet)
[tool.conda.environment]
channels = ["openalea3", "conda-forge"]
dependencies = ["boost", "matplotlib-base", "doxygen"]
dependencies = ["boost", "matplotlib-base"]

[project.urls]
Repository = "https://github.com/openalea/stat_tool"
Expand Down
27 changes: 17 additions & 10 deletions src/cpp/stat_tool/compound.h
Original file line number Diff line number Diff line change
Expand Up @@ -53,16 +53,23 @@ namespace stat_tool {
* Constants
*/


const double COMPOUND_THRESHOLD = 0.99999; // threshold on the cumulative distribution function
// for determining the upper bound of the support

const double COMPOUND_INIT_PROBABILITY = 0.001; // threshold for probability initialization
const double COMPOUND_LIKELIHOOD_DIFF = 1.e-5; // threshold for stopping EM iterations
const int COMPOUND_NB_ITER = 10000; // maximum number of EM iterations
const double COMPOUND_DIFFERENCE_WEIGHT = 0.5; // default penalty weight (1st- or 2nd-order difference cases)
const double COMPOUND_ENTROPY_WEIGHT = 0.1; // default penalty weight (entropy case)
const int COMPOUND_COEFF = 10; // rounding coefficient for the estimator
/// threshold on the cumulative distribution function
/// for determining the upper bound of the support
/// in compound distributions
const double COMPOUND_THRESHOLD = 0.99999;

/// threshold for probability initialization in compound distributions
const double COMPOUND_INIT_PROBABILITY = 0.001;
/// threshold for stopping EM iterations in compound distributions
const double COMPOUND_LIKELIHOOD_DIFF = 1.e-5;
/// maximum number of EM iterations in compound distributions
const int COMPOUND_NB_ITER = 10000;
/// default penalty weight (1st- or 2nd-order difference cases) in compound distributions
const double COMPOUND_DIFFERENCE_WEIGHT = 0.5;
/// default penalty weight (entropy case) in compound distributions
const double COMPOUND_ENTROPY_WEIGHT = 0.1;
/// rounding coefficient for the estimator in compound distributions
const int COMPOUND_COEFF = 10;



Expand Down
3 changes: 2 additions & 1 deletion src/cpp/stat_tool/continuous_parametric_process.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -60,8 +60,9 @@ using namespace boost::math;

namespace stat_tool {


/// default quantiles in continuous parametric processes
const static double bilateral_tail[7] = {0.05, 0.025, 0.01, 0.005, 0.0025, 0.001, 0.0005};
/// default probability thresholds in continuous parametric processes
const static double posterior_threshold[7] = {0.25, 0.1, 0.05, 0.025, 0.01, 0.005, 0.0025};


Expand Down
28 changes: 18 additions & 10 deletions src/cpp/stat_tool/convolution.h
Original file line number Diff line number Diff line change
Expand Up @@ -54,16 +54,24 @@ namespace stat_tool {
*/


const int CONVOLUTION_NB_DISTRIBUTION = 10; // maximum number of elementary distributions
const double CONVOLUTION_THRESHOLD = 0.9999; // threshold on the cumulative distribution function
// for determining the upper bound of the support

const double CONVOLUTION_INIT_PROBABILITY = 0.001; // threshold for probability initialization
const double CONVOLUTION_LIKELIHOOD_DIFF = 1.e-5; // threshold for stopping EM iterations
const int CONVOLUTION_NB_ITER = 10000; // maximum number of EM iterations
const double CONVOLUTION_DIFFERENCE_WEIGHT = 0.5; // default penalty weight (1st- or 2nd-order difference cases)
const double CONVOLUTION_ENTROPY_WEIGHT = 0.1; // default penalty weight (entropy case)
const int CONVOLUTION_COEFF = 10; // rounding coefficient for the estimator
/// maximum number of elementary distributions in convoluations
const int CONVOLUTION_NB_DISTRIBUTION = 10;
/// threshold on the cumulative distribution function
/// for determining the upper bound of the support
/// in convolutions
const double CONVOLUTION_THRESHOLD = 0.9999;
/// threshold for probability initialization in convolutions
const double CONVOLUTION_INIT_PROBABILITY = 0.001;
/// threshold for stopping EM iterations in convolutions
const double CONVOLUTION_LIKELIHOOD_DIFF = 1.e-5;
/// maximum number of EM iterations in convolutions
const int CONVOLUTION_NB_ITER = 10000;
/// default penalty weight (1st- or 2nd-order difference cases) in convolutions
const double CONVOLUTION_DIFFERENCE_WEIGHT = 0.5;
/// default penalty weight (entropy case) in convolutions
const double CONVOLUTION_ENTROPY_WEIGHT = 0.1;
/// rounding coefficient for the estimator in convolutions
const int CONVOLUTION_COEFF = 10;



Expand Down
14 changes: 9 additions & 5 deletions src/cpp/stat_tool/curves.h
Original file line number Diff line number Diff line change
Expand Up @@ -52,11 +52,15 @@ namespace stat_tool {
*/


const int MAX_FREQUENCY = 50; // maximum frequency for smoothing the curves
const int MAX_RANGE = 2; // maximum half-width of the smoothing window

const int PLOT_NB_CURVE = 12; // maximum number of curves (Gnuplot output)
const int PLOT_MIN_FREQUENCY = 10; // minimum frequency for plotting curve points (Gnuplot output)
/// maximum frequency for smoothing curves
const int MAX_FREQUENCY = 50;
/// maximum half-width of the smoothing window
const int MAX_RANGE = 2;

/// maximum number of curves (Gnuplot output)
const int PLOT_NB_CURVE = 12;
/// minimum frequency for plotting curve points (Gnuplot output)
const int PLOT_MIN_FREQUENCY = 10;

enum curve_transformation {
CURVE_COPY ,
Expand Down
23 changes: 15 additions & 8 deletions src/cpp/stat_tool/discrete_mixture.h
Original file line number Diff line number Diff line change
Expand Up @@ -54,14 +54,21 @@ namespace stat_tool {
*/


const int DISCRETE_MIXTURE_NB_COMPONENT = 100; // maximum number of components

const double NEGATIVE_BINOMIAL_PARAMETER = 20.; // initial parameter for a negative binomial distribution
const double MIN_WEIGHT_STEP = 0.1; // minimum step for weight initialization
const double MAX_WEIGHT_STEP = 0.5; // maximum step for weight initialization
const int DISCRETE_MIXTURE_COEFF = 2; // rounding coefficient for the estimator
const double DISCRETE_MIXTURE_LIKELIHOOD_DIFF = 1.e-5; // threshold for stopping the EM iterations
const int DISCRETE_MIXTURE_NB_ITER = 500; // maximum number of EM iterations
/// maximum number of components for discrete mixtures
const int DISCRETE_MIXTURE_NB_COMPONENT = 100;

/// initial parameter for a negative binomial distribution
const double NEGATIVE_BINOMIAL_PARAMETER = 20.;
/// minimum step for weight initialization in discrete mixtures
const double MIN_WEIGHT_STEP = 0.1;
/// maximum step for weight initialization in discrete mixtures
const double MAX_WEIGHT_STEP = 0.5;
/// rounding coefficient for the estimator in discrete mixtures
const int DISCRETE_MIXTURE_COEFF = 2;
/// threshold for stopping the EM iterations in discrete mixtures
const double DISCRETE_MIXTURE_LIKELIHOOD_DIFF = 1.e-5;
/// maximum number of EM iterations in discrete mixtures
const int DISCRETE_MIXTURE_NB_ITER = 500;



Expand Down
1 change: 1 addition & 0 deletions src/cpp/stat_tool/discrete_parametric_process.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -138,6 +138,7 @@ DiscreteParametricProcess::DiscreteParametricProcess(int inb_state , DiscretePar
* \brief Copy of a DiscreteParametricProcess object.
*
* \param[in] process reference on a DiscreteParametricProcess object.
* \param[in] mass_copy flag on copying or not probabilities (mass)
*/
/*--------------------------------------------------------------*/

Expand Down
27 changes: 17 additions & 10 deletions src/cpp/stat_tool/distance_matrix.h
Original file line number Diff line number Diff line change
Expand Up @@ -50,16 +50,23 @@ namespace stat_tool {
*
* Constants
*/

const int ASCII_NB_INDIVIDUAL = 10; // maximum number of individuals for displaying the results of
// individual alignments
const double PLOT_YMARGIN = 0.1; // y axis margin for the plotting of distances

const double DISTANCE_ROUNDNESS = 1.e-12; // distance rounding

const int GLOBAL_NB_ITER = 20; // number of iterations when the clusters are globally computed
const int PARTITIONING_NB_ITER_1 = 50; // maximum number of iterations
const int PARTITIONING_NB_ITER_2 = 20; // maximum number of iterations
/// maximum number of individuals for displaying the results of
/// individual alignments
const int ASCII_NB_INDIVIDUAL = 10;

/// y axis margin for the plotting of distances
const double PLOT_YMARGIN = 0.1;

/// distance rounding value
const double DISTANCE_ROUNDNESS = 1.e-12;

/// number of iterations when the clusters are globally computed
/// in hierarchical clustering
const int GLOBAL_NB_ITER = 20;
/// maximum number of iterations in clustering: partitioning variant 1
const int PARTITIONING_NB_ITER_1 = 50;
/// maximum number of iterations in clustering: partitioning variant 2
const int PARTITIONING_NB_ITER_2 = 20;

enum matrix_transform {
COPY ,
Expand Down
4 changes: 0 additions & 4 deletions src/cpp/stat_tool/distribution.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -583,10 +583,6 @@ ostream& Distribution::ascii_characteristic_print(ostream &os , bool shape , boo
/*--------------------------------------------------------------*/
/**
* \brief Set seed of random generator
*
* \param[in] seed
*
* \
*/
/*--------------------------------------------------------------*/

Expand Down
77 changes: 48 additions & 29 deletions src/cpp/stat_tool/markovian.h
Original file line number Diff line number Diff line change
Expand Up @@ -54,35 +54,54 @@ namespace stat_tool {
*/


const int NB_STATE = 100; // maximum number of states of a Markov chain
const int ORDER = 8; // maximum order of a Markov chain
const double MIN_PROBABILITY = 1.e-5; // minimum initial/transition/categorical observation probability
const double THRESHOLDING_FACTOR = 0.8; // factor for the thresholding of probabilities
const int NB_PARAMETER = 100000; // maximum number of parameters of a Markov chain
const int NB_OUTPUT_PROCESS = 15; // maximum number of observation processes
const int NB_OUTPUT = 25; // maximum number of observed categories per state (categorical case)
const double OBSERVATION_THRESHOLD = 0.999; // threshold on the cumulative distribution function for bounding
// a discrete parametric observation distribution

const double ACCESSIBILITY_THRESHOLD = 1.e-6; // threshold for stopping the probabilistic algorithm
// for computing state accessibility
const int ACCESSIBILITY_LENGTH = 100; // maximum sequence length for the probabilistic algorithm
// for computing state accessibility

const double NOISE_PROBABILITY = 0.05; // perturbation of observation probabilities
const double MEAN_SHIFT_COEFF = 0.1; // coefficient for shifting continuous observation distributions

const int MIN_NB_ELEMENT = 10; // minimum size of the sample built by rounding
const int OBSERVATION_COEFF = 10; // rounding coefficient for the parametric observation distribution estimator

const int GAMMA_MAX_NB_DECIMAL = 6; // maximum number of decimals for the simulation of a gamma distribution
const int INVERSE_GAUSSIAN_MAX_NB_DECIMAL = 6; // maximum number of decimals for the simulation
// of an inverse Gaussian distribution
const int GAUSSIAN_MAX_NB_DECIMAL = 6; // maximum number of decimals for the simulation of a Gaussian distribution
const int DEGREE_DECIMAL_SCALE = 10; // factor for determining the number of decimals
// for the simulation of a von Mises distribution in degrees
const int RADIAN_DECIMAL_SCALE = 1000; // factor for determining the number of decimals
// for the simulation of a von Mises distribution in radians
/// maximum number of states of a Markov chain
const int NB_STATE = 100;
/// maximum order of a Markov chain
const int ORDER = 8;
/// minimum initial/transition/categorical observation probability
const double MIN_PROBABILITY = 1.e-5;
// factor for the thresholding of probabilities
const double THRESHOLDING_FACTOR = 0.8;
/// maximum number of parameters of a Markov chain
const int NB_PARAMETER = 100000;
/// maximum number of observation processes
const int NB_OUTPUT_PROCESS = 15;
/// maximum number of observed categories per state (categorical case)
const int NB_OUTPUT = 25;
/// threshold on the cumulative distribution function for bounding
/// a discrete parametric observation distribution
const double OBSERVATION_THRESHOLD = 0.999;

/// threshold for stopping the probabilistic algorithm
/// for computing state accessibility
const double ACCESSIBILITY_THRESHOLD = 1.e-6;
/// maximum sequence length for the probabilistic algorithm
/// for computing state accessibility
const int ACCESSIBILITY_LENGTH = 100;
/// perturbation of observation probabilities
const double NOISE_PROBABILITY = 0.05;
/// coefficient for shifting continuous observation distributions
const double MEAN_SHIFT_COEFF = 0.1;
/// minimum size of the sample built by rounding
const int MIN_NB_ELEMENT = 10;
/// rounding coefficient for the parametric observation distribution estimator
const int OBSERVATION_COEFF = 10;

/// maximum number of decimals for the simulation of a gamma distribution
const int GAMMA_MAX_NB_DECIMAL = 6;
/// maximum number of decimals for the simulation
/// of an inverse Gaussian distribution
const int INVERSE_GAUSSIAN_MAX_NB_DECIMAL = 6;

/// maximum number of decimals for the simulation of a Gaussian distribution
const int GAUSSIAN_MAX_NB_DECIMAL = 6;
/// factor for determining the number of decimals
/// for the simulation of a von Mises distribution in degrees
const int DEGREE_DECIMAL_SCALE = 10;
/// factor for determining the number of decimals
/// for the simulation of a von Mises distribution in radians
const int RADIAN_DECIMAL_SCALE = 1000;


// const double SELF_TRANSITION = 0.9; initial self-tranistion

Expand Down
20 changes: 14 additions & 6 deletions src/cpp/stat_tool/stat_tools.h
Original file line number Diff line number Diff line change
Expand Up @@ -81,10 +81,14 @@ namespace stat_tool {
PLOT
};

const int I_DEFAULT = -1; // default int
const double D_DEFAULT = -1.; // default double
const double D_INF = -1.e37; // smallest real number
const double DOUBLE_ERROR = 1.e-6; // error on a sum of doubles
/// default value for int
const int I_DEFAULT = -1;
/// smallest real number
const double D_INF = -1.e37;
/// error on a sum of doubles
const double DOUBLE_ERROR = 1.e-6;
/// default value for double
const double D_DEFAULT = -1.;
// const double DOUBLE_ERROR = 5.e-6; error on a sum of doubles

enum test_distribution {
Expand All @@ -94,11 +98,15 @@ namespace stat_tool {
STUDENT
};

/// default bound on a number of mutliple tests
const int NB_CRITICAL_PROBABILITY = 2;
/// default levels of tests
const double ref_critical_probability[NB_CRITICAL_PROBABILITY] = {0.05 , 0.01};

const int NB_VALUE = 1000; // number of values of a discrete variable
const int SAMPLE_NB_VALUE = NB_VALUE; // number of values of a discrete sample
/// number of values of a discrete variable
const int NB_VALUE = 1000;
/// number of values of a discrete sample
const int SAMPLE_NB_VALUE = NB_VALUE;

enum frequency_distribution_transformation {
FREQUENCY_DISTRIBUTION_COPY ,
Expand Down
2 changes: 1 addition & 1 deletion src/cpp/stat_tool/vectors.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -3506,7 +3506,7 @@ Vectors* Vectors::remove_variable_1() const
*
* \param[in] error reference on a StatError object,
* \param[in] nb_summed_variable number of variables to be summed,
* \param[in] variable variable indices.
* \param[in] ivariable variable indices.
*
* \return Vectors object.
*/
Expand Down
4 changes: 2 additions & 2 deletions src/openalea/stat_tool/compound.py
Original file line number Diff line number Diff line change
Expand Up @@ -36,8 +36,8 @@ def Compound(*args, **kargs):
elementary distribution or from an ASCII file.

A compound (or stopped-sum) distribution is defined as the distribution
of the sum of n independent and identically distributed random variables :math:`X_i`
where `n` is the value taken by the random variable `N`. The distribution of N is referred
of the sum of `n` independent and identically distributed random variables :math:`X_i`
where `n` is the value taken by the random variable `N`. The distribution of :math:`N` is referred
to as the sum distribution while the distribution of the :math:`X_i` is referred to as
the elementary distribution.

Expand Down
10 changes: 5 additions & 5 deletions src/openalea/stat_tool/convolution.py
Original file line number Diff line number Diff line change
Expand Up @@ -39,8 +39,8 @@ def Convolution(*args):
"""Construction of an object of type convolution from elementary
distributions or from an ASCII file.

The distribution of the sum of independent random variables is the
convolution of the distributions of these elementary random variables.
The convolution of independent random variables is the distribution
of their sum.

:Parameters:
* dist1, dist2, ...(distribution, mixture, convolution, compound) -
Expand All @@ -64,9 +64,9 @@ def Convolution(*args):
:include-source:

from openalea.stat_tool import *
sum_dist = Binomial(0,10,0.5)
dist = Binomial(0,15,0.2)
c = Convolution(sum_dist, dist)
dist1 = Binomial(0,10,0.5)
dist2 = Binomial(0,15,0.2)
c = Convolution(dist1, dist2)
c.plot()


Expand Down
Loading
Loading