Descriptive statistics

statinf.stats.descriptive.cov(x, y)[source]

Compute the covariance between two variables.

Parameters
  • x (numpy.array) – Input variable. Format can be numpy.array, list or pandas.Series.

  • y (numpy.array) – Input variable. Format can be numpy.array, list or pandas.Series.

Formula
\[Cov(\mathbf{X}, \mathbf{Y}) = \dfrac{ \sum_{i = 1}^{n} (X_{i} - \bar{X}) (Y_{i} - \bar{Y}) }{n - 1}\]
Example

>>> from statinf import stats
>>> x = [0.023699, 0.021436, 0.0200109, 0.0202762, 0.0165271, 0.01027]
>>> y = [9.4228, 9.27951, 9.167963, 9.68820, 9.56490, 7.543]
>>> stats.cov(x, y)
... 0.003047229298620001
Reference
  • DeGroot, M. H., & Schervish, M. J. (2012). Probability and statistics. Pearson Education.

Returns

Covariance value.

Return type

float

statinf.stats.descriptive.pearson(x, y)[source]

Compute the Pearson’s coefficient correlation

Parameters
  • x (numpy.array) – Input variable. Format can be numpy.array, list or pandas.Series.

  • y (numpy.array) – Input variable. Format can be numpy.array, list or pandas.Series.

Formula
\[\rho = \dfrac{ Cov(X, Y) }{\sigma_{X} \sigma_{Y}}\]

where \(\sigma_{Z} = \sqrt{\mathbb{V}(Z)}\)

Example

>>> from statinf import stats
>>> x = [0.023699, 0.021436, 0.0200109, 0.0202762, 0.0165271, 0.01027]
>>> y = [9.4228, 9.27951, 9.167963, 9.68820, 9.56490, 7.543]
>>> stats.pearson(x, y)
... 0.9750052703452801
Reference
  • DeGroot, M. H., & Schervish, M. J. (2012). Probability and statistics. Pearson Education.

Returns

Pearson’s coefficient correlation.

Return type

float

statinf.stats.descriptive.spearman(x, y)[source]

Spearman’s rank correlation coefficient.

Parameters
  • x (numpy.array) – Input variable. Format can be numpy.array, list or pandas.Series.

  • y (numpy.array) – Input variable. Format can be numpy.array, list or pandas.Series.

Formula
\[\rho = 1 - \dfrac{ 6 \sum_{i=1}^{n} d^{2} }{ n (n^{2} - 1)}\]
Example

>>> from statinf import stats
>>> x = [0.023699, 0.021436, 0.0200109, 0.0202762, 0.0165271, 0.01027]
>>> y = [9.4228, 9.27951, 9.167963, 9.68820, 9.56490, 7.543]
>>> stats.spearman(x, y)
... 0.37142857142857144
Returns

Spearman’s rank correlation coefficient.

Return type

float

statinf.stats.descriptive.var(x, std=False, df=1)[source]

Compute the variance of a variable.

Parameters
  • x (numpy.array) – Input variable. Format can be numpy.array, list or pandas.Series.

  • std (bool, optional) – Returns standard deviation, i.e. \(\sqrt{\mathbb{V}(\mathbf{X})}\), defauls to False.

  • df (int, optional) – Degrees of freedom, defaults to 1.

Formula
\[\mathbb{V}(\mathbf{X}) = \dfrac{1}{n - 1} \sum_{i = 1}^{n} (X_{i} - \bar{X})^{2}\]
Example

>>> from statinf import stats
>>> x = [0.023699, 0.021436, 0.0200109, 0.0202762, 0.0165271, 0.01027]
>>> stats.var(x)
... 2.2492979044000003e-05
Returns

Variance.

Return type

float