Neural Network

class statinf.ml.neuralnetwork.Layer(n_in, n_out, W=None, b=None, activation=None, seed=None, init_weights='xavier', init_bias='zeros')[source]

Bases: object

Fully connected Neural Network layer

Parameters
  • n_in (int) – Dimension of the input layer (number of rows for the weights matrix).

  • n_out (int) – Dimension of the output layer (number of columns for the weights matrix).

  • W (numpy.array, optional) – Weights matrix, defaults to None (generated by init_params in initializations).

  • b (numpy.array, optional) –

    Vector of bias, defaults to None (generated by init_params in initializations).

  • activation (str, optional) – Activation function to be applied, defaults to None (i.e. linear, see activations)

  • seed (int, optional) – Random seed, defaults to None.

  • init_weights (str, optional) –

    Distribution to be used when initializing the weights matrix, defaults to ‘xavier’ (see initializations).

  • init_bias (str, optional) –

    Method to used for initializing the bias vector, defaults to ‘zeros’ (see initializations).

Returns

Layer to be stacked to the Neural Network

Return type

Layer

class statinf.ml.neuralnetwork.MLP(loss='MSE', random=None)[source]

Bases: object

Multi-Layer Perceptron

References

Rosenblatt, F. (1958). The perceptron: a probabilistic model for information storage and organization in the brain. Psychological review, 65(6), 386.

add(layer)[source]

Stack layer to Neural Network

Parameters

layer (Layer()) – Layer to be stacked

get_weights(layer='all', param='all')[source]

Fetches the parameters from the network.

Parameters
  • layer (int, optional) – Layer id from which to fetch the parameters, defaults to ‘all’

  • param (str, optional) – What parameter we need to fetch (can either be ‘weights’, ‘bias’ or ‘all’), defaults to ‘all’

Returns

Weights and Bias used in the Neural Network.

Return type

dict

predict(new_data, binary=False, threshold=0.5)[source]

Generates output prediction after feedforward pass.

Parameters
  • new_data (pandas.DataFrame) – Input data.

  • binary (bool, optional) – Boolean for returning a binary output (not probability), defaults to False.

  • threshold (float, optional) – Probability threshold for binary response, defaults to 0.5.

Returns

Predicted values.

Return type

list

summary()[source]

Generates a summary of the architecture of the network. Gives details on layers, activations and number of parameters to be estimated.

Returns

Graph summary of the network.

Return type

str

train(data, X, Y='Y', epochs=100, optimizer='SGD', batch_size=1, training_size=0.8, test_set=None, learning_rate=0.01, L1_reg=0.0, L2_reg=0.0, early_stop=True, patience=10, improvement_threshold=0.995, restore_weights=True, verbose=True, verbose_all=False, plot=False)[source]

Train the Neural Network.

Parameters
  • data (pandas.DataFrame) – Layer id from which to fetch the parameters.

  • X (list) – List of X input variables.

  • Y (str, optional) – Variable to predict, defaults to ‘Y’.

  • epochs (int, optional) – Number of epochs to train the network, defaults to 100.

  • optimizer (str, optional) – Algortihm to use to minimize the loss function, defaults to ‘SGD’ (see optimizers).

  • batch_size (int, optional) – Size of each batch to be trained (only online learning is currently available, batch_size=1), defaults to 1.

  • training_size (float, optional) – Ratio of the data to be used for training set (\(\in (0, 1)\)) the remainder is used for test set, defaults to 0.8.

  • test_set (pandas.DataFrame, optional) – Data frame to use as test set (overrides training_size if provided), defaults to None.

  • learning_rate (float, optional) – Learning rate (step size) for gradient descent, defaults to 0.01.

  • L1_reg (float, optional) – Coefficient \(\lambda_{1} \in (0,1)\) for the L1 penalty, defaults to 0.

  • L2_reg (float, optional) – Coefficient \(\lambda_{2} \in (0,1)\) for the L2 penalty, defaults to 0.

  • early_stop (bool, optional) – Apply early-stopping method to avoid over-fitting. Stop the training sequence after the patience (see below) with no improvement, defaults to True.

  • patience (int, optional) – Number of iterations to wait after a minimum has been found. If no improvement, training sequence is stopped, defaults to 10.

  • improvement_threshold (float, optional) – Change level of the loss from which we consider it is an improvement, defaults to 0.995.

  • restore_weights (bool, optional) – If training is early-stopped, restore the weights of the best iteration, defaults to True.

  • verbose (int, optional) – Verbosity level (True or 1 for minimal display, 2 for details), defaults to True.

  • plot (bool, optional) – Plot the evolution of the training and test loss for each epoch, defaults to False.

References
  • Bergstra, J., Bastien, F., Breuleux, O., Lamblin, P., Pascanu, R., Delalleau, O., … & Bengio, Y. (2011). Theano: Deep learning on gpus with python. In NIPS 2011, BigLearning Workshop, Granada, Spain (Vol. 3, pp. 1-48). Microtome Publishing.

  • Friedman, J., Hastie, T., & Tibshirani, R. (2001). The elements of statistical learning (Vol. 1, No. 10). New York: Springer series in statistics.

  • Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep learning. MIT press.