Neural Network

class statinf.ml.neuralnetwork.Layer(n_in, n_out, activation='linear', W=None, b=None, init_weights='xavier', init_bias='zeros', seed=DeviceArray([0, 0], dtype=uint32))[source]

Bases: object

class statinf.ml.neuralnetwork.MLP(loss='MSE', random=None)[source]

Bases: object

Multi-Layer Perceptron

References

Rosenblatt, F. (1958). The perceptron: a probabilistic model for information storage and organization in the brain. Psychological review, 65(6), 386.

add(layer)[source]

Stack layer to Neural Network

Parameters

layer (Layer()) – Layer to be stacked

fit(data, X, Y='Y', epochs=100, optimizer='SGD', batch_size=1, training_size=0.8, test_set=None, learning_rate=0.05, L1_reg=0.0, L2_reg=0.0, early_stop=True, patience=10, improvement_threshold=0.995, restore_weights=True, verbose=True, verbose_all=False, plot=False, *args)[source]

Train the Neural Network.

Parameters
  • data (pandas.DataFrame) – Layer id from which to fetch the parameters.

  • X (list) – List of X input variables.

  • Y (str, optional) – Variable to predict, defaults to ‘Y’.

  • epochs (int, optional) – Number of epochs to train the network, defaults to 100.

  • optimizer (str, optional) – Algortihm to use to minimize the loss function, defaults to ‘SGD’ (see optimizers).

  • batch_size (int, optional) – Size of each batch to be trained, defaults to 1.

  • training_size (float, optional) – Ratio of the data to be used for training set (\(\in (0, 1)\)) the remainder is used for test set, defaults to 0.8.

  • test_set (pandas.DataFrame, optional) – Data frame to use as test set (overrides training_size if provided), defaults to None.

  • learning_rate (float, optional) – Learning rate (step size) for gradient descent, defaults to 0.01.

  • L1_reg (float, optional) – Coefficient \(\lambda_{1} \in (0,1)\) for the L1 penalty, defaults to 0.

  • L2_reg (float, optional) – Coefficient \(\lambda_{2} \in (0,1)\) for the L2 penalty, defaults to 0.

  • early_stop (bool, optional) – Apply early-stopping method to avoid over-fitting. Stop the training sequence after the patience (see below) with no improvement, defaults to True.

  • patience (int, optional) – Number of iterations to wait after a minimum has been found. If no improvement, training sequence is stopped, defaults to 10.

  • improvement_threshold (float, optional) – Change level of the loss from which we consider it is an improvement, defaults to 0.995.

  • restore_weights (bool, optional) – If training is early-stopped, restore the weights of the best iteration, defaults to True.

  • verbose (int, optional) – Verbosity level (True or 1 for minimal display, 2 for details), defaults to True.

  • plot (bool, optional) – Plot the evolution of the training and test loss for each epoch, defaults to False.

  • *args (str) – Arguments to be passed to the optimizer.

References
get_weights(layer='all', param='all')[source]

Fetches the parameters from the network.

Parameters
  • layer (int, optional) – Layer id from which to fetch the parameters, defaults to ‘all’

  • param (str, optional) – What parameter we need to fetch (can either be ‘weight’, ‘bias’ or ‘all’), defaults to ‘all’

Returns

Weights and Bias used in the Neural Network.

Return type

dict

predict(new_data, binary=False, threshold=0.5)[source]

Generates output prediction after feedforward pass.

Parameters
  • new_data (pandas.DataFrame) – Input data.

  • binary (bool, optional) – Boolean for returning a binary output (not probability), defaults to False.

  • threshold (float, optional) – Probability threshold for binary response, defaults to 0.5.

Returns

Predicted values.

Return type

list