Neural Network

class statinf.ml.neuralnetwork.Layer(n_in, n_out, activation='linear', W=None, b=None, init_weights='xavier', init_bias='zeros', seed=DeviceArray([0, 0], dtype=uint32))[source]: Bases: object

class statinf.ml.neuralnetwork.MLP(loss='MSE', random=None)[source]

Bases: object

Multi-Layer Perceptron

References: Rosenblatt, F. (1958). The perceptron: a probabilistic model for information storage and organization in the brain. Psychological review, 65(6), 386.

Stack layer to Neural Network

Parameters: layer (Layer()) – Layer to be stacked

fit(data, X, Y='Y', epochs=100, optimizer='SGD', batch_size=1, training_size=0.8, test_set=None, learning_rate=0.05, L1_reg=0.0, L2_reg=0.0, early_stop=True, patience=10, improvement_threshold=0.995, restore_weights=True, verbose=True, verbose_all=False, plot=False, *args)[source]

Train the Neural Network.

Parameters

data (pandas.DataFrame) – Layer id from which to fetch the parameters.
X (list) – List of X input variables.
Y (str, optional) – Variable to predict, defaults to ‘Y’.
epochs (int, optional) – Number of epochs to train the network, defaults to 100.
optimizer (str, optional) – Algortihm to use to minimize the loss function, defaults to ‘SGD’ (see optimizers).
batch_size (int, optional) – Size of each batch to be trained, defaults to 1.
training_size (float, optional) – Ratio of the data to be used for training set (\(\in (0, 1)\)) the remainder is used for test set, defaults to 0.8.
test_set (pandas.DataFrame, optional) – Data frame to use as test set (overrides training_size if provided), defaults to None.
learning_rate (float, optional) – Learning rate (step size) for gradient descent, defaults to 0.01.
L1_reg (float, optional) – Coefficient \(\lambda_{1} \in (0,1)\) for the L1 penalty, defaults to 0.
L2_reg (float, optional) – Coefficient \(\lambda_{2} \in (0,1)\) for the L2 penalty, defaults to 0.
early_stop (bool, optional) – Apply early-stopping method to avoid over-fitting. Stop the training sequence after the patience (see below) with no improvement, defaults to True.
patience (int, optional) – Number of iterations to wait after a minimum has been found. If no improvement, training sequence is stopped, defaults to 10.
improvement_threshold (float, optional) – Change level of the loss from which we consider it is an improvement, defaults to 0.995.
restore_weights (bool, optional) – If training is early-stopped, restore the weights of the best iteration, defaults to True.
verbose (int, optional) – Verbosity level (True or 1 for minimal display, 2 for details), defaults to True.
plot (bool, optional) – Plot the evolution of the training and test loss for each epoch, defaults to False.
*args (str) – Arguments to be passed to the optimizer.

References

Friedman, J., Hastie, T., & Tibshirani, R. (2001). The elements of statistical learning (Vol. 1, No. 10). New York: Springer series in statistics.
Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep learning. MIT press.

get_weights(layer='all', param='all')[source]

Fetches the parameters from the network.

Parameters

layer (int, optional) – Layer id from which to fetch the parameters, defaults to ‘all’
param (str, optional) – What parameter we need to fetch (can either be ‘weight’, ‘bias’ or ‘all’), defaults to ‘all’

Returns

Weights and Bias used in the Neural Network.

Return type

dict

predict(new_data, binary=False, threshold=0.5)[source]

Generates output prediction after feedforward pass.

Parameters

new_data (pandas.DataFrame) – Input data.
binary (bool, optional) – Boolean for returning a binary output (not probability), defaults to False.
threshold (float, optional) – Probability threshold for binary response, defaults to 0.5.

Returns

Predicted values.

Return type

list