Learning

Modules for altering the learning process


CyclicLearningRateCallback

Apply cyclic learning rate. Supports the following scale schemes:

  • triangular - Triangular cycle

  • triangular2 - Triangular cycle that shrinks amplitude by half each cycle

  • exp_range - Triangular cycle that shrinks amplitude by gamma ** <cycle iterations> each cycle

Arguments

  • base_lr (float): Lower boundary of each cycle

  • max_lr (float): Upper boundary of each cycle, may not be reached depending on the scaling function

  • step_size (int): Number of batches per half-cycle (step)

  • scale_scheme (str): One of {'triangular', 'triangular2', 'exp_range'}. If scale_fn is passed, this argument is ignored

  • gamma (float): Constant used for the exp_range’s scale_fn, used as (gamma ** <cycle iterations>)

  • scale_fn (callable): Custom scaling policy, accepts cycle index / iterations depending on the scale_mode and must return a value in the range [0, 1]. If passed, ignores scale_scheme

  • scale_mode (str): Define whether scale_fn is evaluated on cycle index or cycle iterations

Examples

Apply a triangular cyclic learning rate (default), with a step size of 2000 batches

import tensorflow as tf
import tavolo as tvl

clr = tvl.learning.CyclicLearningRateCallback(base_lr=0.001, max_lr=0.006, step_size=2000)

model.fit(X_train, Y_train, callbacks=[clr])

Apply a cyclic learning rate that shrinks amplitude by half each cycle

import tensorflow as tf
import tavolo as tvl

clr = tvl.learning.CyclicLearningRateCallback(base_lr=0.001, max_lr=0.006, step_size=2000, scale_scheme='triangular2')

model.fit(X_train, Y_train, callbacks=[clr])

Apply a cyclic learning rate with a custom scaling function

import tensorflow as tf
import tavolo as tvl

scale_fn = lambda x: 0.5 * (1 + np.sin(x * np.pi / 2))
clr = tvl.learning.CyclicLearningRateCallback(base_lr=0.001, max_lr=0.006, step_size=2000, scale_fn=scale_fn)

model.fit(X_train, Y_train, callbacks=[clr])

LearningRateFinder

Learning rate finding utility for conducting the “LR range test”, see article reference for more information

Use the scan method for finding the loss values for learning rates in the given range

Arguments

  • model (tf.keras.Model): Model for conduct test for. Must call model.compile before using this utility

Examples

Run a learning rate range test in the domain [0.0001, 1.0]

import tensorflow as tf
import tavolo as tvl

train_data = ...
train_labels = ...

# Build model
model = tf.keras.Sequential([tf.keras.layers.Input(shape=(784,)),
                             tf.keras.layers.Dense(128, activation=tf.nn.relu),
                             tf.keras.layers.Dense(10, activation=tf.nn.softmax)])

# Must call compile with optimizer before test
model.compile(optimizer=tf.keras.optimizers.SGD(), loss=tf.keras.losses.CategoricalCrossentropy())

# Run learning rate range test
lr_finder = tvl.learning.LearningRateFinder(model=model)

learning_rates, losses = lr_finder.scan(train_data, train_labels, min_lr=0.0001, max_lr=1.0, batch_size=128)

### Plot the results to choose your learning rate

References

learning.LearningRateFinder.scan(x, y, min_lr: float = 0.0001, max_lr: float = 1.0, batch_size: Optional[int] = None, steps: int = 100) → Tuple[List[float], List[float]]

Scans the learning rate range [min_lr, max_lr] for loss values

Parameters
  • x – Input data. It could be: - A Numpy array (or array-like), or a list of arrays (in case the model has multiple inputs) - A TensorFlow tensor, or a list of tensors (in case the model has multiple inputs) - A dict mapping input names to the corresponding array/tensors, if the model has named inputs - A tf.data dataset or a dataset iterator. Should return a tuple of either (inputs, targets) or (inputs, targets, sample_weights) - A generator or keras.utils.Sequence returning (inputs, targets) or (inputs, targets, sample weights)

  • y – Target data. Like the input data x, it could be either Numpy array(s) or TensorFlow tensor(s). It should be consistent with x (you cannot have Numpy inputs and tensor targets, or inversely). If x is a dataset, dataset iterator, generator, or tf.keras.utils.Sequence instance, y should not be specified (since targets will be obtained from x).

  • min_lr – Minimum learning rate

  • max_lr – Maximum learning rate

  • batch_size – Number of samples per gradient update. Do not specify the batch_size if your data is in the form of symbolic tensors, dataset, dataset iterators, generators, or tf.keras.utils.Sequence instances (since they generate batches)

  • steps – Number of steps to scan between min_lr and max_lr

Returns

Learning rates, losses documented