Learning¶
Modules for altering the learning process
CyclicLearningRateCallback
¶
Apply cyclic learning rate. Supports the following scale schemes:
triangular
- Triangular cycletriangular2
- Triangular cycle that shrinks amplitude by half each cycleexp_range
- Triangular cycle that shrinks amplitude bygamma ** <cycle iterations>
each cycle
Arguments¶
base_lr (
float
): Lower boundary of each cyclemax_lr (
float
): Upper boundary of each cycle, may not be reached depending on the scaling functionstep_size (
int
): Number of batches per half-cycle (step)scale_scheme (
str
): One of{'triangular', 'triangular2', 'exp_range'}
. Ifscale_fn
is passed, this argument is ignoredgamma (
float
): Constant used for theexp_range
’sscale_fn
, used as (gamma ** <cycle iterations>
)scale_fn (
callable
): Custom scaling policy, accepts cycle index / iterations depending on thescale_mode
and must return a value in the range [0, 1]. If passed, ignoresscale_scheme
scale_mode (
str
): Define whetherscale_fn
is evaluated on cycle index or cycle iterations
Examples¶
Apply a triangular cyclic learning rate (default), with a step size of 2000 batches
import tensorflow as tf
import tavolo as tvl
clr = tvl.learning.CyclicLearningRateCallback(base_lr=0.001, max_lr=0.006, step_size=2000)
model.fit(X_train, Y_train, callbacks=[clr])
Apply a cyclic learning rate that shrinks amplitude by half each cycle
import tensorflow as tf
import tavolo as tvl
clr = tvl.learning.CyclicLearningRateCallback(base_lr=0.001, max_lr=0.006, step_size=2000, scale_scheme='triangular2')
model.fit(X_train, Y_train, callbacks=[clr])
Apply a cyclic learning rate with a custom scaling function
import tensorflow as tf
import tavolo as tvl
scale_fn = lambda x: 0.5 * (1 + np.sin(x * np.pi / 2))
clr = tvl.learning.CyclicLearningRateCallback(base_lr=0.001, max_lr=0.006, step_size=2000, scale_fn=scale_fn)
model.fit(X_train, Y_train, callbacks=[clr])
LearningRateFinder
¶
Learning rate finding utility for conducting the “LR range test”, see article reference for more information
Use the scan
method for finding the loss values for learning rates in the given range
Arguments¶
model (
tf.keras.Model
): Model for conduct test for. Must callmodel.compile
before using this utility
Examples¶
Run a learning rate range test in the domain [0.0001, 1.0]
import tensorflow as tf
import tavolo as tvl
train_data = ...
train_labels = ...
# Build model
model = tf.keras.Sequential([tf.keras.layers.Input(shape=(784,)),
tf.keras.layers.Dense(128, activation=tf.nn.relu),
tf.keras.layers.Dense(10, activation=tf.nn.softmax)])
# Must call compile with optimizer before test
model.compile(optimizer=tf.keras.optimizers.SGD(), loss=tf.keras.losses.CategoricalCrossentropy())
# Run learning rate range test
lr_finder = tvl.learning.LearningRateFinder(model=model)
learning_rates, losses = lr_finder.scan(train_data, train_labels, min_lr=0.0001, max_lr=1.0, batch_size=128)
### Plot the results to choose your learning rate
References¶
-
learning.LearningRateFinder.
scan
(x, y, min_lr: float = 0.0001, max_lr: float = 1.0, batch_size: Optional[int] = None, steps: int = 100) → Tuple[List[float], List[float]]¶ Scans the learning rate range
[min_lr, max_lr]
for loss values- Parameters
x – Input data. It could be: - A Numpy array (or array-like), or a list of arrays (in case the model has multiple inputs) - A TensorFlow tensor, or a list of tensors (in case the model has multiple inputs) - A dict mapping input names to the corresponding array/tensors, if the model has named inputs - A
tf.data
dataset or a dataset iterator. Should return a tuple of either(inputs, targets)
or(inputs, targets, sample_weights)
- A generator orkeras.utils.Sequence
returning(inputs, targets)
or(inputs, targets, sample weights)
y – Target data. Like the input data x, it could be either Numpy array(s) or TensorFlow tensor(s). It should be consistent with
x
(you cannot have Numpy inputs and tensor targets, or inversely). Ifx
is a dataset, dataset iterator, generator, ortf.keras.utils.Sequence
instance,y
should not be specified (since targets will be obtained fromx
).min_lr – Minimum learning rate
max_lr – Maximum learning rate
batch_size – Number of samples per gradient update. Do not specify the
batch_size
if your data is in the form of symbolic tensors, dataset, dataset iterators, generators, ortf.keras.utils.Sequence
instances (since they generate batches)steps – Number of steps to scan between min_lr and max_lr
- Returns
Learning rates, losses documented