To implement those techniques, we should define the regularization loss and attach this function to every target layer. At each training iteration, these additional losses should be computed over the layers' parameters, and summed with the main task-specific loss (for instance, the cross-entropy over the network's predictions) so that they can all be backpropagated together by the optimizer. Thankfully, TensorFlow 2 provides several tools to simplify this process.
Additional losses can be attached to tf.keras.layers.Layer and tf.keras.Model instances through their .add_loss(losses, ...) method, with the losses tensors or zero-argument callables returning the loss values. Once properly added to a layer (see the following code), these losses will be computed every time the layer/model is called. All the losses attached to a Layer or Model instance, as well as the losses attached to its sublayers, will be computed, and the list of...