Jan 8, 2021
Simply explained with some good humorous analogy. It follows by this logic that the output of a ResNet block (indicated by the green by-pass) can be at least equal to the input to that block even if weights and biases decay to zero. When we are using ReLU activation units.