Like Inception, ResNet has known several iterative improvements to its architecture, for instance, with the addition of bottleneck convolutions or the use of smaller kernels. Like VGG, ResNet also has several pseudo-standardized versions characterized by their depth: ResNet-18, ResNet-50, ResNet-101, ResNet-152, and others. Indeed, the winning ResNet network for ILSVRC 2015 vertically stacked 152 trainable layers (with a total of 60 million parameters), which was an impressive feat at that time:
In the preceding diagram, all the convolutional and max-pooling layers have SAME for padding, and for stride s = 1 if unspecified. Batch normalization is applied after each 3 × 3 convolution (on the residual path, in gray), and 1 × 1 convolutions (on the mapping path in black) have no activation function (identity).
As we can see in Figure 4.5, the ResNet architecture is slimmer than the Inception architecture, though it is...