Title:
Neural Networks: Initializations and Global Minima
Abstract:
Initializing the weights and the biases is a key
part of the training process of a neural network.
Unlike the subsequent optimization phase, however,
the initialization phase has gained only limited
attention in the literature. In the first part of the
talk I will discuss some consequences of commonly
used initialization strategies for vanilla DNNs with
ReLU activations. Based on these insights I will
then introduce an alternative initialization strategy
and present corresponding some large scale
experiments assessing the quality of this new
initialization strategy.
In the second part of the talk, I will discuss the
statistical properties of different global minima in
over-parametrized DNNs.