chapter 5: why are deep neural networks hard to train?