Notes

Memory

Honestly, the diagrams that I wish to reproduce already exist here. Currently this page is in construction and probably will be until I finish my Doctorate.

“Memory is the mother of all wisdom." — Aeschylus

Babbage's Big Brain

Memory as a Hierarchy — Not a Monolith

Hierarchy exists for two intertwined reasons:

  1. Physics – Smaller structures are faster and nearer to ALUs but hold less data; larger structures store more but are farther away and thus slower.
  2. Economics – Fast memory costs disproportionately more per byte.

An efficient system arranges multiple layers so that > the majority of accesses hit the small, fast part, > while the bulk of bytes reside in the large, cheap part.

Read more >

Optimiser Paradigms in Machine Learning

deep learning pipeline

Recall that a Neural Network follows the following construction:

  1. Pass data (forward) through model to get predicted values
  2. Calculate loss with predicted values against labels
  3. Perform backpropagation w.r.t each weight / bias to get the direction in which to move that weight such that it moves closer to the global minima
  4. Update parameters with gradients using an optimiser.

momentum

ball's pace slows down this makes total fkn sense! if the gradient signs are the same, increasing your confidence in that direction and move further. you want to take less steps over all

Read more >