* Context

This repo is my 10,000 Hours of Machine Learning.

*IMPORTANT: this repository eventually grew in scope and thus [[https://abaj.ai][abaj.ai]] was born.
All of the files in this repository are a submodule of that repo, and all of the code is tangled with org files in the site*

#+BEGIN_QUOTE
"Machine Learning is just lego for adults" - Dr. Kieran Samuel Owens
#+END_QUOTE

#+BEGIN_QUOTE
"S/he who has a why, can bear almost any how." - Friedrich Nietzche
#+END_QUOTE

/Why do anything else, when the thing that you could do, would do, everything else?/
-----

To become an expert at anything, there is a common denominator:
#+BEGIN_CENTER
10,000 hours of *deliberate practise* on the subject.
#+END_CENTER

* Structure of the Repository
There are 3 main features:
1. PORTFOLIO
   - Briefly, these are solutions to *classical* problems, MNIST, Boston Housing, XOR, etc.
2. EDUCATION
   - This folder contains coursework from my universities and MOOCs (that which I am allowed to share). Additionally my textbook solutions are included here.
3. PROFICIENCY
   - These are my more complex and non-trivial projects. They are more fun, but also more novel and thus less deterministic; Kanye West chatbot, Peg Solitare Reinforcement Learner, Ultimate Frisbee Computer Vision, etc.

* PORTFOLIO
In no particular order, here are a list of the methods you will find in the notebooks. The emphasis is on understanding their limitations, benefits and constructions.

- Least Squares Regression
- Random Forests
- Boosting, Bagging
- Ensemble Methods
- Multilayer Perceptrons
- Naive Bayes
- K-means regression
- K-nearest Neighbours Clustering
- Logistic Regression
- Decision Trees
- SVM
- Kernel Methods
- GAN's
- Stable Diffusion
- Recurrent Neural Networks
- Convolutional Neural Networks
- Transformers
- word2vec, GLoVE and NLP
- LLM
  
** Projects

To gain proficiency in all of the above methods, I have solved classical problems that lend themselves well to that particular method:

|-----------------------+----------+------------------------|
| Dataset               | Accuracy | Model                  |
|-----------------------+----------+------------------------|
| MNIST                 | 92%      | Logistic Regression    |
| FMNIST                | B%       | Random Forest          |
| KMNIST                | C%       | 2-layer CNN            |
| CIFAR                 | D%       | CNN                    |
| IRIS                  | E%       | SVM                    |
| ImageNet              | F%       | ResNet50               |
| Sentiment140          | G%       | LSTM                   |
| Boston Housing        | H%       | Linear Regression      |
| Wine Quality          | I%       | Gradient Boosting      |
| Pima Indians Diabetes | J%       | Decision Tree          |
| IMDB Reviews          | K%       | BERT                   |
| KDD Cup 1999          | L%       | K-Means Clustering     |
| Digits                | M%       | Gaussian Mixture Model |
| CartPole              | N%       | Deep Q-Network         |

* Education
For mastery, a formal education is also required; either by way of open-courseware, or by paying an institution.

I have done both, and overall benefitted as a result.

- [X] UNSW AI
- [X] UNSW Machine Learning and Data Mining
- [X] UNSW Deep Learning and Neural Networks
- [ ] UNSW Computer Vision
- [ ] Stanford CS229 (Machine Learning)
- [ ] Stanford CS230 (Deep Learning)
- [ ] Mathematics for Machine Learning, Ong et al.
- [ ] HOML (Hands on Machine Learning)
- [ ] All of Statistics, Larry Wasserman
- [X] Coursera Machine Learning Specialisation
- [X] Coursera Deep Learning Specialisation

* PROFICIENCY

To become proficient, I have applied my ML skills to solve problems of personal and social interest.

- [ ] Kanye West Producer
- [ ] KiTS19 Grand Challenge: Kidney and Kidney Tumour Segmentation
- [ ] Non-descriptive Ultimate Frisbee Statistics
- [ ] OCR
- [ ] Peg Solitaire RL
  
#+BEGIN_QUOTE
"Read 2 papers a week" - Andrew Ng
#+END_QUOTE