Supervised Learning

Decision Trees

Entropy and Information Gain

Definition (Entropy)

The entropy of a dataset \(S\) with classes \(C\) is:

\[H(S) = -\sum_{c \in C} p_c \log_2(p_c)\]

where \(p_c\) is the proportion of examples belonging to class \(c\). Entropy is maximised when classes are equally distributed and zero when all examples belong to a single class.

Definition (Information Gain)

The information gain of splitting dataset \(S\) on attribute \(A\) is:

\[\text{IG}(S, A) = H(S) - \sum_{v \in \text{Values}(A)} \frac{|S_v|}{|S|} H(S_v)\]

Read more >

Perceptron

Origins

The perceptron learning algorithm is the most simple algorithm we have for Binary Classification.

It was introduced by Frank Rosenblatt in his seminal paper: “The Perceptron: A Probabilistic Model for Information Storage and Organization in the Brain” in 1958. The history however dates back further to the theoretical foundations of Warren McCulloch and Walter Pitts in 1943 and their paper “A Logical Calculus of the Ideas Immanent in Nervous Activity”. The interested reader may visit these links for annotations and the original pdfs.

Read more >