Examples

2026-04-29

Decision Trees

Entropy and Information Gain

Definition (Entropy)

The entropy of a dataset \(S\) with classes \(C\) is:

\[H(S) = -\sum_{c \in C} p_c \log_2(p_c)\]

where \(p_c\) is the proportion of examples belonging to class \(c\). Entropy is maximised when classes are equally distributed and zero when all examples belong to a single class.

Definition (Information Gain)

The information gain of splitting dataset \(S\) on attribute \(A\) is:

\[\text{IG}(S, A) = H(S) - \sum_{v \in \text{Values}(A)} \frac{|S_v|}{|S|} H(S_v)\]

Read more >

Master Theorem

Divide-and-Conquer Recurrences

Many divide-and-conquer algorithms follow the same pattern: split the input into smaller pieces, solve each piece recursively, and combine the results. This shared structure means their running times satisfy recurrences of the same shape.

Definition (Divide-and-Conquer Recurrence)

A divide-and-conquer recurrence has the form:

\[T(n) = aT\!\left(\left\lceil n/b \right\rceil\right) + \Theta(n^d)\]

where:

  • \(a \geq 1\) is the number of subproblems (the branching factor),
  • \(b > 1\) is the factor by which the input shrinks at each level,
  • \(n^d\) is the cost of the divide and combine step.

Three forces compete: the branching factor \(a\) creates more work at each level, the shrinkage factor \(b\) makes each subproblem cheaper, and the non-recursive cost \(n^d\) sets the price of splitting and merging. The Master Theorem tells us which force wins.

Read more >