Supervised
{{< embed-notebook “/code/10khrs-ai-ml-dl/problems/1-supervised-learning/classification/binary/spam-nb.html” >}}
Embedded Notebook
History
Abstract
Fashion-MNIST is a modern drop-in replacement for MNIST. Released by Zalando Research in 2017, it packs 70 000 tiny grayscale images of apparel—sneakers, shirts, coats—into a lightweight benchmark. Its familiar format keeps setup trivial, while richer visuals pose a tougher challenge.
Origins
Zalando’s quality-control cameras captured millions of 96 × 96 product shots. Han Xiao et al. down-sampled these to 28 × 28, grouped them into ten balanced classes, and open-sourced the result. The idea: upgrade MNIST difficulty without touching loaders or evaluation scripts.
import numpy as np
from sklearn import datasets, svm
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
iris = datasets.load_iris()
#X = iris.data[:, :2]
"""results:
SVC with linear kernel Accuracy: 0.80
LinearSVC (linear kernel) Accuracy: 0.78
SVC with RBF kernel Accuracy: 0.80
SVC with polynomial (degree 3) Accuracy: 0.78
SVC with Monster kernel Accuracy: 0.82
"""
X = iris.data[:, :3]
"""results:
SVC with linear kernel Accuracy: 1.00
LinearSVC (linear kernel) Accuracy: 0.98
SVC with RBF kernel Accuracy: 1.00
SVC with polynomial (degree 3) Accuracy: 0.96
SVC with Monster kernel Accuracy: 0.91
"""
#X = iris.data
#1.00 accuracy on all methods
y = iris.target
# train / test split.
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
# random number generator
rng = np.random.RandomState(42)
D = 196883
W = rng.randn(X.shape[1], D) # creates random matrix of arg size
def monster_kernel(X1, X2): # produces pair-wise combinations of all feature vectors
X1_proj = np.dot(X1, W) # projects the 2,3 or 4 features into 198,883
X2_proj = np.dot(X2, W) # same here with same result
return np.dot(X1_proj, X2_proj.T) # returns the Gram Matrix
# Regularization parameter
C = 1.0
# Define models
models = [
# one vs. one classifier, with dual problem formulation. slower
("SVC with linear kernel", svm.SVC(kernel="linear", C=C)),
# one vs. rest. primal, faster.
("LinearSVC (linear kernel)", svm.LinearSVC(C=C, max_iter=10000)),
("SVC with RBF kernel", svm.SVC(kernel="rbf", gamma=0.7, C=C)),
("SVC with polynomial (degree 3)", svm.SVC(kernel="poly", degree=3, gamma="auto", C=C)),
("SVC with Monster kernel", svm.SVC(kernel=monster_kernel, C=C))
]
# Train, predict, and print accuracy
print("Classification Accuracy:\n")
for name, clf in models:
clf.fit(X_train, y_train)
y_pred = clf.predict(X_test)
acc = accuracy_score(y_test, y_pred)
print(f"{name:<40} Accuracy: {acc:.2f}")
About
This document contains the code to create an RNN chatbot that emulates Kanye West’s speech style.