{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Practice Lab: Neural Networks for Handwritten Digit Recognition, Multiclass \n",
"\n",
"In this exercise, you will use a neural network to recognize the hand-written digits 0-9.\n",
"\n",
"\n",
"# Outline\n",
"- [ 1 - Packages ](#1)\n",
"- [ 2 - ReLU Activation](#2)\n",
"- [ 3 - Softmax Function](#3)\n",
" - [ Exercise 1](#ex01)\n",
"- [ 4 - Neural Networks](#4)\n",
" - [ 4.1 Problem Statement](#4.1)\n",
" - [ 4.2 Dataset](#4.2)\n",
" - [ 4.3 Model representation](#4.3)\n",
" - [ 4.4 Tensorflow Model Implementation](#4.4)\n",
" - [ 4.5 Softmax placement](#4.5)\n",
" - [ Exercise 2](#ex02)\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"_**NOTE:** To prevent errors from the autograder, you are not allowed to edit or delete non-graded cells in this notebook . Please also refrain from adding any new cells. \n",
"**Once you have passed this assignment** and want to experiment with any of the non-graded code, you may follow the instructions at the bottom of this notebook._"
]
},
{
"cell_type": "markdown",
"metadata": {
"tags": []
},
"source": [
"\n",
"## 1 - Packages \n",
"\n",
"First, let's run the cell below to import all the packages that you will need during this assignment.\n",
"- [numpy](https://numpy.org/) is the fundamental package for scientific computing with Python.\n",
"- [matplotlib](http://matplotlib.org) is a popular library to plot graphs in Python.\n",
"- [tensorflow](https://www.tensorflow.org/) a popular platform for machine learning."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"deletable": false,
"editable": false
},
"outputs": [],
"source": [
"import numpy as np\n",
"import tensorflow as tf\n",
"from tensorflow.keras.models import Sequential\n",
"from tensorflow.keras.layers import Dense\n",
"from tensorflow.keras.activations import linear, relu, sigmoid\n",
"%matplotlib widget\n",
"import matplotlib.pyplot as plt\n",
"plt.style.use('./deeplearning.mplstyle')\n",
"\n",
"import logging\n",
"logging.getLogger(\"tensorflow\").setLevel(logging.ERROR)\n",
"tf.autograph.set_verbosity(0)\n",
"\n",
"from public_tests import * \n",
"\n",
"from autils import *\n",
"from lab_utils_softmax import plt_softmax\n",
"np.set_printoptions(precision=2)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"## 2 - ReLU Activation\n",
"This week, a new activation was introduced, the Rectified Linear Unit (ReLU). \n",
"$$ a = max(0,z) \\quad\\quad\\text {# ReLU function} $$"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"deletable": false,
"editable": false
},
"outputs": [
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "04c714ce9c1a44f8b19dd7429c7eb4ab",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"plt_act_trio()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"The example from the lecture on the right shows an application of the ReLU. In this example, the derived \"awareness\" feature is not binary but has a continuous range of values. The sigmoid is best for on/off or binary situations. The ReLU provides a continuous linear relationship. Additionally it has an 'off' range where the output is zero. \n",
"The \"off\" feature makes the ReLU a Non-Linear activation. Why is this needed? This enables multiple units to contribute to to the resulting function without interfering. This is examined more in the supporting optional lab. "
]
},
{
"cell_type": "markdown",
"metadata": {
"tags": []
},
"source": [
"\n",
"## 3 - Softmax Function\n",
"A multiclass neural network generates N outputs. One output is selected as the predicted answer. In the output layer, a vector $\\mathbf{z}$ is generated by a linear function which is fed into a softmax function. The softmax function converts $\\mathbf{z}$ into a probability distribution as described below. After applying softmax, each output will be between 0 and 1 and the outputs will sum to 1. They can be interpreted as probabilities. The larger inputs to the softmax will correspond to larger output probabilities.\n",
"
Important Note: Please only do this when you've already passed the assignment to avoid problems with the autograder.\n", "
Here's a short demo of how to do the steps above: \n",
"
\n",
" \n",
"