spiral_main.py
loads the training data from
spirals.csv
,
applies the specified model and produces a graph of the resulting function,
along with the data.
For this task there is no test set as such,
but we instead judge the generalization
by plotting the function computed by the network
and making a visual assessment.
PolarNet
which operates as follows:
First, the input (x,y)
is converted
to polar co-ordinates (r,a)
with
r=sqrt(x*x + y*y)
, a=atan2(y,x)
.
Next, (r,a)
is fed into a
fully connected neural network with one hidden layer using tanh
activation, followed by a single output using sigmoid
activation. The conversion to polar coordinates should be
included in your forward()
method, so that the
Module performs the entire task of conversion followed by
network layers.
python3 spiral_main.py --net polar --hid 10Try to find the minimum number of hidden nodes required so that this PolarNet learns to correctly classify all of the training data within 20000 epochs, on almost all runs. The
graph_output()
method will generate a picture of the function
computed by your PolarNet called polar_out.png
,
which you should include in your report.
RawNet
which operates on the raw input (x,y)
without converting to polar coordinates.
Your network should consist of two fully connected hidden layers
with tanh activation, plus the output layer, with sigmoid activation.
You should not use Sequential
but should instead
build the network from individual components as shown
in the program xor.py
from Exercises 5
(repeated in slide 4 of lecture slides 3b on PyTorch).
The number of neurons in both
hidden layers should be determined by the parameter num_hid
.
python3 spiral_main.py --net rawKeeping the number of hidden nodes in each layer fixed at 10, try to find a value for the size of the initial weights (--init) such that this RawNet learns to correctly classify all of the training data within 20000 epochs, on almost all runs. Include in your report the number of hidden nodes, and the values of any other metaparameters. The
graph_output()
method will generate a picture of the function
computed by your RawNet called raw_out.png
,
which you should include in your report.
ShortNet
which again operates on the raw input (x,y)
without converting to polar coordinates.
This network should again consist of two hidden layers (with tanh activation)
plus the output layer (with sigmoid activation),
but this time should include short-cut connections
between every pair of layers (input, hid1, hid2
and output
)
as depicted on slide 10 of lecture slides 3a on Hidden Unit Dynamics.
Note, however that this diagram shows only two hidden nodes in each
layer, which is not enough to learn the task;
in your code the number of neurons in both
hidden layers should be determined by the parameter num_hid
.
python3 spiral_main.py --net shortYou should experiment to find a good value for the initial weight size, and try to find the mininum number of hidden nodes per layer so that this ShortNet learns to correctly classify all of the training data within 20000 epochs, on almost all runs. Include in your report the number of hidden nodes per layer, as well as the initial weight size and any other metaparameters. The
graph_output()
method will generate a picture of the function
computed by your ShortNet called short_out.png
,
which you should include in your report.
graph_output()
as a guide, write a method called
graph_hidden(net, layer, node)
which plots the activation
(after applying the tanh
function) of
the hidden node with the specified number (node)
in the specified layer
(1 or 2).
(Note: if net
is of type PolarNet
,
graph_output()
only needs to behave correctly when layer is 1).
Hint: you might need to modify forward()
so that the hidden unit activations are retained, i.e.
replace hid1 = torch.tanh(...)
with
self.hid1 = torch.tanh(...)
Use this code to generate plots of all the hidden nodes in PolarNet, and all the hidden nodes in both layers of RawNet and ShortNet, and include them in your report.