COMP9444 Neural Networks and Deep Learning

Twin Spirals Task

For Part 2 you will be training on the famous Two Spirals Problem (Lang and Witbrock, 1988). The supplied code spiral_main.py loads the training data from spirals.csv, applies the specified model and produces a graph of the resulting function, along with the data. For this task there is no test set as such, but we instead judge the generalization by plotting the function computed by the network and making a visual assessment.

Provide code for a Pytorch Module called PolarNet which operates as follows: First, the input (x,y) is converted to polar co-ordinates (r,a) with r=sqrt(x*x + y*y), a=atan2(y,x). Next, (r,a) is fed into a fully connected neural network with one hidden layer using tanh activation, followed by a single output using sigmoid activation. The conversion to polar coordinates should be included in your forward() method, so that the Module performs the entire task of conversion followed by network layers.
[1 mark] Run the code by typing
```
python3 spiral_main.py --net polar --hid 10
```
Try to find the minimum number of hidden nodes required so that this PolarNet learns to correctly classify all of the training data within 20000 epochs, on almost all runs. The graph_output() method will generate a picture of the function computed by your PolarNet called polar_out.png, which you should include in your report.
Provide code for a Pytorch Module called RawNet which operates on the raw input (x,y) without converting to polar coordinates. Your network should consist of two fully connected hidden layers with tanh activation, plus the output layer, with sigmoid activation. You should not use Sequential but should instead build the network from individual components as shown in the program xor.py from Exercises 5 (repeated in slide 4 of lecture slides 3b on PyTorch). The number of neurons in both hidden layers should be determined by the parameter num_hid.
Run the code by typing
```
python3 spiral_main.py --net raw
```
Keeping the number of hidden nodes in each layer fixed at 10, try to find a value for the size of the initial weights (--init) such that this RawNet learns to correctly classify all of the training data within 20000 epochs, on almost all runs. Include in your report the number of hidden nodes, and the values of any other metaparameters. The graph_output() method will generate a picture of the function computed by your RawNet called raw_out.png, which you should include in your report.
Provide code for a Pytorch Module called ShortNet which again operates on the raw input (x,y) without converting to polar coordinates. This network should again consist of two hidden layers (with tanh activation) plus the output layer (with sigmoid activation), but this time should include short-cut connections between every pair of layers (input, hid1, hid2 and output) as depicted on slide 10 of lecture slides 3a on Hidden Unit Dynamics. Note, however that this diagram shows only two hidden nodes in each layer, which is not enough to learn the task; in your code the number of neurons in both hidden layers should be determined by the parameter num_hid.
Run the code by typing
```
python3 spiral_main.py --net short
```
You should experiment to find a good value for the initial weight size, and try to find the mininum number of hidden nodes per layer so that this ShortNet learns to correctly classify all of the training data within 20000 epochs, on almost all runs. Include in your report the number of hidden nodes per layer, as well as the initial weight size and any other metaparameters. The graph_output() method will generate a picture of the function computed by your ShortNet called short_out.png, which you should include in your report.
Using graph_output() as a guide, write a method called graph_hidden(net, layer, node) which plots the activation (after applying the tanh function) of the hidden node with the specified number (node) in the specified layer (1 or 2). (Note: if net is of type PolarNet, graph_output() only needs to behave correctly when layer is 1).
Hint: you might need to modify forward() so that the hidden unit activations are retained, i.e. replace hid1 = torch.tanh(...) with self.hid1 = torch.tanh(...)
Use this code to generate plots of all the hidden nodes in PolarNet, and all the hidden nodes in both layers of RawNet and ShortNet, and include them in your report.