Deep Learning Tutorials
- 1 1. Get started with DeepSense
- 2 2. Tensorflow Preparation
- 3 3. Tensorflow Example Notebooks
- 4 4. Pytorch
- 5 5. Tensor Processing Units
1. Get started with DeepSense
Follow all the steps from Getting started and Getting started with Deep Learning. This tutorial assumes you can log on to the DeepSense compute platform and have a version of Anaconda python on your path with Tensorflow and Pytorch installed in an anaconda environment.
2. Tensorflow Preparation
Download the example notebooks
request a gpu session
bsub -Is -q gpu bash
activate your anaconda environment
conda activate tensorflow
Note: this assumes you've followed the getting started instructions and have created a python environment called
tensorflow with the
tensorflow-gpu package installed from the IBM-AI repository. If not then please follow those instructions.
start a jupyter notebook
jupyter notebook --no-browser --ip=0.0.0.0
open an SSH tunnel to access the notebook
just as we did in the Getting started with Deep Learning tutorial, open an SSH tunnel in another window
ssh -l <user> login1.deepsense.ca -L <port>:ds-cmgpu-<num>:<port>
open the notebook in your browser
In a web browser navigate to the page listed in the jupyter notebook output. Remember to replace the node name with
3. Tensorflow Example Notebooks
This is an introduction to a python jupyter notebook.
Select a box with code. Press the shift and enter keys together to run the code in that box. You'll notice a star appear beside running code and a number in brackets appear beside finished code to indicate the order in which code boxes were run.
Often in an example notebook you will see code that already has output cached. You still need to run all previous code boxes and may want to use the menu to clear all output. If there is an error then you can modify the code or fix the error (e.g. download a dependency) and try again.
If you are missing a required dependency in later notebooks then you can install that package into your anaconda environment in a terminal window and it will be immediately accessible from the notebook. You do not need to close and restart the notebook or SSH tunnel.
You can also run the entire notebook using the menu.
When you are finished with a notebook you should use the menu to halt the kernel before closing the notebook. This clears resources such as GPU memory.
This is an introduction to a basic machine learning model, kmeans. In this tutorial the kmeans algorithm is used to classify handwritten digits.
kmeans works by clustering different training examples and comparing each new input to the mean of each cluster.
Run the notebook and learn more about kmeans.
Observe that machine learning methods often have parameters that must be chosen. You can use default parameters but optimizing these parameters can greatly improve the accuracy of a model.
For a simple example, try increasing the length of training by changing the
num_steps variable from 50 to 100 and running it again.
What happens if you modify other parameters and run the notebook again? How do these parameters change the training time and accuracy of the model?
Note: depending on your version of tensorflow you may need to modify the notebook. In some versions of tensorflow the kmeans.training_graph() function returns a different number of variables than the notebook expects (such as the cluster_centers_vars variable which may need to be removed from the code).
This notebook uses a different model, random forests, to classify the handwritten digits. A random forest is a set of decision trees, each of which are trained to learn part of the problem.
Random forests are one of the most commonly used machine learning models because they are quick to train and give good performance on many tasks.
Run the notebook and observe that the random forest provides better accuracy than the kmeans model you trained previously.
Try changing parameters such as increasing the number of decision trees or the training time.
If you increase the number of trees or training time by too much then you will see that you achieve worse performance on the test set than on the training set. This is called overfitting and means that your model is learning specific details from the training set that do not generalize to the test set. This is a common problem in machine learning and needs to be considered whenever you train a model. You may need to simplify your model, give it more training examples by collecting more data or use data augmentation.
This is our first example of a commonly applied kind of neural network called a convolutional neural network (CNN), also applied to handwriting classification.
The notebook provides a good description of this kind of network which gradually reduces the size of the input using convolutional layers. This forces the network to learn information about multiple input variables and provides good accuracy.
Run the notebook and observe that this simple CNN greatly outperforms kmeans and random forests for handwriting recognition.
This notebook is an example of a different problem, called regression. Regression attempts to predict a value, unlike classification which determines which class an input example belongs to. This is a simple example that attempts to fit a line to best match a set of data points with x and y values.
Note: you may need to add backets to the print statements depending on your version of python and the version of this example notebook
An autoencoder trains two seperate networks, an encoder and a decoder. This has many applications using the intermediate "latent representation". This includes compression (by storing just the latent representation), translation (by training an encoder with multiple different languages and then training different decoders for each language), and modifying the style of the input (by similarly leveraging different encoder or decoder frameworks).
A generative adversarial network (GAN) can be used to generate new data that looks similar to training data. This kind of network is how so called "deepfakes" are made that alter teh style of images or video.
The basic idea is to traing two opposing networks. The generator tries to generate examples and the discriminator tries to determine if examples are real or fake. Both networks learn and improve together which greatly improves performance over using a single network for either purpose.
A recurrent neural network is often applied to text processing or other problems that consider a sequence of events or letters. This type of network retains information at each step and thus has a type of memory that learns both from training examples as well as from parts of a specific input instance that have already been processed.
There are a variety of example notebooks for the pytorch framework at https://pytorch.org/tutorials/ . For example, there is a series of three notebooks on text processing and translation using recurrent neural networks:
1. classifying names by country of origin
2. generating names similar to those from those countries of origin
3. text translation using seq2seq latent encoding of text with different decoders
5. Tensor Processing Units
Google has developed dedicated deep learning processors called tensor processing units (TPUs). The contest web site kaggle has an online notebook where you can try out these TPUs to classify images of flowers by species.
Try out the notebook and compare the performance of CPU, GPU, and TPU computing. You will need to modify the parameters of the notebook heavily to see any results at all with CPUs, while you will see an obvious speed improvement from using the TPUs over the GPUs.