<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en-CA">
	<id>https://docs.deepsense.ca/api.php?action=feedcontributions&amp;feedformat=atom&amp;user=Cwhidden</id>
	<title>DeepSense Docs - User contributions [en-ca]</title>
	<link rel="self" type="application/atom+xml" href="https://docs.deepsense.ca/api.php?action=feedcontributions&amp;feedformat=atom&amp;user=Cwhidden"/>
	<link rel="alternate" type="text/html" href="https://docs.deepsense.ca/index.php?title=Special:Contributions/Cwhidden"/>
	<updated>2026-06-06T21:08:22Z</updated>
	<subtitle>User contributions</subtitle>
	<generator>MediaWiki 1.31.1</generator>
	<entry>
		<id>https://docs.deepsense.ca/index.php?title=Deep_Learning_Tutorials&amp;diff=170</id>
		<title>Deep Learning Tutorials</title>
		<link rel="alternate" type="text/html" href="https://docs.deepsense.ca/index.php?title=Deep_Learning_Tutorials&amp;diff=170"/>
		<updated>2020-07-06T13:21:28Z</updated>

		<summary type="html">&lt;p&gt;Cwhidden: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;div class=&amp;quot;noautonum&amp;quot;&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This is a collection of deep learning tutorials that were explored as part of the training of the 2019 DeepSense fellowships. These were explored over multiple sessions with 1-3 notebooks explored per 1-1.5 hour session. Both Tensorflow and pytorch are explored.&lt;br /&gt;
&lt;br /&gt;
== 1. Get started with DeepSense ==&lt;br /&gt;
&lt;br /&gt;
Follow all the steps from [[Getting started]] and [[Getting started with Deep Learning]]. This tutorial assumes you can log on to the DeepSense compute platform and have a version of Anaconda python on your path with Tensorflow and Pytorch installed in an anaconda environment.&lt;br /&gt;
&lt;br /&gt;
== 2. Tensorflow Preparation ==&lt;br /&gt;
&lt;br /&gt;
=== Download the example notebooks ===&lt;br /&gt;
&lt;br /&gt;
 git clone https://github.com/aymericdamien/TensorFlow-Examples.git&lt;br /&gt;
&lt;br /&gt;
=== request a gpu session ===&lt;br /&gt;
&lt;br /&gt;
 bsub -Is -q gpu bash&lt;br /&gt;
&lt;br /&gt;
=== activate your anaconda environment ===&lt;br /&gt;
&lt;br /&gt;
 conda activate tensorflow&lt;br /&gt;
&lt;br /&gt;
Note: this assumes you&amp;#039;ve followed the getting started instructions and have created a python environment called &amp;lt;code&amp;gt;tensorflow&amp;lt;/code&amp;gt; with the &amp;lt;code&amp;gt;tensorflow-gpu&amp;lt;/code&amp;gt; package installed from the IBM-AI repository. If not then please follow those instructions.&lt;br /&gt;
&lt;br /&gt;
=== start a jupyter notebook ===&lt;br /&gt;
 jupyter notebook --no-browser --ip=0.0.0.0&lt;br /&gt;
&lt;br /&gt;
=== open an SSH tunnel to access the notebook ===&lt;br /&gt;
just as we did in the [[Getting started with Deep Learning]] tutorial, open an SSH tunnel in another window&lt;br /&gt;
 ssh -l &amp;lt;user&amp;gt; login1.deepsense.ca -L &amp;lt;port&amp;gt;:ds-cmgpu-&amp;lt;num&amp;gt;:&amp;lt;port&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== open the notebook in your browser ===&lt;br /&gt;
In a web browser navigate to the page listed in the jupyter notebook output. Remember to replace the node name with &amp;lt;code&amp;gt;localhost&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== 3. Tensorflow Example Notebooks ==&lt;br /&gt;
&lt;br /&gt;
=== 1. helloworld.ipynb ===&lt;br /&gt;
&lt;br /&gt;
 TensorFlow-Examples/notebooks/1_Introduction/helloworld.ipynb&lt;br /&gt;
&lt;br /&gt;
This is an introduction to a python jupyter notebook.&lt;br /&gt;
&lt;br /&gt;
Select a box with code. Press the shift and enter keys together to run the code in that box. You&amp;#039;ll notice a star appear beside running code and a number in brackets appear beside finished code to indicate the order in which code boxes were run.&lt;br /&gt;
&lt;br /&gt;
Often in an example notebook you will see code that already has output cached. You still need to run all previous code boxes and may want to use the menu to clear all output. If there is an error then you can modify the code or fix the error (e.g. download a dependency) and try again.&lt;br /&gt;
&lt;br /&gt;
If you are missing a required dependency in later notebooks then you can install that package into your anaconda environment in a terminal window and it will be immediately accessible from the notebook. You do not need to close and restart the notebook or SSH tunnel.&lt;br /&gt;
&lt;br /&gt;
You can also run the entire notebook using the menu.&lt;br /&gt;
&lt;br /&gt;
When you are finished with a notebook you should use the menu to halt the kernel before closing the notebook. This clears resources such as GPU memory.&lt;br /&gt;
&lt;br /&gt;
=== 2. kmeans.ipynb ===&lt;br /&gt;
&lt;br /&gt;
 TensorFlow-Examples/notebooks/2_BasicModels/kmeans.ipynb&lt;br /&gt;
&lt;br /&gt;
This is an introduction to a basic machine learning model, kmeans. In this tutorial the kmeans algorithm is used to classify handwritten digits.&lt;br /&gt;
&lt;br /&gt;
kmeans works by clustering different training examples and comparing each new input to the mean of each cluster.&lt;br /&gt;
&lt;br /&gt;
Run the notebook and learn more about kmeans.&lt;br /&gt;
&lt;br /&gt;
Observe that machine learning methods often have parameters that must be chosen. You can use default parameters but optimizing these parameters can greatly improve the accuracy of a model.&lt;br /&gt;
&lt;br /&gt;
For a simple example, try increasing the length of training by changing the &amp;lt;code&amp;gt;num_steps&amp;lt;/code&amp;gt; variable from 50 to 100 and running it again.&lt;br /&gt;
&lt;br /&gt;
What happens if you modify other parameters and run the notebook again? How do these parameters change the training time and accuracy of the model?&lt;br /&gt;
&lt;br /&gt;
Note: depending on your version of tensorflow you may need to modify the notebook. In some versions of tensorflow the kmeans.training_graph() function returns a different number of variables than the notebook expects (such as the cluster_centers_vars variable which may need to be removed from the code).&lt;br /&gt;
&lt;br /&gt;
=== 3. random_forest.ipynb ===&lt;br /&gt;
&lt;br /&gt;
 TensorFlow-Examples/notebooks/2_BasicModels/random_forest.ipynb&lt;br /&gt;
&lt;br /&gt;
This notebook uses a different model, random forests, to classify the handwritten digits. A random forest is a set of decision trees, each of which are trained to learn part of the problem.&lt;br /&gt;
&lt;br /&gt;
 Random forests are one of the most commonly used machine learning models because they are quick to train and give good performance on many tasks.&lt;br /&gt;
&lt;br /&gt;
Run the notebook and observe that the random forest provides better accuracy than the kmeans model you trained previously.&lt;br /&gt;
&lt;br /&gt;
Try changing parameters such as increasing the number of decision trees or the training time.&lt;br /&gt;
&lt;br /&gt;
If you increase the number of trees or training time by too much then you will see that you achieve worse performance on the test set than on the training set. This is called overfitting and means that your model is learning specific details from the training set that do not generalize to the test set. This is a common problem in machine learning and needs to be considered whenever you train a model. You may need to simplify your model, give it more training examples by collecting more data or use data augmentation.&lt;br /&gt;
&lt;br /&gt;
=== 4. convolutional_network.ipynb ===&lt;br /&gt;
&lt;br /&gt;
TensorFlow-Examples/notebooks/3_NeuralNetworks/convolutional_network.ipynb&lt;br /&gt;
&lt;br /&gt;
This is our first example of a commonly applied kind of neural network called a convolutional neural network (CNN), also applied to handwriting classification.&lt;br /&gt;
&lt;br /&gt;
The notebook provides a good description of this kind of network which gradually reduces the size of the input using convolutional layers. This forces the network to learn information about multiple input variables and provides good accuracy.&lt;br /&gt;
&lt;br /&gt;
Run the notebook and observe that this simple CNN greatly outperforms kmeans and random forests for handwriting recognition.&lt;br /&gt;
&lt;br /&gt;
=== 5. linear_regression.ipynb ===&lt;br /&gt;
&lt;br /&gt;
 notebooks/TensorFlow-Examples/notebooks/2_BasicModels/linear_regression.ipynb&lt;br /&gt;
&lt;br /&gt;
This notebook is an example of a different problem, called regression. Regression attempts to predict a value, unlike classification which determines which class an input example belongs to. This is a simple example that attempts to fit a line to best match a set of data points with x and y values.&lt;br /&gt;
&lt;br /&gt;
Note: you may need to add backets to the print statements depending on your version of python and the version of this example notebook&lt;br /&gt;
&lt;br /&gt;
=== 5. autoencoder.ipynb ===&lt;br /&gt;
	&lt;br /&gt;
 notebooks/TensorFlow-Examples/notebooks/3_NeuralNetworks/autoencoder.ipynb&lt;br /&gt;
&lt;br /&gt;
An autoencoder trains two seperate networks, an encoder and a decoder. This has many applications using the intermediate &amp;quot;latent representation&amp;quot;. This includes compression (by storing just the latent representation), translation (by training an encoder with multiple different languages and then training different decoders for each language), and modifying the style of the input (by similarly leveraging different encoder or decoder frameworks).&lt;br /&gt;
&lt;br /&gt;
=== 6. gan.ipynb ===&lt;br /&gt;
&lt;br /&gt;
 TensorFlow-Examples/notebooks/3_NeuralNetworks/gan.ipynb&lt;br /&gt;
&lt;br /&gt;
A generative adversarial network (GAN) can be used to generate new data that looks similar to training data. This kind of network is how so called &amp;quot;deepfakes&amp;quot; are made that alter teh style of images or video.&lt;br /&gt;
&lt;br /&gt;
The basic idea is to traing two opposing networks. The generator tries to generate examples and the discriminator tries to determine if examples are real or fake. Both networks learn and improve together which greatly improves performance over using a single network for either purpose.&lt;br /&gt;
&lt;br /&gt;
=== 7. recurrent_network.ipynb ===&lt;br /&gt;
		&lt;br /&gt;
 TensorFlow-Examples/notebooks/3_NeuralNetworks/recurrent_network.ipynb&lt;br /&gt;
&lt;br /&gt;
A recurrent neural network is often applied to text processing or other problems that consider a sequence of events or letters. This type of network retains information at each step and thus has a type of memory that learns both from training examples as well as from parts of a specific input instance that have already been processed.&lt;br /&gt;
&lt;br /&gt;
== 4. Pytorch ==&lt;br /&gt;
&lt;br /&gt;
There are a variety of example notebooks for the pytorch framework at https://pytorch.org/tutorials/ . For example, there is a series of three notebooks on text processing and translation using recurrent neural networks:&lt;br /&gt;
&lt;br /&gt;
=== 1. classifying names by country of origin ===&lt;br /&gt;
 &lt;br /&gt;
 https://pytorch.org/tutorials/intermediate/char_rnn_classification_tutorial.html&lt;br /&gt;
&lt;br /&gt;
=== 2. generating names similar to those from those countries of origin ===&lt;br /&gt;
 &lt;br /&gt;
 https://pytorch.org/tutorials/intermediate/char_rnn_generation_tutorial.html&lt;br /&gt;
&lt;br /&gt;
=== 3. text translation using seq2seq latent encoding of text with different decoders ===&lt;br /&gt;
&lt;br /&gt;
 https://pytorch.org/tutorials/intermediate/seq2seq_translation_tutorial.html&lt;br /&gt;
&lt;br /&gt;
== 5. Tensor Processing Units ==&lt;br /&gt;
&lt;br /&gt;
Google has developed dedicated deep learning processors called tensor processing units (TPUs). The contest web site kaggle has an online notebook where you can try out these TPUs to classify images of flowers by species.&lt;br /&gt;
&lt;br /&gt;
 https://www.kaggle.com/c/flower-classification-with-tpus&lt;br /&gt;
&lt;br /&gt;
Try out the notebook and compare the performance of CPU, GPU, and TPU computing. You will need to modify the parameters of the notebook heavily to see any results at all with CPUs, while you will see an obvious speed improvement from using the TPUs over the GPUs.&lt;/div&gt;</summary>
		<author><name>Cwhidden</name></author>
		
	</entry>
	<entry>
		<id>https://docs.deepsense.ca/index.php?title=Deep_Learning_Tutorials&amp;diff=169</id>
		<title>Deep Learning Tutorials</title>
		<link rel="alternate" type="text/html" href="https://docs.deepsense.ca/index.php?title=Deep_Learning_Tutorials&amp;diff=169"/>
		<updated>2020-07-02T22:34:12Z</updated>

		<summary type="html">&lt;p&gt;Cwhidden: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;div class=&amp;quot;noautonum&amp;quot;&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== 1. Get started with DeepSense ==&lt;br /&gt;
&lt;br /&gt;
Follow all the steps from [[Getting started]] and [[Getting started with Deep Learning]]. This tutorial assumes you can log on to the DeepSense compute platform and have a version of Anaconda python on your path with Tensorflow and Pytorch installed in an anaconda environment.&lt;br /&gt;
&lt;br /&gt;
== 2. Tensorflow Preparation ==&lt;br /&gt;
&lt;br /&gt;
=== Download the example notebooks ===&lt;br /&gt;
&lt;br /&gt;
 git clone https://github.com/aymericdamien/TensorFlow-Examples.git&lt;br /&gt;
&lt;br /&gt;
=== request a gpu session ===&lt;br /&gt;
&lt;br /&gt;
 bsub -Is -q gpu bash&lt;br /&gt;
&lt;br /&gt;
=== activate your anaconda environment ===&lt;br /&gt;
&lt;br /&gt;
 conda activate tensorflow&lt;br /&gt;
&lt;br /&gt;
Note: this assumes you&amp;#039;ve followed the getting started instructions and have created a python environment called &amp;lt;code&amp;gt;tensorflow&amp;lt;/code&amp;gt; with the &amp;lt;code&amp;gt;tensorflow-gpu&amp;lt;/code&amp;gt; package installed from the IBM-AI repository. If not then please follow those instructions.&lt;br /&gt;
&lt;br /&gt;
=== start a jupyter notebook ===&lt;br /&gt;
 jupyter notebook --no-browser --ip=0.0.0.0&lt;br /&gt;
&lt;br /&gt;
=== open an SSH tunnel to access the notebook ===&lt;br /&gt;
just as we did in the [[Getting started with Deep Learning]] tutorial, open an SSH tunnel in another window&lt;br /&gt;
 ssh -l &amp;lt;user&amp;gt; login1.deepsense.ca -L &amp;lt;port&amp;gt;:ds-cmgpu-&amp;lt;num&amp;gt;:&amp;lt;port&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== open the notebook in your browser ===&lt;br /&gt;
In a web browser navigate to the page listed in the jupyter notebook output. Remember to replace the node name with &amp;lt;code&amp;gt;localhost&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== 3. Tensorflow Example Notebooks ==&lt;br /&gt;
&lt;br /&gt;
=== 1. helloworld.ipynb ===&lt;br /&gt;
&lt;br /&gt;
 TensorFlow-Examples/notebooks/1_Introduction/helloworld.ipynb&lt;br /&gt;
&lt;br /&gt;
This is an introduction to a python jupyter notebook.&lt;br /&gt;
&lt;br /&gt;
Select a box with code. Press the shift and enter keys together to run the code in that box. You&amp;#039;ll notice a star appear beside running code and a number in brackets appear beside finished code to indicate the order in which code boxes were run.&lt;br /&gt;
&lt;br /&gt;
Often in an example notebook you will see code that already has output cached. You still need to run all previous code boxes and may want to use the menu to clear all output. If there is an error then you can modify the code or fix the error (e.g. download a dependency) and try again.&lt;br /&gt;
&lt;br /&gt;
If you are missing a required dependency in later notebooks then you can install that package into your anaconda environment in a terminal window and it will be immediately accessible from the notebook. You do not need to close and restart the notebook or SSH tunnel.&lt;br /&gt;
&lt;br /&gt;
You can also run the entire notebook using the menu.&lt;br /&gt;
&lt;br /&gt;
When you are finished with a notebook you should use the menu to halt the kernel before closing the notebook. This clears resources such as GPU memory.&lt;br /&gt;
&lt;br /&gt;
=== 2. kmeans.ipynb ===&lt;br /&gt;
&lt;br /&gt;
 TensorFlow-Examples/notebooks/2_BasicModels/kmeans.ipynb&lt;br /&gt;
&lt;br /&gt;
This is an introduction to a basic machine learning model, kmeans. In this tutorial the kmeans algorithm is used to classify handwritten digits.&lt;br /&gt;
&lt;br /&gt;
kmeans works by clustering different training examples and comparing each new input to the mean of each cluster.&lt;br /&gt;
&lt;br /&gt;
Run the notebook and learn more about kmeans.&lt;br /&gt;
&lt;br /&gt;
Observe that machine learning methods often have parameters that must be chosen. You can use default parameters but optimizing these parameters can greatly improve the accuracy of a model.&lt;br /&gt;
&lt;br /&gt;
For a simple example, try increasing the length of training by changing the &amp;lt;code&amp;gt;num_steps&amp;lt;/code&amp;gt; variable from 50 to 100 and running it again.&lt;br /&gt;
&lt;br /&gt;
What happens if you modify other parameters and run the notebook again? How do these parameters change the training time and accuracy of the model?&lt;br /&gt;
&lt;br /&gt;
Note: depending on your version of tensorflow you may need to modify the notebook. In some versions of tensorflow the kmeans.training_graph() function returns a different number of variables than the notebook expects (such as the cluster_centers_vars variable which may need to be removed from the code).&lt;br /&gt;
&lt;br /&gt;
=== 3. random_forest.ipynb ===&lt;br /&gt;
&lt;br /&gt;
 TensorFlow-Examples/notebooks/2_BasicModels/random_forest.ipynb&lt;br /&gt;
&lt;br /&gt;
This notebook uses a different model, random forests, to classify the handwritten digits. A random forest is a set of decision trees, each of which are trained to learn part of the problem.&lt;br /&gt;
&lt;br /&gt;
 Random forests are one of the most commonly used machine learning models because they are quick to train and give good performance on many tasks.&lt;br /&gt;
&lt;br /&gt;
Run the notebook and observe that the random forest provides better accuracy than the kmeans model you trained previously.&lt;br /&gt;
&lt;br /&gt;
Try changing parameters such as increasing the number of decision trees or the training time.&lt;br /&gt;
&lt;br /&gt;
If you increase the number of trees or training time by too much then you will see that you achieve worse performance on the test set than on the training set. This is called overfitting and means that your model is learning specific details from the training set that do not generalize to the test set. This is a common problem in machine learning and needs to be considered whenever you train a model. You may need to simplify your model, give it more training examples by collecting more data or use data augmentation.&lt;br /&gt;
&lt;br /&gt;
=== 4. convolutional_network.ipynb ===&lt;br /&gt;
&lt;br /&gt;
TensorFlow-Examples/notebooks/3_NeuralNetworks/convolutional_network.ipynb&lt;br /&gt;
&lt;br /&gt;
This is our first example of a commonly applied kind of neural network called a convolutional neural network (CNN), also applied to handwriting classification.&lt;br /&gt;
&lt;br /&gt;
The notebook provides a good description of this kind of network which gradually reduces the size of the input using convolutional layers. This forces the network to learn information about multiple input variables and provides good accuracy.&lt;br /&gt;
&lt;br /&gt;
Run the notebook and observe that this simple CNN greatly outperforms kmeans and random forests for handwriting recognition.&lt;br /&gt;
&lt;br /&gt;
=== 5. linear_regression.ipynb ===&lt;br /&gt;
&lt;br /&gt;
 notebooks/TensorFlow-Examples/notebooks/2_BasicModels/linear_regression.ipynb&lt;br /&gt;
&lt;br /&gt;
This notebook is an example of a different problem, called regression. Regression attempts to predict a value, unlike classification which determines which class an input example belongs to. This is a simple example that attempts to fit a line to best match a set of data points with x and y values.&lt;br /&gt;
&lt;br /&gt;
Note: you may need to add backets to the print statements depending on your version of python and the version of this example notebook&lt;br /&gt;
&lt;br /&gt;
=== 5. autoencoder.ipynb ===&lt;br /&gt;
	&lt;br /&gt;
 notebooks/TensorFlow-Examples/notebooks/3_NeuralNetworks/autoencoder.ipynb&lt;br /&gt;
&lt;br /&gt;
An autoencoder trains two seperate networks, an encoder and a decoder. This has many applications using the intermediate &amp;quot;latent representation&amp;quot;. This includes compression (by storing just the latent representation), translation (by training an encoder with multiple different languages and then training different decoders for each language), and modifying the style of the input (by similarly leveraging different encoder or decoder frameworks).&lt;br /&gt;
&lt;br /&gt;
=== 6. gan.ipynb ===&lt;br /&gt;
&lt;br /&gt;
 TensorFlow-Examples/notebooks/3_NeuralNetworks/gan.ipynb&lt;br /&gt;
&lt;br /&gt;
A generative adversarial network (GAN) can be used to generate new data that looks similar to training data. This kind of network is how so called &amp;quot;deepfakes&amp;quot; are made that alter teh style of images or video.&lt;br /&gt;
&lt;br /&gt;
The basic idea is to traing two opposing networks. The generator tries to generate examples and the discriminator tries to determine if examples are real or fake. Both networks learn and improve together which greatly improves performance over using a single network for either purpose.&lt;br /&gt;
&lt;br /&gt;
=== 7. recurrent_network.ipynb ===&lt;br /&gt;
		&lt;br /&gt;
 TensorFlow-Examples/notebooks/3_NeuralNetworks/recurrent_network.ipynb&lt;br /&gt;
&lt;br /&gt;
A recurrent neural network is often applied to text processing or other problems that consider a sequence of events or letters. This type of network retains information at each step and thus has a type of memory that learns both from training examples as well as from parts of a specific input instance that have already been processed.&lt;br /&gt;
&lt;br /&gt;
== 4. Pytorch ==&lt;br /&gt;
&lt;br /&gt;
There are a variety of example notebooks for the pytorch framework at https://pytorch.org/tutorials/ . For example, there is a series of three notebooks on text processing and translation using recurrent neural networks:&lt;br /&gt;
&lt;br /&gt;
=== 1. classifying names by country of origin ===&lt;br /&gt;
 &lt;br /&gt;
 https://pytorch.org/tutorials/intermediate/char_rnn_classification_tutorial.html&lt;br /&gt;
&lt;br /&gt;
=== 2. generating names similar to those from those countries of origin ===&lt;br /&gt;
 &lt;br /&gt;
 https://pytorch.org/tutorials/intermediate/char_rnn_generation_tutorial.html&lt;br /&gt;
&lt;br /&gt;
=== 3. text translation using seq2seq latent encoding of text with different decoders ===&lt;br /&gt;
&lt;br /&gt;
 https://pytorch.org/tutorials/intermediate/seq2seq_translation_tutorial.html&lt;br /&gt;
&lt;br /&gt;
== 5. Tensor Processing Units ==&lt;br /&gt;
&lt;br /&gt;
Google has developed dedicated deep learning processors called tensor processing units (TPUs). The contest web site kaggle has an online notebook where you can try out these TPUs to classify images of flowers by species.&lt;br /&gt;
&lt;br /&gt;
 https://www.kaggle.com/c/flower-classification-with-tpus&lt;br /&gt;
&lt;br /&gt;
Try out the notebook and compare the performance of CPU, GPU, and TPU computing. You will need to modify the parameters of the notebook heavily to see any results at all with CPUs, while you will see an obvious speed improvement from using the TPUs over the GPUs.&lt;/div&gt;</summary>
		<author><name>Cwhidden</name></author>
		
	</entry>
	<entry>
		<id>https://docs.deepsense.ca/index.php?title=Deep_Learning_Tutorials&amp;diff=168</id>
		<title>Deep Learning Tutorials</title>
		<link rel="alternate" type="text/html" href="https://docs.deepsense.ca/index.php?title=Deep_Learning_Tutorials&amp;diff=168"/>
		<updated>2020-07-02T22:26:03Z</updated>

		<summary type="html">&lt;p&gt;Cwhidden: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;div class=&amp;quot;noautonum&amp;quot;&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== 1. Get started with DeepSense ==&lt;br /&gt;
&lt;br /&gt;
Follow all the steps from [[Getting started]] and [[Getting started with Deep Learning]]. This tutorial assumes you can log on to the DeepSense compute platform and have a version of Anaconda python on your path with Tensorflow and Pytorch installed in an anaconda environment.&lt;br /&gt;
&lt;br /&gt;
== 2. Tensorflow Preparation ==&lt;br /&gt;
&lt;br /&gt;
=== Download the example notebooks ===&lt;br /&gt;
&lt;br /&gt;
 git clone https://github.com/aymericdamien/TensorFlow-Examples.git&lt;br /&gt;
&lt;br /&gt;
=== request a gpu session ===&lt;br /&gt;
&lt;br /&gt;
 bsub -Is -q gpu bash&lt;br /&gt;
&lt;br /&gt;
=== activate your anaconda environment ===&lt;br /&gt;
&lt;br /&gt;
 conda activate tensorflow&lt;br /&gt;
&lt;br /&gt;
Note: this assumes you&amp;#039;ve followed the getting started instructions and have created a python environment called &amp;lt;code&amp;gt;tensorflow&amp;lt;/code&amp;gt; with the &amp;lt;code&amp;gt;tensorflow-gpu&amp;lt;/code&amp;gt; package installed from the IBM-AI repository. If not then please follow those instructions.&lt;br /&gt;
&lt;br /&gt;
=== start a jupyter notebook ===&lt;br /&gt;
 jupyter notebook --no-browser --ip=0.0.0.0&lt;br /&gt;
&lt;br /&gt;
=== open an SSH tunnel to access the notebook ===&lt;br /&gt;
just as we did in the [[Getting started with Deep Learning]] tutorial, open an SSH tunnel in another window&lt;br /&gt;
 ssh -l &amp;lt;user&amp;gt; login1.deepsense.ca -L &amp;lt;port&amp;gt;:ds-cmgpu-&amp;lt;num&amp;gt;:&amp;lt;port&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== open the notebook in your browser ===&lt;br /&gt;
In a web browser navigate to the page listed in the jupyter notebook output. Remember to replace the node name with &amp;lt;code&amp;gt;localhost&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== 3. Tensorflow Example Notebooks ==&lt;br /&gt;
&lt;br /&gt;
=== 1. helloworld.ipynb ===&lt;br /&gt;
&lt;br /&gt;
 TensorFlow-Examples/notebooks/1_Introduction/helloworld.ipynb&lt;br /&gt;
&lt;br /&gt;
This is an introduction to a python jupyter notebook.&lt;br /&gt;
&lt;br /&gt;
Select a box with code. Press the shift and enter keys together to run the code in that box. You&amp;#039;ll notice a star appear beside running code and a number in brackets appear beside finished code to indicate the order in which code boxes were run.&lt;br /&gt;
&lt;br /&gt;
Often in an example notebook you will see code that already has output cached. You still need to run all previous code boxes and may want to use the menu to clear all output. If there is an error then you can modify the code or fix the error (e.g. download a dependency) and try again.&lt;br /&gt;
&lt;br /&gt;
If you are missing a required dependency in later notebooks then you can install that package into your anaconda environment in a terminal window and it will be immediately accessible from the notebook. You do not need to close and restart the notebook or SSH tunnel.&lt;br /&gt;
&lt;br /&gt;
You can also run the entire notebook using the menu.&lt;br /&gt;
&lt;br /&gt;
When you are finished with a notebook you should use the menu to halt the kernel before closing the notebook. This clears resources such as GPU memory.&lt;br /&gt;
&lt;br /&gt;
=== 2. kmeans.ipynb ===&lt;br /&gt;
&lt;br /&gt;
 TensorFlow-Examples/notebooks/2_BasicModels/kmeans.ipynb&lt;br /&gt;
&lt;br /&gt;
This is an introduction to a basic machine learning model, kmeans. In this tutorial the kmeans algorithm is used to classify handwritten digits.&lt;br /&gt;
&lt;br /&gt;
kmeans works by clustering different training examples and comparing each new input to the mean of each cluster.&lt;br /&gt;
&lt;br /&gt;
Run the notebook and learn more about kmeans.&lt;br /&gt;
&lt;br /&gt;
Observe that machine learning methods often have parameters that must be chosen. You can use default parameters but optimizing these parameters can greatly improve the accuracy of a model.&lt;br /&gt;
&lt;br /&gt;
For a simple example, try increasing the length of training by changing the &amp;lt;code&amp;gt;num_steps&amp;lt;/code&amp;gt; variable from 50 to 100 and running it again.&lt;br /&gt;
&lt;br /&gt;
What happens if you modify other parameters and run the notebook again? How do these parameters change the training time and accuracy of the model?&lt;br /&gt;
&lt;br /&gt;
Note: depending on your version of tensorflow you may need to modify the notebook. In some versions of tensorflow the kmeans.training_graph() function returns a different number of variables than the notebook expects (such as the cluster_centers_vars variable which may need to be removed from the code).&lt;br /&gt;
&lt;br /&gt;
=== 3. random_forest.ipynb ===&lt;br /&gt;
&lt;br /&gt;
 TensorFlow-Examples/notebooks/2_BasicModels/random_forest.ipynb&lt;br /&gt;
&lt;br /&gt;
This notebook uses a different model, random forests, to classify the handwritten digits. A random forest is a set of decision trees, each of which are trained to learn part of the problem.&lt;br /&gt;
&lt;br /&gt;
 Random forests are one of the most commonly used machine learning models because they are quick to train and give good performance on many tasks.&lt;br /&gt;
&lt;br /&gt;
Run the notebook and observe that the random forest provides better accuracy than the kmeans model you trained previously.&lt;br /&gt;
&lt;br /&gt;
Try changing parameters such as increasing the number of decision trees or the training time.&lt;br /&gt;
&lt;br /&gt;
If you increase the number of trees or training time by too much then you will see that you achieve worse performance on the test set than on the training set. This is called overfitting and means that your model is learning specific details from the training set that do not generalize to the test set. This is a common problem in machine learning and needs to be considered whenever you train a model. You may need to simplify your model, give it more training examples by collecting more data or use data augmentation.&lt;br /&gt;
&lt;br /&gt;
=== 4. convolutional_network.ipynb ===&lt;br /&gt;
&lt;br /&gt;
TensorFlow-Examples/notebooks/3_NeuralNetworks/convolutional_network.ipynb&lt;br /&gt;
&lt;br /&gt;
This is our first example of a commonly applied kind of neural network called a convolutional neural network (CNN), also applied to handwriting classification.&lt;br /&gt;
&lt;br /&gt;
The notebook provides a good description of this kind of network which gradually reduces the size of the input using convolutional layers. This forces the network to learn information about multiple input variables and provides good accuracy.&lt;br /&gt;
&lt;br /&gt;
Run the notebook and observe that this simple CNN greatly outperforms kmeans and random forests for handwriting recognition.&lt;br /&gt;
&lt;br /&gt;
=== 5. linear_regression.ipynb ===&lt;br /&gt;
&lt;br /&gt;
 notebooks/TensorFlow-Examples/notebooks/2_BasicModels/linear_regression.ipynb&lt;br /&gt;
&lt;br /&gt;
This notebook is an example of a different problem, called regression. Regression attempts to predict a value, unlike classification which determines which class an input example belongs to. This is a simple example that attempts to fit a line to best match a set of data points with x and y values.&lt;br /&gt;
&lt;br /&gt;
Note: you may need to add backets to the print statements depending on your version of python and the version of this example notebook&lt;br /&gt;
&lt;br /&gt;
=== 5. autoencoder.ipynb ===&lt;br /&gt;
	&lt;br /&gt;
 notebooks/TensorFlow-Examples/notebooks/3_NeuralNetworks/autoencoder.ipynb&lt;br /&gt;
&lt;br /&gt;
An autoencoder trains two seperate networks, an encoder and a decoder. This has many applications using the intermediate &amp;quot;latent representation&amp;quot;. This includes compression (by storing just the latent representation), translation (by training an encoder with multiple different languages and then training different decoders for each language), and modifying the style of the input (by similarly leveraging different encoder or decoder frameworks).&lt;br /&gt;
&lt;br /&gt;
=== 6. gan.ipynb ===&lt;br /&gt;
&lt;br /&gt;
 TensorFlow-Examples/notebooks/3_NeuralNetworks/gan.ipynb&lt;br /&gt;
&lt;br /&gt;
A generative adversarial network (GAN) can be used to generate new data that looks similar to training data. This kind of network is how so called &amp;quot;deepfakes&amp;quot; are made that alter teh style of images or video.&lt;br /&gt;
&lt;br /&gt;
The basic idea is to traing two opposing networks. The generator tries to generate examples and the discriminator tries to determine if examples are real or fake. Both networks learn and improve together which greatly improves performance over using a single network for either purpose.&lt;br /&gt;
&lt;br /&gt;
=== 7. recurrent_network.ipynb ===&lt;br /&gt;
		&lt;br /&gt;
 TensorFlow-Examples/notebooks/3_NeuralNetworks/recurrent_network.ipynb&lt;br /&gt;
&lt;br /&gt;
A recurrent neural network is often applied to text processing or other problems that consider a sequence of events or letters. This type of network retains information at each step and thus has a type of memory that learns both from training examples as well as from parts of a specific input instance that have already been processed.&lt;/div&gt;</summary>
		<author><name>Cwhidden</name></author>
		
	</entry>
	<entry>
		<id>https://docs.deepsense.ca/index.php?title=Deep_Learning_Tutorials&amp;diff=167</id>
		<title>Deep Learning Tutorials</title>
		<link rel="alternate" type="text/html" href="https://docs.deepsense.ca/index.php?title=Deep_Learning_Tutorials&amp;diff=167"/>
		<updated>2020-07-02T18:04:42Z</updated>

		<summary type="html">&lt;p&gt;Cwhidden: Created page with &amp;quot;&amp;lt;div class=&amp;quot;noautonum&amp;quot;&amp;gt;  == 1. Get started with DeepSense ==  Follow all the steps from Getting started and Getting started with Deep Learning. This tutorial assumes y...&amp;quot;&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;div class=&amp;quot;noautonum&amp;quot;&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== 1. Get started with DeepSense ==&lt;br /&gt;
&lt;br /&gt;
Follow all the steps from [[Getting started]] and [[Getting started with Deep Learning]]. This tutorial assumes you can log on to the DeepSense compute platform and have a version of Anaconda python on your path with Tensorflow and Pytorch installed in an anaconda environment.&lt;br /&gt;
&lt;br /&gt;
== 2. Tensorflow Preparation ==&lt;br /&gt;
&lt;br /&gt;
=== Download the example notebooks ===&lt;br /&gt;
&lt;br /&gt;
 git clone https://github.com/aymericdamien/TensorFlow-Examples.git&lt;br /&gt;
&lt;br /&gt;
=== request a gpu session ===&lt;br /&gt;
&lt;br /&gt;
 bsub -Is -q gpu bash&lt;br /&gt;
&lt;br /&gt;
=== activate your anaconda environment ===&lt;br /&gt;
&lt;br /&gt;
 conda activate tensorflow&lt;br /&gt;
&lt;br /&gt;
Note: this assumes you&amp;#039;ve followed the getting started instructions and have created a python environment called &amp;lt;code&amp;gt;tensorflow&amp;lt;/code&amp;gt; with the &amp;lt;code&amp;gt;tensorflow-gpu&amp;lt;/code&amp;gt; package installed from the IBM-AI repository. If not then please follow those instructions.&lt;br /&gt;
&lt;br /&gt;
=== start a jupyter notebook ===&lt;br /&gt;
 jupyter notebook --no-browser --ip=0.0.0.0&lt;br /&gt;
&lt;br /&gt;
=== open an SSH tunnel to access the notebook ===&lt;br /&gt;
just as we did in the [[Getting started with Deep Learning]] tutorial, open an SSH tunnel in another window&lt;br /&gt;
 ssh -l &amp;lt;user&amp;gt; login1.deepsense.ca -L &amp;lt;port&amp;gt;:ds-cmgpu-&amp;lt;num&amp;gt;:&amp;lt;port&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== open the notebook in your browser ===&lt;br /&gt;
In a web browser navigate to the page listed in the jupyter notebook output. Remember to replace the node name with &amp;lt;code&amp;gt;localhost&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== 3. Tensorflow Example Notebooks ==&lt;br /&gt;
&lt;br /&gt;
=== 1. helloworld.ipynb ===&lt;br /&gt;
&lt;br /&gt;
 TensorFlow-Examples/notebooks/1_Introduction/helloworld.ipynb&lt;br /&gt;
&lt;br /&gt;
This is an introduction to a python jupyter notebook.&lt;br /&gt;
&lt;br /&gt;
Select a box with code. Press the shift and enter keys together to run the code in that box. You&amp;#039;ll notice a star appear beside running code and a number in brackets appear beside finished code to indicate the order in which code boxes were run.&lt;br /&gt;
&lt;br /&gt;
Often in an example notebook you will see code that already has output cached. You still need to run all previous code boxes and may want to use the menu to clear all output. If there is an error then you can modify the code or fix the error (e.g. download a dependency) and try again.&lt;br /&gt;
&lt;br /&gt;
If you are missing a required dependency in later notebooks then you can install that package into your anaconda environment in a terminal window and it will be immediately accessible from the notebook. You do not need to close and restart the notebook or SSH tunnel.&lt;br /&gt;
&lt;br /&gt;
You can also run the entire notebook using the menu.&lt;br /&gt;
&lt;br /&gt;
When you are finished with a notebook you should use the menu to halt the kernel before closing the notebook. This clears resources such as GPU memory.&lt;/div&gt;</summary>
		<author><name>Cwhidden</name></author>
		
	</entry>
	<entry>
		<id>https://docs.deepsense.ca/index.php?title=DeepSense_Documentation&amp;diff=166</id>
		<title>DeepSense Documentation</title>
		<link rel="alternate" type="text/html" href="https://docs.deepsense.ca/index.php?title=DeepSense_Documentation&amp;diff=166"/>
		<updated>2020-07-02T17:43:44Z</updated>

		<summary type="html">&lt;p&gt;Cwhidden: /* Guides */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Note ==&lt;br /&gt;
On June 26 we will update the GPU compute nodes to a new version of IBM Watson Machine Learning Accelerator. This will change the way you access deep learning packages like Tensorflow and Pytorch. Instead of &amp;quot;activating&amp;quot; these packages, you will be able to install new versions directly in your anaconda environment.&lt;br /&gt;
&lt;br /&gt;
We are actively updating the wiki documentation to explain the new method of accessing deep learning packages. Please bear with us during these updates as some documentation may still refer to the old method of &amp;quot;activating&amp;quot; deep learning packages&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;&amp;lt;span style=&amp;quot;font-size:120%&amp;gt;Cluster status&amp;lt;/span&amp;gt;&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
{|class=&amp;quot;wikitable&amp;quot; style=&amp;quot;text-align: center; color: black; font-style:bold&amp;quot;&lt;br /&gt;
|&amp;#039;&amp;#039;&amp;#039;Status&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
|style=&amp;quot;width:20% | &amp;#039;&amp;#039;&amp;#039;Planned Outage&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
|style=&amp;quot;width:70% | &amp;#039;&amp;#039;&amp;#039;Notes&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
|-&lt;br /&gt;
|style=&amp;quot;Color:green&amp;quot; | Online&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|}&lt;br /&gt;
Legend:&amp;lt;br/&amp;gt;&lt;br /&gt;
&amp;lt;span style=&amp;quot;color:green&amp;quot;&amp;gt;Online&amp;lt;/span&amp;gt;: cluster is running normally&amp;lt;br/&amp;gt;&lt;br /&gt;
&amp;lt;span style=&amp;quot;color:orange&amp;quot;&amp;gt;Online&amp;lt;/span&amp;gt;: cluster has some problems and is partially available&amp;lt;br/&amp;gt;&lt;br /&gt;
&amp;lt;span style=&amp;quot;color:red&amp;quot;&amp;gt;Offline&amp;lt;/span&amp;gt;: cluster is offine and users are not able to log in&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== System Information ==&lt;br /&gt;
* [[Resources]]&lt;br /&gt;
* [[Available software]]&lt;br /&gt;
&lt;br /&gt;
== Guides ==&lt;br /&gt;
* [[ Requesting access]]&lt;br /&gt;
* [[Getting started]]&lt;br /&gt;
* [[Introduction to Linux]]&lt;br /&gt;
* [[Getting started with Deep Learning]]&lt;br /&gt;
* [[Deep Learning Tutorials]]&lt;br /&gt;
* [[Storage policies]]&lt;br /&gt;
* [[Transferring Data]]&lt;br /&gt;
* Running jobs&lt;br /&gt;
** [[LSF|LSF batch jobs]]&lt;br /&gt;
** [[CWS|CWS web interface]]&lt;br /&gt;
* [[Installing local software]]&lt;br /&gt;
* Writing Tips&lt;br /&gt;
** [[Mitacs Accelerate Proposals]]&lt;br /&gt;
* [[Known problems]]&lt;br /&gt;
* [[Contact information|Contacting DeepSense]]&lt;br /&gt;
&lt;br /&gt;
== Documentation ==&lt;br /&gt;
* [[Media:DeepSense_Computing_Platform.pdf|DeepSense Computing Platform]]&lt;br /&gt;
&lt;br /&gt;
== Links ==&lt;br /&gt;
* [https://deepsense.ca DeepSense home page]&lt;br /&gt;
* [https://dal.ca Dalhousie University]&lt;br /&gt;
* [https://www.dal.ca/faculty/computerscience.html Faculty of Computer Science]&lt;br /&gt;
* [https://oceanfrontierinstitute.com/ Ocean Frontier Institute]&lt;/div&gt;</summary>
		<author><name>Cwhidden</name></author>
		
	</entry>
	<entry>
		<id>https://docs.deepsense.ca/index.php?title=Mitacs_Accelerate_Proposals&amp;diff=165</id>
		<title>Mitacs Accelerate Proposals</title>
		<link rel="alternate" type="text/html" href="https://docs.deepsense.ca/index.php?title=Mitacs_Accelerate_Proposals&amp;diff=165"/>
		<updated>2020-06-26T19:56:23Z</updated>

		<summary type="html">&lt;p&gt;Cwhidden: Created page with &amp;quot;The Mitacs Accelerate program is a common funding method for DeepSense projects. This guide will help you through the process of writing a funding proposal for submission to M...&amp;quot;&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;The Mitacs Accelerate program is a common funding method for DeepSense projects. This guide will help you through the process of writing a funding proposal for submission to Mitacs.&lt;br /&gt;
&lt;br /&gt;
Mitacs and a company each pay for half of the project. Mitacs funding is allocated in *units* of 4-6 months with each unit paying $15000 to the student.&lt;br /&gt;
&lt;br /&gt;
See the [https://www.mitacs.ca/en/programs/accelerate/proposal|The Mitacs web site] for the proposal template and guide.&lt;br /&gt;
&lt;br /&gt;
== Planning Meeting ==&lt;br /&gt;
As with any project, the first step is to have a planning meeting with the student, company, supervisor, and possibly with a DeepSense staff member. If you are reading this guide you probably already had at least one meeting with the company but there are a few items you need to be clear on. If necessary, have another meeting or discuss these items by email or phone with your supervisor, DeepSense staff, and/or the company:&lt;br /&gt;
&lt;br /&gt;
# Number of students and duration of the project&lt;br /&gt;
# Overall goal of the project&lt;br /&gt;
# Deliverables&lt;br /&gt;
# A general idea of how to approach the project and belief it is feasible&lt;br /&gt;
&lt;br /&gt;
== General advice ==&lt;br /&gt;
&lt;br /&gt;
* There are a lot of sections in the proposal so write the proposal in stages instead of trying to do it all at once or in order.&lt;br /&gt;
&lt;br /&gt;
* Be iterative: write an outline of what you intend to do, expand the outline to sentences and paragraphs, and then edit for clarity and content.&lt;br /&gt;
&lt;br /&gt;
* Get feedback often. Involve your supervisor, other students on the project or related projects, and DeepSense staff. It will take several rounds of feedback and editing to write a strong proposal.&lt;br /&gt;
&lt;br /&gt;
* Be mindful of both the science and the benefit to the company. A DeepSense project is a mix of both. There needs to be specified deliverables for the company and an explanation of how those deliverables solve a problem for the company. There also needs to be a research question to solve or novel advance that doesn&amp;#039;t currently exist. This can range from applying a known state of the art technique to a novel dataset or situation all the way to developing a brand new solution.&lt;br /&gt;
&lt;br /&gt;
* Use topic sentences and strong closing sentences. A topic sentence is a sentence that explains the purpose of a paragraph and is typically at the beginning of the paragraph. In most cases you should be able to read just the first sentence of each paragraph and still understand the main ideas. A strong closing sentence, especially at the end of each section, will encourage the reader&lt;br /&gt;
&lt;br /&gt;
* Ask your supervisor or DeepSense staff if you can see an example of a successfully funded Mitacs proposal. This will help you understand the scope and detail required in each section.&lt;br /&gt;
&lt;br /&gt;
== First Steps ==&lt;br /&gt;
&lt;br /&gt;
The first steps are to do some research and write down the general idea of the project&lt;br /&gt;
&lt;br /&gt;
=== 2.1 and 7.1: Title ===&lt;br /&gt;
Write down a draft title. You may want to change this later on so don&amp;#039;t worry too much about writing a catchy title yet.&lt;br /&gt;
&lt;br /&gt;
=== 2.2 Abstract ===&lt;br /&gt;
Approximately 200 words. The abstract summarizes the entire project. Write out the problem to be solved, why it is important, the current state of the art and why that is insufficient, your proposed solution, and finally the steps and deliverables of the project.&lt;br /&gt;
&lt;br /&gt;
The abstract is important but does not need to be polished at this stage. Just write out each of these facets and if you can&amp;#039;t then you need to do some research or discuss the project further with your supervisor, the company, and/or DeepSense staff.&lt;br /&gt;
&lt;br /&gt;
=== 2.3 Background ===&lt;br /&gt;
Minimum 500 words. This is where you explain the current state of the problem in detail and cite relevant research.&lt;br /&gt;
&lt;br /&gt;
First give a general overview of the problem and the broad research area. Cite some relevant survey papers or important guiding works.&lt;br /&gt;
&lt;br /&gt;
Then summarize some of the more specific topics and research papers. Typically you will find papers that partially but do not completely solve your problem. Summarize their results as well as their pros and cons. It should be clear why you have selected these specific results and how this knowledge will contribute to completing the project.&lt;br /&gt;
&lt;br /&gt;
Finally, explain how some of these results together can be used to solve your specific problem. For example, you may be using a deep learning model from one paper, adding a technique from a second paper, and then using a type of analysis from a third paper. Moreover, highlight any gaps that the current literature can&amp;#039;t solve for your problem. It is important to explain how these methods fit together and show that your project is novel research.&lt;br /&gt;
&lt;br /&gt;
=== 2.9 References ===&lt;br /&gt;
&lt;br /&gt;
Put your references in this section. Mitacs does not specify a reference format so use any consistent format. We&amp;#039;ve had good success with the APA format.&lt;br /&gt;
&lt;br /&gt;
This section is just as important as it is in a scientific paper so be sure to check your reference details and order them consistently.&lt;br /&gt;
&lt;br /&gt;
== Project Detail ==&lt;br /&gt;
&lt;br /&gt;
Now that you have a general idea of the project and have done some research it&amp;#039;s time to get more specific with the deliverables and methods. Consider getting feedback on the sections you have already completed while you start on the next sections.&lt;br /&gt;
&lt;br /&gt;
=== 2.4 General Objective ===&lt;br /&gt;
&lt;br /&gt;
Write out the main objective of the project. Split this up into sub-objectives. Very briefly summarize any information needed to explain a sub-objective such as how it will be tested or what data will be used.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== 2.5 Details of internships or subprojects ===&lt;br /&gt;
&lt;br /&gt;
Although numbered like a regular section, each subsection of Details requires a significant explanation and should be treated as a full section in its own right. We&amp;#039;ll complete some of these now but go back and do the other later.&lt;br /&gt;
&lt;br /&gt;
For a very large project funded by a single mitacs submission you may need to break the project up into different subprojects. If so then the student(s) working on each subproject should fill out their subprojects information.&lt;br /&gt;
&lt;br /&gt;
=== 2.5.a Name of Intern ===&lt;br /&gt;
&lt;br /&gt;
Write out the name of the student or students working on a shared goal.&lt;br /&gt;
&lt;br /&gt;
=== 2.5.b Specific Objectives ===&lt;br /&gt;
&lt;br /&gt;
Write out the specific objectives that each student listed will work on. These should be matched with the general objective and sub-objectives as well as explaining what each individual student will do.&lt;br /&gt;
&lt;br /&gt;
=== 2.5.c Methodologies ===&lt;br /&gt;
&lt;br /&gt;
In this section you write out in detail how you will accomplish this project.&lt;br /&gt;
&lt;br /&gt;
This is the meat of the proposal and must includes the datasets, tools, and methods you will use in enough detail for an expert external reviewer to determine if the project is feasible and will accomplish your objectives. Consider this like the methods section of a scientific paper.&lt;br /&gt;
&lt;br /&gt;
Explain the analysis you will complete and how you will show that you have met each sub-objective.&lt;br /&gt;
&lt;br /&gt;
=== 2.5.e Deliverables ===&lt;br /&gt;
&lt;br /&gt;
Explain the deliverables of the project such as software, reports, publications, etc. What will the project create and how will it meet the sub-objectives?&lt;br /&gt;
&lt;br /&gt;
=== 2.5.f Benefit to the intern ===&lt;br /&gt;
&lt;br /&gt;
Explain what you will gain from this project. Examples include:&lt;br /&gt;
* exposure to an ocean company and/or novel data&lt;br /&gt;
* learning a new analysis method&lt;br /&gt;
* opportunity to contribute to an industry problem or a solution that will be used by the partner to do X&lt;br /&gt;
* opportunity to publish in peer-reviewed venues and/or write a thesis&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Timeline ==&lt;br /&gt;
&lt;br /&gt;
Now that you have a good idea of the specifics of the project it&amp;#039;s time to plan. Revise and edit the previous sections and seek feedback.&lt;br /&gt;
&lt;br /&gt;
Then, take the steps your propose to do in your methodology and create a *gannt chart* showing how long each of those steps will take:&lt;br /&gt;
&lt;br /&gt;
=== 2.5.d Timeline === &lt;br /&gt;
&lt;br /&gt;
First, list out the major steps of the project, goals, and deliverables such as:&lt;br /&gt;
* obtain data&lt;br /&gt;
* preprocess the data&lt;br /&gt;
* design the machine learning model&lt;br /&gt;
* train the model&lt;br /&gt;
* test the model&lt;br /&gt;
* improve the model&lt;br /&gt;
* train and test again&lt;br /&gt;
* analysis&lt;br /&gt;
* write a report&lt;br /&gt;
&lt;br /&gt;
Then think about how long each step of the project will take in weeks, half months, or (for a long project) months. Break the project up into 4 month sections and create a gannt chart for each. You will probably want to do this in a spreadsheet program such as Excel for easy editing and then copy the chart into the mitacs proposal later.&lt;br /&gt;
&lt;br /&gt;
== Basic Information ==&lt;br /&gt;
&lt;br /&gt;
There are some informational sections you need to complete. You will need to obtain some information from your supervisor:&lt;br /&gt;
&lt;br /&gt;
* 2.8 Relationship (if any) to past/other Mitacs projects: &lt;br /&gt;
&lt;br /&gt;
* 3 Declarations&lt;br /&gt;
&lt;br /&gt;
* 4.1 Lead academic supervisor in Canada&lt;br /&gt;
&lt;br /&gt;
* 4.3 Interns&lt;br /&gt;
&lt;br /&gt;
* 4.4 Interns to be determined (TBD)&lt;br /&gt;
&lt;br /&gt;
== Company information ==&lt;br /&gt;
&lt;br /&gt;
When your proposal is nearly ready you will need to send it to your partner company for their feedback. Do this in conjunction with your supervisor and/or DeepSense staff.&lt;br /&gt;
&lt;br /&gt;
Be sure you make it clear to the company that *Section 7.2 Public project overview* will be publicly available on the Mitacs web site. They must review this section carefully.&lt;br /&gt;
&lt;br /&gt;
Section of the proposal that should be completed by or with the help of the partner organization are:&lt;br /&gt;
&lt;br /&gt;
* 2.5.g Partner interaction&lt;br /&gt;
&lt;br /&gt;
* 2.6 Relevance to the partner organization and to Canada&lt;br /&gt;
&lt;br /&gt;
* 4.2 Partner organization in Canada&lt;br /&gt;
&lt;br /&gt;
* 7.2 Public project overview&lt;br /&gt;
&lt;br /&gt;
== Remaining steps ==&lt;br /&gt;
&lt;br /&gt;
There are a few other sections that must be completed&lt;br /&gt;
&lt;br /&gt;
=== 5 Budget and invoicing ===&lt;br /&gt;
&lt;br /&gt;
Your supervisor or DeepSense staff can help with the budget&lt;br /&gt;
&lt;br /&gt;
=== 6 Suggested reviewers ===&lt;br /&gt;
&lt;br /&gt;
You need to suggest 6 reviewers. They cannot be from your university and you cannot have published with them or have plans to collaborate with them in the near future.&lt;br /&gt;
&lt;br /&gt;
Moreover, each reviewer must be from a different university or organization than the others.&lt;br /&gt;
&lt;br /&gt;
We recommend selecting subject experts that are knowledgeable of the research area but also likely to be generally favourable of the research and provide constructive criticism and suggestions. Contact your supervisor and/or DeepSense staff if you need help selecting reviewers.&lt;br /&gt;
&lt;br /&gt;
== Scientific Committee Review ==&lt;br /&gt;
&lt;br /&gt;
The proposal must be reviewed by the DeepSense scientific committee.&lt;br /&gt;
&lt;br /&gt;
== Signatures ==&lt;br /&gt;
&lt;br /&gt;
After the scientific committee has approved the project and the company has agreed and signed the necessary sections, you and your supervisor will need to sign:&lt;br /&gt;
&lt;br /&gt;
* 7.3 Signatures&lt;br /&gt;
* Appendix A - Intern consent form&lt;br /&gt;
* The separate mitacs IP agreement&lt;br /&gt;
&lt;br /&gt;
== Review Process ==&lt;br /&gt;
&lt;br /&gt;
Proposal review will take approximately 4-8 weeks so it is important to submit the proposal early before the start of a project. Most well-written proposals are funded but you may need to provide extra information to address reviewer comments.&lt;/div&gt;</summary>
		<author><name>Cwhidden</name></author>
		
	</entry>
	<entry>
		<id>https://docs.deepsense.ca/index.php?title=DeepSense_Documentation&amp;diff=164</id>
		<title>DeepSense Documentation</title>
		<link rel="alternate" type="text/html" href="https://docs.deepsense.ca/index.php?title=DeepSense_Documentation&amp;diff=164"/>
		<updated>2020-06-26T18:12:37Z</updated>

		<summary type="html">&lt;p&gt;Cwhidden: /* Guides */  Added page for writing a mitacs&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Note ==&lt;br /&gt;
On June 26 we will update the GPU compute nodes to a new version of IBM Watson Machine Learning Accelerator. This will change the way you access deep learning packages like Tensorflow and Pytorch. Instead of &amp;quot;activating&amp;quot; these packages, you will be able to install new versions directly in your anaconda environment.&lt;br /&gt;
&lt;br /&gt;
We are actively updating the wiki documentation to explain the new method of accessing deep learning packages. Please bear with us during these updates as some documentation may still refer to the old method of &amp;quot;activating&amp;quot; deep learning packages&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;&amp;lt;span style=&amp;quot;font-size:120%&amp;gt;Cluster status&amp;lt;/span&amp;gt;&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
{|class=&amp;quot;wikitable&amp;quot; style=&amp;quot;text-align: center; color: black; font-style:bold&amp;quot;&lt;br /&gt;
|&amp;#039;&amp;#039;&amp;#039;Status&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
|style=&amp;quot;width:20% | &amp;#039;&amp;#039;&amp;#039;Planned Outage&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
|style=&amp;quot;width:70% | &amp;#039;&amp;#039;&amp;#039;Notes&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
|-&lt;br /&gt;
|style=&amp;quot;Color:green&amp;quot; | Online&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|}&lt;br /&gt;
Legend:&amp;lt;br/&amp;gt;&lt;br /&gt;
&amp;lt;span style=&amp;quot;color:green&amp;quot;&amp;gt;Online&amp;lt;/span&amp;gt;: cluster is running normally&amp;lt;br/&amp;gt;&lt;br /&gt;
&amp;lt;span style=&amp;quot;color:orange&amp;quot;&amp;gt;Online&amp;lt;/span&amp;gt;: cluster has some problems and is partially available&amp;lt;br/&amp;gt;&lt;br /&gt;
&amp;lt;span style=&amp;quot;color:red&amp;quot;&amp;gt;Offline&amp;lt;/span&amp;gt;: cluster is offine and users are not able to log in&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== System Information ==&lt;br /&gt;
* [[Resources]]&lt;br /&gt;
* [[Available software]]&lt;br /&gt;
&lt;br /&gt;
== Guides ==&lt;br /&gt;
* [[ Requesting access]]&lt;br /&gt;
* [[Getting started]]&lt;br /&gt;
* [[Introduction to Linux]]&lt;br /&gt;
* [[Getting started with Deep Learning]]&lt;br /&gt;
* [[Storage policies]]&lt;br /&gt;
* [[Transferring Data]]&lt;br /&gt;
* Running jobs&lt;br /&gt;
** [[LSF|LSF batch jobs]]&lt;br /&gt;
** [[CWS|CWS web interface]]&lt;br /&gt;
* [[Installing local software]]&lt;br /&gt;
* Writing Tips&lt;br /&gt;
** [[Mitacs Accelerate Proposals]]&lt;br /&gt;
* [[Known problems]]&lt;br /&gt;
* [[Contact information|Contacting DeepSense]]&lt;br /&gt;
&lt;br /&gt;
== Documentation ==&lt;br /&gt;
* [[Media:DeepSense_Computing_Platform.pdf|DeepSense Computing Platform]]&lt;br /&gt;
&lt;br /&gt;
== Links ==&lt;br /&gt;
* [https://deepsense.ca DeepSense home page]&lt;br /&gt;
* [https://dal.ca Dalhousie University]&lt;br /&gt;
* [https://www.dal.ca/faculty/computerscience.html Faculty of Computer Science]&lt;br /&gt;
* [https://oceanfrontierinstitute.com/ Ocean Frontier Institute]&lt;/div&gt;</summary>
		<author><name>Cwhidden</name></author>
		
	</entry>
	<entry>
		<id>https://docs.deepsense.ca/index.php?title=Known_problems&amp;diff=163</id>
		<title>Known problems</title>
		<link rel="alternate" type="text/html" href="https://docs.deepsense.ca/index.php?title=Known_problems&amp;diff=163"/>
		<updated>2020-06-17T19:28:07Z</updated>

		<summary type="html">&lt;p&gt;Cwhidden: Update with information about IBM WMLA conda channel&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Where did Tensorflow go? ==&lt;br /&gt;
&lt;br /&gt;
On June 26 we will update the GPU compute nodes to a new version of IBM Watson Machine Learning Accelerator. This will change the way you access deep learning packages like Tensorflow and Pytorch. Instead of &amp;quot;activating&amp;quot; these packages, you will be able to install new versions directly in your anaconda environment.&lt;br /&gt;
&lt;br /&gt;
See [[Installing local software]] and [[Getting started with Deep Learning]] for more information.&lt;br /&gt;
&lt;br /&gt;
We are actively updating the wiki documentation to explain the new method of accessing deep learning packages. Please bear with us during these updates as some documentation may still refer to the old method of &amp;quot;activating&amp;quot; deep learning packages&lt;br /&gt;
&lt;br /&gt;
== The bhist command is not working ==&lt;br /&gt;
&lt;br /&gt;
We are aware of a problem causing the bhist command to not find the job log file necessary to print information about completed jobs. Instead, the output is always &amp;quot;no matching job found&amp;quot; even if the user specifies the &amp;quot;-a&amp;quot; option to print information about all running, completed, and failed jobs. &lt;br /&gt;
&lt;br /&gt;
While we work on solving this issue you can manually specify the location of the log file using the -f option. For example, the following command will print all running, completed, and failed jobs for the current user:&lt;br /&gt;
 &lt;br /&gt;
bhist -f /lsfshare/lsfswg/lsf/work/DeepSenseLSFCluster/logdir/lsb.events -a&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Where is my job output file? ==&lt;br /&gt;
Output may not be written to the specified file immediately when using the &amp;lt;code&amp;gt;-o &amp;lt;filename&amp;gt;&amp;lt;/code&amp;gt; or &amp;lt;code&amp;gt;-oo &amp;lt;filename&amp;gt;&amp;lt;/code&amp;gt; options. There are two workarounds for this problem:&lt;br /&gt;
&lt;br /&gt;
# You can use the bpeek &amp;lt;jobid&amp;gt; command to view the output of a currently running job.&lt;br /&gt;
# You can send your output to a file with the typical unix output specifications such as &amp;lt;code&amp;gt;&amp;gt; &amp;lt;filename&amp;gt;&amp;lt;/code&amp;gt; with your executed programs or by specifying output files in programs that support such options.&lt;br /&gt;
&lt;br /&gt;
== Jupyter notebooks or other programs fail trying to access a /run directory ==&lt;br /&gt;
&lt;br /&gt;
The default login shell is BASH.  Make sure the following parameter is in your .bashrc file in your home directory, as it prevents a problem where some types of jobs fail when run through the LSF queue. This should be done automatically the first time you log onto DeepSense. &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;code&amp;gt;echo &amp;#039;unset XDG_RUNTIME_DIR&amp;#039; &amp;gt;&amp;gt; ~/.bashrc&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This line has been added to the default .bashrc file for new users but older user accounts may need this step to be done manually. &lt;br /&gt;
&lt;br /&gt;
== Browser fails to connect to Jupyter Notebooks ==&lt;br /&gt;
&lt;br /&gt;
On our MacBook Pros, Jupyter notebooks work in Chrome, but don&amp;#039;t work in safari. Unfortunately, no error is given.  Safari just fails to connect.  Please let us know if you have issues with any other browsers, and we can add that info here.&lt;br /&gt;
&lt;br /&gt;
== Cannot Install PyTorch dependencies ==&lt;br /&gt;
&lt;br /&gt;
 UnsatisfiableError: The following specifications were found to be in conflict:&lt;br /&gt;
   - powerai-pytorch-prereqs=0.4.1_12295.5cb3523&lt;br /&gt;
&lt;br /&gt;
You may see this error when attempting to install the pytorch dependencies in a local anaconda environment. This error indicates that some of your installed python packages are not compatible with the pytorch prequisites. In particular, we see this error when conda has been updated to version 4.6 (which may sometimes happen when installing the tensorflow dependencies first).&lt;br /&gt;
&lt;br /&gt;
To resolve this problem, create a new environment with a 4.5.x conda version and then install the pytorch dependencies in that environment.&lt;br /&gt;
&lt;br /&gt;
== Cannot use Caffe on login node or compute nodes without GPUs ==&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;Cuda number of devices: -579579216&lt;br /&gt;
Current device id: -579579216&lt;br /&gt;
Current device name: &lt;br /&gt;
[==========] Running 2207 tests from 293 test cases.&lt;br /&gt;
[----------] Global test environment set-up.&lt;br /&gt;
[----------] 9 tests from AccuracyLayerTest/0, where TypeParam = caffe::CPUDevice&amp;lt;float&amp;gt;&lt;br /&gt;
[ RUN      ] AccuracyLayerTest/0.TestSetup&lt;br /&gt;
E0206 15:59:26.604874  7990 common.cpp:121] Cannot create Cublas handle. Cublas won&amp;#039;t be available.&lt;br /&gt;
E0206 15:59:26.611477  7990 common.cpp:128] Cannot create Curand generator. Curand won&amp;#039;t be available.&lt;br /&gt;
F0206 15:59:26.611616  7990 syncedmem.cpp:500] Check failed: error == cudaSuccess (30 vs. 0)  unknown error&lt;br /&gt;
*** Check failure stack trace: ***&lt;br /&gt;
&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
You may see this error when attempting to use Caffe on a node without GPUs or a GPU node without specifically requesting a GPU.&lt;br /&gt;
&lt;br /&gt;
To resolve this problem, use a GPU node and request a GPU. Caffe cannot run without an available GPU.&lt;br /&gt;
&lt;br /&gt;
== Cannot see GPUs in an LSF job ==&lt;br /&gt;
&lt;br /&gt;
 $ nvidia-smi &lt;br /&gt;
 No devices were found&lt;br /&gt;
&lt;br /&gt;
GPUs must be requested with the &amp;lt;code&amp;gt;-gpu -&amp;lt;/code&amp;gt; option to bsub. See [[LSF#GPU_Computation]] for more information.&lt;br /&gt;
&lt;br /&gt;
== Nested anaconda environments may cause strange behaviour ==&lt;br /&gt;
&lt;br /&gt;
Some users have experienced strange behaviour when activating an anaconda environment within another environment. This may include permission errors, loading incorrect versions of software, or strange conflicts when attempting to install packages. If you encounter problems with a nested anaconda environment then first try deactivating all anaconda environments and activating just the desired environment.&lt;/div&gt;</summary>
		<author><name>Cwhidden</name></author>
		
	</entry>
	<entry>
		<id>https://docs.deepsense.ca/index.php?title=Getting_started_with_Deep_Learning&amp;diff=162</id>
		<title>Getting started with Deep Learning</title>
		<link rel="alternate" type="text/html" href="https://docs.deepsense.ca/index.php?title=Getting_started_with_Deep_Learning&amp;diff=162"/>
		<updated>2020-06-17T19:25:35Z</updated>

		<summary type="html">&lt;p&gt;Cwhidden: Update with information about IBM WMLA conda channel&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;div class=&amp;quot;noautonum&amp;quot;&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Note ==&lt;br /&gt;
On June 26 we will update the GPU compute nodes to a new version of IBM Watson Machine Learning Accelerator. This will change the way you access deep learning packages like Tensorflow and Pytorch. Instead of &amp;quot;activating&amp;quot; these packages, you will be able to install new versions directly in your anaconda environment.&lt;br /&gt;
&lt;br /&gt;
We are actively updating the wiki documentation to explain the new method of accessing deep learning packages. Please bear with us during these updates as some documentation may still refer to the old method of &amp;quot;activating&amp;quot; deep learning packages&lt;br /&gt;
&lt;br /&gt;
== 1. Get started with DeepSense ==&lt;br /&gt;
&lt;br /&gt;
Follow all the steps from [[Getting started]]. This tutorial assumes you can log on to the DeepSense compute platform and have a version of Anaconda python on your path. We recommend installing Anaconda in your home directory before starting this tutorial (See [[Installing local software]]).&lt;br /&gt;
&lt;br /&gt;
== 2. Prepare Caffe and download Caffe samples to your home directory ==&lt;br /&gt;
&lt;br /&gt;
(New method)&lt;br /&gt;
&lt;br /&gt;
Activate your anaconda environment. See [[Installing local software]] for how to create an environment. We will assume you have created one called &amp;quot;caffe&amp;quot;.&lt;br /&gt;
 conda activate caffe&lt;br /&gt;
Add the IBM-AI anaconda channel if you have not done so already&lt;br /&gt;
 conda config --prepend channels https://public.dhe.ibm.com/ibmdl/export/pub/software/server/ibm-ai/conda/&lt;br /&gt;
Install caffe if you have not done so already&lt;br /&gt;
 conda install caffe&lt;br /&gt;
Install the caffe samples&lt;br /&gt;
 caffe-install-samples&lt;br /&gt;
&lt;br /&gt;
(Old method)&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;/opt/DL/caffe/bin/caffe-install-samples&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== 3. Request an interactive session on a GPU compute node ==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- TODO: still need to set up queues to fairly share GPUs --&amp;gt;&lt;br /&gt;
&amp;lt;!-- TODO: write instructions doing this with regular LSF without an interactive session. We don&amp;#039;t want to encourage everyone to use interactive sessions --&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;bsub -Is -gpu - bash&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== 4. Start a python2 Jupyter notebook ==&lt;br /&gt;
&lt;br /&gt;
=== (Old method only) Source the Caffe deep learning toolkit ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;source /opt/DL/caffe/bin/caffe-activate&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Start the notebook ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;jupyter notebook --no-browser --ip=0.0.0.0&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Sample output ===&lt;br /&gt;
&amp;lt;pre&amp;gt;[I 13:32:23.937 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).&lt;br /&gt;
[C 13:32:23.937 NotebookApp] &lt;br /&gt;
    &lt;br /&gt;
    Copy/paste this URL into your browser when you connect for the first time,&lt;br /&gt;
    to login with a token:&lt;br /&gt;
        http://ds-cmgpu-04:8888/?token=68042f40a10b500f3747ae0a232ee209fa4bf1aa384d29ba&amp;amp;token=68042f40a10b500f3747ae0a232ee209fa4bf1aa384d29ba&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Copy the URL, host, and port ===&lt;br /&gt;
&lt;br /&gt;
Copy the URL but don’t paste it in your browser yet.&lt;br /&gt;
&lt;br /&gt;
Make a note of which compute host and port the notebook is running on (e.g. host ds-cmgpu-04 and port 8888 in this case)&lt;br /&gt;
&lt;br /&gt;
== 5. Port Forwarding ==&lt;br /&gt;
&lt;br /&gt;
In a separate terminal window from your local computer, forward your local port to the remote host.&lt;br /&gt;
&lt;br /&gt;
=== ssh command port forwarding ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt; ssh -l &amp;lt;username&amp;gt; login1.deepsense.ca -L &amp;lt;local_port&amp;gt;:&amp;lt;remote_host&amp;gt;:&amp;lt;remote_port&amp;gt;&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
for example, &amp;lt;code&amp;gt;ssh -l user1 login1.deepsense.ca -L 8888:ds-cmgpu-04:8888&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note that you may need to use a different &amp;lt;local_port&amp;gt; than 8888 if you have other web services running on your local computer. In particular, if you run a jupyter notebook locally then it will use port 8888 and you will try to connect to the local jupyter notebook instead of the cluster notebook. In this case close your port forwarding and try again with 8889 or another unused port.&lt;br /&gt;
&lt;br /&gt;
=== PuTTY port forwarding on Windows ===&lt;br /&gt;
&lt;br /&gt;
If you are using a PuTTY terminal from a Windows computer to access DeepSense then you can still forward ports.&lt;br /&gt;
&lt;br /&gt;
Before starting your session, scroll down to the option &amp;lt;code&amp;gt;Connection-&amp;gt;SSH-&amp;gt;Tunnels&amp;lt;/code&amp;gt; in the Category pane.&lt;br /&gt;
&lt;br /&gt;
Enter the &amp;lt;code&amp;gt;local_port&amp;lt;/code&amp;gt; in the &amp;lt;code&amp;gt;Source port&amp;lt;/code&amp;gt; field. For example, &amp;lt;code&amp;gt;8888&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Enter &amp;lt;code&amp;gt;&amp;lt;remote_host&amp;gt;:&amp;lt;remote_port&amp;gt;&amp;lt;/code&amp;gt; in the Destination field. For example, &amp;lt;code&amp;gt;ds-cmgpu-04:8888&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Press the &amp;lt;code&amp;gt;Add&amp;lt;/code&amp;gt; button to add the port forwarding rule to your PuTTY session.&lt;br /&gt;
&lt;br /&gt;
Finally, open the session as usual.&lt;br /&gt;
&lt;br /&gt;
== 6. Open the desired sample notebook ==&lt;br /&gt;
&lt;br /&gt;
Enter the copied URL in your web browser but change the remote host name to “localhost” before pressing enter.&lt;br /&gt;
&lt;br /&gt;
e.g &amp;lt;code&amp;gt;http://localhost:8888/?token=68042f40a10b500f3747ae0a232ee209fa4bf1aa384d29ba&amp;amp;token=68042f40a10b500f3747ae0a232ee209fa4bf1aa384d29ba&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Note&amp;#039;&amp;#039;&amp;#039;: On our macs, this worked in Chrome, but not in Safari.  Unfortunately, there was no error reported, it simply could not connect.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Be sure to enter the location of the “caffe-samples” directory in your home directory as your caffe-root in the Caffe example notebooks.&lt;br /&gt;
&lt;br /&gt;
== 7. Enjoy Deep Learning on DeepSense! ==&lt;br /&gt;
&lt;br /&gt;
== 8. More information ==&lt;br /&gt;
&lt;br /&gt;
Go to Caffe&amp;#039;s [http://caffe.berkeleyvision.org/ website] for tutorials and example programs that you can run to get started.&lt;br /&gt;
See the following links to a couple of the example programs:&lt;br /&gt;
&lt;br /&gt;
[http://caffe.berkeleyvision.org/gathered/examples/mnist.html LeNet MNIST Tutorial] - Train a neural network to understand handwritten digits.&lt;br /&gt;
&lt;br /&gt;
[http://caffe.berkeleyvision.org/gathered/examples/cifar10.html CIFAR-10 tutorial] - Train a convolutional neural network to classify small images.&lt;br /&gt;
&lt;br /&gt;
== 9. Using another deep learning toolkit such as Tensorflow ==&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
(New Method)&lt;br /&gt;
* Ensure any Anaconda dependencies are installed&lt;br /&gt;
** for tensorflow, create a new environment and &amp;lt;code&amp;gt;conda install tensorflow&amp;lt;/conda&amp;gt;&lt;br /&gt;
* Download example notebooks for the deep learning toolkit to your home directory,&lt;br /&gt;
** e.g. &amp;lt;code&amp;gt; git clone https://github.com/aymericdamien/TensorFlow-Examples.git&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
(Old Method)&lt;br /&gt;
* Ensure any Anaconda dependencies are installed&lt;br /&gt;
** for tensorflow, run &amp;lt;code&amp;gt;/opt/DL/tensorflow/bin/install_dependencies&amp;lt;/code&amp;gt;&lt;br /&gt;
* Source the appropriate toolkit instead of caffe-activate&lt;br /&gt;
** e.g. &amp;lt;code&amp;gt;source /opt/DL/tensorflow/bin/tensorflow-activate&amp;lt;/code&amp;gt;&lt;br /&gt;
* Download example notebooks for the deep learning toolkit to your home directory,&lt;br /&gt;
** e.g. &amp;lt;code&amp;gt; git clone https://github.com/aymericdamien/TensorFlow-Examples.git&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The TensorFlow [https://www.tensorflow.org/ home page] has various information, including Tutorials, How-To documents, and a Getting Started guide.&lt;br /&gt;
&lt;br /&gt;
Additional tutorials and examples are available from the community, for example:&lt;br /&gt;
&lt;br /&gt;
  https://github.com/nlintz/TensorFlow-Tutorials&lt;br /&gt;
&lt;br /&gt;
  https://github.com/aymericdamien/TensorFlow-Examples&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/div&amp;gt; &amp;lt;!-- autonum --&amp;gt;&lt;/div&gt;</summary>
		<author><name>Cwhidden</name></author>
		
	</entry>
	<entry>
		<id>https://docs.deepsense.ca/index.php?title=Getting_started&amp;diff=161</id>
		<title>Getting started</title>
		<link rel="alternate" type="text/html" href="https://docs.deepsense.ca/index.php?title=Getting_started&amp;diff=161"/>
		<updated>2020-06-17T19:01:32Z</updated>

		<summary type="html">&lt;p&gt;Cwhidden: Update with information about IBM WMLA conda channel&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt; Getting Started with DeepSense &lt;br /&gt;
&lt;br /&gt;
&amp;lt;div class=&amp;quot;noautonum&amp;quot;&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Note ==&lt;br /&gt;
&lt;br /&gt;
On June 26 we will update the GPU compute nodes to a new version of IBM Watson Machine Learning Accelerator. This will change the way you access deep learning packages like Tensorflow and Pytorch. Instead of &amp;quot;activating&amp;quot; these packages, you will be able to install new versions directly in your anaconda environment.&lt;br /&gt;
&lt;br /&gt;
We are actively updating the wiki documentation to explain the new method of accessing deep learning packages. Please bear with us during these updates as some documentation may still refer to the old method of &amp;quot;activating&amp;quot; deep learning packages&lt;br /&gt;
&lt;br /&gt;
== 1. Logging on ==&lt;br /&gt;
&lt;br /&gt;
DeepSense has two login nodes, login1.deepsense.ca and login2.deepsense.ca . You can access these through SSH with your username and password from any computer on campus.&lt;br /&gt;
&lt;br /&gt;
For example, if your userid is &amp;lt;code&amp;gt;user1&amp;lt;/code&amp;gt;, you can connect to deepsense by typing &amp;lt;code&amp;gt;ssh user1@login1.deepsense.ca&amp;lt;/code&amp;gt; just like logging on to any other network computer.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Note&amp;#039;&amp;#039;&amp;#039;: The login nodes are intended for testing and compiling code. Please don’t run long or intensive computation on these nodes. Keep reading for instructions on how to submit compute jobs to dedicated compute nodes.&lt;br /&gt;
&lt;br /&gt;
== 1.1 VPN ==&lt;br /&gt;
&lt;br /&gt;
To connect to the DeepSense platform from outside of the Dalhousie Campus, you&amp;#039;ll need to use a VPN.&lt;br /&gt;
If you are are student, staff or faculty, you can use the Dalhousie VPN (https://wireless.dal.ca/vpnsoftware.php).&lt;br /&gt;
&lt;br /&gt;
If you are not a Dalhousie staff, student, or faculty but require offsite access and cannot use the Dalhousie VPN then contact your project leader or support@deepsense.ca to make different arrangements. &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==  2. Transfer data ==&lt;br /&gt;
&lt;br /&gt;
For more information, see [[Transferring Data]].&lt;br /&gt;
&lt;br /&gt;
Deepsense has two protocol nodes, protocol1.deepsense.ca and protocol2.deepsense.ca . You can connect to these using the SAMBA transfer protocol, e.g. smb://protocol1.deepsense.ca with your username and password. Please contact your project leader or support@deepsense.ca if you need help transferring large amounts of data.  &lt;br /&gt;
&lt;br /&gt;
Data transferred through the protocol nodes will be located in the shared /data directory .&lt;br /&gt;
&lt;br /&gt;
See [[Storage policies]] for more information about the available shared file systems, storage policies, and backup policies.&lt;br /&gt;
&lt;br /&gt;
== 3. Configure your environment ==&lt;br /&gt;
&lt;br /&gt;
DeepSense compute and management nodes are IBM Power8 computers (ppc64le) running Redhat Enterprise Linux. See [[Resources]] for more details on the available nodes.&lt;br /&gt;
&lt;br /&gt;
=== 3.1 Loading a python environment ===&lt;br /&gt;
&lt;br /&gt;
You have two options for using python on DeepSense. You can use the systemwide python install, managed by DeepSense administrators. This is recommended for users new to Linux. You will need to contact DeepSense support to have additional software packages installed in the systemwide python.&lt;br /&gt;
&lt;br /&gt;
Alternatively, you can install an Anaconda python environment or other software in your home directory. This allows you to install or update packages or software without requesting and waiting for DeepSense staff. &lt;br /&gt;
&lt;br /&gt;
==== Systemwide python (managed by DeepSense) ====&lt;br /&gt;
&lt;br /&gt;
DeepSense nodes have anaconda2 python installed in /opt/anaconda2. To use this systemwide python add a parameter to your .bashrc file in your home directory:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;echo &amp;quot;. /opt/anaconda2/etc/profile.d/conda.sh&amp;quot; &amp;gt;&amp;gt; ~/.bashrc&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Then source your .bashrc file:&lt;br /&gt;
&amp;lt;code&amp;gt;source ~/.bashrc&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To load the python2 environment run &amp;lt;code&amp;gt;conda activate&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To use python3 you can activate the py36 environment:&lt;br /&gt;
&amp;lt;code&amp;gt;conda activate py36&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
You can add either line to your .bashrc file to automatically load the desired environment when you log in.&lt;br /&gt;
&lt;br /&gt;
==== Local python install (managed by individual user) ====&lt;br /&gt;
&lt;br /&gt;
See [[Installing local software]] for more information.&lt;br /&gt;
&lt;br /&gt;
== 4. Running compute jobs ==&lt;br /&gt;
&lt;br /&gt;
DeepSense has two different methods of submitting compute jobs.&lt;br /&gt;
&lt;br /&gt;
=== 4.1 Load Sharing Facility (LSF) ===&lt;br /&gt;
&lt;br /&gt;
LSF is a set of command line tools for submitting compute jobs. You may be familiar with other similar software such as Sun Grid Engine or SLURM.&lt;br /&gt;
&lt;br /&gt;
LSF jobs are submitted using the &amp;lt;code&amp;gt;bsub&amp;lt;/code&amp;gt; command.&lt;br /&gt;
&lt;br /&gt;
You can examine the progress of your currently running jobs with the &amp;lt;code&amp;gt;bjobs&amp;lt;/code&amp;gt; command.&lt;br /&gt;
&lt;br /&gt;
You can examine the available compute nodes and their available resources with the &amp;lt;code&amp;gt;bhosts&amp;lt;/code&amp;gt; command.&lt;br /&gt;
&lt;br /&gt;
For more information about using LSF see [[LSF]].&lt;br /&gt;
&lt;br /&gt;
=== 4.2 Conductor with Spark (CWS) ===&lt;br /&gt;
&lt;br /&gt;
CWS is an IBM web-based graphical interface for creating and running Apache Spark compute jobs.&lt;br /&gt;
&lt;br /&gt;
To use CWS, connect to the IBM Spectrum Computing Cluster Management Console at https://ds-mgm-02.deepsense.cs.dal.ca:8443. Log in with your username and password.&lt;br /&gt;
&lt;br /&gt;
Note that currently you need to accept a self-signed web certificate. In the future this will be fixed.&lt;br /&gt;
&lt;br /&gt;
For more information about using CWS see [[CWS]].&lt;br /&gt;
&lt;br /&gt;
== 5. Deep Learning packages and other available software ==&lt;br /&gt;
&lt;br /&gt;
DeepSense has a variety of Deep Learning packages available as part of IBM Watson Machine Learning Accelerator including Tensorflow, Caffe, and PyTorch. These packages can be installed from the anaconda repository https://public.dhe.ibm.com/ibmdl/export/pub/software/server/ibm-ai/conda/&lt;br /&gt;
&lt;br /&gt;
These packages were formerly installed in /opt/DL/ on each compute node and used to need to be activated before using them, e.g. &amp;lt;code&amp;gt;source /opt/DL/tensorflow/bin/tensorflow-activate&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Deep Learning packages are typically used on the GPU nodes but some deep learning packages can also be used on the login nodes and CPU-only nodes. This can be useful for testing your code or running CPU-bound workloads. Note that some deep learning packages may fail if run without a GPU, e.g. Caffe currently requires a GPU.&lt;br /&gt;
&lt;br /&gt;
For a brief tutorial including running Caffe and Tensorflow in a Jupyter notebook see [[Getting started with Deep Learning]].&lt;br /&gt;
&lt;br /&gt;
See [[Available software]] for the current list of installed software. If you require additional software you are welcome to install it locally in your home directory or contact DeepSense support.&lt;br /&gt;
&lt;br /&gt;
== 6. Technical and research support == &lt;br /&gt;
&lt;br /&gt;
DeepSense has a dedicated support team of research scientists ready to help you with technical questions, installing software, or even research questions.&lt;br /&gt;
&lt;br /&gt;
If you can&amp;#039;t find the answer to your question on this wiki or need more extensive help then send an email to support@deepsense.ca .&lt;br /&gt;
&lt;br /&gt;
See [[Technical support]] for more information about the support available.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/div&amp;gt; &amp;lt;!-- autonum --&amp;gt;&lt;/div&gt;</summary>
		<author><name>Cwhidden</name></author>
		
	</entry>
	<entry>
		<id>https://docs.deepsense.ca/index.php?title=DeepSense_Documentation&amp;diff=160</id>
		<title>DeepSense Documentation</title>
		<link rel="alternate" type="text/html" href="https://docs.deepsense.ca/index.php?title=DeepSense_Documentation&amp;diff=160"/>
		<updated>2020-06-17T18:57:52Z</updated>

		<summary type="html">&lt;p&gt;Cwhidden: Update with information about IBM WMLA conda channel&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Note ==&lt;br /&gt;
On June 26 we will update the GPU compute nodes to a new version of IBM Watson Machine Learning Accelerator. This will change the way you access deep learning packages like Tensorflow and Pytorch. Instead of &amp;quot;activating&amp;quot; these packages, you will be able to install new versions directly in your anaconda environment.&lt;br /&gt;
&lt;br /&gt;
We are actively updating the wiki documentation to explain the new method of accessing deep learning packages. Please bear with us during these updates as some documentation may still refer to the old method of &amp;quot;activating&amp;quot; deep learning packages&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;&amp;lt;span style=&amp;quot;font-size:120%&amp;gt;Cluster status&amp;lt;/span&amp;gt;&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
{|class=&amp;quot;wikitable&amp;quot; style=&amp;quot;text-align: center; color: black; font-style:bold&amp;quot;&lt;br /&gt;
|&amp;#039;&amp;#039;&amp;#039;Status&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
|style=&amp;quot;width:20% | &amp;#039;&amp;#039;&amp;#039;Planned Outage&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
|style=&amp;quot;width:70% | &amp;#039;&amp;#039;&amp;#039;Notes&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
|-&lt;br /&gt;
|style=&amp;quot;Color:green&amp;quot; | Online&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|}&lt;br /&gt;
Legend:&amp;lt;br/&amp;gt;&lt;br /&gt;
&amp;lt;span style=&amp;quot;color:green&amp;quot;&amp;gt;Online&amp;lt;/span&amp;gt;: cluster is running normally&amp;lt;br/&amp;gt;&lt;br /&gt;
&amp;lt;span style=&amp;quot;color:orange&amp;quot;&amp;gt;Online&amp;lt;/span&amp;gt;: cluster has some problems and is partially available&amp;lt;br/&amp;gt;&lt;br /&gt;
&amp;lt;span style=&amp;quot;color:red&amp;quot;&amp;gt;Offline&amp;lt;/span&amp;gt;: cluster is offine and users are not able to log in&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== System Information ==&lt;br /&gt;
* [[Resources]]&lt;br /&gt;
* [[Available software]]&lt;br /&gt;
&lt;br /&gt;
== Guides ==&lt;br /&gt;
* [[ Requesting access]]&lt;br /&gt;
* [[Getting started]]&lt;br /&gt;
* [[Introduction to Linux]]&lt;br /&gt;
* [[Getting started with Deep Learning]]&lt;br /&gt;
* [[Storage policies]]&lt;br /&gt;
* [[Transferring Data]]&lt;br /&gt;
* Running jobs&lt;br /&gt;
** [[LSF|LSF batch jobs]]&lt;br /&gt;
** [[CWS|CWS web interface]]&lt;br /&gt;
* [[Installing local software]]&lt;br /&gt;
* [[Known problems]]&lt;br /&gt;
* [[Contact information|Contacting DeepSense]]&lt;br /&gt;
&lt;br /&gt;
== Documentation ==&lt;br /&gt;
* [[Media:DeepSense_Computing_Platform.pdf|DeepSense Computing Platform]]&lt;br /&gt;
&lt;br /&gt;
== Links ==&lt;br /&gt;
* [https://deepsense.ca DeepSense home page]&lt;br /&gt;
* [https://dal.ca Dalhousie University]&lt;br /&gt;
* [https://www.dal.ca/faculty/computerscience.html Faculty of Computer Science]&lt;br /&gt;
* [https://oceanfrontierinstitute.com/ Ocean Frontier Institute]&lt;/div&gt;</summary>
		<author><name>Cwhidden</name></author>
		
	</entry>
	<entry>
		<id>https://docs.deepsense.ca/index.php?title=Installing_local_software&amp;diff=159</id>
		<title>Installing local software</title>
		<link rel="alternate" type="text/html" href="https://docs.deepsense.ca/index.php?title=Installing_local_software&amp;diff=159"/>
		<updated>2020-06-17T17:59:18Z</updated>

		<summary type="html">&lt;p&gt;Cwhidden: Update with information about IBM WMLA conda channel&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Note ==&lt;br /&gt;
&lt;br /&gt;
On June 26 we will update the GPU compute nodes to a new version of IBM Watson Machine Learning Accelerator. This will change the way you access deep learning packages like Tensorflow and Pytorch. Instead of &amp;quot;activating&amp;quot; these packages, you will be able to install new versions directly in your anaconda environment. See below for more information.&lt;br /&gt;
&lt;br /&gt;
We are actively updating the wiki documentation to explain the new method of accessing deep learning packages. Please bear with us during these updates as some documentation may still refer to the old method of &amp;quot;activating&amp;quot; deep learning packages. &lt;br /&gt;
&lt;br /&gt;
== Introduction ==&lt;br /&gt;
&lt;br /&gt;
You are welcome to install software locally in your home directory. This allows you to use specific versions of software instead of the cluster wide versions. For example you may need an older version of a specific package or a newly released version that isn&amp;#039;t yet installed on DeepSense.&lt;br /&gt;
&lt;br /&gt;
For assistance installing or compiling software contact [[Contact_Information|Technical Support]]. We will support locally installed software to the best of our ability, although we can not guarantee that all software will run on the DeepSense platform. In the event that desired software will not run, we can help you determine alternatives such as different software or using a different system for some of your computation.&lt;br /&gt;
&lt;br /&gt;
If you attempt to install compiled software (e.g. an anaconda package) but the package cannot be found then also contact [[Contact_Information|Technical Support]]. The package may not have been compiled for the DeepSense hardware architecture (ppc64le).&lt;br /&gt;
&lt;br /&gt;
If your project has specific software you want to share between members then we can create a shared directory for your group in /software/&amp;lt;project&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you have locally compiled software that you think may be useful for other DeepSense users then let us know at [[Contact_Information|Technical Support]]. We may install and support it systemwide if there is sufficient interest.&lt;br /&gt;
&lt;br /&gt;
== Installing Anaconda Python in your home directory ==&lt;br /&gt;
&lt;br /&gt;
=== Stop using systemwide anaconda ===&lt;br /&gt;
&lt;br /&gt;
If you added the system anaconda environment to your &amp;lt;code&amp;gt;.bashrc&amp;lt;/code&amp;gt; file then remove the line:&lt;br /&gt;
 . /opt/anaconda2/etc/profile.d/conda.sh&lt;br /&gt;
&lt;br /&gt;
=== Installing Anaconda with a python3 base ===&lt;br /&gt;
&lt;br /&gt;
From your home directory run:&lt;br /&gt;
 wget https://repo.continuum.io/archive/Anaconda3-5.2.0-Linux-ppc64le.sh&lt;br /&gt;
 bash Anaconda3-5.2.0-Linux-ppc64le.sh&lt;br /&gt;
&lt;br /&gt;
Note: please enter &amp;quot;yes&amp;quot; when asked if you want to add anaconda to your .bashrc file. If you do not then you will need to add the following command to your .bashrc file or run it each time before using anaconda:&lt;br /&gt;
 . ~/anaconda3/etc/profile.d/conda.sh&lt;br /&gt;
&lt;br /&gt;
After the installer ends you need to either close and restart your terminal or run:&lt;br /&gt;
 source ~/.bashrc&lt;br /&gt;
&lt;br /&gt;
=== Adding a python2 environment ===&lt;br /&gt;
The previous instruction creates a python3 base environment. To add a python2 environment:&lt;br /&gt;
 conda create -n py27 python=2.7&lt;br /&gt;
&lt;br /&gt;
Activate this environment to use python3:&lt;br /&gt;
 conda activate py27&lt;br /&gt;
&lt;br /&gt;
note: if you receive an error message then you may need to deactivate the base conda environment first:&lt;br /&gt;
 conda deactivate&lt;br /&gt;
 conda activate py27&lt;br /&gt;
&lt;br /&gt;
=== Adding a python3 environment ===&lt;br /&gt;
We recommend creating a separate python3 environment from the base environment. This makes it easier to install the specific packages required for IBM PowerAI.&lt;br /&gt;
 conda create -n py36 python=3.6&lt;br /&gt;
&lt;br /&gt;
Activate this environment to use python3:&lt;br /&gt;
 conda activate py36&lt;br /&gt;
&lt;br /&gt;
=== (New Method) IBM-AI Deep Learning Anaconda Channel ===&lt;br /&gt;
&lt;br /&gt;
To use deep learning packages like Tensorflow on DeepSense you need to add the IBM-AI anaconda channel to your list of available software channels.&lt;br /&gt;
 conda config --prepend channels https://public.dhe.ibm.com/ibmdl/export/pub/software/server/ibm-ai/conda/&lt;br /&gt;
&lt;br /&gt;
We suggest creating a new environment for each deep learning package you want to use. For example for Tensorflow:&lt;br /&gt;
 conda create -n py36_tensorflow python=3.6&lt;br /&gt;
 conda activate py36_tensorflow&lt;br /&gt;
&lt;br /&gt;
Then install the anaconda package for the software you need. Again, with Tensorflow as an example:&lt;br /&gt;
 conda install tensorflow&lt;br /&gt;
&lt;br /&gt;
You can then use tensorflow or other deep learning packages as needed by simply activating that anaconda environment. Unlike the old method, you do not need to specifically activate tensorflow or other deep learning methods.&lt;br /&gt;
&lt;br /&gt;
You can directly visit the IBM-AI anaconda channel URL to see a list of available software (https://public.dhe.ibm.com/ibmdl/export/pub/software/server/ibm-ai/conda/) &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== (Old Method) Install PowerAI dependencies ===&lt;br /&gt;
&lt;br /&gt;
Warning: these scripts will install, update, and downgrade some packages to the recommended packages for the current version of PowerAI. You may want to create a separate python environment to use different versions of those packages with other software.&lt;br /&gt;
&lt;br /&gt;
To use Tensorflow first install the Tensorflow dependencies:&lt;br /&gt;
 /opt/DL/tensorflow/bin/install_dependencies&lt;br /&gt;
&lt;br /&gt;
To use PyTorch first install the PyTorch dependencies:&lt;br /&gt;
 /opt/DL/pytorch/bin/install_dependencies&lt;br /&gt;
&lt;br /&gt;
The dependencies must be installed in whichever python environment you intend to use. We&amp;#039;ve encountered some problems installing the PyTorch dependencies directly in the base environment if the base conda environment has been updated to conda version 4.6.2. If you want to use PyTorch, be sure to use a conda environment with a lower version of conda.&lt;br /&gt;
&lt;br /&gt;
=== Install other dependencies ===&lt;br /&gt;
&lt;br /&gt;
If you need additional python libraries then you can install them in your python environment.&lt;br /&gt;
&lt;br /&gt;
The base package comes with several python libraries but you may want a newer version or additional libraries. Also, when you create a new environment it does not automatically get all of the same libraries as the base environment.&lt;br /&gt;
&lt;br /&gt;
For example, suppose you want to install the &amp;lt;code&amp;gt;scikit-learn&amp;lt;/code&amp;gt; package in your python3 environment.&lt;br /&gt;
&lt;br /&gt;
First you need to activate the environment:&lt;br /&gt;
 conda activate py36&lt;br /&gt;
&lt;br /&gt;
Then you install the package&lt;br /&gt;
 conda install scikit-learn&lt;br /&gt;
&lt;br /&gt;
A list of recommended packages follows in the next section.&lt;br /&gt;
&lt;br /&gt;
=== Recommended packages ===&lt;br /&gt;
&lt;br /&gt;
==== Jupyter Notebooks for deep learning ====&lt;br /&gt;
 conda install jupyter&lt;br /&gt;
&lt;br /&gt;
=== (Old Method) Testing Deep Learning packages on the login nodes or non-GPU nodes ===&lt;br /&gt;
&lt;br /&gt;
You may wish to run PowerAI software on the login nodes for testing on the CPU-only nodes for some workflows.&lt;br /&gt;
&lt;br /&gt;
Only the GPU nodes have graphics cards and graphics drivers installed. If you attempt to run the deep learning software like Tensorflow on the login nodes or CPU-only nodes then you will see errors like the following:&lt;br /&gt;
 ImportError: libcublas.so.9.2: cannot open shared object file: No such file or directory&lt;br /&gt;
&lt;br /&gt;
You need to load the GPU drivers with the following command:&lt;br /&gt;
 source /opt/DL/cudnn/bin/cudnn-activate&lt;br /&gt;
&lt;br /&gt;
Then you can activate the deep learning package, e.g. for Tensorflow:&lt;br /&gt;
 source /opt/DL/tensorflow/bin/tensorflow-activate&lt;br /&gt;
&lt;br /&gt;
Note that some deep learning software may be much slower or refuse to run without GPU access. Tensorflow works but Caffe does not.&lt;br /&gt;
&lt;br /&gt;
Keep in mind you need to activate the GPU drivers and deep learning package in each browser shell before you are able to use the package in your code or LSF jobs.&lt;br /&gt;
&lt;br /&gt;
== Compiling Software for DeepSense ==&lt;br /&gt;
&lt;br /&gt;
DeepSense uses IBM Power8 systems running RedHat Enterprise Linux. Code must be compiled for &amp;lt;code&amp;gt;ppc64le&amp;lt;/code&amp;gt; which is PowerPC 64 bit Little Endian.&lt;br /&gt;
&lt;br /&gt;
Some software may not have binaries available for &amp;lt;code&amp;gt;ppc64le&amp;lt;/code&amp;gt; even if it does for other systems. If this happens then you (or [[Contact_Information|DeepSense support]]) will need to compile the software to run on DeepSense. Visit the web page for the software and see if the source code is available (e.g. through github). If so then follow the compilation instructions to run the software.&lt;br /&gt;
&lt;br /&gt;
You may encounter errors when attempting to compile software for &amp;lt;code&amp;gt;ppc64le&amp;lt;/code&amp;gt;. Often this occurs because of differences between &amp;lt;code&amp;gt;ppc64le&amp;lt;/code&amp;gt; and other common architectures such as x86 and x86_64. &lt;br /&gt;
&lt;br /&gt;
For example, one DeepSense user attempted to compile the rdkit software package from https://www.rdkit.org/ . This compilation failed when it attempted to use the gcc x86 optimization &amp;lt;code&amp;gt;-mpopcnt&amp;lt;/code&amp;gt;. After replacing the optimization with the &amp;lt;code&amp;gt;ppc64le&amp;lt;/code&amp;gt; equivalent &amp;lt;code&amp;gt;-mpopcntb&amp;lt;/code&amp;gt; the software compiled successfully.&lt;/div&gt;</summary>
		<author><name>Cwhidden</name></author>
		
	</entry>
	<entry>
		<id>https://docs.deepsense.ca/index.php?title=Getting_started_with_Deep_Learning&amp;diff=158</id>
		<title>Getting started with Deep Learning</title>
		<link rel="alternate" type="text/html" href="https://docs.deepsense.ca/index.php?title=Getting_started_with_Deep_Learning&amp;diff=158"/>
		<updated>2020-04-28T18:29:08Z</updated>

		<summary type="html">&lt;p&gt;Cwhidden: /* ssh command port forwarding */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;div class=&amp;quot;noautonum&amp;quot;&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== 1. Get started with DeepSense ==&lt;br /&gt;
&lt;br /&gt;
Follow all the steps from [[Getting started]]. This tutorial assumes you can log on to the DeepSense compute platform and have a version of Anaconda python on your path. We recommend installing Anaconda in your home directory before starting this tutorial (See [[Installing local software]]).&lt;br /&gt;
&lt;br /&gt;
== 2. Download Caffe samples to your home directory ==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;/opt/DL/caffe/bin/caffe-install-samples&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== 3. Request an interactive session on a GPU compute node ==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- TODO: still need to set up queues to fairly share GPUs --&amp;gt;&lt;br /&gt;
&amp;lt;!-- TODO: write instructions doing this with regular LSF without an interactive session. We don&amp;#039;t want to encourage everyone to use interactive sessions --&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;bsub -Is -gpu - bash&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== 4. Start a python2 Jupyter notebook ==&lt;br /&gt;
&lt;br /&gt;
=== Source the Caffe deep learning toolkit ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;source /opt/DL/caffe/bin/caffe-activate&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Start the notebook ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;jupyter notebook --no-browser --ip=0.0.0.0&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Sample output ===&lt;br /&gt;
&amp;lt;pre&amp;gt;[I 13:32:23.937 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).&lt;br /&gt;
[C 13:32:23.937 NotebookApp] &lt;br /&gt;
    &lt;br /&gt;
    Copy/paste this URL into your browser when you connect for the first time,&lt;br /&gt;
    to login with a token:&lt;br /&gt;
        http://ds-cmgpu-04:8888/?token=68042f40a10b500f3747ae0a232ee209fa4bf1aa384d29ba&amp;amp;token=68042f40a10b500f3747ae0a232ee209fa4bf1aa384d29ba&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Copy the URL, host, and port ===&lt;br /&gt;
&lt;br /&gt;
Copy the URL but don’t paste it in your browser yet.&lt;br /&gt;
&lt;br /&gt;
Make a note of which compute host and port the notebook is running on (e.g. host ds-cmgpu-04 and port 8888 in this case)&lt;br /&gt;
&lt;br /&gt;
== 5. Port Forwarding ==&lt;br /&gt;
&lt;br /&gt;
In a separate terminal window from your local computer, forward your local port to the remote host.&lt;br /&gt;
&lt;br /&gt;
=== ssh command port forwarding ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt; ssh -l &amp;lt;username&amp;gt; login1.deepsense.ca -L &amp;lt;local_port&amp;gt;:&amp;lt;remote_host&amp;gt;:&amp;lt;remote_port&amp;gt;&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
for example, &amp;lt;code&amp;gt;ssh -l user1 login1.deepsense.ca -L 8888:ds-cmgpu-04:8888&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note that you may need to use a different &amp;lt;local_port&amp;gt; than 8888 if you have other web services running on your local computer. In particular, if you run a jupyter notebook locally then it will use port 8888 and you will try to connect to the local jupyter notebook instead of the cluster notebook. In this case close your port forwarding and try again with 8889 or another unused port.&lt;br /&gt;
&lt;br /&gt;
=== PuTTY port forwarding on Windows ===&lt;br /&gt;
&lt;br /&gt;
If you are using a PuTTY terminal from a Windows computer to access DeepSense then you can still forward ports.&lt;br /&gt;
&lt;br /&gt;
Before starting your session, scroll down to the option &amp;lt;code&amp;gt;Connection-&amp;gt;SSH-&amp;gt;Tunnels&amp;lt;/code&amp;gt; in the Category pane.&lt;br /&gt;
&lt;br /&gt;
Enter the &amp;lt;code&amp;gt;local_port&amp;lt;/code&amp;gt; in the &amp;lt;code&amp;gt;Source port&amp;lt;/code&amp;gt; field. For example, &amp;lt;code&amp;gt;8888&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Enter &amp;lt;code&amp;gt;&amp;lt;remote_host&amp;gt;:&amp;lt;remote_port&amp;gt;&amp;lt;/code&amp;gt; in the Destination field. For example, &amp;lt;code&amp;gt;ds-cmgpu-04:8888&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Press the &amp;lt;code&amp;gt;Add&amp;lt;/code&amp;gt; button to add the port forwarding rule to your PuTTY session.&lt;br /&gt;
&lt;br /&gt;
Finally, open the session as usual.&lt;br /&gt;
&lt;br /&gt;
== 6. Open the desired sample notebook ==&lt;br /&gt;
&lt;br /&gt;
Enter the copied URL in your web browser but change the remote host name to “localhost” before pressing enter.&lt;br /&gt;
&lt;br /&gt;
e.g &amp;lt;code&amp;gt;http://localhost:8888/?token=68042f40a10b500f3747ae0a232ee209fa4bf1aa384d29ba&amp;amp;token=68042f40a10b500f3747ae0a232ee209fa4bf1aa384d29ba&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Note&amp;#039;&amp;#039;&amp;#039;: On our macs, this worked in Chrome, but not in Safari.  Unfortunately, there was no error reported, it simply could not connect.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Be sure to enter the location of the “caffe-samples” directory in your home directory as your caffe-root in the Caffe example notebooks.&lt;br /&gt;
&lt;br /&gt;
== 7. Enjoy Deep Learning on DeepSense! ==&lt;br /&gt;
&lt;br /&gt;
== 8. More information ==&lt;br /&gt;
&lt;br /&gt;
Go to Caffe&amp;#039;s [http://caffe.berkeleyvision.org/ website] for tutorials and example programs that you can run to get started.&lt;br /&gt;
See the following links to a couple of the example programs:&lt;br /&gt;
&lt;br /&gt;
[http://caffe.berkeleyvision.org/gathered/examples/mnist.html LeNet MNIST Tutorial] - Train a neural network to understand handwritten digits.&lt;br /&gt;
&lt;br /&gt;
[http://caffe.berkeleyvision.org/gathered/examples/cifar10.html CIFAR-10 tutorial] - Train a convolutional neural network to classify small images.&lt;br /&gt;
&lt;br /&gt;
== 9. Using another deep learning toolkit such as Tensorflow ==&lt;br /&gt;
&lt;br /&gt;
* Ensure any Anaconda dependencies are installed&lt;br /&gt;
** for tensorflow, run &amp;lt;code&amp;gt;/opt/DL/tensorflow/bin/install_dependencies&amp;lt;/code&amp;gt;&lt;br /&gt;
* Source the appropriate toolkit instead of caffe-activate&lt;br /&gt;
** e.g. &amp;lt;code&amp;gt;source /opt/DL/tensorflow/bin/tensorflow-activate&amp;lt;/code&amp;gt;&lt;br /&gt;
* Download example notebooks for the deep learning toolkit to your home directory,&lt;br /&gt;
** e.g. &amp;lt;code&amp;gt; git clone https://github.com/aymericdamien/TensorFlow-Examples.git&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The TensorFlow [https://www.tensorflow.org/ home page] has various information, including Tutorials, How-To documents, and a Getting Started guide.&lt;br /&gt;
&lt;br /&gt;
Additional tutorials and examples are available from the community, for example:&lt;br /&gt;
&lt;br /&gt;
  https://github.com/nlintz/TensorFlow-Tutorials&lt;br /&gt;
&lt;br /&gt;
  https://github.com/aymericdamien/TensorFlow-Examples&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/div&amp;gt; &amp;lt;!-- autonum --&amp;gt;&lt;/div&gt;</summary>
		<author><name>Cwhidden</name></author>
		
	</entry>
	<entry>
		<id>https://docs.deepsense.ca/index.php?title=Known_problems&amp;diff=151</id>
		<title>Known problems</title>
		<link rel="alternate" type="text/html" href="https://docs.deepsense.ca/index.php?title=Known_problems&amp;diff=151"/>
		<updated>2020-02-20T19:48:32Z</updated>

		<summary type="html">&lt;p&gt;Cwhidden: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== The bhist command is not working ==&lt;br /&gt;
&lt;br /&gt;
We are aware of a problem causing the bhist command to not find the job log file necessary to print information about completed jobs. Instead, the output is always &amp;quot;no matching job found&amp;quot; even if the user specifies the &amp;quot;-a&amp;quot; option to print information about all running, completed, and failed jobs. &lt;br /&gt;
&lt;br /&gt;
While we work on solving this issue you can manually specify the location of the log file using the -f option. For example, the following command will print all running, completed, and failed jobs for the current user:&lt;br /&gt;
 &lt;br /&gt;
bhist -f /lsfshare/lsfswg/lsf/work/DeepSenseLSFCluster/logdir/lsb.events -a&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Where is my job output file? ==&lt;br /&gt;
Output may not be written to the specified file immediately when using the &amp;lt;code&amp;gt;-o &amp;lt;filename&amp;gt;&amp;lt;/code&amp;gt; or &amp;lt;code&amp;gt;-oo &amp;lt;filename&amp;gt;&amp;lt;/code&amp;gt; options. There are two workarounds for this problem:&lt;br /&gt;
&lt;br /&gt;
# You can use the bpeek &amp;lt;jobid&amp;gt; command to view the output of a currently running job.&lt;br /&gt;
# You can send your output to a file with the typical unix output specifications such as &amp;lt;code&amp;gt;&amp;gt; &amp;lt;filename&amp;gt;&amp;lt;/code&amp;gt; with your executed programs or by specifying output files in programs that support such options.&lt;br /&gt;
&lt;br /&gt;
== Jupyter notebooks or other programs fail trying to access a /run directory ==&lt;br /&gt;
&lt;br /&gt;
The default login shell is BASH.  Make sure the following parameter is in your .bashrc file in your home directory, as it prevents a problem where some types of jobs fail when run through the LSF queue. This should be done automatically the first time you log onto DeepSense. &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;code&amp;gt;echo &amp;#039;unset XDG_RUNTIME_DIR&amp;#039; &amp;gt;&amp;gt; ~/.bashrc&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This line has been added to the default .bashrc file for new users but older user accounts may need this step to be done manually. &lt;br /&gt;
&lt;br /&gt;
== Browser fails to connect to Jupyter Notebooks ==&lt;br /&gt;
&lt;br /&gt;
On our MacBook Pros, Jupyter notebooks work in Chrome, but don&amp;#039;t work in safari. Unfortunately, no error is given.  Safari just fails to connect.  Please let us know if you have issues with any other browsers, and we can add that info here.&lt;br /&gt;
&lt;br /&gt;
== Cannot Install PyTorch dependencies ==&lt;br /&gt;
&lt;br /&gt;
 UnsatisfiableError: The following specifications were found to be in conflict:&lt;br /&gt;
   - powerai-pytorch-prereqs=0.4.1_12295.5cb3523&lt;br /&gt;
&lt;br /&gt;
You may see this error when attempting to install the pytorch dependencies in a local anaconda environment. This error indicates that some of your installed python packages are not compatible with the pytorch prequisites. In particular, we see this error when conda has been updated to version 4.6 (which may sometimes happen when installing the tensorflow dependencies first).&lt;br /&gt;
&lt;br /&gt;
To resolve this problem, create a new environment with a 4.5.x conda version and then install the pytorch dependencies in that environment.&lt;br /&gt;
&lt;br /&gt;
== Cannot use Caffe on login node or compute nodes without GPUs ==&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;Cuda number of devices: -579579216&lt;br /&gt;
Current device id: -579579216&lt;br /&gt;
Current device name: &lt;br /&gt;
[==========] Running 2207 tests from 293 test cases.&lt;br /&gt;
[----------] Global test environment set-up.&lt;br /&gt;
[----------] 9 tests from AccuracyLayerTest/0, where TypeParam = caffe::CPUDevice&amp;lt;float&amp;gt;&lt;br /&gt;
[ RUN      ] AccuracyLayerTest/0.TestSetup&lt;br /&gt;
E0206 15:59:26.604874  7990 common.cpp:121] Cannot create Cublas handle. Cublas won&amp;#039;t be available.&lt;br /&gt;
E0206 15:59:26.611477  7990 common.cpp:128] Cannot create Curand generator. Curand won&amp;#039;t be available.&lt;br /&gt;
F0206 15:59:26.611616  7990 syncedmem.cpp:500] Check failed: error == cudaSuccess (30 vs. 0)  unknown error&lt;br /&gt;
*** Check failure stack trace: ***&lt;br /&gt;
&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
You may see this error when attempting to use Caffe on a node without GPUs or a GPU node without specifically requesting a GPU.&lt;br /&gt;
&lt;br /&gt;
To resolve this problem, use a GPU node and request a GPU. Caffe cannot run without an available GPU.&lt;br /&gt;
&lt;br /&gt;
== Cannot see GPUs in an LSF job ==&lt;br /&gt;
&lt;br /&gt;
 $ nvidia-smi &lt;br /&gt;
 No devices were found&lt;br /&gt;
&lt;br /&gt;
GPUs must be requested with the &amp;lt;code&amp;gt;-gpu -&amp;lt;/code&amp;gt; option to bsub. See [[LSF#GPU_Computation]] for more information.&lt;br /&gt;
&lt;br /&gt;
== Nested anaconda environments may cause strange behaviour ==&lt;br /&gt;
&lt;br /&gt;
Some users have experienced strange behaviour when activating an anaconda environment within another environment. This may include permission errors, loading incorrect versions of software, or strange conflicts when attempting to install packages. If you encounter problems with a nested anaconda environment then first try deactivating all anaconda environments and activating just the desired environment.&lt;/div&gt;</summary>
		<author><name>Cwhidden</name></author>
		
	</entry>
	<entry>
		<id>https://docs.deepsense.ca/index.php?title=LSF&amp;diff=143</id>
		<title>LSF</title>
		<link rel="alternate" type="text/html" href="https://docs.deepsense.ca/index.php?title=LSF&amp;diff=143"/>
		<updated>2019-12-05T20:13:05Z</updated>

		<summary type="html">&lt;p&gt;Cwhidden: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[https://www.ibm.com/support/knowledgecenter/SSWRJV_10.1.0/ IBM Spectrum LSF] is the command line job submission system for submitting batch and interactive jobs on DeepSense computing hardware.&lt;br /&gt;
&lt;br /&gt;
== Test code and short computation ==&lt;br /&gt;
DeepSense has two login nodes, login1.deepsense.ca and login2.deepsense.ca . You can access these through SSH with your username and password from any computer on campus. From off campus you’ll need to use the [https://wireless.dal.ca/vpnsoftware.php Dalhousie VPN].&lt;br /&gt;
&lt;br /&gt;
The login nodes are intended for testing and compiling code. Please don’t run long or intensive computation on these nodes.&lt;br /&gt;
&lt;br /&gt;
== Job Submission ==&lt;br /&gt;
When you have a small example working with your code and are ready to run a real workload, use the LSF queue to submit your jobs to the cluster (https://www.ibm.com/support/knowledgecenter/SSWRJV_10.1.0/lsf_users_guide/batch_jobs_about.html). If you’ve used other queuing systems like slurm or Sun Grid Engine before then LSF will seem very familiar.&lt;br /&gt;
 &lt;br /&gt;
To submit a job you use the &amp;lt;code&amp;gt;bsub&amp;lt;/code&amp;gt; command (https://www.ibm.com/support/knowledgecenter/en/SSWRJV_10.1.0/lsf_command_ref/bsub.man_top.1.html).&lt;br /&gt;
 &lt;br /&gt;
For example, to submit a shared memory job using 20 processors and 256GB of memory for at most 24 hours you would run:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;bsub -oo &amp;lt;output_file&amp;gt; -n 20 -M 256000 -W 24:0 -R &amp;quot;span[hosts=1] rusage[mem=256000]&amp;quot; &amp;lt;executable&amp;gt; [options]&amp;lt;/code&amp;gt;&lt;br /&gt;
 &lt;br /&gt;
For openMP jobs, please make sure that you use &amp;lt;code&amp;gt;OMP_NUM_THREADS&amp;lt;/code&amp;gt; to limit the number of threads your program uses and that you set this variable in your code that will run on the server. LSF sets a variable &amp;lt;code&amp;gt;$LSB_DJOB_NUMPROC&amp;lt;/code&amp;gt; that you can use if you don’t want to hardcode &amp;lt;code&amp;gt;OMP_NUM_THREADS&amp;lt;/code&amp;gt; or set it with your own variable.&lt;br /&gt;
&lt;br /&gt;
=== CPU Limit ===&lt;br /&gt;
The number of requested processors is specified with the option &amp;lt;code&amp;gt;-n&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
The resource request &amp;lt;code&amp;gt;-R &amp;quot;span[hosts=1]&amp;quot;&amp;lt;/code&amp;gt; requires that all processors are on the same compute host, i.e. a shared memory job.&lt;br /&gt;
&lt;br /&gt;
LSF can also be used to run compute jobs across multiple hosts such as MPI jobs. Examples will be included here at a later date.&lt;br /&gt;
&lt;br /&gt;
=== Memory Limit === &lt;br /&gt;
LSF has two different types of memory limits.&lt;br /&gt;
The scheduler memory limit &amp;lt;code&amp;gt;-R &amp;quot;rusage[mem=&amp;lt;memlimit&amp;gt;]&amp;quot;&amp;lt;/code&amp;gt; requests &amp;lt;code&amp;gt;&amp;lt;memlimit&amp;gt;&amp;lt;/code&amp;gt; amount of memory. Your job will not start until a compute node is available with that amount of memory. You are guaranteed to have this amount of memory available. If you exceed the requested amount then your job may be killed but it will only be killed if other jobs need that memory. &lt;br /&gt;
&lt;br /&gt;
The job memory limit &amp;lt;code&amp;gt;-R &amp;quot;rusage[mem=&amp;lt;memlimit&amp;gt;]&amp;lt;/code&amp;gt; will kill your job if it exceeds the given memory limit. Note that this option does not guarantee that you will have that amount of memory available.&lt;br /&gt;
&lt;br /&gt;
The memory limits are specified in MB by default. You can also specify units, e.g. &amp;lt;code&amp;gt;-M 256GB&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;-R &amp;quot;rusage[mem=256GB]&amp;quot;&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
If you are using more than a few GB of memory than you must specify the &amp;lt;code&amp;gt;-R &amp;quot;rusage[mem=&amp;lt;memlimit&amp;gt;]&amp;quot;&amp;lt;/code&amp;gt; option or your job may be terminated. You may additionally want to use the &amp;lt;code&amp;gt;-M &amp;lt;memlimit&amp;gt;&amp;lt;/code&amp;gt; option to be sure you aren&amp;#039;t using more memory than intended.&lt;br /&gt;
&lt;br /&gt;
=== Time Limit ===&lt;br /&gt;
The runtime limit &amp;lt;code&amp;gt;-W hours:minutes&amp;lt;/code&amp;gt; specifies the maximum length of time your job is allowed to run.&lt;br /&gt;
For example &amp;lt;code&amp;gt;-W 24:0&amp;lt;/code&amp;gt; requests 24 hours of running time.&lt;br /&gt;
Your job will be terminated when the runtime limit is exceeded.&lt;br /&gt;
&lt;br /&gt;
If you do not specify a runtime limit then the default runtime limit of 168 hours (7 days) will be used.&lt;br /&gt;
The maximum possible runtime limit is currently 30 days and may vary by queue in the future.&lt;br /&gt;
&lt;br /&gt;
If there is a scheduled maintenance window announced then any job with a run time limit that could extend into the maintenance period will be listed as pending and will not run until the maintenance has concluded. Use a shorter run time limit that ends before the maintenance period to avoid this.&lt;br /&gt;
&lt;br /&gt;
=== GPU Computation ===&lt;br /&gt;
&lt;br /&gt;
To request access to a GPU use the &amp;lt;code&amp;gt;-gpu -&amp;lt;/code&amp;gt; options.&lt;br /&gt;
&lt;br /&gt;
Note the trailing dash, which specifies the default GPU arguments. The following options can be used in place of that dash.&lt;br /&gt;
&lt;br /&gt;
The default GPU arguments are &amp;lt;code&amp;gt;&amp;quot;num=1:mode=shared:mps=no:j_exclusive=no&amp;quot;&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;num=num_gpus&amp;lt;/code&amp;gt; is the number of requested GPUs on each host.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;mode=shared | exclusive_process&amp;lt;/code&amp;gt; specifies the GPU mode.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;mps=yes | no&amp;lt;/code&amp;gt; use the Nvidia Multi-Process Server (MPS). MPS enables better sharing of GPU resources. If &amp;lt;code&amp;gt;mode=exclusive_process&amp;lt;/code&amp;gt; then mps should be set to yes. &lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;j_exclusive=yes | no&amp;lt;/code&amp;gt; Is the GPU exclusive to this job and prevented from being used by other jobs?&lt;br /&gt;
&lt;br /&gt;
By default the &amp;lt;code&amp;gt;-gpu -&amp;lt;/code&amp;gt; option will request one nonexclusive GPU. Please limit your usage of GPU resources to a reasonable number of concurrently used GPUs and use shared GPUs when possible. We may enact limits on GPU use in the feature if necessary.&lt;br /&gt;
&lt;br /&gt;
See the [https://www.ibm.com/support/knowledgecenter/en/SSWRJV_10.1.0/lsf_command_ref/bsub.gpu.1.html bsub.gpu] documentation for more information on submitting GPU jobs.&lt;br /&gt;
&lt;br /&gt;
=== Input and Output files ===&lt;br /&gt;
If you do not specify an output file with &amp;lt;code&amp;gt;-o&amp;lt;/code&amp;gt; (append) or &amp;lt;code&amp;gt;-oo&amp;lt;/code&amp;gt; (overwrite) then the output will be lost. Note that LSF will prepend submission information to this file. You can use typical linux options like &amp;lt;code&amp;gt;&amp;gt; output_file2&amp;lt;/code&amp;gt; in which case the file specified with &amp;lt;code&amp;gt;-oo&amp;lt;/code&amp;gt; will just contain any errors and submission information.&lt;br /&gt;
&lt;br /&gt;
You can specify an input file with the &amp;lt;code&amp;gt;-i&amp;lt;/code&amp;gt; option or the typical linux option &amp;lt;code&amp;gt;&amp;lt; &amp;lt;input_file&amp;gt;&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note that output may not be written to the specified file immediately. You can use the &amp;lt;code&amp;gt;bpeek &amp;lt;jobid&amp;gt;&amp;lt;/code&amp;gt; command to view the output of a currently running job.&lt;br /&gt;
&lt;br /&gt;
== Advanced Job Submission ==&lt;br /&gt;
&lt;br /&gt;
=== Array Jobs ===&lt;br /&gt;
To run the same program multiple time with different input and output files you can use [https://www.ibm.com/support/knowledgecenter/en/SSWRJV_10.1.0/lsf_admin/job_arrays_lsf.html LSF Array Jobs].&lt;br /&gt;
&lt;br /&gt;
An example command in the LSF documentation is given as:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt; bsub -J &amp;quot;myArray[1-1000]&amp;quot; -i &amp;quot;input.%I&amp;quot; -o &amp;quot;output.%I&amp;quot; myJob&amp;lt;/code&amp;gt;&lt;br /&gt;
 &lt;br /&gt;
This command uses only one line to submit 1000 jobs running the script myJob with the input file &amp;lt;code&amp;gt;input.1, input.2, ... input.1000&amp;lt;/code&amp;gt; with the output of each job placed in the files &amp;lt;code&amp;gt;output.1, output.2, ... output.1000&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Complicated Jobs ===&lt;br /&gt;
To run the same program with multiple files, possibly with different options, you can create a job submission script that iterates over the files and submits the jobs.&lt;br /&gt;
 &lt;br /&gt;
For example, suppose you have &amp;lt;code&amp;gt;programA&amp;lt;/code&amp;gt; and want to process &amp;lt;code&amp;gt;input.1, input.2, ... input.N&amp;lt;/code&amp;gt; with output in &amp;lt;code&amp;gt;output.1, output.2, ... output.N&amp;lt;/code&amp;gt;, as in the array example.&lt;br /&gt;
&lt;br /&gt;
Create a bash script &amp;lt;code&amp;gt;do_submit_programA.bash&amp;lt;/code&amp;gt; that looks something like:&lt;br /&gt;
&lt;br /&gt;
 n=&amp;lt;N&amp;gt;&lt;br /&gt;
 arguments=&amp;lt;nodes, memory, time constraints, etc&amp;gt; &lt;br /&gt;
 for ((i=1; i&amp;lt;=$n; i++)); do&lt;br /&gt;
    bsub -oo log.$i $arguments programA &amp;lt; input.$i &amp;gt; output.$i&lt;br /&gt;
 done&lt;br /&gt;
 &lt;br /&gt;
Note that everything in triangle braces here is not real code. For example &amp;lt;code&amp;gt;N&amp;lt;/code&amp;gt; might be read from a command line argument or hardcoded as say 10. The arguments will be something like &amp;lt;code&amp;gt;-n 1 -M 100MB -R &amp;quot;rusage[mem=100MB]&amp;quot;&amp;lt;/code&amp;gt; and any other desired options. You can run multiple types of jobs with complex arguments.&lt;br /&gt;
&lt;br /&gt;
You may wish to create separate directories for the log files, input files, and output files if there are more than a handful of jobs.&lt;br /&gt;
 &lt;br /&gt;
If each job requires nontrivial processing (e.g. changing into different directories for each job) then you may want to create a second script that generates the jobfiles and then use a similar kind of submit script.&lt;br /&gt;
&lt;br /&gt;
=== Interactive Jobs ===&lt;br /&gt;
&lt;br /&gt;
Some jobs may require user input such as testing code on a gpu system or an interactive analytics program.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;bsub -I&amp;lt;/code&amp;gt; requests an interactive job that will print its output to your terminal.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;bsub -Ip&amp;lt;/code&amp;gt; requests an interactive job with a pseudo terminal. For example, this can be used to schedule a console program that takes user input and output.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;bsub -Is&amp;lt;/code&amp;gt; requests an interactive job with a shell. This can be used to test code on one of the gpu nodes or for more resource intensive development than is allowed on the login nodes.&lt;br /&gt;
&lt;br /&gt;
Note that interactive jobs are still subject to time and memory constraints as typical batch jobs. Please be careful not to interfere with other jobs running on a node and that your interactive job does not attempt to use more resources than you have requested. Please do not leave interactive jobs running for long periods and do not leave interactive jobs idle when you are not using them.&lt;br /&gt;
&lt;br /&gt;
We do not currently treat interactive jobs different than any other jobs. As DeepSense becomes more heavily utilized we may need to limit the number of interactive jobs run by a user, project, or on a given node. We may need to limit the time or other resources used by interactive jobs.&lt;br /&gt;
&lt;br /&gt;
== Job Information ==&lt;br /&gt;
&lt;br /&gt;
=== Running Jobs ===&lt;br /&gt;
 &lt;br /&gt;
To examine currently running jobs you use the &amp;lt;code&amp;gt;bjobs&amp;lt;/code&amp;gt; command (https://www.ibm.com/support/knowledgecenter/en/SSWRJV_10.1.0/lsf_command_ref/bjobs.man_top.1.html)&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;bjobs -l&amp;lt;/code&amp;gt; or &amp;lt;code&amp;gt;bjobs -l &amp;lt;jobid&amp;gt;&amp;lt;/code&amp;gt; shows additional job information including job status and resource usage.&lt;br /&gt;
&lt;br /&gt;
=== Past Jobs ===&lt;br /&gt;
&lt;br /&gt;
To examine current and past jobs use the &amp;lt;code&amp;gt;bhist&amp;lt;/code&amp;gt; command (https://www.ibm.com/support/knowledgecenter/en/SSWRJV_10.1.0/lsf_command_ref/bhist.1.html).&lt;br /&gt;
&lt;br /&gt;
The following options will show jobs with the specified status:&lt;br /&gt;
 -a all&lt;br /&gt;
 -d finished&lt;br /&gt;
 -e exited&lt;br /&gt;
 -p pending&lt;br /&gt;
 -r running&lt;br /&gt;
 -s suspended&lt;br /&gt;
&lt;br /&gt;
You can use options like &amp;lt;code&amp;gt;-S start_time,end_time&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;-C start_time,end_time&amp;lt;/code&amp;gt; to find jobs that were submitted or completed between the specified time intervals. These options require using the &amp;lt;code&amp;gt;-a&amp;lt;/code&amp;gt; option.&lt;br /&gt;
&lt;br /&gt;
As with bjobs, you can use the &amp;lt;code&amp;gt;-l&amp;lt;/code&amp;gt; option for additional information and can also specify a specific known jobid as the last command argument.&lt;br /&gt;
&lt;br /&gt;
=== Available Hosts ===&lt;br /&gt;
 &lt;br /&gt;
To see the available hosts and how busy they are you use the &amp;lt;code&amp;gt;bhosts&amp;lt;/code&amp;gt; command (https://www.ibm.com/support/knowledgecenter/en/SSWRJV_10.1.0/lsf_command_ref/bhosts.1.html)&lt;br /&gt;
&lt;br /&gt;
== LSF Command Reference == &lt;br /&gt;
&lt;br /&gt;
The complete list of LSF commands with description is available [https://www.ibm.com/support/knowledgecenter/en/SSWRJV_10.1.0/lsf_kc_cmd_ref.html here].&lt;/div&gt;</summary>
		<author><name>Cwhidden</name></author>
		
	</entry>
	<entry>
		<id>https://docs.deepsense.ca/index.php?title=Known_problems&amp;diff=142</id>
		<title>Known problems</title>
		<link rel="alternate" type="text/html" href="https://docs.deepsense.ca/index.php?title=Known_problems&amp;diff=142"/>
		<updated>2019-12-05T20:07:47Z</updated>

		<summary type="html">&lt;p&gt;Cwhidden: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Where is my job output file? ==&lt;br /&gt;
Output may not be written to the specified file immediately when using the &amp;lt;code&amp;gt;-o &amp;lt;filename&amp;gt;&amp;lt;/code&amp;gt; or &amp;lt;code&amp;gt;-oo &amp;lt;filename&amp;gt;&amp;lt;/code&amp;gt; options. There are two workarounds for this problem:&lt;br /&gt;
&lt;br /&gt;
# You can use the bpeek &amp;lt;jobid&amp;gt; command to view the output of a currently running job.&lt;br /&gt;
# You can send your output to a file with the typical unix output specifications such as &amp;lt;code&amp;gt;&amp;gt; &amp;lt;filename&amp;gt;&amp;lt;/code&amp;gt; with your executed programs or by specifying output files in programs that support such options.&lt;br /&gt;
&lt;br /&gt;
== Jupyter notebooks or other programs fail trying to access a /run directory ==&lt;br /&gt;
&lt;br /&gt;
The default login shell is BASH.  Make sure the following parameter is in your .bashrc file in your home directory, as it prevents a problem where some types of jobs fail when run through the LSF queue. This should be done automatically the first time you log onto DeepSense. &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;code&amp;gt;echo &amp;#039;unset XDG_RUNTIME_DIR&amp;#039; &amp;gt;&amp;gt; ~/.bashrc&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This line has been added to the default .bashrc file for new users but older user accounts may need this step to be done manually. &lt;br /&gt;
&lt;br /&gt;
== Browser fails to connect to Jupyter Notebooks ==&lt;br /&gt;
&lt;br /&gt;
On our MacBook Pros, Jupyter notebooks work in Chrome, but don&amp;#039;t work in safari. Unfortunately, no error is given.  Safari just fails to connect.  Please let us know if you have issues with any other browsers, and we can add that info here.&lt;br /&gt;
&lt;br /&gt;
== Cannot Install PyTorch dependencies ==&lt;br /&gt;
&lt;br /&gt;
 UnsatisfiableError: The following specifications were found to be in conflict:&lt;br /&gt;
   - powerai-pytorch-prereqs=0.4.1_12295.5cb3523&lt;br /&gt;
&lt;br /&gt;
You may see this error when attempting to install the pytorch dependencies in a local anaconda environment. This error indicates that some of your installed python packages are not compatible with the pytorch prequisites. In particular, we see this error when conda has been updated to version 4.6 (which may sometimes happen when installing the tensorflow dependencies first).&lt;br /&gt;
&lt;br /&gt;
To resolve this problem, create a new environment with a 4.5.x conda version and then install the pytorch dependencies in that environment.&lt;br /&gt;
&lt;br /&gt;
== Cannot use Caffe on login node or compute nodes without GPUs ==&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;Cuda number of devices: -579579216&lt;br /&gt;
Current device id: -579579216&lt;br /&gt;
Current device name: &lt;br /&gt;
[==========] Running 2207 tests from 293 test cases.&lt;br /&gt;
[----------] Global test environment set-up.&lt;br /&gt;
[----------] 9 tests from AccuracyLayerTest/0, where TypeParam = caffe::CPUDevice&amp;lt;float&amp;gt;&lt;br /&gt;
[ RUN      ] AccuracyLayerTest/0.TestSetup&lt;br /&gt;
E0206 15:59:26.604874  7990 common.cpp:121] Cannot create Cublas handle. Cublas won&amp;#039;t be available.&lt;br /&gt;
E0206 15:59:26.611477  7990 common.cpp:128] Cannot create Curand generator. Curand won&amp;#039;t be available.&lt;br /&gt;
F0206 15:59:26.611616  7990 syncedmem.cpp:500] Check failed: error == cudaSuccess (30 vs. 0)  unknown error&lt;br /&gt;
*** Check failure stack trace: ***&lt;br /&gt;
&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
You may see this error when attempting to use Caffe on a node without GPUs or a GPU node without specifically requesting a GPU.&lt;br /&gt;
&lt;br /&gt;
To resolve this problem, use a GPU node and request a GPU. Caffe cannot run without an available GPU.&lt;br /&gt;
&lt;br /&gt;
== Cannot see GPUs in an LSF job ==&lt;br /&gt;
&lt;br /&gt;
 $ nvidia-smi &lt;br /&gt;
 No devices were found&lt;br /&gt;
&lt;br /&gt;
GPUs must be requested with the &amp;lt;code&amp;gt;-gpu -&amp;lt;/code&amp;gt; option to bsub. See [[LSF#GPU_Computation]] for more information.&lt;br /&gt;
&lt;br /&gt;
== Nested anaconda environments may cause strange behaviour ==&lt;br /&gt;
&lt;br /&gt;
Some users have experienced strange behaviour when activating an anaconda environment within another environment. This may include permission errors, loading incorrect versions of software, or strange conflicts when attempting to install packages. If you encounter problems with a nested anaconda environment then first try deactivating all anaconda environments and activating just the desired environment.&lt;/div&gt;</summary>
		<author><name>Cwhidden</name></author>
		
	</entry>
	<entry>
		<id>https://docs.deepsense.ca/index.php?title=LSF&amp;diff=141</id>
		<title>LSF</title>
		<link rel="alternate" type="text/html" href="https://docs.deepsense.ca/index.php?title=LSF&amp;diff=141"/>
		<updated>2019-12-05T20:03:02Z</updated>

		<summary type="html">&lt;p&gt;Cwhidden: /* Input and Output files */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[https://www.ibm.com/support/knowledgecenter/SSWRJV_10.1.0/ IBM Spectrum LSF] is the command line job submission system for submitting batch and interactive jobs on DeepSense computing hardware.&lt;br /&gt;
&lt;br /&gt;
== Test code and short computation ==&lt;br /&gt;
DeepSense has two login nodes, login1.deepsense.ca and login2.deepsense.ca . You can access these through SSH with your username and password from any computer on campus. From off campus you’ll need to use the [https://wireless.dal.ca/vpnsoftware.php Dalhousie VPN].&lt;br /&gt;
&lt;br /&gt;
The login nodes are intended for testing and compiling code. Please don’t run long or intensive computation on these nodes.&lt;br /&gt;
&lt;br /&gt;
== Job Submission ==&lt;br /&gt;
When you have a small example working with your code and are ready to run a real workload, use the LSF queue to submit your jobs to the cluster (https://www.ibm.com/support/knowledgecenter/SSWRJV_10.1.0/lsf_users_guide/batch_jobs_about.html). If you’ve used other queuing systems like slurm or Sun Grid Engine before then LSF will seem very familiar.&lt;br /&gt;
 &lt;br /&gt;
To submit a job you use the &amp;lt;code&amp;gt;bsub&amp;lt;/code&amp;gt; command (https://www.ibm.com/support/knowledgecenter/en/SSWRJV_10.1.0/lsf_command_ref/bsub.man_top.1.html).&lt;br /&gt;
 &lt;br /&gt;
For example, to submit a shared memory job using 20 processors and 256GB of memory for at most 24 hours you would run:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;bsub -oo &amp;lt;output_file&amp;gt; -n 20 -M 256000 -W 24:0 -R “span[hosts=1] rusage[mem=256000]” &amp;lt;executable&amp;gt; [options]&amp;lt;/code&amp;gt;&lt;br /&gt;
 &lt;br /&gt;
For openMP jobs, please make sure that you use &amp;lt;code&amp;gt;OMP_NUM_THREADS&amp;lt;/code&amp;gt; to limit the number of threads your program uses and that you set this variable in your code that will run on the server. LSF sets a variable &amp;lt;code&amp;gt;$LSB_DJOB_NUMPROC&amp;lt;/code&amp;gt; that you can use if you don’t want to hardcode &amp;lt;code&amp;gt;OMP_NUM_THREADS&amp;lt;/code&amp;gt; or set it with your own variable.&lt;br /&gt;
&lt;br /&gt;
=== CPU Limit ===&lt;br /&gt;
The number of requested processors is specified with the option &amp;lt;code&amp;gt;-n&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
The resource request &amp;lt;code&amp;gt;-R &amp;quot;span[hosts=1]&amp;quot;&amp;lt;/code&amp;gt; requires that all processors are on the same compute host, i.e. a shared memory job.&lt;br /&gt;
&lt;br /&gt;
LSF can also be used to run compute jobs across multiple hosts such as MPI jobs. Examples will be included here at a later date.&lt;br /&gt;
&lt;br /&gt;
=== Memory Limit === &lt;br /&gt;
LSF has two different types of memory limits.&lt;br /&gt;
The scheduler memory limit &amp;lt;code&amp;gt;-R &amp;quot;rusage[mem=&amp;lt;memlimit&amp;gt;]&amp;quot;&amp;lt;/code&amp;gt; requests &amp;lt;code&amp;gt;&amp;lt;memlimit&amp;gt;&amp;lt;/code&amp;gt; amount of memory. Your job will not start until a compute node is available with that amount of memory. You are guaranteed to have this amount of memory available. If you exceed the requested amount then your job may be killed but it will only be killed if other jobs need that memory. &lt;br /&gt;
&lt;br /&gt;
The job memory limit &amp;lt;code&amp;gt;-R &amp;quot;rusage[mem=&amp;lt;memlimit&amp;gt;]&amp;lt;/code&amp;gt; will kill your job if it exceeds the given memory limit. Note that this option does not guarantee that you will have that amount of memory available.&lt;br /&gt;
&lt;br /&gt;
The memory limits are specified in MB by default. You can also specify units, e.g. &amp;lt;code&amp;gt;-M 256GB&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;-R &amp;quot;rusage[mem=256GB]&amp;quot;&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
If you are using more than a few GB of memory than you must specify the &amp;lt;code&amp;gt;-R &amp;quot;rusage[mem=&amp;lt;memlimit&amp;gt;]&amp;quot;&amp;lt;/code&amp;gt; option or your job may be terminated. You may additionally want to use the &amp;lt;code&amp;gt;-M &amp;lt;memlimit&amp;gt;&amp;lt;/code&amp;gt; option to be sure you aren&amp;#039;t using more memory than intended.&lt;br /&gt;
&lt;br /&gt;
=== Time Limit ===&lt;br /&gt;
The runtime limit &amp;lt;code&amp;gt;-W hours:minutes&amp;lt;/code&amp;gt; specifies the maximum length of time your job is allowed to run.&lt;br /&gt;
For example &amp;lt;code&amp;gt;-W 24:0&amp;lt;/code&amp;gt; requests 24 hours of running time.&lt;br /&gt;
Your job will be terminated when the runtime limit is exceeded.&lt;br /&gt;
&lt;br /&gt;
If you do not specify a runtime limit then the default runtime limit of 168 hours (7 days) will be used.&lt;br /&gt;
The maximum possible runtime limit is currently 30 days and may vary by queue in the future.&lt;br /&gt;
&lt;br /&gt;
If there is a scheduled maintenance window announced then any job with a run time limit that could extend into the maintenance period will be listed as pending and will not run until the maintenance has concluded. Use a shorter run time limit that ends before the maintenance period to avoid this.&lt;br /&gt;
&lt;br /&gt;
=== GPU Computation ===&lt;br /&gt;
&lt;br /&gt;
To request access to a GPU use the &amp;lt;code&amp;gt;-gpu -&amp;lt;/code&amp;gt; options.&lt;br /&gt;
&lt;br /&gt;
Note the trailing dash, which specifies the default GPU arguments. The following options can be used in place of that dash.&lt;br /&gt;
&lt;br /&gt;
The default GPU arguments are &amp;lt;code&amp;gt;&amp;quot;num=1:mode=shared:mps=no:j_exclusive=no&amp;quot;&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;num=num_gpus&amp;lt;/code&amp;gt; is the number of requested GPUs on each host.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;mode=shared | exclusive_process&amp;lt;/code&amp;gt; specifies the GPU mode.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;mps=yes | no&amp;lt;/code&amp;gt; use the Nvidia Multi-Process Server (MPS). MPS enables better sharing of GPU resources. If &amp;lt;code&amp;gt;mode=exclusive_process&amp;lt;/code&amp;gt; then mps should be set to yes. &lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;j_exclusive=yes | no&amp;lt;/code&amp;gt; Is the GPU exclusive to this job and prevented from being used by other jobs?&lt;br /&gt;
&lt;br /&gt;
By default the &amp;lt;code&amp;gt;-gpu -&amp;lt;/code&amp;gt; option will request one nonexclusive GPU. Please limit your usage of GPU resources to a reasonable number of concurrently used GPUs and use shared GPUs when possible. We may enact limits on GPU use in the feature if necessary.&lt;br /&gt;
&lt;br /&gt;
See the [https://www.ibm.com/support/knowledgecenter/en/SSWRJV_10.1.0/lsf_command_ref/bsub.gpu.1.html bsub.gpu] documentation for more information on submitting GPU jobs.&lt;br /&gt;
&lt;br /&gt;
=== Input and Output files ===&lt;br /&gt;
If you do not specify an output file with &amp;lt;code&amp;gt;-o&amp;lt;/code&amp;gt; (append) or &amp;lt;code&amp;gt;-oo&amp;lt;/code&amp;gt; (overwrite) then the output will be lost. Note that LSF will prepend submission information to this file. You can use typical linux options like &amp;lt;code&amp;gt;&amp;gt; output_file2&amp;lt;/code&amp;gt; in which case the file specified with &amp;lt;code&amp;gt;-oo&amp;lt;/code&amp;gt; will just contain any errors and submission information.&lt;br /&gt;
&lt;br /&gt;
You can specify an input file with the &amp;lt;code&amp;gt;-i&amp;lt;/code&amp;gt; option or the typical linux option &amp;lt;code&amp;gt;&amp;lt; &amp;lt;input_file&amp;gt;&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note that output may not be written to the specified file immediately. You can use the &amp;lt;code&amp;gt;bpeek &amp;lt;jobid&amp;gt;&amp;lt;/code&amp;gt; command to view the output of a currently running job.&lt;br /&gt;
&lt;br /&gt;
== Advanced Job Submission ==&lt;br /&gt;
&lt;br /&gt;
=== Array Jobs ===&lt;br /&gt;
To run the same program multiple time with different input and output files you can use [https://www.ibm.com/support/knowledgecenter/en/SSWRJV_10.1.0/lsf_admin/job_arrays_lsf.html LSF Array Jobs].&lt;br /&gt;
&lt;br /&gt;
An example command in the LSF documentation is given as:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt; bsub -J &amp;quot;myArray[1-1000]&amp;quot; -i &amp;quot;input.%I&amp;quot; -o &amp;quot;output.%I&amp;quot; myJob&amp;lt;/code&amp;gt;&lt;br /&gt;
 &lt;br /&gt;
This command uses only one line to submit 1000 jobs running the script myJob with the input file &amp;lt;code&amp;gt;input.1, input.2, ... input.1000&amp;lt;/code&amp;gt; with the output of each job placed in the files &amp;lt;code&amp;gt;output.1, output.2, ... output.1000&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Complicated Jobs ===&lt;br /&gt;
To run the same program with multiple files, possibly with different options, you can create a job submission script that iterates over the files and submits the jobs.&lt;br /&gt;
 &lt;br /&gt;
For example, suppose you have &amp;lt;code&amp;gt;programA&amp;lt;/code&amp;gt; and want to process &amp;lt;code&amp;gt;input.1, input.2, ... input.N&amp;lt;/code&amp;gt; with output in &amp;lt;code&amp;gt;output.1, output.2, ... output.N&amp;lt;/code&amp;gt;, as in the array example.&lt;br /&gt;
&lt;br /&gt;
Create a bash script &amp;lt;code&amp;gt;do_submit_programA.bash&amp;lt;/code&amp;gt; that looks something like:&lt;br /&gt;
&lt;br /&gt;
 n=&amp;lt;N&amp;gt;&lt;br /&gt;
 arguments=&amp;lt;nodes, memory, time constraints, etc&amp;gt; &lt;br /&gt;
 for ((i=1; i&amp;lt;=$n; i++)); do&lt;br /&gt;
    bsub -oo log.$i $arguments programA &amp;lt; input.$i &amp;gt; output.$i&lt;br /&gt;
 done&lt;br /&gt;
 &lt;br /&gt;
Note that everything in triangle braces here is not real code. For example &amp;lt;code&amp;gt;N&amp;lt;/code&amp;gt; might be read from a command line argument or hardcoded as say 10. The arguments will be something like &amp;lt;code&amp;gt;-n 1 -M 100MB -R &amp;quot;rusage[mem=100MB]&amp;quot;&amp;lt;/code&amp;gt; and any other desired options. You can run multiple types of jobs with complex arguments.&lt;br /&gt;
&lt;br /&gt;
You may wish to create separate directories for the log files, input files, and output files if there are more than a handful of jobs.&lt;br /&gt;
 &lt;br /&gt;
If each job requires nontrivial processing (e.g. changing into different directories for each job) then you may want to create a second script that generates the jobfiles and then use a similar kind of submit script.&lt;br /&gt;
&lt;br /&gt;
=== Interactive Jobs ===&lt;br /&gt;
&lt;br /&gt;
Some jobs may require user input such as testing code on a gpu system or an interactive analytics program.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;bsub -I&amp;lt;/code&amp;gt; requests an interactive job that will print its output to your terminal.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;bsub -Ip&amp;lt;/code&amp;gt; requests an interactive job with a pseudo terminal. For example, this can be used to schedule a console program that takes user input and output.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;bsub -Is&amp;lt;/code&amp;gt; requests an interactive job with a shell. This can be used to test code on one of the gpu nodes or for more resource intensive development than is allowed on the login nodes.&lt;br /&gt;
&lt;br /&gt;
Note that interactive jobs are still subject to time and memory constraints as typical batch jobs. Please be careful not to interfere with other jobs running on a node and that your interactive job does not attempt to use more resources than you have requested. Please do not leave interactive jobs running for long periods and do not leave interactive jobs idle when you are not using them.&lt;br /&gt;
&lt;br /&gt;
We do not currently treat interactive jobs different than any other jobs. As DeepSense becomes more heavily utilized we may need to limit the number of interactive jobs run by a user, project, or on a given node. We may need to limit the time or other resources used by interactive jobs.&lt;br /&gt;
&lt;br /&gt;
== Job Information ==&lt;br /&gt;
&lt;br /&gt;
=== Running Jobs ===&lt;br /&gt;
 &lt;br /&gt;
To examine currently running jobs you use the &amp;lt;code&amp;gt;bjobs&amp;lt;/code&amp;gt; command (https://www.ibm.com/support/knowledgecenter/en/SSWRJV_10.1.0/lsf_command_ref/bjobs.man_top.1.html)&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;bjobs -l&amp;lt;/code&amp;gt; or &amp;lt;code&amp;gt;bjobs -l &amp;lt;jobid&amp;gt;&amp;lt;/code&amp;gt; shows additional job information including job status and resource usage.&lt;br /&gt;
&lt;br /&gt;
=== Past Jobs ===&lt;br /&gt;
&lt;br /&gt;
To examine current and past jobs use the &amp;lt;code&amp;gt;bhist&amp;lt;/code&amp;gt; command (https://www.ibm.com/support/knowledgecenter/en/SSWRJV_10.1.0/lsf_command_ref/bhist.1.html).&lt;br /&gt;
&lt;br /&gt;
The following options will show jobs with the specified status:&lt;br /&gt;
 -a all&lt;br /&gt;
 -d finished&lt;br /&gt;
 -e exited&lt;br /&gt;
 -p pending&lt;br /&gt;
 -r running&lt;br /&gt;
 -s suspended&lt;br /&gt;
&lt;br /&gt;
You can use options like &amp;lt;code&amp;gt;-S start_time,end_time&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;-C start_time,end_time&amp;lt;/code&amp;gt; to find jobs that were submitted or completed between the specified time intervals. These options require using the &amp;lt;code&amp;gt;-a&amp;lt;/code&amp;gt; option.&lt;br /&gt;
&lt;br /&gt;
As with bjobs, you can use the &amp;lt;code&amp;gt;-l&amp;lt;/code&amp;gt; option for additional information and can also specify a specific known jobid as the last command argument.&lt;br /&gt;
&lt;br /&gt;
=== Available Hosts ===&lt;br /&gt;
 &lt;br /&gt;
To see the available hosts and how busy they are you use the &amp;lt;code&amp;gt;bhosts&amp;lt;/code&amp;gt; command (https://www.ibm.com/support/knowledgecenter/en/SSWRJV_10.1.0/lsf_command_ref/bhosts.1.html)&lt;br /&gt;
&lt;br /&gt;
== LSF Command Reference == &lt;br /&gt;
&lt;br /&gt;
The complete list of LSF commands with description is available [https://www.ibm.com/support/knowledgecenter/en/SSWRJV_10.1.0/lsf_kc_cmd_ref.html here].&lt;/div&gt;</summary>
		<author><name>Cwhidden</name></author>
		
	</entry>
	<entry>
		<id>https://docs.deepsense.ca/index.php?title=Installing_local_software&amp;diff=140</id>
		<title>Installing local software</title>
		<link rel="alternate" type="text/html" href="https://docs.deepsense.ca/index.php?title=Installing_local_software&amp;diff=140"/>
		<updated>2019-12-05T20:01:25Z</updated>

		<summary type="html">&lt;p&gt;Cwhidden: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Introduction ==&lt;br /&gt;
&lt;br /&gt;
You are welcome to install software locally in your home directory. This allows you to use specific versions of software instead of the cluster wide versions. For example you may need an older version of a specific package or a newly released version that isn&amp;#039;t yet installed on DeepSense.&lt;br /&gt;
&lt;br /&gt;
For assistance installing or compiling software contact [[Contact_Information|Technical Support]]. We will support locally installed software to the best of our ability, although we can not guarantee that all software will run on the DeepSense platform. In the event that desired software will not run, we can help you determine alternatives such as different software or using a different system for some of your computation.&lt;br /&gt;
&lt;br /&gt;
If you attempt to install compiled software (e.g. an anaconda package) but the package cannot be found then also contact [[Contact_Information|Technical Support]]. The package may not have been compiled for the DeepSense hardware architecture (ppc64le).&lt;br /&gt;
&lt;br /&gt;
If your project has specific software you want to share between members then we can create a shared directory for your group in /software/&amp;lt;project&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you have locally compiled software that you think may be useful for other DeepSense users then let us know at [[Contact_Information|Technical Support]]. We may install and support it systemwide if there is sufficient interest.&lt;br /&gt;
&lt;br /&gt;
== Installing Anaconda Python in your home directory ==&lt;br /&gt;
&lt;br /&gt;
=== Stop using systemwide anaconda ===&lt;br /&gt;
&lt;br /&gt;
If you added the system anaconda environment to your &amp;lt;code&amp;gt;.bashrc&amp;lt;/code&amp;gt; file then remove the line:&lt;br /&gt;
 . /opt/anaconda2/etc/profile.d/conda.sh&lt;br /&gt;
&lt;br /&gt;
=== Installing Anaconda with a python3 base ===&lt;br /&gt;
&lt;br /&gt;
From your home directory run:&lt;br /&gt;
 wget https://repo.continuum.io/archive/Anaconda3-5.2.0-Linux-ppc64le.sh&lt;br /&gt;
 bash Anaconda3-5.2.0-Linux-ppc64le.sh&lt;br /&gt;
&lt;br /&gt;
Note: please enter &amp;quot;yes&amp;quot; when asked if you want to add anaconda to your .bashrc file. If you do not then you will need to add the following command to your .bashrc file or run it each time before using anaconda:&lt;br /&gt;
 . ~/anaconda3/etc/profile.d/conda.sh&lt;br /&gt;
&lt;br /&gt;
After the installer ends you need to either close and restart your terminal or run:&lt;br /&gt;
 source ~/.bashrc&lt;br /&gt;
&lt;br /&gt;
=== Adding a python2 environment ===&lt;br /&gt;
The previous instruction creates a python3 base environment. To add a python2 environment:&lt;br /&gt;
 conda create -n py27 python=2.7&lt;br /&gt;
&lt;br /&gt;
Activate this environment to use python3:&lt;br /&gt;
 conda activate py27&lt;br /&gt;
&lt;br /&gt;
note: if you receive an error message then you may need to deactivate the base conda environment first:&lt;br /&gt;
 conda deactivate&lt;br /&gt;
 conda activate py27&lt;br /&gt;
&lt;br /&gt;
=== Adding a python3 environment ===&lt;br /&gt;
We recommend creating a separate python3 environment from the base environment. This makes it easier to install the specific packages required for IBM PowerAI.&lt;br /&gt;
 conda create -n py36 python=3.6&lt;br /&gt;
&lt;br /&gt;
Activate this environment to use python3:&lt;br /&gt;
 conda activate py36&lt;br /&gt;
&lt;br /&gt;
=== Install PowerAI dependencies ===&lt;br /&gt;
&lt;br /&gt;
Warning: these scripts will install, update, and downgrade some packages to the recommended packages for the current version of PowerAI. You may want to create a separate python environment to use different versions of those packages with other software.&lt;br /&gt;
&lt;br /&gt;
To use Tensorflow first install the Tensorflow dependencies:&lt;br /&gt;
 /opt/DL/tensorflow/bin/install_dependencies&lt;br /&gt;
&lt;br /&gt;
To use PyTorch first install the PyTorch dependencies:&lt;br /&gt;
 /opt/DL/pytorch/bin/install_dependencies&lt;br /&gt;
&lt;br /&gt;
The dependencies must be installed in whichever python environment you intend to use. We&amp;#039;ve encountered some problems installing the PyTorch dependencies directly in the base environment if the base conda environment has been updated to conda version 4.6.2. If you want to use PyTorch, be sure to use a conda environment with a lower version of conda.&lt;br /&gt;
&lt;br /&gt;
=== Install other dependencies ===&lt;br /&gt;
&lt;br /&gt;
If you need additional python libraries then you can install them in your python environment.&lt;br /&gt;
&lt;br /&gt;
The base package comes with several python libraries but you may want a newer version or additional libraries. Also, when you create a new environment it does not automatically get all of the same libraries as the base environment.&lt;br /&gt;
&lt;br /&gt;
For example, suppose you want to install the &amp;lt;code&amp;gt;scikit-learn&amp;lt;/code&amp;gt; package in your python3 environment.&lt;br /&gt;
&lt;br /&gt;
First you need to activate the environment:&lt;br /&gt;
 conda activate py36&lt;br /&gt;
&lt;br /&gt;
Then you install the package&lt;br /&gt;
 conda install scikit-learn&lt;br /&gt;
&lt;br /&gt;
A list of recommended packages follows in the next section.&lt;br /&gt;
&lt;br /&gt;
=== Recommended packages ===&lt;br /&gt;
&lt;br /&gt;
==== Jupyter Notebooks for deep learning ====&lt;br /&gt;
 conda install jupyter&lt;br /&gt;
&lt;br /&gt;
=== Testing Deep Learning packages on the login nodes or non-GPU nodes ===&lt;br /&gt;
&lt;br /&gt;
You may wish to run PowerAI software on the login nodes for testing on the CPU-only nodes for some workflows.&lt;br /&gt;
&lt;br /&gt;
Only the GPU nodes have graphics cards and graphics drivers installed. If you attempt to run the deep learning software like Tensorflow on the login nodes or CPU-only nodes then you will see errors like the following:&lt;br /&gt;
 ImportError: libcublas.so.9.2: cannot open shared object file: No such file or directory&lt;br /&gt;
&lt;br /&gt;
You need to load the GPU drivers with the following command:&lt;br /&gt;
 source /opt/DL/cudnn/bin/cudnn-activate&lt;br /&gt;
&lt;br /&gt;
Then you can activate the deep learning package, e.g. for Tensorflow:&lt;br /&gt;
 source /opt/DL/tensorflow/bin/tensorflow-activate&lt;br /&gt;
&lt;br /&gt;
Note that some deep learning software may be much slower or refuse to run without GPU access. Tensorflow works but Caffe does not.&lt;br /&gt;
&lt;br /&gt;
Keep in mind you need to activate the GPU drivers and deep learning package in each browser shell before you are able to use the package in your code or LSF jobs.&lt;br /&gt;
&lt;br /&gt;
== Compiling Software for DeepSense ==&lt;br /&gt;
&lt;br /&gt;
DeepSense uses IBM Power8 systems running RedHat Enterprise Linux. Code must be compiled for &amp;lt;code&amp;gt;ppc64le&amp;lt;/code&amp;gt; which is PowerPC 64 bit Little Endian.&lt;br /&gt;
&lt;br /&gt;
Some software may not have binaries available for &amp;lt;code&amp;gt;ppc64le&amp;lt;/code&amp;gt; even if it does for other systems. If this happens then you (or [[Contact_Information|DeepSense support]]) will need to compile the software to run on DeepSense. Visit the web page for the software and see if the source code is available (e.g. through github). If so then follow the compilation instructions to run the software.&lt;br /&gt;
&lt;br /&gt;
You may encounter errors when attempting to compile software for &amp;lt;code&amp;gt;ppc64le&amp;lt;/code&amp;gt;. Often this occurs because of differences between &amp;lt;code&amp;gt;ppc64le&amp;lt;/code&amp;gt; and other common architectures such as x86 and x86_64. &lt;br /&gt;
&lt;br /&gt;
For example, one DeepSense user attempted to compile the rdkit software package from https://www.rdkit.org/ . This compilation failed when it attempted to use the gcc x86 optimization &amp;lt;code&amp;gt;-mpopcnt&amp;lt;/code&amp;gt;. After replacing the optimization with the &amp;lt;code&amp;gt;ppc64le&amp;lt;/code&amp;gt; equivalent &amp;lt;code&amp;gt;-mpopcntb&amp;lt;/code&amp;gt; the software compiled successfully.&lt;/div&gt;</summary>
		<author><name>Cwhidden</name></author>
		
	</entry>
	<entry>
		<id>https://docs.deepsense.ca/index.php?title=Getting_started_with_Deep_Learning&amp;diff=139</id>
		<title>Getting started with Deep Learning</title>
		<link rel="alternate" type="text/html" href="https://docs.deepsense.ca/index.php?title=Getting_started_with_Deep_Learning&amp;diff=139"/>
		<updated>2019-12-05T19:57:43Z</updated>

		<summary type="html">&lt;p&gt;Cwhidden: /* 1. Get started with DeepSense */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;div class=&amp;quot;noautonum&amp;quot;&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== 1. Get started with DeepSense ==&lt;br /&gt;
&lt;br /&gt;
Follow all the steps from [[Getting started]]. This tutorial assumes you can log on to the DeepSense compute platform and have a version of Anaconda python on your path. We recommend installing Anaconda in your home directory before starting this tutorial (See [[Installing local software]]).&lt;br /&gt;
&lt;br /&gt;
== 2. Download Caffe samples to your home directory ==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;/opt/DL/caffe/bin/caffe-install-samples&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== 3. Request an interactive session on a GPU compute node ==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- TODO: still need to set up queues to fairly share GPUs --&amp;gt;&lt;br /&gt;
&amp;lt;!-- TODO: write instructions doing this with regular LSF without an interactive session. We don&amp;#039;t want to encourage everyone to use interactive sessions --&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;bsub -Is -gpu - bash&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== 4. Start a python2 Jupyter notebook ==&lt;br /&gt;
&lt;br /&gt;
=== Source the Caffe deep learning toolkit ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;source /opt/DL/caffe/bin/caffe-activate&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Start the notebook ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;jupyter notebook --no-browser --ip=0.0.0.0&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Sample output ===&lt;br /&gt;
&amp;lt;pre&amp;gt;[I 13:32:23.937 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).&lt;br /&gt;
[C 13:32:23.937 NotebookApp] &lt;br /&gt;
    &lt;br /&gt;
    Copy/paste this URL into your browser when you connect for the first time,&lt;br /&gt;
    to login with a token:&lt;br /&gt;
        http://ds-cmgpu-04:8888/?token=68042f40a10b500f3747ae0a232ee209fa4bf1aa384d29ba&amp;amp;token=68042f40a10b500f3747ae0a232ee209fa4bf1aa384d29ba&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Copy the URL, host, and port ===&lt;br /&gt;
&lt;br /&gt;
Copy the URL but don’t paste it in your browser yet.&lt;br /&gt;
&lt;br /&gt;
Make a note of which compute host and port the notebook is running on (e.g. host ds-cmgpu-04 and port 8888 in this case)&lt;br /&gt;
&lt;br /&gt;
== 5. Port Forwarding ==&lt;br /&gt;
&lt;br /&gt;
In a separate terminal window from your local computer, forward your local port to the remote host.&lt;br /&gt;
&lt;br /&gt;
=== ssh command port forwarding ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt; ssh -l &amp;lt;username&amp;gt; login1.deepsense.ca -L &amp;lt;local_port&amp;gt;:&amp;lt;remote_host&amp;gt;:&amp;lt;remote_port&amp;gt;&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
for example, &amp;lt;code&amp;gt;ssh -l user1 login1.deepsense.ca -L 8888:ds-cmgpu-04:8888&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== PuTTY port forwarding on Windows ===&lt;br /&gt;
&lt;br /&gt;
If you are using a PuTTY terminal from a Windows computer to access DeepSense then you can still forward ports.&lt;br /&gt;
&lt;br /&gt;
Before starting your session, scroll down to the option &amp;lt;code&amp;gt;Connection-&amp;gt;SSH-&amp;gt;Tunnels&amp;lt;/code&amp;gt; in the Category pane.&lt;br /&gt;
&lt;br /&gt;
Enter the &amp;lt;code&amp;gt;local_port&amp;lt;/code&amp;gt; in the &amp;lt;code&amp;gt;Source port&amp;lt;/code&amp;gt; field. For example, &amp;lt;code&amp;gt;8888&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Enter &amp;lt;code&amp;gt;&amp;lt;remote_host&amp;gt;:&amp;lt;remote_port&amp;gt;&amp;lt;/code&amp;gt; in the Destination field. For example, &amp;lt;code&amp;gt;ds-cmgpu-04:8888&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Press the &amp;lt;code&amp;gt;Add&amp;lt;/code&amp;gt; button to add the port forwarding rule to your PuTTY session.&lt;br /&gt;
&lt;br /&gt;
Finally, open the session as usual.&lt;br /&gt;
&lt;br /&gt;
== 6. Open the desired sample notebook ==&lt;br /&gt;
&lt;br /&gt;
Enter the copied URL in your web browser but change the remote host name to “localhost” before pressing enter.&lt;br /&gt;
&lt;br /&gt;
e.g &amp;lt;code&amp;gt;http://localhost:8888/?token=68042f40a10b500f3747ae0a232ee209fa4bf1aa384d29ba&amp;amp;token=68042f40a10b500f3747ae0a232ee209fa4bf1aa384d29ba&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Note&amp;#039;&amp;#039;&amp;#039;: On our macs, this worked in Chrome, but not in Safari.  Unfortunately, there was no error reported, it simply could not connect.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Be sure to enter the location of the “caffe-samples” directory in your home directory as your caffe-root in the Caffe example notebooks.&lt;br /&gt;
&lt;br /&gt;
== 7. Enjoy Deep Learning on DeepSense! ==&lt;br /&gt;
&lt;br /&gt;
== 8. More information ==&lt;br /&gt;
&lt;br /&gt;
Go to Caffe&amp;#039;s [http://caffe.berkeleyvision.org/ website] for tutorials and example programs that you can run to get started.&lt;br /&gt;
See the following links to a couple of the example programs:&lt;br /&gt;
&lt;br /&gt;
[http://caffe.berkeleyvision.org/gathered/examples/mnist.html LeNet MNIST Tutorial] - Train a neural network to understand handwritten digits.&lt;br /&gt;
&lt;br /&gt;
[http://caffe.berkeleyvision.org/gathered/examples/cifar10.html CIFAR-10 tutorial] - Train a convolutional neural network to classify small images.&lt;br /&gt;
&lt;br /&gt;
== 9. Using another deep learning toolkit such as Tensorflow ==&lt;br /&gt;
&lt;br /&gt;
* Ensure any Anaconda dependencies are installed&lt;br /&gt;
** for tensorflow, run &amp;lt;code&amp;gt;/opt/DL/tensorflow/bin/install_dependencies&amp;lt;/code&amp;gt;&lt;br /&gt;
* Source the appropriate toolkit instead of caffe-activate&lt;br /&gt;
** e.g. &amp;lt;code&amp;gt;source /opt/DL/tensorflow/bin/tensorflow-activate&amp;lt;/code&amp;gt;&lt;br /&gt;
* Download example notebooks for the deep learning toolkit to your home directory,&lt;br /&gt;
** e.g. &amp;lt;code&amp;gt; git clone https://github.com/aymericdamien/TensorFlow-Examples.git&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The TensorFlow [https://www.tensorflow.org/ home page] has various information, including Tutorials, How-To documents, and a Getting Started guide.&lt;br /&gt;
&lt;br /&gt;
Additional tutorials and examples are available from the community, for example:&lt;br /&gt;
&lt;br /&gt;
  https://github.com/nlintz/TensorFlow-Tutorials&lt;br /&gt;
&lt;br /&gt;
  https://github.com/aymericdamien/TensorFlow-Examples&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/div&amp;gt; &amp;lt;!-- autonum --&amp;gt;&lt;/div&gt;</summary>
		<author><name>Cwhidden</name></author>
		
	</entry>
	<entry>
		<id>https://docs.deepsense.ca/index.php?title=Getting_started_with_Deep_Learning&amp;diff=137</id>
		<title>Getting started with Deep Learning</title>
		<link rel="alternate" type="text/html" href="https://docs.deepsense.ca/index.php?title=Getting_started_with_Deep_Learning&amp;diff=137"/>
		<updated>2019-12-03T18:08:24Z</updated>

		<summary type="html">&lt;p&gt;Cwhidden: /* PuTTY port forwarding on Windows */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;div class=&amp;quot;noautonum&amp;quot;&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== 1. Get started with DeepSense ==&lt;br /&gt;
&lt;br /&gt;
Follow all the steps from [[Getting started]]. This tutorial assumes you can log on to the DeepSense compute platform and have a version of Anaconda python on your path.&lt;br /&gt;
&lt;br /&gt;
== 2. Download Caffe samples to your home directory ==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;/opt/DL/caffe/bin/caffe-install-samples&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== 3. Request an interactive session on a GPU compute node ==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- TODO: still need to set up queues to fairly share GPUs --&amp;gt;&lt;br /&gt;
&amp;lt;!-- TODO: write instructions doing this with regular LSF without an interactive session. We don&amp;#039;t want to encourage everyone to use interactive sessions --&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;bsub -Is -gpu - bash&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== 4. Start a python2 Jupyter notebook ==&lt;br /&gt;
&lt;br /&gt;
=== Source the Caffe deep learning toolkit ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;source /opt/DL/caffe/bin/caffe-activate&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Start the notebook ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;jupyter notebook --no-browser --ip=0.0.0.0&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Sample output ===&lt;br /&gt;
&amp;lt;pre&amp;gt;[I 13:32:23.937 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).&lt;br /&gt;
[C 13:32:23.937 NotebookApp] &lt;br /&gt;
    &lt;br /&gt;
    Copy/paste this URL into your browser when you connect for the first time,&lt;br /&gt;
    to login with a token:&lt;br /&gt;
        http://ds-cmgpu-04:8888/?token=68042f40a10b500f3747ae0a232ee209fa4bf1aa384d29ba&amp;amp;token=68042f40a10b500f3747ae0a232ee209fa4bf1aa384d29ba&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Copy the URL, host, and port ===&lt;br /&gt;
&lt;br /&gt;
Copy the URL but don’t paste it in your browser yet.&lt;br /&gt;
&lt;br /&gt;
Make a note of which compute host and port the notebook is running on (e.g. host ds-cmgpu-04 and port 8888 in this case)&lt;br /&gt;
&lt;br /&gt;
== 5. Port Forwarding ==&lt;br /&gt;
&lt;br /&gt;
In a separate terminal window from your local computer, forward your local port to the remote host.&lt;br /&gt;
&lt;br /&gt;
=== ssh command port forwarding ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt; ssh -l &amp;lt;username&amp;gt; login1.deepsense.ca -L &amp;lt;local_port&amp;gt;:&amp;lt;remote_host&amp;gt;:&amp;lt;remote_port&amp;gt;&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
for example, &amp;lt;code&amp;gt;ssh -l user1 login1.deepsense.ca -L 8888:ds-cmgpu-04:8888&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== PuTTY port forwarding on Windows ===&lt;br /&gt;
&lt;br /&gt;
If you are using a PuTTY terminal from a Windows computer to access DeepSense then you can still forward ports.&lt;br /&gt;
&lt;br /&gt;
Before starting your session, scroll down to the option &amp;lt;code&amp;gt;Connection-&amp;gt;SSH-&amp;gt;Tunnels&amp;lt;/code&amp;gt; in the Category pane.&lt;br /&gt;
&lt;br /&gt;
Enter the &amp;lt;code&amp;gt;local_port&amp;lt;/code&amp;gt; in the &amp;lt;code&amp;gt;Source port&amp;lt;/code&amp;gt; field. For example, &amp;lt;code&amp;gt;8888&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Enter &amp;lt;code&amp;gt;&amp;lt;remote_host&amp;gt;:&amp;lt;remote_port&amp;gt;&amp;lt;/code&amp;gt; in the Destination field. For example, &amp;lt;code&amp;gt;ds-cmgpu-04:8888&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Press the &amp;lt;code&amp;gt;Add&amp;lt;/code&amp;gt; button to add the port forwarding rule to your PuTTY session.&lt;br /&gt;
&lt;br /&gt;
Finally, open the session as usual.&lt;br /&gt;
&lt;br /&gt;
== 6. Open the desired sample notebook ==&lt;br /&gt;
&lt;br /&gt;
Enter the copied URL in your web browser but change the remote host name to “localhost” before pressing enter.&lt;br /&gt;
&lt;br /&gt;
e.g &amp;lt;code&amp;gt;http://localhost:8888/?token=68042f40a10b500f3747ae0a232ee209fa4bf1aa384d29ba&amp;amp;token=68042f40a10b500f3747ae0a232ee209fa4bf1aa384d29ba&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Note&amp;#039;&amp;#039;&amp;#039;: On our macs, this worked in Chrome, but not in Safari.  Unfortunately, there was no error reported, it simply could not connect.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Be sure to enter the location of the “caffe-samples” directory in your home directory as your caffe-root in the Caffe example notebooks.&lt;br /&gt;
&lt;br /&gt;
== 7. Enjoy Deep Learning on DeepSense! ==&lt;br /&gt;
&lt;br /&gt;
== 8. More information ==&lt;br /&gt;
&lt;br /&gt;
Go to Caffe&amp;#039;s [http://caffe.berkeleyvision.org/ website] for tutorials and example programs that you can run to get started.&lt;br /&gt;
See the following links to a couple of the example programs:&lt;br /&gt;
&lt;br /&gt;
[http://caffe.berkeleyvision.org/gathered/examples/mnist.html LeNet MNIST Tutorial] - Train a neural network to understand handwritten digits.&lt;br /&gt;
&lt;br /&gt;
[http://caffe.berkeleyvision.org/gathered/examples/cifar10.html CIFAR-10 tutorial] - Train a convolutional neural network to classify small images.&lt;br /&gt;
&lt;br /&gt;
== 9. Using another deep learning toolkit such as Tensorflow ==&lt;br /&gt;
&lt;br /&gt;
* Ensure any Anaconda dependencies are installed&lt;br /&gt;
** for tensorflow, run &amp;lt;code&amp;gt;/opt/DL/tensorflow/bin/install_dependencies&amp;lt;/code&amp;gt;&lt;br /&gt;
* Source the appropriate toolkit instead of caffe-activate&lt;br /&gt;
** e.g. &amp;lt;code&amp;gt;source /opt/DL/tensorflow/bin/tensorflow-activate&amp;lt;/code&amp;gt;&lt;br /&gt;
* Download example notebooks for the deep learning toolkit to your home directory,&lt;br /&gt;
** e.g. &amp;lt;code&amp;gt; git clone https://github.com/aymericdamien/TensorFlow-Examples.git&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The TensorFlow [https://www.tensorflow.org/ home page] has various information, including Tutorials, How-To documents, and a Getting Started guide.&lt;br /&gt;
&lt;br /&gt;
Additional tutorials and examples are available from the community, for example:&lt;br /&gt;
&lt;br /&gt;
  https://github.com/nlintz/TensorFlow-Tutorials&lt;br /&gt;
&lt;br /&gt;
  https://github.com/aymericdamien/TensorFlow-Examples&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/div&amp;gt; &amp;lt;!-- autonum --&amp;gt;&lt;/div&gt;</summary>
		<author><name>Cwhidden</name></author>
		
	</entry>
	<entry>
		<id>https://docs.deepsense.ca/index.php?title=Getting_started_with_Deep_Learning&amp;diff=136</id>
		<title>Getting started with Deep Learning</title>
		<link rel="alternate" type="text/html" href="https://docs.deepsense.ca/index.php?title=Getting_started_with_Deep_Learning&amp;diff=136"/>
		<updated>2019-12-03T16:46:20Z</updated>

		<summary type="html">&lt;p&gt;Cwhidden: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;div class=&amp;quot;noautonum&amp;quot;&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== 1. Get started with DeepSense ==&lt;br /&gt;
&lt;br /&gt;
Follow all the steps from [[Getting started]]. This tutorial assumes you can log on to the DeepSense compute platform and have a version of Anaconda python on your path.&lt;br /&gt;
&lt;br /&gt;
== 2. Download Caffe samples to your home directory ==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;/opt/DL/caffe/bin/caffe-install-samples&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== 3. Request an interactive session on a GPU compute node ==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- TODO: still need to set up queues to fairly share GPUs --&amp;gt;&lt;br /&gt;
&amp;lt;!-- TODO: write instructions doing this with regular LSF without an interactive session. We don&amp;#039;t want to encourage everyone to use interactive sessions --&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;bsub -Is -gpu - bash&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== 4. Start a python2 Jupyter notebook ==&lt;br /&gt;
&lt;br /&gt;
=== Source the Caffe deep learning toolkit ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;source /opt/DL/caffe/bin/caffe-activate&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Start the notebook ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;jupyter notebook --no-browser --ip=0.0.0.0&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Sample output ===&lt;br /&gt;
&amp;lt;pre&amp;gt;[I 13:32:23.937 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).&lt;br /&gt;
[C 13:32:23.937 NotebookApp] &lt;br /&gt;
    &lt;br /&gt;
    Copy/paste this URL into your browser when you connect for the first time,&lt;br /&gt;
    to login with a token:&lt;br /&gt;
        http://ds-cmgpu-04:8888/?token=68042f40a10b500f3747ae0a232ee209fa4bf1aa384d29ba&amp;amp;token=68042f40a10b500f3747ae0a232ee209fa4bf1aa384d29ba&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Copy the URL, host, and port ===&lt;br /&gt;
&lt;br /&gt;
Copy the URL but don’t paste it in your browser yet.&lt;br /&gt;
&lt;br /&gt;
Make a note of which compute host and port the notebook is running on (e.g. host ds-cmgpu-04 and port 8888 in this case)&lt;br /&gt;
&lt;br /&gt;
== 5. Port Forwarding ==&lt;br /&gt;
&lt;br /&gt;
In a separate terminal window from your local computer, forward your local port to the remote host.&lt;br /&gt;
&lt;br /&gt;
=== ssh command port forwarding ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt; ssh -l &amp;lt;username&amp;gt; login1.deepsense.ca -L &amp;lt;local_port&amp;gt;:&amp;lt;remote_host&amp;gt;:&amp;lt;remote_port&amp;gt;&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
for example, &amp;lt;code&amp;gt;ssh -l user1 login1.deepsense.ca -L 8888:ds-cmgpu-04:8888&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== PuTTY port forwarding on Windows ===&lt;br /&gt;
&lt;br /&gt;
If you are using a PuTTY terminal from a Windows computer to access DeepSense then you can still forward ports.&lt;br /&gt;
&lt;br /&gt;
Before starting your session, scroll down to the option &amp;lt;code&amp;gt;Connection-&amp;gt;SSH-&amp;gt;Tunnels&amp;lt;/code&amp;gt; in the Category pane.&lt;br /&gt;
&lt;br /&gt;
Enter the &amp;lt;code&amp;gt;local_port&amp;lt;/code&amp;gt; in the &amp;lt;code&amp;gt;Source port&amp;lt;/code&amp;gt; field. For example, &amp;lt;code&amp;gt;8888&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Enter &amp;lt;code&amp;gt;&amp;lt;remote_host&amp;gt;:&amp;lt;remote_port&amp;gt;&amp;lt;/code&amp;gt; in the Destination field. For example, &amp;lt;code&amp;gt;ds-cmgpu-04:8888&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Finally, open the session as usual.&lt;br /&gt;
&lt;br /&gt;
== 6. Open the desired sample notebook ==&lt;br /&gt;
&lt;br /&gt;
Enter the copied URL in your web browser but change the remote host name to “localhost” before pressing enter.&lt;br /&gt;
&lt;br /&gt;
e.g &amp;lt;code&amp;gt;http://localhost:8888/?token=68042f40a10b500f3747ae0a232ee209fa4bf1aa384d29ba&amp;amp;token=68042f40a10b500f3747ae0a232ee209fa4bf1aa384d29ba&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Note&amp;#039;&amp;#039;&amp;#039;: On our macs, this worked in Chrome, but not in Safari.  Unfortunately, there was no error reported, it simply could not connect.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Be sure to enter the location of the “caffe-samples” directory in your home directory as your caffe-root in the Caffe example notebooks.&lt;br /&gt;
&lt;br /&gt;
== 7. Enjoy Deep Learning on DeepSense! ==&lt;br /&gt;
&lt;br /&gt;
== 8. More information ==&lt;br /&gt;
&lt;br /&gt;
Go to Caffe&amp;#039;s [http://caffe.berkeleyvision.org/ website] for tutorials and example programs that you can run to get started.&lt;br /&gt;
See the following links to a couple of the example programs:&lt;br /&gt;
&lt;br /&gt;
[http://caffe.berkeleyvision.org/gathered/examples/mnist.html LeNet MNIST Tutorial] - Train a neural network to understand handwritten digits.&lt;br /&gt;
&lt;br /&gt;
[http://caffe.berkeleyvision.org/gathered/examples/cifar10.html CIFAR-10 tutorial] - Train a convolutional neural network to classify small images.&lt;br /&gt;
&lt;br /&gt;
== 9. Using another deep learning toolkit such as Tensorflow ==&lt;br /&gt;
&lt;br /&gt;
* Ensure any Anaconda dependencies are installed&lt;br /&gt;
** for tensorflow, run &amp;lt;code&amp;gt;/opt/DL/tensorflow/bin/install_dependencies&amp;lt;/code&amp;gt;&lt;br /&gt;
* Source the appropriate toolkit instead of caffe-activate&lt;br /&gt;
** e.g. &amp;lt;code&amp;gt;source /opt/DL/tensorflow/bin/tensorflow-activate&amp;lt;/code&amp;gt;&lt;br /&gt;
* Download example notebooks for the deep learning toolkit to your home directory,&lt;br /&gt;
** e.g. &amp;lt;code&amp;gt; git clone https://github.com/aymericdamien/TensorFlow-Examples.git&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The TensorFlow [https://www.tensorflow.org/ home page] has various information, including Tutorials, How-To documents, and a Getting Started guide.&lt;br /&gt;
&lt;br /&gt;
Additional tutorials and examples are available from the community, for example:&lt;br /&gt;
&lt;br /&gt;
  https://github.com/nlintz/TensorFlow-Tutorials&lt;br /&gt;
&lt;br /&gt;
  https://github.com/aymericdamien/TensorFlow-Examples&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/div&amp;gt; &amp;lt;!-- autonum --&amp;gt;&lt;/div&gt;</summary>
		<author><name>Cwhidden</name></author>
		
	</entry>
	<entry>
		<id>https://docs.deepsense.ca/index.php?title=Installing_local_software&amp;diff=135</id>
		<title>Installing local software</title>
		<link rel="alternate" type="text/html" href="https://docs.deepsense.ca/index.php?title=Installing_local_software&amp;diff=135"/>
		<updated>2019-11-07T17:42:56Z</updated>

		<summary type="html">&lt;p&gt;Cwhidden: change instructions to use anaconda3 as the base environment&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Introduction ==&lt;br /&gt;
&lt;br /&gt;
You are welcome to install software locally in your home directory. This allows you to use specific versions of software instead of the cluster wide versions. For example you may need an older version of a specific package or a newly released version that isn&amp;#039;t yet installed on DeepSense.&lt;br /&gt;
&lt;br /&gt;
For assistance installing or compiling software contact [[Contact_Information|Technical Support]]. We will support locally installed software to the best of our ability, although we can not guarantee that all software will run on the DeepSense platform. In the event that desired software will not run, we can help you determine alternatives such as different software or using a different system for some of your computation.&lt;br /&gt;
&lt;br /&gt;
If you attempt to install compiled software (e.g. an anaconda package) but the package cannot be found then also contact [[Contact_Information|Technical Support]]. The package may not have been compiled for the DeepSense hardware architecture (ppc64le).&lt;br /&gt;
&lt;br /&gt;
If your project has specific software you want to share between members then we can create a shared directory for your group in /software/&amp;lt;project&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you have locally compiled software that you think may be useful for other DeepSense users then let us know at [[Contact_Information|Technical Support]]. We may install and support it systemwide if there is sufficient interest.&lt;br /&gt;
&lt;br /&gt;
== Installing Anaconda Python in your home directory ==&lt;br /&gt;
&lt;br /&gt;
=== Stop using systemwide anaconda ===&lt;br /&gt;
&lt;br /&gt;
If you added the system anaconda environment to your &amp;lt;code&amp;gt;.bashrc&amp;lt;/code&amp;gt; file then remove the line:&lt;br /&gt;
 . /opt/anaconda2/etc/profile.d/conda.sh&lt;br /&gt;
&lt;br /&gt;
=== Installing Anaconda with a python3 base ===&lt;br /&gt;
&lt;br /&gt;
From your home directory run:&lt;br /&gt;
 wget https://repo.continuum.io/archive/Anaconda3-5.2.0-Linux-ppc64le.sh&lt;br /&gt;
 bash Anaconda3-5.2.0-Linux-ppc64le.sh&lt;br /&gt;
&lt;br /&gt;
Note: please enter &amp;quot;yes&amp;quot; when asked if you want to add anaconda to your .bashrc file. If you do not then you will need to add the following command to your .bashrc file or run it each time before using anaconda:&lt;br /&gt;
 . ~/anaconda3/etc/profile.d/conda.sh&lt;br /&gt;
&lt;br /&gt;
After the installer ends you need to either close and restart your terminal or run:&lt;br /&gt;
 source ~/.bashrc&lt;br /&gt;
&lt;br /&gt;
=== Adding a python2 environment ===&lt;br /&gt;
The previous instruction creates a python3 base environment. To add a python2 environment:&lt;br /&gt;
 conda create -n py27 python=2.7&lt;br /&gt;
&lt;br /&gt;
Activate this environment to use python3:&lt;br /&gt;
 conda activate py27&lt;br /&gt;
&lt;br /&gt;
note: if you receive an error message then you may need to deactivate the base conda environment first:&lt;br /&gt;
 conda deactivate&lt;br /&gt;
 conda activate py27&lt;br /&gt;
&lt;br /&gt;
=== Adding a python3 environment ===&lt;br /&gt;
We recommend creating a separate python3 environment from the base environment. This makes it easier to install the specific packages required for IBM PowerAI.&lt;br /&gt;
 conda create -n py36 python=3.6&lt;br /&gt;
&lt;br /&gt;
Activate this environment to use python3:&lt;br /&gt;
 conda activate py36&lt;br /&gt;
&lt;br /&gt;
=== Install PowerAI dependencies ===&lt;br /&gt;
&lt;br /&gt;
Warning: these scripts will install, update, and downgrade some packages to the recommended packages for the current version of PowerAI. You may want to create a separate python environment to use different versions of those packages with other software.&lt;br /&gt;
&lt;br /&gt;
To use Tensorflow first install the Tensorflow dependencies:&lt;br /&gt;
 /opt/DL/tensorflow/bin/install_dependencies&lt;br /&gt;
&lt;br /&gt;
To use PyTorch first install the PyTorch dependencies:&lt;br /&gt;
 /opt/DL/pytorch/bin/install_dependencies&lt;br /&gt;
&lt;br /&gt;
The dependencies must be installed in whichever python environment you intend to use. We&amp;#039;ve encountered some problems installing the PyTorch dependencies directly in the base environment if the base conda environment has been updated to conda version 4.6.2. If you want to use PyTorch, be sure to use a conda environment with a lower version of conda.&lt;br /&gt;
&lt;br /&gt;
=== Install other dependencies ===&lt;br /&gt;
&lt;br /&gt;
If you need additional python libraries then you can install them in your python environment.&lt;br /&gt;
&lt;br /&gt;
The base package comes with several python libraries but you may want a newer version or additional libraries. Also, when you create a new environment it does not automatically get all of the same libraries as the base environment.&lt;br /&gt;
&lt;br /&gt;
For example, suppose you want to install the &amp;lt;code&amp;gt;scikit-learn&amp;lt;/code&amp;gt; package in your python3 environment.&lt;br /&gt;
&lt;br /&gt;
First you need to activate the environment:&lt;br /&gt;
 conda activate py36&lt;br /&gt;
&lt;br /&gt;
Then you install the package&lt;br /&gt;
 conda install scikit-learn&lt;br /&gt;
&lt;br /&gt;
=== Testing Deep Learning packages on the login nodes or non-GPU nodes ===&lt;br /&gt;
&lt;br /&gt;
You may wish to run PowerAI software on the login nodes for testing on the CPU-only nodes for some workflows.&lt;br /&gt;
&lt;br /&gt;
Only the GPU nodes have graphics cards and graphics drivers installed. If you attempt to run the deep learning software like Tensorflow on the login nodes or CPU-only nodes then you will see errors like the following:&lt;br /&gt;
 ImportError: libcublas.so.9.2: cannot open shared object file: No such file or directory&lt;br /&gt;
&lt;br /&gt;
You need to load the GPU drivers with the following command:&lt;br /&gt;
 source /opt/DL/cudnn/bin/cudnn-activate&lt;br /&gt;
&lt;br /&gt;
Then you can activate the deep learning package, e.g. for Tensorflow:&lt;br /&gt;
 source /opt/DL/tensorflow/bin/tensorflow-activate&lt;br /&gt;
&lt;br /&gt;
Note that some deep learning software may be much slower or refuse to run without GPU access. Tensorflow works but Caffe does not.&lt;br /&gt;
&lt;br /&gt;
Keep in mind you need to activate the GPU drivers and deep learning package in each browser shell before you are able to use the package in your code or LSF jobs.&lt;br /&gt;
&lt;br /&gt;
== Compiling Software for DeepSense ==&lt;br /&gt;
&lt;br /&gt;
DeepSense uses IBM Power8 systems running RedHat Enterprise Linux. Code must be compiled for &amp;lt;code&amp;gt;ppc64le&amp;lt;/code&amp;gt; which is PowerPC 64 bit Little Endian.&lt;br /&gt;
&lt;br /&gt;
Some software may not have binaries available for &amp;lt;code&amp;gt;ppc64le&amp;lt;/code&amp;gt; even if it does for other systems. If this happens then you (or [[Contact_Information|DeepSense support]]) will need to compile the software to run on DeepSense. Visit the web page for the software and see if the source code is available (e.g. through github). If so then follow the compilation instructions to run the software.&lt;br /&gt;
&lt;br /&gt;
You may encounter errors when attempting to compile software for &amp;lt;code&amp;gt;ppc64le&amp;lt;/code&amp;gt;. Often this occurs because of differences between &amp;lt;code&amp;gt;ppc64le&amp;lt;/code&amp;gt; and other common architectures such as x86 and x86_64. &lt;br /&gt;
&lt;br /&gt;
For example, one DeepSense user attempted to compile the rdkit software package from https://www.rdkit.org/ . This compilation failed when it attempted to use the gcc x86 optimization &amp;lt;code&amp;gt;-mpopcnt&amp;lt;/code&amp;gt;. After replacing the optimization with the &amp;lt;code&amp;gt;ppc64le&amp;lt;/code&amp;gt; equivalent &amp;lt;code&amp;gt;-mpopcntb&amp;lt;/code&amp;gt; the software compiled successfully.&lt;/div&gt;</summary>
		<author><name>Cwhidden</name></author>
		
	</entry>
	<entry>
		<id>https://docs.deepsense.ca/index.php?title=Installing_local_software&amp;diff=126</id>
		<title>Installing local software</title>
		<link rel="alternate" type="text/html" href="https://docs.deepsense.ca/index.php?title=Installing_local_software&amp;diff=126"/>
		<updated>2019-11-06T15:30:08Z</updated>

		<summary type="html">&lt;p&gt;Cwhidden: /* Installing Anaconda with a python2 base */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Introduction ==&lt;br /&gt;
&lt;br /&gt;
You are welcome to install software locally in your home directory. This allows you to use specific versions of software instead of the cluster wide versions. For example you may need an older version of a specific package or a newly released version that isn&amp;#039;t yet installed on DeepSense.&lt;br /&gt;
&lt;br /&gt;
For assistance installing or compiling software contact [[Contact_Information|Technical Support]]. We will support locally installed software to the best of our ability, although we can not guarantee that all software will run on the DeepSense platform. In the event that desired software will not run, we can help you determine alternatives such as different software or using a different system for some of your computation.&lt;br /&gt;
&lt;br /&gt;
If you attempt to install compiled software (e.g. an anaconda package) but the package cannot be found then also contact [[Contact_Information|Technical Support]]. The package may not have been compiled for the DeepSense hardware architecture (ppc64le).&lt;br /&gt;
&lt;br /&gt;
If your project has specific software you want to share between members then we can create a shared directory for your group in /software/&amp;lt;project&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you have locally compiled software that you think may be useful for other DeepSense users then let us know at [[Contact_Information|Technical Support]]. We may install and support it systemwide if there is sufficient interest.&lt;br /&gt;
&lt;br /&gt;
== Installing Anaconda Python in your home directory ==&lt;br /&gt;
&lt;br /&gt;
=== Stop using systemwide anaconda ===&lt;br /&gt;
&lt;br /&gt;
If you added the system anaconda environment to your &amp;lt;code&amp;gt;.bashrc&amp;lt;/code&amp;gt; file then remove the line:&lt;br /&gt;
 . /opt/anaconda2/etc/profile.d/conda.sh&lt;br /&gt;
&lt;br /&gt;
=== Installing Anaconda with a python2 base ===&lt;br /&gt;
&lt;br /&gt;
From your home directory run:&lt;br /&gt;
 wget https://repo.continuum.io/archive/Anaconda2-5.2.0-Linux-ppc64le.sh&lt;br /&gt;
 bash Anaconda2-5.2.0-Linux-ppc64le.sh&lt;br /&gt;
&lt;br /&gt;
Note: please enter &amp;quot;yes&amp;quot; when asked if you want to add anaconda to your .bashrc file. If you do not then you will need to add the following command to your .bashrc file or run it each time before using anaconda:&lt;br /&gt;
 . ~/anaconda2/etc/profile.d/conda.sh&lt;br /&gt;
&lt;br /&gt;
After the installer ends you need to either close and restart your terminal or run:&lt;br /&gt;
 source ~/.bashrc&lt;br /&gt;
&lt;br /&gt;
=== Adding a python3 environment ===&lt;br /&gt;
The previous instruction creates a python2 base environment. To add a python3 environment:&lt;br /&gt;
 conda create -n py36 python=3.6&lt;br /&gt;
&lt;br /&gt;
Activate this environment to use python3:&lt;br /&gt;
 conda activate py36&lt;br /&gt;
&lt;br /&gt;
note: if you receive an error message then you may need to deactivate the base conda environment first:&lt;br /&gt;
 conda deactivate&lt;br /&gt;
 conda activate py36&lt;br /&gt;
&lt;br /&gt;
=== Adding a python2 environment ===&lt;br /&gt;
We recommend creating a separate python2 environment from the base environment. This makes it easier to install the specific packages required for IBM PowerAI.&lt;br /&gt;
 conda create -n py27 python=2.7&lt;br /&gt;
&lt;br /&gt;
Activate this environment to use python2:&lt;br /&gt;
 conda activate py27&lt;br /&gt;
&lt;br /&gt;
=== Install PowerAI dependencies ===&lt;br /&gt;
&lt;br /&gt;
Warning: these scripts will install, update, and downgrade some packages to the recommended packages for the current version of PowerAI. You may want to create a separate python environment to use different versions of those packages with other software.&lt;br /&gt;
&lt;br /&gt;
To use Tensorflow first install the Tensorflow dependencies:&lt;br /&gt;
 /opt/DL/tensorflow/bin/install_dependencies&lt;br /&gt;
&lt;br /&gt;
To use PyTorch first install the PyTorch dependencies:&lt;br /&gt;
 /opt/DL/pytorch/bin/install_dependencies&lt;br /&gt;
&lt;br /&gt;
The dependencies must be installed in whichever python environment you intend to use. We&amp;#039;ve encountered some problems installing the PyTorch dependencies directly in the base environment if the base conda environment has been updated to conda version 4.6.2. If you want to use PyTorch, be sure to use a conda environment with a lower version of conda.&lt;br /&gt;
&lt;br /&gt;
=== Install other dependencies ===&lt;br /&gt;
&lt;br /&gt;
If you need additional python libraries then you can install them in your python environment.&lt;br /&gt;
&lt;br /&gt;
The base package comes with several python libraries but you may want a newer version or additional libraries. Also, when you create a new environment it does not automatically get all of the same libraries as the base environment.&lt;br /&gt;
&lt;br /&gt;
For example, suppose you want to install the &amp;lt;code&amp;gt;scikit-learn&amp;lt;/code&amp;gt; package in your python3 environment.&lt;br /&gt;
&lt;br /&gt;
First you need to activate the environment:&lt;br /&gt;
 conda activate py36&lt;br /&gt;
&lt;br /&gt;
Then you install the package&lt;br /&gt;
 conda install scikit-learn&lt;br /&gt;
&lt;br /&gt;
=== Testing Deep Learning packages on the login nodes or non-GPU nodes ===&lt;br /&gt;
&lt;br /&gt;
You may wish to run PowerAI software on the login nodes for testing on the CPU-only nodes for some workflows.&lt;br /&gt;
&lt;br /&gt;
Only the GPU nodes have graphics cards and graphics drivers installed. If you attempt to run the deep learning software like Tensorflow on the login nodes or CPU-only nodes then you will see errors like the following:&lt;br /&gt;
 ImportError: libcublas.so.9.2: cannot open shared object file: No such file or directory&lt;br /&gt;
&lt;br /&gt;
You need to load the GPU drivers with the following command:&lt;br /&gt;
 source /opt/DL/cudnn/bin/cudnn-activate&lt;br /&gt;
&lt;br /&gt;
Then you can activate the deep learning package, e.g. for Tensorflow:&lt;br /&gt;
 source /opt/DL/tensorflow/bin/tensorflow-activate&lt;br /&gt;
&lt;br /&gt;
Note that some deep learning software may be much slower or refuse to run without GPU access. Tensorflow works but Caffe does not.&lt;br /&gt;
&lt;br /&gt;
Keep in mind you need to activate the GPU drivers and deep learning package in each browser shell before you are able to use the package in your code or LSF jobs.&lt;br /&gt;
&lt;br /&gt;
== Compiling Software for DeepSense ==&lt;br /&gt;
&lt;br /&gt;
DeepSense uses IBM Power8 systems running RedHat Enterprise Linux. Code must be compiled for &amp;lt;code&amp;gt;ppc64le&amp;lt;/code&amp;gt; which is PowerPC 64 bit Little Endian.&lt;br /&gt;
&lt;br /&gt;
Some software may not have binaries available for &amp;lt;code&amp;gt;ppc64le&amp;lt;/code&amp;gt; even if it does for other systems. If this happens then you (or [[Contact_Information|DeepSense support]]) will need to compile the software to run on DeepSense. Visit the web page for the software and see if the source code is available (e.g. through github). If so then follow the compilation instructions to run the software.&lt;br /&gt;
&lt;br /&gt;
You may encounter errors when attempting to compile software for &amp;lt;code&amp;gt;ppc64le&amp;lt;/code&amp;gt;. Often this occurs because of differences between &amp;lt;code&amp;gt;ppc64le&amp;lt;/code&amp;gt; and other common architectures such as x86 and x86_64. &lt;br /&gt;
&lt;br /&gt;
For example, one DeepSense user attempted to compile the rdkit software package from https://www.rdkit.org/ . This compilation failed when it attempted to use the gcc x86 optimization &amp;lt;code&amp;gt;-mpopcnt&amp;lt;/code&amp;gt;. After replacing the optimization with the &amp;lt;code&amp;gt;ppc64le&amp;lt;/code&amp;gt; equivalent &amp;lt;code&amp;gt;-mpopcntb&amp;lt;/code&amp;gt; the software compiled successfully.&lt;/div&gt;</summary>
		<author><name>Cwhidden</name></author>
		
	</entry>
	<entry>
		<id>https://docs.deepsense.ca/index.php?title=Available_software&amp;diff=97</id>
		<title>Available software</title>
		<link rel="alternate" type="text/html" href="https://docs.deepsense.ca/index.php?title=Available_software&amp;diff=97"/>
		<updated>2019-10-15T14:26:15Z</updated>

		<summary type="html">&lt;p&gt;Cwhidden: /* Bioinformatics Software */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Basic Software ==&lt;br /&gt;
&lt;br /&gt;
* RedHat Enterprise Linux Server release 7.5 (RHEL)&lt;br /&gt;
* gcc 4.8.5&lt;br /&gt;
* glibc 2.17&lt;br /&gt;
* R 3.5.1&lt;br /&gt;
&lt;br /&gt;
== Anaconda Python ==&lt;br /&gt;
&lt;br /&gt;
Two Anaconda python environments are installed locally on each DeepSense compute node:&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
! Version&lt;br /&gt;
! Environment location&lt;br /&gt;
|-&lt;br /&gt;
|python 2.7.15&lt;br /&gt;
|/opt/anaconda2&lt;br /&gt;
|-&lt;br /&gt;
|python 3.6.8&lt;br /&gt;
|/opt/anaconda2/envs/py36&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
These python environments have many packages installed, including prerequisite libraries for running the IBM PowerAI deep learning frameworks.&lt;br /&gt;
&lt;br /&gt;
See [[Getting_started]] for instructions on using the shared anaconda python environments.&lt;br /&gt;
&lt;br /&gt;
See [[Installing local software]] for instructions on installing and managing your own python environments in your home directory.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== IBM PowerAI Deep Learning Packages ==&lt;br /&gt;
&lt;br /&gt;
[https://developer.ibm.com/linuxonpower/deep-learning-powerai/ IBM PowerAI] includes multiple open source deep learning frameworks compiled for IBM Power8 systems.&lt;br /&gt;
&lt;br /&gt;
IBM PowerAI Enterprise includes:&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
!Framework&lt;br /&gt;
!colspan=&amp;quot;2&amp;quot;|Location&lt;br /&gt;
|-&lt;br /&gt;
|Caffe&lt;br /&gt;
|/opt/DL/caffe&lt;br /&gt;
|-&lt;br /&gt;
|cuDNN&lt;br /&gt;
|/opt/DL/cudnn&lt;br /&gt;
|-&lt;br /&gt;
|IBM Distributed Deep Learning (DDL)&lt;br /&gt;
|/opt/DL/ddl&lt;br /&gt;
|-&lt;br /&gt;
| HDF5&lt;br /&gt;
|/opt/DL/hdf5&lt;br /&gt;
|-&lt;br /&gt;
|NCCL&lt;br /&gt;
|/opt/DL/nccl&lt;br /&gt;
|/opt/DL/nccl2&lt;br /&gt;
|-&lt;br /&gt;
|openblas&lt;br /&gt;
|/opt/DL/openblas&lt;br /&gt;
|-&lt;br /&gt;
|protobuf&lt;br /&gt;
|/opt/DL/protobuf&lt;br /&gt;
|-&lt;br /&gt;
|pytorch&lt;br /&gt;
|/opt/DL/pytorch&lt;br /&gt;
|-&lt;br /&gt;
|snap-ml&lt;br /&gt;
|/opt/DL/snap-ml-local&lt;br /&gt;
|/opt/DL/snap-ml-mpi&lt;br /&gt;
|-&lt;br /&gt;
|Tensorflow 1.11 (including keras)&lt;br /&gt;
|/opt/DL/tensorflow&lt;br /&gt;
|/opt/DL/ddl-tensorflow&lt;br /&gt;
|-&lt;br /&gt;
|Tensorboard&lt;br /&gt;
|/opt/DL/tensorboard&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
To use most of these frameworks you need to activate a python2 or python3 environment and then activate the relevant framework.&lt;br /&gt;
&lt;br /&gt;
For example, to use tensorflow you can activate a python2 environment:&lt;br /&gt;
 . /opt/anaconda2/etc/profile.d/conda.sh&lt;br /&gt;
 conda activate&lt;br /&gt;
&lt;br /&gt;
and then activate tensorflow:&lt;br /&gt;
 source /opt/DL/tensorflow/bin/tensorflow-activate&lt;br /&gt;
&lt;br /&gt;
You can then &amp;lt;code&amp;gt;import tensorflow as tf&amp;lt;/code&amp;gt; in your python code.&lt;br /&gt;
&lt;br /&gt;
See [[Getting started with Deep Learning]] for a tutorial on using Caffe and Tensorflow on Deep Sense.&lt;br /&gt;
&lt;br /&gt;
== IBM Advance Toolchain ==&lt;br /&gt;
&lt;br /&gt;
You may require newer versions of compilers such as GCC than are provided with RHEL.&lt;br /&gt;
&lt;br /&gt;
The [https://developer.ibm.com/linuxonpower/advance-toolchain IBM Advance Toolchain for Linux on Power] is a set of open source compilers, runtime libraries, and development tools.&lt;br /&gt;
&lt;br /&gt;
The IBM Advance Toolchain] includes recent versions of:&lt;br /&gt;
* GNU Compiler Collection (gcc, g++ and gfortran)&lt;br /&gt;
* GNU C library (glibc)&lt;br /&gt;
* GNU Binary Utilities (binutils)&lt;br /&gt;
* Decimal Floating Point Library (libdfp)&lt;br /&gt;
* IBM Power Architecture Facilities Library (PAFLib)&lt;br /&gt;
* GNU Debugger (gdb)&lt;br /&gt;
* Python&lt;br /&gt;
* Golang&lt;br /&gt;
* Performance analysis tools (oprofile, valgrind, itrace)&lt;br /&gt;
* Multi-core exploitation libraries (TBB, Userspace RCU, SPHDE)&lt;br /&gt;
* support libraries (libhugetlbfs, Boost, zlib, etc)&lt;br /&gt;
&lt;br /&gt;
To use the the Advance Toolchain, first activate environment modules:&lt;br /&gt;
 source /usr/local/Modules/init/bash&lt;br /&gt;
&lt;br /&gt;
Then load the advance toolchain:&lt;br /&gt;
 module load at12.0&lt;br /&gt;
&lt;br /&gt;
To stop using the advance toolchain, unload the environment module:&lt;br /&gt;
 module unload at12.0&lt;br /&gt;
&lt;br /&gt;
Note that software dynamically compiled with the advance toolchain will only run with the advance toolchain loaded.&lt;br /&gt;
&lt;br /&gt;
== Requested Software ==&lt;br /&gt;
&lt;br /&gt;
Software packages that are requested for use by DeepSense projects will be available in several locations. Our preference is to use conda packages when available. &lt;br /&gt;
&lt;br /&gt;
=== External conda channels ===&lt;br /&gt;
If a requested software is available for ppc64le systems from an externally maintained anaconda channel then we will simply list the channel. You can install such software into a local anaconda environment using:&lt;br /&gt;
 conda install -c &amp;lt;channel&amp;gt; &amp;lt;software&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Internal conda packages ===&lt;br /&gt;
When possible, software compiled by DeepSense staff will compiled using conda build and placed in a subdirectory of &amp;lt;code&amp;gt;/software/conda-bld/&amp;lt;/code&amp;gt; . You can install such software into a local anaconda environment using:&lt;br /&gt;
 conda install -c file://software/conda-bld/ &amp;lt;software&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Shared software ===&lt;br /&gt;
Some software will simply be installed in its own subdirectory of &amp;lt;code&amp;gt;/software&amp;lt;/code&amp;gt;. You can run this software directly from its subdirectory.&lt;br /&gt;
&lt;br /&gt;
=== Bioinformatics Software ===&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
!Software&lt;br /&gt;
!Version&lt;br /&gt;
!Location&lt;br /&gt;
|-&lt;br /&gt;
|trimmomatic&lt;br /&gt;
|0.39&lt;br /&gt;
|/software/trimmomattic-0.39&lt;br /&gt;
|-&lt;br /&gt;
|cutadapt&lt;br /&gt;
|2.3&lt;br /&gt;
|/software/conda-bld/linux-ppc64le/&lt;br /&gt;
|-&lt;br /&gt;
|bowtie2&lt;br /&gt;
|&lt;br /&gt;
|biobuilds channel&lt;br /&gt;
|-&lt;br /&gt;
|LAST&lt;br /&gt;
|980&lt;br /&gt;
|/software/last-980&lt;br /&gt;
|-&lt;br /&gt;
|Burrows wheeler aligner&lt;br /&gt;
|0.7.15&lt;br /&gt;
|/software/bwa&lt;br /&gt;
|-&lt;br /&gt;
|pb-falcon&lt;br /&gt;
|2.2.0&lt;br /&gt;
|/software/conda-bld/linux-ppc64le/&lt;br /&gt;
|-&lt;br /&gt;
|MASURCA&lt;br /&gt;
|3.3.4&lt;br /&gt;
|/software/conda-bld/linux-ppc64le/&lt;br /&gt;
|-&lt;br /&gt;
|Samtools&lt;br /&gt;
|1.9&lt;br /&gt;
|/software/conda-bld/linux-ppc64le/&lt;br /&gt;
|-&lt;br /&gt;
|htslib&lt;br /&gt;
|1.9&lt;br /&gt;
|/software/conda-bld/linux-ppc64le/&lt;br /&gt;
|-&lt;br /&gt;
|bcftools&lt;br /&gt;
|1.9&lt;br /&gt;
|/software/conda-bld/linux-ppc64le/&lt;br /&gt;
|-&lt;br /&gt;
|gatk&lt;br /&gt;
|4.1.2.0&lt;br /&gt;
|/software/conda-bld/noarch/&lt;br /&gt;
|-&lt;br /&gt;
|stacks&lt;br /&gt;
|2.4&lt;br /&gt;
|/software/conda-bld/linux-ppc64le/&lt;br /&gt;
|-&lt;br /&gt;
|angsd&lt;br /&gt;
|0.923&lt;br /&gt;
|/software/conda-bld/linux-ppc64le/&lt;br /&gt;
|-&lt;br /&gt;
|vcftools&lt;br /&gt;
|&lt;br /&gt;
|biobuilds channel&lt;br /&gt;
|-&lt;br /&gt;
|plink&lt;br /&gt;
|&lt;br /&gt;
|biobuilds channel&lt;br /&gt;
|-&lt;br /&gt;
|msprime&lt;br /&gt;
|0.7.0&lt;br /&gt;
|/software/conda-bld/linux-ppc64le/&lt;br /&gt;
|-&lt;br /&gt;
|slim&lt;br /&gt;
|3.3&lt;br /&gt;
|/software/slim-3.3&lt;br /&gt;
|-&lt;br /&gt;
|DeepGSR&lt;br /&gt;
|&lt;br /&gt;
|/software/DeepGSR&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
== Requesting Additional Software ==&lt;br /&gt;
&lt;br /&gt;
Contact DeepSense [[contact information|support]] to have additional software installed or for help installing or compiling software locally in your home directory.&lt;/div&gt;</summary>
		<author><name>Cwhidden</name></author>
		
	</entry>
	<entry>
		<id>https://docs.deepsense.ca/index.php?title=Available_software&amp;diff=96</id>
		<title>Available software</title>
		<link rel="alternate" type="text/html" href="https://docs.deepsense.ca/index.php?title=Available_software&amp;diff=96"/>
		<updated>2019-10-15T14:25:24Z</updated>

		<summary type="html">&lt;p&gt;Cwhidden: /* Shared software */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Basic Software ==&lt;br /&gt;
&lt;br /&gt;
* RedHat Enterprise Linux Server release 7.5 (RHEL)&lt;br /&gt;
* gcc 4.8.5&lt;br /&gt;
* glibc 2.17&lt;br /&gt;
* R 3.5.1&lt;br /&gt;
&lt;br /&gt;
== Anaconda Python ==&lt;br /&gt;
&lt;br /&gt;
Two Anaconda python environments are installed locally on each DeepSense compute node:&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
! Version&lt;br /&gt;
! Environment location&lt;br /&gt;
|-&lt;br /&gt;
|python 2.7.15&lt;br /&gt;
|/opt/anaconda2&lt;br /&gt;
|-&lt;br /&gt;
|python 3.6.8&lt;br /&gt;
|/opt/anaconda2/envs/py36&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
These python environments have many packages installed, including prerequisite libraries for running the IBM PowerAI deep learning frameworks.&lt;br /&gt;
&lt;br /&gt;
See [[Getting_started]] for instructions on using the shared anaconda python environments.&lt;br /&gt;
&lt;br /&gt;
See [[Installing local software]] for instructions on installing and managing your own python environments in your home directory.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== IBM PowerAI Deep Learning Packages ==&lt;br /&gt;
&lt;br /&gt;
[https://developer.ibm.com/linuxonpower/deep-learning-powerai/ IBM PowerAI] includes multiple open source deep learning frameworks compiled for IBM Power8 systems.&lt;br /&gt;
&lt;br /&gt;
IBM PowerAI Enterprise includes:&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
!Framework&lt;br /&gt;
!colspan=&amp;quot;2&amp;quot;|Location&lt;br /&gt;
|-&lt;br /&gt;
|Caffe&lt;br /&gt;
|/opt/DL/caffe&lt;br /&gt;
|-&lt;br /&gt;
|cuDNN&lt;br /&gt;
|/opt/DL/cudnn&lt;br /&gt;
|-&lt;br /&gt;
|IBM Distributed Deep Learning (DDL)&lt;br /&gt;
|/opt/DL/ddl&lt;br /&gt;
|-&lt;br /&gt;
| HDF5&lt;br /&gt;
|/opt/DL/hdf5&lt;br /&gt;
|-&lt;br /&gt;
|NCCL&lt;br /&gt;
|/opt/DL/nccl&lt;br /&gt;
|/opt/DL/nccl2&lt;br /&gt;
|-&lt;br /&gt;
|openblas&lt;br /&gt;
|/opt/DL/openblas&lt;br /&gt;
|-&lt;br /&gt;
|protobuf&lt;br /&gt;
|/opt/DL/protobuf&lt;br /&gt;
|-&lt;br /&gt;
|pytorch&lt;br /&gt;
|/opt/DL/pytorch&lt;br /&gt;
|-&lt;br /&gt;
|snap-ml&lt;br /&gt;
|/opt/DL/snap-ml-local&lt;br /&gt;
|/opt/DL/snap-ml-mpi&lt;br /&gt;
|-&lt;br /&gt;
|Tensorflow 1.11 (including keras)&lt;br /&gt;
|/opt/DL/tensorflow&lt;br /&gt;
|/opt/DL/ddl-tensorflow&lt;br /&gt;
|-&lt;br /&gt;
|Tensorboard&lt;br /&gt;
|/opt/DL/tensorboard&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
To use most of these frameworks you need to activate a python2 or python3 environment and then activate the relevant framework.&lt;br /&gt;
&lt;br /&gt;
For example, to use tensorflow you can activate a python2 environment:&lt;br /&gt;
 . /opt/anaconda2/etc/profile.d/conda.sh&lt;br /&gt;
 conda activate&lt;br /&gt;
&lt;br /&gt;
and then activate tensorflow:&lt;br /&gt;
 source /opt/DL/tensorflow/bin/tensorflow-activate&lt;br /&gt;
&lt;br /&gt;
You can then &amp;lt;code&amp;gt;import tensorflow as tf&amp;lt;/code&amp;gt; in your python code.&lt;br /&gt;
&lt;br /&gt;
See [[Getting started with Deep Learning]] for a tutorial on using Caffe and Tensorflow on Deep Sense.&lt;br /&gt;
&lt;br /&gt;
== IBM Advance Toolchain ==&lt;br /&gt;
&lt;br /&gt;
You may require newer versions of compilers such as GCC than are provided with RHEL.&lt;br /&gt;
&lt;br /&gt;
The [https://developer.ibm.com/linuxonpower/advance-toolchain IBM Advance Toolchain for Linux on Power] is a set of open source compilers, runtime libraries, and development tools.&lt;br /&gt;
&lt;br /&gt;
The IBM Advance Toolchain] includes recent versions of:&lt;br /&gt;
* GNU Compiler Collection (gcc, g++ and gfortran)&lt;br /&gt;
* GNU C library (glibc)&lt;br /&gt;
* GNU Binary Utilities (binutils)&lt;br /&gt;
* Decimal Floating Point Library (libdfp)&lt;br /&gt;
* IBM Power Architecture Facilities Library (PAFLib)&lt;br /&gt;
* GNU Debugger (gdb)&lt;br /&gt;
* Python&lt;br /&gt;
* Golang&lt;br /&gt;
* Performance analysis tools (oprofile, valgrind, itrace)&lt;br /&gt;
* Multi-core exploitation libraries (TBB, Userspace RCU, SPHDE)&lt;br /&gt;
* support libraries (libhugetlbfs, Boost, zlib, etc)&lt;br /&gt;
&lt;br /&gt;
To use the the Advance Toolchain, first activate environment modules:&lt;br /&gt;
 source /usr/local/Modules/init/bash&lt;br /&gt;
&lt;br /&gt;
Then load the advance toolchain:&lt;br /&gt;
 module load at12.0&lt;br /&gt;
&lt;br /&gt;
To stop using the advance toolchain, unload the environment module:&lt;br /&gt;
 module unload at12.0&lt;br /&gt;
&lt;br /&gt;
Note that software dynamically compiled with the advance toolchain will only run with the advance toolchain loaded.&lt;br /&gt;
&lt;br /&gt;
== Bioinformatics Software ==&lt;br /&gt;
&lt;br /&gt;
Software packages that are requested for use by DeepSense projects will be available in several locations. Our preference is to use conda packages when available. &lt;br /&gt;
&lt;br /&gt;
=== External conda channels ===&lt;br /&gt;
If a requested software is available for ppc64le systems from an externally maintained anaconda channel then we will simply list the channel. You can install such software into a local anaconda environment using:&lt;br /&gt;
 conda install -c &amp;lt;channel&amp;gt; &amp;lt;software&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Internal conda packages ===&lt;br /&gt;
When possible, software compiled by DeepSense staff will compiled using conda build and placed in a subdirectory of &amp;lt;code&amp;gt;/software/conda-bld/&amp;lt;/code&amp;gt; . You can install such software into a local anaconda environment using:&lt;br /&gt;
 conda install -c file://software/conda-bld/ &amp;lt;software&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Shared software ===&lt;br /&gt;
Some software will simply be installed in its own subdirectory of &amp;lt;code&amp;gt;/software&amp;lt;/code&amp;gt;. You can run this software directly from its subdirectory.&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
!Software&lt;br /&gt;
!Version&lt;br /&gt;
!Location&lt;br /&gt;
|-&lt;br /&gt;
|trimmomatic&lt;br /&gt;
|0.39&lt;br /&gt;
|/software/trimmomattic-0.39&lt;br /&gt;
|-&lt;br /&gt;
|cutadapt&lt;br /&gt;
|2.3&lt;br /&gt;
|/software/conda-bld/linux-ppc64le/&lt;br /&gt;
|-&lt;br /&gt;
|bowtie2&lt;br /&gt;
|&lt;br /&gt;
|biobuilds channel&lt;br /&gt;
|-&lt;br /&gt;
|LAST&lt;br /&gt;
|980&lt;br /&gt;
|/software/last-980&lt;br /&gt;
|-&lt;br /&gt;
|Burrows wheeler aligner&lt;br /&gt;
|0.7.15&lt;br /&gt;
|/software/bwa&lt;br /&gt;
|-&lt;br /&gt;
|pb-falcon&lt;br /&gt;
|2.2.0&lt;br /&gt;
|/software/conda-bld/linux-ppc64le/&lt;br /&gt;
|-&lt;br /&gt;
|MASURCA&lt;br /&gt;
|3.3.4&lt;br /&gt;
|/software/conda-bld/linux-ppc64le/&lt;br /&gt;
|-&lt;br /&gt;
|Samtools&lt;br /&gt;
|1.9&lt;br /&gt;
|/software/conda-bld/linux-ppc64le/&lt;br /&gt;
|-&lt;br /&gt;
|htslib&lt;br /&gt;
|1.9&lt;br /&gt;
|/software/conda-bld/linux-ppc64le/&lt;br /&gt;
|-&lt;br /&gt;
|bcftools&lt;br /&gt;
|1.9&lt;br /&gt;
|/software/conda-bld/linux-ppc64le/&lt;br /&gt;
|-&lt;br /&gt;
|gatk&lt;br /&gt;
|4.1.2.0&lt;br /&gt;
|/software/conda-bld/noarch/&lt;br /&gt;
|-&lt;br /&gt;
|stacks&lt;br /&gt;
|2.4&lt;br /&gt;
|/software/conda-bld/linux-ppc64le/&lt;br /&gt;
|-&lt;br /&gt;
|angsd&lt;br /&gt;
|0.923&lt;br /&gt;
|/software/conda-bld/linux-ppc64le/&lt;br /&gt;
|-&lt;br /&gt;
|vcftools&lt;br /&gt;
|&lt;br /&gt;
|biobuilds channel&lt;br /&gt;
|-&lt;br /&gt;
|plink&lt;br /&gt;
|&lt;br /&gt;
|biobuilds channel&lt;br /&gt;
|-&lt;br /&gt;
|msprime&lt;br /&gt;
|0.7.0&lt;br /&gt;
|/software/conda-bld/linux-ppc64le/&lt;br /&gt;
|-&lt;br /&gt;
|slim&lt;br /&gt;
|3.3&lt;br /&gt;
|/software/slim-3.3&lt;br /&gt;
|-&lt;br /&gt;
|DeepGSR&lt;br /&gt;
|&lt;br /&gt;
|/software/DeepGSR&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
== Requesting Additional Software ==&lt;br /&gt;
&lt;br /&gt;
Contact DeepSense [[contact information|support]] to have additional software installed or for help installing or compiling software locally in your home directory.&lt;/div&gt;</summary>
		<author><name>Cwhidden</name></author>
		
	</entry>
	<entry>
		<id>https://docs.deepsense.ca/index.php?title=Available_software&amp;diff=95</id>
		<title>Available software</title>
		<link rel="alternate" type="text/html" href="https://docs.deepsense.ca/index.php?title=Available_software&amp;diff=95"/>
		<updated>2019-10-15T14:22:11Z</updated>

		<summary type="html">&lt;p&gt;Cwhidden: /* Shared software */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Basic Software ==&lt;br /&gt;
&lt;br /&gt;
* RedHat Enterprise Linux Server release 7.5 (RHEL)&lt;br /&gt;
* gcc 4.8.5&lt;br /&gt;
* glibc 2.17&lt;br /&gt;
* R 3.5.1&lt;br /&gt;
&lt;br /&gt;
== Anaconda Python ==&lt;br /&gt;
&lt;br /&gt;
Two Anaconda python environments are installed locally on each DeepSense compute node:&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
! Version&lt;br /&gt;
! Environment location&lt;br /&gt;
|-&lt;br /&gt;
|python 2.7.15&lt;br /&gt;
|/opt/anaconda2&lt;br /&gt;
|-&lt;br /&gt;
|python 3.6.8&lt;br /&gt;
|/opt/anaconda2/envs/py36&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
These python environments have many packages installed, including prerequisite libraries for running the IBM PowerAI deep learning frameworks.&lt;br /&gt;
&lt;br /&gt;
See [[Getting_started]] for instructions on using the shared anaconda python environments.&lt;br /&gt;
&lt;br /&gt;
See [[Installing local software]] for instructions on installing and managing your own python environments in your home directory.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== IBM PowerAI Deep Learning Packages ==&lt;br /&gt;
&lt;br /&gt;
[https://developer.ibm.com/linuxonpower/deep-learning-powerai/ IBM PowerAI] includes multiple open source deep learning frameworks compiled for IBM Power8 systems.&lt;br /&gt;
&lt;br /&gt;
IBM PowerAI Enterprise includes:&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
!Framework&lt;br /&gt;
!colspan=&amp;quot;2&amp;quot;|Location&lt;br /&gt;
|-&lt;br /&gt;
|Caffe&lt;br /&gt;
|/opt/DL/caffe&lt;br /&gt;
|-&lt;br /&gt;
|cuDNN&lt;br /&gt;
|/opt/DL/cudnn&lt;br /&gt;
|-&lt;br /&gt;
|IBM Distributed Deep Learning (DDL)&lt;br /&gt;
|/opt/DL/ddl&lt;br /&gt;
|-&lt;br /&gt;
| HDF5&lt;br /&gt;
|/opt/DL/hdf5&lt;br /&gt;
|-&lt;br /&gt;
|NCCL&lt;br /&gt;
|/opt/DL/nccl&lt;br /&gt;
|/opt/DL/nccl2&lt;br /&gt;
|-&lt;br /&gt;
|openblas&lt;br /&gt;
|/opt/DL/openblas&lt;br /&gt;
|-&lt;br /&gt;
|protobuf&lt;br /&gt;
|/opt/DL/protobuf&lt;br /&gt;
|-&lt;br /&gt;
|pytorch&lt;br /&gt;
|/opt/DL/pytorch&lt;br /&gt;
|-&lt;br /&gt;
|snap-ml&lt;br /&gt;
|/opt/DL/snap-ml-local&lt;br /&gt;
|/opt/DL/snap-ml-mpi&lt;br /&gt;
|-&lt;br /&gt;
|Tensorflow 1.11 (including keras)&lt;br /&gt;
|/opt/DL/tensorflow&lt;br /&gt;
|/opt/DL/ddl-tensorflow&lt;br /&gt;
|-&lt;br /&gt;
|Tensorboard&lt;br /&gt;
|/opt/DL/tensorboard&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
To use most of these frameworks you need to activate a python2 or python3 environment and then activate the relevant framework.&lt;br /&gt;
&lt;br /&gt;
For example, to use tensorflow you can activate a python2 environment:&lt;br /&gt;
 . /opt/anaconda2/etc/profile.d/conda.sh&lt;br /&gt;
 conda activate&lt;br /&gt;
&lt;br /&gt;
and then activate tensorflow:&lt;br /&gt;
 source /opt/DL/tensorflow/bin/tensorflow-activate&lt;br /&gt;
&lt;br /&gt;
You can then &amp;lt;code&amp;gt;import tensorflow as tf&amp;lt;/code&amp;gt; in your python code.&lt;br /&gt;
&lt;br /&gt;
See [[Getting started with Deep Learning]] for a tutorial on using Caffe and Tensorflow on Deep Sense.&lt;br /&gt;
&lt;br /&gt;
== IBM Advance Toolchain ==&lt;br /&gt;
&lt;br /&gt;
You may require newer versions of compilers such as GCC than are provided with RHEL.&lt;br /&gt;
&lt;br /&gt;
The [https://developer.ibm.com/linuxonpower/advance-toolchain IBM Advance Toolchain for Linux on Power] is a set of open source compilers, runtime libraries, and development tools.&lt;br /&gt;
&lt;br /&gt;
The IBM Advance Toolchain] includes recent versions of:&lt;br /&gt;
* GNU Compiler Collection (gcc, g++ and gfortran)&lt;br /&gt;
* GNU C library (glibc)&lt;br /&gt;
* GNU Binary Utilities (binutils)&lt;br /&gt;
* Decimal Floating Point Library (libdfp)&lt;br /&gt;
* IBM Power Architecture Facilities Library (PAFLib)&lt;br /&gt;
* GNU Debugger (gdb)&lt;br /&gt;
* Python&lt;br /&gt;
* Golang&lt;br /&gt;
* Performance analysis tools (oprofile, valgrind, itrace)&lt;br /&gt;
* Multi-core exploitation libraries (TBB, Userspace RCU, SPHDE)&lt;br /&gt;
* support libraries (libhugetlbfs, Boost, zlib, etc)&lt;br /&gt;
&lt;br /&gt;
To use the the Advance Toolchain, first activate environment modules:&lt;br /&gt;
 source /usr/local/Modules/init/bash&lt;br /&gt;
&lt;br /&gt;
Then load the advance toolchain:&lt;br /&gt;
 module load at12.0&lt;br /&gt;
&lt;br /&gt;
To stop using the advance toolchain, unload the environment module:&lt;br /&gt;
 module unload at12.0&lt;br /&gt;
&lt;br /&gt;
Note that software dynamically compiled with the advance toolchain will only run with the advance toolchain loaded.&lt;br /&gt;
&lt;br /&gt;
== Bioinformatics Software ==&lt;br /&gt;
&lt;br /&gt;
Software packages that are requested for use by DeepSense projects will be available in several locations. Our preference is to use conda packages when available. &lt;br /&gt;
&lt;br /&gt;
=== External conda channels ===&lt;br /&gt;
If a requested software is available for ppc64le systems from an externally maintained anaconda channel then we will simply list the channel. You can install such software into a local anaconda environment using:&lt;br /&gt;
 conda install -c &amp;lt;channel&amp;gt; &amp;lt;software&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Internal conda packages ===&lt;br /&gt;
When possible, software compiled by DeepSense staff will compiled using conda build and placed in a subdirectory of &amp;lt;code&amp;gt;/software/conda-bld/&amp;lt;/code&amp;gt; . You can install such software into a local anaconda environment using:&lt;br /&gt;
 conda install -c file://software/conda-bld/ &amp;lt;software&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Shared software ===&lt;br /&gt;
Some software will simply be installed in its own subdirectory of &amp;lt;code&amp;gt;/software&amp;lt;/code&amp;gt;. You can run this software directly from its subdirectory.&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
!Software&lt;br /&gt;
!Version&lt;br /&gt;
!Location&lt;br /&gt;
|-&lt;br /&gt;
|trimmomatic&lt;br /&gt;
|0.39&lt;br /&gt;
|/software/trimmomattic-0.39&lt;br /&gt;
|-&lt;br /&gt;
|cutadapt&lt;br /&gt;
|2.3&lt;br /&gt;
|/software/conda-bld/linux-ppc64le/&lt;br /&gt;
|-&lt;br /&gt;
|bowtie2&lt;br /&gt;
|&lt;br /&gt;
|biobuilds channel&lt;br /&gt;
|-&lt;br /&gt;
|LAST&lt;br /&gt;
|980&lt;br /&gt;
|/software/last-980&lt;br /&gt;
|-&lt;br /&gt;
|Burrows wheeler aligner&lt;br /&gt;
|0.7.15&lt;br /&gt;
|/software/bwa&lt;br /&gt;
|-&lt;br /&gt;
|pb-falcon&lt;br /&gt;
|2.2.0&lt;br /&gt;
|/software/conda-bld/linux-ppc64le/&lt;br /&gt;
|-&lt;br /&gt;
|MASURCA&lt;br /&gt;
|3.3.4&lt;br /&gt;
|/software/conda-bld/linux-ppc64le/&lt;br /&gt;
|-&lt;br /&gt;
|Samtools&lt;br /&gt;
|1.9&lt;br /&gt;
|/software/conda-bld/linux-ppc64le/&lt;br /&gt;
|-&lt;br /&gt;
|htslib&lt;br /&gt;
|1.9&lt;br /&gt;
|/software/conda-bld/linux-ppc64le/&lt;br /&gt;
|-&lt;br /&gt;
|bcftools&lt;br /&gt;
|1.9&lt;br /&gt;
|/software/conda-bld/linux-ppc64le/&lt;br /&gt;
|-&lt;br /&gt;
|gatk&lt;br /&gt;
|4.1.2.0&lt;br /&gt;
|/software/conda-bld/noarch/&lt;br /&gt;
|-&lt;br /&gt;
|stacks&lt;br /&gt;
|2.4&lt;br /&gt;
|/software/conda-bld/linux-ppc64le/&lt;br /&gt;
|-&lt;br /&gt;
|angsd&lt;br /&gt;
|0.923&lt;br /&gt;
|/software/conda-bld/linux-ppc64le/&lt;br /&gt;
|-&lt;br /&gt;
|vcftools&lt;br /&gt;
|&lt;br /&gt;
|biobuilds channel&lt;br /&gt;
|-&lt;br /&gt;
|plink&lt;br /&gt;
|&lt;br /&gt;
|biobuilds channel&lt;br /&gt;
|-&lt;br /&gt;
|msprime&lt;br /&gt;
|0.7.0&lt;br /&gt;
|/software/conda-bld/linux-ppc64le/&lt;br /&gt;
|-&lt;br /&gt;
|slim&lt;br /&gt;
|3.3&lt;br /&gt;
|?&lt;br /&gt;
|-&lt;br /&gt;
|DeepGSR&lt;br /&gt;
|&lt;br /&gt;
|/software/DeepGSR&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
== Requesting Additional Software ==&lt;br /&gt;
&lt;br /&gt;
Contact DeepSense [[contact information|support]] to have additional software installed or for help installing or compiling software locally in your home directory.&lt;/div&gt;</summary>
		<author><name>Cwhidden</name></author>
		
	</entry>
	<entry>
		<id>https://docs.deepsense.ca/index.php?title=Available_software&amp;diff=94</id>
		<title>Available software</title>
		<link rel="alternate" type="text/html" href="https://docs.deepsense.ca/index.php?title=Available_software&amp;diff=94"/>
		<updated>2019-10-15T14:15:14Z</updated>

		<summary type="html">&lt;p&gt;Cwhidden: Bioinformatics Software&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Basic Software ==&lt;br /&gt;
&lt;br /&gt;
* RedHat Enterprise Linux Server release 7.5 (RHEL)&lt;br /&gt;
* gcc 4.8.5&lt;br /&gt;
* glibc 2.17&lt;br /&gt;
* R 3.5.1&lt;br /&gt;
&lt;br /&gt;
== Anaconda Python ==&lt;br /&gt;
&lt;br /&gt;
Two Anaconda python environments are installed locally on each DeepSense compute node:&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
! Version&lt;br /&gt;
! Environment location&lt;br /&gt;
|-&lt;br /&gt;
|python 2.7.15&lt;br /&gt;
|/opt/anaconda2&lt;br /&gt;
|-&lt;br /&gt;
|python 3.6.8&lt;br /&gt;
|/opt/anaconda2/envs/py36&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
These python environments have many packages installed, including prerequisite libraries for running the IBM PowerAI deep learning frameworks.&lt;br /&gt;
&lt;br /&gt;
See [[Getting_started]] for instructions on using the shared anaconda python environments.&lt;br /&gt;
&lt;br /&gt;
See [[Installing local software]] for instructions on installing and managing your own python environments in your home directory.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== IBM PowerAI Deep Learning Packages ==&lt;br /&gt;
&lt;br /&gt;
[https://developer.ibm.com/linuxonpower/deep-learning-powerai/ IBM PowerAI] includes multiple open source deep learning frameworks compiled for IBM Power8 systems.&lt;br /&gt;
&lt;br /&gt;
IBM PowerAI Enterprise includes:&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
!Framework&lt;br /&gt;
!colspan=&amp;quot;2&amp;quot;|Location&lt;br /&gt;
|-&lt;br /&gt;
|Caffe&lt;br /&gt;
|/opt/DL/caffe&lt;br /&gt;
|-&lt;br /&gt;
|cuDNN&lt;br /&gt;
|/opt/DL/cudnn&lt;br /&gt;
|-&lt;br /&gt;
|IBM Distributed Deep Learning (DDL)&lt;br /&gt;
|/opt/DL/ddl&lt;br /&gt;
|-&lt;br /&gt;
| HDF5&lt;br /&gt;
|/opt/DL/hdf5&lt;br /&gt;
|-&lt;br /&gt;
|NCCL&lt;br /&gt;
|/opt/DL/nccl&lt;br /&gt;
|/opt/DL/nccl2&lt;br /&gt;
|-&lt;br /&gt;
|openblas&lt;br /&gt;
|/opt/DL/openblas&lt;br /&gt;
|-&lt;br /&gt;
|protobuf&lt;br /&gt;
|/opt/DL/protobuf&lt;br /&gt;
|-&lt;br /&gt;
|pytorch&lt;br /&gt;
|/opt/DL/pytorch&lt;br /&gt;
|-&lt;br /&gt;
|snap-ml&lt;br /&gt;
|/opt/DL/snap-ml-local&lt;br /&gt;
|/opt/DL/snap-ml-mpi&lt;br /&gt;
|-&lt;br /&gt;
|Tensorflow 1.11 (including keras)&lt;br /&gt;
|/opt/DL/tensorflow&lt;br /&gt;
|/opt/DL/ddl-tensorflow&lt;br /&gt;
|-&lt;br /&gt;
|Tensorboard&lt;br /&gt;
|/opt/DL/tensorboard&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
To use most of these frameworks you need to activate a python2 or python3 environment and then activate the relevant framework.&lt;br /&gt;
&lt;br /&gt;
For example, to use tensorflow you can activate a python2 environment:&lt;br /&gt;
 . /opt/anaconda2/etc/profile.d/conda.sh&lt;br /&gt;
 conda activate&lt;br /&gt;
&lt;br /&gt;
and then activate tensorflow:&lt;br /&gt;
 source /opt/DL/tensorflow/bin/tensorflow-activate&lt;br /&gt;
&lt;br /&gt;
You can then &amp;lt;code&amp;gt;import tensorflow as tf&amp;lt;/code&amp;gt; in your python code.&lt;br /&gt;
&lt;br /&gt;
See [[Getting started with Deep Learning]] for a tutorial on using Caffe and Tensorflow on Deep Sense.&lt;br /&gt;
&lt;br /&gt;
== IBM Advance Toolchain ==&lt;br /&gt;
&lt;br /&gt;
You may require newer versions of compilers such as GCC than are provided with RHEL.&lt;br /&gt;
&lt;br /&gt;
The [https://developer.ibm.com/linuxonpower/advance-toolchain IBM Advance Toolchain for Linux on Power] is a set of open source compilers, runtime libraries, and development tools.&lt;br /&gt;
&lt;br /&gt;
The IBM Advance Toolchain] includes recent versions of:&lt;br /&gt;
* GNU Compiler Collection (gcc, g++ and gfortran)&lt;br /&gt;
* GNU C library (glibc)&lt;br /&gt;
* GNU Binary Utilities (binutils)&lt;br /&gt;
* Decimal Floating Point Library (libdfp)&lt;br /&gt;
* IBM Power Architecture Facilities Library (PAFLib)&lt;br /&gt;
* GNU Debugger (gdb)&lt;br /&gt;
* Python&lt;br /&gt;
* Golang&lt;br /&gt;
* Performance analysis tools (oprofile, valgrind, itrace)&lt;br /&gt;
* Multi-core exploitation libraries (TBB, Userspace RCU, SPHDE)&lt;br /&gt;
* support libraries (libhugetlbfs, Boost, zlib, etc)&lt;br /&gt;
&lt;br /&gt;
To use the the Advance Toolchain, first activate environment modules:&lt;br /&gt;
 source /usr/local/Modules/init/bash&lt;br /&gt;
&lt;br /&gt;
Then load the advance toolchain:&lt;br /&gt;
 module load at12.0&lt;br /&gt;
&lt;br /&gt;
To stop using the advance toolchain, unload the environment module:&lt;br /&gt;
 module unload at12.0&lt;br /&gt;
&lt;br /&gt;
Note that software dynamically compiled with the advance toolchain will only run with the advance toolchain loaded.&lt;br /&gt;
&lt;br /&gt;
== Bioinformatics Software ==&lt;br /&gt;
&lt;br /&gt;
Software packages that are requested for use by DeepSense projects will be available in several locations. Our preference is to use conda packages when available. &lt;br /&gt;
&lt;br /&gt;
=== External conda channels ===&lt;br /&gt;
If a requested software is available for ppc64le systems from an externally maintained anaconda channel then we will simply list the channel. You can install such software into a local anaconda environment using:&lt;br /&gt;
 conda install -c &amp;lt;channel&amp;gt; &amp;lt;software&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Internal conda packages ===&lt;br /&gt;
When possible, software compiled by DeepSense staff will compiled using conda build and placed in a subdirectory of &amp;lt;code&amp;gt;/software/conda-bld/&amp;lt;/code&amp;gt; . You can install such software into a local anaconda environment using:&lt;br /&gt;
 conda install -c file://software/conda-bld/ &amp;lt;software&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Shared software ===&lt;br /&gt;
Some software will simply be installed in its own subdirectory of &amp;lt;code&amp;gt;/software&amp;lt;/code&amp;gt;. You can run this software directly from its subdirectory.&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
!Software&lt;br /&gt;
!Version&lt;br /&gt;
!Location&lt;br /&gt;
|-&lt;br /&gt;
|trimmomatic&lt;br /&gt;
|0.39&lt;br /&gt;
|/software/trimmomattic-0.39&lt;br /&gt;
|-&lt;br /&gt;
|cutadapt&lt;br /&gt;
|2.3&lt;br /&gt;
|/software/conda-bld/linux-ppc64le/&lt;br /&gt;
|-&lt;br /&gt;
|bowtie2&lt;br /&gt;
|&lt;br /&gt;
|biobuilds channel&lt;br /&gt;
|-&lt;br /&gt;
|LAST&lt;br /&gt;
|980&lt;br /&gt;
|/software/last-980&lt;br /&gt;
|-&lt;br /&gt;
|Burrows wheeler aligner&lt;br /&gt;
|0.7.15&lt;br /&gt;
|/software/bwa&lt;br /&gt;
|-&lt;br /&gt;
|pb-falcon&lt;br /&gt;
|2.2.0&lt;br /&gt;
|/software/conda-bld/linux-ppc64le/&lt;br /&gt;
|-&lt;br /&gt;
|MASURCA&lt;br /&gt;
|3.3.4&lt;br /&gt;
|/software/conda-bld/linux-ppc64le/&lt;br /&gt;
|-&lt;br /&gt;
|Samtools&lt;br /&gt;
|1.9&lt;br /&gt;
|/software/conda-bld/linux-ppc64le/&lt;br /&gt;
|-&lt;br /&gt;
|htslib&lt;br /&gt;
|1.9&lt;br /&gt;
|/software/conda-bld/linux-ppc64le/&lt;br /&gt;
|-&lt;br /&gt;
|bcftools&lt;br /&gt;
|1.9&lt;br /&gt;
|/software/conda-bld/linux-ppc64le/&lt;br /&gt;
|-&lt;br /&gt;
|gatk&lt;br /&gt;
|4.1.2.0&lt;br /&gt;
|/software/conda-bld/noarch/&lt;br /&gt;
|-&lt;br /&gt;
|stacks&lt;br /&gt;
|2.4&lt;br /&gt;
|/software/conda-bld/linux-ppc64le/&lt;br /&gt;
|-&lt;br /&gt;
|angsd&lt;br /&gt;
|0.923&lt;br /&gt;
|/software/conda-bld/linux-ppc64le/&lt;br /&gt;
|-&lt;br /&gt;
|vcftools&lt;br /&gt;
|&lt;br /&gt;
|biobuilds channel&lt;br /&gt;
|-&lt;br /&gt;
|plink&lt;br /&gt;
|&lt;br /&gt;
|biobuilds channel&lt;br /&gt;
|-&lt;br /&gt;
|msprime&lt;br /&gt;
|0.7.0&lt;br /&gt;
|/software/conda-bld/linux-ppc64le/&lt;br /&gt;
|-&lt;br /&gt;
|slim&lt;br /&gt;
|&lt;br /&gt;
|?&lt;br /&gt;
|-&lt;br /&gt;
|DeepGSR&lt;br /&gt;
|&lt;br /&gt;
|?&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Requesting Additional Software ==&lt;br /&gt;
&lt;br /&gt;
Contact DeepSense [[contact information|support]] to have additional software installed or for help installing or compiling software locally in your home directory.&lt;/div&gt;</summary>
		<author><name>Cwhidden</name></author>
		
	</entry>
	<entry>
		<id>https://docs.deepsense.ca/index.php?title=Getting_started&amp;diff=92</id>
		<title>Getting started</title>
		<link rel="alternate" type="text/html" href="https://docs.deepsense.ca/index.php?title=Getting_started&amp;diff=92"/>
		<updated>2019-09-23T17:59:38Z</updated>

		<summary type="html">&lt;p&gt;Cwhidden: /* 3. Logging on */  clarify how to log on with ssh&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt; Getting Started with DeepSense &lt;br /&gt;
&lt;br /&gt;
&amp;lt;div class=&amp;quot;noautonum&amp;quot;&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== 1. Request access to DeepSense ==&lt;br /&gt;
&lt;br /&gt;
If you belong to an approved DeepSense project but do not yet have an account then send an email to support@deepsense.ca with the subject &amp;quot;DeepSense Account Request&amp;quot; and provide your:&lt;br /&gt;
  a) First and last name&lt;br /&gt;
  b) Faculty of Computer Science username or requested FCS username&lt;br /&gt;
  c) Dalhousie BannerID&lt;br /&gt;
  d) Project ID&lt;br /&gt;
  e) Project leader&lt;br /&gt;
  f) Reason for requesting the account.&lt;br /&gt;
&lt;br /&gt;
== 2. Change your password ==&lt;br /&gt;
&lt;br /&gt;
If you require a new FCS username then your initial password is your BannerID. Please change it immediately upon receiving access to DeepSense.&lt;br /&gt;
&lt;br /&gt;
You can change your password at https://www.cs.dal.ca/csid&lt;br /&gt;
&lt;br /&gt;
Alternatively, you can contact cshelp@cs.dal.ca to reset your password.&lt;br /&gt;
&lt;br /&gt;
== 3. Logging on ==&lt;br /&gt;
&lt;br /&gt;
DeepSense has two login nodes, login1.deepsense.ca and login2.deepsense.ca . You can access these through SSH with your username and password from any computer on campus.&lt;br /&gt;
&lt;br /&gt;
For example, if your userid is &amp;lt;code&amp;gt;user1&amp;lt;/code&amp;gt;, you can connect to deepsense by typing &amp;lt;code&amp;gt;ssh user1@login1.deepsense.ca&amp;lt;/code&amp;gt; just like logging on to any other network computer.&lt;br /&gt;
&lt;br /&gt;
From off campus you’ll need to use the Dalhousie VPN (https://wireless.dal.ca/vpnsoftware.php). If you are not a Dalhousie staff, student, or faculty but require offsite access and cannot use the Dalhousie VPN then contact your project leader or support@deepsense.ca to make different arrangements.&lt;br /&gt;
&lt;br /&gt;
The login nodes are intended for testing and compiling code. Please don’t run long or intensive computation on these nodes. Keep reading for instructions on how to submit compute jobs to dedicated compute nodes.&lt;br /&gt;
&lt;br /&gt;
==  4. Transfer data ==&lt;br /&gt;
&lt;br /&gt;
For more information, see [[Transferring Data]].&lt;br /&gt;
&lt;br /&gt;
Deepsense has two protocol nodes, protocol1.deepsense.ca and protocol2.deepsense.ca . You can connect to these using the SAMBA transfer protocol, e.g. smb://protocol1.deepsense.ca with your username and password. Please contact your project leader or support@deepsense.ca if you need help transferring large amounts of data.  &lt;br /&gt;
&lt;br /&gt;
Data transferred through the protocol nodes will be located in the shared /data directory .&lt;br /&gt;
&lt;br /&gt;
See [[Storage policies]] for more information about the available shared file systems, storage policies, and backup policies.&lt;br /&gt;
&lt;br /&gt;
== 5. Configure your environment ==&lt;br /&gt;
&lt;br /&gt;
DeepSense compute and management nodes are IBM Power8 computers (ppc64le) running Redhat Enterprise Linux. See [[Resources]] for more details on the available nodes.&lt;br /&gt;
&lt;br /&gt;
=== 5.1 Loading a python environment ===&lt;br /&gt;
&lt;br /&gt;
You have two options for using python on DeepSense. You can use the systemwide python install, managed by DeepSense administrators. This is recommended for users new to Linux. You will need to contact DeepSense support to have additional software packages installed in the systemwide python.&lt;br /&gt;
&lt;br /&gt;
Alternatively, you can install an Anaconda python environment or other software in your home directory. This allows you to install or update packages or software without requesting and waiting for DeepSense staff. &lt;br /&gt;
&lt;br /&gt;
==== Systemwide python (managed by DeepSense) ====&lt;br /&gt;
&lt;br /&gt;
DeepSense nodes have anaconda2 python installed in /opt/anaconda2. To use this systemwide python add a parameter to your .bashrc file in your home directory:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;echo &amp;quot;. /opt/anaconda2/etc/profile.d/conda.sh&amp;quot; &amp;gt;&amp;gt; ~/.bashrc&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Then source your .bashrc file:&lt;br /&gt;
&amp;lt;code&amp;gt;source ~/.bashrc&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To load the python2 environment run &amp;lt;code&amp;gt;conda activate&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To use python3 you can activate the py36 environment:&lt;br /&gt;
&amp;lt;code&amp;gt;conda activate py36&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
You can add either line to your .bashrc file to automatically load the desired environment when you log in.&lt;br /&gt;
&lt;br /&gt;
==== Local python install (managed by individual user) ====&lt;br /&gt;
&lt;br /&gt;
See [[Installing local software]] for more information.&lt;br /&gt;
&lt;br /&gt;
== 6. Running compute jobs ==&lt;br /&gt;
&lt;br /&gt;
DeepSense has two different methods of submitting compute jobs.&lt;br /&gt;
&lt;br /&gt;
=== 6.1 Load Sharing Facility (LSF) ===&lt;br /&gt;
&lt;br /&gt;
LSF is a set of command line tools for submitting compute jobs. You may be familiar with other similar software such as Sun Grid Engine or SLURM.&lt;br /&gt;
&lt;br /&gt;
LSF jobs are submitted using the &amp;lt;code&amp;gt;bsub&amp;lt;/code&amp;gt; command.&lt;br /&gt;
&lt;br /&gt;
You can examine the progress of your currently running jobs with the &amp;lt;code&amp;gt;bjobs&amp;lt;/code&amp;gt; command.&lt;br /&gt;
&lt;br /&gt;
You can examine the available compute nodes and their available resources with the &amp;lt;code&amp;gt;bhosts&amp;lt;/code&amp;gt; command.&lt;br /&gt;
&lt;br /&gt;
For more information about using LSF see [[LSF]].&lt;br /&gt;
&lt;br /&gt;
=== 6.2 Conductor with Spark (CWS) ===&lt;br /&gt;
&lt;br /&gt;
CWS is an IBM web-based graphical interface for creating and running Apache Spark compute jobs.&lt;br /&gt;
&lt;br /&gt;
To use CWS, connect to the IBM Spectrum Computing Cluster Management Console at https://ds-mgm-02.deepsense.cs.dal.ca:8443. Log in with your username and password.&lt;br /&gt;
&lt;br /&gt;
Note that currently you need to accept a self-signed web certificate. In the future this will be fixed.&lt;br /&gt;
&lt;br /&gt;
For more information about using CWS see [[CWS]].&lt;br /&gt;
&lt;br /&gt;
== 7. Deep Learning packages and other available software ==&lt;br /&gt;
&lt;br /&gt;
DeepSense has a variety of Deep Learning packages installed as part of IBM PowerAI including Tensorflow, Caffe, and PyTorch. These packages are installed in /opt/DL/ on each compute node and typically need to be activated before using them, e.g. &amp;lt;code&amp;gt;source /opt/DL/tensorflow/bin/tensorflow-activate&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Deep Learning packages are typically used on the GPU nodes but some deep learning packages can also be used on the login nodes and CPU-only nodes. This can be useful for testing your code or running CPU-bound workloads. To use the deep learning packages on the login or compute nodes you will also need to load the GPU libraries with &amp;lt;code&amp;gt;source /opt/DL/cudnn/bin/cudnn-activate&amp;lt;/code&amp;gt;. Note that some deep learning packages may fail if run without a GPU, e.g. Caffe currently requires a GPU.&lt;br /&gt;
&lt;br /&gt;
For a brief tutorial including running Caffe and Tensorflow in a Jupyter notebook see [[Getting started with Deep Learning]].&lt;br /&gt;
&lt;br /&gt;
See [[Available software]] for the current list of installed software. If you require additional software you are welcome to install it locally in your home directory or contact DeepSense support.&lt;br /&gt;
&lt;br /&gt;
== 8. Technical and research support == &lt;br /&gt;
&lt;br /&gt;
DeepSense has a dedicated support team of research scientists ready to help you with technical questions, installing software, or even research questions.&lt;br /&gt;
&lt;br /&gt;
If you can&amp;#039;t find the answer to your question on this wiki or need more extensive help then send an email to support@deepsense.ca .&lt;br /&gt;
&lt;br /&gt;
See [[Technical support]] for more information about the support available.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/div&amp;gt; &amp;lt;!-- autonum --&amp;gt;&lt;/div&gt;</summary>
		<author><name>Cwhidden</name></author>
		
	</entry>
	<entry>
		<id>https://docs.deepsense.ca/index.php?title=Getting_started_with_Deep_Learning&amp;diff=90</id>
		<title>Getting started with Deep Learning</title>
		<link rel="alternate" type="text/html" href="https://docs.deepsense.ca/index.php?title=Getting_started_with_Deep_Learning&amp;diff=90"/>
		<updated>2019-09-13T19:56:46Z</updated>

		<summary type="html">&lt;p&gt;Cwhidden: /* 3. Request an interactive session on a GPU compute node */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;div class=&amp;quot;noautonum&amp;quot;&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== 1. Get started with DeepSense ==&lt;br /&gt;
&lt;br /&gt;
Follow all the steps from [[Getting started]]. This tutorial assumes you can log on to the DeepSense compute platform and have a version of Anaconda python on your path.&lt;br /&gt;
&lt;br /&gt;
== 2. Download Caffe samples to your home directory ==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;/opt/DL/caffe/bin/caffe-install-samples&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== 3. Request an interactive session on a GPU compute node ==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- TODO: still need to set up queues to fairly share GPUs --&amp;gt;&lt;br /&gt;
&amp;lt;!-- TODO: write instructions doing this with regular LSF without an interactive session. We don&amp;#039;t want to encourage everyone to use interactive sessions --&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;bsub -Is -gpu - bash&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== 4. Start a python2 Jupyter notebook ==&lt;br /&gt;
&lt;br /&gt;
=== Source the Caffe deep learning toolkit ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;source /opt/DL/caffe/bin/caffe-activate&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Start the notebook ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;jupyter notebook --no-browser --ip=0.0.0.0&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Sample output ===&lt;br /&gt;
&amp;lt;pre&amp;gt;[I 13:32:23.937 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).&lt;br /&gt;
[C 13:32:23.937 NotebookApp] &lt;br /&gt;
    &lt;br /&gt;
    Copy/paste this URL into your browser when you connect for the first time,&lt;br /&gt;
    to login with a token:&lt;br /&gt;
        http://ds-cmgpu-04:8888/?token=68042f40a10b500f3747ae0a232ee209fa4bf1aa384d29ba&amp;amp;token=68042f40a10b500f3747ae0a232ee209fa4bf1aa384d29ba&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Copy the URL, host, and port ===&lt;br /&gt;
&lt;br /&gt;
Copy the URL but don’t paste it in your browser yet.&lt;br /&gt;
&lt;br /&gt;
Make a note of which compute host and port the notebook is running on (e.g. host ds-cmgpu-04 and port 8888 in this case)&lt;br /&gt;
&lt;br /&gt;
== 5. Port Forwarding ==&lt;br /&gt;
&lt;br /&gt;
In a separate terminal window from your local computer, forward your local port to the remote host:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt; ssh -l &amp;lt;username&amp;gt; login1.deepsense.ca -L &amp;lt;port&amp;gt;:&amp;lt;remote_host&amp;gt;:&amp;lt;port&amp;gt;&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
for example, &amp;lt;code&amp;gt;ssh -l user1 login1.deepsense.ca -L 8888:ds-cmgpu-04:8888&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Enter the copied URL in your web browser but change the remote host name to “localhost” before pressing enter.&lt;br /&gt;
&lt;br /&gt;
e.g &amp;lt;code&amp;gt;http://localhost:8888/?token=68042f40a10b500f3747ae0a232ee209fa4bf1aa384d29ba&amp;amp;token=68042f40a10b500f3747ae0a232ee209fa4bf1aa384d29ba&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== 6. Open the desired sample notebook ==&lt;br /&gt;
&lt;br /&gt;
Be sure to enter the location of the “caffe-samples” directory in your home directory as your caffe-root in the Caffe example notebooks.&lt;br /&gt;
&lt;br /&gt;
== 7. Enjoy Deep Learning on DeepSense! ==&lt;br /&gt;
&lt;br /&gt;
== 8. More information ==&lt;br /&gt;
&lt;br /&gt;
Go to Caffe&amp;#039;s [http://caffe.berkeleyvision.org/ website] for tutorials and example programs that you can run to get started.&lt;br /&gt;
See the following links to a couple of the example programs:&lt;br /&gt;
&lt;br /&gt;
[http://caffe.berkeleyvision.org/gathered/examples/mnist.html LeNet MNIST Tutorial] - Train a neural network to understand handwritten digits.&lt;br /&gt;
&lt;br /&gt;
[http://caffe.berkeleyvision.org/gathered/examples/cifar10.html CIFAR-10 tutorial] - Train a convolutional neural network to classify small images.&lt;br /&gt;
&lt;br /&gt;
== 9. Using another deep learning toolkit such as Tensorflow ==&lt;br /&gt;
&lt;br /&gt;
* Ensure any Anaconda dependencies are installed&lt;br /&gt;
** for tensorflow, run &amp;lt;code&amp;gt;/opt/DL/tensorflow/bin/install_dependencies&amp;lt;/code&amp;gt;&lt;br /&gt;
* Source the appropriate toolkit instead of caffe-activate&lt;br /&gt;
** e.g. &amp;lt;code&amp;gt;source /opt/DL/tensorflow/bin/tensorflow-activate&amp;lt;/code&amp;gt;&lt;br /&gt;
* Download example notebooks for the deep learning toolkit to your home directory,&lt;br /&gt;
** e.g. &amp;lt;code&amp;gt; git clone https://github.com/aymericdamien/TensorFlow-Examples.git&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The TensorFlow [https://www.tensorflow.org/ home page] has various information, including Tutorials, How-To documents, and a Getting Started guide.&lt;br /&gt;
&lt;br /&gt;
Additional tutorials and examples are available from the community, for example:&lt;br /&gt;
&lt;br /&gt;
  https://github.com/nlintz/TensorFlow-Tutorials&lt;br /&gt;
&lt;br /&gt;
  https://github.com/aymericdamien/TensorFlow-Examples&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/div&amp;gt; &amp;lt;!-- autonum --&amp;gt;&lt;/div&gt;</summary>
		<author><name>Cwhidden</name></author>
		
	</entry>
	<entry>
		<id>https://docs.deepsense.ca/index.php?title=Getting_started&amp;diff=82</id>
		<title>Getting started</title>
		<link rel="alternate" type="text/html" href="https://docs.deepsense.ca/index.php?title=Getting_started&amp;diff=82"/>
		<updated>2019-08-27T17:04:19Z</updated>

		<summary type="html">&lt;p&gt;Cwhidden: /* 6.2 Conductor with Spark (CWS) */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt; Getting Started with DeepSense &lt;br /&gt;
&lt;br /&gt;
&amp;lt;div class=&amp;quot;noautonum&amp;quot;&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== 1. Request access to DeepSense ==&lt;br /&gt;
&lt;br /&gt;
If you belong to an approved DeepSense project but do not yet have an account then send an email to support@deepsense.ca with the subject &amp;quot;DeepSense Account Request&amp;quot; and provide your:&lt;br /&gt;
  a) First and last name&lt;br /&gt;
  b) Faculty of Computer Science username or requested FCS username&lt;br /&gt;
  c) Dalhousie BannerID&lt;br /&gt;
  d) Project ID&lt;br /&gt;
  e) Project leader&lt;br /&gt;
  f) Reason for requesting the account.&lt;br /&gt;
&lt;br /&gt;
== 2. Change your password ==&lt;br /&gt;
&lt;br /&gt;
If you require a new FCS username then your initial password is your BannerID. Please change it immediately upon receiving access to DeepSense.&lt;br /&gt;
&lt;br /&gt;
You can change your password at https://www.cs.dal.ca/csid&lt;br /&gt;
&lt;br /&gt;
Alternatively, you can contact cshelp@cs.dal.ca to reset your password.&lt;br /&gt;
&lt;br /&gt;
== 3. Logging on ==&lt;br /&gt;
&lt;br /&gt;
DeepSense has two login nodes, login1.deepsense.ca and login2.deepsense.ca . You can access these through SSH with your username and password from any computer on campus.&lt;br /&gt;
&lt;br /&gt;
From off campus you’ll need to use the Dalhousie VPN (https://wireless.dal.ca/vpnsoftware.php). If you are not a Dalhousie staff, student, or faculty but require offsite access and cannot use the Dalhousie VPN then contact your project leader or support@deepsense.ca to make different arrangements.&lt;br /&gt;
&lt;br /&gt;
The login nodes are intended for testing and compiling code. Please don’t run long or intensive computation on these nodes.&lt;br /&gt;
&lt;br /&gt;
==  4. Transfer data ==&lt;br /&gt;
&lt;br /&gt;
For more information, see [[Transferring Data]].&lt;br /&gt;
&lt;br /&gt;
Deepsense has two protocol nodes, protocol1.deepsense.ca and protocol2.deepsense.ca . You can connect to these using the SAMBA transfer protocol, e.g. smb://protocol1.deepsense.ca with your username and password. Please contact your project leader or support@deepsense.ca if you need help transferring large amounts of data.  &lt;br /&gt;
&lt;br /&gt;
Data transferred through the protocol nodes will be located in the shared /data directory .&lt;br /&gt;
&lt;br /&gt;
See [[Storage policies]] for more information about the available shared file systems, storage policies, and backup policies.&lt;br /&gt;
&lt;br /&gt;
== 5. Configure your environment ==&lt;br /&gt;
&lt;br /&gt;
DeepSense compute and management nodes are IBM Power8 computers (ppc64le) running Redhat Enterprise Linux. See [[Resources]] for more details on the available nodes.&lt;br /&gt;
&lt;br /&gt;
=== 5.1 Loading a python environment ===&lt;br /&gt;
&lt;br /&gt;
You have two options for using python on DeepSense. You can use the systemwide python install, managed by DeepSense administrators. This is recommended for users new to Linux. You will need to contact DeepSense support to have additional software packages installed in the systemwide python.&lt;br /&gt;
&lt;br /&gt;
Alternatively, you can install an Anaconda python environment or other software in your home directory. This allows you to install or update packages or software without requesting and waiting for DeepSense staff. &lt;br /&gt;
&lt;br /&gt;
==== Systemwide python (managed by DeepSense) ====&lt;br /&gt;
&lt;br /&gt;
DeepSense nodes have anaconda2 python installed in /opt/anaconda2. To use this systemwide python add a parameter to your .bashrc file in your home directory:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;echo &amp;quot;. /opt/anaconda2/etc/profile.d/conda.sh&amp;quot; &amp;gt;&amp;gt; ~/.bashrc&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Then source your .bashrc file:&lt;br /&gt;
&amp;lt;code&amp;gt;source ~/.bashrc&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To load the python2 environment run &amp;lt;code&amp;gt;conda activate&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To use python3 you can activate the py36 environment:&lt;br /&gt;
&amp;lt;code&amp;gt;conda activate py36&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
You can add either line to your .bashrc file to automatically load the desired environment when you log in.&lt;br /&gt;
&lt;br /&gt;
==== Local python install (managed by individual user) ====&lt;br /&gt;
&lt;br /&gt;
See [[Installing local software]] for more information.&lt;br /&gt;
&lt;br /&gt;
== 6. Running compute jobs ==&lt;br /&gt;
&lt;br /&gt;
DeepSense has two different methods of submitting compute jobs.&lt;br /&gt;
&lt;br /&gt;
=== 6.1 Load Sharing Facility (LSF) ===&lt;br /&gt;
&lt;br /&gt;
LSF is a set of command line tools for submitting compute jobs. You may be familiar with other similar software such as Sun Grid Engine or SLURM.&lt;br /&gt;
&lt;br /&gt;
LSF jobs are submitted using the &amp;lt;code&amp;gt;bsub&amp;lt;/code&amp;gt; command.&lt;br /&gt;
&lt;br /&gt;
You can examine the progress of your currently running jobs with the &amp;lt;code&amp;gt;bjobs&amp;lt;/code&amp;gt; command.&lt;br /&gt;
&lt;br /&gt;
You can examine the available compute nodes and their available resources with the &amp;lt;code&amp;gt;bhosts&amp;lt;/code&amp;gt; command.&lt;br /&gt;
&lt;br /&gt;
For more information about using LSF see [[LSF]].&lt;br /&gt;
&lt;br /&gt;
=== 6.2 Conductor with Spark (CWS) ===&lt;br /&gt;
&lt;br /&gt;
CWS is an IBM web-based graphical interface for creating and running Apache Spark compute jobs.&lt;br /&gt;
&lt;br /&gt;
To use CWS, connect to the IBM Spectrum Computing Cluster Management Console at https://ds-mgm-02.deepsense.cs.dal.ca:8443. Log in with your username and password.&lt;br /&gt;
&lt;br /&gt;
Note that currently you need to accept a self-signed web certificate. In the future this will be fixed.&lt;br /&gt;
&lt;br /&gt;
For more information about using CWS see [[CWS]].&lt;br /&gt;
&lt;br /&gt;
== 7. Deep Learning packages and other available software ==&lt;br /&gt;
&lt;br /&gt;
DeepSense has a variety of Deep Learning packages installed as part of IBM PowerAI including Tensorflow, Caffe, and PyTorch. These packages are installed in /opt/DL/ on each compute node and typically need to be activated before using them, e.g. &amp;lt;code&amp;gt;source /opt/DL/tensorflow/bin/tensorflow-activate&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Deep Learning packages are typically used on the GPU nodes but some deep learning packages can also be used on the login nodes and CPU-only nodes. This can be useful for testing your code or running CPU-bound workloads. To use the deep learning packages on the login or compute nodes you will also need to load the GPU libraries with &amp;lt;code&amp;gt;source /opt/DL/cudnn/bin/cudnn-activate&amp;lt;/code&amp;gt;. Note that some deep learning packages may fail if run without a GPU, e.g. Caffe currently requires a GPU.&lt;br /&gt;
&lt;br /&gt;
For a brief tutorial including running Caffe and Tensorflow in a Jupyter notebook see [[Getting started with Deep Learning]].&lt;br /&gt;
&lt;br /&gt;
See [[Available software]] for the current list of installed software. If you require additional software you are welcome to install it locally in your home directory or contact DeepSense support.&lt;br /&gt;
&lt;br /&gt;
== 8. Technical and research support == &lt;br /&gt;
&lt;br /&gt;
DeepSense has a dedicated support team of research scientists ready to help you with technical questions, installing software, or even research questions.&lt;br /&gt;
&lt;br /&gt;
If you can&amp;#039;t find the answer to your question on this wiki or need more extensive help then send an email to support@deepsense.ca .&lt;br /&gt;
&lt;br /&gt;
See [[Technical support]] for more information about the support available.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/div&amp;gt; &amp;lt;!-- autonum --&amp;gt;&lt;/div&gt;</summary>
		<author><name>Cwhidden</name></author>
		
	</entry>
	<entry>
		<id>https://docs.deepsense.ca/index.php?title=LSF&amp;diff=81</id>
		<title>LSF</title>
		<link rel="alternate" type="text/html" href="https://docs.deepsense.ca/index.php?title=LSF&amp;diff=81"/>
		<updated>2019-08-26T15:00:50Z</updated>

		<summary type="html">&lt;p&gt;Cwhidden: /* Complicated Jobs */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[https://www.ibm.com/support/knowledgecenter/SSWRJV_10.1.0/ IBM Spectrum LSF] is the command line job submission system for submitting batch and interactive jobs on DeepSense computing hardware.&lt;br /&gt;
&lt;br /&gt;
== Test code and short computation ==&lt;br /&gt;
DeepSense has two login nodes, login1.deepsense.ca and login2.deepsense.ca . You can access these through SSH with your username and password from any computer on campus. From off campus you’ll need to use the [https://wireless.dal.ca/vpnsoftware.php Dalhousie VPN].&lt;br /&gt;
&lt;br /&gt;
The login nodes are intended for testing and compiling code. Please don’t run long or intensive computation on these nodes.&lt;br /&gt;
&lt;br /&gt;
== Job Submission ==&lt;br /&gt;
When you have a small example working with your code and are ready to run a real workload, use the LSF queue to submit your jobs to the cluster (https://www.ibm.com/support/knowledgecenter/SSWRJV_10.1.0/lsf_users_guide/batch_jobs_about.html). If you’ve used other queuing systems like slurm or Sun Grid Engine before then LSF will seem very familiar.&lt;br /&gt;
 &lt;br /&gt;
To submit a job you use the &amp;lt;code&amp;gt;bsub&amp;lt;/code&amp;gt; command (https://www.ibm.com/support/knowledgecenter/en/SSWRJV_10.1.0/lsf_command_ref/bsub.man_top.1.html).&lt;br /&gt;
 &lt;br /&gt;
For example, to submit a shared memory job using 20 processors and 256GB of memory for at most 24 hours you would run:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;bsub -oo &amp;lt;output_file&amp;gt; -n 20 -M 256000 -W 24:0 -R “span[hosts=1] rusage[mem=256000]” &amp;lt;executable&amp;gt; [options]&amp;lt;/code&amp;gt;&lt;br /&gt;
 &lt;br /&gt;
For openMP jobs, please make sure that you use &amp;lt;code&amp;gt;OMP_NUM_THREADS&amp;lt;/code&amp;gt; to limit the number of threads your program uses and that you set this variable in your code that will run on the server. LSF sets a variable &amp;lt;code&amp;gt;$LSB_DJOB_NUMPROC&amp;lt;/code&amp;gt; that you can use if you don’t want to hardcode &amp;lt;code&amp;gt;OMP_NUM_THREADS&amp;lt;/code&amp;gt; or set it with your own variable.&lt;br /&gt;
&lt;br /&gt;
=== CPU Limit ===&lt;br /&gt;
The number of requested processors is specified with the option &amp;lt;code&amp;gt;-n&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
The resource request &amp;lt;code&amp;gt;-R &amp;quot;span[hosts=1]&amp;quot;&amp;lt;/code&amp;gt; requires that all processors are on the same compute host, i.e. a shared memory job.&lt;br /&gt;
&lt;br /&gt;
LSF can also be used to run compute jobs across multiple hosts such as MPI jobs. Examples will be included here at a later date.&lt;br /&gt;
&lt;br /&gt;
=== Memory Limit === &lt;br /&gt;
LSF has two different types of memory limits.&lt;br /&gt;
The scheduler memory limit &amp;lt;code&amp;gt;-R &amp;quot;rusage[mem=&amp;lt;memlimit&amp;gt;]&amp;quot;&amp;lt;/code&amp;gt; requests &amp;lt;code&amp;gt;&amp;lt;memlimit&amp;gt;&amp;lt;/code&amp;gt; amount of memory. Your job will not start until a compute node is available with that amount of memory. You are guaranteed to have this amount of memory available. If you exceed the requested amount then your job may be killed but it will only be killed if other jobs need that memory. &lt;br /&gt;
&lt;br /&gt;
The job memory limit &amp;lt;code&amp;gt;-R &amp;quot;rusage[mem=&amp;lt;memlimit&amp;gt;]&amp;lt;/code&amp;gt; will kill your job if it exceeds the given memory limit. Note that this option does not guarantee that you will have that amount of memory available.&lt;br /&gt;
&lt;br /&gt;
The memory limits are specified in MB by default. You can also specify units, e.g. &amp;lt;code&amp;gt;-M 256GB&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;-R &amp;quot;rusage[mem=256GB]&amp;quot;&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
If you are using more than a few GB of memory than you must specify the &amp;lt;code&amp;gt;-R &amp;quot;rusage[mem=&amp;lt;memlimit&amp;gt;]&amp;quot;&amp;lt;/code&amp;gt; option or your job may be terminated. You may additionally want to use the &amp;lt;code&amp;gt;-M &amp;lt;memlimit&amp;gt;&amp;lt;/code&amp;gt; option to be sure you aren&amp;#039;t using more memory than intended.&lt;br /&gt;
&lt;br /&gt;
=== Time Limit ===&lt;br /&gt;
The runtime limit &amp;lt;code&amp;gt;-W hours:minutes&amp;lt;/code&amp;gt; specifies the maximum length of time your job is allowed to run.&lt;br /&gt;
For example &amp;lt;code&amp;gt;-W 24:0&amp;lt;/code&amp;gt; requests 24 hours of running time.&lt;br /&gt;
Your job will be terminated when the runtime limit is exceeded.&lt;br /&gt;
&lt;br /&gt;
If you do not specify a runtime limit then the default runtime limit of 168 hours (7 days) will be used.&lt;br /&gt;
The maximum possible runtime limit is currently 30 days and may vary by queue in the future.&lt;br /&gt;
&lt;br /&gt;
If there is a scheduled maintenance window announced then any job with a run time limit that could extend into the maintenance period will be listed as pending and will not run until the maintenance has concluded. Use a shorter run time limit that ends before the maintenance period to avoid this.&lt;br /&gt;
&lt;br /&gt;
=== GPU Computation ===&lt;br /&gt;
&lt;br /&gt;
To request access to a GPU use the &amp;lt;code&amp;gt;-gpu -&amp;lt;/code&amp;gt; options.&lt;br /&gt;
&lt;br /&gt;
Note the trailing dash, which specifies the default GPU arguments. The following options can be used in place of that dash.&lt;br /&gt;
&lt;br /&gt;
The default GPU arguments are &amp;lt;code&amp;gt;&amp;quot;num=1:mode=shared:mps=no:j_exclusive=no&amp;quot;&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;num=num_gpus&amp;lt;/code&amp;gt; is the number of requested GPUs on each host.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;mode=shared | exclusive_process&amp;lt;/code&amp;gt; specifies the GPU mode.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;mps=yes | no&amp;lt;/code&amp;gt; use the Nvidia Multi-Process Server (MPS). MPS enables better sharing of GPU resources. If &amp;lt;code&amp;gt;mode=exclusive_process&amp;lt;/code&amp;gt; then mps should be set to yes. &lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;j_exclusive=yes | no&amp;lt;/code&amp;gt; Is the GPU exclusive to this job and prevented from being used by other jobs?&lt;br /&gt;
&lt;br /&gt;
By default the &amp;lt;code&amp;gt;-gpu -&amp;lt;/code&amp;gt; option will request one nonexclusive GPU. Please limit your usage of GPU resources to a reasonable number of concurrently used GPUs and use shared GPUs when possible. We may enact limits on GPU use in the feature if necessary.&lt;br /&gt;
&lt;br /&gt;
See the [https://www.ibm.com/support/knowledgecenter/en/SSWRJV_10.1.0/lsf_command_ref/bsub.gpu.1.html bsub.gpu] documentation for more information on submitting GPU jobs.&lt;br /&gt;
&lt;br /&gt;
=== Input and Output files ===&lt;br /&gt;
If you do not specify an output file with &amp;lt;code&amp;gt;-o&amp;lt;/code&amp;gt; (append) or &amp;lt;code&amp;gt;-oo&amp;lt;/code&amp;gt; (overwrite) then the output will be lost. Note that LSF will prepend submission information to this file. You can use typical linux options like &amp;lt;code&amp;gt;&amp;gt; output_file2&amp;lt;/code&amp;gt; in which case the file specified with &amp;lt;code&amp;gt;-oo&amp;lt;/code&amp;gt; will just contain any errors and submission information.&lt;br /&gt;
&lt;br /&gt;
You can specify an input file with the &amp;lt;code&amp;gt;-i&amp;lt;/code&amp;gt; option or the typical linux option &amp;lt;code&amp;gt;&amp;lt; &amp;lt;input_file&amp;gt;&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Advanced Job Submission ==&lt;br /&gt;
&lt;br /&gt;
=== Array Jobs ===&lt;br /&gt;
To run the same program multiple time with different input and output files you can use [https://www.ibm.com/support/knowledgecenter/en/SSWRJV_10.1.0/lsf_admin/job_arrays_lsf.html LSF Array Jobs].&lt;br /&gt;
&lt;br /&gt;
An example command in the LSF documentation is given as:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt; bsub -J &amp;quot;myArray[1-1000]&amp;quot; -i &amp;quot;input.%I&amp;quot; -o &amp;quot;output.%I&amp;quot; myJob&amp;lt;/code&amp;gt;&lt;br /&gt;
 &lt;br /&gt;
This command uses only one line to submit 1000 jobs running the script myJob with the input file &amp;lt;code&amp;gt;input.1, input.2, ... input.1000&amp;lt;/code&amp;gt; with the output of each job placed in the files &amp;lt;code&amp;gt;output.1, output.2, ... output.1000&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Complicated Jobs ===&lt;br /&gt;
To run the same program with multiple files, possibly with different options, you can create a job submission script that iterates over the files and submits the jobs.&lt;br /&gt;
 &lt;br /&gt;
For example, suppose you have &amp;lt;code&amp;gt;programA&amp;lt;/code&amp;gt; and want to process &amp;lt;code&amp;gt;input.1, input.2, ... input.N&amp;lt;/code&amp;gt; with output in &amp;lt;code&amp;gt;output.1, output.2, ... output.N&amp;lt;/code&amp;gt;, as in the array example.&lt;br /&gt;
&lt;br /&gt;
Create a bash script &amp;lt;code&amp;gt;do_submit_programA.bash&amp;lt;/code&amp;gt; that looks something like:&lt;br /&gt;
&lt;br /&gt;
 n=&amp;lt;N&amp;gt;&lt;br /&gt;
 arguments=&amp;lt;nodes, memory, time constraints, etc&amp;gt; &lt;br /&gt;
 for ((i=1; i&amp;lt;=$n; i++)); do&lt;br /&gt;
    bsub -oo log.$i $arguments programA &amp;lt; input.$i &amp;gt; output.$i&lt;br /&gt;
 done&lt;br /&gt;
 &lt;br /&gt;
Note that everything in triangle braces here is not real code. For example &amp;lt;code&amp;gt;N&amp;lt;/code&amp;gt; might be read from a command line argument or hardcoded as say 10. The arguments will be something like &amp;lt;code&amp;gt;-n 1 -M 100MB -R &amp;quot;rusage[mem=100MB]&amp;quot;&amp;lt;/code&amp;gt; and any other desired options. You can run multiple types of jobs with complex arguments.&lt;br /&gt;
&lt;br /&gt;
You may wish to create separate directories for the log files, input files, and output files if there are more than a handful of jobs.&lt;br /&gt;
 &lt;br /&gt;
If each job requires nontrivial processing (e.g. changing into different directories for each job) then you may want to create a second script that generates the jobfiles and then use a similar kind of submit script.&lt;br /&gt;
&lt;br /&gt;
=== Interactive Jobs ===&lt;br /&gt;
&lt;br /&gt;
Some jobs may require user input such as testing code on a gpu system or an interactive analytics program.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;bsub -I&amp;lt;/code&amp;gt; requests an interactive job that will print its output to your terminal.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;bsub -Ip&amp;lt;/code&amp;gt; requests an interactive job with a pseudo terminal. For example, this can be used to schedule a console program that takes user input and output.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;bsub -Is&amp;lt;/code&amp;gt; requests an interactive job with a shell. This can be used to test code on one of the gpu nodes or for more resource intensive development than is allowed on the login nodes.&lt;br /&gt;
&lt;br /&gt;
Note that interactive jobs are still subject to time and memory constraints as typical batch jobs. Please be careful not to interfere with other jobs running on a node and that your interactive job does not attempt to use more resources than you have requested. Please do not leave interactive jobs running for long periods and do not leave interactive jobs idle when you are not using them.&lt;br /&gt;
&lt;br /&gt;
We do not currently treat interactive jobs different than any other jobs. As DeepSense becomes more heavily utilized we may need to limit the number of interactive jobs run by a user, project, or on a given node. We may need to limit the time or other resources used by interactive jobs.&lt;br /&gt;
&lt;br /&gt;
== Job Information ==&lt;br /&gt;
&lt;br /&gt;
=== Running Jobs ===&lt;br /&gt;
 &lt;br /&gt;
To examine currently running jobs you use the &amp;lt;code&amp;gt;bjobs&amp;lt;/code&amp;gt; command (https://www.ibm.com/support/knowledgecenter/en/SSWRJV_10.1.0/lsf_command_ref/bjobs.man_top.1.html)&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;bjobs -l&amp;lt;/code&amp;gt; or &amp;lt;code&amp;gt;bjobs -l &amp;lt;jobid&amp;gt;&amp;lt;/code&amp;gt; shows additional job information including job status and resource usage.&lt;br /&gt;
&lt;br /&gt;
=== Past Jobs ===&lt;br /&gt;
&lt;br /&gt;
To examine current and past jobs use the &amp;lt;code&amp;gt;bhist&amp;lt;/code&amp;gt; command (https://www.ibm.com/support/knowledgecenter/en/SSWRJV_10.1.0/lsf_command_ref/bhist.1.html).&lt;br /&gt;
&lt;br /&gt;
The following options will show jobs with the specified status:&lt;br /&gt;
 -a all&lt;br /&gt;
 -d finished&lt;br /&gt;
 -e exited&lt;br /&gt;
 -p pending&lt;br /&gt;
 -r running&lt;br /&gt;
 -s suspended&lt;br /&gt;
&lt;br /&gt;
You can use options like &amp;lt;code&amp;gt;-S start_time,end_time&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;-C start_time,end_time&amp;lt;/code&amp;gt; to find jobs that were submitted or completed between the specified time intervals. These options require using the &amp;lt;code&amp;gt;-a&amp;lt;/code&amp;gt; option.&lt;br /&gt;
&lt;br /&gt;
As with bjobs, you can use the &amp;lt;code&amp;gt;-l&amp;lt;/code&amp;gt; option for additional information and can also specify a specific known jobid as the last command argument.&lt;br /&gt;
&lt;br /&gt;
=== Available Hosts ===&lt;br /&gt;
 &lt;br /&gt;
To see the available hosts and how busy they are you use the &amp;lt;code&amp;gt;bhosts&amp;lt;/code&amp;gt; command (https://www.ibm.com/support/knowledgecenter/en/SSWRJV_10.1.0/lsf_command_ref/bhosts.1.html)&lt;br /&gt;
&lt;br /&gt;
== LSF Command Reference == &lt;br /&gt;
&lt;br /&gt;
The complete list of LSF commands with description is available [https://www.ibm.com/support/knowledgecenter/en/SSWRJV_10.1.0/lsf_kc_cmd_ref.html here].&lt;/div&gt;</summary>
		<author><name>Cwhidden</name></author>
		
	</entry>
	<entry>
		<id>https://docs.deepsense.ca/index.php?title=LSF&amp;diff=80</id>
		<title>LSF</title>
		<link rel="alternate" type="text/html" href="https://docs.deepsense.ca/index.php?title=LSF&amp;diff=80"/>
		<updated>2019-08-26T15:00:05Z</updated>

		<summary type="html">&lt;p&gt;Cwhidden: /* Job Submission */  rusage memory limit&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[https://www.ibm.com/support/knowledgecenter/SSWRJV_10.1.0/ IBM Spectrum LSF] is the command line job submission system for submitting batch and interactive jobs on DeepSense computing hardware.&lt;br /&gt;
&lt;br /&gt;
== Test code and short computation ==&lt;br /&gt;
DeepSense has two login nodes, login1.deepsense.ca and login2.deepsense.ca . You can access these through SSH with your username and password from any computer on campus. From off campus you’ll need to use the [https://wireless.dal.ca/vpnsoftware.php Dalhousie VPN].&lt;br /&gt;
&lt;br /&gt;
The login nodes are intended for testing and compiling code. Please don’t run long or intensive computation on these nodes.&lt;br /&gt;
&lt;br /&gt;
== Job Submission ==&lt;br /&gt;
When you have a small example working with your code and are ready to run a real workload, use the LSF queue to submit your jobs to the cluster (https://www.ibm.com/support/knowledgecenter/SSWRJV_10.1.0/lsf_users_guide/batch_jobs_about.html). If you’ve used other queuing systems like slurm or Sun Grid Engine before then LSF will seem very familiar.&lt;br /&gt;
 &lt;br /&gt;
To submit a job you use the &amp;lt;code&amp;gt;bsub&amp;lt;/code&amp;gt; command (https://www.ibm.com/support/knowledgecenter/en/SSWRJV_10.1.0/lsf_command_ref/bsub.man_top.1.html).&lt;br /&gt;
 &lt;br /&gt;
For example, to submit a shared memory job using 20 processors and 256GB of memory for at most 24 hours you would run:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;bsub -oo &amp;lt;output_file&amp;gt; -n 20 -M 256000 -W 24:0 -R “span[hosts=1] rusage[mem=256000]” &amp;lt;executable&amp;gt; [options]&amp;lt;/code&amp;gt;&lt;br /&gt;
 &lt;br /&gt;
For openMP jobs, please make sure that you use &amp;lt;code&amp;gt;OMP_NUM_THREADS&amp;lt;/code&amp;gt; to limit the number of threads your program uses and that you set this variable in your code that will run on the server. LSF sets a variable &amp;lt;code&amp;gt;$LSB_DJOB_NUMPROC&amp;lt;/code&amp;gt; that you can use if you don’t want to hardcode &amp;lt;code&amp;gt;OMP_NUM_THREADS&amp;lt;/code&amp;gt; or set it with your own variable.&lt;br /&gt;
&lt;br /&gt;
=== CPU Limit ===&lt;br /&gt;
The number of requested processors is specified with the option &amp;lt;code&amp;gt;-n&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
The resource request &amp;lt;code&amp;gt;-R &amp;quot;span[hosts=1]&amp;quot;&amp;lt;/code&amp;gt; requires that all processors are on the same compute host, i.e. a shared memory job.&lt;br /&gt;
&lt;br /&gt;
LSF can also be used to run compute jobs across multiple hosts such as MPI jobs. Examples will be included here at a later date.&lt;br /&gt;
&lt;br /&gt;
=== Memory Limit === &lt;br /&gt;
LSF has two different types of memory limits.&lt;br /&gt;
The scheduler memory limit &amp;lt;code&amp;gt;-R &amp;quot;rusage[mem=&amp;lt;memlimit&amp;gt;]&amp;quot;&amp;lt;/code&amp;gt; requests &amp;lt;code&amp;gt;&amp;lt;memlimit&amp;gt;&amp;lt;/code&amp;gt; amount of memory. Your job will not start until a compute node is available with that amount of memory. You are guaranteed to have this amount of memory available. If you exceed the requested amount then your job may be killed but it will only be killed if other jobs need that memory. &lt;br /&gt;
&lt;br /&gt;
The job memory limit &amp;lt;code&amp;gt;-R &amp;quot;rusage[mem=&amp;lt;memlimit&amp;gt;]&amp;lt;/code&amp;gt; will kill your job if it exceeds the given memory limit. Note that this option does not guarantee that you will have that amount of memory available.&lt;br /&gt;
&lt;br /&gt;
The memory limits are specified in MB by default. You can also specify units, e.g. &amp;lt;code&amp;gt;-M 256GB&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;-R &amp;quot;rusage[mem=256GB]&amp;quot;&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
If you are using more than a few GB of memory than you must specify the &amp;lt;code&amp;gt;-R &amp;quot;rusage[mem=&amp;lt;memlimit&amp;gt;]&amp;quot;&amp;lt;/code&amp;gt; option or your job may be terminated. You may additionally want to use the &amp;lt;code&amp;gt;-M &amp;lt;memlimit&amp;gt;&amp;lt;/code&amp;gt; option to be sure you aren&amp;#039;t using more memory than intended.&lt;br /&gt;
&lt;br /&gt;
=== Time Limit ===&lt;br /&gt;
The runtime limit &amp;lt;code&amp;gt;-W hours:minutes&amp;lt;/code&amp;gt; specifies the maximum length of time your job is allowed to run.&lt;br /&gt;
For example &amp;lt;code&amp;gt;-W 24:0&amp;lt;/code&amp;gt; requests 24 hours of running time.&lt;br /&gt;
Your job will be terminated when the runtime limit is exceeded.&lt;br /&gt;
&lt;br /&gt;
If you do not specify a runtime limit then the default runtime limit of 168 hours (7 days) will be used.&lt;br /&gt;
The maximum possible runtime limit is currently 30 days and may vary by queue in the future.&lt;br /&gt;
&lt;br /&gt;
If there is a scheduled maintenance window announced then any job with a run time limit that could extend into the maintenance period will be listed as pending and will not run until the maintenance has concluded. Use a shorter run time limit that ends before the maintenance period to avoid this.&lt;br /&gt;
&lt;br /&gt;
=== GPU Computation ===&lt;br /&gt;
&lt;br /&gt;
To request access to a GPU use the &amp;lt;code&amp;gt;-gpu -&amp;lt;/code&amp;gt; options.&lt;br /&gt;
&lt;br /&gt;
Note the trailing dash, which specifies the default GPU arguments. The following options can be used in place of that dash.&lt;br /&gt;
&lt;br /&gt;
The default GPU arguments are &amp;lt;code&amp;gt;&amp;quot;num=1:mode=shared:mps=no:j_exclusive=no&amp;quot;&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;num=num_gpus&amp;lt;/code&amp;gt; is the number of requested GPUs on each host.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;mode=shared | exclusive_process&amp;lt;/code&amp;gt; specifies the GPU mode.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;mps=yes | no&amp;lt;/code&amp;gt; use the Nvidia Multi-Process Server (MPS). MPS enables better sharing of GPU resources. If &amp;lt;code&amp;gt;mode=exclusive_process&amp;lt;/code&amp;gt; then mps should be set to yes. &lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;j_exclusive=yes | no&amp;lt;/code&amp;gt; Is the GPU exclusive to this job and prevented from being used by other jobs?&lt;br /&gt;
&lt;br /&gt;
By default the &amp;lt;code&amp;gt;-gpu -&amp;lt;/code&amp;gt; option will request one nonexclusive GPU. Please limit your usage of GPU resources to a reasonable number of concurrently used GPUs and use shared GPUs when possible. We may enact limits on GPU use in the feature if necessary.&lt;br /&gt;
&lt;br /&gt;
See the [https://www.ibm.com/support/knowledgecenter/en/SSWRJV_10.1.0/lsf_command_ref/bsub.gpu.1.html bsub.gpu] documentation for more information on submitting GPU jobs.&lt;br /&gt;
&lt;br /&gt;
=== Input and Output files ===&lt;br /&gt;
If you do not specify an output file with &amp;lt;code&amp;gt;-o&amp;lt;/code&amp;gt; (append) or &amp;lt;code&amp;gt;-oo&amp;lt;/code&amp;gt; (overwrite) then the output will be lost. Note that LSF will prepend submission information to this file. You can use typical linux options like &amp;lt;code&amp;gt;&amp;gt; output_file2&amp;lt;/code&amp;gt; in which case the file specified with &amp;lt;code&amp;gt;-oo&amp;lt;/code&amp;gt; will just contain any errors and submission information.&lt;br /&gt;
&lt;br /&gt;
You can specify an input file with the &amp;lt;code&amp;gt;-i&amp;lt;/code&amp;gt; option or the typical linux option &amp;lt;code&amp;gt;&amp;lt; &amp;lt;input_file&amp;gt;&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Advanced Job Submission ==&lt;br /&gt;
&lt;br /&gt;
=== Array Jobs ===&lt;br /&gt;
To run the same program multiple time with different input and output files you can use [https://www.ibm.com/support/knowledgecenter/en/SSWRJV_10.1.0/lsf_admin/job_arrays_lsf.html LSF Array Jobs].&lt;br /&gt;
&lt;br /&gt;
An example command in the LSF documentation is given as:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt; bsub -J &amp;quot;myArray[1-1000]&amp;quot; -i &amp;quot;input.%I&amp;quot; -o &amp;quot;output.%I&amp;quot; myJob&amp;lt;/code&amp;gt;&lt;br /&gt;
 &lt;br /&gt;
This command uses only one line to submit 1000 jobs running the script myJob with the input file &amp;lt;code&amp;gt;input.1, input.2, ... input.1000&amp;lt;/code&amp;gt; with the output of each job placed in the files &amp;lt;code&amp;gt;output.1, output.2, ... output.1000&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Complicated Jobs ===&lt;br /&gt;
To run the same program with multiple files, possibly with different options, you can create a job submission script that iterates over the files and submits the jobs.&lt;br /&gt;
 &lt;br /&gt;
For example, suppose you have &amp;lt;code&amp;gt;programA&amp;lt;/code&amp;gt; and want to process &amp;lt;code&amp;gt;input.1, input.2, ... input.N&amp;lt;/code&amp;gt; with output in &amp;lt;code&amp;gt;output.1, output.2, ... output.N&amp;lt;/code&amp;gt;, as in the array example.&lt;br /&gt;
&lt;br /&gt;
Create a bash script &amp;lt;code&amp;gt;do_submit_programA.bash&amp;lt;/code&amp;gt; that looks something like:&lt;br /&gt;
&lt;br /&gt;
 n=&amp;lt;N&amp;gt;&lt;br /&gt;
 arguments=&amp;lt;nodes, memory, time constraints, etc&amp;gt; &lt;br /&gt;
 for ((i=1; i&amp;lt;=$n; i++)); do&lt;br /&gt;
    bsub -oo log.$i $arguments programA &amp;lt; input.$i &amp;gt; output.$i&lt;br /&gt;
 done&lt;br /&gt;
 &lt;br /&gt;
Note that everything in triangle braces here is not real code. For example &amp;lt;code&amp;gt;N&amp;lt;/code&amp;gt; might be read from a command line argument or hardcoded as say 10. The arguments will be something like &amp;lt;code&amp;gt;-n 1 -M 100MB&amp;lt;/code&amp;gt; and any other desired options. You can run multiple types of jobs with complex arguments.&lt;br /&gt;
&lt;br /&gt;
You may wish to create separate directories for the log files, input files, and output files if there are more than a handful of jobs.&lt;br /&gt;
 &lt;br /&gt;
If each job requires nontrivial processing (e.g. changing into different directories for each job) then you may want to create a second script that generates the jobfiles and then use a similar kind of submit script.&lt;br /&gt;
&lt;br /&gt;
=== Interactive Jobs ===&lt;br /&gt;
&lt;br /&gt;
Some jobs may require user input such as testing code on a gpu system or an interactive analytics program.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;bsub -I&amp;lt;/code&amp;gt; requests an interactive job that will print its output to your terminal.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;bsub -Ip&amp;lt;/code&amp;gt; requests an interactive job with a pseudo terminal. For example, this can be used to schedule a console program that takes user input and output.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;bsub -Is&amp;lt;/code&amp;gt; requests an interactive job with a shell. This can be used to test code on one of the gpu nodes or for more resource intensive development than is allowed on the login nodes.&lt;br /&gt;
&lt;br /&gt;
Note that interactive jobs are still subject to time and memory constraints as typical batch jobs. Please be careful not to interfere with other jobs running on a node and that your interactive job does not attempt to use more resources than you have requested. Please do not leave interactive jobs running for long periods and do not leave interactive jobs idle when you are not using them.&lt;br /&gt;
&lt;br /&gt;
We do not currently treat interactive jobs different than any other jobs. As DeepSense becomes more heavily utilized we may need to limit the number of interactive jobs run by a user, project, or on a given node. We may need to limit the time or other resources used by interactive jobs.&lt;br /&gt;
&lt;br /&gt;
== Job Information ==&lt;br /&gt;
&lt;br /&gt;
=== Running Jobs ===&lt;br /&gt;
 &lt;br /&gt;
To examine currently running jobs you use the &amp;lt;code&amp;gt;bjobs&amp;lt;/code&amp;gt; command (https://www.ibm.com/support/knowledgecenter/en/SSWRJV_10.1.0/lsf_command_ref/bjobs.man_top.1.html)&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;bjobs -l&amp;lt;/code&amp;gt; or &amp;lt;code&amp;gt;bjobs -l &amp;lt;jobid&amp;gt;&amp;lt;/code&amp;gt; shows additional job information including job status and resource usage.&lt;br /&gt;
&lt;br /&gt;
=== Past Jobs ===&lt;br /&gt;
&lt;br /&gt;
To examine current and past jobs use the &amp;lt;code&amp;gt;bhist&amp;lt;/code&amp;gt; command (https://www.ibm.com/support/knowledgecenter/en/SSWRJV_10.1.0/lsf_command_ref/bhist.1.html).&lt;br /&gt;
&lt;br /&gt;
The following options will show jobs with the specified status:&lt;br /&gt;
 -a all&lt;br /&gt;
 -d finished&lt;br /&gt;
 -e exited&lt;br /&gt;
 -p pending&lt;br /&gt;
 -r running&lt;br /&gt;
 -s suspended&lt;br /&gt;
&lt;br /&gt;
You can use options like &amp;lt;code&amp;gt;-S start_time,end_time&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;-C start_time,end_time&amp;lt;/code&amp;gt; to find jobs that were submitted or completed between the specified time intervals. These options require using the &amp;lt;code&amp;gt;-a&amp;lt;/code&amp;gt; option.&lt;br /&gt;
&lt;br /&gt;
As with bjobs, you can use the &amp;lt;code&amp;gt;-l&amp;lt;/code&amp;gt; option for additional information and can also specify a specific known jobid as the last command argument.&lt;br /&gt;
&lt;br /&gt;
=== Available Hosts ===&lt;br /&gt;
 &lt;br /&gt;
To see the available hosts and how busy they are you use the &amp;lt;code&amp;gt;bhosts&amp;lt;/code&amp;gt; command (https://www.ibm.com/support/knowledgecenter/en/SSWRJV_10.1.0/lsf_command_ref/bhosts.1.html)&lt;br /&gt;
&lt;br /&gt;
== LSF Command Reference == &lt;br /&gt;
&lt;br /&gt;
The complete list of LSF commands with description is available [https://www.ibm.com/support/knowledgecenter/en/SSWRJV_10.1.0/lsf_kc_cmd_ref.html here].&lt;/div&gt;</summary>
		<author><name>Cwhidden</name></author>
		
	</entry>
	<entry>
		<id>https://docs.deepsense.ca/index.php?title=LSF&amp;diff=70</id>
		<title>LSF</title>
		<link rel="alternate" type="text/html" href="https://docs.deepsense.ca/index.php?title=LSF&amp;diff=70"/>
		<updated>2019-07-03T19:23:09Z</updated>

		<summary type="html">&lt;p&gt;Cwhidden: /* Time Limit */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[https://www.ibm.com/support/knowledgecenter/SSWRJV_10.1.0/ IBM Spectrum LSF] is the command line job submission system for submitting batch and interactive jobs on DeepSense computing hardware.&lt;br /&gt;
&lt;br /&gt;
== Test code and short computation ==&lt;br /&gt;
DeepSense has two login nodes, login1.deepsense.ca and login2.deepsense.ca . You can access these through SSH with your username and password from any computer on campus. From off campus you’ll need to use the [https://wireless.dal.ca/vpnsoftware.php Dalhousie VPN].&lt;br /&gt;
&lt;br /&gt;
The login nodes are intended for testing and compiling code. Please don’t run long or intensive computation on these nodes.&lt;br /&gt;
&lt;br /&gt;
== Job Submission ==&lt;br /&gt;
When you have a small example working with your code and are ready to run a real workload, use the LSF queue to submit your jobs to the cluster (https://www.ibm.com/support/knowledgecenter/SSWRJV_10.1.0/lsf_users_guide/batch_jobs_about.html). If you’ve used other queuing systems like slurm or Sun Grid Engine before then LSF will seem very familiar.&lt;br /&gt;
 &lt;br /&gt;
To submit a job you use the &amp;lt;code&amp;gt;bsub&amp;lt;/code&amp;gt; command (https://www.ibm.com/support/knowledgecenter/en/SSWRJV_10.1.0/lsf_command_ref/bsub.man_top.1.html).&lt;br /&gt;
 &lt;br /&gt;
For example, to submit a shared memory job using 20 processors and 256GB of memory for at most 24 hours you would run:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;bsub -oo &amp;lt;output_file&amp;gt; -n 20 -M 256000 -W 24:0 -R “span[hosts=1]” &amp;lt;executable&amp;gt; [options]&amp;lt;/code&amp;gt;&lt;br /&gt;
 &lt;br /&gt;
For openMP jobs, please make sure that you use &amp;lt;code&amp;gt;OMP_NUM_THREADS&amp;lt;/code&amp;gt; to limit the number of threads your program uses and that you set this variable in your code that will run on the server. LSF sets a variable &amp;lt;code&amp;gt;$LSB_DJOB_NUMPROC&amp;lt;/code&amp;gt; that you can use if you don’t want to hardcode &amp;lt;code&amp;gt;OMP_NUM_THREADS&amp;lt;/code&amp;gt; or set it with your own variable.&lt;br /&gt;
&lt;br /&gt;
=== CPU Limit ===&lt;br /&gt;
The number of requested processors is specified with the option &amp;lt;code&amp;gt;-n&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
The resource request &amp;lt;code&amp;gt;-R &amp;quot;span[hosts=1]&amp;quot;&amp;lt;/code&amp;gt; requires that all processors are on the same compute host, i.e. a shared memory job.&lt;br /&gt;
&lt;br /&gt;
LSF can also be used to run compute jobs across multiple hosts such as MPI jobs. Examples will be included here at a later date.&lt;br /&gt;
&lt;br /&gt;
=== Memory Limit === &lt;br /&gt;
The memory limit &amp;lt;code&amp;gt;-M&amp;lt;/code&amp;gt; is specified in MB by default. You can also specify units, e.g. &amp;lt;code&amp;gt;-M 256GB&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Time Limit ===&lt;br /&gt;
The runtime limit &amp;lt;code&amp;gt;-W hours:minutes&amp;lt;/code&amp;gt; specifies the maximum length of time your job is allowed to run.&lt;br /&gt;
For example &amp;lt;code&amp;gt;-W 24:0&amp;lt;/code&amp;gt; requests 24 hours of running time.&lt;br /&gt;
Your job will be terminated when the runtime limit is exceeded.&lt;br /&gt;
&lt;br /&gt;
If you do not specify a runtime limit then the default runtime limit of 168 hours (7 days) will be used.&lt;br /&gt;
The maximum possible runtime limit is currently 30 days and may vary by queue in the future.&lt;br /&gt;
&lt;br /&gt;
If there is a scheduled maintenance window announced then any job with a run time limit that could extend into the maintenance period will be listed as pending and will not run until the maintenance has concluded. Use a shorter run time limit that ends before the maintenance period to avoid this.&lt;br /&gt;
&lt;br /&gt;
=== GPU Computation ===&lt;br /&gt;
&lt;br /&gt;
To request access to a GPU use the &amp;lt;code&amp;gt;-gpu -&amp;lt;/code&amp;gt; options.&lt;br /&gt;
&lt;br /&gt;
Note the trailing dash, which specifies the default GPU arguments. The following options can be used in place of that dash.&lt;br /&gt;
&lt;br /&gt;
The default GPU arguments are &amp;lt;code&amp;gt;&amp;quot;num=1:mode=shared:mps=no:j_exclusive=no&amp;quot;&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;num=num_gpus&amp;lt;/code&amp;gt; is the number of requested GPUs on each host.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;mode=shared | exclusive_process&amp;lt;/code&amp;gt; specifies the GPU mode.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;mps=yes | no&amp;lt;/code&amp;gt; use the Nvidia Multi-Process Server (MPS). MPS enables better sharing of GPU resources. If &amp;lt;code&amp;gt;mode=exclusive_process&amp;lt;/code&amp;gt; then mps should be set to yes. &lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;j_exclusive=yes | no&amp;lt;/code&amp;gt; Is the GPU exclusive to this job and prevented from being used by other jobs?&lt;br /&gt;
&lt;br /&gt;
By default the &amp;lt;code&amp;gt;-gpu -&amp;lt;/code&amp;gt; option will request one nonexclusive GPU. Please limit your usage of GPU resources to a reasonable number of concurrently used GPUs and use shared GPUs when possible. We may enact limits on GPU use in the feature if necessary.&lt;br /&gt;
&lt;br /&gt;
See the [https://www.ibm.com/support/knowledgecenter/en/SSWRJV_10.1.0/lsf_command_ref/bsub.gpu.1.html bsub.gpu] documentation for more information on submitting GPU jobs.&lt;br /&gt;
&lt;br /&gt;
=== Input and Output files ===&lt;br /&gt;
If you do not specify an output file with &amp;lt;code&amp;gt;-o&amp;lt;/code&amp;gt; (append) or &amp;lt;code&amp;gt;-oo&amp;lt;/code&amp;gt; (overwrite) then the output will be lost. Note that LSF will prepend submission information to this file. You can use typical linux options like &amp;lt;code&amp;gt;&amp;gt; output_file2&amp;lt;/code&amp;gt; in which case the file specified with &amp;lt;code&amp;gt;-oo&amp;lt;/code&amp;gt; will just contain any errors and submission information.&lt;br /&gt;
&lt;br /&gt;
You can specify an input file with the &amp;lt;code&amp;gt;-i&amp;lt;/code&amp;gt; option or the typical linux option &amp;lt;code&amp;gt;&amp;lt; &amp;lt;input_file&amp;gt;&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Advanced Job Submission ==&lt;br /&gt;
&lt;br /&gt;
=== Array Jobs ===&lt;br /&gt;
To run the same program multiple time with different input and output files you can use [https://www.ibm.com/support/knowledgecenter/en/SSWRJV_10.1.0/lsf_admin/job_arrays_lsf.html LSF Array Jobs].&lt;br /&gt;
&lt;br /&gt;
An example command in the LSF documentation is given as:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt; bsub -J &amp;quot;myArray[1-1000]&amp;quot; -i &amp;quot;input.%I&amp;quot; -o &amp;quot;output.%I&amp;quot; myJob&amp;lt;/code&amp;gt;&lt;br /&gt;
 &lt;br /&gt;
This command uses only one line to submit 1000 jobs running the script myJob with the input file &amp;lt;code&amp;gt;input.1, input.2, ... input.1000&amp;lt;/code&amp;gt; with the output of each job placed in the files &amp;lt;code&amp;gt;output.1, output.2, ... output.1000&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Complicated Jobs ===&lt;br /&gt;
To run the same program with multiple files, possibly with different options, you can create a job submission script that iterates over the files and submits the jobs.&lt;br /&gt;
 &lt;br /&gt;
For example, suppose you have &amp;lt;code&amp;gt;programA&amp;lt;/code&amp;gt; and want to process &amp;lt;code&amp;gt;input.1, input.2, ... input.N&amp;lt;/code&amp;gt; with output in &amp;lt;code&amp;gt;output.1, output.2, ... output.N&amp;lt;/code&amp;gt;, as in the array example.&lt;br /&gt;
&lt;br /&gt;
Create a bash script &amp;lt;code&amp;gt;do_submit_programA.bash&amp;lt;/code&amp;gt; that looks something like:&lt;br /&gt;
&lt;br /&gt;
 n=&amp;lt;N&amp;gt;&lt;br /&gt;
 arguments=&amp;lt;nodes, memory, time constraints, etc&amp;gt; &lt;br /&gt;
 for ((i=1; i&amp;lt;=$n; i++)); do&lt;br /&gt;
    bsub -oo log.$i $arguments programA &amp;lt; input.$i &amp;gt; output.$i&lt;br /&gt;
 done&lt;br /&gt;
 &lt;br /&gt;
Note that everything in triangle braces here is not real code. For example &amp;lt;code&amp;gt;N&amp;lt;/code&amp;gt; might be read from a command line argument or hardcoded as say 10. The arguments will be something like &amp;lt;code&amp;gt;-n 1 -M 100MB&amp;lt;/code&amp;gt; and any other desired options. You can run multiple types of jobs with complex arguments.&lt;br /&gt;
&lt;br /&gt;
You may wish to create separate directories for the log files, input files, and output files if there are more than a handful of jobs.&lt;br /&gt;
 &lt;br /&gt;
If each job requires nontrivial processing (e.g. changing into different directories for each job) then you may want to create a second script that generates the jobfiles and then use a similar kind of submit script.&lt;br /&gt;
&lt;br /&gt;
=== Interactive Jobs ===&lt;br /&gt;
&lt;br /&gt;
Some jobs may require user input such as testing code on a gpu system or an interactive analytics program.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;bsub -I&amp;lt;/code&amp;gt; requests an interactive job that will print its output to your terminal.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;bsub -Ip&amp;lt;/code&amp;gt; requests an interactive job with a pseudo terminal. For example, this can be used to schedule a console program that takes user input and output.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;bsub -Is&amp;lt;/code&amp;gt; requests an interactive job with a shell. This can be used to test code on one of the gpu nodes or for more resource intensive development than is allowed on the login nodes.&lt;br /&gt;
&lt;br /&gt;
Note that interactive jobs are still subject to time and memory constraints as typical batch jobs. Please be careful not to interfere with other jobs running on a node and that your interactive job does not attempt to use more resources than you have requested. Please do not leave interactive jobs running for long periods and do not leave interactive jobs idle when you are not using them.&lt;br /&gt;
&lt;br /&gt;
We do not currently treat interactive jobs different than any other jobs. As DeepSense becomes more heavily utilized we may need to limit the number of interactive jobs run by a user, project, or on a given node. We may need to limit the time or other resources used by interactive jobs.&lt;br /&gt;
&lt;br /&gt;
== Job Information ==&lt;br /&gt;
&lt;br /&gt;
=== Running Jobs ===&lt;br /&gt;
 &lt;br /&gt;
To examine currently running jobs you use the &amp;lt;code&amp;gt;bjobs&amp;lt;/code&amp;gt; command (https://www.ibm.com/support/knowledgecenter/en/SSWRJV_10.1.0/lsf_command_ref/bjobs.man_top.1.html)&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;bjobs -l&amp;lt;/code&amp;gt; or &amp;lt;code&amp;gt;bjobs -l &amp;lt;jobid&amp;gt;&amp;lt;/code&amp;gt; shows additional job information including job status and resource usage.&lt;br /&gt;
&lt;br /&gt;
=== Past Jobs ===&lt;br /&gt;
&lt;br /&gt;
To examine current and past jobs use the &amp;lt;code&amp;gt;bhist&amp;lt;/code&amp;gt; command (https://www.ibm.com/support/knowledgecenter/en/SSWRJV_10.1.0/lsf_command_ref/bhist.1.html).&lt;br /&gt;
&lt;br /&gt;
The following options will show jobs with the specified status:&lt;br /&gt;
 -a all&lt;br /&gt;
 -d finished&lt;br /&gt;
 -e exited&lt;br /&gt;
 -p pending&lt;br /&gt;
 -r running&lt;br /&gt;
 -s suspended&lt;br /&gt;
&lt;br /&gt;
You can use options like &amp;lt;code&amp;gt;-S start_time,end_time&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;-C start_time,end_time&amp;lt;/code&amp;gt; to find jobs that were submitted or completed between the specified time intervals. These options require using the &amp;lt;code&amp;gt;-a&amp;lt;/code&amp;gt; option.&lt;br /&gt;
&lt;br /&gt;
As with bjobs, you can use the &amp;lt;code&amp;gt;-l&amp;lt;/code&amp;gt; option for additional information and can also specify a specific known jobid as the last command argument.&lt;br /&gt;
&lt;br /&gt;
=== Available Hosts ===&lt;br /&gt;
 &lt;br /&gt;
To see the available hosts and how busy they are you use the &amp;lt;code&amp;gt;bhosts&amp;lt;/code&amp;gt; command (https://www.ibm.com/support/knowledgecenter/en/SSWRJV_10.1.0/lsf_command_ref/bhosts.1.html)&lt;br /&gt;
&lt;br /&gt;
== LSF Command Reference == &lt;br /&gt;
&lt;br /&gt;
The complete list of LSF commands with description is available [https://www.ibm.com/support/knowledgecenter/en/SSWRJV_10.1.0/lsf_kc_cmd_ref.html here].&lt;/div&gt;</summary>
		<author><name>Cwhidden</name></author>
		
	</entry>
	<entry>
		<id>https://docs.deepsense.ca/index.php?title=LSF&amp;diff=69</id>
		<title>LSF</title>
		<link rel="alternate" type="text/html" href="https://docs.deepsense.ca/index.php?title=LSF&amp;diff=69"/>
		<updated>2019-07-03T19:08:38Z</updated>

		<summary type="html">&lt;p&gt;Cwhidden: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[https://www.ibm.com/support/knowledgecenter/SSWRJV_10.1.0/ IBM Spectrum LSF] is the command line job submission system for submitting batch and interactive jobs on DeepSense computing hardware.&lt;br /&gt;
&lt;br /&gt;
== Test code and short computation ==&lt;br /&gt;
DeepSense has two login nodes, login1.deepsense.ca and login2.deepsense.ca . You can access these through SSH with your username and password from any computer on campus. From off campus you’ll need to use the [https://wireless.dal.ca/vpnsoftware.php Dalhousie VPN].&lt;br /&gt;
&lt;br /&gt;
The login nodes are intended for testing and compiling code. Please don’t run long or intensive computation on these nodes.&lt;br /&gt;
&lt;br /&gt;
== Job Submission ==&lt;br /&gt;
When you have a small example working with your code and are ready to run a real workload, use the LSF queue to submit your jobs to the cluster (https://www.ibm.com/support/knowledgecenter/SSWRJV_10.1.0/lsf_users_guide/batch_jobs_about.html). If you’ve used other queuing systems like slurm or Sun Grid Engine before then LSF will seem very familiar.&lt;br /&gt;
 &lt;br /&gt;
To submit a job you use the &amp;lt;code&amp;gt;bsub&amp;lt;/code&amp;gt; command (https://www.ibm.com/support/knowledgecenter/en/SSWRJV_10.1.0/lsf_command_ref/bsub.man_top.1.html).&lt;br /&gt;
 &lt;br /&gt;
For example, to submit a shared memory job using 20 processors and 256GB of memory for at most 24 hours you would run:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;bsub -oo &amp;lt;output_file&amp;gt; -n 20 -M 256000 -W 24:0 -R “span[hosts=1]” &amp;lt;executable&amp;gt; [options]&amp;lt;/code&amp;gt;&lt;br /&gt;
 &lt;br /&gt;
For openMP jobs, please make sure that you use &amp;lt;code&amp;gt;OMP_NUM_THREADS&amp;lt;/code&amp;gt; to limit the number of threads your program uses and that you set this variable in your code that will run on the server. LSF sets a variable &amp;lt;code&amp;gt;$LSB_DJOB_NUMPROC&amp;lt;/code&amp;gt; that you can use if you don’t want to hardcode &amp;lt;code&amp;gt;OMP_NUM_THREADS&amp;lt;/code&amp;gt; or set it with your own variable.&lt;br /&gt;
&lt;br /&gt;
=== CPU Limit ===&lt;br /&gt;
The number of requested processors is specified with the option &amp;lt;code&amp;gt;-n&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
The resource request &amp;lt;code&amp;gt;-R &amp;quot;span[hosts=1]&amp;quot;&amp;lt;/code&amp;gt; requires that all processors are on the same compute host, i.e. a shared memory job.&lt;br /&gt;
&lt;br /&gt;
LSF can also be used to run compute jobs across multiple hosts such as MPI jobs. Examples will be included here at a later date.&lt;br /&gt;
&lt;br /&gt;
=== Memory Limit === &lt;br /&gt;
The memory limit &amp;lt;code&amp;gt;-M&amp;lt;/code&amp;gt; is specified in MB by default. You can also specify units, e.g. &amp;lt;code&amp;gt;-M 256GB&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Time Limit ===&lt;br /&gt;
The runtime limit &amp;lt;code&amp;gt;-W hours:minutes&amp;lt;/code&amp;gt; specifies the maximum length of time your job is allowed to run.&lt;br /&gt;
For example &amp;lt;code&amp;gt;-W 24:0&amp;lt;/code&amp;gt; requests 24 hours of running time.&lt;br /&gt;
You job will be terminated when the runtime limit is exceeded.&lt;br /&gt;
&lt;br /&gt;
If you do not specify a runtime limit then the default runtime limit of 168 hours (7 days) will be used.&lt;br /&gt;
The maximum possible runtime limit is currently 30 days and may vary by queue in the future.&lt;br /&gt;
&lt;br /&gt;
If there is a scheduled maintenance window announced then any job with a run time limit that could extend into the maintenance period will be listed as pending and will not run until the maintenance has concluded. Use a shorter run time limit that ends before the maintenance period to avoid this.&lt;br /&gt;
&lt;br /&gt;
=== GPU Computation ===&lt;br /&gt;
&lt;br /&gt;
To request access to a GPU use the &amp;lt;code&amp;gt;-gpu -&amp;lt;/code&amp;gt; options.&lt;br /&gt;
&lt;br /&gt;
Note the trailing dash, which specifies the default GPU arguments. The following options can be used in place of that dash.&lt;br /&gt;
&lt;br /&gt;
The default GPU arguments are &amp;lt;code&amp;gt;&amp;quot;num=1:mode=shared:mps=no:j_exclusive=no&amp;quot;&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;num=num_gpus&amp;lt;/code&amp;gt; is the number of requested GPUs on each host.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;mode=shared | exclusive_process&amp;lt;/code&amp;gt; specifies the GPU mode.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;mps=yes | no&amp;lt;/code&amp;gt; use the Nvidia Multi-Process Server (MPS). MPS enables better sharing of GPU resources. If &amp;lt;code&amp;gt;mode=exclusive_process&amp;lt;/code&amp;gt; then mps should be set to yes. &lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;j_exclusive=yes | no&amp;lt;/code&amp;gt; Is the GPU exclusive to this job and prevented from being used by other jobs?&lt;br /&gt;
&lt;br /&gt;
By default the &amp;lt;code&amp;gt;-gpu -&amp;lt;/code&amp;gt; option will request one nonexclusive GPU. Please limit your usage of GPU resources to a reasonable number of concurrently used GPUs and use shared GPUs when possible. We may enact limits on GPU use in the feature if necessary.&lt;br /&gt;
&lt;br /&gt;
See the [https://www.ibm.com/support/knowledgecenter/en/SSWRJV_10.1.0/lsf_command_ref/bsub.gpu.1.html bsub.gpu] documentation for more information on submitting GPU jobs.&lt;br /&gt;
&lt;br /&gt;
=== Input and Output files ===&lt;br /&gt;
If you do not specify an output file with &amp;lt;code&amp;gt;-o&amp;lt;/code&amp;gt; (append) or &amp;lt;code&amp;gt;-oo&amp;lt;/code&amp;gt; (overwrite) then the output will be lost. Note that LSF will prepend submission information to this file. You can use typical linux options like &amp;lt;code&amp;gt;&amp;gt; output_file2&amp;lt;/code&amp;gt; in which case the file specified with &amp;lt;code&amp;gt;-oo&amp;lt;/code&amp;gt; will just contain any errors and submission information.&lt;br /&gt;
&lt;br /&gt;
You can specify an input file with the &amp;lt;code&amp;gt;-i&amp;lt;/code&amp;gt; option or the typical linux option &amp;lt;code&amp;gt;&amp;lt; &amp;lt;input_file&amp;gt;&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Advanced Job Submission ==&lt;br /&gt;
&lt;br /&gt;
=== Array Jobs ===&lt;br /&gt;
To run the same program multiple time with different input and output files you can use [https://www.ibm.com/support/knowledgecenter/en/SSWRJV_10.1.0/lsf_admin/job_arrays_lsf.html LSF Array Jobs].&lt;br /&gt;
&lt;br /&gt;
An example command in the LSF documentation is given as:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt; bsub -J &amp;quot;myArray[1-1000]&amp;quot; -i &amp;quot;input.%I&amp;quot; -o &amp;quot;output.%I&amp;quot; myJob&amp;lt;/code&amp;gt;&lt;br /&gt;
 &lt;br /&gt;
This command uses only one line to submit 1000 jobs running the script myJob with the input file &amp;lt;code&amp;gt;input.1, input.2, ... input.1000&amp;lt;/code&amp;gt; with the output of each job placed in the files &amp;lt;code&amp;gt;output.1, output.2, ... output.1000&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Complicated Jobs ===&lt;br /&gt;
To run the same program with multiple files, possibly with different options, you can create a job submission script that iterates over the files and submits the jobs.&lt;br /&gt;
 &lt;br /&gt;
For example, suppose you have &amp;lt;code&amp;gt;programA&amp;lt;/code&amp;gt; and want to process &amp;lt;code&amp;gt;input.1, input.2, ... input.N&amp;lt;/code&amp;gt; with output in &amp;lt;code&amp;gt;output.1, output.2, ... output.N&amp;lt;/code&amp;gt;, as in the array example.&lt;br /&gt;
&lt;br /&gt;
Create a bash script &amp;lt;code&amp;gt;do_submit_programA.bash&amp;lt;/code&amp;gt; that looks something like:&lt;br /&gt;
&lt;br /&gt;
 n=&amp;lt;N&amp;gt;&lt;br /&gt;
 arguments=&amp;lt;nodes, memory, time constraints, etc&amp;gt; &lt;br /&gt;
 for ((i=1; i&amp;lt;=$n; i++)); do&lt;br /&gt;
    bsub -oo log.$i $arguments programA &amp;lt; input.$i &amp;gt; output.$i&lt;br /&gt;
 done&lt;br /&gt;
 &lt;br /&gt;
Note that everything in triangle braces here is not real code. For example &amp;lt;code&amp;gt;N&amp;lt;/code&amp;gt; might be read from a command line argument or hardcoded as say 10. The arguments will be something like &amp;lt;code&amp;gt;-n 1 -M 100MB&amp;lt;/code&amp;gt; and any other desired options. You can run multiple types of jobs with complex arguments.&lt;br /&gt;
&lt;br /&gt;
You may wish to create separate directories for the log files, input files, and output files if there are more than a handful of jobs.&lt;br /&gt;
 &lt;br /&gt;
If each job requires nontrivial processing (e.g. changing into different directories for each job) then you may want to create a second script that generates the jobfiles and then use a similar kind of submit script.&lt;br /&gt;
&lt;br /&gt;
=== Interactive Jobs ===&lt;br /&gt;
&lt;br /&gt;
Some jobs may require user input such as testing code on a gpu system or an interactive analytics program.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;bsub -I&amp;lt;/code&amp;gt; requests an interactive job that will print its output to your terminal.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;bsub -Ip&amp;lt;/code&amp;gt; requests an interactive job with a pseudo terminal. For example, this can be used to schedule a console program that takes user input and output.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;bsub -Is&amp;lt;/code&amp;gt; requests an interactive job with a shell. This can be used to test code on one of the gpu nodes or for more resource intensive development than is allowed on the login nodes.&lt;br /&gt;
&lt;br /&gt;
Note that interactive jobs are still subject to time and memory constraints as typical batch jobs. Please be careful not to interfere with other jobs running on a node and that your interactive job does not attempt to use more resources than you have requested. Please do not leave interactive jobs running for long periods and do not leave interactive jobs idle when you are not using them.&lt;br /&gt;
&lt;br /&gt;
We do not currently treat interactive jobs different than any other jobs. As DeepSense becomes more heavily utilized we may need to limit the number of interactive jobs run by a user, project, or on a given node. We may need to limit the time or other resources used by interactive jobs.&lt;br /&gt;
&lt;br /&gt;
== Job Information ==&lt;br /&gt;
&lt;br /&gt;
=== Running Jobs ===&lt;br /&gt;
 &lt;br /&gt;
To examine currently running jobs you use the &amp;lt;code&amp;gt;bjobs&amp;lt;/code&amp;gt; command (https://www.ibm.com/support/knowledgecenter/en/SSWRJV_10.1.0/lsf_command_ref/bjobs.man_top.1.html)&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;bjobs -l&amp;lt;/code&amp;gt; or &amp;lt;code&amp;gt;bjobs -l &amp;lt;jobid&amp;gt;&amp;lt;/code&amp;gt; shows additional job information including job status and resource usage.&lt;br /&gt;
&lt;br /&gt;
=== Past Jobs ===&lt;br /&gt;
&lt;br /&gt;
To examine current and past jobs use the &amp;lt;code&amp;gt;bhist&amp;lt;/code&amp;gt; command (https://www.ibm.com/support/knowledgecenter/en/SSWRJV_10.1.0/lsf_command_ref/bhist.1.html).&lt;br /&gt;
&lt;br /&gt;
The following options will show jobs with the specified status:&lt;br /&gt;
 -a all&lt;br /&gt;
 -d finished&lt;br /&gt;
 -e exited&lt;br /&gt;
 -p pending&lt;br /&gt;
 -r running&lt;br /&gt;
 -s suspended&lt;br /&gt;
&lt;br /&gt;
You can use options like &amp;lt;code&amp;gt;-S start_time,end_time&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;-C start_time,end_time&amp;lt;/code&amp;gt; to find jobs that were submitted or completed between the specified time intervals. These options require using the &amp;lt;code&amp;gt;-a&amp;lt;/code&amp;gt; option.&lt;br /&gt;
&lt;br /&gt;
As with bjobs, you can use the &amp;lt;code&amp;gt;-l&amp;lt;/code&amp;gt; option for additional information and can also specify a specific known jobid as the last command argument.&lt;br /&gt;
&lt;br /&gt;
=== Available Hosts ===&lt;br /&gt;
 &lt;br /&gt;
To see the available hosts and how busy they are you use the &amp;lt;code&amp;gt;bhosts&amp;lt;/code&amp;gt; command (https://www.ibm.com/support/knowledgecenter/en/SSWRJV_10.1.0/lsf_command_ref/bhosts.1.html)&lt;br /&gt;
&lt;br /&gt;
== LSF Command Reference == &lt;br /&gt;
&lt;br /&gt;
The complete list of LSF commands with description is available [https://www.ibm.com/support/knowledgecenter/en/SSWRJV_10.1.0/lsf_kc_cmd_ref.html here].&lt;/div&gt;</summary>
		<author><name>Cwhidden</name></author>
		
	</entry>
	<entry>
		<id>https://docs.deepsense.ca/index.php?title=Terms_of_Use&amp;diff=65</id>
		<title>Terms of Use</title>
		<link rel="alternate" type="text/html" href="https://docs.deepsense.ca/index.php?title=Terms_of_Use&amp;diff=65"/>
		<updated>2019-05-30T15:48:21Z</updated>

		<summary type="html">&lt;p&gt;Cwhidden: hide TOC&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;__NOTOC__&lt;br /&gt;
&lt;br /&gt;
== General Principles ==&lt;br /&gt;
DeepSense resources are provided to enhance the ocean analytics economy in Nova Scotia.&lt;br /&gt;
Users must conform to ethical standards, follow all applicable laws and regulations, and follow all applicable Acceptable Use Policies&lt;br /&gt;
&lt;br /&gt;
== University Acceptable Use Policies ==&lt;br /&gt;
The use of DeepSense resources is subject to the Dalhousie University and Faculty of Computer Science Acceptable Use Policies:&lt;br /&gt;
&lt;br /&gt;
* [https://www.dal.ca/dept/university_secretariat/policies/information-management-and-technology/acceptable-use-policy-.html Dalhousie Acceptable Use Policy]&lt;br /&gt;
* [https://cdn.dal.ca/content/dam/dalhousie/pdf/faculty/computerscience/policies/fcs_policy_local.pdf Policy on Local Computer Resources]&lt;br /&gt;
&lt;br /&gt;
== Resource Priority ==&lt;br /&gt;
&lt;br /&gt;
DeepSense is intended for academic and commercial partnerships for oceans analytics and priority will be given to these projects. Others uses may be approved for training and entrepreneurial projects but are subject to available resources and it may be necessary at times to reduce the resources allocated to such use.&lt;br /&gt;
&lt;br /&gt;
== Protection of Data ==&lt;br /&gt;
&lt;br /&gt;
All your data and backups are stored in Canada.&lt;br /&gt;
&lt;br /&gt;
Aspects of data management will follow industry best practices (as determined by DeepSense staff) unless specific arrangements are agreed for your project.&lt;br /&gt;
&lt;br /&gt;
Your data will be stored and protected for the duration of your project and a short grace period thereafter. It is your responsibility to safeguard and delete your data after your project is terminated. Alternative arrangements can be made to accomodate longer storage or sharing data between projects.&lt;br /&gt;
&lt;br /&gt;
=== Data Access ===&lt;br /&gt;
Members of your project team with deepsense accounts can access your team&amp;#039;s data.&lt;br /&gt;
&lt;br /&gt;
=== Data Storage ===&lt;br /&gt;
Data is stored for the duration of your project unless you request deletion sooner. Deletion requests will be confirmed with the project lead.&lt;br /&gt;
&lt;br /&gt;
=== Confidentiality ===&lt;br /&gt;
All project data is confidential to your project team.&lt;br /&gt;
&lt;br /&gt;
== Acknowledging DeepSense ==&lt;br /&gt;
&lt;br /&gt;
Please [[Acknowledging_DeepSense|acknowledge DeepSense]] when publishing results that used DeepSense resources, including computing hardware, computing software, and staff expertise.&lt;/div&gt;</summary>
		<author><name>Cwhidden</name></author>
		
	</entry>
	<entry>
		<id>https://docs.deepsense.ca/index.php?title=MediaWiki:Sidebar&amp;diff=64</id>
		<title>MediaWiki:Sidebar</title>
		<link rel="alternate" type="text/html" href="https://docs.deepsense.ca/index.php?title=MediaWiki:Sidebar&amp;diff=64"/>
		<updated>2019-05-30T15:47:45Z</updated>

		<summary type="html">&lt;p&gt;Cwhidden: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&lt;br /&gt;
* navigation&lt;br /&gt;
** mainpage|mainpage-description&lt;br /&gt;
* Support&lt;br /&gt;
** Contact information | Getting help&lt;br /&gt;
** Getting started | Getting started&lt;br /&gt;
** Transferring Data | Transferring Data&lt;br /&gt;
** LSF | Running jobs&lt;br /&gt;
** Known problems | Known issues&lt;br /&gt;
** mainpage | System status&lt;br /&gt;
* DeepSense&lt;br /&gt;
** https://deepsense.ca | DeepSense home page&lt;br /&gt;
** Acknowledging DeepSense | Acknowledging DeepSense&lt;br /&gt;
** Terms of Use | Terms of use&lt;br /&gt;
* Authoring&lt;br /&gt;
&amp;lt;!-- ** Guidelines | Guidelines --&amp;gt;&lt;br /&gt;
** helppage|MediaWiki help&lt;br /&gt;
** recentchanges-url|recentchanges&lt;/div&gt;</summary>
		<author><name>Cwhidden</name></author>
		
	</entry>
	<entry>
		<id>https://docs.deepsense.ca/index.php?title=Terms_of_Use&amp;diff=62</id>
		<title>Terms of Use</title>
		<link rel="alternate" type="text/html" href="https://docs.deepsense.ca/index.php?title=Terms_of_Use&amp;diff=62"/>
		<updated>2019-05-30T15:43:00Z</updated>

		<summary type="html">&lt;p&gt;Cwhidden: Cwhidden moved page Acceptable Use Policy to Terms of Use&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== General Principles ==&lt;br /&gt;
DeepSense resources are provided to enhance the ocean analytics economy in Nova Scotia.&lt;br /&gt;
Users must conform to ethical standards, follow all applicable laws and regulations, and follow all applicable Acceptable Use Policies&lt;br /&gt;
&lt;br /&gt;
== University Acceptable Use Policies ==&lt;br /&gt;
The use of DeepSense resources is subject to the Dalhousie University and Faculty of Computer Science Acceptable Use Policies:&lt;br /&gt;
&lt;br /&gt;
* [https://www.dal.ca/dept/university_secretariat/policies/information-management-and-technology/acceptable-use-policy-.html Dalhousie Acceptable Use Policy]&lt;br /&gt;
* [https://cdn.dal.ca/content/dam/dalhousie/pdf/faculty/computerscience/policies/fcs_policy_local.pdf Policy on Local Computer Resources]&lt;br /&gt;
&lt;br /&gt;
== Resource Priority ==&lt;br /&gt;
&lt;br /&gt;
DeepSense is intended for academic and commercial partnerships for oceans analytics and priority will be given to these projects. Others uses may be approved for training and entrepreneurial projects but are subject to available resources and it may be necessary at times to reduce the resources allocated to such use.&lt;br /&gt;
&lt;br /&gt;
== Protection of Data ==&lt;br /&gt;
&lt;br /&gt;
All your data and backups are stored in Canada.&lt;br /&gt;
&lt;br /&gt;
Aspects of data management will follow industry best practices (as determined by DeepSense staff) unless specific arrangements are agreed for your project.&lt;br /&gt;
&lt;br /&gt;
Your data will be stored and protected for the duration of your project and a short grace period thereafter. It is your responsibility to safeguard and delete your data after your project is terminated. Alternative arrangements can be made to accomodate longer storage or sharing data between projects.&lt;br /&gt;
&lt;br /&gt;
=== Data Access ===&lt;br /&gt;
Members of your project team with deepsense accounts can access your team&amp;#039;s data.&lt;br /&gt;
&lt;br /&gt;
=== Data Storage ===&lt;br /&gt;
Data is stored for the duration of your project unless you request deletion sooner. Deletion requests will be confirmed with the project lead.&lt;br /&gt;
&lt;br /&gt;
=== Confidentiality ===&lt;br /&gt;
All project data is confidential to your project team.&lt;br /&gt;
&lt;br /&gt;
== Acknowledging DeepSense ==&lt;br /&gt;
&lt;br /&gt;
Please [[Acknowledging_DeepSense|acknowledge DeepSense]] when publishing results that used DeepSense resources, including computing hardware, computing software, and staff expertise.&lt;/div&gt;</summary>
		<author><name>Cwhidden</name></author>
		
	</entry>
	<entry>
		<id>https://docs.deepsense.ca/index.php?title=Acceptable_Use_Policy&amp;diff=63</id>
		<title>Acceptable Use Policy</title>
		<link rel="alternate" type="text/html" href="https://docs.deepsense.ca/index.php?title=Acceptable_Use_Policy&amp;diff=63"/>
		<updated>2019-05-30T15:43:00Z</updated>

		<summary type="html">&lt;p&gt;Cwhidden: Cwhidden moved page Acceptable Use Policy to Terms of Use&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;#REDIRECT [[Terms of Use]]&lt;/div&gt;</summary>
		<author><name>Cwhidden</name></author>
		
	</entry>
	<entry>
		<id>https://docs.deepsense.ca/index.php?title=Terms_of_Use&amp;diff=61</id>
		<title>Terms of Use</title>
		<link rel="alternate" type="text/html" href="https://docs.deepsense.ca/index.php?title=Terms_of_Use&amp;diff=61"/>
		<updated>2019-05-30T15:42:22Z</updated>

		<summary type="html">&lt;p&gt;Cwhidden: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== General Principles ==&lt;br /&gt;
DeepSense resources are provided to enhance the ocean analytics economy in Nova Scotia.&lt;br /&gt;
Users must conform to ethical standards, follow all applicable laws and regulations, and follow all applicable Acceptable Use Policies&lt;br /&gt;
&lt;br /&gt;
== University Acceptable Use Policies ==&lt;br /&gt;
The use of DeepSense resources is subject to the Dalhousie University and Faculty of Computer Science Acceptable Use Policies:&lt;br /&gt;
&lt;br /&gt;
* [https://www.dal.ca/dept/university_secretariat/policies/information-management-and-technology/acceptable-use-policy-.html Dalhousie Acceptable Use Policy]&lt;br /&gt;
* [https://cdn.dal.ca/content/dam/dalhousie/pdf/faculty/computerscience/policies/fcs_policy_local.pdf Policy on Local Computer Resources]&lt;br /&gt;
&lt;br /&gt;
== Resource Priority ==&lt;br /&gt;
&lt;br /&gt;
DeepSense is intended for academic and commercial partnerships for oceans analytics and priority will be given to these projects. Others uses may be approved for training and entrepreneurial projects but are subject to available resources and it may be necessary at times to reduce the resources allocated to such use.&lt;br /&gt;
&lt;br /&gt;
== Protection of Data ==&lt;br /&gt;
&lt;br /&gt;
All your data and backups are stored in Canada.&lt;br /&gt;
&lt;br /&gt;
Aspects of data management will follow industry best practices (as determined by DeepSense staff) unless specific arrangements are agreed for your project.&lt;br /&gt;
&lt;br /&gt;
Your data will be stored and protected for the duration of your project and a short grace period thereafter. It is your responsibility to safeguard and delete your data after your project is terminated. Alternative arrangements can be made to accomodate longer storage or sharing data between projects.&lt;br /&gt;
&lt;br /&gt;
=== Data Access ===&lt;br /&gt;
Members of your project team with deepsense accounts can access your team&amp;#039;s data.&lt;br /&gt;
&lt;br /&gt;
=== Data Storage ===&lt;br /&gt;
Data is stored for the duration of your project unless you request deletion sooner. Deletion requests will be confirmed with the project lead.&lt;br /&gt;
&lt;br /&gt;
=== Confidentiality ===&lt;br /&gt;
All project data is confidential to your project team.&lt;br /&gt;
&lt;br /&gt;
== Acknowledging DeepSense ==&lt;br /&gt;
&lt;br /&gt;
Please [[Acknowledging_DeepSense|acknowledge DeepSense]] when publishing results that used DeepSense resources, including computing hardware, computing software, and staff expertise.&lt;/div&gt;</summary>
		<author><name>Cwhidden</name></author>
		
	</entry>
	<entry>
		<id>https://docs.deepsense.ca/index.php?title=Terms_of_Use&amp;diff=60</id>
		<title>Terms of Use</title>
		<link rel="alternate" type="text/html" href="https://docs.deepsense.ca/index.php?title=Terms_of_Use&amp;diff=60"/>
		<updated>2019-05-30T15:40:16Z</updated>

		<summary type="html">&lt;p&gt;Cwhidden: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== General Principles ==&lt;br /&gt;
DeepSense resources are provided to enhance the ocean analytics economy in Nova Scotia.&lt;br /&gt;
Users must conform to ethical standards, follow all applicable laws and regulations, and follow all applicable Acceptable Use Policies&lt;br /&gt;
&lt;br /&gt;
== Acceptable Use Policies ==&lt;br /&gt;
The use of DeepSense resources is subject to the Dalhousie University and Faculty of Computer Science Acceptable Use Policies:&lt;br /&gt;
&lt;br /&gt;
* [https://www.dal.ca/dept/university_secretariat/policies/information-management-and-technology/acceptable-use-policy-.html Dalhousie Acceptable Use Policy]&lt;br /&gt;
* [https://cdn.dal.ca/content/dam/dalhousie/pdf/faculty/computerscience/policies/fcs_policy_local.pdf Policy on Local Computer Resources]&lt;br /&gt;
&lt;br /&gt;
== Resource Priority ==&lt;br /&gt;
&lt;br /&gt;
DeepSense is intended for academic and commercial partnerships for oceans analytics and priority will be given to these projects. Others uses may be approved for training and entrepreneurial projects but are subject to available resources and it may be necessary at times to reduce the resources allocated to such use.&lt;br /&gt;
&lt;br /&gt;
== Protection of Data ==&lt;br /&gt;
&lt;br /&gt;
All your data and backups are stored in Canada.&lt;br /&gt;
&lt;br /&gt;
Aspects of data management will follow industry best practices (as determined by DeepSense staff) unless specific arrangements are agreed for your project.&lt;br /&gt;
&lt;br /&gt;
Your data will be stored and protected for the duration of your project and a short grace period thereafter. It is your responsibility to safeguard and delete your data after your project is terminated. Alternative arrangements can be made to accomodate longer storage or sharing data between projects.&lt;br /&gt;
&lt;br /&gt;
=== Data Access ===&lt;br /&gt;
Members of your project team with deepsense accounts can access your team&amp;#039;s data.&lt;br /&gt;
&lt;br /&gt;
=== Data Storage ===&lt;br /&gt;
Data is stored for the duration of your project unless you request deletion sooner. Deletion requests will be confirmed with the project lead.&lt;br /&gt;
&lt;br /&gt;
=== Confidentiality ===&lt;br /&gt;
All project data is confidential to your project team.&lt;br /&gt;
&lt;br /&gt;
== Acknowledging DeepSense ==&lt;br /&gt;
&lt;br /&gt;
Please [[Acknowledging_DeepSense|acknowledge DeepSense]] when publishing results that used DeepSense resources, including computing hardware, computing software, and staff expertise.&lt;/div&gt;</summary>
		<author><name>Cwhidden</name></author>
		
	</entry>
	<entry>
		<id>https://docs.deepsense.ca/index.php?title=Available_software&amp;diff=59</id>
		<title>Available software</title>
		<link rel="alternate" type="text/html" href="https://docs.deepsense.ca/index.php?title=Available_software&amp;diff=59"/>
		<updated>2019-05-15T14:43:39Z</updated>

		<summary type="html">&lt;p&gt;Cwhidden: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Basic Software ==&lt;br /&gt;
&lt;br /&gt;
* RedHat Enterprise Linux Server release 7.5 (RHEL)&lt;br /&gt;
* gcc 4.8.5&lt;br /&gt;
* glibc 2.17&lt;br /&gt;
* R 3.5.1&lt;br /&gt;
&lt;br /&gt;
== Anaconda Python ==&lt;br /&gt;
&lt;br /&gt;
Two Anaconda python environments are installed locally on each DeepSense compute node:&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
! Version&lt;br /&gt;
! Environment location&lt;br /&gt;
|-&lt;br /&gt;
|python 2.7.15&lt;br /&gt;
|/opt/anaconda2&lt;br /&gt;
|-&lt;br /&gt;
|python 3.6.8&lt;br /&gt;
|/opt/anaconda2/envs/py36&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
These python environments have many packages installed, including prerequisite libraries for running the IBM PowerAI deep learning frameworks.&lt;br /&gt;
&lt;br /&gt;
See [[Getting_started]] for instructions on using the shared anaconda python environments.&lt;br /&gt;
&lt;br /&gt;
See [[Installing local software]] for instructions on installing and managing your own python environments in your home directory.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== IBM PowerAI Deep Learning Packages ==&lt;br /&gt;
&lt;br /&gt;
[https://developer.ibm.com/linuxonpower/deep-learning-powerai/ IBM PowerAI] includes multiple open source deep learning frameworks compiled for IBM Power8 systems.&lt;br /&gt;
&lt;br /&gt;
IBM PowerAI Enterprise includes:&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
!Framework&lt;br /&gt;
!colspan=&amp;quot;2&amp;quot;|Location&lt;br /&gt;
|-&lt;br /&gt;
|Caffe&lt;br /&gt;
|/opt/DL/caffe&lt;br /&gt;
|-&lt;br /&gt;
|cuDNN&lt;br /&gt;
|/opt/DL/cudnn&lt;br /&gt;
|-&lt;br /&gt;
|IBM Distributed Deep Learning (DDL)&lt;br /&gt;
|/opt/DL/ddl&lt;br /&gt;
|-&lt;br /&gt;
| HDF5&lt;br /&gt;
|/opt/DL/hdf5&lt;br /&gt;
|-&lt;br /&gt;
|NCCL&lt;br /&gt;
|/opt/DL/nccl&lt;br /&gt;
|/opt/DL/nccl2&lt;br /&gt;
|-&lt;br /&gt;
|openblas&lt;br /&gt;
|/opt/DL/openblas&lt;br /&gt;
|-&lt;br /&gt;
|protobuf&lt;br /&gt;
|/opt/DL/protobuf&lt;br /&gt;
|-&lt;br /&gt;
|pytorch&lt;br /&gt;
|/opt/DL/pytorch&lt;br /&gt;
|-&lt;br /&gt;
|snap-ml&lt;br /&gt;
|/opt/DL/snap-ml-local&lt;br /&gt;
|/opt/DL/snap-ml-mpi&lt;br /&gt;
|-&lt;br /&gt;
|Tensorflow 1.11 (including keras)&lt;br /&gt;
|/opt/DL/tensorflow&lt;br /&gt;
|/opt/DL/ddl-tensorflow&lt;br /&gt;
|-&lt;br /&gt;
|Tensorboard&lt;br /&gt;
|/opt/DL/tensorboard&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
To use most of these frameworks you need to activate a python2 or python3 environment and then activate the relevant framework.&lt;br /&gt;
&lt;br /&gt;
For example, to use tensorflow you can activate a python2 environment:&lt;br /&gt;
 . /opt/anaconda2/etc/profile.d/conda.sh&lt;br /&gt;
 conda activate&lt;br /&gt;
&lt;br /&gt;
and then activate tensorflow:&lt;br /&gt;
 source /opt/DL/tensorflow/bin/tensorflow-activate&lt;br /&gt;
&lt;br /&gt;
You can then &amp;lt;code&amp;gt;import tensorflow as tf&amp;lt;/code&amp;gt; in your python code.&lt;br /&gt;
&lt;br /&gt;
See [[Getting started with Deep Learning]] for a tutorial on using Caffe and Tensorflow on Deep Sense.&lt;br /&gt;
&lt;br /&gt;
== IBM Advance Toolchain ==&lt;br /&gt;
&lt;br /&gt;
You may require newer versions of compilers such as GCC than are provided with RHEL.&lt;br /&gt;
&lt;br /&gt;
The [https://developer.ibm.com/linuxonpower/advance-toolchain IBM Advance Toolchain for Linux on Power] is a set of open source compilers, runtime libraries, and development tools.&lt;br /&gt;
&lt;br /&gt;
The IBM Advance Toolchain] includes recent versions of:&lt;br /&gt;
* GNU Compiler Collection (gcc, g++ and gfortran)&lt;br /&gt;
* GNU C library (glibc)&lt;br /&gt;
* GNU Binary Utilities (binutils)&lt;br /&gt;
* Decimal Floating Point Library (libdfp)&lt;br /&gt;
* IBM Power Architecture Facilities Library (PAFLib)&lt;br /&gt;
* GNU Debugger (gdb)&lt;br /&gt;
* Python&lt;br /&gt;
* Golang&lt;br /&gt;
* Performance analysis tools (oprofile, valgrind, itrace)&lt;br /&gt;
* Multi-core exploitation libraries (TBB, Userspace RCU, SPHDE)&lt;br /&gt;
* support libraries (libhugetlbfs, Boost, zlib, etc)&lt;br /&gt;
&lt;br /&gt;
To use the the Advance Toolchain, first activate environment modules:&lt;br /&gt;
 source /usr/local/Modules/init/bash&lt;br /&gt;
&lt;br /&gt;
Then load the advance toolchain:&lt;br /&gt;
 module load at12.0&lt;br /&gt;
&lt;br /&gt;
To stop using the advance toolchain, unload the environment module:&lt;br /&gt;
 module unload at12.0&lt;br /&gt;
&lt;br /&gt;
Note that software dynamically compiled with the advance toolchain will only run with the advance toolchain loaded.&lt;br /&gt;
&lt;br /&gt;
== Requesting Additional Software ==&lt;br /&gt;
&lt;br /&gt;
Contact DeepSense [[contact information|support]] to have additional software installed or for help installing or compiling software locally in your home directory.&lt;/div&gt;</summary>
		<author><name>Cwhidden</name></author>
		
	</entry>
	<entry>
		<id>https://docs.deepsense.ca/index.php?title=Installing_local_software&amp;diff=50</id>
		<title>Installing local software</title>
		<link rel="alternate" type="text/html" href="https://docs.deepsense.ca/index.php?title=Installing_local_software&amp;diff=50"/>
		<updated>2019-04-15T18:08:22Z</updated>

		<summary type="html">&lt;p&gt;Cwhidden: Compiling Software section&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Introduction ==&lt;br /&gt;
&lt;br /&gt;
You are welcome to install software locally in your home directory. This allows you to use specific versions of software instead of the cluster wide versions. For example you may need an older version of a specific package or a newly released version that isn&amp;#039;t yet installed on DeepSense.&lt;br /&gt;
&lt;br /&gt;
For assistance installing or compiling software contact [[Contact_Information|Technical Support]]. We will support locally installed software to the best of our ability, although we can not guarantee that all software will run on the DeepSense platform. In the event that desired software will not run, we can help you determine alternatives such as different software or using a different system for some of your computation.&lt;br /&gt;
&lt;br /&gt;
If you attempt to install compiled software (e.g. an anaconda package) but the package cannot be found then also contact [[Contact_Information|Technical Support]]. The package may not have been compiled for the DeepSense hardware architecture (ppc64le).&lt;br /&gt;
&lt;br /&gt;
If your project has specific software you want to share between members then we can create a shared directory for your group in /software/&amp;lt;project&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you have locally compiled software that you think may be useful for other DeepSense users then let us know at [[Contact_Information|Technical Support]]. We may install and support it systemwide if there is sufficient interest.&lt;br /&gt;
&lt;br /&gt;
== Installing Anaconda Python in your home directory ==&lt;br /&gt;
&lt;br /&gt;
=== Stop using systemwide anaconda ===&lt;br /&gt;
&lt;br /&gt;
If you added the system anaconda environment to your &amp;lt;code&amp;gt;.bashrc&amp;lt;/code&amp;gt; file then remove the line:&lt;br /&gt;
 . /opt/anaconda2/etc/profile.d/conda.sh&lt;br /&gt;
&lt;br /&gt;
=== Installing Anaconda with a python2 base ===&lt;br /&gt;
&lt;br /&gt;
From your home directory run:&lt;br /&gt;
 wget https://repo.continuum.io/archive/Anaconda2-5.2.0-Linux-ppc64le.sh&lt;br /&gt;
 bash Anaconda2-5.2.0-Linux-ppc64le.sh&lt;br /&gt;
&lt;br /&gt;
Note: please enter &amp;quot;yes&amp;quot; when asked if you want to add anaconda to your .bashrc file. If you do not then you will need to add the following command to your .bashrc file or run it each time before using anaconda:&lt;br /&gt;
 ~/anaconda2/etc/profile.d/conda.sh&lt;br /&gt;
&lt;br /&gt;
After the installer ends you need to either close and restart your terminal or run:&lt;br /&gt;
 source ~/.bashrc&lt;br /&gt;
&lt;br /&gt;
=== Adding a python3 environment ===&lt;br /&gt;
The previous instruction creates a python2 base environment. To add a python3 environment:&lt;br /&gt;
 conda create -n py36 python=3.6&lt;br /&gt;
&lt;br /&gt;
Activate this environment to use python3:&lt;br /&gt;
 conda activate py36&lt;br /&gt;
&lt;br /&gt;
note: if you receive an error message then you may need to deactivate the base conda environment first:&lt;br /&gt;
 conda deactivate&lt;br /&gt;
 conda activate py36&lt;br /&gt;
&lt;br /&gt;
=== Adding a python2 environment ===&lt;br /&gt;
We recommend creating a separate python2 environment from the base environment. This makes it easier to install the specific packages required for IBM PowerAI.&lt;br /&gt;
 conda create -n py27 python=2.7&lt;br /&gt;
&lt;br /&gt;
Activate this environment to use python2:&lt;br /&gt;
 conda activate py27&lt;br /&gt;
&lt;br /&gt;
=== Install PowerAI dependencies ===&lt;br /&gt;
&lt;br /&gt;
Warning: these scripts will install, update, and downgrade some packages to the recommended packages for the current version of PowerAI. You may want to create a separate python environment to use different versions of those packages with other software.&lt;br /&gt;
&lt;br /&gt;
To use Tensorflow first install the Tensorflow dependencies:&lt;br /&gt;
 /opt/DL/tensorflow/bin/install_dependencies&lt;br /&gt;
&lt;br /&gt;
To use PyTorch first install the PyTorch dependencies:&lt;br /&gt;
 /opt/DL/pytorch/bin/install_dependencies&lt;br /&gt;
&lt;br /&gt;
The dependencies must be installed in whichever python environment you intend to use. We&amp;#039;ve encountered some problems installing the PyTorch dependencies directly in the base environment if the base conda environment has been updated to conda version 4.6.2. If you want to use PyTorch, be sure to use a conda environment with a lower version of conda.&lt;br /&gt;
&lt;br /&gt;
=== Install other dependencies ===&lt;br /&gt;
&lt;br /&gt;
If you need additional python libraries then you can install them in your python environment.&lt;br /&gt;
&lt;br /&gt;
The base package comes with several python libraries but you may want a newer version or additional libraries. Also, when you create a new environment it does not automatically get all of the same libraries as the base environment.&lt;br /&gt;
&lt;br /&gt;
For example, suppose you want to install the &amp;lt;code&amp;gt;scikit-learn&amp;lt;/code&amp;gt; package in your python3 environment.&lt;br /&gt;
&lt;br /&gt;
First you need to activate the environment:&lt;br /&gt;
 conda activate py36&lt;br /&gt;
&lt;br /&gt;
Then you install the package&lt;br /&gt;
 conda install scikit-learn&lt;br /&gt;
&lt;br /&gt;
=== Testing Deep Learning packages on the login nodes or non-GPU nodes ===&lt;br /&gt;
&lt;br /&gt;
You may wish to run PowerAI software on the login nodes for testing on the CPU-only nodes for some workflows.&lt;br /&gt;
&lt;br /&gt;
Only the GPU nodes have graphics cards and graphics drivers installed. If you attempt to run the deep learning software like Tensorflow on the login nodes or CPU-only nodes then you will see errors like the following:&lt;br /&gt;
 ImportError: libcublas.so.9.2: cannot open shared object file: No such file or directory&lt;br /&gt;
&lt;br /&gt;
You need to load the GPU drivers with the following command:&lt;br /&gt;
 source /opt/DL/cudnn/bin/cudnn-activate&lt;br /&gt;
&lt;br /&gt;
Then you can activate the deep learning package, e.g. for Tensorflow:&lt;br /&gt;
 source /opt/DL/tensorflow/bin/tensorflow-activate&lt;br /&gt;
&lt;br /&gt;
Note that some deep learning software may be much slower or refuse to run without GPU access. Tensorflow works but Caffe does not.&lt;br /&gt;
&lt;br /&gt;
Keep in mind you need to activate the GPU drivers and deep learning package in each browser shell before you are able to use the package in your code or LSF jobs.&lt;br /&gt;
&lt;br /&gt;
== Compiling Software for DeepSense ==&lt;br /&gt;
&lt;br /&gt;
DeepSense uses IBM Power8 systems running RedHat Enterprise Linux. Code must be compiled for &amp;lt;code&amp;gt;ppc64le&amp;lt;/code&amp;gt; which is PowerPC 64 bit Little Endian.&lt;br /&gt;
&lt;br /&gt;
Some software may not have binaries available for &amp;lt;code&amp;gt;ppc64le&amp;lt;/code&amp;gt; even if it does for other systems. If this happens then you (or [[Contact_Information|DeepSense support]]) will need to compile the software to run on DeepSense. Visit the web page for the software and see if the source code is available (e.g. through github). If so then follow the compilation instructions to run the software.&lt;br /&gt;
&lt;br /&gt;
You may encounter errors when attempting to compile software for &amp;lt;code&amp;gt;ppc64le&amp;lt;/code&amp;gt;. Often this occurs because of differences between &amp;lt;code&amp;gt;ppc64le&amp;lt;/code&amp;gt; and other common architectures such as x86 and x86_64. &lt;br /&gt;
&lt;br /&gt;
For example, one DeepSense user attempted to compile the rdkit software package from https://www.rdkit.org/ . This compilation failed when it attempted to use the gcc x86 optimization &amp;lt;code&amp;gt;-mpopcnt&amp;lt;/code&amp;gt;. After replacing the optimization with the &amp;lt;code&amp;gt;ppc64le&amp;lt;/code&amp;gt; equivalent &amp;lt;code&amp;gt;-mpopcntb&amp;lt;/code&amp;gt; the software compiled successfully.&lt;/div&gt;</summary>
		<author><name>Cwhidden</name></author>
		
	</entry>
	<entry>
		<id>https://docs.deepsense.ca/index.php?title=LSF&amp;diff=49</id>
		<title>LSF</title>
		<link rel="alternate" type="text/html" href="https://docs.deepsense.ca/index.php?title=LSF&amp;diff=49"/>
		<updated>2019-04-15T17:47:25Z</updated>

		<summary type="html">&lt;p&gt;Cwhidden: /* Advanced Job Submission */  Added Interactive Jobs section&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[https://www.ibm.com/support/knowledgecenter/SSWRJV_10.1.0/ IBM Spectrum LSF] is the command line job submission system for submitting batch and interactive jobs on DeepSense computing hardware.&lt;br /&gt;
&lt;br /&gt;
== Test code and short computation ==&lt;br /&gt;
DeepSense has two login nodes, login1.deepsense.ca and login2.deepsense.ca . You can access these through SSH with your username and password from any computer on campus. From off campus you’ll need to use the [https://wireless.dal.ca/vpnsoftware.php Dalhousie VPN].&lt;br /&gt;
&lt;br /&gt;
The login nodes are intended for testing and compiling code. Please don’t run long or intensive computation on these nodes.&lt;br /&gt;
&lt;br /&gt;
== Job Submission ==&lt;br /&gt;
When you have a small example working with your code and are ready to run a real workload, use the LSF queue to submit your jobs to the cluster (https://www.ibm.com/support/knowledgecenter/SSWRJV_10.1.0/lsf_users_guide/batch_jobs_about.html). If you’ve used other queuing systems like slurm or Sun Grid Engine before then LSF will seem very familiar.&lt;br /&gt;
 &lt;br /&gt;
To submit a job you use the &amp;lt;code&amp;gt;bsub&amp;lt;/code&amp;gt; command (https://www.ibm.com/support/knowledgecenter/en/SSWRJV_10.1.0/lsf_command_ref/bsub.man_top.1.html).&lt;br /&gt;
 &lt;br /&gt;
For example, to submit a shared memory job using 20 processors and 256GB of memory you would run:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;bsub -oo &amp;lt;output_file&amp;gt; -n 20 -M 256000 -R “span[hosts=1]” &amp;lt;executable&amp;gt; [options]&amp;lt;/code&amp;gt;&lt;br /&gt;
 &lt;br /&gt;
For openMP jobs, please make sure that you use &amp;lt;code&amp;gt;OMP_NUM_THREADS&amp;lt;/code&amp;gt; to limit the number of threads your program uses and that you set this variable in your code that will run on the server. LSF sets a variable &amp;lt;code&amp;gt;$LSB_DJOB_NUMPROC&amp;lt;/code&amp;gt; that you can use if you don’t want to hardcode &amp;lt;code&amp;gt;OMP_NUM_THREADS&amp;lt;/code&amp;gt; or set it with your own variable.&lt;br /&gt;
&lt;br /&gt;
=== CPU Limit ===&lt;br /&gt;
The number of requested processors is specified with the option &amp;lt;code&amp;gt;-n&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
The resource request &amp;lt;code&amp;gt;-R &amp;quot;span[hosts=1]&amp;quot;&amp;lt;/code&amp;gt; requires that all processors are on the same compute host, i.e. a shared memory job.&lt;br /&gt;
&lt;br /&gt;
LSF can also be used to run compute jobs across multiple hosts such as MPI jobs. Examples will be included here at a later date.&lt;br /&gt;
&lt;br /&gt;
=== Memory Limit === &lt;br /&gt;
The memory limit &amp;lt;code&amp;gt;-M&amp;lt;/code&amp;gt; is specified in MB by default. You can also specify units, e.g. &amp;lt;code&amp;gt;-M 256GB&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== GPU Computation ===&lt;br /&gt;
&lt;br /&gt;
To request access to a GPU use the &amp;lt;code&amp;gt;-gpu -&amp;lt;/code&amp;gt; options.&lt;br /&gt;
&lt;br /&gt;
Note the trailing dash, which specifies the default GPU arguments. The following options can be used in place of that dash.&lt;br /&gt;
&lt;br /&gt;
The default GPU arguments are &amp;lt;code&amp;gt;&amp;quot;num=1:mode=shared:mps=no:j_exclusive=no&amp;quot;&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;num=num_gpus&amp;lt;/code&amp;gt; is the number of requested GPUs on each host.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;mode=shared | exclusive_process&amp;lt;/code&amp;gt; specifies the GPU mode.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;mps=yes | no&amp;lt;/code&amp;gt; use the Nvidia Multi-Process Server (MPS). MPS enables better sharing of GPU resources. If &amp;lt;code&amp;gt;mode=exclusive_process&amp;lt;/code&amp;gt; then mps should be set to yes. &lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;j_exclusive=yes | no&amp;lt;/code&amp;gt; Is the GPU exclusive to this job and prevented from being used by other jobs?&lt;br /&gt;
&lt;br /&gt;
By default the &amp;lt;code&amp;gt;-gpu -&amp;lt;/code&amp;gt; option will request one nonexclusive GPU. Please limit your usage of GPU resources to a reasonable number of concurrently used GPUs and use shared GPUs when possible. We may enact limits on GPU use in the feature if necessary.&lt;br /&gt;
&lt;br /&gt;
See the [https://www.ibm.com/support/knowledgecenter/en/SSWRJV_10.1.0/lsf_command_ref/bsub.gpu.1.html bsub.gpu] documentation for more information on submitting GPU jobs.&lt;br /&gt;
&lt;br /&gt;
=== Input and Output files ===&lt;br /&gt;
If you do not specify an output file with &amp;lt;code&amp;gt;-o&amp;lt;/code&amp;gt; (append) or &amp;lt;code&amp;gt;-oo&amp;lt;/code&amp;gt; (overwrite) then the output will be lost. Note that LSF will prepend submission information to this file. You can use typical linux options like &amp;lt;code&amp;gt;&amp;gt; output_file2&amp;lt;/code&amp;gt; in which case the file specified with &amp;lt;code&amp;gt;-oo&amp;lt;/code&amp;gt; will just contain any errors and submission information.&lt;br /&gt;
&lt;br /&gt;
You can specify an input file with the &amp;lt;code&amp;gt;-i&amp;lt;/code&amp;gt; option or the typical linux option &amp;lt;code&amp;gt;&amp;lt; &amp;lt;input_file&amp;gt;&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Advanced Job Submission ==&lt;br /&gt;
&lt;br /&gt;
=== Array Jobs ===&lt;br /&gt;
To run the same program multiple time with different input and output files you can use [https://www.ibm.com/support/knowledgecenter/en/SSWRJV_10.1.0/lsf_admin/job_arrays_lsf.html LSF Array Jobs].&lt;br /&gt;
&lt;br /&gt;
An example command in the LSF documentation is given as:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt; bsub -J &amp;quot;myArray[1-1000]&amp;quot; -i &amp;quot;input.%I&amp;quot; -o &amp;quot;output.%I&amp;quot; myJob&amp;lt;/code&amp;gt;&lt;br /&gt;
 &lt;br /&gt;
This command uses only one line to submit 1000 jobs running the script myJob with the input file &amp;lt;code&amp;gt;input.1, input.2, ... input.1000&amp;lt;/code&amp;gt; with the output of each job placed in the files &amp;lt;code&amp;gt;output.1, output.2, ... output.1000&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Complicated Jobs ===&lt;br /&gt;
To run the same program with multiple files, possibly with different options, you can create a job submission script that iterates over the files and submits the jobs.&lt;br /&gt;
 &lt;br /&gt;
For example, suppose you have &amp;lt;code&amp;gt;programA&amp;lt;/code&amp;gt; and want to process &amp;lt;code&amp;gt;input.1, input.2, ... input.N&amp;lt;/code&amp;gt; with output in &amp;lt;code&amp;gt;output.1, output.2, ... output.N&amp;lt;/code&amp;gt;, as in the array example.&lt;br /&gt;
&lt;br /&gt;
Create a bash script &amp;lt;code&amp;gt;do_submit_programA.bash&amp;lt;/code&amp;gt; that looks something like:&lt;br /&gt;
&lt;br /&gt;
 n=&amp;lt;N&amp;gt;&lt;br /&gt;
 arguments=&amp;lt;nodes, memory, time constraints, etc&amp;gt; &lt;br /&gt;
 for ((i=1; i&amp;lt;=$n; i++)); do&lt;br /&gt;
    bsub -oo log.$i $arguments programA &amp;lt; input.$i &amp;gt; output.$i&lt;br /&gt;
 done&lt;br /&gt;
 &lt;br /&gt;
Note that everything in triangle braces here is not real code. For example &amp;lt;code&amp;gt;N&amp;lt;/code&amp;gt; might be read from a command line argument or hardcoded as say 10. The arguments will be something like &amp;lt;code&amp;gt;-n 1 -M 100MB&amp;lt;/code&amp;gt; and any other desired options. You can run multiple types of jobs with complex arguments.&lt;br /&gt;
&lt;br /&gt;
You may wish to create separate directories for the log files, input files, and output files if there are more than a handful of jobs.&lt;br /&gt;
 &lt;br /&gt;
If each job requires nontrivial processing (e.g. changing into different directories for each job) then you may want to create a second script that generates the jobfiles and then use a similar kind of submit script.&lt;br /&gt;
&lt;br /&gt;
=== Interactive Jobs ===&lt;br /&gt;
&lt;br /&gt;
Some jobs may require user input such as testing code on a gpu system or an interactive analytics program.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;bsub -I&amp;lt;/code&amp;gt; requests an interactive job that will print its output to your terminal.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;bsub -Ip&amp;lt;/code&amp;gt; requests an interactive job with a pseudo terminal. For example, this can be used to schedule a console program that takes user input and output.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;bsub -Is&amp;lt;/code&amp;gt; requests an interactive job with a shell. This can be used to test code on one of the gpu nodes or for more resource intensive development than is allowed on the login nodes.&lt;br /&gt;
&lt;br /&gt;
Note that interactive jobs are still subject to time and memory constraints as typical batch jobs. Please be careful not to interfere with other jobs running on a node and that your interactive job does not attempt to use more resources than you have requested. Please do not leave interactive jobs running for long periods and do not leave interactive jobs idle when you are not using them.&lt;br /&gt;
&lt;br /&gt;
We do not currently treat interactive jobs different than any other jobs. As DeepSense becomes more heavily utilized we may need to limit the number of interactive jobs run by a user, project, or on a given node. We may need to limit the time or other resources used by interactive jobs.&lt;br /&gt;
&lt;br /&gt;
== Job Information ==&lt;br /&gt;
&lt;br /&gt;
=== Running Jobs ===&lt;br /&gt;
 &lt;br /&gt;
To examine currently running jobs you use the &amp;lt;code&amp;gt;bjobs&amp;lt;/code&amp;gt; command (https://www.ibm.com/support/knowledgecenter/en/SSWRJV_10.1.0/lsf_command_ref/bjobs.man_top.1.html)&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;bjobs -l&amp;lt;/code&amp;gt; or &amp;lt;code&amp;gt;bjobs -l &amp;lt;jobid&amp;gt;&amp;lt;/code&amp;gt; shows additional job information including job status and resource usage.&lt;br /&gt;
&lt;br /&gt;
=== Past Jobs ===&lt;br /&gt;
&lt;br /&gt;
To examine current and past jobs use the &amp;lt;code&amp;gt;bhist&amp;lt;/code&amp;gt; command (https://www.ibm.com/support/knowledgecenter/en/SSWRJV_10.1.0/lsf_command_ref/bhist.1.html).&lt;br /&gt;
&lt;br /&gt;
The following options will show jobs with the specified status:&lt;br /&gt;
 -a all&lt;br /&gt;
 -d finished&lt;br /&gt;
 -e exited&lt;br /&gt;
 -p pending&lt;br /&gt;
 -r running&lt;br /&gt;
 -s suspended&lt;br /&gt;
&lt;br /&gt;
You can use options like &amp;lt;code&amp;gt;-S start_time,end_time&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;-C start_time,end_time&amp;lt;/code&amp;gt; to find jobs that were submitted or completed between the specified time intervals. These options require using the &amp;lt;code&amp;gt;-a&amp;lt;/code&amp;gt; option.&lt;br /&gt;
&lt;br /&gt;
As with bjobs, you can use the &amp;lt;code&amp;gt;-l&amp;lt;/code&amp;gt; option for additional information and can also specify a specific known jobid as the last command argument.&lt;br /&gt;
&lt;br /&gt;
=== Available Hosts ===&lt;br /&gt;
 &lt;br /&gt;
To see the available hosts and how busy they are you use the &amp;lt;code&amp;gt;bhosts&amp;lt;/code&amp;gt; command (https://www.ibm.com/support/knowledgecenter/en/SSWRJV_10.1.0/lsf_command_ref/bhosts.1.html)&lt;br /&gt;
&lt;br /&gt;
== LSF Command Reference == &lt;br /&gt;
&lt;br /&gt;
The complete list of LSF commands with description is available [https://www.ibm.com/support/knowledgecenter/en/SSWRJV_10.1.0/lsf_kc_cmd_ref.html here].&lt;/div&gt;</summary>
		<author><name>Cwhidden</name></author>
		
	</entry>
	<entry>
		<id>https://docs.deepsense.ca/index.php?title=Known_problems&amp;diff=48</id>
		<title>Known problems</title>
		<link rel="alternate" type="text/html" href="https://docs.deepsense.ca/index.php?title=Known_problems&amp;diff=48"/>
		<updated>2019-04-15T17:31:00Z</updated>

		<summary type="html">&lt;p&gt;Cwhidden: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Jupyter notebooks or other programs fail trying to access a /run directory ==&lt;br /&gt;
&lt;br /&gt;
The default login shell is BASH.  Make sure the following parameter is in your .bashrc file in your home directory, as it prevents a problem where some types of jobs fail when run through the LSF queue. This should be done automatically the first time you log onto DeepSense. &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;code&amp;gt;echo &amp;#039;unset XDG_RUNTIME_DIR&amp;#039; &amp;gt;&amp;gt; ~/.bashrc&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This line has been added to the default .bashrc file for new users but older user accounts may need this step to be done manually. &lt;br /&gt;
&lt;br /&gt;
== Cannot Install PyTorch dependencies ==&lt;br /&gt;
&lt;br /&gt;
 UnsatisfiableError: The following specifications were found to be in conflict:&lt;br /&gt;
   - powerai-pytorch-prereqs=0.4.1_12295.5cb3523&lt;br /&gt;
&lt;br /&gt;
You may see this error when attempting to install the pytorch dependencies in a local anaconda environment. This error indicates that some of your installed python packages are not compatible with the pytorch prequisites. In particular, we see this error when conda has been updated to version 4.6 (which may sometimes happen when installing the tensorflow dependencies first).&lt;br /&gt;
&lt;br /&gt;
To resolve this problem, create a new environment with a 4.5.x conda version and then install the pytorch dependencies in that environment.&lt;br /&gt;
&lt;br /&gt;
== Cannot use Caffe on login node or compute nodes without GPUs ==&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;Cuda number of devices: -579579216&lt;br /&gt;
Current device id: -579579216&lt;br /&gt;
Current device name: &lt;br /&gt;
[==========] Running 2207 tests from 293 test cases.&lt;br /&gt;
[----------] Global test environment set-up.&lt;br /&gt;
[----------] 9 tests from AccuracyLayerTest/0, where TypeParam = caffe::CPUDevice&amp;lt;float&amp;gt;&lt;br /&gt;
[ RUN      ] AccuracyLayerTest/0.TestSetup&lt;br /&gt;
E0206 15:59:26.604874  7990 common.cpp:121] Cannot create Cublas handle. Cublas won&amp;#039;t be available.&lt;br /&gt;
E0206 15:59:26.611477  7990 common.cpp:128] Cannot create Curand generator. Curand won&amp;#039;t be available.&lt;br /&gt;
F0206 15:59:26.611616  7990 syncedmem.cpp:500] Check failed: error == cudaSuccess (30 vs. 0)  unknown error&lt;br /&gt;
*** Check failure stack trace: ***&lt;br /&gt;
&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
You may see this error when attempting to use Caffe on a node without GPUs or a GPU node without specifically requesting a GPU.&lt;br /&gt;
&lt;br /&gt;
To resolve this problem, use a GPU node and request a GPU. Caffe cannot run without an available GPU.&lt;br /&gt;
&lt;br /&gt;
== Cannot see GPUs in an LSF job ==&lt;br /&gt;
&lt;br /&gt;
 $ nvidia-smi &lt;br /&gt;
 No devices were found&lt;br /&gt;
&lt;br /&gt;
GPUs must be requested with the &amp;lt;code&amp;gt;-gpu -&amp;lt;/code&amp;gt; option to bsub. See [[LSF#GPU_Computation]] for more information.&lt;br /&gt;
&lt;br /&gt;
== Nested anaconda environments may cause strange behaviour ==&lt;br /&gt;
&lt;br /&gt;
Some users have experienced strange behaviour when activating an anaconda environment within another environment. This may include permission errors, loading incorrect versions of software, or strange conflicts when attempting to install packages. If you encounter problems with a nested anaconda environment then first try deactivating all anaconda environments and activating just the desired environment.&lt;/div&gt;</summary>
		<author><name>Cwhidden</name></author>
		
	</entry>
	<entry>
		<id>https://docs.deepsense.ca/index.php?title=Installing_local_software&amp;diff=46</id>
		<title>Installing local software</title>
		<link rel="alternate" type="text/html" href="https://docs.deepsense.ca/index.php?title=Installing_local_software&amp;diff=46"/>
		<updated>2019-04-10T16:04:51Z</updated>

		<summary type="html">&lt;p&gt;Cwhidden: /* Introduction */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Introduction ==&lt;br /&gt;
&lt;br /&gt;
You are welcome to install software locally in your home directory. This allows you to use specific versions of software instead of the cluster wide versions. For example you may need an older version of a specific package or a newly released version that isn&amp;#039;t yet installed on DeepSense.&lt;br /&gt;
&lt;br /&gt;
For assistance installing or compiling software contact [[Technical Support]]. We will support locally installed software to the best of our ability, although we can not guarantee that all software will run on the DeepSense platform. In the event that desired software will not run, we can help you determine alternatives such as different software or using a different system for some of your computation.&lt;br /&gt;
&lt;br /&gt;
If you attempt to install compiled software (e.g. an anaconda package) but the package cannot be found then also contact [[Technical Support]]. The package may not have been compiled for the DeepSense hardware architecture (ppc64le).&lt;br /&gt;
&lt;br /&gt;
If your project has specific software you want to share between members then we can create a shared directory for your group in /software/&amp;lt;project&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you have locally compiled software that you think may be useful for other DeepSense users then let us know at [[Technical Support]]. We may install and support it systemwide if there is sufficient interest.&lt;br /&gt;
&lt;br /&gt;
== Installing Anaconda Python in your home directory ==&lt;br /&gt;
&lt;br /&gt;
=== Stop using systemwide anaconda ===&lt;br /&gt;
&lt;br /&gt;
If you added the system anaconda environment to your &amp;lt;code&amp;gt;.bashrc&amp;lt;/code&amp;gt; file then remove the line:&lt;br /&gt;
 . /opt/anaconda2/etc/profile.d/conda.sh&lt;br /&gt;
&lt;br /&gt;
=== Installing Anaconda with a python2 base ===&lt;br /&gt;
&lt;br /&gt;
From your home directory run:&lt;br /&gt;
 wget https://repo.continuum.io/archive/Anaconda2-5.2.0-Linux-ppc64le.sh&lt;br /&gt;
 bash Anaconda2-5.2.0-Linux-ppc64le.sh&lt;br /&gt;
&lt;br /&gt;
Note: please enter &amp;quot;yes&amp;quot; when asked if you want to add anaconda to your .bashrc file. If you do not then you will need to add the following command to your .bashrc file or run it each time before using anaconda:&lt;br /&gt;
 ~/anaconda2/etc/profile.d/conda.sh&lt;br /&gt;
&lt;br /&gt;
After the installer ends you need to either close and restart your terminal or run:&lt;br /&gt;
 source ~/.bashrc&lt;br /&gt;
&lt;br /&gt;
=== Adding a python3 environment ===&lt;br /&gt;
The previous instruction creates a python2 base environment. To add a python3 environment:&lt;br /&gt;
 conda create -n py36 python=3.6&lt;br /&gt;
&lt;br /&gt;
Activate this environment to use python3:&lt;br /&gt;
 conda activate py36&lt;br /&gt;
&lt;br /&gt;
note: if you receive an error message then you may need to deactivate the base conda environment first:&lt;br /&gt;
 conda deactivate&lt;br /&gt;
 conda activate py36&lt;br /&gt;
&lt;br /&gt;
=== Adding a python2 environment ===&lt;br /&gt;
We recommend creating a separate python2 environment from the base environment. This makes it easier to install the specific packages required for IBM PowerAI.&lt;br /&gt;
 conda create -n py27 python=2.7&lt;br /&gt;
&lt;br /&gt;
Activate this environment to use python2:&lt;br /&gt;
 conda activate py27&lt;br /&gt;
&lt;br /&gt;
=== Install PowerAI dependencies ===&lt;br /&gt;
&lt;br /&gt;
Warning: these scripts will install, update, and downgrade some packages to the recommended packages for the current version of PowerAI. You may want to create a separate python environment to use different versions of those packages with other software.&lt;br /&gt;
&lt;br /&gt;
To use Tensorflow first install the Tensorflow dependencies:&lt;br /&gt;
 /opt/DL/tensorflow/bin/install_dependencies&lt;br /&gt;
&lt;br /&gt;
To use PyTorch first install the PyTorch dependencies:&lt;br /&gt;
 /opt/DL/pytorch/bin/install_dependencies&lt;br /&gt;
&lt;br /&gt;
The dependencies must be installed in whichever python environment you intend to use. We&amp;#039;ve encountered some problems installing the PyTorch dependencies directly in the base environment if the base conda environment has been updated to conda version 4.6.2. If you want to use PyTorch, be sure to use a conda environment with a lower version of conda.&lt;br /&gt;
&lt;br /&gt;
=== Install other dependencies ===&lt;br /&gt;
&lt;br /&gt;
If you need additional python libraries then you can install them in your python environment.&lt;br /&gt;
&lt;br /&gt;
The base package comes with several python libraries but you may want a newer version or additional libraries. Also, when you create a new environment it does not automatically get all of the same libraries as the base environment.&lt;br /&gt;
&lt;br /&gt;
For example, suppose you want to install the &amp;lt;code&amp;gt;scikit-learn&amp;lt;/code&amp;gt; package in your python3 environment.&lt;br /&gt;
&lt;br /&gt;
First you need to activate the environment:&lt;br /&gt;
 conda activate py36&lt;br /&gt;
&lt;br /&gt;
Then you install the package&lt;br /&gt;
 conda install scikit-learn&lt;br /&gt;
&lt;br /&gt;
=== Testing Deep Learning packages on the login nodes or non-GPU nodes ===&lt;br /&gt;
&lt;br /&gt;
You may wish to run PowerAI software on the login nodes for testing on the CPU-only nodes for some workflows.&lt;br /&gt;
&lt;br /&gt;
Only the GPU nodes have graphics cards and graphics drivers installed. If you attempt to run the deep learning software like Tensorflow on the login nodes or CPU-only nodes then you will see errors like the following:&lt;br /&gt;
 ImportError: libcublas.so.9.2: cannot open shared object file: No such file or directory&lt;br /&gt;
&lt;br /&gt;
You need to load the GPU drivers with the following command:&lt;br /&gt;
 source /opt/DL/cudnn/bin/cudnn-activate&lt;br /&gt;
&lt;br /&gt;
Then you can activate the deep learning package, e.g. for Tensorflow:&lt;br /&gt;
 source /opt/DL/tensorflow/bin/tensorflow-activate&lt;br /&gt;
&lt;br /&gt;
Note that some deep learning software may be much slower or refuse to run without GPU access. Tensorflow works but Caffe does not.&lt;br /&gt;
&lt;br /&gt;
Keep in mind you need to activate the GPU drivers and deep learning package in each browser shell before you are able to use the package in your code or LSF jobs.&lt;/div&gt;</summary>
		<author><name>Cwhidden</name></author>
		
	</entry>
	<entry>
		<id>https://docs.deepsense.ca/index.php?title=Getting_started&amp;diff=45</id>
		<title>Getting started</title>
		<link rel="alternate" type="text/html" href="https://docs.deepsense.ca/index.php?title=Getting_started&amp;diff=45"/>
		<updated>2019-04-10T15:57:03Z</updated>

		<summary type="html">&lt;p&gt;Cwhidden: /* 5. Configure your environment */  Separated systemwide and local python options so users don&amp;#039;t follow the instructions to set up the systemwide python before learning that they can install it locally.&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt; Getting Started with DeepSense &lt;br /&gt;
&lt;br /&gt;
&amp;lt;div class=&amp;quot;noautonum&amp;quot;&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== 1. Request access to DeepSense ==&lt;br /&gt;
&lt;br /&gt;
If you belong to an approved DeepSense project but do not yet have an account then send an email to support@deepsense.ca with the subject &amp;quot;DeepSense Account Request&amp;quot; and provide your:&lt;br /&gt;
  a) First and last name&lt;br /&gt;
  b) Faculty of Computer Science username or requested FCS username&lt;br /&gt;
  c) Dalhousie BannerID&lt;br /&gt;
  d) Project ID&lt;br /&gt;
  e) Project leader&lt;br /&gt;
  f) Reason for requesting the account.&lt;br /&gt;
&lt;br /&gt;
== 2. Change your password ==&lt;br /&gt;
&lt;br /&gt;
If you require a new FCS username then your initial password is your BannerID. Please change it immediately upon receiving access to DeepSense.&lt;br /&gt;
&lt;br /&gt;
You can change your password at https://www.cs.dal.ca/csid&lt;br /&gt;
&lt;br /&gt;
Alternatively, you can contact cshelp@cs.dal.ca to reset your password.&lt;br /&gt;
&lt;br /&gt;
== 3. Logging on ==&lt;br /&gt;
&lt;br /&gt;
DeepSense has two login nodes, login1.deepsense.ca and login2.deepsense.ca . You can access these through SSH with your username and password from any computer on campus.&lt;br /&gt;
&lt;br /&gt;
From off campus you’ll need to use the Dalhousie VPN (https://wireless.dal.ca/vpnsoftware.php). If you are not a Dalhousie staff, student, or faculty but require offsite access and cannot use the Dalhousie VPN then contact your project leader or support@deepsense.ca to make different arrangements.&lt;br /&gt;
&lt;br /&gt;
The login nodes are intended for testing and compiling code. Please don’t run long or intensive computation on these nodes.&lt;br /&gt;
&lt;br /&gt;
==  4. Transfer data ==&lt;br /&gt;
&lt;br /&gt;
Deepsense has two protocol nodes, protocol1.deepsense.ca and protocol2.deepsense.ca . You can connect to these using the SAMBA transfer protocol, e.g. smb://protocol1.deepsense.ca with your username and password. Please contact your project leader or support@deepsense.ca if you need help transferring large amounts of data.&lt;br /&gt;
&lt;br /&gt;
Data transferred through the protocol nodes will be located in the shared /data directory .&lt;br /&gt;
&lt;br /&gt;
See [[Storage policies]] for more information about the available shared file systems, storage policies, and backup policies.&lt;br /&gt;
&lt;br /&gt;
== 5. Configure your environment ==&lt;br /&gt;
&lt;br /&gt;
DeepSense compute and management nodes are IBM Power8 computers (ppc64le) running Redhat Enterprise Linux. See [[Resources]] for more details on the available nodes.&lt;br /&gt;
&lt;br /&gt;
=== 5.1 Loading a python environment ===&lt;br /&gt;
&lt;br /&gt;
You have two options for using python on DeepSense. You can use the systemwide python install, managed by DeepSense administrators. This is recommended for users new to Linux. You will need to contact DeepSense support to have additional software packages installed in the systemwide python.&lt;br /&gt;
&lt;br /&gt;
Alternatively, you can install an Anaconda python environment or other software in your home directory. This allows you to install or update packages or software without requesting and waiting for DeepSense staff. &lt;br /&gt;
&lt;br /&gt;
==== Systemwide python (managed by DeepSense) ====&lt;br /&gt;
&lt;br /&gt;
DeepSense nodes have anaconda2 python installed in /opt/anaconda2. To use this systemwide python add a parameter to your .bashrc file in your home directory:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;echo &amp;quot;. /opt/anaconda2/etc/profile.d/conda.sh&amp;quot; &amp;gt;&amp;gt; ~/.bashrc&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Then source your .bashrc file:&lt;br /&gt;
&amp;lt;code&amp;gt;source ~/.bashrc&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To load the python2 environment run &amp;lt;code&amp;gt;conda activate&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To use python3 you can activate the py36 environment:&lt;br /&gt;
&amp;lt;code&amp;gt;conda activate py36&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
You can add either line to your .bashrc file to automatically load the desired environment when you log in.&lt;br /&gt;
&lt;br /&gt;
==== Local python install (managed by individual user) ====&lt;br /&gt;
&lt;br /&gt;
See [[Installing local software]] for more information.&lt;br /&gt;
&lt;br /&gt;
== 6. Running compute jobs ==&lt;br /&gt;
&lt;br /&gt;
DeepSense has two different methods of submitting compute jobs.&lt;br /&gt;
&lt;br /&gt;
=== 6.1 Load Sharing Facility (LSF) ===&lt;br /&gt;
&lt;br /&gt;
LSF is a set of command line tools for submitting compute jobs. You may be familiar with other similar software such as Sun Grid Engine or SLURM.&lt;br /&gt;
&lt;br /&gt;
LSF jobs are submitted using the &amp;lt;code&amp;gt;bsub&amp;lt;/code&amp;gt; command.&lt;br /&gt;
&lt;br /&gt;
You can examine the progress of your currently running jobs with the &amp;lt;code&amp;gt;bjobs&amp;lt;/code&amp;gt; command.&lt;br /&gt;
&lt;br /&gt;
You can examine the available compute nodes and their available resources with the &amp;lt;code&amp;gt;bhosts&amp;lt;/code&amp;gt; command.&lt;br /&gt;
&lt;br /&gt;
For more information about using LSF see [[LSF]].&lt;br /&gt;
&lt;br /&gt;
=== 6.2 Conductor with Spark (CWS) ===&lt;br /&gt;
&lt;br /&gt;
CWS is an IBM web-based graphical interface for creating and running Apache Spark compute jobs.&lt;br /&gt;
&lt;br /&gt;
To use CWS, connect to the IBM Spectrum Computing Cluster Management Console at https://ds-mgm-02.deepsense.cs.dal.ca:8443. Log in with your username and password.&lt;br /&gt;
&lt;br /&gt;
Note that currently you need to accept a self-signed web certificate. In the future this will be fixed.&lt;br /&gt;
&lt;br /&gt;
For more information about using CWS see [[Conductor with Spark]].&lt;br /&gt;
&lt;br /&gt;
== 7. Deep Learning packages and other available software ==&lt;br /&gt;
&lt;br /&gt;
DeepSense has a variety of Deep Learning packages installed as part of IBM PowerAI including Tensorflow, Caffe, and PyTorch. These packages are installed in /opt/DL/ on each compute node and typically need to be activated before using them, e.g. &amp;lt;code&amp;gt;source /opt/DL/tensorflow/bin/tensorflow-activate&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Deep Learning packages are typically used on the GPU nodes but some deep learning packages can also be used on the login nodes and CPU-only nodes. This can be useful for testing your code or running CPU-bound workloads. To use the deep learning packages on the login or compute nodes you will also need to load the GPU libraries with &amp;lt;code&amp;gt;source /opt/DL/cudnn/bin/cudnn-activate&amp;lt;/code&amp;gt;. Note that some deep learning packages may fail if run without a GPU, e.g. Caffe currently requires a GPU.&lt;br /&gt;
&lt;br /&gt;
For a brief tutorial including running Caffe and Tensorflow in a Jupyter notebook see [[Getting started with Deep Learning]].&lt;br /&gt;
&lt;br /&gt;
See [[Available software]] for the current list of installed software. If you require additional software you are welcome to install it locally in your home directory or contact DeepSense support.&lt;br /&gt;
&lt;br /&gt;
== 8. Technical and research support == &lt;br /&gt;
&lt;br /&gt;
DeepSense has a dedicated support team of research scientists ready to help you with technical questions, installing software, or even research questions.&lt;br /&gt;
&lt;br /&gt;
If you can&amp;#039;t find the answer to your question on this wiki or need more extensive help then send an email to support@deepsense.ca .&lt;br /&gt;
&lt;br /&gt;
See [[Technical support]] for more information about the support available.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/div&amp;gt; &amp;lt;!-- autonum --&amp;gt;&lt;/div&gt;</summary>
		<author><name>Cwhidden</name></author>
		
	</entry>
	<entry>
		<id>https://docs.deepsense.ca/index.php?title=Contact_information&amp;diff=40</id>
		<title>Contact information</title>
		<link rel="alternate" type="text/html" href="https://docs.deepsense.ca/index.php?title=Contact_information&amp;diff=40"/>
		<updated>2019-03-15T18:54:33Z</updated>

		<summary type="html">&lt;p&gt;Cwhidden: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== For potential new industry or academic partners ==&lt;br /&gt;
If you are interested in the DeepSense ocean innovation platform then please visit https://deepsense.ca and contact info@deepsense.ca for more information on starting a DeepSense project.&lt;br /&gt;
&lt;br /&gt;
== For students interested in working on current DeepSense projects ==&lt;br /&gt;
Please contact the scientific director of DeepSense, Evangelos Milios ([mailto:eem@cs.dal.ca eem@cs.dal.ca]).&lt;br /&gt;
&lt;br /&gt;
== For technical support with the DeepSense platform ==&lt;br /&gt;
&lt;br /&gt;
Contact [mailto:support@deepsense.ca support@deepsense.ca]. Your issue will be submitted to our ticket tracking queue and responded to promptly.&lt;br /&gt;
&lt;br /&gt;
== DEEPSENSE-USERS Mailing List ==&lt;br /&gt;
&lt;br /&gt;
We maintain a mailing list for deepsense users&lt;br /&gt;
 https://listserv.dal.ca/index.cgi?A0=DEEPSENSE-USERS&lt;br /&gt;
&lt;br /&gt;
The mailing list is primarily intended to provide information such as service interruptions or important updates.&lt;br /&gt;
&lt;br /&gt;
During the onboarding process you should be added to the mailing list. If you are not on the mailing list and wish to be added you can visit the above link or send us an email using one of the above methods.&lt;/div&gt;</summary>
		<author><name>Cwhidden</name></author>
		
	</entry>
	<entry>
		<id>https://docs.deepsense.ca/index.php?title=Available_software&amp;diff=39</id>
		<title>Available software</title>
		<link rel="alternate" type="text/html" href="https://docs.deepsense.ca/index.php?title=Available_software&amp;diff=39"/>
		<updated>2019-03-15T18:49:02Z</updated>

		<summary type="html">&lt;p&gt;Cwhidden: /* IBM PowerAI Deep Learning Packages */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Basic Software ==&lt;br /&gt;
&lt;br /&gt;
* RedHat Enterprise Linux Server release 7.5 (RHEL)&lt;br /&gt;
* gcc 4.8.5&lt;br /&gt;
* glibc 2.17&lt;br /&gt;
&lt;br /&gt;
== Anaconda Python ==&lt;br /&gt;
&lt;br /&gt;
Two Anaconda python environments are installed locally on each DeepSense compute node:&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
! Version&lt;br /&gt;
! Environment location&lt;br /&gt;
|-&lt;br /&gt;
|python 2.7.15&lt;br /&gt;
|/opt/anaconda2&lt;br /&gt;
|-&lt;br /&gt;
|python 3.6.8&lt;br /&gt;
|/opt/anaconda2/envs/py36&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
These python environments have many packages installed, including prerequisite libraries for running the IBM PowerAI deep learning frameworks.&lt;br /&gt;
&lt;br /&gt;
See [[Getting_started]] for instructions on using the shared anaconda python environments.&lt;br /&gt;
&lt;br /&gt;
See [[Installing local software]] for instructions on installing and managing your own python environments in your home directory.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== IBM PowerAI Deep Learning Packages ==&lt;br /&gt;
&lt;br /&gt;
[https://developer.ibm.com/linuxonpower/deep-learning-powerai/ IBM PowerAI] includes multiple open source deep learning frameworks compiled for IBM Power8 systems.&lt;br /&gt;
&lt;br /&gt;
IBM PowerAI Enterprise includes:&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
!Framework&lt;br /&gt;
!colspan=&amp;quot;2&amp;quot;|Location&lt;br /&gt;
|-&lt;br /&gt;
|Caffe&lt;br /&gt;
|/opt/DL/caffe&lt;br /&gt;
|-&lt;br /&gt;
|cuDNN&lt;br /&gt;
|/opt/DL/cudnn&lt;br /&gt;
|-&lt;br /&gt;
|IBM Distributed Deep Learning (DDL)&lt;br /&gt;
|/opt/DL/ddl&lt;br /&gt;
|-&lt;br /&gt;
| HDF5&lt;br /&gt;
|/opt/DL/hdf5&lt;br /&gt;
|-&lt;br /&gt;
|NCCL&lt;br /&gt;
|/opt/DL/nccl&lt;br /&gt;
|/opt/DL/nccl2&lt;br /&gt;
|-&lt;br /&gt;
|openblas&lt;br /&gt;
|/opt/DL/openblas&lt;br /&gt;
|-&lt;br /&gt;
|protobuf&lt;br /&gt;
|/opt/DL/protobuf&lt;br /&gt;
|-&lt;br /&gt;
|pytorch&lt;br /&gt;
|/opt/DL/pytorch&lt;br /&gt;
|-&lt;br /&gt;
|snap-ml&lt;br /&gt;
|/opt/DL/snap-ml-local&lt;br /&gt;
|/opt/DL/snap-ml-mpi&lt;br /&gt;
|-&lt;br /&gt;
|Tensorflow 1.11 (including keras)&lt;br /&gt;
|/opt/DL/tensorflow&lt;br /&gt;
|/opt/DL/ddl-tensorflow&lt;br /&gt;
|-&lt;br /&gt;
|Tensorboard&lt;br /&gt;
|/opt/DL/tensorboard&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
To use most of these frameworks you need to activate a python2 or python3 environment and then activate the relevant framework.&lt;br /&gt;
&lt;br /&gt;
For example, to use tensorflow you can activate a python2 environment:&lt;br /&gt;
 . /opt/anaconda2/etc/profile.d/conda.sh&lt;br /&gt;
 conda activate&lt;br /&gt;
&lt;br /&gt;
and then activate tensorflow:&lt;br /&gt;
 source /opt/DL/tensorflow/bin/tensorflow-activate&lt;br /&gt;
&lt;br /&gt;
You can then &amp;lt;code&amp;gt;import tensorflow as tf&amp;lt;/code&amp;gt; in your python code.&lt;br /&gt;
&lt;br /&gt;
See [[Getting started with Deep Learning]] for a tutorial on using Caffe and Tensorflow on Deep Sense.&lt;br /&gt;
&lt;br /&gt;
== IBM Advance Toolchain ==&lt;br /&gt;
&lt;br /&gt;
You may require newer versions of compilers such as GCC than are provided with RHEL.&lt;br /&gt;
&lt;br /&gt;
The [https://developer.ibm.com/linuxonpower/advance-toolchain IBM Advance Toolchain for Linux on Power] is a set of open source compilers, runtime libraries, and development tools.&lt;br /&gt;
&lt;br /&gt;
The IBM Advance Toolchain] includes recent versions of:&lt;br /&gt;
* GNU Compiler Collection (gcc, g++ and gfortran)&lt;br /&gt;
* GNU C library (glibc)&lt;br /&gt;
* GNU Binary Utilities (binutils)&lt;br /&gt;
* Decimal Floating Point Library (libdfp)&lt;br /&gt;
* IBM Power Architecture Facilities Library (PAFLib)&lt;br /&gt;
* GNU Debugger (gdb)&lt;br /&gt;
* Python&lt;br /&gt;
* Golang&lt;br /&gt;
* Performance analysis tools (oprofile, valgrind, itrace)&lt;br /&gt;
* Multi-core exploitation libraries (TBB, Userspace RCU, SPHDE)&lt;br /&gt;
* support libraries (libhugetlbfs, Boost, zlib, etc)&lt;br /&gt;
&lt;br /&gt;
To use the the Advance Toolchain, first activate environment modules:&lt;br /&gt;
 source /usr/local/Modules/init/bash&lt;br /&gt;
&lt;br /&gt;
Then load the advance toolchain:&lt;br /&gt;
 module load at12.0&lt;br /&gt;
&lt;br /&gt;
To stop using the advance toolchain, unload the environment module:&lt;br /&gt;
 module unload at12.0&lt;br /&gt;
&lt;br /&gt;
Note that software dynamically compiled with the advance toolchain will only run with the advance toolchain loaded.&lt;br /&gt;
&lt;br /&gt;
== Requesting Additional Software ==&lt;br /&gt;
&lt;br /&gt;
Contact DeepSense [[contact information|support]] to have additional software installed or for help installing or compiling software locally in your home directory.&lt;/div&gt;</summary>
		<author><name>Cwhidden</name></author>
		
	</entry>
	<entry>
		<id>https://docs.deepsense.ca/index.php?title=Known_problems&amp;diff=34</id>
		<title>Known problems</title>
		<link rel="alternate" type="text/html" href="https://docs.deepsense.ca/index.php?title=Known_problems&amp;diff=34"/>
		<updated>2019-03-12T18:57:12Z</updated>

		<summary type="html">&lt;p&gt;Cwhidden: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Jupyter notebooks or other programs fail trying to access a /run directory ==&lt;br /&gt;
&lt;br /&gt;
The default login shell is BASH.  Make sure the following parameter is in your .bashrc file in your home directory, as it prevents a problem where some types of jobs fail when run through the LSF queue. This should be done automatically the first time you log onto DeepSense. &lt;br /&gt;
 &lt;br /&gt;
&amp;lt;code&amp;gt;echo &amp;#039;unset XDG_RUNTIME_DIR&amp;#039; &amp;gt;&amp;gt; ~/.bashrc&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This line has been added to the default .bashrc file for new users but older user accounts may need this step to be done manually. &lt;br /&gt;
&lt;br /&gt;
== Cannot Install PyTorch dependencies ==&lt;br /&gt;
&lt;br /&gt;
 UnsatisfiableError: The following specifications were found to be in conflict:&lt;br /&gt;
   - powerai-pytorch-prereqs=0.4.1_12295.5cb3523&lt;br /&gt;
&lt;br /&gt;
You may see this error when attempting to install the pytorch dependencies in a local anaconda environment. This error indicates that some of your installed python packages are not compatible with the pytorch prequisites. In particular, we see this error when conda has been updated to version 4.6 (which may sometimes happen when installing the tensorflow dependencies first).&lt;br /&gt;
&lt;br /&gt;
To resolve this problem, create a new environment with a 4.5.x conda version and then install the pytorch dependencies in that environment.&lt;br /&gt;
&lt;br /&gt;
== Cannot use Caffe on login node or compute nodes without GPUs ==&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;Cuda number of devices: -579579216&lt;br /&gt;
Current device id: -579579216&lt;br /&gt;
Current device name: &lt;br /&gt;
[==========] Running 2207 tests from 293 test cases.&lt;br /&gt;
[----------] Global test environment set-up.&lt;br /&gt;
[----------] 9 tests from AccuracyLayerTest/0, where TypeParam = caffe::CPUDevice&amp;lt;float&amp;gt;&lt;br /&gt;
[ RUN      ] AccuracyLayerTest/0.TestSetup&lt;br /&gt;
E0206 15:59:26.604874  7990 common.cpp:121] Cannot create Cublas handle. Cublas won&amp;#039;t be available.&lt;br /&gt;
E0206 15:59:26.611477  7990 common.cpp:128] Cannot create Curand generator. Curand won&amp;#039;t be available.&lt;br /&gt;
F0206 15:59:26.611616  7990 syncedmem.cpp:500] Check failed: error == cudaSuccess (30 vs. 0)  unknown error&lt;br /&gt;
*** Check failure stack trace: ***&lt;br /&gt;
&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
You may see this error when attempting to use Caffe on a node without GPUs or a GPU node without specifically requesting a GPU.&lt;br /&gt;
&lt;br /&gt;
To resolve this problem, use a GPU node and request a GPU. Caffe cannot run without an available GPU.&lt;br /&gt;
&lt;br /&gt;
== Cannot see GPUs in an LSF job ==&lt;br /&gt;
&lt;br /&gt;
 $ nvidia-smi &lt;br /&gt;
 No devices were found&lt;br /&gt;
&lt;br /&gt;
GPUs must be requested with the &amp;lt;code&amp;gt;-gpu -&amp;lt;/code&amp;gt; option to bsub. See [[LSF#GPU_Computation]] for more information.&lt;/div&gt;</summary>
		<author><name>Cwhidden</name></author>
		
	</entry>
	<entry>
		<id>https://docs.deepsense.ca/index.php?title=Getting_started&amp;diff=33</id>
		<title>Getting started</title>
		<link rel="alternate" type="text/html" href="https://docs.deepsense.ca/index.php?title=Getting_started&amp;diff=33"/>
		<updated>2019-03-12T18:53:39Z</updated>

		<summary type="html">&lt;p&gt;Cwhidden: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt; Getting Started with DeepSense &lt;br /&gt;
&lt;br /&gt;
&amp;lt;div class=&amp;quot;noautonum&amp;quot;&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== 1. Request access to DeepSense ==&lt;br /&gt;
&lt;br /&gt;
If you belong to an approved DeepSense project but do not yet have an account then send an email to support@deepsense.ca with the subject &amp;quot;DeepSense Account Request&amp;quot; and provide your:&lt;br /&gt;
  a) First and last name&lt;br /&gt;
  b) Faculty of Computer Science username or requested FCS username&lt;br /&gt;
  c) Dalhousie BannerID&lt;br /&gt;
  d) Project ID&lt;br /&gt;
  e) Project leader&lt;br /&gt;
  f) Reason for requesting the account.&lt;br /&gt;
&lt;br /&gt;
== 2. Change your password ==&lt;br /&gt;
&lt;br /&gt;
If you require a new FCS username then your initial password is your BannerID. Please change it immediately upon receiving access to DeepSense.&lt;br /&gt;
&lt;br /&gt;
You can change your password at https://www.cs.dal.ca/csid&lt;br /&gt;
&lt;br /&gt;
Alternatively, you can contact cshelp@cs.dal.ca to reset your password.&lt;br /&gt;
&lt;br /&gt;
== 3. Logging on ==&lt;br /&gt;
&lt;br /&gt;
DeepSense has two login nodes, login1.deepsense.ca and login2.deepsense.ca . You can access these through SSH with your username and password from any computer on campus.&lt;br /&gt;
&lt;br /&gt;
From off campus you’ll need to use the Dalhousie VPN (https://wireless.dal.ca/vpnsoftware.php). If you are not a Dalhousie staff, student, or faculty but require offsite access and cannot use the Dalhousie VPN then contact your project leader or support@deepsense.ca to make different arrangements.&lt;br /&gt;
&lt;br /&gt;
The login nodes are intended for testing and compiling code. Please don’t run long or intensive computation on these nodes.&lt;br /&gt;
&lt;br /&gt;
==  4. Transfer data ==&lt;br /&gt;
&lt;br /&gt;
Deepsense has two protocol nodes, protocol1.deepsense.ca and protocol2.deepsense.ca . You can connect to these using the SAMBA transfer protocol, e.g. smb://protocol1.deepsense.ca with your username and password. Please contact your project leader or support@deepsense.ca if you need help transferring large amounts of data.&lt;br /&gt;
&lt;br /&gt;
Data transferred through the protocol nodes will be located in the shared /data directory .&lt;br /&gt;
&lt;br /&gt;
See [[Storage policies]] for more information about the available shared file systems, storage policies, and backup policies.&lt;br /&gt;
&lt;br /&gt;
== 5. Configure your environment ==&lt;br /&gt;
&lt;br /&gt;
DeepSense compute and management nodes are IBM Power8 computers (ppc64le) running Redhat Enterprise Linux. See [[Resources]] for more details on the available nodes.&lt;br /&gt;
&lt;br /&gt;
=== 5.1 Loading a python environment ===&lt;br /&gt;
&lt;br /&gt;
DeepSense nodes have anaconda2 python installed in /opt/anaconda2. To use this systemwide python add a parameter to your .bashrc file in your home directory:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;echo &amp;quot;. /opt/anaconda2/etc/profile.d/conda.sh&amp;quot; &amp;gt;&amp;gt; ~/.bashrc&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Then source your .bashrc file:&lt;br /&gt;
&amp;lt;code&amp;gt;source ~/.bashrc&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To load the python2 environment run &amp;lt;code&amp;gt;conda activate&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To use python3 you can activate the py36 environment:&lt;br /&gt;
&amp;lt;code&amp;gt;conda activate py36&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
You can add either line to your .bashrc file to automatically load the desired environment when you log in.&lt;br /&gt;
&lt;br /&gt;
Alternatively, you may wish to install Anaconda or other software locally in your home directory. This allows you to install or update packages or software without requesting and waiting for DeepSense staff. See [[Installing local software]] for more information.&lt;br /&gt;
&lt;br /&gt;
== 6. Running compute jobs ==&lt;br /&gt;
&lt;br /&gt;
DeepSense has two different methods of submitting compute jobs.&lt;br /&gt;
&lt;br /&gt;
=== 6.1 Load Sharing Facility (LSF) ===&lt;br /&gt;
&lt;br /&gt;
LSF is a set of command line tools for submitting compute jobs. You may be familiar with other similar software such as Sun Grid Engine or SLURM.&lt;br /&gt;
&lt;br /&gt;
LSF jobs are submitted using the &amp;lt;code&amp;gt;bsub&amp;lt;/code&amp;gt; command.&lt;br /&gt;
&lt;br /&gt;
You can examine the progress of your currently running jobs with the &amp;lt;code&amp;gt;bjobs&amp;lt;/code&amp;gt; command.&lt;br /&gt;
&lt;br /&gt;
You can examine the available compute nodes and their available resources with the &amp;lt;code&amp;gt;bhosts&amp;lt;/code&amp;gt; command.&lt;br /&gt;
&lt;br /&gt;
For more information about using LSF see [[LSF]].&lt;br /&gt;
&lt;br /&gt;
=== 6.2 Conductor with Spark (CWS) ===&lt;br /&gt;
&lt;br /&gt;
CWS is an IBM web-based graphical interface for creating and running Apache Spark compute jobs.&lt;br /&gt;
&lt;br /&gt;
To use CWS, connect to the IBM Spectrum Computing Cluster Management Console at https://ds-mgm-02.deepsense.cs.dal.ca:8443. Log in with your username and password.&lt;br /&gt;
&lt;br /&gt;
Note that currently you need to accept a self-signed web certificate. In the future this will be fixed.&lt;br /&gt;
&lt;br /&gt;
For more information about using CWS see [[Conductor with Spark]].&lt;br /&gt;
&lt;br /&gt;
== 7. Deep Learning packages and other available software ==&lt;br /&gt;
&lt;br /&gt;
DeepSense has a variety of Deep Learning packages installed as part of IBM PowerAI including Tensorflow, Caffe, and PyTorch. These packages are installed in /opt/DL/ on each compute node and typically need to be activated before using them, e.g. &amp;lt;code&amp;gt;source /opt/DL/tensorflow/bin/tensorflow-activate&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Deep Learning packages are typically used on the GPU nodes but some deep learning packages can also be used on the login nodes and CPU-only nodes. This can be useful for testing your code or running CPU-bound workloads. To use the deep learning packages on the login or compute nodes you will also need to load the GPU libraries with &amp;lt;code&amp;gt;source /opt/DL/cudnn/bin/cudnn-activate&amp;lt;/code&amp;gt;. Note that some deep learning packages may fail if run without a GPU, e.g. Caffe currently requires a GPU.&lt;br /&gt;
&lt;br /&gt;
For a brief tutorial including running Caffe and Tensorflow in a Jupyter notebook see [[Getting started with Deep Learning]].&lt;br /&gt;
&lt;br /&gt;
See [[Available software]] for the current list of installed software. If you require additional software you are welcome to install it locally in your home directory or contact DeepSense support.&lt;br /&gt;
&lt;br /&gt;
== 8. Technical and research support == &lt;br /&gt;
&lt;br /&gt;
DeepSense has a dedicated support team of research scientists ready to help you with technical questions, installing software, or even research questions.&lt;br /&gt;
&lt;br /&gt;
If you can&amp;#039;t find the answer to your question on this wiki or need more extensive help then send an email to support@deepsense.ca .&lt;br /&gt;
&lt;br /&gt;
See [[Technical support]] for more information about the support available.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/div&amp;gt; &amp;lt;!-- autonum --&amp;gt;&lt;/div&gt;</summary>
		<author><name>Cwhidden</name></author>
		
	</entry>
</feed>