Installing local software

From DeepSense Docs
Revision as of 12:24, 30 October 2020 by Bgeetika (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Note

On June 26 we will update the GPU compute nodes to a new version of IBM Watson Machine Learning Accelerator. This will change the way you access deep learning packages like Tensorflow and Pytorch. Instead of "activating" these packages, you will be able to install new versions directly in your anaconda environment. See below for more information.

We are actively updating the wiki documentation to explain the new method of accessing deep learning packages. Please bear with us during these updates as some documentation may still refer to the old method of "activating" deep learning packages.

Introduction

You are welcome to install software locally in your home directory. This allows you to use specific versions of software instead of the cluster wide versions. For example you may need an older version of a specific package or a newly released version that isn't yet installed on DeepSense.

For assistance installing or compiling software contact Technical Support. We will support locally installed software to the best of our ability, although we can not guarantee that all software will run on the DeepSense platform. In the event that desired software will not run, we can help you determine alternatives such as different software or using a different system for some of your computation.

If you attempt to install compiled software (e.g. an anaconda package) but the package cannot be found then also contact Technical Support. The package may not have been compiled for the DeepSense hardware architecture (ppc64le).

If your project has specific software you want to share between members then we can create a shared directory for your group in /software/<project>

If you have locally compiled software that you think may be useful for other DeepSense users then let us know at Technical Support. We may install and support it systemwide if there is sufficient interest.

Installing Anaconda Python in your home directory

Stop using systemwide anaconda

If you added the system anaconda environment to your .bashrc file then remove the line:

. /opt/anaconda2/etc/profile.d/conda.sh

Installing Anaconda with a python3 base

From your home directory run:

wget https://repo.continuum.io/archive/Anaconda3-5.2.0-Linux-ppc64le.sh
bash Anaconda3-5.2.0-Linux-ppc64le.sh

Note: please enter "yes" when asked if you want to add anaconda to your .bashrc file. If you do not then you will need to add the following command to your .bashrc file or run it each time before using anaconda:

. ~/anaconda3/etc/profile.d/conda.sh

After the installer ends you need to either close and restart your terminal or run:

source ~/.bashrc

Adding a python2 environment

The previous instruction creates a python3 base environment. To add a python2 environment:

conda create -n py27 python=2.7

Activate this environment to use python3:

conda activate py27

note: if you receive an error message then you may need to deactivate the base conda environment first:

conda deactivate
conda activate py27

Adding a python3 environment

We recommend creating a separate python3 environment from the base environment. This makes it easier to install the specific packages required for IBM PowerAI.

conda create -n py36 python=3.6

Activate this environment to use python3:

conda activate py36

(New Method) IBM-AI Deep Learning Anaconda Channel

To use deep learning packages like Tensorflow on DeepSense you need to add the IBM-AI anaconda channel to your list of available software channels.

conda config --prepend channels https://public.dhe.ibm.com/ibmdl/export/pub/software/server/ibm-ai/conda/

We suggest creating a new environment for each deep learning package you want to use. For example for Tensorflow:

conda create -n py36_tensorflow python=3.6
conda activate py36_tensorflow

Then install the anaconda package for the software you need. Again, with Tensorflow as an example:

conda install tensorflow

You can then use tensorflow or other deep learning packages as needed by simply activating that anaconda environment. Unlike the old method, you do not need to specifically activate tensorflow or other deep learning methods.

You can directly visit the IBM-AI anaconda channel URL to see a list of available software (https://public.dhe.ibm.com/ibmdl/export/pub/software/server/ibm-ai/conda/)


(Old Method) Install PowerAI dependencies

Warning: these scripts will install, update, and downgrade some packages to the recommended packages for the current version of PowerAI. You may want to create a separate python environment to use different versions of those packages with other software.

To use Tensorflow first install the Tensorflow dependencies:

/opt/DL/tensorflow/bin/install_dependencies

To use PyTorch first install the PyTorch dependencies:

/opt/DL/pytorch/bin/install_dependencies

The dependencies must be installed in whichever python environment you intend to use. We've encountered some problems installing the PyTorch dependencies directly in the base environment if the base conda environment has been updated to conda version 4.6.2. If you want to use PyTorch, be sure to use a conda environment with a lower version of conda.


Install PyTorch 1.6.0 in a user's home directory on DeepSense

A DeepSense user can install PyTorch by him/herself in his/her home directory using the already built packages in /sofware/PyTorch-1.6.0-Build. The current build only works with Python 3.6. So, a user needs to create a conda environment with Python 3.6. If a user would like to use higher versions of Python, they would need to ask DeepSense team to build PyTorch with those versions.
Here are the steps that a normal DeepSense user install PyTorch 1.6.0 in his/her home directory.

1. Source the conda environment you would like to use. For example:

source anaconda3/etc/profile.d/conda.sh

2. Activate the environment you would use to install PyTorch. If the environment hasn't been created, a user can create it and install PyTorch in one command line. For example, if you would create an environment with name "my-environment" (This is just an example. Please choose a meaningful name for yourself.) and install PyTorch, you would run the following command:

conda create -y -n my-environment python=3.6 pytorch -c file:////software/PyTorch-1.6.0-Build/condabuild

3. If the environment has been created, say the name of the environment is "my-environment", you would need to activate the environment first and then install PyTorch. For example:

conda activate my-environment
conda install pytorch -c file:////software/PyTorch-1.6.0-Build/condabuild

This should take about 2 minutes to install.

4. To test if your install is successful, issue python from the environment where PyTorch is installed. Then run "import torch" to see if there are any errors. For example:
[luy@ds-lg-01 ~]$ conda activate my-environment
(my-environment) [luy@ds-lg-01 ~]$ python
Python 3.6.12 |Anaconda, Inc.| (default, Sep  9 2020, 00:40:10) 
[GCC 7.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> 

Install Opencv 3.4.10 in a user's home directory on DeepSense

A DeepSense user can install Opencv by him/herself in his/her home directory using the already built packages in /sofware/PyTorch-1.6.0-Build/opencv-feedstock. The current build only works with Python 3.6. So, a user needs to create a conda environment with Python 3.6. If a user would like to use higher versions of Python, they would need to ask DeepSense team to build Opencv with those versions.
Here are the steps that a normal DeepSense user installs Opencv 3.4.10 in his/her home directory.
1. Source the conda environment you would like to use. For example:

source ~/anaconda3/etc/profile.d/conda.sh

2. Activate the environment you would use to install Opencv. If the environment hasn't been created, a user should create one. Assume a user created an environment "my-environment" and activated it. To install Opencv, you would run the following command:

conda activate my-environment
conda install opencv -c file:////software/PyTorch-1.6.0-Build/opencv-feedstock/condabuild

This should take about 2 minutes to install.

3. To test if your install is successful, issue python from the environment where Opencv is installed. Then run "import cv2" to see if there are any errors. For example:

[luy@ds-lg-01 ~]$ conda activate my-environment
(my-environment) [luy@ds-lg-01 ~]$ python
Python 3.6.12 |Anaconda, Inc.| (default, Sep  9 2020, 00:40:10) 
[GCC 7.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information. >>> import cv2 >>> 

Install other dependencies

If you need additional python libraries then you can install them in your python environment.

The base package comes with several python libraries but you may want a newer version or additional libraries. Also, when you create a new environment it does not automatically get all of the same libraries as the base environment.

For example, suppose you want to install the scikit-learn package in your python3 environment.

First you need to activate the environment:

conda activate py36

Then you install the package

conda install scikit-learn

A list of recommended packages follows in the next section.

Recommended packages

Jupyter Notebooks for deep learning

conda install jupyter

(Old Method) Testing Deep Learning packages on the login nodes or non-GPU nodes

You may wish to run PowerAI software on the login nodes for testing on the CPU-only nodes for some workflows.

Only the GPU nodes have graphics cards and graphics drivers installed. If you attempt to run the deep learning software like Tensorflow on the login nodes or CPU-only nodes then you will see errors like the following:

ImportError: libcublas.so.9.2: cannot open shared object file: No such file or directory

You need to load the GPU drivers with the following command:

source /opt/DL/cudnn/bin/cudnn-activate

Then you can activate the deep learning package, e.g. for Tensorflow:

source /opt/DL/tensorflow/bin/tensorflow-activate

Note that some deep learning software may be much slower or refuse to run without GPU access. Tensorflow works but Caffe does not.

Keep in mind you need to activate the GPU drivers and deep learning package in each browser shell before you are able to use the package in your code or LSF jobs.

Compiling Software for DeepSense

DeepSense uses IBM Power8 systems running RedHat Enterprise Linux. Code must be compiled for ppc64le which is PowerPC 64 bit Little Endian.

Some software may not have binaries available for ppc64le even if it does for other systems. If this happens then you (or DeepSense support) will need to compile the software to run on DeepSense. Visit the web page for the software and see if the source code is available (e.g. through github). If so then follow the compilation instructions to run the software.

You may encounter errors when attempting to compile software for ppc64le. Often this occurs because of differences between ppc64le and other common architectures such as x86 and x86_64.

For example, one DeepSense user attempted to compile the rdkit software package from https://www.rdkit.org/ . This compilation failed when it attempted to use the gcc x86 optimization -mpopcnt. After replacing the optimization with the ppc64le equivalent -mpopcntb the software compiled successfully.