CWS
IBM Spectrum Conductor with Spark (CWS) enables to efficiently deploy and manage multiple Spark deployments on DeepSense computing hardware. CWS supports multiple versions and instances of Spark, provides multitenancy through Spark instance groups, and maximizes usage of resources with increased performance and scale.
Contents
Accessing CWS
In order to access the Spectrum Conductor and get started with Spark application you can either use the web interface management console or the command-line interface.
Management Console
The management console, which is the web interface to IBM Spectrum Conductor with Spark, provides a single point of access to key system components for cluster monitoring and control, configuration, and troubleshooting. The web interface to the DeepSense IBM CWS Management Console is at https://ds-mgm-02.deepsense.cs.dal.ca:8443. Go to the url and log in using your DeepSense account information.
Command Line Option
Spectrum Conductor with Spark also includes a Command-Line Interface (CLI) for administration. You can launch the CLI by starting a command console and source the environment for your shell.
Steps to launch the command console:
- From the login node, ssh to ‘ds-cmhm-02.deepsense.cs.dal.ca’
ssh ds-cmhm-02.deepsense.cs.dal.ca
- Source the environment for your shell:
source /software/WMLA/wmla/profile.platform
- Login using your account:
egosh user logon -u ‘user_name’ -x ‘password’
- You can see the list of available resources:
egosh resource list
- View current activity:
egosh activity view
The complete list of the CLI commands with details is available at the IBM Knowledge Center.
Spark Workload
To create a Spark Instance Group (SIG), go to the management console and click Workload -> Spark -> Spark Instance Group. For the basic configuration you will need to specify the name, deployment directory and execution user. For how to create and manage a SIG, please refer to the following instructions.
Spark Versions
After specifying the basic configuration, you can choose one of the Spark versions to deploy. Currently, the following Spark versions are available:
- Spark 2.4.3
- Spark 2.3.3
Notebooks
Notebooks provide an interactive environment for data analysis with visualization from a web browser. The Jupyter 5.4.0 notebook version is available. Below is an instruction how to setup and use notebook on the Conductor.
When you create your SIG, in the section "Enable notebooks", you would need to check the box of "Jupyter 5.4.0 and select your "Anaconda distribution instance" and "Conda environment". The following screenshot is an example:
Finish the creation the SIG and then deploy and start it before you are able to use the notebook. The example SIG here is named "Lu-Jun-Test". Click on the SIG name and open it. We can see the Notebooks tab in the following screenshot:
Click the "Notebooks" tab, then click the green button "Create Notebooks for Users", select yourself as the the user, and click the "Create" button:
After you create the Notebooks, you can click on the button "My Notebooks" and select your notebook: