Difference between revisions of "Checking Job Status"

From DeepSense Docs
Jump to: navigation, search
 
Line 73: Line 73:
 
  ds-cmgpu-03      0    1  TeslaP100_SX 15.8G  6.0    0      1      -
 
  ds-cmgpu-03      0    1  TeslaP100_SX 15.8G  6.0    0      1      -
  
bkill, bresume, and bstop are some of the useful commands and users should read the manual page of them.[https://www.ibm.com/support/knowledgecenter/SSWRJV_10.1.0/lsf_welcome/lsf_kc_cmd_ref.html Click Here]
+
'''Note:''' bkill, bresume, and bstop are some of the useful commands and users should read the manual page of them.[https://www.ibm.com/support/knowledgecenter/SSWRJV_10.1.0/lsf_welcome/lsf_kc_cmd_ref.html Click Here]

Latest revision as of 15:24, 4 December 2020

Job Status can be checked by bjobs command. For more details on bjobs , go to jobs

Some example on jobs:

1. bjobs

It will display all the LSF jobs

for example

[bhatiag@ds-cmgpu-04 ~]$ bjobs
JOBID   USER    STAT  QUEUE      FROM_HOST   EXEC_HOST   JOB_NAME   SUBMIT_TIME
5970    bhatiag RUN   gpu        ds-lg-01    ds-cmgpu-03 bash       Dec  4 10:34
5971    bhatiag RUN   gpu        ds-cmgpu-03 ds-cmgpu-04 bash       Dec  4 10:55

2. bjobs jobid

Using jobid with jobs will display the status for that specific job

for example

[bhatiag@ds-cmgpu-04 ~]$ bjobs 5970
JOBID   USER    STAT  QUEUE      FROM_HOST   EXEC_HOST   JOB_NAME   SUBMIT_TIME
5970    bhatiag RUN   gpu        ds-lg-01    ds-cmgpu-03 bash       Dec  4 10:34
[bhatiag@ds-cmgpu-04 ~]$ bjobs 5970 5971
JOBID   USER    STAT  QUEUE      FROM_HOST   EXEC_HOST   JOB_NAME   SUBMIT_TIME
5970    bhatiag RUN   gpu        ds-lg-01    ds-cmgpu-03 bash       Dec  4 10:34
5971    bhatiag RUN   gpu        ds-cmgpu-03 ds-cmgpu-04 bash       Dec  4 10:55

3 bjobs -l -gpu jobid

Using -gpu with jobs will display the gpu jobs

for example

[bhatiag@ds-cmgpu-04 ~]$ bjobs -l -gpu 5970
Job <5970>, User <bhatiag>, Project <default>, Status <RUN>, Queue <gpu>, Inter
                    active pseudo-terminal shell mode, Command <bash>, Share g
                    roup charged </bhatiag>
Fri Dec  4 10:34:08: Submitted from host <ds-lg-01>, CWD <$HOME>, Requested GPU
                    ;
Fri Dec  4 10:34:08: Started 1 Task(s) on Host(s) <ds-cmgpu-03>, Allocated 1 Sl
                    ot(s) on Host(s) <ds-cmgpu-03>;
Fri Dec  4 11:01:33: Resource usage collected.
                    The CPU time used is 1 seconds.
                    MEM: 14 Mbytes;  SWAP: 0 Mbytes;  NTHREAD: 6
                    PGID: 49226;  PIDs: 49226
                    PGID: 49240;  PIDs: 49240
                    PGID: 49242;  PIDs: 49242
                    PGID: 51520;  PIDs: 51520
RUNLIMIT
10080.0 min
MEMORY USAGE:
MAX MEM: 14 Mbytes;  AVG MEM: 8 Mbytes
SCHEDULING PARAMETERS:
          r15s   r1m  r15m   ut      pg    io   ls    it    tmp    swp    mem
loadSched   -     -     -     -       -     -    -     -     -      -      -
loadStop    -     -     -     -       -     -    -     -     -      -      -
EXTERNAL MESSAGES:
MSG_ID FROM       POST_TIME      MESSAGE                             ATTACHMENT
0      bhatiag    Dec  4 10:34   ds-cmgpu-03:gpus=1;                     N
RESOURCE REQUIREMENT DETAILS:
Combined: select[(type == any ) && (ngpus>0)] order[r15s:pg] rusage[ngpus_phys
                    ical=1.00]
Effective: select[(type == any ) && (ngpus>0)] order[r15s:pg] rusage[ngpus_phy
                    sical=1.00]
GPU REQUIREMENT DETAILS:
Combined: num=1:mode=exclusive_process:mps=yes:j_exclusive=yes
Effective: num=1:mode=exclusive_process:mps=yes:j_exclusive=yes
GPU_ALLOCATION:
HOST             TASK ID  MODEL        MTOTAL  FACTOR MRSV    SOCKET NVLINK
ds-cmgpu-03      0    1   TeslaP100_SX 15.8G   6.0    0       1      -

Note: bkill, bresume, and bstop are some of the useful commands and users should read the manual page of them.Click Here