Condor Roll: Users Guide: ![]() | ||
---|---|---|
Prev | Chapter 3. Using the Condor Roll |
First, make sure condor daemons are running by executing:
# ps -ef | grep condor |
On the frontend, the output should be similar to following:
condor 2623 1 0 Apr19 ? 00:04:26 /opt/condor/sbin/condor_master condor 2646 2623 0 Apr19 ? 00:20:25 condor_collector -f condor 2647 2623 0 Apr19 ? 00:04:56 condor_negotiator -f condor 2649 2623 0 Apr19 ? 00:00:02 condor_schedd -f |
And on the compute nodes, the output should be similar to following:
condor 17007 1 0 Apr19 ? 00:01:09 /opt/condor/sbin/condor_master condor 17009 17007 0 Apr19 ? 00:00:02 condor_schedd -f condor 17010 17007 0 Apr19 ? 00:09:09 condor_startd -f |
Try a test job submission.
# su - condor $ cd ~condor/tests $ condor_submit subs/hmmpfam3 |
Check if jobs are submitted by executing:
$ condor_q |
The output should be similar to:
-- Submitter: rocks-155.sdsc.edu : <198.202.88.155:47289> : rocks-155.sdsc.edu ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD 1.0 condor 4/27 23:02 0+00:00:48 R 0 0.2 hmmpfam data/db100 1.1 condor 4/27 23:02 0+00:00:46 R 0 0.2 hmmpfam data/db100 1.2 condor 4/27 23:02 0+00:00:44 R 0 0.2 hmmpfam data/db100 1.3 condor 4/27 23:02 0+00:00:42 R 0 0.2 hmmpfam data/db100 1.4 condor 4/27 23:02 0+00:00:38 R 0 0.2 hmmpfam data/db100 1.5 condor 4/27 23:02 0+00:00:36 R 0 0.2 hmmpfam data/db100 1.6 condor 4/27 23:02 0+00:00:34 R 0 0.2 hmmpfam data/db100 1.7 condor 4/27 23:02 0+00:00:40 R 0 0.2 hmmpfam data/db100 1.8 condor 4/27 23:02 0+00:00:32 I 0 0.2 hmmpfam data/db100 1.9 condor 4/27 23:02 0+00:00:30 I 0 0.2 hmmpfam data/db100 |
R in status column(ST) means running. I means idling. The output from the jobs will be in results/
Once the queue is empty (above command shows no jobs) can see the history of jobs execution with:
$ condor_history |
To see all the nodes in the condor pool do:
$ condor_status |
The output should be similar to:
Name OpSys Arch State Activity LoadAv Mem ActvtyTime vm1@compute-0 LINUX INTEL Unclaimed Idle 0.000 506 0+00:40:04 vm2@compute-0 LINUX INTEL Unclaimed Idle 0.000 506 0+00:45:05 vm3@compute-0 LINUX INTEL Unclaimed Idle 0.000 506 0+00:45:06 vm4@compute-0 LINUX INTEL Unclaimed Idle 0.000 506 0+00:45:07 vm1@compute-0 LINUX INTEL Unclaimed Idle 0.000 506 0+00:35:04 vm2@compute-0 LINUX INTEL Unclaimed Idle 0.000 506 0+00:40:05 vm3@compute-0 LINUX INTEL Unclaimed Idle 0.000 506 0+00:40:06 vm4@compute-0 LINUX INTEL Unclaimed Idle 0.000 506 0+00:40:07 vm1@compute-0 LINUX INTEL Unclaimed Idle 0.000 506 0+00:25:04 vm2@compute-0 LINUX INTEL Unclaimed Idle 0.000 506 0+00:30:05 vm3@compute-0 LINUX INTEL Unclaimed Idle 0.000 506 0+00:30:06 vm4@compute-0 LINUX INTEL Unclaimed Idle 0.000 506 0+00:30:07 vm1@compute-0 LINUX INTEL Unclaimed Idle 0.000 506 0+00:15:05 vm2@compute-0 LINUX INTEL Unclaimed Idle 0.000 506 0+00:20:06 vm3@compute-0 LINUX INTEL Unclaimed Idle 0.000 506 0+00:20:07 vm4@compute-0 LINUX INTEL Unclaimed Idle 0.000 506 0+00:20:08 vm1@compute-0 LINUX INTEL Unclaimed Idle 0.000 506 0+00:10:04 vm2@compute-0 LINUX INTEL Unclaimed Idle 0.000 506 0+00:15:05 vm3@compute-0 LINUX INTEL Unclaimed Idle 0.000 506 0+00:15:06 vm4@compute-0 LINUX INTEL Unclaimed Idle 0.000 506 0+00:15:07 vm1@compute-0 LINUX INTEL Unclaimed Idle 0.000 506 0+00:00:04 vm2@compute-0 LINUX INTEL Unclaimed Idle 0.000 506 0+00:05:05 vm3@compute-0 LINUX INTEL Unclaimed Idle 0.000 506 0+00:05:06 vm4@compute-0 LINUX INTEL Unclaimed Idle 0.000 506 0+00:05:07 vm1@compute-0 LINUX INTEL Owner Idle 0.860 506 0+00:00:09 vm2@compute-0 LINUX INTEL Unclaimed Idle 0.000 506 0+00:00:05 vm3@compute-0 LINUX INTEL Unclaimed Idle 0.000 506 0+00:00:06 vm4@compute-0 LINUX INTEL Unclaimed Idle 0.000 506 0+00:00:07 Machines Owner Claimed Unclaimed Matched Preempting INTEL/LINUX 28 1 0 27 0 0 Total 28 1 0 27 0 0 |
The directory ~condor/tests has a few tests programs with the corresponding job submit files for running test jobs in different condor universes. The test programs are in bin/, and the submit files are in subs/. The output of the jobs, if any, goes to results/. To run these tests as condor simply execute condor_submit command followed by the desired submit file name from subs/. For example:
$ cd ~/tests $ condor_submit subs/hmmpfam3 |
![]() | The test program tests/bin/simple.mpi and its submit file test/subs/submit_mpi are provided only as a reference. Current condor binaries do not work with MPI programs compiled with mpicc version higher then v.1.2.4. If you wish to run jobs in MPI universe your programs should be compiled with MPI versions 1.2.2 through 1.2.4. |