You are here: Foswiki>Main Web>WikiUsers>MarcoMambelli>MarcoWorkPages>TestBM081124 (02 Dec 2008, MarcoMambelli)Edit Attach

Execution of Analysis jobs on ANALY_MWT2

Introduction
Job description
Results
Plot from Charles
Conclusion
Some additional information

Introduction

Jobs have been submitted using Pathena (from uct3-edge5). Results have been checked in the Panda DB with mysql client typing the queries. jobsArchived4 is the table containing my analysis jobs. File results have been checked using DQ2 enduser clients.

Job description

The Job executed is an example available in DPD production:

Each Pathena job has 1 build job.
Each job is split in 1 job per input file, this makes 21 jobs (input dataset has 21 files)
It copies with dccp on input AOD to the run directory (in /scratch)
Runs AnalyzeJpsiphi.py
At the end copies the file back to the USERDISK (using lcg-cp)

Each Pathena job has 1 build job. Each job is split in 1 job per input file, this makes 21 jobs (input dataset has 21 files)

Results

Generic statistics:

number of submission: 250 Pathena jobs (repetitions of the same job)
resulting Panda jobs: 5250 Panda jobs
- 250 build jobs
- 5000 run jobs
Build jobs
- finished 247
- failed 3
Run jobs
- finished 5138
- failed 112 (63 of which never started due to build job failure)

CPU Use (kSI2kseconds)

Job	AVG	min	Max	Total
finished BUILD Jobs	453.34	388	933	111974
failed BUILD Jobs	271.00	0	416	816
finished RUN Jobs	1299.36	481	3803	6676092
failed RUN Jobs	322.15	0	1989	36081
Total	1240.90	0	3803	6824960

CPU types are: Quad-Core AMD Opteron(tm) Processor 2350 512 KB and Dual Core AMD Opteron(tm) Processor 285 1024 KB

Wall Time Use (seconds)

Job	AVG	min	Max	Total
finished BUILD Jobs	4045.72	2226	6673	999293
failed BUILD Jobs	4289.33	1	6533	12868
finished RUN Jobs	10818.82	873	767584	55576278
failed RUN Jobs	16042.62	1	50218	770046
Total	10867.19	1	767584	56346324

Not started (63 failed run jobs) jobs are excluded from the wall-time count due to wrong entries in the DB.

Each Pathena job completing successfully reads one dataset with 2 files, produces 3 datasets, one output directory, 21 root files and 21 log files:

One dataset is used for the input files (DSname_shadow) and has no replicas at the end of the job
The other 2 datasets contain the same 42 files (21 root files and 21 log files): DSname and DSname_subXXX
most of the root files are around MB (except the last one of the job)
log files size varies (and are generally smaller)
below are statistics about both 1 successfully completed job (1J) and for the whole sample
File sizes are always measured in MB (10^6 bytes) unless otherwise specified
Estimated total events written: 66210470 (one per read events, excluding failures)

File type	AVG	min	Max	Total
Root files 1J	63.0	38.4	68.8	1323.1
LOG files 1J	0.21	0.16	0.22	4.4
Total 1J	31.6	0.16	68.8	1327.5
Root files	63.0	0	68.9	322771.3
LOG files	0.21	0	0.57	1080.5
Total	30.7	0	68.9	326171.5

The input dataset is fdr08_run2.0052283.physics_Jet.merge.AOD.o3_f47_m26:

21 files
37.5 GB
270615 events
it has been read by each job (that passed the build phase)
total events read: 66841905

The job is not really a skim, the skim ratio is 100% (all events are written to the output)

Plot from Charles

Nice plot that shows 5000 jobs completing:

Conclusion

The jobs caused some trouble in the cluster, specially for the gatekeeper and the NFS server for the home directories.

Anyway it is not possible to check now whether pathena is abusing the gass cache, since there is no track of the data flow. That has to be done while the job is running. These analysis jobs have nothing special, different from others:

pathena is staging the pilot and its auxiliary files using Globus gass-cache
the jobs use the movers to copy input and output files

Some additional information

jobs-db-description.txt: Jobs DB description
sample-queries-jobs.txt: Queries about jobs
failed-jobs-detail.txt: Failed jobs detail
sample-queries-files.txt: Queries about files (LFC DB)

-- MarcoMambelli - 26 Nov 2008

I	Attachment	Action	Size	Date	Who	Comment
txt	failed-jobs-detail.txt	manage	1 K	02 Dec 2008 - 00:31	MarcoMambelli	Failed jobs detail
txt	jobs-db-description.txt	manage	23 K	02 Dec 2008 - 00:30	MarcoMambelli	Jobs DB description
png	marcos-jobs.png	manage	45 K	02 Dec 2008 - 00:20	MarcoMambelli	Charle's plot
txt	sample-queries-files.txt	manage	1 K	02 Dec 2008 - 00:32	MarcoMambelli	Queries about files (LFC DB)
txt	sample-queries-jobs.txt	manage	8 K	02 Dec 2008 - 00:30	MarcoMambelli	Queries about jobs

Topic revision: r2 - 02 Dec 2008, MarcoMambelli

Main

Copyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Foswiki? Send feedback