Using Pathena at the MWT2
This is the description of Pathena given on the
DAonPanda
pathena is a glue script to submit user-defined jobs to distributed analysis systems. It provides a consistent user-interface to Athena users. It does the following tasks:
- archives the user's work directory
- sends the archive to Panda
- extracts job configuration from jobOs
- defines jobs automatically
- submits jobs
It is meant to make the transition from using athena on a local node to
using athena on
Panda as transparent as possible.
The pathena script is located in the
PhysicsAnalysis/AnalysisCommon/UserAnalysis/share.
It uses the runAthena script in the same directory to run the jobs on the Panda nodes. It has the following command line
parameters.
- inDS : input dataset
- outDS : output dataset
- split : number of subjobs to use (default=??)
- nfiles : number of files to process (default=all)
- site : send jobs to a particular site
For the Data Skimming Service based on a TAG selection pathena is a candidate engine for
processing the results of the selection.
Tests
BNL
- Task 1: Try running Pathena in an environment where it is said that it works.
- Status: (Jack) I set up the suggested environment on a BNL acas machine. Generation and simulation examples on the DAonPanda page were run. The generation job worked ok. The simulation job failed on a heartbeat.
- Procedure
1. set up the atlas software for release 12.0.4 at BNL and go to your workarea.
- source /afs/usatlas.bnl.gov/scripts/setup_cmt_usatlas.csh 12.0.4 --opt -p AtlasOffline
2. clean up your InstallArea since any packages which show up in your workarea and InstallArea
will be shipped with the job and built on the worker nodes.
3. check out the UserAnalysis package and go to the run directory
4. get the example job options from the web service with wget
5. set up the grid environment (/afs/usatlas.bnl.gov/lcg/lcg-2.7.0/etc/profile.d/grid_env.csh) and
get a grid certificate (grid-proxy-init).
6. source the cmt/setup.csh *Note* that you must do this after the grid certificate because pathena
and grid use different pythons.
7. try _athena jobOptions.pythia.py_ to check that it works
8. now do _pathena -v jobOptions.pythia.py --outDS user.YourName.1.test.evgen.pool.v1_
9. the output to the screen will contain two PandaID's: one for the build and one for the run.
10. use the [[http://gridui02.usatlas.bnl.gov:25880/server/pandamon/query][Panda page]] to monitor
the job by putting the PandaID in the Job field.
- Task 2: Try to run a pathena job on an existing AOD dataset at BNL.
MWT2
- Issue 1: What is involved in doing this on the MWT2 system?
- Status: Here's how to run DAonPanda Example 1.
0. source sourceme.12.sh
1. cd workarea
2. export CMTPATH=`pwd`:${CMTPATH}
3. export PATHENA_GRID_SETUP_SH=/local/inst/pjs/setup.sh
(4. export CVSROOT=:ext:atlas-sw.cern.ch:/atlascvs)
(5. export CVS_RSH=ssh)
(6. cmt co cmt co PhysicsAnalysis/DistributedAnalysis/PandaTools)
7. cd PhysicsAnalysis/DistributedAnalysis/PandaTools/cmt
(8. gmake)
9. source setup.sh
10. cd ../../../../run
11. wget http://cern.ch/tmaeno/jobOptions.pythia.py
12. pathena jobOptions.pythia.py --outDS user.JackCranshaw.123456.aho.evgen.pool.v1
- Task 3
- Task 3.1: Repeate Task 2, but at the MWT2.
- Task 3.2: Repeat 3.1 and have it output some fraction of the input.
- Issue 2: Can we get the Production team to provide us with TAG files that we can use for our test?
- Status: Merge transform had neither been debugged or run regularly. Various problems with converters and flags have been fixed for 12.0.4. The first likely TAG's to show up will be from the StreamingTest2006.
- Task 4
- Task 4.1: Find or build a TAG file dataset at the MWT2 and repeat Task 3 using this as input.
- Task 4.2: Find or build a TAG file dataset at the MWT2 and repeat Task 3 using this as input with some selection in the job options.
--
JackCranshaw - 14 Aug 2006
--
JerryGieraltowski - 16 Jan 2007