DssPrototype

Developers

Jerry, Jack, Marco

Description

A simple end-to-end skimming service prototype.
  • Create a TAG database, given an AOD.
  • Process to create SkimSelector output.
  • Simple Athena job and wrapper as SkimExtractor prototype.
  • Publish results into LRC

Dependencies

  • AOD sample
  • TAG schema
  • MySQL database
  • Local DQ2 client tools
  • Athena releases
  • Data services prototype machine

Indentation indicates a dependency.

Initial Testing of Skim Functionality and Pool File Catalog Utilities

On November 27, 2006 Jack, Marco, and Jerry got together to walk through a set of exercises intended to demonstrate that a skim of the TAG databse on tier2-05.uchicago.edu could be performed manually and that the resultant output could then be processed correctly by athena. These exercises were also intended to demonstrate and prove the funcitonality of several PoolFileCatalog utilities.

Initialization

The following steps were executed:
1. Setup the d-cache and atlas environments. The atlas environment chosen was for Release 12.0.3
source /local/workdir/d-cache.setup.sh
source /share/app/atlas_app/atlas_rel/12.0.3/cmtsite/setup.sh -tag=AtlasOffline,12.0.3
source /share/app/atlas_app/atlas_rel/12.0.3/AtlasOffline/12.0.3/AtlasOfflineRunTime/cmt/setup.sh
Note that the ordering of the source scripts is important. D-cache must be courced before the atlas setup scripts.

Basic component tests

2. We then ran a connection test on the tier2-05 LRC (UC_VOB) using one of the POOL utilities that comes with the Atlas release.

Execute FClistPFN to list the files referenced by the local replica catalog (lrc) on tier2-05.uchicago.edu
 FClistPFN -u mysqlcatalog_mysql://dq2user:dqpwd@tier2-05.uchicago.edu/localreplicas

The output should be similar to the following:
gsiftp://tier2-d1.uchicago.edu:2811/pnfs/uchicago.edu/data/usatlas/testIdeal_07/testIdeal_07.005711.PythiaB_bbmu6mu4X.recon.AOD.v12000201_tid002968/testIdeal_07.005711.PythiaB_bbmu6mu4X.recon.AOD.v12000201_tid002968._00034.pool.root.1
gsiftp://tier2-d1.uchicago.edu:2811/pnfs/uchicago.edu/data/usatlas/testIdeal_07/testIdeal_07.005711.PythiaB_bbmu6mu4X.recon.AOD.v12000201_tid002968/testIdeal_07.005711.PythiaB_bbmu6mu4X.recon.AOD.v12000201_tid002968._00032.pool.root.2
.
.
gsiftp://tier2-d1.uchicago.edu:2811/pnfs/uchicago.edu/data/usatlas/testIdeal_07/testIdeal_07.005711.PythiaB_bbmu6mu4X.recon.ESD.v12000201_tid002968/testIdeal_07.005711.PythiaB_bbmu6mu4X.recon.ESD.v12000201_tid002968._00041.pool.root.1
gsiftp://tier2-d1.uchicago.edu:2811/pnfs/uchicago.edu/data/usatlas/testIdeal_07/testIdeal_07.005711.PythiaB_bbmu6mu4X.recon.ESD.v12000201_tid002968/testIdeal_07.005711.PythiaB_bbmu6mu4X.recon.ESD.v12000201_tid002968._00022.pool.root.2
.
.
There should be 40 AOD files and 40 ESD files

3. For a variety of reasons we don't want to use the LRC directly for running athena jobs.
  1. Avoid interference with DDM activities.
  2. The PFN's used by the LRC are not intelligible to athena.
  3. The LRC does not set the file type.

So the next step was to create a copy of the LRC that we could edit. The easiest thing was to export it to an xml file using the POOL utilities. We used FCpublish with a query limiting it to files containing the extension root.
FCpublish -d file:/tmp/myPoolFileCatalog.xml -u mysqlcatalog_mysql://dq2user:dqpwd@tier2-05.uchicago.edu/localreplicas -q "pfname like '%.root.%'"
The output file /tmp/myPoolFileCatalog.xml should have a size of 60046 bytes and contain 1053 lines.

4. For our test we can do simple hacks to get things going. All of the files are in dcache, so we just replace the gsiftp protocols with dcache protocols. In the real system, one should be able to look up pfn's and either add or replace protocols based on client capabilities. Also all of the files are POOL files, so we can just do a global replace of the file type. In the future there may be different file types, so this needs a long term solution not used here.

Create a working directory (e.g. dss_prototype_tests) and exexcute:
cd dss_prototype_tests
mv /tmp/myPoolFileCatalog.xml PoolFileCatalog.xml
The file PoolFileCatalog.xml must be edited so that the protocol of each pfn is listed as "dcache:" instead of "gsiftp:" The file PoolFileCatalog.xml must also be edited so that the "pfn filetype" is set to "ROOT_All" instead of "NULL". You can do this by executing the following vi commands:
vi PoolFileCatalog.xml
:1,$s/gsiftp:\/\/tier2-d1.uchicago.edu:2811/dcache:/g
:1,$s/filetype="NULL"/filetype="ROOT_All"/g
:wq

Steps 3 and 4 can be executed using the script makePoolFileCatalog.

5. Before we move on to the TAG's let's see if athena can actually access one of the pfns listed in the generated PoolFileCatalog.xml with the dcache: protocol. We'll run an athena job using this catalog and accessing one of the files by LFN and have it execute AthenaPoolUtilities/EventCount.py
athena -c "In=['LFN:testIdeal_07.005711.PythiaB_bbmu6mu4X.recon.AOD.v12000201_tid002968._00034.pool.root.1']" AthenaPoolUtilities/EventCount.py

The following lines in the output file show that this input pfn has 1000 events:
EventCount           INFO ---------- INPUT FILE SUMMARY ----------
EventCount           INFO Input contained: 1000 events
EventCount           INFO  -- Event Range ( 50700 .. 33199 )

This shows that we have a catalog which allows us to read files resident in dcache. The next step is to look up the file using the references in a TAG file rather than an LFN in the catalog.

6. A set of TAG root files are stored at '/local/workdir/testIdeal_07.005711.PythiaB_bbmu6mu4X.recon.TAG.v12000201_tid002968. Each root file contains the collected TAG data from four different AOD root files for a total of 4000 events per TAG root file. Each of these TAG root files can be considered a collection. Now we'll try to have athena read one of these collections using the previously created PoolFileCatalog.xml and count the number of events in the input.
athena -c "In=
'/local/workdir/testIdeal_07.005711.PythiaB_bbmu6mu4X.recon.TAG.v12000201_tid002968/testIdeal_07.005711.PythiaB_bbmu6mu4X.recon.TAG.v12000201_tid002968._01-04'];CollType='ExplicitROOT'" AthenaPoolUtilities/EventCount.py

The output should show the following:
HistorySvc           INFO Service finalised successfully
EventCount           INFO ---------- INPUT FILE SUMMARY ----------
EventCount           INFO Input contained: 4000 events
EventCount           INFO  -- Event Range ( 4800 .. 22199 )

7. DEFERRED, skip to 8 Create a new mysql database on tier2-06.uchicago.edu to put the PoolFileCatalog.xml data into. Then we'll have a mysql database to use instaed of a static PoolFileCatalog.xml file .
mysql -h tier2-06.uchicago.edu -u root -p tagdbadmin1

NOTE: This failed with "access denied"  The problem probably stems from the fact that we used the wrong mysql.  The one we used was from /usr/bin/mysql and probably should have been from /opt/mysql-standard-4.1.20-pc-linux-gnu-i686/bin/mysql
We decided at this point in time to NOT create the mysql database and continue using the static PoolFileCatalog.xml file.

Simulate SkimComposer

8. Now that we have checked that we can take a TAG file and access data, let's start at the beginning of our Overview diagram and move through a simple Composer->Selector->Extractor. At this point we don't have much of a SkimComposer, so one can
  1. look directly at the 10 TAG root files in
/local/workdir/testIdeal_07.005711.PythiaB_bbmu6mu4X.recon.TAG.v12000201_tid002968

Execute:
root  /local/workdir/testIdeal_07.005711.PythiaB_bbmu6mu4X.recon.TAG.v12000201_tid002968/testIdeal_07.005711.PythiaB_bbmu6mu4X.recon.TAG.v12000201_tid002968._01-04.root
     root> CollectionTree.Draw("name_of_token_to_display");
Here is the resultant plot executing CollectionTree.Draw("NLooseElectron");

CollectionTree-sample-plot.jpg

  1. look at the mysql database directly.
Execute:
mysql -h tier2-06.uchicago.edu -u tagreader tier2tagdb
      mysql> select count(*) from testIdeal07_005711_TAG_v12000201 where ...

For example, one could open one of the root files and make plots using methods described on TagDBView. This gives the visual analysis needed to decide what sort of cuts make sense. One can then look at the full sample in the mysql database to find the total. We chose a simple cut requiring at least one jet (NJet>0) and at least one electron (NLooseElectron>0). This reduced the event sample from roughly 40k events to 4014 events, i.e. a relatively loose 10% selection.

Note that at these small sizes one could TChain all 10 of the TTree's in the root files and look at the full sample, but that's not a scalable solution.

Simulate SkimSelector

9. The selection criteria form the pseudo output from the SkimComposer. Now we want to simulate the SkimSelector by extracting a simplified TAG file which can be used as input to athena.

Run the following script for the collection testIdeal07_005711_TAG_v12000201 from the TAG-database (tagreader@tier2-06.uchicago.edu). The output destination is defined to be the root file "test.coll"
CollCopy.exe -src testIdeal07_005711_TAG_v12000201 MySQLltCollection -srcconnect mysql://tagreader@tier2-06.uchicago.edu/tier2tagdb -dst test.coll RootCollection -query "NJet>0&&NLooseElectron>0" -queryopt "SELECT EventNumber,RunNumber"
The query option implements the selection from above. The queryopt option tells it to strip off all of the metadata other than EventNumber and RunNumber. Note that EventNumber and RunNumber are purely for debugging. They are not used by athena in the following steps.

The following output should be seen:
CollCopy: Finished copying input collection(s) `testIdeal07_005711_TAG_v12000201:MySQLltCollection' to output collection(s) `test.coll:RootCollection'

The file test.coll.root should have been created with a size of 48132 bytes.

Simulate SkimExtractor

10. Copy the local file PoolFileCatalog.xml to the directory /share/data/t2data so that it is visible to all working nodes. Then use the file "test.coll.root" as input to athena with the file PoolFileCatalog.xml file stored in /share/data/t2data. First we will need to get a local copy of the script EventCount.py and edit it so that the PoolSvc knows to read the file PoolFileCatalog.xml located at /share/data/t2data.
cp PoolFileCatalog.xml /share/data/t2data
get_files AthenaPoolUtilities/EventCount.py
edit the file EventCount.py and add the following three lines after line 23:
theApp.EvtMax = 20   # Only analyze 20 events in this example
PoolSvc = Service("PoolSvc")
PoolSvc.ReadCatalog = ["file:/share/data/t2data/PoolFileCatalog.xml"]

Jak put a local copy of the file testExtract.py into the directory /share/data/t2data/. We copied the file testExtract.py from /share/data/t2data into our local working directory and executed:
athena -c "In=['test.coll']; CollType='ExplicitROOT'; OutputAODFile='testExtractor.AOD.root'" testExtract.py

The output file testExtractor.AOD.root should have been created with a size of roughly 2283743 bytes.

The command above took 37 sec on tier2-06, so roughly 2 sec/ev.

Validation

11. The simplest way to get plots of the variables that we used to cut on is to recreate a TAG file from the skim file testExtractor.AOD.root:
wget http://twiki.mwt2.org/pub/DataServices/DataServicesMachine/MergeAODwithTags_v12.py.txt

mv MergeAODwithTags_v12.py.txt MergeAODwithTags_v12.py

athena -c "PoolAODInput=['testExtractor.AOD.root']; PoolAODOutput='mytest'; doWriteAOD=False" MergeAODwithTags_v12.py

The command above took 30 sec on tier2-06, so roughly 1.5 sec/ev. For more information on using these job options look at Building TAGs.

The file mytest.TAG.root contains all the tags and can be viewed doing some plots with root. If you haven't already done so, execute the source commands from Step 1.
root mytest.TAG.root

.ls
CollectionTree.Print()
CollectionTree.Draw("NLooseElectron")
CollectionTree.Draw("NJet")
.q
As visible from the plots below, there are at least one Jet and at least 1 Electron (as designed and expected).

12. To validate the process the same cuts have been done on the source data and the data has been plotted (before and after the cuts). Keep in mind that this example processed only 20 valid events (the first 250 events total). The plots show that the distribution is the same. They also show that the important cut (for this sample at least is on NLooseElectron): if NLooseElectron>0 then NJets>0.

root /local/workdir/testIdeal_07.005711.PythiaB_bbmu6mu4X.recon.TAG.v12000201_tid002968/testIdeal_07.005711.PythiaB_bbmu6mu4X.recon.TAG.v12000201_tid002968._01-04.root

CollectionTree.Draw("NJet", "NJet>0&&NLooseElectron>0", "", 250, 0)
CollectionTree.Draw("NJet", "NLooseElectron>0", "", 250, 0)
CollectionTree.Draw("NJet", "", "", 250, 0)
CollectionTree.Draw("NLooseElectron", "NJet>0&&NLooseElectron>0", "", 250, 0)
CollectionTree.Draw("NLooseElectron", "NJet>0", "", 250, 0)
CollectionTree.Draw("NLooseElectron", "", "", 250, 0)

Next steps

Comments from Jack on November 28, 2006: Once we've converged on the recipe from our notes, I would like to have Ed try to follow it later this week. I think we should also start timing everything we do for initial performance evaluation.

Other milestones we can then move on to.
  • run the single job extraction on a worker node rather than tier2-06.
  • look at the TNT scripts and see what is reasonable to borrow.
  • put the TAG files that Jerry built into a DDM dataset.
  • Try to subscribe to the TAG dataset at BNL.

Running on the Grid

Now we would like to be able to run the same athena job as previously, but on the grid site UC_ATLAS_MWT2. We will also change the testExtract.py file so that the maximum number of events to process is 2000 events; instead of 20 as previously. Since the 20 event run took about 37 seconds to complete, a run of 2000 events should scale to about 65 minutes.

We downloaded the attachment run-on-the-grid.tar.gz and executed the command tar -xzvf run-on-the-grid.tar.gz. This created the directory runongrid. In the directory runongrid the file gridjob.sub is the submit file to the Condor jobmanager. We attempted first to run this on the OSG server, tier2-osg.uchicago.edu but were unsuccessful in getting the job to start running even after 24 hours because of a hugh backlog of Panda jobs on this server. Instead, we decide to access the condor scheduler directly on the UC_ATLAS_MWT2 site by simply re-logging in to tier2-06.uchicago.edu and executing:
export PATH=/opt/condor/bin:/opt/condor/sbin:$PATH
This allowed us to submit directly to the UC_ATLAS_MWT2 condor scheduler; putting us at a higher priority than the ATLAS jobs submitted from remote sites.

We then changed to the working directory runongrid and submitted the job directly to the condor scheduler by executing condor_submit gridjob.sub. The progress of the job was monitored by periodically executing condor_q. The job tookould take approximately 1 hour to run on a worker node at the UC_ATLAS_MWT2 site.

The standard output from the job was in the file gridjob.runongrid.out. The standard error from the job was in the file gridjob.runongrid.err. Examining the contents of the file gridjob.runongrid.err showed the following timing results:
Time(s) for job on worker node:
real    60m32.698s
user    21m2.660s
sys     0m23.070s

The size of the output file testExtractor.AOD.root was
Size of output file:~240Mb
-rw-r--r--    1 gfg      gfg      239477995 Dec  6 12:06 testExtractor.AOD.root

Here is the resultant plot executing CollectionTree.Draw("NLooseElectron");

NLooseElectron-2000evts.jpg

Here is the resultant plot executing CollectionTree.Draw("NJet");

NJet-2000evts.jpg

Installation and Use of the TagNavigatorTool for Osg

The ATLAS TagNavigatorTool (TNT) is a utility which is designed to allow ATLAS physicists to use the Tag database for analysis, using the Distributed Data Management (DDM) system in an integrated way. It consists of a number of python scripts which interact with the Tag database, the grid, and DQ2 (the ATLAS DDM implementation). The twiki page TagNavigatorToolOsg describes some preliminary investigations into the use of this tool in an OSG environment. Attempts to duplicate the exercises described previously in this twiki using TNT in the OSG environment on tier2-06.uchicago.edu are described in the TagNavigatorToolOsg twiki.


Major Updates:
-- JackCranshaw - 29 Nov 2006 - updated with comments on section purposes and direction
-- JerryGieraltowski - 28 Nov 2006 - update with exercises done on November 27, 2006
-- RobGardner - 06 Nov 2006
  • plot1: N Electron:
    test0plot1.png

  • plot2: N Jets:
    test0plot2.png

  • N Electron in source data:
    test0plot2s.png

  • N Electron in source data, cut:
    test0plot2sc.png

  • N Electron in source data, double cut:
    test0plot2scc.png

  • N Jets in source data:
    test0plot1s.png

  • N Jets in source data, cut:
    test0plot1sc.png

  • N Jets in source data, double cut:
    test0plot1scc.png
I Attachment Action Size Date Who Comment
CollectionTree-sample-plotEXT CollectionTree-sample-plot manage 24 K 29 Nov 2006 - 16:24 JerryGieraltowski sample plot using root
CollectionTree-sample-plot.jpgjpg CollectionTree-sample-plot.jpg manage 24 K 29 Nov 2006 - 16:33 JerryGieraltowski Sample plot of NLooseElectron from rroot
NJet-2000evts.jpgjpg NJet-2000evts.jpg manage 25 K 07 Dec 2006 - 15:52 JerryGieraltowski NJet-2000evts.jpg
NLooseElectron-2000evts.jpgjpg NLooseElectron-2000evts.jpg manage 26 K 07 Dec 2006 - 15:52 JerryGieraltowski NLooseElectron-2000evts.jpg
makePoolFileCatalog.txttxt makePoolFileCatalog.txt manage 1 K 06 Mar 2007 - 16:04 JerryGieraltowski Make a PoolFileCatalog.xml file
run-on-grid.tar.gzgz run-on-grid.tar.gz manage 2 K 07 Dec 2006 - 16:27 JerryGieraltowski Run-on-grid.tar.gz
test0plot1.pngpng test0plot1.png manage 7 K 28 Nov 2006 - 20:42 MarcoMambelli plot1: N Electron
test0plot1s.pngpng test0plot1s.png manage 8 K 05 Dec 2006 - 17:59 MarcoMambelli N Jets in source data
test0plot1sc.pngpng test0plot1sc.png manage 8 K 05 Dec 2006 - 18:01 MarcoMambelli N Jets in source data, cut
test0plot1scc.pngpng test0plot1scc.png manage 8 K 05 Dec 2006 - 17:58 MarcoMambelli N Jets in source data, double cut
test0plot2.pngpng test0plot2.png manage 8 K 28 Nov 2006 - 20:43 MarcoMambelli plot2: N Jets
test0plot2s.pngpng test0plot2s.png manage 10 K 05 Dec 2006 - 18:01 MarcoMambelli N Electron in source data
test0plot2sc.pngpng test0plot2sc.png manage 10 K 05 Dec 2006 - 18:07 MarcoMambelli N Electron in source data, cut
test0plot2scc.pngpng test0plot2scc.png manage 10 K 05 Dec 2006 - 18:00 MarcoMambelli N Electron in source data, double cut
Topic revision: r16 - 06 Mar 2007, JerryGieraltowski
This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Foswiki? Send feedback