FY08-Q3 Effort Report of Marco Mambelli
This effort report covers the period of activity of Marco Mambelli from April-June, 2008.
Distributed Data Services and DSS
Work continues in the Tier2 Data Services activity, see further: http://twiki.mwt2.org/bin/view/DataServices/WebHome
. This activity involves a joint group from the Midwest Tier2 Center and Argonne National Lab (Jack Cranshaw, David Malon, Rob Gardner).
The goal of the activity is to design, prototype and package services that:
- Host or provide access to ATLAS specific database services, such as TAG and possibly conditions (IOV and calibration) databases.
- Provide a dataset skimming service (DSS) for Tier2-resident datasets through either command line or web interfaces.
- Provide users the ability to access the Tier2's DQ2 server.
Much of my effort on this topic as been as an part of the Event TAGs project, coordinated by David Malon and including people from ANL, Glasgow and CERN (Jack Cranshaw, Qizhi Zhang , Mike Kenyon, Elizabeth Gallas, Florbela Tique Aires Viegas).
DSS has been used successfully by friendly users to perform skims of FDR data using tags from the TAG DB.
The difficulty to have a working AODtoAOD
transformation with ATLAS release 14 is adding difficulties to the project.
- The prototype of the Web interface is available at: http://tier2-06.uchicago.edu:8800/dss/
- Further development and integration of skim extraction jobs
- Further development and support of the Data Movement Utilities component (DMU) of DSS to move data reliably between local storage elements and worker nodes, including file registration in DQ2 site services. The DMU component is also in use by Panda.
- Support FDR-1R analysis using Pathena and TAG from the TAG DB during ANL ASC Analysis Workshop (14-16 May 08, 14-16 May 08, completed)
- Demo and examples on FDR-2 analysis using Pathena and TAG from the TAG DB as part of the TAG demo during ATLAS offline software tutorial st CERN (18 Jul 08, 16-18 Jul 08, completed)
- Planned Functional Delivery of DSS 1.0 (30 Sep 07, 15 Sep 08, delayed/expanded)
User and Tier3 support
Prepare and document an exemplar installation at MWT2 of the software used by end users to transfer datasets.
Support the existing installation, support sysadmin (UIUC Tir3), educate/support analysis user in moving files and registering datasets.
As part of this effort came out the necessity to provide ATLAS scientists with a suite of client programs that could allow them to access ATLAS data everywhere in the world and download it everywhere they need, including a simple laptop. This has been an integration effort: the software exist, DQ2 and other scripts, but most of the time works at the big computing centers (CERN, BNL) or can access only some data (depending on the Grid, on how it is stored) or require software difficult to install, sometime different depending on the data.
The goal is to provide a single package simple to install providing access to all ATLAS data.
Collaborating with VDT, integrator of Pacman packages with major Grid software, and the group developing DQ2 Client, mainly Mario Lassnig, I was able to provide a package with the tool to access the major grids used in ATLAS (LCG, OSG and NorduGrid
) and to obtain a DQ2 Client package depending only on it. For further documentation on WLCG-Client check: http://www.usatlas.bnl.gov/twiki/bin/view/Admins/WlcgClient
The process took more iteration due to the enlargement of the scope:
, initially designed only to support data movement commands, became later a generic Grid client used also for other purposes and the switch of the Storage elements to SRMv2 required some changes to compensate for the rigidity of some client tools.
With the support of local system administrators I provided also an exemplar installation at the University of Chicago Tier2 and Tier3, documenting the installation process and how to use the software (respectively in http://twiki.mwt2.org/bin/view/DataServices/InstallingDQ2AtUC
Data movement Milestones:
- collabotation with VDT to test and release LCG-Util for data transfer interoperability with LCG (Apr 08, , completed)
- development of
LFC-min a package for file catalog interoperability and release of
wlcg-client v0.9 to allow US-ATLAS users to access ATLAS data worldwide (23 May 08, , completed)
- collaboration with Mario Lassnig to produce a DQ2Client release compatible with OSG and
wlcg-client, resulting in DQ2Client 0.1.17 (27 Jun 08, , completed in July)
- new version of
wlcg-client v0.10 extended to be a generic user client (23 Jun 08, , completed)
Continued activity on PilotChecker (project started on FY07Q3
), see further: http://twiki.mwt2.org/bin/view/DataServices/PandaSubmitHost
. This activity emerged as support of my troubleshooting activity for Panda production. Troubleshooting needs and Site administrators requests prompted improved versions of the tool. The tool has been used also for official Site certification (http://www.usatlas.bnl.gov/twiki/bin/view/Admins/SiteCertificationP1
) following the 2 steps procedure defined in http://www.usatlas.bnl.gov/twiki/bin/view/Admins/PilotCheckerP1.html
. This activity includes maintenance of the server and development.
- Support and small UI improvements of PilotChecker v 0.3
Continued activity in support of Panda production, specially the pilot submission, pilot troubleshooting on USATLAS sites and help supporting ATLAS production at MWT2 and integration of new Tier3.
Member of the US-ATLAS production shift team (with Yuri Smirnov, Mark Sosebee, Wensheng Deng, Barry Spurlock), part of the world-wide ATLAS production team coordinated by Xavier Espinal and Kaushik De: https://twiki.cern.ch/twiki/bin/view/Atlas/ADCoS
Shifters perform surveillance and troubleshooting of ATLAS production Mon-Sat, alternating with EU and Asian shifters in order to cover a 24h cycle (Asian shifts are not well established yet).
Each shifters covers 6days/mo (3 2days shifts each month).
Shift duties include
Some accomplishments for Panda production:
- monitor production jobs (Panda, ARDA, ...)
- monitor data transfers (FTS, Panda, ARDA, ...)
- monitor ATLAS tasks
- submit tickets to RT, GGUS and Savannah
- report anomalies on the eLog (http://atlas003.uta.edu:8080/ADCoS/)
- troubleshoot problems
- update other shifters about open problems and investigations
- contact experts or Site's administrators
- report weekly (ADC meeting and Prodsys meeting)
- steer production (activating/deactivating sites depending on problems and planned downtimes)
- Final update and phase out of the pilot2 submit host at UTA (May 08, , completed)
- Migration to autopilot of the last production sites that were using the UTA submit host (May-June 08, , completed)
- Suggested IM solution to ease shifters communication and set up
atlasshift chat room (for Jabber and Google talk) (June 08, , completed)
- 25 Jul 2008