FY09-Q1 Effort Report of Marco Mambelli
This effort report covers the period of activity of Marco Mambelli from October-December, 2008.
Distributed Data Services and DSS
Work continues in the Tier2 Data Services activity. This activity involves a joint group from the Midwest Tier2 Center and Argonne National Lab (Jack Cranshaw, David Malon, Rob Gardner).
The goal of the activity is to design, prototype and package services that:
- Host or provide access to ATLAS specific database services, such as TAG and possibly conditions (IOV and calibration) databases.
- Provide a dataset skimming service (DSS) for Tier2-resident datasets through either command line or web interfaces.
- Provide users the ability to access the files on Tier2's SE and registered in the file catalog.
The production of Skim on the Grid or the use of TAG based selection for analysis have been used successfully by friendly users working on FDR data, specially during analysis Jamborees. This involved browsing of the TAG DB using ELSSI, selecting the desired events using constraints on streams, run numbers, luminosity, triggers and physics attributes, extracting a root collection and following procedures to perform the desired operation on the Grid using Pathena.
Package software to access data in US-ATLAS (WLCG-Client) and provide and update user documentation like:
The planned functional delivery of DSS includes the installation of a server that allows to browse TAG information (event level metadata), perform selections on the different attributes in the TAGs, use the selected events for skims or distributed analysis on the grid, all integrated on the server. It further includes packaging and documenting installation and setup to allow easy replication of it.
DSS Milestones:
- Further development and integration of skim extraction jobs
- Demo and examples on FDR-2c analysis using Pathena and TAG from the TAG DB as part of the TAG demo during ATLAS offline software tutorial (Dec 08, Dec 08, completed)
- Planned Functional Delivery of DSS 1.0 (30 Nov 07, 28 Feb 09, delayed/expanded)
Related to data movement are the following roles:
Data movement milestones
- Facility-wide client throughput performance report: Sep 30
- Comprehensive client (WLCG-Client+custom Python) for 64bit OSes: Oct 15
- Further development and support of the Data Movement Utilities component (DMU) of DSS to move data reliably between local storage elements and worker nodes, including file registration in DQ2 site services. The DMU component is also in use by Panda.
- Coding of the pilot part of the Local Site Mover (3 Dec 08, 3 Dec 08, completed)
- Support for the development and testing of the Local Site Mover scripts at NET2 (15 Dec 08, 15Dec 08, completed)
User and Tier3 support
Prepare and document an exemplar installation at MWT2 of the software used by end users to transfer datasets (with the support of Charles Waldman - sysadmin,
http://twiki.mwt2.org/bin/view/DataServices/InstallingDQ2AtUC).
Support the existing installation, support sysadmin, collaborate with the Tier3 task force (Chip Brock), educate/support analysis user in moving files and registering datasets. Support the integration of new Tier3s.
Data movement Milestones:
- presentation and user support at the ANL Analysis Jamboree (9-12 Dec 08, 9-12 Dec 08, completed)
Panda Troubleshooting
Continued activity in support of Panda production, specially the pilot submission and pilot troubleshooting on USATLAS sites.
Member of the US-ATLAS production shift team (with Yuri Smirnov, Mark Sosebee, Wensheng Deng, Barry Spurlock), part of the world-wide ATLAS production team coordinated by Xavier Espinal and Kaushik De:
https://twiki.cern.ch/twiki/bin/view/Atlas/ADCoS
Shifters perform surveillance and troubleshooting of ATLAS production Mon-Sat, alternating with EU and Asian shifters in order to cover a 24h cycle.
Each shifters covers 6days/mo (3 2days shifts).
Shift duties include:
- monitor production jobs (Panda, ARDA, ...)
- monitor data transfers (FTS, Panda, ARDA, ...)
- monitor ATLAS tasks
- submit tickets to RT, GGUS and Savannah
- report anomalies on the eLog (http://atlas003.uta.edu:8080/ADCoS/)
- troubleshoot problems
- update other shifters about open problems and investigations
- contact experts or Site's administrators
- report weekly (ADC meeting and Prodsys meeting)
- steer production (activating/testing/deactivating sites depending on problems and planned downtimes)
Shifts are recorded in US Calendar (
http://grid.uchicago.edu/cgi/plans.cgi?cal_id=1) and in ATLAS OTP:
https://pptevm.cern.ch/mao/client/cern.ppt.mao.app.gwt.MaoClient/MaoClient.html#Ma0_Task_panel(N88)
Some accomplishments for Panda production:
- Support to autopilot: help in troubleshooting pilot and submit host problems
- Contributed to develop and document the procedure to test and reactivate sites after failures, including a modification of test jobs to allow a speedup of the testing procedures.
- Help finding and setting the correct site parameters to allow jobs using event TAGs or back navigation
--
MarcoMambelli - 07 Jan 2009