Attending
- Marco, Jack, Rob, Tom, David, Paul
- Next meeting - invite Dan.
Pathena
- Jack ran pathena at brookahven; general checkout.
- Pathena is a simple script in the user-analysis package that sends out the job to Panda.
- Instructions are given for cern and bnl. AFS dependence.
- Jack will keep exploring pathena. :
- check if Pathena can be run at the Tier2 without making use of BNL (directly or AFS)
- work on running Pathena on a full dataset (instead of the samples used in the example)
- TODO
- run on an AOD not on a sample (at BNL)
- can we make pathena available at the Tier2 or is it there something special at BNL and CERN?
TAG datasets
- The listed csc11 dataset (the only merged dataset with TAGs known to be correct) is in NorduGrid and DQ2 is not able to get it
- Jack called Ian to submit new merge jobs (they produce three datasets: a TAG dataset, logs, and a merged AOD dataset) - we should subscribe to both TAG and AOD.
- The production of the merge files failed (the results contained 10000 events instead of the 50000 expected). The transformation has to be fixed. Ian is in vacation this week, so no progress is expected from there.
- in the Panda DB there are 45 tasks producing datasets with TAGs, mostly empty right now..
- Should we subscribe to these today? Getting some samples could help in debugging .
- subscriptions to csc11 did not work because the datasets are in NorduGrid
- Marco failed to transfer them manually for the same reason. He is troubleshooting the problem with Oxana to be able to transfer files from NG
Tier2-06 - Data service prototype
- SRM client (v1 and v2) has been installed by Marco on tier2-06 - the data service prototype machine.
- Marco started a twiki page for DS prototype machine (being modified by Robert)
- Possible databases that are part of the Tier2 Data Service:
- condition, intervals of validity, calibration (calibration is part of condition and AOD is part of condition)
- These DB have to be estimated (Size, performance requirements and tested)
- Sasha at CERN next week will try to get a working set to populate a condition DB
- Both OSG and NorduGrid have big Tier2 centers, LCG has a different model:
- these Tier2 are good candidates to have mysql replicas of the Oracle POOL Condition DB
- we should compare plans and experiences with NG
- The High level trigger group from SLAC is developing DB-proxy: a technology to have in memory replicas of the Oracle DB through the HLT farm for rapid startup. The technology works only for MySQL. Currently the Oracle DB is replicated by hand (from the ORACLE source) and the MySQL replica is replicated further.
- Richard Hawking should have some test scripts that are redoing the COOL DB
- Robert asked: what exactly is the DS machine providing? The twiki page DataServicesMachine should document that (beside documenting the specific installation on Tier2-06)
- a machine hising the DBs
- a machine hosting the skimming service (allowing to select specific events and providing the selection)
- may host a datset cache
- will use local CE and SE to perform skimming and host results
- provide failover mechanisms: if the Tier2DB is down it will provide access to the one at Tier1/0
- Make sure POOL utilities are built and in place to connect to database, and database setup properly. Will document in the twiki - Jack
Other
Sasha: How will user software be distributed? This is a problem to consider and the Data Service prototype may help in is too
Next meeting probably 8/22 at 3pm Chicago (1pm Seattle)
Discuss with Dan about dCache and local access to the files: dccp?
Datasets
--
MarcoMambelli - 15 Aug 2006