Luc provided information on drivers (cx_Oracle and Oracle library) and example of Tier0 API
Gancho/Florbela created account: atlas_t0_w1/XXX
Jack provided list of useful Datasets
Meeting Friday 10/12 to clarify doubts/questions
Follow notes from a manual test
Dataset subscription
Subscription of one of the datasets recommended by Jack
A machine with AFS is required since interacting with a LCG Site
[marco@tier2-u2 marco]$ source /share/app/dq2/
LCG_LOCATION not defined. Copies to/from LCG sites may not work.
You may consider setting up the LCG environment by sourcing one of the 2 following lines:
source /afs/
source /afs/
NB You need AFS access (e.g. from tier2-u2)
[marco@tier2-u2 marco]$ source /afs/
[marco@tier2-u2 marco]$ dq2-list-subscription trig1_misal1_mc12.017901.PythiaB_Bd_KstarMuMu_Signal_F.merge.TAG.v12000605
[marco@tier2-u2 marco]$ grid-proxy-init
Your identity: /DC=org/DC=doegrids/OU=People/CN=Marco Mambelli 325802
Enter GRID pass phrase for this identity:
Creating proxy ................................................................. Done
Your proxy is valid until: Fri Oct 12 21:05:59 2007
[marco@tier2-u2 marco]$ dq2-register-subscription trig1_misal1_mc12.017901.PythiaB_Bd_KstarMuMu_Signal_F.merge.TAG.v12000605 CERNCAF
Dataset trig1_misal1_mc12.017901.PythiaB_Bd_KstarMuMu_Signal_F.merge.TAG.v12000605 subscribed (archived: 0) to CERNCAF.
[marco@tier2-u2 marco]$ dq2-list-subscription trig1_misal1_mc12.017901.PythiaB_Bd_KstarMuMu_Signal_F.merge.TAG.v12000605
Finding files in a SE
$ dq2_ls -vfp -s CERNCAF trig1_misal1_mc12.006352.AcerMC_Zbb_2e.merge.TAG.v12000605
get file info from
trig1_misal1_mc12.006352.AcerMC_Zbb_2e.merge.TAG.v12000605 Total: 2 - Local: 2
Some notes about
- TTL of the proxy has to be > 3 hours
- specify the SE if you want the file list, else
will consider only the local SE (tell you if there are files in the local SE)
For the registration you need information that should be in the file catalog. I don't know of a way to query the LFC (in LRC a query to the DB would do but there is no public API as well)
To copy the files and get file information:
Some notes about
Full details
[marco@tier2-u2 marco]$ dq2_ls -vfp trig1_misal1_mc12.006352.AcerMC_Zbb_2e.merge.TAG.v12000605
trig1_misal1_mc12.006352.AcerMC_Zbb_2e.merge.TAG.v12000605 Total: 2 - Local: 0
[marco@tier2-u2 marco]$ dq2_ls -vfp -s CERNCAF trig1_misal1_mc12.006352.AcerMC_Zbb_2e.merge.TAG.v12000605
get file info from
..send2nsd: NS002 - send error : No valid credential found
trig1_misal1_mc12.006352.AcerMC_Zbb_2e.merge.TAG.v12000605 Total: 2 - Local: 0
[marco@tier2-u2 marco]$ grid-proxy-info
subject : /DC=org/DC=doegrids/OU=People/CN=Marco Mambelli 325802/CN=proxy
issuer : /DC=org/DC=doegrids/OU=People/CN=Marco Mambelli 325802
identity : /DC=org/DC=doegrids/OU=People/CN=Marco Mambelli 325802
type : full legacy globus proxy
strength : 512 bits
path : /tmp/x509up_u1195
timeleft : 2:30:45
[marco@tier2-u2 marco]$ grid-proxy-init
Your identity: /DC=org/DC=doegrids/OU=People/CN=Marco Mambelli 325802
Enter GRID pass phrase for this identity:
Creating proxy ........................................... Done
Your proxy is valid until: Sat Oct 13 06:35:23 2007
[marco@tier2-u2 marco]$ dq2_ls -vfp -s CERNCAF trig1_misal1_mc12.006352.AcerMC_Zbb_2e.merge.TAG.v12000605
get file info from
trig1_misal1_mc12.006352.AcerMC_Zbb_2e.merge.TAG.v12000605 Total: 2 - Local: 2
[marco@tier2-u2 marco]$ dq2_ls -vf -s CERNCAF trig1_misal1_mc12.006352.AcerMC_Zbb_2e.merge.TAG.v12000605
get file info from
trig1_misal1_mc12.006352.AcerMC_Zbb_2e.merge.TAG.v12000605 Total: 2 - Local: 2
More difficult is getting files characteristics.
Getting the files failed:
[marco@tier2-u2 dataset]$ dq2_get -vr -d /home/marco/dataset trig1_misal1_mc12.006352.AcerMC_Zbb_2e.merge.TAG.v12000605
2 files are missing in the local SE
Some files are missing in the local storage
They are in the following sites;
0 : scan all sites
1 : CERN - Incomplete Replica
2 : SARA - Incomplete Replica
Which site to retrieve them from ? [0-2] : 1
get SURLs from
srmcp -retry_num=1 -streams_num=10 -x509_user_trusted_certificates=/afs/ srm:// file:////home/marco/dataset/trig1_misal1_mc12.006352.AcerMC_Zbb_2e.merge.TAG.v12000605_tid010206._00002.pool.root.1
srmcp -retry_num=1 -streams_num=10 -x509_user_trusted_certificates=/afs/ srm:// file:////home/marco/dataset/trig1_misal1_mc12.006352.AcerMC_Zbb_2e.merge.TAG.v12000605_tid010206._00001.pool.root.1
Total:2 - Failed:0
[marco@tier2-u2 dataset]$ ls -al
total 8
drwxrwxr-x 2 marco marco 6 Oct 12 18:46 .
drwxr-xr-x 36 marco marco 4096 Oct 12 18:46 ..
[marco@tier2-u2 dataset]$ . /share/app/wn_client/
[marco@tier2-u2 dataset]$ which srmcp
[marco@tier2-u2 dataset]$ which java
[marco@tier2-u2 dataset]$ dq2_get -vr -d /home/marco/dataset trig1_misal1_mc12.006352.AcerMC_Zbb_2e.merge.TAG.v12000605
2 files are missing in the local SE
Some files are missing in the local storage
They are in the following sites;
0 : scan all sites
1 : CERN - Incomplete Replica
2 : SARA - Incomplete Replica
Which site to retrieve them from ? [0-2] : 1
get SURLs from
..send2nsd: NS002 - send error : No valid credential found
WARNING : Replica is incomplete
trig1_misal1_mc12.006352.AcerMC_Zbb_2e.merge.TAG.v12000605_tid010206._00001.pool.root.1 is not found
trig1_misal1_mc12.006352.AcerMC_Zbb_2e.merge.TAG.v12000605_tid010206._00002.pool.root.1 is not found
Total:0 - Failed:0
[marco@tier2-u2 dataset]$ dq2_ls trig1_misal1_mc12.006352.AcerMC_Zbb_2e.merge.TAG.v12000605
[marco@tier2-u2 dataset]$ dq2_get -vr -d /home/marco/dataset -s CERNCAF trig1_misal1_mc12.006352.AcerMC_Zbb_2e.merge.TAG.v12000605
2 files are missing in the local SE
get SURLs from
..send2nsd: NS002 - send error : No valid credential found
WARNING : Replica is incomplete
trig1_misal1_mc12.006352.AcerMC_Zbb_2e.merge.TAG.v12000605_tid010206._00001.pool.root.1 is not found
trig1_misal1_mc12.006352.AcerMC_Zbb_2e.merge.TAG.v12000605_tid010206._00002.pool.root.1 is not found
Total:0 - Failed:0
[marco@tier2-u2 dataset]$ dq2_get -vr -g -d /home/marco/dataset -s CERNCAF trig1_misal1_mc12.006352.AcerMC_Zbb_2e.merge.TAG.v12000605
2 files are missing in the local SE
get SURLs from
..send2nsd: NS002 - send error : No valid credential found
WARNING : Replica is incomplete
trig1_misal1_mc12.006352.AcerMC_Zbb_2e.merge.TAG.v12000605_tid010206._00001.pool.root.1 is not found
trig1_misal1_mc12.006352.AcerMC_Zbb_2e.merge.TAG.v12000605_tid010206._00002.pool.root.1 is not found
Total:0 - Failed:0
[marco@tier2-u2 dataset]$ grid-proxy-info
subject : /DC=org/DC=doegrids/OU=People/CN=Marco Mambelli 325802/CN=proxy
issuer : /DC=org/DC=doegrids/OU=People/CN=Marco Mambelli 325802
identity : /DC=org/DC=doegrids/OU=People/CN=Marco Mambelli 325802
type : full legacy globus proxy
strength : 512 bits
path : /tmp/x509up_u1195
timeleft : 11:40:02
from panda monitoring you get only the guid:
Dataset loading on Tier0
Python Oracle driver installed on /local/inst/oracle:
pacman -get $PROD:cx_Oracle-4.3.1
Version number from python -c "import cx_Oracle;print cx_Oracle.version": 4.3.1
Versions may be different, check in the directory
Workspace /local/work/tier0test
API documentation (from Eowyn source)
The meening of the parameters is not always clear
def __init__(self, _logmgr=None, _dbaccount=None, _dbpw=None, _dbname=None, _dbhost=None) :
def insertDataset(self,datasetname,type,writer1,reader1, partnr, ddm='NONE',t1='NONE',sfo='UNKNOWN',maxsfo=1, atomic=True) :
# returns ecode
def updateDataset(self,dsname,partnr=None,writer1=None,reader1=None, ddm=None, t1=None, ami=None, atomic=True) :
# returns ecode
def insertFiles(self,files) :
# returns ecode
# files = [(dsname,lfn,size,guid,md5,state,pfn,nrevents)]
def setFilesState(self,files,newstate) :
# returns ecode
# files = [(dsname,lfn)]
def getFiles(self,dataset=None) :
# returns ecode, [(filename,filesize,guid,md5,pfn)]
def getDatasets(self,writer1=None,reader1=None,type=None,ddm=None) :
# returns ecode, [(datasetname,type,writer1,reader1,ddm,t1)]
Test 1: Manual loading
Try to load one of the DS complete according to Jack:
[marco@tier2-06 tier0prod]$ python
Python 2.2.3 (#1, Jun 14 2007, 21:13:53)
[GCC 3.2.3 20030502 (Red Hat Linux 3.2.3-58)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from Eowyn import ProdDBProxyOracle
Protocol version OK
>>> pdb=ProdDBProxyOracle.ProdDBProxyOracle(None,'atlas_t0','XXX', 'atlas_t0_w1')
>>> pdb.connect()
>>>> PDBORA:connect:0 trying to connect to prodDB (pdb v1.128 sup v? pro v1.34)
>>>> PDBORA:connect:0 ERROR: connection to prodDB failed cx_Oracle.DatabaseError ORA-12154: TNS:could not resolve the connect identifier specified
>>> pdb=ProdDBProxyOracle.ProdDBProxyOracle(None,'atlas_t0_w1','XXX', 'atlas_t0')
>>> pdb.connect()
>>>> PDBORA:connect:0 trying to connect to prodDB (pdb v1.128 sup v? pro v1.34)
>>>> PDBORA:connect:0 connected to prodDB
Inserting Dataset following Luc's example. I don't know the menaning of many of the parameters.
- Are reader ad writer statuses (FILLING, NONE)?
- Is TAGMC a correct type? Is it there a list of valid types?
- Should I specify the ddn since I know that the files are at CERNCAF?
>> pdb.insertDataset('trig1_misal1_mc12.006352.AcerMC_Zbb_2e.merge.TAG.v12000605', 'TAGMC','FILLING','NONE',0)
>>> PDBORA:insertDataset:9 CALLED with ('trig1_misal1_mc12.006352.AcerMC_Zbb_2e.merge.TAG.v12000605', 'TAGMC', 'FILLING', 'NONE', 0, 'NONE', 'NONE')
>>> PDBORA:updateSQL:10 --> executing INSERT INTO DATASET (datasetname,type,writer1,reader1,partnr,ddm,t1,sfo,maxsfo) VALUES ('trig1_misal1_mc12.006352.AcerMC_Zbb_2e.merge.TAG.v12000605','TAGMC','FILLING','NONE',0,'NONE','NONE','UNKNOWN',1)
>>> PDBORA:updateSQL:10 --> varmap {}
>>> PDBORA:updateSQL:10 SQL update took : 226 ms
>>> PDBORA:insertDataset:5 INSERT dataset took 226ms
>>> PDBORA:commit:10 committing prodDB ...
>>> PDBORA:commit:10 committing prodDB done
>>> PDBORA:insertDataset:9 RETURNS ok
Files insertion not executed because unable to retrieve file size and MD5 sum.
- Why are file size and md5 sum necessary? Shouldn't the catalog have them?
- What is the last parameter?
- Why status NONE?
>>> pdb.insertFiles([('trig1_misal1_mc12.006352.AcerMC_Zbb_2e.merge.TAG.v12000605', 'trig1_misal1_mc12.006352.AcerMC_Zbb_2e.merge.TAG.v12000605_tid010206._00001.pool.root.1',size,'33c8a02f-fd2d-4738-bd8a-9cc64e7e67c9',md5,'NONE', '/castor/', 100)
After Luc suggestion to add the files anyway (even without MD5 and file_size) since Tier0 those are put in the DB but not really used so far (coming from Panda experience where these are essential I stopped), I went ahead and added 2 files and closed the Dataset in Tier0.
- Had to reopen DB connection
- These values are incorrect (but necessary to get us going for now):
and md5=''
- Keep in ming that the dataset is open (files may be added in the future)
>>> pdb.connect()
>>>> PDBORA:connect:0 trying to connect to prodDB (pdb v1.128 sup v? pro v1.34)
>>>> PDBORA:connect:0 connected to prodDB
>>> size=0
>>> md5=''
>>> pdb.insertFiles([('trig1_misal1_mc12.006352.AcerMC_Zbb_2e.merge.TAG.v12000605', 'trig1_misal1_mc12.006352.AcerMC_Zbb_2e.merge.TAG.v12000605_tid010206._00001.pool.root.1',size,'33c8a02f-fd2d-4738-bd8a-9cc64e7e67c9',md5,'NONE', '/castor/', 100)])
>>>> PDBORA:insertFiles:9 CALLED with [('trig1_misal1_mc12.006352.AcerMC_Zbb_2e.merge.TAG.v12000605', 'trig1_misal1_mc12.006352.AcerMC_Zbb_2e.merge.TAG.v12000605_tid010206._00001.pool.root.1', 0, '33c8a02f-fd2d-4738-bd8a-9cc64e7e67c9', '', 'NONE', '/castor/', 100)]
>>>> PDBORA:updateSQL:10 --> executing INSERT INTO tomfile (datasetname,filename,filesize,guid,md5,state,pfn,nrevents) VALUES ('trig1_misal1_mc12.006352.AcerMC_Zbb_2e.merge.TAG.v12000605','trig1_misal1_mc12.006352.AcerMC_Zbb_2e.merge.TAG.v12000605_tid010206._00001.pool.root.1',0,'33c8a02f-fd2d-4738-bd8a-9cc64e7e67c9','','NONE','/castor/',100)
>>>> PDBORA:updateSQL:10 --> varmap {}
>>>> PDBORA:updateSQL:10 SQL update took : 347 ms
>>>> PDBORA:insertFiles:5 INSERT one file took 347ms
>>>> PDBORA:commit:10 committing prodDB ...
>>>> PDBORA:commit:10 committing prodDB done
>>>> PDBORA:insertFiles:9 RETURNS ok
>>> pdb.insertFiles([('trig1_misal1_mc12.006352.AcerMC_Zbb_2e.merge.TAG.v12000605', 'trig1_misal1_mc12.006352.AcerMC_Zbb_2e.merge.TAG.v12000605_tid010206._00002.pool.root.1', size,'b43354c2-3d4f-43ec-ad53-7755dba8e3b4', '', 'NONE', '/castor/', 100)])
>>>> PDBORA:insertFiles:9 CALLED with [('trig1_misal1_mc12.006352.AcerMC_Zbb_2e.merge.TAG.v12000605', 'trig1_misal1_mc12.006352.AcerMC_Zbb_2e.merge.TAG.v12000605_tid010206._00002.pool.root.1', 0, 'b43354c2-3d4f-43ec-ad53-7755dba8e3b4', '', 'NONE', '/castor/', 100)]
>>>> PDBORA:updateSQL:10 --> executing INSERT INTO tomfile (datasetname,filename,filesize,guid,md5,state,pfn,nrevents) VALUES ('trig1_misal1_mc12.006352.AcerMC_Zbb_2e.merge.TAG.v12000605','trig1_misal1_mc12.006352.AcerMC_Zbb_2e.merge.TAG.v12000605_tid010206._00002.pool.root.1',0,'b43354c2-3d4f-43ec-ad53-7755dba8e3b4','','NONE','/castor/',100)
>>>> PDBORA:updateSQL:10 --> varmap {}
>>>> PDBORA:updateSQL:10 SQL update took : 196 ms
>>>> PDBORA:insertFiles:5 INSERT one file took 196ms
>>>> PDBORA:commit:10 committing prodDB ...
>>>> PDBORA:commit:10 committing prodDB done
>>>> PDBORA:insertFiles:9 RETURNS ok
>>> pdb.updateDataset('foobar.TAG',writer1='FILLED')
>>>> PDBORA:updateDataset:9 CALLED with ('foobar.TAG', None, 'FILLED', None)
>>>> PDBORA:updateSQL:10 --> executing UPDATE DATASET SET writer1 = :w1 WHERE datasetname = :ds
>>>> PDBORA:updateSQL:10 --> varmap {'ds': 'foobar.TAG', 'w1': 'FILLED'}
>>>> PDBORA:updateSQL:10 SQL update took : 4411 ms
>>>> PDBORA:updateDataset:5 UPDATE dataset took 4411ms
>>>> PDBORA:commit:10 committing prodDB ...
>>>> PDBORA:commit:10 committing prodDB done
>>>> PDBORA:updateDataset:9 RETURNS ok
>>> pdb.close()
>>>> PDBORA:close:0 Trying to close connection to prodDB
>>>> PDBORA:close:0 Connection to prodDB closed
End result of Test 1: Manual loading
Dataset trig1_misal1_mc12.006352.AcerMC_Zbb_2e.merge.TAG.v12000605 with its current 2 files is loaded in the
Tier0DB and set to FILLED (ready for action)
- Dataset trig1_misal1_mc12.006352.AcerMC_Zbb_2e.merge.TAG.v12000605 is declared open in DQ2: files may be added to it in the future
- [file] size and md5 contain bogus values (
and md5=''
MarcoMambelli - 12 Oct 2007