Transfer speed test

Introduction

This is a ballpark test of the transfer speed done measuring manually (time) the time spent by dq2-get to copy the files.

What I'm using:
  • du (du -sb) to evaluate the size of the transferred files (during the test I actually used du -sh)
  • Units: MB=10^6Bytes, MiB=1024^2Bytes, Mib=1024^2bits (sometimes MiB/s is abbreviated M/s, but it is ambiguous)
  • time around the command to measure the time
  • a simple script to grab status snapshots:
#!/bin/sh
uptime
finger
ps -flu $1 --forest --cols=500
ls -al $2
echo "Current size:"
du -sh $2

Test 1: failed test

time dq2-get -L UCT3 -s MWT2_UC fdr08_run2.0052290.physics_Express.daq.RAW.o2 >& ../0052290_Transfer080805.log &

This test toke 23min but copied no files:

In this test dq2-get decided to use lcg-cp as copy command that failed because the URLs are not complete. Here e alternative invocations all failing
[uct3-edge5] /ecache/marco/test_dq2-get/timing >  lcg-cp -v --vo atlas srm://uct2-dc1.uchicago.edu:8443/pnfs/uchicago.edu/data/ddm1/fdr08_run2/RAW/fdr08_run2.0052290.physics_Express.daq.RAW.o2/daq.fdr08_run2.0052290.physics.Express.LB0014._sfo02._0001.data file:///ecache/marco/test_dq2-get/timing/fdr08_run2.0052290.physics_Express.daq.RAW.o2/daq.fdr08_run2.0052290.physics.Express.LB0014._sfo02._0001.data
Command:  lcg-cp -v --vo atlas -b -T srmv1  srm://uct2-dc1.uchicago.edu:8443/pnfs/uchicago.edu/data/ddm1/fdr08_run2/RAW/fdr08_run2.0052290.physics_Express.daq.RAW.o2/daq.fdr08_run2.0052290.physics.Express.LB0014._sfo02._0001.data file:///ecache/marco/test_dq2-get/timing/fdr08_run2.0052290.physics_Express.daq.RAW.o2/daq.fdr08_run2.0052290.physics.Express.LB0014._sfo02._0001.data
httpg://uct2-dc1.uchicago.edu:8443: Unknown error
lcg_cp: Communication error on send
Source SE type: SRMv1
[uct3-edge5] /ecache/marco/test_dq2-get/timing >  /share/wlcg-client/lcg/bin/lcg-cp -v --vo atlas srm://uct2-dc1.uchicago.edu:8443/pnfs/uchicago.edu/data/ddm1/fdr08_run2/RAW/fdr08_run2.0052290.physics_Express.daq.RAW.o2/daq.fdr08_run2.0052290.physics.Express.LB0014._sfo02._0001.data file:///ecache/marco/test_dq2-get/timing/fdr08_run2.0052290.physics_Express.daq.RAW.o2/daq.fdr08_run2.0052290.physics.Express.LB0014._sfo02._0001.data
LCG_GFAL_INFOSYS not set
lcg_cp: Invalid argument
[uct3-edge5] /ecache/marco/test_dq2-get/timing >  /share/wlcg-client/lcg/bin/lcg-cp -v -b -T srmv2 --vo atlas srm://uct2-dc1.uchicago.edu:8443/pnfs/uchicago.edu/data/ddm1/fdr08_run2/RAW/fdr08_run2.0052290.physics_Express.daq.RAW.o2/daq.fdr08_run2.0052290.physics.Express.LB0014._sfo02._0001.data file:///ecache/marco/test_dq2-get/timing/fdr08_run2.0052290.physics_Express.daq.RAW.o2/daq.fdr08_run2.0052290.physics.Express.LB0014._sfo02._0001.data
Invalid request: When BDII checks are disabled, you must provide full endpoint
lcg_cp: Invalid argument
If setting LCG_GFAL_INFOSYS in the 2nd attempt there is a parsing error.

Test 2: dq2-get (srmcp)

time dq2-get -L UCT3 -s MWT2_UC -p srm fdr08_run2.0052290.physics_Express.daq.RAW.o2 >& ../0052290_Transfer080805b.log &

This test toke almost 125min and copied all files (300, ~128GB):

It toke exactly:
real	124m58.523s
user	58m0.271s
sys	19m33.982s
And was:
> du -b fdr08_run2.0052290.physics_Express.daq.RAW.o2/
136621451208	fdr08_run2.0052290.physics_Express.daq.RAW.o2/

Results:
  • from UCT2 to uct3-edge5 local /ecache
  • 137 GB, 300 files
  • 7498 seconds
  • 18.2 MB/s (145.7 Mbs)

Test 3: sequential srmcp

time ./copytest2.sh >& copytest2.log 

A text file (bash script) contains the exact sequence of srmcp commands executed by Test 2. This time there is no dq2-get overhead involved and the commands are executed sequentially. Commands are like:
srmcp srm://uct2-dc1.uchicago.edu:8443/pnfs/uchicago.edu/data/ddm1/fdr08_run2/RAW/fdr08_run2.0052290.physics_Express.daq.RAW.o2/daq.fdr08_run2.0052290.physics.Express.LB0029._sfo04._0001.data file:////ecache/marco/test_dq2-get/timing/fdr08_run2.0052290.physics_Express.daq.RAW.o2/daq.fdr08_run2.0052290.physics.Express.LB0029._sfo04._0001.data

Exactly the same files were copied. It took exactly:
real	173m16.120s
user	34m37.884s
sys	16m0.831s

Results:
  • from UCT2 to uct3-edge5 local /ecache
  • 137 GB, 300 files
  • 10396 seconds
  • 13.1 MB/s (105.1 Mbs)

Test 4: srmcp with input file

This test is similar to the previous one with the exception that all the input/destination files couples are in a file and there is a single srmcp invocation.
time /share/wlcg-client/srm-client-fermi/bin/srmcp -copyjobfile=copyjobf.txt -report=copytest3.report >&  copytest3.log
This completed with:
real	153m20.639s
user	6m50.329s
sys	14m25.589s

This test has also a second part where the same copy is tried several times changing the number of streams used by srmcp. This will test if the current limit is due to the stream.
 time /share/wlcg-client/srm-client-fermi/bin/srmcp -streams_num=3 -copyjobfile=copyjobf.txt -report=copytest4.report >&  copytest4.log

4: 3streams
real	126m26.115s
user	6m10.713s
sys	8m52.897s

5: 5streams
real	127m10.319s
user	6m7.940s
sys	9m19.152s

7: 7streams
real	124m50.975s
user	6m37.168s
sys	9m34.491s

Results:
  • from UCT2 to uct3-edge5 local /ecache
  • 137 GB, 300 files
  • 1 stream: 9200 seconds, 14.9 MB/s (118.8 Mbs)
  • 3 streams: 7586 seconds, 18.0 MB/s (144.1 Mbs)
  • 5 streams: 7630 seconds, 17.9 MB/s (143.2 Mbs)
  • 7 streams: 7491 seconds, 18.2 MB/s (145.9 Mbs)

Notes:
  • uct3-edge5 crashed during the first execution of the 7 stream test. Results are from the second execution.

Test5: 2 (or more copies at the same time)

In this test 2 or more copies of the same dataset are performed at the same time from the same client host. This will check if there are limit per transaction (or process) or if the limits are somewhere in the client or in the server.

To shorten the test and use less disk space each of these transfer involve only 100 files (45.4 GB, 45436718584 bytes). It should be still big enough not to be affected by variations in load.

2 Instances on the same client host
SC SS
time /share/wlcg-client/srm-client-fermi/bin/srmcp -streams_num=3 -copyjobfile=copyjobf100b.txt -report=copytest8b.report >&  copytest8b.log
real	79m29.572s
user	2m4.430s
sys	3m37.087s

time /share/wlcg-client/srm-client-fermi/bin/srmcp -streams_num=3 -copyjobfile=copyjobf100a.txt -report=copytest8a.report >&  copytest8a.log
real	77m11.139s
user	2m1.247s
sys	3m37.087s

Results:
  • 45.4GB, 100 files
  • same client host, same server: 9.5 MB/s (76.2 Mbs), 9.8 MB/s (78.5 Mbs), sum: 20.3 MB/s (154.7 Mbs)
  • same client host, different server:
  • different client host, same server:

Performance

Some performance discussion: Speed
  • 18221052.4 byte/sec
  • 18.2 MB/s (145.7 Mbs, 139.0 Mibs)
  • Max theoretical speed
    • Ethernet: Gbs , 7 times
    • disk: 1.5 Gbs (max SATA)
    • read test:
    • dCache read from 1 pool (3 if files are on different pools):

Further test:
  • srmcp tests (loop, streams, multiple): in progress
  • loop with different command (lcg-cp, ngcp)

Other data:
  • 160 Mbps from UC->da.physics.indiana.edu (Tom and ) Probably iperf (mem to mem) transfer rate
Fred's rates:
  • 52280: 113 GB in 12240 s = 74 Mbps (Friday afternoon)
  • 52290: 133 GB in 8400 s = 126 Mbps (Monday morning)

Some thoughts and comparisons:
  • local performance is not high
  • anyway about the same performance is achieved with remote transfers
  • remote performance (UC-IU) hits about at the same time dq2-get limit and network limit (iperf test and dq2-get local execution have about the same value). Both vould have to be improved to get a better performance
  • there is no verification that the copy is correct (no checksum evaluation, this would slow down further)
  • transfer rate on 8/6/08 in production FTS (UC-BNL, the highest one of the day) are even lower (1Mbs). Probably there are problems today
  • transfer rate on 8/6/08 evening in production FTS (UC-BNL, the highest one of the day) are even lower (2Mbs). This is considered a good rate
  • transfer test IU-UC on 8/6/08 by Sarah: sustained 500Mbs in ganglia. This involves 34 machines each one starting a 3rd party SRM copy (multiple SRM servers and gridftp doors are involved as well)
    • 800 Mbs according to Cacti and other measurements (Charles' ifrate.sh script)
    • test ended after ~6hours with iut2-dc1 crashing (seems tcp memory allocation problems)
  • RAID card test at MWT2_UC by Charles (MWT2 meeting 8/12/08), max throughput with dd:
    • LSI cards max 20 MB/s
    • 3Ware cards max 34 MB/s

-- MarcoMambelli - 05 Aug 2008

Topic revision: r12 - 26 Sep 2008, MarcoMambelli
This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Foswiki? Send feedback