Rucio VM Administrator Guide

This guide will show how to construct the Rucio layer for OpenStack.

Hypervisor Settings

Create seven VM instances

Login at https://openstack.mwt2.org. Click the `Instance` tab on the left-hand side of the screen:

Screen Shot 2019-04-11 at 10.32.33 AM.png

Click launch instance. The following instructions will be the same for each instance except for the security groups & names.

Name each a Fully Qualfied Domain Name, telling you something about the service they're running for Rucio, appended by ".grid.uchicago.edu":
  • rucio-server.grid.uchicago.edu, rucio-client.grid.uchicago.edu, rucio-gridftp1.grid.uchicago.edu, rucio-gridftp2.grid.uchicago.edu, rucio-grafana.grid.uchicago.edu, rucio-prometheus.grid.uchicago.edu, rucio-fts3.grid.uchicago.edu
Screen Shot 2019-04-11 at 10.50.01 AM.png

Click the 'Next >' button. On the Source screen, set 'Select Boot Source', 'Create New Volume', 'Volume Size (GB)', and 'Delete Volume on Instance Delete' as in the screencap:

Screen Shot 2019-04-11 at 11.42.00 AM.png

then scroll through the available images to the one named 'CentOS7' and click the 'up arrow button'; then click the 'Next >' button:

Screen Shot 2019-04-11 at 11.47.46 AM.png

Select the "m1.small" flavor. Click the 'Next >' button:

Screen Shot 2019-04-11 at 12.04.57 PM.png

Select the "UChicago Campus Network" button. Click the 'Next >' button:

Screen Shot 2019-04-11 at 12.08.02 PM.png

Click the 'Next >' button to skip the "Network Ports" tab entirely. At the security groups tab:
  1. For the instance named "rucio-server", select: webserver, gridftp, fts, activemq, and memcached
  2. For the instances named "gridftp1" & "gridftp2", select: gridftp
  3. For the instance named "rucio-fts3", select: fts3, gridftp
Then click the 'Next >' button:

Screen Shot 2019-04-11 at 12.28.37 PM.png

Select the keypair you will use for `ssh`. Then click 'Next >':
Screen Shot 2019-04-11 at 12.34.00 PM.png

Then click 'Next >' at each subsequent screen. Repeat the above sections from the top, each with the new name regarding the system it supports, until all seven instances are created.

Ask your DNS administrator to add an A Record for each pair of FQDN & IP address.

Create two volumes

Create the volumes. Go to https://openstack.mwt2.org/dashboard/project/volumes/ and click 'Create Volume':

Screen Shot 2019-04-11 at 1.30.50 PM.png

When the popup appears, arrange its left column as in the screencap and select a unique name and description:
Screen Shot 2019-04-11 at 1.38.20 PM.png

Then click "Create Volume". Repeat the above to create a second volume. Attach one to each GridFTP instance by selecting 'Manage Attachments' in the dropdown menu:
Screen Shot 2019-04-17 at 1.17.50 PM.png

and selecting the name of a GridFTP instance. Repeat the process to attach the other volume to the other GridFTP instance.

Certification

IGTF Certification

`ssh` to each VM you created in the previous step. Select hostnames for each instance, something like {Instance Name}.grid.uchicago.edu.

Set the hostname for each to the desired FQDN hostname:

hostnamectl set-hostname {FQDN}

Ask your DNS administrator to add the A Record with the FQDN and the IP address.

Verify that it is propagated:

ping {FQDN}

If so, obtain hostname IGTF certificates to enable SSL transactions from each machine in the Rucio layer. Generate a key and CSR on each host:

openssl req -new -newkey rsa:4096 -nodes -keyout {FQDN} -out {FQDN}.csr

Log in to https://cert-manager.com/customer/InCommon/ssl?action=enroll with the credentials available from your site coordinators. Set the dropdown menus as in the following screencap:

Screen Shot 2019-04-11 at 2.13.54 PM.png

Copy the .csr file's contents and paste them into the box labeled CSR. Click 'GET CN FROM CSR', which should populate the box 'Common Name' with the FQDN.

Select the Auto renew checkbox. Select a pass-phrase & re-type the pass-phrase. Type "gardnergroup@lists.uchicago.edu, {your email address}" into External Requester.

Click 'Enroll' at the bottom of the screen.
The certification process seems to take anywhere from 1hr to four days of NIST time.

When you get the 'ENROLLMENT SUCCESSFUL' email, right-click the link to 'X509 Certificate, base64 encoded', and copy it:

Screen Shot 2019-04-11 at 2.20.24 PM.png

`ssh` to the FQDN, escalate to root, and retrieve the certificates to /root/:

ssh -i /path/to/key centos@{FQDN}
sudo su -
curl -o {FQDN}.cert -l "{copied URI of certificate}"

Repeat this section for each of the seven hostnames, for a total of seven certificates.

X509 User Certification

The GridFTP Servers, the Rucio Server, and the Rucio Client will need a user X509 certificate and key. Follow the guide at http://twiki.mwt2.org/bin/view/Main/RucioNewUserStartGuide. Copy the link to the .P12 file.

ssh -i /path/to/key centos@{instance address or FQDN}
curl -o {a filename}.p12 -l "{p12 link}"
openssl pkcs12 -in {your-cert}.p12 -nocerts -out userkey.pem && openssl pkcs12 -in {your-cert}.p12 -clcerts -nokeys -out usercert.pem
openssl x509 -in usercert.pem -out usercert.pem
openssl rsa -in userkey.pem -out userkey.pem
useradd {new user}
mkdir -p /home/{new user}/.globus/
chmod 644 /home/{new user}/.globus/usercert.pem
chmod 400 /home/{new user}/.globus/userkey.pem
chown {new user}:{new user} /home/{new user}/.globus

Do not delete the p12. Do this for each instance.

Configure Each Instance

Knowing each instance's purpose, `ssh` to it; follow the relevant instructions to configure it.

Rucio Server

Install the Python Dependencies

Log in to the Rucio server instance via SSH. Start by installing these dependencies:
yum -y install python-devel && yum -y install m2crypto && yum -y install pycrypto && pip2.7 install --upgrade setuptools && pip2.7 install rucio && yum -y install MySQL-python 

Install MariaDB & Start It

curl -sS https://downloads.mariadb.com/MariaDB/mariadb_repo_setup | sudo bash && sudo yum install MariaDB-server MariaDB-client && chkconfig mariadb on && service mariadb restart

Install the httpd Dependencies

rpm -Uvh https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm
&& yum -y install httpd mod_ssl mod_wsgi gridsite && systemctl enable httpd && rm -f /etc/httpd/conf.d/*

Install, configure, and run ActiveMQ

ActiveMQ support is unstable; however, it is not mission-critical. Install the OpenJDK, add it to your path, and download and verify ActiveMQ:

yum install -y java-1.8.0-openjdk
export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.212.b04-0.el7_6.x86_64/jre/
export PATH=$PATH:$JAVA_HOME/bin
curl -o apache-activemq-5.15.9.tar.gz -l "http://www.apache.org/dyn/closer.cgi?filename=/activemq/5.15.9/apache-activemq-5.15.9-bin.tar.gz&action=download"
curl -o KEYS -l "https://www.apache.org/dist/activemq/5.15.9/apache-activemq-5.15.9-bin.zip.asc"
gpg --import KEYS
gpg --verify apache-activemq-5.15.9.tar.gz

Make the ActiveMQ binary executable:

chmod 755 /bin/activemq

add these two lines to "{ActiveMQ}/conf/jetty-realm.properties":

rucio: rucio, admin
fts3activemq: fts3activemq, admin

and create a systemctl module for it (Caution: the module is unstable and can unexpectedly OOM or self-destruct):

echo "[Unit]

Description=Apache ActiveMQ
After=network-online.target

[Service]
Type=forking
WorkingDirectory=/root/apache-activemq-5.15.9/bin/
ExecStart=/root/apache-activemq-5.15.9/bin/activemq start
ExecStop=/root/apache-activemq-5.15.9/bin/activemq stop
Restart=on-abort
[Install]
WantedBy=multi-user.target"
> activemq.service
cp activemq.service /etc/systemd/system/
systemctl daemon-reload
systemctl start activemq
systemctl enable activemq

Visit the web interface (by default, at hostname:8161). Add queue:
Consumer.kronos.rucio.tracer

Add topics:
transfer.fts_monitoring_queue_state, rucio.tracer, rucio.fax

Install memcached and secure it

yum install -y memcached && 

iptables -A INPUT -p tcp -s 192.170.226.0/23 --dport 11211 -j ACCEPT &&
iptables -A INPUT -p tcp -s 127.0.0.0/8 --dport 11211 -j ACCEPT &&
iptables -A INPUT -p tcp -s 192.170.231.0/26 --dport 11211 -j ACCEPT &&
iptables -A INPUT -p tcp -s 128.135.158.128/25 --dport 11211 -j ACCEPT &&
iptables -A INPUT -p udp -s 192.170.226.0/23 --dport 11211 -j ACCEPT &&
iptables -A INPUT -p udp -s 127.0.0.0/8 --dport 11211 -j ACCEPT &&
iptables -A INPUT -p udp -s 192.170.231.0/26 --dport 11211 -j ACCEPT &&
iptables -A INPUT -p udp -s 128.135.158.128/25 --dport 11211 -j ACCEPT &&
iptables -A INPUT -p tcp --dport 11211 -j DROP &&
iptables -A INPUT -p udp --dport 11211 -j DROP

Create the Rucio configuration file

cd /opt && mkdir rucio && cd rucio && mkdir etc && echo "[common]
logdir = /var/log/rucio
loglevel = DEBUG
mailtemplatedir=/opt/rucio/etc/mail_templates
[client]
rucio_host = https://{Rucio server hostname}:443
auth_host = https://{Rucio server hostname}:443
auth_type = x509
account = %(RUCIO_ACCOUNT)s
ca_cert = /opt/rucio/etc/web/ca.crt
client_cert = ~/.globus/usercert.pem
client_key = ~/.globus/userkey.pem
client_x509_proxy = $X509_USER_PROXY
request_retries = 3
[bootstrap]
x509_identity = {Admin's X509 ID}
[database]
default = mysql://rucio:rucio@localhost/rucio
pool_recycle=3600
echo=0
pool_reset_on_return=rollback
#[bootstrap]
# Hardcoded salt = 0, String = secret, Python: hashlib.sha256("0secret").hexdigest()
#userpass_identity = rucio
#userpass_pwd = 1c78bb21ad7f941d5c76890be51d414409cc1f059907631c0142a5d61bddc468
#userpass_email = jlstephen@uchicago.edu
# Default DDMLAB client certificate from /opt/rucio/etc/web/hostcert.pem
# x509_identity = /DC=org/DC=opensciencegrid/O=Open Science Grid/OU=Services/CN=rucio.mwt2.org
# x509_email = jlstephen@uchicago.edu

#[monitor]
#carbon_server = rucio-xenon-dev.grid.uchicago.edu
#carbon_port = 8125
#user_scope = rucio

[conveyor]
scheme = gsiftp
transfertool = fts3
ftshosts = https://{fts3 hostname}:8446  
cacert = /etc/grid-security/certificates/InCommon-IGTF-Server-CA.pem
usercert = /tmp/x509up_u1001 

[messaging-fts3]
brokers = localhost 
port = 61616
ssl_key_file = /etc/grid-security/hostkey.pem
ssl_cert_file = /etc/grid-security/hostcert.pem
destination = /topic/transfer.fts_monitoring_queue_state

#[messaging-hermes]
#brokers = rucio-xenon-dev.grid.uchicago.edu
#port = 61613
#nonssl_port = 61613
#use_ssl = True
#voname = xenon.biggrid.nl
#ssl_cert_file = /opt/rucio/etc/web/hostcert.pem
#ssl_key_file = /opt/rucio/etc/web/hostkey.pem
#destination = /topic/rucio.events
#email_from = Rucio <jlstephen@uchicago.edu>
#email = jlstephen@uchicago.edu

#[transmogrifier]
#maxdids = 100000

#[accounts]
# These are accounts that can write into scopes owned by another account
#special_accounts = panda, tier0
[trace]
brokers = rucio-server-test.grid.uchicago.edu
port = 61616
username = rucio
password = 88c9580ac2cf378b0c7b1b2ad0d69beba55fcbb9
tracedir = /var/log/rucio/trace
topic = /topic/rucio.tracer
#[tracer-kronos]
#brokers = rucio-xenon-dev.grid.uchicago.edu
#port = 61613
#ssl_cert_file = /opt/rucio/etc/web/hostcert.pem
#ssl_key_file = /opt/rucio/etc/web/hostkey.pem
#username = rucio
#password = 88c9580ac2cf378b0c7b1b2ad0d69beba55fcbb9
#queue = /queue/Consumer.kronos.rucio.tracer
#prefetch_size = 10
#chunksize = 10
#subscription_id = rucio-tracer-listener
#use_ssl = False
#reconnect_attempts = 100
#excluded_usrdns = /DC=ch/DC=cern/OU=Organic Units/OU=Users/CN=gangarbt/CN=722147/CN=Robot: Ganga Robot/CN=proxy
#dataset_wait = 60
#[injector]
#file = /opt/rucio/tools/test.file.1000
#bytes = 1000
#md5 = fd21ce524a9e45060fd3f62c4ef6a386
#adler32 = 52590737
[alembic]
cfg = /opt/rucio/etc/alembic.ini

#[messaging-cache]
#brokers = rucio-xenon-dev.grid.uchicago.edu
#port = 61613
#voname = xenon.biggrid.nl
#ssl_cert_file = /opt/rucio/etc/web/hostcert.pem
#ssl_key_file = /opt/rucio/etc/web/hostkey.pem
#destination = /topic/rucio.fax
#account = cache_mb

#[hermes]
#email_from = Rucio <jlstephen@uchicago.edu>
#email_test = jlstephen@uchicago.edu

#[permission]
#policy = xenon.biggrid.nl
#schema = xenon.biggrid.nl

[policy]
schema = xenon.biggrid.nl
permission = xenon.biggrid.nl" > /opt/rucio/etc/rucio.cfg && rpm -Uvh https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm && rpm -Uvh https://repo.opensciencegrid.org/osg/3.4/osg-3.4-el7-release-latest.rpm && yum -y install httpd mod_ssl mod_wsgi gridsite mod_auth_kerb yum-plugin-priorities

Get the Client CA cert

Visit https://support.comodo.com/index.php?/comodo/Knowledgebase/Article/View/991/0/incommonssl-sha-2, and download `incommonrsaserverca-bundle.crt` from the attachments at the bottom of the page. `cat` the file's contents and paste them into /opt/rucio/etc/web/ca.crt.

Configure the database

curl -o /opt/rucio/etc/alembic.ini -l "https://raw.githubusercontent.com/rucio/rucio/master/etc/docker/demo/alembic.ini"
curl -o setup_data.py -l "https://raw.githubusercontent.com/rucio/rucio/master/etc/docker/demo/setup_data.py"

`vi` to setup_data.py and replace the `/DN = Docker Client` with your X509, such that the "jdoe" account (you can change that name too, but it will invalidate certain New User Guide steps) is identified with your x509 DN. Then run:

python setup_data.py

Start the webserver

In /etc/httpd/conf.d, create a file called `rucio.conf`, and define it as follows:

# Copyright European Organization for Nuclear Research (CERN)
#
# Licensed under the Apache License, Version 2.0 (the "License");
# You may not use this file except in compliance with the License.
# You may obtain a copy of the License at
# http://www.apache.org/licenses/LICENSE-2.0
#
# Authors:
# - Mario Lassnig, <mario.lassnig@cern.ch>, 2012
# - Vincent Garonne, <vincent.garonne@cern.ch>, 2012
# - Ralph Vigne, <ralph.vigne@cern.ch>, 2013
#
# Notabene:
#
# This file configures a local SSLised Apache 2.2 for testing.
#
# Usage:
#   sudo apachectl restart
#   tail -f /var/log/apache2/*_log /var/log/rucio/httpd_*
#
# We are relying on some features of mod_ssl of the Apache project,
# therefore we cannot use another backend server for now.
#
# This configuration is for Ubuntu, with mod_wsgi
# installed from ubuntu repositories
#
# If Apache throws a "Cannot follow symlink" error, check the
# permissions of /opt/rucio; the apache user must be able to
# access it.
# dave was here
# {your name}

#Built-in modules if by yum
LoadModule ssl_module /usr/lib64/httpd/modules/mod_ssl.so
LoadModule auth_kerb_module /usr/lib64/httpd/modules/mod_auth_kerb.so
LoadModule wsgi_module /usr/lib64/httpd/modules/mod_wsgi.so
LoadModule gridsite_module /usr/lib64/httpd/modules/mod_gridsite.so

#WSGIPythonHome /opt/rucio/.venv/
#WSGIPythonPath /opt/rucio/.venv/lib/python2.7/site-packages

<Directory /usr/lib/python2.7>
    AllowOverride none
    Require all granted
</Directory>

Listen 443
#NameVirtualHost *:443
<VirtualHost {your hostname}:443>

 ServerName {your hostname}
 ServerAdmin {your email}

 SSLEngine on
 SSLCertificateFile /etc/grid-security/hostcert.pem
 SSLCertificateKeyFile /etc/grid-security/hostkey.pem
 SSLCACertificatePath /etc/grid-security/certificates
 SSLCARevocationPath /etc/grid-security/certificates
 SSLVerifyClient optional 
 SSLVerifyDepth 10
 SSLOptions +StdEnvVars

 LogLevel warn

 Include /opt/rucio/etc/web/aliases-py27.conf
 
 ErrorLog /var/log/rucio/httpd_error_log
 TransferLog /var/log/rucio/httpd_access_log

 # Valid client certificates required only for x509 authentication
 <LocationMatch /identity/.+/x509>
  SSLVerifyClient require
  SSLVerifyDepth 10
  SSLOptions +StdEnvVars
 </LocationMatch>

 <LocationMatch auth/x509>
  SSLVerifyClient optional 
  SSLVerifyDepth 10
  SSLOptions +StdEnvVars
 </LocationMatch>

   #Proxy authentication via mod_gridsite
   #<LocationMatch /auth/x509_proxy>
   #GridSiteIndexes on
   #GridSiteAuth on
   #GridSiteDNlists /etc/grid-security/dn-lists/
   #GridSiteGSIProxyLimit 16
   #GridSiteEnvs on
   #GridSiteACLPath /opt/rucio/etc/gacl
   #</LocationMatch>

</VirtualHost>

Then `touch` /opt/rucio/etc/web/aliases-py27.conf and set it like so:

# Rucio REST
# Rucio REST
WSGIScriptAlias /accounts                /usr/lib/python2.7/site-packages/rucio/web/rest/account.py
WSGIScriptAlias /accountlimits           /usr/lib/python2.7/site-packages/rucio/web/rest/account_limit.py
WSGIScriptAlias /auth                    /usr/lib/python2.7/site-packages/rucio/web/rest/authentication.py
WSGIScriptAlias /credentials             /usr/lib/python2.7/site-packages/rucio/web/rest/credentials.py
WSGIScriptAlias /archives                /usr/lib/python2.7/site-packages/rucio/web/rest/archive.py
WSGIScriptAlias /config                  /usr/lib/python2.7/site-packages/rucio/web/rest/config.py
WSGIScriptAlias /tmp_dids                /usr/lib/python2.7/site-packages/rucio/web/rest/temporary_did.py
WSGIScriptAlias /dids                    /usr/lib/python2.7/site-packages/rucio/web/rest/did.py
WSGIScriptAlias /identities              /usr/lib/python2.7/site-packages/rucio/web/rest/identity.py
WSGIScriptAlias /heartbeats              /usr/lib/python2.7/site-packages/rucio/web/rest/heartbeat.py
WSGIScriptAlias /locks                   /usr/lib/python2.7/site-packages/rucio/web/rest/lock.py
WSGIScriptAlias /meta                    /usr/lib/python2.7/site-packages/rucio/web/rest/meta.py
WSGIScriptAlias /nongrid_traces          /usr/lib/python2.7/site-packages/rucio/web/rest/nongrid_trace.py
WSGIScriptAlias /ping                    /usr/lib/python2.7/site-packages/rucio/web/rest/ping.py
WSGIScriptAlias /redirect                /usr/lib/python2.7/site-packages/rucio/web/rest/redirect.py
WSGIScriptAlias /replicas                /usr/lib/python2.7/site-packages/rucio/web/rest/replica.py
WSGIScriptAlias /requests                /usr/lib/python2.7/site-packages/rucio/web/rest/request.py
WSGIScriptAlias /rses                    /usr/lib/python2.7/site-packages/rucio/web/rest/rse.py
WSGIScriptAlias /rules                   /usr/lib/python2.7/site-packages/rucio/web/rest/rule.py
WSGIScriptAlias /scopes                  /usr/lib/python2.7/site-packages/rucio/web/rest/scope.py
WSGIScriptAlias /subscriptions           /usr/lib/python2.7/site-packages/rucio/web/rest/subscription.py
WSGIScriptAlias /traces                  /usr/lib/python2.7/site-packages/rucio/web/rest/trace.py
WSGIScriptAlias /objectstores            /usr/lib/python2.7/site-packages/rucio/web/rest/objectstore.py
WSGIScriptAlias /lifetime_exceptions     /usr/lib/python2.7/site-packages/rucio/web/rest/lifetime_exception.py
WSGIScriptAlias /import                  /usr/lib/python2.7/site-packages/rucio/web/rest/importer.py
WSGIScriptAlias /export                  /usr/lib/python2.7/site-packages/rucio/web/rest/exporter.py

Then start the webserver:

systemctl enable httpd
systemctl start httpd

Configure the daemons

Install supervisord if it isn't already installed
yum -y install supervisor

Create the file '/etc/supervisord.d/rucio.ini':

[program:rucio-conveyor-transfer-submitter]
command=/bin/rucio-conveyor-transfer-submitter
stdout_logfile=/var/log/rucio/conveyor-transfer-submitter.log

[program:rucio-conveyor-poller]
command=/bin/rucio-conveyor-poller
stdout_logfile=/var/log/rucio/conveyor-poller.log

[program:rucio-conveyor-finisher]
command=/bin/rucio-conveyor-finisher
stdout_logfile=/var/log/rucio/conveyor-finisher.log

[program:rucio-undertaker]
command=/bin/rucio-undertaker
stdout_logfile=/var/log/rucio/undertaker.log

[program:rucio-reaper]
command=/bin/rucio-reaper
stdout_logfile=/var/log/rucio/reaper.log

[program:rucio-necromancer]
command=/bin/rucio-necromancer
stdout_logfile=/var/log/rucio/necromancer.log

[program:rucio-abacus-account]
command=/bin/rucio-abacus-account
stdout_logfile=/var/log/rucio/abacus-account.log

[program:rucio-abacus-rse]
command=/bin/rucio-abacus-rse
stdout_logfile=/var/log/rucio/abacus-rse.log

[program:rucio-transmogrifier]
command=/bin/rucio-transmogrifier
stdout_logfile=/var/log/rucio/transmogrifier.log

[program:rucio-judge-evaluator]
command=/bin/rucio-judge-evaluator
stdout_logfile=/var/log/rucio/judge-evaluator.log

[program:rucio-judge-repairer]
command=/bin/rucio-judge-repairer
stdout_logfile=/var/log/rucio/judge-repairer.log

[program:rucio-conveyor-stager]
command=/bin/rucio-conveyor-stager
stdout_logfile=/var/log/rucio/conveyor-stager.log

Start supervisord, and check that each of the above programs are running:

supervisorctl
supervisor> status
rucio-abacus-account             RUNNING   pid 715, uptime 4 days, 22:25:39
rucio-abacus-rse                 RUNNING   pid 725, uptime 4 days, 22:25:39
rucio-conveyor-finisher          RUNNING   pid 723, uptime 4 days, 22:25:39
rucio-conveyor-poller            RUNNING   pid 724, uptime 4 days, 22:25:39
rucio-conveyor-stager            RUNNING   pid 720, uptime 4 days, 22:25:39
rucio-conveyor-submitter         RUNNING   pid 727, uptime 4 days, 22:25:39
rucio-judge-cleaner              RUNNING   pid 718, uptime 4 days, 22:25:39
rucio-judge-evaluator            RUNNING   pid 716, uptime 4 days, 22:25:39
rucio-judge-repairer             RUNNING   pid 719, uptime 4 days, 22:25:39
rucio-necromancer                RUNNING   pid 726, uptime 4 days, 22:25:39
rucio-reaper                     RUNNING   pid 721, uptime 4 days, 22:25:39
rucio-transmogrifier             RUNNING   pid 722, uptime 4 days, 22:25:39
rucio-undertaker                 RUNNING   pid 717, uptime 4 days, 22:25:39
supervisor> quit

Set crontab to renew credentials

vi /etc/crontab

and set like so:

SHELL=/bin/bash
PATH=/sbin:/bin:/usr/sbin:/usr/bin
MAILTO=root# For details see man 4 crontabs# Example of job definition:
# .---------------- minute (0 - 59)
# |  .------------- hour (0 - 23)
# |  |  .---------- day of month (1 - 31)
# |  |  |  .------- month (1 - 12) OR jan,feb,mar,apr ...
# |  |  |  |  .---- day of week (0 - 6) (Sunday=0 or 7) OR sun,mon,tue,wed,thu,fri,sat
# |  |  |  |  |
# *  *  *  *  * user-name  command to be executed
0 * * * * {your user} grid-proxy-init
0 */3 * * * root fetch-crl
15 */3 * * * root /bin/fts-delegation-init -s https://{FTS3 hostname}:{FTS3 port} --proxy {grid-proxy-init proxy cert path}

Rucio Client

Install client and dependencies

`ssh` to the Rucio client instance. Run:

curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py
python get_pip.py
pip2.7 install --upgrade setuptools
pip2.7 install rucio-clients
rpm -Uvh https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm
yum install -y  gfal2-all gfal2-util osg-ca-certs yum-plugin-priorities osg-gridftp globus-proxy-utils-7.1-1.el7.x86_64

Set the configuration

vi /opt/rucio/etc/rucio.cfg

Paste this in, substituting where noted:

[common]
logdir = /var/log/rucio
loglevel = DEBUG
mailtemplatedir=/opt/rucio/etc/mail_templates
[client]
rucio_host = https://{server FQDN}:443
auth_host = https://{server FQDN}:443
auth_type = x509
account = %(RUCIO_ACCOUNT)s
ca_cert = /opt/rucio/etc/web/ca.crt
client_cert = ~/.globus/usercert.pem
client_key = ~/.globus/userkey.pem
client_x509_proxy = $X509_USER_PROXY
request_retries = 3
[policy]
schema = xenon.biggrid.nl
permission = xenon.biggrid.nl

Set crontab to renew credentials

vi /etc/crontab

and set like so:

SHELL=/bin/bash
PATH=/sbin:/bin:/usr/sbin:/usr/bin
MAILTO=root# For details see man 4 crontabs# Example of job definition:
# .---------------- minute (0 - 59)
# |  .------------- hour (0 - 23)
# |  |  .---------- day of month (1 - 31)
# |  |  |  .------- month (1 - 12) OR jan,feb,mar,apr ...
# |  |  |  |  .---- day of week (0 - 6) (Sunday=0 or 7) OR sun,mon,tue,wed,thu,fri,sat
# |  |  |  |  |
# *  *  *  *  * user-name  command to be executed
0 * * * * {your user} grid-proxy-init
0 */3 * * * root fetch-crl
15 */3 * * * root /bin/fts-delegation-init -s https://{FTS3 hostname}:{FTS3 port} --proxy {grid-proxy-init proxy cert path}

Get the Client CA Cert

Visit https://support.comodo.com/index.php?/comodo/Knowledgebase/Article/View/991/0/incommonssl-sha-2, and download `incommonrsaserverca-bundle.crt` from the attachments at the bottom of the page. `cat` the file's contents and paste them into /opt/rucio/etc/web/ca.crt.

GridFTP Servers

Install dependencies

sudo su - && rpm -Uvh https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm && rpm -Uvh https://repo.opensciencegrid.org/osg/3.4/osg-3.4-el7-release-latest.rpm && yum install -y osg-ca-certs yum-plugin-priorities gfal2-all gfal2-util globus-proxy-utils-7.1-1.el7.x86_64 osg-gridftp

Configure X509 Credentials

useradd {your user} && cd /home/{your user} && mkdir .globus && cp ~/usercert.pem /home/{your user}/.globus/usercert.pem && cp ~/userkey.pem /home/{your user}/.globus/userkey.pem && chmod 644 /home/{your user}/.globus/usercert.pem && chmod 400 /home/{your user}/.globus/userkey.pem && chown {your user} /home/{your user}/.globus/usercert.pem && chown {your user} /home/{your user}/.globus/userkey.pem 

to configure the X509 certificate and key. As {your user}, get the X509 DN for the user:
openssl x509 -in ~/.globus/usercert.pem -noout -subject

Run 'vi /etc/grid-security/grid-mapfile' and set that file like so:

{DN} {your user}

Remove the password from the usercert so it can be automatically renewed by crontab:
openssl x509 -in ~/.globus/usercert.pem -out ~/.globus/usercert.pem

Configure the server

Run 'vi /etc/sysconfig/globus-gridftp-server' and set it like so:

#Uncomment and modify for firewalls
export GLOBUS_TCP_PORT_RANGE=50000,51000
#export GLOBUS_TCP_SOURCE_RANGE=min,max

Set crontab to renew credentials

vi /etc/crontab

and set like so:

SHELL=/bin/bash
PATH=/sbin:/bin:/usr/sbin:/usr/bin
MAILTO=root# For details see man 4 crontabs# Example of job definition:
# .---------------- minute (0 - 59)
# |  .------------- hour (0 - 23)
# |  |  .---------- day of month (1 - 31)
# |  |  |  .------- month (1 - 12) OR jan,feb,mar,apr ...
# |  |  |  |  .---- day of week (0 - 6) (Sunday=0 or 7) OR sun,mon,tue,wed,thu,fri,sat
# |  |  |  |  |
# *  *  *  *  * user-name  command to be executed
0 * * * * {your user} grid-proxy-init
0 */3 * * * root fetch-crl
15 */3 * * * root /bin/fts-delegation-init -s https://{FTS3 hostname}:{FTS3 port} --proxy {grid-proxy-init proxy cert path}

Add the attached OpenStack volume

The OpenStack web interface dictates the logical block name; in the previous screenshot, it is /dev/vdc on rucioGridFTP1 after being attached to that instance through the 'Manage Attachments' menu. In the instance's shell, make an XFS filesystem at the logical block:

mkfs.xfs {logical block name}

get the XFS filesystem UUID and mkdir a new directory `/scratch`:

xfs_admin -u /dev/vdb
mkdir /scratch

Set `/etc/fstab` with the UUID and /scratch directory:

#
# /etc/fstab
# Created by anaconda on Mon Jul  3 14:42:53 2017
#
# Accessible filesystems, by reference, are maintained under '/dev/disk'
# See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info
#
UUID=32860528-52a7-4814-897e-d56563da7040 /                       xfs     defaults        0 0
UUID={UUID from previous step} /scratch  xfs     defaults        0 0

This will map that directory, and all subdirectories and files, to the filesystem you made. Then allow the user from the grid-mapfile to manage that directory and filesystem:

chmod 755 /scratch
chown {your user}:{your user} /scratch

Start the server

globus-gridftp-server -S

FTS3

`ssh` to the instance configured earlier.

Install Dependencies

sudo su - && rpm -Uvh https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm && rpm -Uvh https://repo.opensciencegrid.org/osg/3.4/osg-3.4-el7-release-latest.rpm && yum install -y osg-ca-certs yum-plugin-priorities gfal2-all osg-gridftp gfal2-util && yum install fts-server fts-client fts-rest fts-monitoring fts-mysql fts-mysql fts-server-selinux fts-rest-selinux fts-monitoring-selinux fts-msg osg-ca-certs

Install MariaDB & start it

curl -sS https://downloads.mariadb.com/MariaDB/mariadb_repo_setup | sudo bash && sudo yum install MariaDB-server MariaDB-client && chkconfig mariadb on && service mariadb restart

Set up the database

'mysql -u root -p` without password to log in. Create a database called fts3 with the statement `create database fts3;`. Create a user called `fts3` with `create user fts3;`. Then initialize the database privileges exclusively for the localhost, with the password `fts3pass`:

GRANT ALL ON <database>.* TO 'fts3'@'localhost' IDENTIFIED BY 'fts3pass';
FLUSH PRIVILEGES;
GRANT SUPER ON *.* to 'fts3'@'localhost' IDENTIFIED BY 'fts3pass';
FLUSH PRIVILEGES;

Set up configuration files

Open /etc/fts3/fts3config and /etc/fts3/fts-msg-monitoring.conf and set the missing values to conform to the configuration.

/etc/fts3/fts3config:

# Running user and group
User=fts3
Group=fts3

# mysql only
DbType=mysql

#db username
DbUserName=fts3

#db password
DbPassword=fts3pass

#For MySQL, it has to follow the format 'host/db' (i.e. "mysql-server.example.com/fts3db")
DbConnectString=localhost/fts3

#Number of db connections in the pool (use even number, e.g. 2,4,6,8,etc OR 1 for a single connection)
DbThreadsNum=25

#The alias used for the FTS endpoint, will be published as such in the dashboard transfers UI http://dashb-wlcg-transfers.cern.ch/ui/
Alias=replacethis

#Infosys, either the fqdn:port of a BDII instance or false to disable BDII access
Infosys=false

#Query the info systems specified in the order given, e.g. glue1;glue2
InfoProviders=glue1

#List of authorized VOs, separated by ;
#Leave * to authorize any VO
AuthorizedVO=*

# site name
SiteName=replacethis

#Enable/Disable monitoring using messaging monitoring (disabled=false / enabled=true)
MonitoringMessaging=true

# Profiling interval in seconds. If set to 0, it will be disabled
Profiling=0

# Log directories
TransferLogDirectory=/var/log/fts3/transfers
ServerLogDirectory=/var/log/fts3

# Log level. Enables logging for messages of level >= than configured
# Possible values are
#   TRACE (every detail), DEBUG (internal behaviour), INFO (normal behaviour),
#   NOTICE (final states), WARNING (things worth checking), ERR (internal FTS3 errors, as database connectivity),
#   CRIT (fatal errors, as segmentation fault)
# It is recommended to use DEBUG or INFO
LogLevel=DEBUG

# Check for fts_url_copy processes that do not give their progress back
# CheckStalledTransfers = true
# Stalled timeout, in seconds
# CheckStalledTimeout = 900

# Minimum required free RAM (in MB) for FTS3 to work normally
# If the amount of free RAM goes below the limit, FTS3 will enter auto-drain mode
# This is intended to protect against system resource exhaustion
# MinRequiredFreeRAM = 50

# Maximum number of url copy processes that the node can run
# The RAM limitation may not take into account other node limitations (i.e. IO)
# or, depending on the swapping policy, may not even prevent overloads if the kernel
# starts swapping before the free RAM decreases until it reaches the value of MinRequiredFreeRAM
# 0 disables the check.
# The default is 400.
# MaxUrlCopyProcesses = 400

# Parameters for Bring Online
# Maximum bulk size.
# If the size is too large, it will take more resources (memory and CPU) to generate the requests and
# parse the responses. Some servers may reject the requests if they are too big.
# If it is too small, performance will be reduced.
# Keep it to a sensible size (between 100 and 1k)
# StagingBulkSize=400
# Maximum number of concurrent requests. This gives a maximum of files sent to the storage system
# (StagingBulkSize*StagingConcurrentRequests). The larger the number, the more requests will FTS need to keep track of.
# StagingConcurrentRequests=500
# Seconds to wait before submitting a bulk request, so FTS can accumulate more files per bulk.
# Note that the resolution is 60 seconds.
# StagingWaitingFactor=300
# Retry this number of times if a staging poll fails with ECOMM
# StagingPollRetries=3

# In seconds, interval between heartbeats
# HeartBeatInterval=60
# I seconds, after this interval a host is considered down
# HeartBeatGraceInterval=120

# Seconds between optimizer runs
# OptimizerInterval = 60
# After this time without optimizer updates, force a run
# OptimizerSteadyInterval = 300
# Maximum number of streams per file
# OptimizerMaxStreams = 16

# EMA Alpha factor to reduce the influence of fluctuations
# OptimizerEMAAlpha = 0.1
# Increase step size when the optimizer considers the performance is good
# OptimizerIncreaseStep = 1
# Increase step size when the optimizer considers the performance is good, and set to aggressive or normal
# OptimizerAggressiveIncreaseStep = 2
# Decrease step size when the optimizer considers the performance is bad
# OptimizerDecreaseStep = 1

# Set the bulk size, in number of jobs, used for cleaning the old records
#CleanBulkSize=5000
# In days. Entries older than this will be purged.
#CleanInterval=7

## The higher the values for the following parameters,
## the higher the latency for some operations (as cancelations),
## but can also reduce the system and/or database load

# In seconds, how often to purge the messaging directory
#PurgeMessagingDirectoryInterval = 600
# In seconds, how often to run sanity checks
#CheckSanityStateInterval = 3600
# In seconds, how often to check for canceled transfers
#CancelCheckInterval = 10
# In seconds, how often to check for expired queued transfers
#QueueTimeoutCheckInterval = 300
# In seconds, how often to check for stalled transfers
#ActiveTimeoutCheckInterval = 300
# In seconds, how often to schedule new transfers
#SchedulingInterval = 2
# In seconds, how often to check for messages. Should be less than CheckStalledTimeout/2
#MessagingConsumeInterval = 1

[roles]
Public = transfer
lcgadmin = vo:transfer
production = all:config

/etc/fts3/fts-msg-monitoring.conf:

# Configuration file for the FTS3 monitoring system using messaging
# Fill in: USERNAME, PASSWORD,FQDN

ACTIVE=true
BROKER=:61616
COMPLETE=transfer.fts_monitoring_complete
STATE=transfer.fts_monitoring_state
OPTIMIZER=transfer.fts_monitoring_queue_state
LOGFILEDIR=/var/log/fts3/
LOGFILENAME=msg.log
START=transfer.fts_monitoring_start
TOPIC=true
TTL=24
USE_BROKER_CREDENTIALS=true
PASSWORD=fts3activemq
USERNAME=fts3activemq
FQDN= {rucio server hostname where you previously configured ActiveMQ}

### SSL settings
## Set to true to enable SSL
SSL=false
## Set to false if you don't want to verify the peer certificate
SSL_VERIFY=true
## Set to a .pem file containing the root CA
SSL_ROOT_CA=/etc/grid-security/certificates/CERN-GridCA.pem
## Set to a .pem file containing both the client certificate and private key
SSL_CLIENT_KEYSTORE=
## If the private key is password-protected, enter it here
## If you set this, make sure that only the user that runs fts3 is allowed to read this file!
SSL_CLIENT_KEYSTORE_PASSWORD=

`vi /usr/share/fts-mysql/fts-diff-4.0.1.sql` and set the statement to `ALTER TABLE t_credential CHANGE COLUMN `termination_time` `termination_time` TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP` before upgrading the database. Then:
python /usr/share/fts/fts-database-upgrade.py

Start the server

# Core FTS3
service fts-server start
service fts-bringonline start
# REST API and Web Monitoring
service httpd restart
# Messaging (optional)
service fts-msg-bulk start
# BDII (optional)
service bdii start

Verify Components & Function

GridFTP

Try the demo transfer

grid-proxy-init
cd ~
echo "This is a file for transfer" > uploadable.txt
gfal-copy file:///home/{your username}/uploadable.txt gsiftp://ruciogridftp1.grid.uchicago.edu/scratch/{your username}/uploadable1.txt
gfal-copy file:///home/{your username}/uploadable.txt gsiftp://ruciogridftp2.grid.uchicago.edu/scratch/{your username}/uploadable2.txt
gfal-copy gsiftp://ruciogridftp1.grid.uchicago.edu/scratch/{your username}/uploadable1.txt file:///home/{your username}/uploadable1.txt
gfal-copy gsiftp://ruciogridftp2.grid.uchicago.edu/scratch/{your username}/uploadable2.txt file:///home/{your username}/uploadable2.txt
cmp --print-bytes uploadable1.txt uploadable.txt
cmp --print-bytes uploadable2.txt uploadable.txt

FTS3

Check the web interfaces

Add your InCommon user cert to your browser. Navigate to https://{hostname}:8449/fts3/ftsmon/#/ and https://{hostname}:8446/fts3/ftsmon/#/ and you should see a web interface and some user specific information. With the above manual, the proxy should already be delegated.

Replicate an RSE with FTS3

In the New User Start Guide, the FTS3 will run.

Rucio

Add two GridFTP RSEs

rucio-admin rse add {RSE1 name}
rucio-admin rse add-protocol --hostname {hostname} --scheme gsiftp --prefix {directory for GridFTP1 writes} --port {port number} --impl rucio.rse.protocols.gfal.Default --domain-json '{"wan": {"read": 1, "write": 1, "delete": 1, "third_party_copy": 1}, "lan": {"read": 1, "write": 1, "delete": 1}}' {RSE1 name}
rucio-admin rse add {RSE2 name}
rucio-admin rse add-protocol --hostname {hostname} --scheme gsiftp --prefix {directory for GridFTP2 writes} --port {port number} --impl rucio.rse.protocols.gfal.Default --domain-json '{"wan": {"read": 1, "write": 1, "delete": 1, "third_party_copy": 1}, "lan": {"read": 1, "write": 1, "delete": 1}}' {RSE2 name}

Connect the RSE to FTS3

The FTS3 REST endpoint (by default, on port 8446) should be set as the 'fts' attribute of any RSE which will replicate through FTS3:
rucio-admin rse set-attribute --rse {RSE1 name} --key fts --value https://{FTS3 hostname}:{REST port}
rucio-admin rse set-attribute --rse {RSE2 name} --key fts --value https://{FTS3 hostname}:{REST port}

Create the RSE Graph

Establish a graph relation between two RSEs with the following command:
rucio-admin rse add-distance --ranking {integer} --distance {integer} {RSE 1} {RSE 2}

Any FTS3 transfer needs to have a path on the graph; so, if RSE 2 should replicate to RSE 1, and RSE 1 should replicate to RSE 2, you will need to add that path as well:
rucio-admin rse add-distance --ranking {integer} --distance {integer} {RSE 2} {RSE 1}

Ranking and distance can be utilized to increase network performance.

Work through the New User Start Guide

Follow the instructions at http://twiki.mwt2.org/bin/view/Main/RucioNewUserStartGuide to verify your infrastructure.

Adding a new user to the system

On the client instance:

sudo su -
useradd {new username}
mkdir -p /home/{new username}/.ssh/ && cd /home/{new username}/.ssh && touch authorized_keys && cd .. && mkdir .globus && chown {new username}:{new username} /home/{new username}/.ssh

Paste the new user's RSA public key into /home/{new username}/.ssh/authorized keys.

As the Rucio root account, create a new account for the username. Attach an X509 identity to that account. Add a permission attribute (Note: here this is 'admin', which is considered dangerous). Remove the quota for the user's RSEs:

rucio-admin account add {new username}
rucio-admin identity add --account {new username} --type X509 --id {new user's X509 DN, in quotes, i.e. "/DC=org/DC=cilogon/C=US/O=University of Chicago/CN=Foo Barbaz A1337"} --email {user's email address}
rucio-admin account add-attribute {new username} --key admin --value ''
rucio-admin account set-limits {new username} {RSE} -1

Then log in to each GridFTP instance and add the new user's X509 DN to the grid-mapfile, mapping it to the controlling identity.

-- DavidManglano - 09 Apr 2019

This topic: Main > WebHome > RucioVMAdminGuide
Topic revision: 23 May 2019, DavidManglano
This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Foswiki? Send feedback