Panda Pilot Submitter
Overview
This documentation refers to a web-based Panda pilot
submission interface. This web frontend provides access to a number of Panda job schedulers and CEs on OSG. It is built using Django (a high level, python based web framework) to provide access to the lower level grid submission tools (Condor-G, etc.) and Panda pilot code and configuration files. More information about the Panda system can be found in its
twiki. Information about the
Panda Job Scheduler component is particularly relevant.
The Panda Pilot Submitter provides:
- A user interface to submit new pilots, view past submissions, the status of submit hosts, and information about compute elements.
- An administrative interface to add and configure new submit hosts and compute elements.
Additionally, a user with administrator privileges can add one or more backends (grid job submission service hosts) from the administrative interface. An admin user can also update information about available CE resources by uploading the files
sitenfo.py
(containing CE information) and
storageaccess_info.py
(containing additional CE attributes required for successful pilot execution).
Note: The
Panda Pilot Submitter (
alt link if the previous is not working) service is under continuous development, so there may be interruptions in the service. Please contact
Marco Mambelli if the server is down.
Access
To use the system, a username and password are required. Send email to
Marco.
Job submission
To submit Panda pilots or simple test jobs, a user goes to the
submit page (
alt link if the previous is not working), selects a submit host and CE (or CE's), and submits the job. The following options are available:
CE selection options
- Send to a single CE: check the box "Check to select a single CE", and the specific CE to use with the pull-down menu.
- Send pilots to any available CE: leave the "Check to select a single CE" box unchecked.
Available pilot types
- default - same as pilot2
- pilot2 - the production release Panda pilot (runs ATLAS jobs). Panda pilot description is available on its CERN twiki page
- sleep job - is sleeps a random amount of time between 1 and 1800 sec (useful to test a queue)
- test - simple jobs that check the CE execution environment: this is a simple python executable, the code is available here in CVS
- old Panda pilot - a previous version of the Panda pilot code
- Other available pilot type are further tests and Kit Validation executions
Monitoring
The interface additionally provides:
- Access to Panda configuration files currently in use which describe static parameters of CEs.
- Links to job submission results including information for both active and completed jobs. For each job you can see:
- the results of the condor-G submission including the current status of the pilot (unsubmitted, in queue, running, completed, failed)
- information about job execution coming from Panda monitoring (panda monitor documantation)
CE testing for site administrators
An important use of the Panda Pilot Submitter is to troubleshoot CEs for Panda job execution. When using the Panda Pilot Submitter, you'll find that most of the time the Condor-G submission of the pilots (from the submit host behind the service, to the remote CE) completes successfully, but check the monitoring links provided in the interface to be sure. Then to be sure the CE can successfully execute Panda pilots, select the
default pilot type and verify that jobs appear in the
Panda dashboard correctly.
CE testing in summary:
- Check that the information about the CE is correct: from the CE list click on the name of your CE, compare the information displayed with what you know about your CE (and used for the OSG configuration)
- If you have a new CE, send the information about it to Marco.
- Go to the submit page and send few pilots.
- Check the submit host status (using the view of the submit host used for job submission) to verify that they were sent.
- Check the job scheduler locally (your PBS, LSF, Condor job manager) to see that the pilots are appearing in the queue and running.
- If Panda pilots are sent, check the Panda production dashboard to verify that the central server is aware of the pilots (this will not work right away for new sites since CE information has to be updated at the central servers at BNL).
System installation
A Panda Pilot Submitter involves the following components:
- a working Panda Job Submitter
- the Django python web framework
- the Panda Pilot Submitter web tools
To install a
PandaJobScheduler use:
pacman -get GCL:PandaJS
. It will take around 20 minutes (network dependent) and use more than 600MB of disk space. A more detailed description of the installation and configuration of a Panda Job Submitter is available
here. Once you have installed a submit host you can use the utility
pusher
to send Panda pilots. The command line interface is more complex but also more flexible and powerful.
To install the Django framework, follow the instructions provided at the
Django website: you can
download Django 0.96 and follow the
instructions to set it up.
Django 0.96 or newer is required. The Panda Pilot Submitter has been developed with 0.96 but there is a commitment to keep it working with future Django releases.
To install the Panda Pilot Submitter web tools, send an email to
Marco and you will receive a tarfile with the application to expand into your Django installation directory.
(more detail needed here - can you put this into CVS, at a minimum repo.mwt2.org)
Once everything is installed you can use the administrative interface to add a new submit host and upload the information about CEs.
FAQ
Can I install a Panda Pilot Submission server?
Yes. Note that while the web frontend was designed to give to site administrators a tool that is easy to use, hiding the complexity of the
PandaJobScheduler, this system is relatively new and should be considered a prototype. It is suggested to try out the existing
server and provide
Marco with feedback.
How do I test a new CE?
If the CE is not in the list of the ones supported by a current submit host, you can either install your own submit host or (recommended) send to the server administrator the necessary information about the CE (check what is displayed for the existing CEs) so that your new CE can be added.
--
MarcoMambelli - 18 Jun 2007