r4 - 01 Aug 2007 - 10:34:14 - RobertBallYou are here: TWiki >  Admins Web > AnalysisQueueP1



The goal of this task is to configure sites to schedule analysis pilots with priority.

Panda can submit two types of pilots to a local batch system. The first type is the usual production pilot while the second is a user job pilot, typically an analysis task. Both types of jobs actually use the same pilot.py code but an analysis job will provide an additional command line argument (-u) to indicate that it is an analysis pilot. Panda's submission mechanism is responsible for the rate at which each type of pilot is submitted to a computing site.

Panda's submission mechanism, based on the siteinfo.py file, maintains separate sites for production and analysis. Analysis sites are designated by starting with "ANALY_" and can specify different features, if needed, than a related production site. The first step in supporting analysis is to define an appropriate entry in siteinfo.py and ask that the site be enabled.

The second step is to prioritize the received analysis jobs within the local batch system so that they are executed before production jobs. The exact method for doing this will vary by batch system and scheduling practices, but the common problem is allowing the batch system to identify incoming pilots as analysis jobs or production jobs. Both types of jobs will arrive through a Globus jobmanager interface that does not know the difference between analysis and production jobs. The definition of an analysis site in siteinfo.py offers a method to distinguish an analysis job.

The eighth parameter of a site definition, in siteinfo.py, allows for the inclusion of GRAM RSL parameters when submitting the pilot through Globus. Consider a computing site that uses PBS where jobs are executed within queues and there are two execution queues defined: default_q and analy_q. Further, consider the case where the scheduling is strictly by queue and that any job in analy_q will be executed before any job in default_q. The analysis site definition in siteinfo.py can use the eighth parameter to specify that analysis pilots are submitted to analy_q. An alternative approach is to use the GRAM parameter to request different walltimes for the different job types and allow the batch system to prioritize jobs by shortest job first.

The third factor that sites will need to address is controlling the number of jobs executing within each job class. Ideally there should always be some fraction of CPU's available for immediate analysis work. This will contribute to the success of Panda based analysis by reducing the wait time experienced by users. This implies that the number of running production jobs is capped below the number of processors. How to do this will vary by batch system and enacted scheduling policies.

Requirements for Panda integration

Information needed from sites for Panda integration.

Tips, experiences from sites with PBS job schedulers

Tips, experiences from sites with Condor job schedulers

Local tests at AGLT2 show that adding a priority argument to the condor submission as shown below causes the higher priority submission to begin running more quickly than submissions with no such argument, even if submitted later. However, due to both the light load of jobs (until recently) and urgent work by both Mark Sosebee and myself, we have not yet performed this test in a Panda submission.
condor_submit  <usual stuff> -append "priority = 5"
Default priority is 0, with range -20 to +20 

Bob Ball - 1 August, 2007

-- RobertGardner - 22 Jun 2007

About This Site

Please note that this site is a content mirror of the BNL US ATLAS TWiki. To edit the content of this page, click the Edit this page button at the top of the page and log in with your US ATLAS computing account name and password.


Powered by TWiki
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback