r22 - 08 Feb 2007 - 17:16:34 - TorreWenausYou are here: TWiki >  AtlasSoftware Web > PandaPlans

Panda project plans

High level objectives

Panda tasks, priorities, assignments

Task items below appear as (you can cut/paste this as a template):

Short description (may be wiki link to a page about the item) Priority, status Assigned to
Complete description. Be verbose, include links to relevant email and info, etc. If the task warrants a wiki page of its own, make one and link it here

Hot list: top priority tasks

Panda server (task buffer, brokerage, dispatcher, data service)

Analysis support (pathena, data access for analysis)

User datasets holding files processed so far MEDIUM Tadashi
Users typically won't succeed in processing a complete dataset on the first (or second) attempt, and anyway the dataset may not be complete when first processed. Need a user-level specification of the subset of files that have been processed so far. Support via a user dataset containing files processed so far. Use this dataset to implement 'process any newly available files' functionality in pathena.

pathena sub-file job partitioning MEDIUM Sergey
Support partitioning of pathena jobs with a granularity smaller than the file level. Once AOD merging is in production the AOD files will be larger than the appropriate granularity for individual jobs. Need to partition jobs within files into event sets processed by different jobs.

Posix PFN extraction and cataloging for DB-based tag processing LOW Glasgow?
pathena supports file-based tag processing via a ROOT macro that scans the files for POOL refs so they can be resolved via the LRC, translated to posix-appropriate PFNs, and recorded in an XML catalog for Athena use. The same must be done for DB-based tags, presumably for the tag DB as a whole, populating a large posix-access catalog (probably not an XML file) for Athena use.

Data file marshalling for selections LOW
A collection or a selection out of a collection can define (through the refs therein) a file set of interest for marshalling and replication to an external site for processing. Need a tool to do this marshalling: build the list of files from the refs, create a new dataset for the needed files, and register its incomplete locations based on locality of constituent files (not necessarily trivial) such that it can be replicated by subscription.

Production pilot (analysis and managed production)

Pilot recovery and cleanup system HIGH Paul
Subsequent pilots on a WN with previous job failures recover output and clean up failed job

Production job scheduler

TestPilot pilot

Switch to Tadashi's webdav area for staging scripts on web HIGH Torre
If you upload a file to the dev server, the trf will have wget https://gridui02.usatlas.bnl.gov:26443/cache/***

For the production server wget https://gridui01.usatlas.bnl.gov:25443/cache/***

DDM plugins HIGH Torre
Add generic DDM plugins with OSG production instance based on DQ2ProdClient2? and LCG instance based on dq2_*

TestPilot job scheduler

DDM (pilot data handling, site data handling, data transport, data access, file-level validation)

Timeout on dCache in pilot URGENT Paul
dccp hangs regularly. Right now the pilot hangs with it, ultimately dying with lost heartbeat. It shouldn't. Protect file movement in the pilot with timeouts on a separate thread doing the transfer.

OSG installation instructions for DDM URGENT Wensheng
We need OSG-specific installation instructions, and corresponding installation scripts, for DQ2@OSG, OSG-specific components like site http service, MySQL? etc.

All DDM scripts in CVS URGENT Wensheng
All scripts used by Panda DDM, sites etc have to be maintained in CVS. Packaging procedures like pacman cache building should then pick up the scripts from CVS, eg. in scripts/ area under Production/panda in ATLAS CVS.

xrootd evaluation MEDIUM
Implement xrootd as supported remote file access mechanism and evaluate usage with pathena

SRM/xrootd interface MEDIUM SLAC

xrootd support for file stage-out to WN disk MEDIUM SLAC

dcap vs. dccp to WN MEDIUM
Statements at FNAL dCache workshop: "dcap is a simple thin posix interface. Much more efficient than file transfers. Very small overhead." Is posix I/O via dcap an improvement on dccp to WN for fully-processed files? Evaluate.
And another statement: "srmcp cannot do more than 3-5 files at once or XML gets too big and the protocol (SOAP) too slow."

Databases (database servers, services, schema)

Validate and deploy grid-authentication to MySQL? URGENT Wensheng, Yuri

Second tier archival tables HIGH Yuri, Tadashi
Do a second migration after a few months out of the archive table and into dedicated long term archival tables each spanning a month (or N months), tables named for the month. This should scale, any scripts producing history plots and such will know where to find the data, and it'll help the monitoring to work with a smaller archive table. Do the same for job table.

Database schema revision 5 HIGH Yuri, Sudhamsh at CERN
Schema changes for revision 5:

Job table:

  • Add taskID mediumint(9) after jobDefinitionID, for convenience
  • Add encrypt varchar(250) after transformation, to store an encryption of the transformation (and other?) fields for use in RSA key pair authentication of job record content (250 suggested because transformation is 250. Should they both be shortened?)
  • Add sourceGroup varchar(32) after prodSourceLabel to record group (eg physics working group) for jobs that are run under the auspices of a group (and are accounted accordingly)
  • Add VO varchar(32) after sourceGroup to record virtual organization of the job, such that non-ATLAS usage can be supported and accounted.
  • Add app varchar(64) after homepackage to designate the application being run by the job, to allow brokerage across applications in a VO
  • Add pilotVersion varchar(32) after pilotID to designate pilot version/type, to allow matchmaking of jobs requiring a particular pilot type/version to compatible pilots. If pilot reports a pilotversion to the dispatcher when requesting a job, the dispatcher should only allocate it a job with a matching pilotversion value.
  • Change prodSourceLabel to varchar(30) rather than enum
  • Remove transExitCode (and do not add transExitDiag)
  • Remove xxErrorCode, xxErrorDiag, xx = sup, brokerage, jobDispatcher, taskBuffer
  • Add serverErrorCode, serverErrorDiag, replacing the brokerage etc. ones
  • New proposal: Change Diag error descriptions to varchar(128): long enough for useful description, and saves space
  • New proposal: Change prodUserID to varchar(128): long enough, saves space

Feb 2007 addition:

  • Add GridCert? column for the DN of the grid certificate of the job submitter. Needed because we now allow the submitter to specify the DN used in prodUserID for accounting purposes.

Facility services (servers, data storage and other facility issues at Tier 1 and Tier 2s)

Migrate BNL LRC to new dedicated server URGENT Yuri, Wensheng

Data flow monitoring between sites, throughput plots HIGH Tier1

xrootd testbed including SRM/xrootd to run xrootd tests MEDIUM Tier1? SLAC?

dCache deployed in production at all US Tier 2s MEDIUM Tier2s
Agreed at the FNAL dCache workshop to do this by ~end 2006

Data validation

Prodsys interface and integration

Monitor - operations, performance metrics, site support

Expand service health check to include LRCs HIGH

dCache space reporting via http service HIGH Torre
From Dan: The two sites so configured are IU_BC/IU_BANDICOOT and UC_VOB:

curl http://bandicoot.uits.indiana.edu:8000/dq2/space/default

curl http://tier2-05.uchicago.edu:8000/dq2/space/default

Operations history HIGH Prem
Operations statistics currently are only kept in a current snapshot. Implement a history table to which snapshots are migrated to record an operations history, and implement plotting tools to display the history.

Database plotting system HIGH Sudhamsh
Many Panda DBs record info we would like to have in plot form. Job tables are just one example. Provide a generic plotting system which given a DB, SQL describing the values to be plotted, and other plot parameters (user controllable via URL API or form), do the plot.

Scheduler status page HIGH
List and status of (non) operating schedulers

Job duration monitoring HIGH
  • Plot distributions for job duration, latencies for activated and running
    • Separately for production and analysis
  • Alarms for anomalies
    • too-long jobs
    • too-long waits in assigned, waiting for activation

Job status analyzer HIGH
For assigned/waiting jobs, analyze why they're waiting, info on jobs pages and alarms to alarm system

Job pages - subscription info HIGH
Links to associated dispatch, destination subscriptions

WN info from pilots MEDIUM
Memory usage, CPU power, load from heartbeats

Efficiency, specint resources, throughput MEDIUM
Use CPU power formula for efficiency, resource, throughput metrics

Monitor - user analysis, accounting

Provide global jobID to pathena HIGH Torre
pathena's user job IDs are currently recorded in a local disk file. As a result, pathena runs in different environments (BNL vs CERN) can result in duplicated jobIDs. Use the monitor's user DB to provide http API to retrieve latest jobID -- and other info -- for pathena and other client use.

User stats table, job history MEDIUM
Jobs run that day, success rate, where ran, datasets used,...

User level datasets library MEDIUM

Monitor info, functions via the command line MEDIUM

Extend quota system to groups (eg PWGs) LOW Sudhamsh
Once group level usage of Panda is in use, populate the group-level quota information and extend the API and monitor to support group quotas.

Email notification MEDIUM
Optional email notification of job start, finish, error; selective user subscription info

Monitor - dataset browser and DDM

Access to recent file lists for SEs HIGH
eg. http://gridui02.usatlas.bnl.gov:25880/server/pandamon/query?overview=recentfiles&site=AGLT2

Exclude empty replicas HIGH
On dataset pages do not show as replicas (or give warning) sites that hold none (under 25%) of the files.

replica catalog extractions HIGH Torre, Alexei
Replica catalog scanners in cron on Tier 1s to extract info in usable form for fast comprehensive replica info in monitor (dataset completeness per site)

'View originating dataset for this file' link in file view HIGH Torre

Extract complete metadata to the pickle DB for dataset browser HIGH Torre
Dataset metadata accessible efficiently to the browser is woefully incomplete and very expensive to access dynamically. Need to extract more to the pickle DB. Problematic because DQ2 metadata retrieval is so slow it is hard even with a looong cron to pull it out.

Extract dataset completeness at sites to the pickle DB for dataset browser HIGH Torre
Monitor needs to provide info on which files of a dataset are actually at which site, fast and dynamically via cached info. Percent complete and so on. Fast and trivial for MySQL LRCs. We'll see for LFC LRCs.

Move dataset listings to web cache MEDIUM

Monitor - MonaLisa and Dashboard integration, migration

Monitor - Infrastructure

Alarm system HIGH
Alarm panel in monitor, alarm status on every page, API to register alarm, email mechanism

Generic table spec class LOW
Generalize *Spec class to accept connection specs and DB name, and dynamically build list of attributes. Accept dictionary (tolerant against inconsistencies) mapping fields to longer descriptions.

Monitor - WN interaction

Pilot listener, interface for 'real time' interaction MEDIUM
Mechanism (high rate polling as supported for multi-tasking pilot? jabber?) for pseudo real time pilot interaction for eg. simple 'command line' (tail filename, ps, ls, debugging)


Performance and scalability (measurement, optimization, design changes)

Site information system

Pick up pickled scheduler config, sike OK list from logger and record to site info DB MEDIUM Torre

Include LCG sites, at least those with Panda capability HIGH


RSA key pair authentication scheme for validating job record content based on encrypted transformation field LATER

Tools for operations and shift support

Automated scanning, identification and management of duplicate bugs by signature HIGH

Site support, packaging, deployment, installation, maintenance

LCG deployment and support

Organization and communication (email lists, code repository, ...)


Web pages to update URGENT

Needed documentation URGENT

Completed tasks

Direct posix access to remote files in pathena, supporting refs in collections and back navigation DONE 200608 Tadashi

User quota system DONE 200609 Sudhamsh
Extract user-level usage information from job tables and load to the user table. Provide API reporting user usage relative to quota for use in submit authorization once quotas are activated. Provide monitoring interface to usage/quotas.

Open issues and things to investigate

  • Use of subversion, at least for components requiring broad deployment but central management (pilot, scheduler, end user tools)
  • Tool/technology laundry list: xrootd? PROOF?
  • Evolution of monitoring - mix of in-house vs MonaLisa/Dashboard
  • Zipping files within datasets to aggregate to larger files (supported by POOL/ROOT; files can be accessed within the zipfile container by their original guid)

Major updates:
-- TorreWenaus - 20 Sep 2006

About This Site

Please note that this site is a content mirror of the BNL US ATLAS TWiki. To edit the content of this page, click the Edit this page button at the top of the page and log in with your US ATLAS computing account name and password.


Powered by TWiki
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback