r2 - 14 Apr 2009 - 12:49:22 - TorreWenausYou are here: TWiki >  Admins Web > MinutesUSComputingApr09PandaDev

Panda development activities and plans 20090414

  • complete the Oracle migration
    • no technical obstacle to migrating the rest of the clouds
    • current MySQL? /Oracle split is major pain for users, shifters, developers, so the sooner the better
    • add compact DN to job tables, use in place of prodUserID for analysis jobs
  • will allow monitor development to refocus on actual development

  • adapt AKTR+monitor to same Oracle interface code as server/bamboo
  • use bind variables in monitor
    • Alexei: Mikhail is in discussion with Gancho as to how/where it is useful to use bind variables. Await outcome.

  • checksum in PandaMover?
    • Kaushik: on a remote copy we shouldn't burden the destination with a remote checksum, too easy to knock over SRM and such services that way. Source checking is fine. dCache provides adler32.
    • Alexei: checksum very good idea, particularly if PandaMover? usage is expanded, eg. to evgen distribution. Validate the source with checksum.
    • Discussion needed on PandaMover? use cases and how checksum should be used. Actually remove corrupted files? Too aggressive. Use an error code/emails? ACTION to discuss.
  • enable token-based security mechanism between scheduler - Panda server - pilot

  • pilot release-candidate testing framework
  • pilot code refactoring to introduce glexec with file management either at prod-proxy or user-proxy level
    • John: Keep in mind that where a user is allowed to make use of a privileged (for storage) proxy, data is unsafe
    • Torre: True. We want to provide the capability to avoid this (either via glexec or by downloading the user's proxy so they can use it for data storage), and leave what is actually done to policy decisions.
  • job recovery code rewrite
  • finish testing of dq2-get/put site mover, and migrate to it

  • glexec integration
    • glexec testbed participation
    • production pilot integration/refactoring
    • myproxy service dev/test/debug

  • pilot scheduler consolidation
    • pyfactory integrated with autopilot as basis for for pilot submission
    • autopilot's pilotScheduler used solely for monitoring
    • http based interaction with monitoring DB

  • analysis expansion to all clouds
    • silence from DE (Rod), UK (Graeme) on running generic analysis pilots, "allowed" according to Massimo/Dario/Kors. ES?
      • ACTION: to deploy to Dutch, UK clouds. Alexei will take up with Graeme.
    • go ahead with centralized user-owned pilot submission using myproxy-registered proxies? Less desirable/efficient than generic
      • Not for the moment.

  • better user analysis views in monitor
    • overall queue activity, where a user's jobs sit in the queue and why
      • where do a user's jobs stand relative to others in priority, position in queues
      • what jobs from what users with what priority are on what queues

  • analysis statistics gathering/presentation, performance analysis

  • system for establishing and maintaining site/cloud membership with associated rights
    • self-subscription system from pathena environment: users register with sites/clouds they belong to (reviewed by responsible)
    • use this registry as basis for member-specific rights: controlled queue access, priority bump

  • dataset browser overhaul
    • do we still need it? Nothing better has come along in three years?
    • give up on memory-based dataset catalog; too slow even with memcached (at least as implemented)
    • move catalog to DB
      • maintain via incremental updates, not full rebuild
      • one of many independent skims/recataloging of DQ2 data. Such is life
    • Discussion: Yes we still need it. Need to bring all people interested together. Possibly late May. Start with phone call beginning of May. Alexei has a number of very good new guys interested to work for an extended period on this. Consider it as a new project.

  • generic Panda data movement
    • OSG-directed tools to offer data management & automated dataflow as part of Panda workflow offered to OSG VOs
    • LFC catalog for file cataloging
    • Panda dataset catalog/file catalog for dataset definition
    • same PandaMover? -based dataflow used by ATLAS, with a VO's "home SE" replacing the Tier 1 role

  • WN level storage management, brokerage
    • WN-local storage (eg. pcache)
    • pilots on xrootd nodes, a la PROOF
    • file-based tag analysis

  • experimenting with the cloud
    • cloud as dev platform in progress
    • follow with experimenting with cloud as CE/SE? Manpower permitting

About This Site

Please note that this site is a content mirror of the BNL US ATLAS TWiki. To edit the content of this page, click the Edit this page button at the top of the page and log in with your US ATLAS computing account name and password.


Attachments

 
Powered by TWiki
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback