r4 - 13 Aug 2011 - 02:14:59 - WeiYangYou are here: TWiki >  Admins Web > MinutesFedXrootdAug11

MinutesFedXrootdAug11

Coordinates

  • Thursday, August 11, 4 pm - 5 pm Eastern, looked like the best time.
  • Phone: (570) 310 0130, Access code: 735188

Attending

  • Shawn, Rob, Tom, Dave, Patrick, Saul, Ofer, Andy, Wei, Hiro, Sarah, Tomasz
  • Apologies: Doug

Workshop Planning

  • Agenda: https://indico.cern.ch/conferenceDisplay.py?confId=149453
  • Monday morning we will have presentations from Brian Bockelman (US CMS federated xrootd) focusing on experience with direct access federation, and from OSG on xrootd packaging (their roadmap which may include not packaging things already packaged "upstream" in the xrootd.org repo, validation and support. Discuss a little bit the options of where to get software and under what conditions. Might request OSG maintain a repo of ATLAS requested add-on rpms, eg, or other modules we might have in collaboration with USCMS in the OSG context.
  • In the US ATLAS Facility and in OSG Integration Testbed deployments, we have often used a site validation table to organize progress. This is a temporary bookkeeping device. Below is an example site validation matrix (with columns to indicate specific tasks, TBD, by site admins):
    site-val-example2.jpg
  • Site mini-blessing scripts/toolkit (local)
  • Site mini-blessing scripts/toolkit (remote)
  • Global federation testing
  • Analysis benchmarks
  • Note HC stress tests reach a few steps beyond the scope of this project as additional infrastructure is needed. Defer.
  • We need volunteers to help with defining each of the tests, providing scripts and how-to's, in order to get a standard validation across sites.

dCache federating issues

  • Discussed in particular the situation for dCache sites using xrootd-dcache doors available in 1.9.12 (new golden releases). Question as to whether a proxy service is needed to provide the translation and access to the dcache-xrootd door, and whether traffic then flows through that server (to be avoided). Or, whether a server needs to be installed on each pool. Since the CMS folks have a similar issue, Wei suggested consulting Brian.
  • Discussed a subset of folks may be able to work on this at the workshop.

Checksum support

  • Ofer described an implementation worked out with HIro, already in production at the BNL Tier3, that uses checksums obtained from dq2-client
  • Context is the client passes checksum information along filename to the FRM module which uses it to validate the xrdcp copy.

GUID as opaque information

  • Hiro brought up his request to support passing GUID information along as opaque data.
  • Idea is to use this for direct lookups of files in the LFC (rather than using the current filter which does not have a 100% hit rate)
  • To implement this natively in xrootd will require development in order to pass opaque data to N2N? module, and a N2N module the use GUID.
  • Andy said he would think about this and come back with a recommendation, hopefully in advance of the workshop.

Federation monitoring

  • See slides circulated from Rob today.
  • Idea is a simple "Nagios-like" heartbeat check of the infrastructure that serves as both a presentation layer and additional information about status.
  • At its simplest, provide period heartbeat monitor of:
    • Download of file unique to a site via the site's local redirector
    • Download of file unique to a site via the global redirector
  • Discussed pros/cons to this approach, in particular the need to avoid burdening Tier 3 sites

Configuration Repo

  • Place for stashing site configurations. Setup on a twiki page. No strong objections to doing this, just need to set this up.

Other issues

  • Andy reports work in progress for X509 support continuing, but likely not ready in time for the workshop.
  • Perhaps one more meeting in advance of workshop
  • Asking for release 3.0.5 which will address FRM bug and memory leak in 3.0.4 (the issue with sss module on dual nics (public and private) will not be addressed until 3.1.0)

References


-- RobertGardner - 09 Aug 2011

About This Site

Please note that this site is a content mirror of the BNL US ATLAS TWiki. To edit the content of this page, click the Edit this page button at the top of the page and log in with your US ATLAS computing account name and password.


Attachments


jpg site-val-example2.jpg (183.7K) | RobertGardner, 11 Aug 2011 - 18:50 |
pdf fax-nagios.key.pdf (200.8K) | RobertGardner, 11 Aug 2011 - 18:53 |
 
Powered by TWiki
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback