r4 - 12 Jan 2012 - 19:14:43 - WeiYangYou are here: TWiki >  Admins Web > MinutesFedXrootdJan11



  • Bi-weekly US ATLAS Federated Xrootd meeting
  • Wednesday, 2-3pm Eastern
  • USA Toll-Free: (866)-740-1260
  • ACCESS CODE: 9263338

  • Attending: Sarah, Patrick, Rob, David, Horst, Shawn, Andy, Wei, Doug, Hiro, Ofer
  • Apologies:

FAX Status Dashboard


Meeting business

  • Twiki documentation locations
    • Some have difficulty to access certain CERN twiki, unknown why. Suggest to put at BNL twiki with link at CERN twiki to BNL (http, not https).
    • not done yet
    • RG will follow-up

Xrootd release 3.1.0 deployment

Summary of previous meetings:
  • Xrootd releases come out with some functional validation by stakeholders and large sites. But lack a formal release validation process.
  • 3.1.0 is the first "mature" release so suggested to site to deploy for the proxy function. WT2 has migrated Solaris storage (regular xrootd) to 3.1.0 for a month. Will migrate linux storage to 3.1.0 soon. WT2 has also run 3.1.0 single proxy for a month. N2N? works under 3.1.0.
  • CMS abandoned Dcap plug-in for Xrootd OFS. They use dCache xrootd door directly or Xrootd overlap dCache.
  • Known issues:
    • RPM updates overwrite /etc/init.d/{xrootd,cmsd} which have LFC environment setup. Those setup should go to /etc/system/xrootd which survives rpm updates. Patrick will test it.
    • Bug in 3.1.0 prevent setting up a proxy cluster. Fixed in git head.
    • Continue xrdcp debugging between Doug and Andy.
    • "sss" module bug cause permission deny. Probably with replacing existing identity mapping. Workaround (in xrootdfs) is to not update existing identity.
last meeting:
  • Who has deployed?
  • Illinois - which is over a dCache server reporting
  • OU on their tier3 (the problematic one). OU is joining as a data server, and it has a Posix backend. New configuration turns off asynchronous IO.
  • UTA - is crashing under certain conditions (trying to generate a core file). Patrick will send configuration info to Andy. Seems to be related to a specific mode of operation in xrootd.

  • Plans:
    • UCTier3 - by next meeting
    • BNL - wait-n-see
  • Wei does have a work-around for proxy cluster
  • UCTier3 - have not been able to do this due to user constraints at UC (end of year deadlines).
this meeting:
  • BNL wait-n-see for proxy cluster: Andy: a patch release of 3.1.0 is coming, which should fix the proxy cluster problem
  • UCTier 3: adding more storage, will move to 3.1.0 after that
  • SWT2: N2N? crashing issue is understood (conflict of signal usage in regular xrootd and Globus). Solution is either use a proxy, or regular xrootd with "async off".

dq2-ls-global and dq2-list-files-global

last meeting:
  • want dq2 client tools that can list files in a dataset in GFN (or local redirector); and check against their existence in FAX or local site.
  • Hiro's poor man's version can be found at http://www.usatlas.bnl.gov/~hiroito/xrootd/dq2/; work with containers.
  • Doug: very long waiting to dq2-ls-global when there are missing files (incomplete datasets). not acceptable in real use. Hiro/Wei: multi-threading the dq2-l*-global, or use xprep before checking existence.

  • Hiro: will build xprep into dq2-ls-global.
  • Doug: what about those dq2-get/xprep requests still in queue, shouldn't be marked as not existing. Hiro: Looking for a way consolidate sites xprep queue info for dq2-ls-global to check. Will discuss detail over e-mail.
  • Is there a plan to checkin Hiro's code (dq2-*-global) to dq2-client?
  • Hiro will discuss with Fernando.
  • RWG - I am using this for expanding tests across datasets - works great.
this meeting:
  • Hiro will find out who is in charge of the dq2-client

ANALY queue

last meeting:
  • Rob ran interactive test jobs against glrd, MWT2, Illinois, AGLT2 and BNL. First tried glrd, slow (probably was redirected to BNL). not surprisingly, BNL is slow. subsequent test against glrd is faster (probably was redirected to other sites).
  • To run in Panda queue, Dan van der Ster suggested to use prun with --pfnList to supply a list of files for jobs (list coming from dq2-list-file-global), but may still have dependency on site having the datasets, even though reading points to glrd. Doug: that may not be the case. Rob will try.
  • Noticed that Rob's tested has very high success rate, with obvious long distance effect. The input data sample is small. Large data sample test may be useful to reveal problems.
  • WT2 will deploy more proxy nodes to further reduce bottleneck at proxy. This may help isolate latency related issues.
  • (place holder) Where to write output? A small write space in the federation, or other solutions. Doug is looking at the possibility of have a small xrootd space at BNL for job output.
  • A federated space for writing is out of the scope of this working group. Should be discussed at the facility meeting.
  • Extending this work by have not made much progress since Lyon presentation (here)
  • Have tried TTreeCache tuning - will need guidance since first attempt made things worse.
  • Need to subscribe datasets to sites:
this meeting:
  • Rob: will scribe the above datasets to selected sites.

Performance Studying

this meeting:
  • Should send a request to root IO group asking for a self-contain example to test at FAX, should find out what matrix FAX group want to see from ROOT IO group.


  • Andy: code is ready in 3.1.0. Wei (and Doug?) will test it?

D3PD example

last meeting:
  • Get Shuwei's top DP3D? example into HC (Doug?)
  • Doug will follow-up in two weeks to see about getting this into HC, and the workbook updated. Need to drive this with real examples, with updated D3PDs? . So examples need to be updated for Rel 17.
  • Doug: Goal is to get this into HC test, with sites being able to replace input datasets. will be used by sites to compare performance of reading from local and remote storage. will follow up.
this meeting: The data sets are:

[dbenjamin@atlas28 ~]$ dq2-ls -r user.bdouglas.physics_Egamma.SMWZd3pdExample.NTUP_SMWZ.f406_m991_p716


[dbenjamin@atlas28 ~]$ dq2-ls -r user.bdouglas.physics_Muons.SMWZd3pdExample.NTUP_SMWZ.f406_m991_p716


  • question for next meeting: How will site request to run this type of HC test? How can site change inputs? How to obtain performance matrix such as total time, etc.


Summary of last week(s)
  • See further https://twiki.cern.ch/twiki/bin/viewauth/Atlas/AtlasXrootdSystems
  • Decided to continue improving the current N2N and leave GUID as a future option. Chicago can keep the source of N2N in CVS for now - Send update to Rob. Wei can compile
  • Doug's use-case - look up files that existed at BNL but N2N can't find it. Hiro: need to change the code slightly - will do. Probably only happens at BNL. Had to do with the way panda outputs to BNL.
  • Complains about possible memory leak in N2N? . Provided to Andy a standalone package for debugging.

  • Hiro will look into the dCache plugin for native xrootd-dCache
  • Proxy memory footprint grows to several gigabits - not sure where this occurring.
  • Andy will look into this with a stand-alone N2N? ; will provide a wrapper for it.
  • Will use a memory debugging tool
  • Hiro: update N2N? for BNL special cases. Doug will test if this can improve hitting rate to near 100%.
  • OU Tier 3 is OK with xrd.async off. Will try OU Tier 2 (3.0.5) and UTA (currently xrootd on top of xrootdfs).
  • Gave Andy a standard alone package to debug possible memory issue of N2N. Wei: Will also collect core dumps with proxy with N2N has a small memory footprint and a relatively memory footprint (e.g. is growing).
this meeting:


last meeting:
  • Wei: with 3.1, checksum is working for Xrootd proxy even when N2N is in use. Tested at SLAC at both T2 and T3. Should be straightforward for Posix sites.
  • Not sure about dCache sites. Probably need a plugin for dCache. Callout to figure the checksum from a dCache system. Andy and Hiro will go through this at CERN
  • Wei: Direct reading, dq2-get (-whatever) don't need checksum from remote sites.
  • On-hold
  • rename this item to discuss general issues with checksumming instead of integrated checksumming.
  • Checksumming for native xrootd is basically solved
  • For posix - can adapt
  • For dCache - is there a plugin for checksum? Its there, need to grap.
  • Querying the remote site for checksumming
  • Wrapper script is needed
this meeting:
  • MWT2 and AGLT2 will evaluate dCache's Xrootd door. Will look at checksum solution after it is found useful.

xprep warnings, status checking

Summary of previous meetings:
  • xprep (and dq2-get -xprep) doesn't give a warning if site's xrootd cluster is not configured for xprep. At least we need to give sites enough warning so that they don't miss this issue during configuration.
  • Notes that one could modify the stage-in script to add re-try to easily achieve 100% success rate.
  • Want to be able to do a dq2-ls and get the namespace back, but thats not possible now without triggering other actions (downloading, consistency checking).
  • Can dq2-ls be used against a local storage system to check for existence of files without consistency checking? maybe dq2-ls-global with MYXROOTD is a solution?
  • Doug: noticed imbalanced FRM stage-in request queue length.
this meeting: *

FRM script standardization

last meetings:
  • Standardize FRM scripts, including authorization, GUID passing, checksum validation and retries.
  • A few flavors possible.
  • Setup a twiki page just for this.

  • Brings up the question again about checking completion of xprep commands. Failures do leave a .failed file. Are there tools to check the frm queues. Can we provide a tool for this?
  • Andy: suggests setting up a webpage to monitor the frm queues. frm_admin command. Hiro wil be looking into this.*
  • a prototype of doing this:
for i in all_your_data_servers; do
    ssh your_dataserver_$i and do the following:
    export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:xrootd_lib_path
    export PATH=$PATH$xrootd_bin_path
    frm_admin -c your_xrootd_config_file -n your_xrootd_instance_name query xfrq stage lfn qwt 
done | sort -k2 -n -r

this meeting:

cmdsd+dcache/xrootd door

last meeting:
  • An updated CMSD that will work with the native dCache/xrootd door (Andy?)
  • A caching mechanism to allow the lookup done by the CMSD N2N[2?] plugin to be useable by the xrootd door (either dCache or Xrootd version) (Andy/Hiro/Wei/?)
  • Redirect to the xrootd-dcache door; will do the lookup and do memcached. cmsd will need the N2N plugin. N2N? must write to something the dCache sites can read.
  • Hiro will look into this; not critical path.
  • On-hold.

this meeting:

Authorization plugin (Hiro)

last meeting:

  • A "authorization" plugin for the dCache/xrootd door which uses the cached GFN->LFN information to correctly respond to GFN requests (Hiro/Shawn/?)
  • On-hold.
this meeting:

Sharing Configurations

last meeting: this meeting:


last meeting:

this meeting:

Ganglia monitoring information

last meeting:

  • Note from Artem: Hello Robert, We've managed to do some progress since our previous talk. We build rpms, here is link to repo: http://t3mon-build.cern.ch/t3mon/, we have rebuilded versions of gangla, gweb in it. Ganglia people've issued ganglia 3.2 and new ganglia web (gweb), all our stuff was rechecked and works with this new software. It's better to install ganglia from our repo, instructions are here: https://svnweb.cern.ch/trac/t3mon/wiki/T3MONHome. About xrootd: we have created daemonized version of xrootd summary to ganglia script. It's available at the moment at https://svnweb.cern.ch/trac/t3mon/wiki/xRootdAndGanglia, it sends xrootd summary metrics (http://xrootd.slac.stanford.edu/doc/prod/xrd_monitoring.htm#_Toc235610398) to ganglia web interface. Also we have application which works with xrootd summary stream but at the moment we're not sure how it's better to present fetched data. We collect there user activity and accessed files, all within the site. Last week we installed one more xrd development cluster and we're going to test if it possible to get and then split information about file transfers between sites/within one site. WBR Artem
  • Deployed at BNL, works.
  • Anyone tried this out in the past week? Would be good to try this out before software week to provide feedback.

this meeting:

Monalisa monitoring

  • Discussions with Matevz Tadel (USCMS, UCSD) at Lyon
  • Considering deploying an instance at UC; if so would ask sites to publish information to it.
this meeting
  • Matevz will visit SLAC on Feb 1-2. will ask for a demo. Interested in detailed morning stuff


-- WeiYang - 10 Jan 2012

About This Site

Please note that this site is a content mirror of the BNL US ATLAS TWiki. To edit the content of this page, click the Edit this page button at the top of the page and log in with your US ATLAS computing account name and password.


Powered by TWiki
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback