r27 - 10 May 2007 - 12:42:50 - RobertGardnerYou are here: TWiki >  Admins Web > DiscussionItems



The following is a list of issues for discussions at the Tier2 site visit at U Chicago, May 9-10, 2007.

Data management, and movement

  • At MWT2 we operate 5 DQ2 site services instances:
    • MWT2_UC
    • MWT2_IU
    • UC Teraport
  • Would like to reduce this number!
  • 5 separate storage areas to look after; 5 LRCs; 5 FTS channels; 5 dq2 services
  • Very hard to track new releases, schedules for dq2 software and services.
  • System is opaque - one cannot easily track the progress or state of subscriptions.
  • When is a dataset complete, and where. Can't ask central catalogs the filenames - so must go by what BNL has registered.
  • Would be nice to have an FTS monitoring layer such as: http://fts001.nsc.liu.se/, and drill down for errors, http://fts001.nsc.liu.se/failures/index-86400.html
  • Globally a collection like http://wiki.ndgf.org/index.php/Operation:Monitoring, and a site services console as well (eg. Nagios based)
  • DDM troubleshooting - of some help, http://ddm-log.uchicago.edu:8000/login (work in progress)
  • Backend dataset management - production mixed with AOD mixed with user-requested.
  • Management of incoming queues.
  • Can we simplify the installations? Eg. rpms.
  • Consider DDM T0-T1, followed by a data accelerator within the US network.
  • How to provide better feedback to developers, ddm operations, site admins.
  • Could we run site-services from the Tier1? Concerns over latencies - perhaps use bulk operations.

Storage and file systems

  • 3 deployments of dCache (4, counting UC Tier3).
  • Neutral with regard to xrootd - should we be participating in an R&D testbed?
  • IU has HPSS available (promised as a contribution by the University) - does this raise any issues?
    • Question is - what is the advantage over using BNL's resources.
    • HPSS has undergone a major upgrade, making the interfaces easier - filesystems, gridftp.
    • What is the cost of providing the additional service? At BNL, its zero. BNL can set aside a storage class for this type of storage.
  • What is our baseline? Does it make sense to use dcache on workernodes w/o tape?
  • Question of resiliency. A single drive should not cause a catastrophic failure, or loss of node.
  • In dcache you can associate tags for pathnames, so that files end up in a particular backend. Eg. - analysis datasets would be precious.
  • SRM can define storage classes; is the concept present in dq2?
  • Note that there is only one LRC for BNL, which uses different types of backend storage - eg., BNLTAPE.
  • Direct attached storage of servers (w/o backend network).
  • Creation of a unified filespace is an issue - dcache or xroot.
  • Question about dcache operational load. xrootd is intriguing - Patrick will do some experimenting.
  • Perhaps a read-only storage area managed by xrootd. Analysis oriented. More efficient in accessing data - I/O operations costly, xrootd may do read-ahead better.
  • The ability to move data between sites - not as well developed in xrootd at the moment, eg., no SRM door like dcache.
  • SRM should be required for Tier2 as well - as a managed access point to the site. There is an FTS layer in between which mitigates this to some extent.
  • Distributed dcache between sites. This has been demonstrated in Nordugrid.
  • Interest in dcache at UTA for new cluster - once electricity problems resovled. Timeframe - June/July. Perhaps also at OU.
  • What do we promise for skim output storage, analysis output storage.
  • Agree that we need to consider a BNL as a backup service. Does introduce metadata issues.

Operating systems

  • Which releases? Kernel revisions and patches - on-going judgements. Should these be published, advised, reviewed within the integration project? Basic concern areas: security and filesystem (ext3, xfs, ...) --- and NFS performance.
  • Athena software is tested and validated on a particular version of SLC4 installed at CERN. What are the risks of using some other version of the Linux kernel.
  • 32 bits vs 64 bits - probably won't have a 64 bit clean version of Athena before the end of the year.
  • Can we find some commonality? There are validation questions at the OS level. Isn't this done within a production campaign?
  • Can we find a standard on a certain environment specification. Share through documentation, Wednesday meetings.

Network optimization

  • We're pretty certain we have not optimized our network configuration in all layers of the network (host level parameters such as TCP window sizes; parameters in the Cisco switch - MTU, etc for jumbo frames)
  • At present we have no end-to-end monitoring tools deployed and in use.
  • Come up a set of metrics that everyone can agree on.
  • Need the right, dedicated exercises in place - and we have the right tools in place. Also need a persistent "load tests" that go on all the time. Low level, but always going. The monitoring keeps you informed to discover the cause of a higher level throughput problem.
  • Need to understand better what the analysis model is - posix over LAN to SE, vs copy to the worker node.
  • We are discussing T3 access to T2 data. Talking about co-located T3.
  • Can we have a complete AOD at every site - the concern is the access pattern needs to be understood. Worried about serving the local community's needs. Worried about the distribution of jobs to locales with specific datasets.

OSG gatekeeper and storage services

  • Deploy in synch /or in a staged schedule within the intgration project.
  • Participation - provisioning of resources for the OSG ITB.
  • Pre-OSG release validation of ATLAS software and services.
  • Want these resources to look like our production environment.
  • Need to manage our own requirements in OSG.
  • Three usatlas sites should be sufficient to participate in the ITB.
  • Need to standardize GUMS and priviledged roles.
  • Standard policy description for all US ATLAS, OSG, and the RAC.
  • OSG maintenence bullinten board.


  • Visibility of all flavor of ATLAS jobs (production, individual users, local users)
  • Can Gratia provide this? Probably. Well supported.
  • Are we getting the right metrics collected: walltime+cpu model vs specinit.
  • Dedicated probes - do they include CPU time.
  • Write down the requirements here.
  • Reinforces need to get OSG 0.6 deployed.

Tier2/3 issues

  • Accessing Tier2-resident datasets via full copy
  • Accessing Tier2-resident datasets via skim utilites (eg. http://twiki.mwt2.org/bin/view/DataServices/WebHome)
  • How does Tier3 enter into the Integration program. What do they look like. Creation of a Tier3 "How-To".

Operational issues

  • Individual users are requesting datasets. The current procedure is very complicated, as the latest exercise from Erik B at UC has shown (see thread on ddm-l).
  • Large numbers of jobs fails because of transformation errors (inexperienced users). Would like to accept only those jobs with known working transformations to avoid having to troubleshoot these.
  • Brings up questions - how can you get people to run validation steps? Fast track queue? Test area.


  • We spend a large portion of our time troubleshooting problems. Clearly some fraction of these problems fall in the gray area between being clearly the responsibility of the MWT2 and clearly the responsibility of the production team and/or BNL - how can such problems be debugged efficiently?
  • How to spread the effort of supporting dCache, DQ2, etc. over enough people to make the system less fragile (it seems very fagile now)?
  • Central notification service - eg. are central services down? Which FTS channels are deactivated.
  • Persistent chat room.

Future Procurements

  • Should the future acquistion of the computing equipment be coordinated among all sites in the in US ATLAS Facility (Tier 1 and 2)?
  • How can we get a reasonable division between compute and storage at each site so that overall US ATLAS has both sufficient CPU power and sufficient storage. Getting the right balence is critical.
  • Need to share schedules across sites. Can we align things. Need to exchange information, at least.
  • Standardization of recommendations, requirements. What needs to be investigated/evaluated - the farms group at BNL would be available for evaluation of hardware choices.
  • Create a boxed set of HEP-specific applications to be used as a standardized ATLAS benchmark, to be used for capacity planning.

Planning for T2/T3 meeting at Indiana

  • Meeting URL: http://www.mwt2.org/t2conf/.
  • Friday reserved for a tutorial on production monitoring - we need to make sure that this is well organized and useful to the participants.
  • It would be good to provide some basic on what hardware and software that at Tier 3 should provide. Should there be session on this (a getting started session)? It would also be good apprise the inexperienced users of the possibilities for obtaining support.
  • Tier3: Getting started session. Installing ATLAS software. OSG client software.

Project planning


  • Site infrastructure tests - site availability monitoring. Site level. Nagios-type system.
  • Load tests. gsiftp. move up to dq2.
  • With kitval job, add simple benchmarks.
  • Information service - critical set.
  • FTS viewing, result of glite commands. Tracking transfers of individual files - looking at srm queue, eg.
  • Star configuration of transfers - BNL person to work on viz.

-- RobertGardner - 08 May 2007

About This Site

Please note that this site is a content mirror of the BNL US ATLAS TWiki. To edit the content of this page, click the Edit this page button at the top of the page and log in with your US ATLAS computing account name and password.


Powered by TWiki
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback