r5 - 28 Nov 2007 - 19:59:40 - RobertGardnerYou are here: TWiki >  Admins Web > NotesTier2Nov28



Minutes of the Tier2/Tier3 workshop at SLAC, November 28-30, 2007

LHC and ATLAS status (Jim Shank)

  • See slides - overview of LHC machine and ATLAS status
  • Beam by July 2008, but a large number of tests for sectors (interconnect of cryo, leaks, power-up, ..) right up to the turn-on date. No room for slippage.
  • Leaks and shorts are the big problem - warm up/cool-down takes 2 months to recover
  • Probs resolved:
    • Triplet problem - welding and cold masses
    • problems in the plug-in modules (in the bellows that interconnect mags). Fingers get mangled.
  • 4 sectors will be cooled by end of year - so on schedule for July
  • ATLAS - schedule is also very tight
  • Small wheel, other services on the critical path
  • Endcap toroid problem. Tests last saturday - slid during power-up and hit parts of LAr calib system. Assessing damage now. If minimal, can do this w/o warm up of LAr.
  • ATLAS computing
  • SRM testing v2.2, throughput testing
  • FDR pre in December (bytestream production - output to SFO at CERN); FDR-2 will be Jan-Mar 2008, a big simulation production.
  • CCRC - combined computing readiness challenge - a WLCG exercise.
  • M6 coming up in April. There may be an M7, depending on schedule.
  • Combine FDR and CCRC? Discussions on-going.
  • See FDR definition - Dave Charlton.
  • Putting data at the SFO's - emulating real collection of data - raw data for physics streams, calibration streams, expressline
  • Possible there will be multiple reconstruction runs on the express streams.
  • Want to get ES data to BNL/Tier1
  • Data processing - as function of Tiers
  • AOD copied to Tier1's and Tier2s immediately, and DPD as requested.
  • Exercise the concept of a DPD train - small group of physics group coordinators placing algorithms passing over the datasets.
  • There are questions about where the DPD production chain runs - we may run it throughout the Tiers.
  • Discussion of the new computing organization within ATLAS, ADC (ATLAS Distributed Computing). Two areas - development and operations. First meeting this morning. Will begin pushing this very hard in January.

Facilities Status and Plans (Michael Ernst)

  • See slides giving overview of the computing organization in US ATLAS
  • Service model - need a U.S. ATLAS operations coordinator, working from Brookhaven
  • US contribution to worldwide production is substantial - level of 33% overall
  • Data location among tiers - a task group is working on the model for Tier3's, work underway
  • Resource allocation committee - chaired by Jim - user community can submit requests
  • Tier1 resources - additional funds coming from MR for LAN backbone upgrades and high performance disk, etc. Other resources for operations and integration
  • Data storage a main challenge - upgrade to dcache 1.8 providing srm 2.2 (allows pinning) and eval of Chimera to replace the vulnerable pnfs.
  • See list of critical issues - too many files of US data on tape, long latencies for pathena users, need for more disk
  • Why tapes don't work well - esp for analysis
  • Transitioning to operations - stability is the most important thing; manpower intensive.
  • New operational model (service based, not systems based) for the RACF. Implementing new SLAs, service coordinators oversee response and resolution of error conditions. Work in progress, see dependency matrix.
  • Challenges ahead: infrastructure to be built; more efficient use of the facility resources; integration of operations of the Tier1 with the whole facility.
  • Securing the facilities readiness - getting guidance from ATLAS milestones.
  • Questions
    • Jim: dcache - srm 2.2 and chimera - any hands-on experience? Chimera - very clear that this will be a major improvement over pnfs which uses file-locking. srm v2: the pinning, completely missing in current version, and has to be integrated in the DDM level to track status. Current system pushes data out w/o control. Another problem resolved had to do with performance degradation w/ concurrent requests. These have been claimed to be resolved, and should handle thousands of requests from DDM. Note - what would be a viable alternative at the level of a Tier1?
    • Kaushik: Need layering of software and services, ATLAS-specific, on top of dcache.

Facilities Integration Program (Rob Gardner)

  • See slides
  • Question about Tier3 center support and definition. There is a need to define this better.

Production and Analysis (Kaushik De)

  • Process and functional requirements at US ATLAS Tier2 and Tier3 centers including Panda Mover, Pathena, Autopilot, and DQ2
  • Going over high level issues:
    • MC production and processing simultaneously
    • How much to scale up?
    • How to integrate Tier3?
  • Major challenge will be the work associated with Panda to other clouds
  • Another challenge is providing pinning functionality. srm v2.2 will be rolled out in mid-December at BNL. It also has to be incorporated into DQ2.
  • DQ2 critical issues
    • hierarchical datasets, lost file flag, tape handling
  • LFC - testing and evaluating for performance
  • DA usage rising rapidly, limited because of data availability; note there are 30K user jobs waiting to run.
  • Action item replicate AODs to Tier2's. How many files/TB? Need a point-person for this in the US Cloud.
  • Questions
    • Patrick: What about user datasets? At the moment they are left where they are produced. How long should they be kept precious at a Tier2? Expiration and quota system? This is a pressing issue.

Shift operations, troubleshooting, discussion (Mark Sosebee)

  • A discussion moderated by Mark Sosebee, including an introduction to the ELOG shift logging system
  • See eLog - http://atlas003.uta.edu:8080/
  • Goal will be to catalog shift info from last few years.
  • Can we post system alerts, for down central services. Yes.

Fabric Services: Storage management - Xrootd (Ofer Rind)

  • See slides
  • Very appropriate for smaller sites like Tier3's. Integration w/ PROOF most interesting. An apache/tomcat server can be used for remote proof sessions.
  • Operation and management: two people working the test installation at BNL - have set up to separate classic sys admin tasks - Ofer, and file system (non-priv and some sudo privs for some operations) tasks - Sergey (usng xrd admin account, and Tentacle cluster management tools).
  • Lots of monitoring, both native and Ganglia/Nagios integrations
  • XrdMon installation. Would like to see mature product available for this. Need a contribution!
  • Managing utilization, workloads. Eg. analysis trains. Need tools for data management tasks, all within the ATLAS framework.
  • Questions
    • ratio of proof-cores and xrootd server nodes. proof tries to use data from local sources. A big area of investigation.
    • What about data import, export issues and xrootd? - See Andy's talk below, regarding work on srm-xrootd.
    • Interest in FUSE - file system visibility

Proof and Xrootd for Tier3 centers (Bruce Mellado)

  • Packetizer - basically job schedulers that defines how jobs are run on nodes. This is where work is to be done, to optimize for Tier3-size facilities.
  • xrootd-proof tests at GLOW-ATLAS
  • Look at response of PROOF to varying file sizes and formats.
  • Result is I/O limitation to 50 MB/s due to RAID5 cards. Consideration for going to 16 core systems.
  • Performance tests - learned that shouldn't start more than 2 workers on a node. Importance of pre-load... to be understood.
  • Issues at GLOW - avoid network traffic.
  • Looking at Condor job scheduling over a Tier3 site, integrating PROOF and Condor COD. Issue is supporting multiple users.
  • I/O queue tests - using ESD files, comparing locally resident and data from other nodes served by xrdcp. Big difference comes from attempting NFS.
  • Looking at xrootd file distribution, and reduction in elapsed time.

Next generation of xrootd (Andrew Hanushevsky)

  • Next generation xrootd/olbd and its performance, Posix filesystem interface for xrootd using FUSE
  • Scalla implements a distributed namespace.
  • cnsd - composite namespace server daemon. Client sees a composite namespace for each server, hosted by a common xrootd. Namespace is replicated in the filesystem. No external database needed, small footprint.
  • FUSE - implement a file system in a user space program.
  • Application client has Posix access via kernel to namespace server, etc.
  • Globalization: redirectors can affiliate with a meta manager simply
  • ALICE has found very good performance for WAN access using hints which effectively pre-read root-tree data. Comparable to LAN performance.
  • Questions:
    • Namespace management in the meta-cluster configuration. Would have to setup a meta-manager.
    • Issue about client mounting the namespace and how data flows between client and server - how does srm-like load-balancing work.
    • FUSE - no quotas possible. Authentication is done Unix-like (eg. NFS).
    • more questions/good discussions..

Fabric Services: Analysis Queues (Mark Sosebee, Bob Ball)

  • (Condor, PBS) functionality and configuration examples for setting up analysis queues
  • Questions / issues to consider. Dedicated versus opportunistic CPUs. Long versus short queues.
  • So far, pathena jobs have run only at BNL - primarily due to location of AODs
  • Notes new mode: pathena --site xxx.
  • AGLT2, OU sites are setup (Condor sites)
  • On-going testing at UTA using PBS scheduler
  • Need to determine schedule
  • Easy to setup for Condor sites - Horst did this in an hour.
  • An issue is setting up analysis queues for Tier3 - can they fetch AODs or ESDs already at an affiliated Tier2.

-- RobertGardner - 28 Nov 2007

About This Site

Please note that this site is a content mirror of the BNL US ATLAS TWiki. To edit the content of this page, click the Edit this page button at the top of the page and log in with your US ATLAS computing account name and password.


Powered by TWiki
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback