r2 - 09 Oct 2006 - 15:51:25 - TorreWenausYou are here: TWiki >  AtlasSoftware Web > StorageExtensions

OSG extensions activity in storage area

Initial priorities

  • dCache deployment to 15 sites (ATLAS and CMS Tier 2s)
    • site functional tests as prerequisite
    • based on srm 2.2
    • sites advertised via official OSG catalog
    • installations documented on site twiki pages, with configuration exposed to support people in a standard way
    • OSG-wide metrics monitored and posted: transfers average and aggregate per day, total space available/used, space available/used to general OSG community.
      • based on billing DB deployed to all sites, and an OSG-standard data gather/present
  • agree on OSG support model. Ticketing system, deployment and ops support.
  • 'tax' of 5% of site storage capacity to be used by OSG community; payment for OSG's support services
    • hard partitioning, documented and available. Manage it how?
  • operations tools
    • file existence and integrity checking (e.g. pnfs checker)
    • Alarm sensors for dCache. Only the sensors, and in such a way that it plugs into either NGOP, Nagios, or MonALISA? based alarm systems.
      • Is only the sensors enough? Require also that -- possibly in addition to feeding an in-house favorite monitor -- feed an OSG-standard system
        • Lesson of Panda is that more and deeper info (as long as it is correct info) pays off for fast, effective diagnostics and for automation
  • Validation, deployment of new Chimera namespace catalog when it is available?

Text from Fermilab (Eileen) for effort there:

DCache is a distributed disk-based storage system that began as a rate-adapting front-end cache to tape-based mass storage systems, supporting Posix I/O and FTP-oriented file access. DCache has evolved into a full-featured storage system capable of very high data delivery rates, optional internal data replication for increased robustness in disk-only systems, configurable policies for automated management of internal data flows, and a standard Grid interface using the SRM API.

Storage Resource Managers (SRMs) are middleware components managing shared storage resources on the Grid with common application interfaces. SRMs provide protocol negotiation, dynamic transfer URL allocation, advanced space and file reservation and reliable replication mechanisms. OSG Storage Elements usage of SRM interface will make the task of building an OSG Data Grid simpler by facilitating Reservation and Sharing of Storage Resources on the Grid.

To succeed with their physics programs, the LHC experiments ATLAS and CMS expect to accumulate 10PB of data each in 2008, and serve it via ~30PB disk space to ~100 MSpecInt2000? CPU power across ~100 computing centers worldwide. A crucial challenge for the LHC physics program is to provision a cost-effective high performance, feature rich storage element that is sufficiently easy to operate and support at the scale of these ~100 computing centers. ATLAS and CMS identified dCache as storage technology to satisfy this need.

SRM Enabled Storage Systems such as dCache are software systems which can be difficult to install, configure and support due to their distributed architecture and great number of configuration and administration options. The OSG specific deployment framework and integration into existing OSG monitoring and accounting mechanisms will greatly reduce the administration and support overhead of each deployment. Integration of dCache/SRM Authorization with OSG Virtual Organization based authorization services will allow a grid wide control of the storage resources and again make tasks of user administrations simpler for site storage administrators.

Due to various site network configurations and limitations such as existence of firewalls, configurations must be tailored to the specific needs of each site. Integration of the functional and end-to-end testing and troubleshooting tools will allow an early detection and fast resolution of problems, improving the quality of the storage service at each installation while maintaining a reasonable support load.


Year 1

  • Deploy, integrate, and provision SRM/dCache for general use on OSG, including storage authz, and SRM v2.2
  • Integrate a suite of operations tools for all sites
  • Integrate a first version end-to-end troubleshooting tool for SRM/dCache Xfers.
  • Support two alternate deployment methods: VDT & ROCKS.
  • Integrate in a meaningful set of site functional tests for srm/dcache
  • Establish a support model
  • Establish an education model

Year 2

  • Enhance the end-to-end SRM/dCache troubleshooting tool.
  • Establish SRM/dCache site test procedures for OSG 1.0
  • Reintegration, deployment, upgrade support for established tools.
  • Integrate in new operations tools
  • Integrate in new site functional tests
  • Integrate into SRM/dCache a site configuration backup tool
  • Integrate SRM/dCache OSG specific monitoring values into an OSG supported tool
  • Provide a first version integration of operational tools, site tests, and monitoring into a cohesive web based display.

Major updates:
-- TorreWenaus - 25 Sep 2006

About This Site

Please note that this site is a content mirror of the BNL US ATLAS TWiki. To edit the content of this page, click the Edit this page button at the top of the page and log in with your US ATLAS computing account name and password.


Powered by TWiki
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback