r9 - 25 Feb 2009 - 14:14:19 - CharlesWaldmanYou are here: TWiki >  Admins Web > MinutesDataManageFeb24

MinutesDataManageFeb24

Introduction

Minutes of the US ATLAS Data Management meeting, Feb 24, 2009
  • Previous meetings and background : IntegrationProgram
  • Coordinates: Tuesdays, 3:00pm Central
    • (309) 946-5300, Access code: 735188; Dial *6 to mute/un-mute.

Attending

  • Meeting attendees: Pedro, Armen, Hiro, Rob, Charles, Saul, John, Bob, Patrick, Wei, Wensheng, Shawn
  • Apologies: Alexei
  • Guests: None

Background

A regular meeting to discuss site-level data management issues in depth, for US cloud, to include topics such as:
  • Storage validation (filesystem, LFC, DQ2 catalog)
  • Data placement policies
  • Data deletion policies, plans
  • Data transfer problems
  • Datasets required at sites for analysis, datasets to be deleted, etc
  • User & group dataset policies
  • Storage capacities required in space tokens
  • Storage capacities reporting

Topics for this week

  • Introduction - Kaushik
  • Storage reporting status - Armen
  • PRODDISK cleanup - Charles
    • proddisk-cleanse.py http://repo.mwt2.org/viewvc/admin-scripts/lfc
      • probably too general if cleaning PRODDISK only
        • Starts from LFC (for pandamover data) and optionally DQ2 listings (if -dq2 flag specified). Does not traverse filesystem, so will not find "orphan" files on filesystem (see next item)
        • Evolved from cleanse.py (pre-space-token). Would be good to have a spec for what this program should be doing (overlapping datasets, etc).
      • cleaning pandamover data (_dis datasets) - works well, by data range, would be nice to have a storage threshold option also
      • can also clean DQ2 data (only file/LFC deletion, _sub datasets or mistakes), would be nice to have an option to delete dataset info from central catalog.
    • Site usage:
      • MWT2 - cleans both types of data, on demand (when fills up), use default of 2 weeks (files older than)
      • AGLT2, NET2 - same as MWT2
      • SWT2 - so far only used to clean pandamover data once
      • WT2 - only tried once, older version

  • Stray file checker ccc.py Charles
    • CCC = Complete Consistency Checker - http://repo.mwt2.org/viewvc/admin-scripts/lfc/
    • Looks for ghosts (present in catalog but not on disk) and orphans (vice-versa)
    • Somewhat dcache specific, but easy to modify for other fs
    • Discuss in future meeting

  • Current deletion policies - Hiro, Armen
    • Tier 2's are not cleaning any other space token (only PRODDISK)
    • some data may have gotten deleted in other space tokens - we will pick this topic up later
    • For obsolete data, we would like to clean centrally from BNL all US cloud, after notification to user
    • Technical problems - these datasets are no longer in central catalog
      • Discussing with DDM the possibility to get the information from central catalog (Pedro has filed Savannah ticket)
      • Backup plan, get a dump from DQ2 catalog
    • Hiro is developing deletion tool (keeps list of all datasets to be deleted in the US cloud)
      • Daemon does cleanup automatically
      • Tool is space token specific
      • Let's try this centrally from BNL (unless load becomes too high - then we have to run at each site)
    • USERDISK cleanup at BNL - users have been notified, but not been cleaned
      • Waiting for Hiro's tool to be ready
      • Can also be used for cleaning Tier 2's
      • Always notify user (more details from Armen here)
  • Hot issues
    • What if space gets filled up - discuss next meeting.
    • List of datasets that must be present (possibly deleted from DATADISK)
  • AOB


About This Site

Please note that this site is a content mirror of the BNL US ATLAS TWiki. To edit the content of this page, click the Edit this page button at the top of the page and log in with your US ATLAS computing account name and password.


Attachments

 
Powered by TWiki
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback