r4 - 30 Sep 2009 - 14:43:40 - RobertGardnerYou are here: TWiki >  Admins Web > MinutesSep30



Minutes of the Facilities Integration Program meeting, Sep 30, 2009
  • Previous meetings and background : IntegrationProgram
  • Coordinates: Wednesdays, 1:00pm Eastern
    • (309) 946-5300, Access code: 735188; Dial *6 to mute/un-mute.


  • Meeting attendees: Michael, Rob, Sarah, Rich, Shawn, Fred, Jason, Horst, Karthik, Charles, Armen, Kaushik, Mark, Torre, Doug, Aaron, Wei, Hiro, Xin, Jim C, John B, Saul, ... (missed some late arrivals)
  • Apologies:

Integration program update (Rob, Michael)

  • Introducing: SiteCertificationP10 - FY09Q04
  • Special meetings
    • Tuesday (9am CDT): Frontier/Squid
    • Tuesday (9:30am CDT): Facility working group on analysis queue performance: FacilityWGAP suspended for now
    • Tuesday (12 noon CDT) : Data management
    • Tuesday (2pm CDT): Throughput meetings
  • Upcoming related meetings:
  • US ATLAS persistent chat room http://integrationcloud.campfirenow.com/ (requires account, email Rob), guest (open): http://integrationcloud.campfirenow.com/1391f
  • Program notes:
    • last week(s)
      • CapacitySummary
      • WLCG pledge
      • ATLAS has refined resource requirements versus what has been requested earlier. Revised requirements in terms of numbers of replicas of complete AOD datasets has led to a reduction in requirements, in particular at the T1 (9.5 PB down to 5 PB). However not at the T2's. Old 2010 25K HS06 requirement at same level in terms of storage and CPU. Pledge for 09 2.5 PB disk at all T2's (almost what we have). Target would be 3.1 PB by end of year. 2010 is a steep ramp up required, in order to satisfy ATLAS plus US reserve. Numbers go from 3 to 7 PB in 010. In terms of CPU we're already doing very good. Currently at 55 KHS06 - target is 61 kHS06. 76-80 kHS06 by end of 2010 - shouldn't be an issue. Really have to get CPU and storage balanced. So we need to more than double the storage the space deployed by end of 09. Table sent out summarizing currently installed capacity - some sites need to significantly ramp. Goal is to get all our T2's to the same level - from the production perspective. Harder to balance large and small sites. May take more than one year. Goal is to homogenize level of resources across sites. Regarding technology - all of our goals should be to use where possible the new technology in disks - eg 2 TB drives. Avoid going for old technology w.r.t. cost of space and power. Target dates will depend on the performance of the machine. Overriding issue is supporting analysis.
      • Kaushik: need to plan for allocation of the storage among the space tokens. As we evolve towards 1 PB, we need specific information about procurements step-by-steps. Stefan's proposal from SW week.
      • Shawn: We need a schedule for required storage.
      • Armen is watching critical storage areas at T1/T2's.
    • this week:
      • Quarterly reports due!
      • Storage requirements: SpaceManagement
      • FabricUpgradeP10 - procurement discussion
      • End user analysis test (UAT), Date: 21-23 Oct. (28-30 Oct. as a backup date), UserAnalysisTest
        • 5 large containers - 100M events - spread over Tier 2s, mostly US sites
        • Expert users running jobs making ntuples first few days (20 people); in Europe, 4-5 people
        • Larger group would copy ntuple datasets
        • Smaller datasets will run over raw and ESDs
        • Metrics - is information going into panda db correct?
        • 400M events high pt, 100M low pt. Merge jobs going on right now. Earlier 300M sample already done. 520M events total. 6 containers.
        • How much space? 80 TB for all 6 containers.
        • 420M events will go to 5 containers - small and large.
        • Plan is that some jobs will not be US specific.
        • Pre-testing should show what the output sizes are.
        • USERDISK - capacity could be as much as 8 TB.
        • At the end, could be many users attempting fetches of output ntuples via dq2-get.
        • Lots of questions of details on how this will work.
        • Will be discussed in daily operations meeting; need to follow-up here next week

Operations overview: Production (Kaushik)

  • Reference:
  • last meeting(s):
    • Very successful reprocessing validation - at BNL - went quickly 20K jobs/day. Caveats: some jobs are crashing; Panda tweak - orders jobs by success probability. 2% job failures in the repo task.
    • Today: James Catmore given green light for full reprocessing. Will be fully Panda brokered.
    • US cloud may be majority of jobs - we might get most of the jobs, and they my finish quickly.
    • Getting lots of mc09 validation tasks. There are some problems still.
    • There is cosmic data being distributed - keep an eye on storage
    • GB/s going to BNL overnight from CERN, to tape
    • Volume of cosmic data for T2s? There seems to be derived data generated. AOD datasets defined - may be distributed to T2s also. AODs probably going to subscriptions
  • this week:
    • https://twiki.cern.ch/twiki/bin/view/Atlas/ADCReproFromESDAug2009
    • Reprocessing completed, all went well, US did 60%.
    • 2M job task ran into counter limits; Kaushik cleaned it up, but there will be some job failures.
    • Back to usual mode - US monte carlo queue fillers
    • Validation tasks will be coming in.
    • Real mc09 production about a week away.
    • Duplicate events in a large number of tasks. 100's of tasks to be regenerated.

Shifters report (Mark)

  • Reference
  • last meeting:
    Yuri's summary from the weekly ADCoS meeting:
    [ ESD reprocessing -- restarted mid-week once the new s/w cache became available.  Most (all?) jobs are running with the flag  "--ignoreunknown accepted,"
    which means errors like "Unknown Transform error" can be ignored.  Primary error seen so far is "Athena ran out of memory." ]
    See: https://twiki.cern.ch/twiki/bin/view/Atlas/ADCReproFromESDAug2009
    [Other production generally running very smoothly this past week -- most tasks ("queue fillers") have low error rates. ]
    1)  9/17: upgrade of dCache pool nodes at MWT2_UC to SL5.3.
    2)  9/17: From Xin, s/w patch for SLC4 ==> 5 migration:
    The patch fixes problems encountered by analysis jobs, which run on SL5 platform and involve compilation in the job.
    Other production jobs and SL4 platform sites are fine without it, while having it is harmless as well.
    3)  9/20: Test jobs completed successfully at UCITB_EDGE7.
    4)  9/21: Intermittent transfer errors at MWT2 sites likely due to ongoing testing -- from Charles:
    We've been running some throughput/load tests from UC to IU, which are almost certainly the cause of these transfer failures.
    I'll terminate the test now and the errors ought to clear up.  https://gus.fzk.de/ws/ticket_info.php?ticket=51697
    5)  9/23: UTD-HEP completed hardware maintenance (new RAID controller on fileserver) -- test jobs finished successfully, site set back to 'online'.
    6)  9/23: All jobs were failing at AGLT2 with "Put error: lfc-mkdir failed."  Hiro was able to fix a problem with an ACL -- site set back to 'online'.
    Follow-ups from earlier reports:
    (i)  7/23-7/24 -- Ongoing work by Paul and Xin to debug issues with s/w installation jobs at OU_OSCER_ATLAS.  Significant progress, 
    but still a few remaining issues.
    (ii)  SLC5 upgrades are ongoing at the sites during the month of September.
  • this meeting:
    Yuri's summary from the weekly ADCoS meeting:
    [ ESD reprocessing is essentially done.]
    See: https://twiki.cern.ch/twiki/bin/view/Atlas/ADCReproFromESDAug2009
    1)  9/24: Some files were lost in the MWT2 storage due to a dCache misconfiguration / cleanup operation.  Not a major issue -- jobs should simply fail and get rerun. eLog 5731.
    2)  9/25 ==> Large number of failed jobs in the US cloud from task 78741 -- error was "could not add files to dataset."  Remaining jobs were aborted.  Issue discussed in Savannah 56127, RT 14134.
    3)  9/25: Jobs failed at AGLT2 with the error "Put error: Error in copying the file from job workdir to localSE."  Issue was expired host certs on several machines -- resolved.
    4)  9/26: NET2 - problematic WN atlas-c01.bu.edu taken offline -- all pilots were failing on the machine with the error "Did not find a valid proxy, will now abort:"
    5)  9/29 p.m.- 9/30 a.m.: NET2 sites offline due to a problem with the gatekeeper.  Issue resolved, test jobs finished successfully, sites set back 'online'. eLog 5831.
    6)  9/30: Power outage at SLAC today -- from Wei:
    SLAC will take a power outage at 9/30 to work on urgently needed maintenance of two transformers that supply power to machine rooms. We will start
    setting things offline from 6pm 9/29 and eventually will shutdown all ATLAS services. The outage is scheduled to complete at 6pm of 9/30.
    7)  9/30: New pilot s/w from Paul, v39c:
    A problem with job recovery was discoverer due to the usage of a wrong error code related to LFC registration. When lfc-mkdir encountered an error, 
    the wrong error code was set which led the pilot to believe that the job could be recovered on sites that support job recovery. 
    The current job recovery version can not handle these cases.
    8)  Grid certificate for special user 'sm' updated at BNL & UTA (thanks Nurcan).
    9)  Heads-up: ATLAS User Analysis Test (UAT) scheduled for the second half of October.
    Follow-ups from earlier reports:
    (i)  7/23-7/24 -- Ongoing work by Paul and Xin to debug issues with s/w installation jobs at OU_OSCER_ATLAS.  Significant progress, but still a few remaining issues.
    (ii)  SLC5 upgrades are ongoing at the sites during the month of September.
    • Saul: Jobs seen where Athena running out of memory, causing nodes to lock up; Kaushik: they were reprocessing jobs - we expected 2% job failures for memory failures. Frustrating at the site level since host recovery is labor intensive. We could kill jobs w/ limits, but sometimes results in unnecessary job failures. Q: how much swap was available.
    • Kaushik notes that these jobs failed everywhere.

Analysis queues (Nurcan)

  • Reference:
  • last meeting:
    • CosmicsAnalysis job using DB access is successfully tested at FZK using Frontier and at DESY-HH using Squid/Frontier (by Johannes). The job has been put into HammerCloud and now being tested at DE_PANDA, no submission to US sites yet.
    • TAG selection job has been put into HammerCloud and is now being tested (in DE cloud).
    • We have now 3 new analysis shifters confirmed, still waiting to hear from one person. I'm planing a training for them in October.
    • Jim C. contacted with us on the status of the large containers for the stress test. Kaushik reported that we have total ~500M events produced. Only the first bunch replicated to Tier2's as I had validated them (step09.00000011.jetStream_medcut.recon.AOD.a84/ with 97.69M events and step09.00000011.jetStream_lowcut.recon.AOD.a84/ with 27.49M events). Others are at BNL, waiting to be merged and put into new containers. Depending on the time scale of the stress test this can be done in a few days as Kaushik reported.
  • this meeting:

DDM Operations (Hiro)

  • Reference
  • last meeting(s):
    • Site services - make sure Tier 0 shares are dropped, blacklist is removed. Make sure this is done. Those functions have been moved to CERN's DQ2.
    • BNL viewer: http://ddmv01.usatlas.bnl.gov:20050/dq2log/
    • Migration of SS to BNL - will happen only if and after FTS checksum works well. Testing now w/ CERNs FTS 2.2. dCache sites are okay. Current BM sites will have a discussion. Under discussion w/ Wei, Hiro and Simone. Requires some BM development. BM does not store it, so it must be calculated all the time. 500 MB = 2 seconds. Hiro is finding 33 secs for FTS to get the data. LBL asked to provide a hook for an external package to compute the checksum. Expect a new version within a week. Saul: might work since the machine is powerful. client controls the checksum. lcg-cp client asks this.
    • DQ2 upgrade expected next week.
    • T3 LFCs to move to BNL; will make a plan and contact each site.
  • this meeting:
    • New DQ2 (now available) testing this week at BNL. Do not update SS at the Tier 2's.
    • As soon as new bestman installed SLAC - this week; complete FTS 2.2 afterwards.
    • BNL DQ2 SS host has some problems - investigating. Host rebooting automatically. It is delaying some transfers in the US cloud.
    • Changing way LFC connects to the network via F5 switch to avoid firewall problems. Unscheduled.
    • Did manual cleanup of jobs which failed LFC registrations

Conditions data access from Tier 2, Tier 3 (Fred, John DeStefano)

  • last week
    • https://twiki.cern.ch/twiki/bin/view/Atlas/RemoteConditionsDataAccess
    • Fred managed to run jobs using just enviro variables to control access to conditions data.
    • Getting ready for full scale test at MWT2.
    • Wei: is there a way to setup two squid servers at a site? Documentation needs to be updated.
    • Upgraded frontier servers at BNL to take care of cache consistency provided by ATLAS. Any T2 that uses squid needs to update.
    • All sites need to upgrade - John will send an email.
    • Fred will follow-up on placement of files in these areas
  • this week
    • Fred - continuing to work on this topic; not sure its happening fast enough; Jack Cranshaw working w/ Alessandro and Xin to get pool file catalogs generated and installed correctly.
    • On-going discussion about how to set the environment variable to effectively use local conditions files in HOTDISK
    • New version of squid - need updates at all sites. Improved mechanism for cache consistency; this version more efficient.
    • Instructions have been updated.

Data Management & Storage Validation (Kaushik)

Throughput Initiative (Shawn)

  • NetworkMonitoring
  • last week(s):
    • perfSONAR due to be release this Friday (Sep 25 2009).
      • Goal is to have all USATLAS Tier-2 sites updated by October 1, 2009
      • By October 7 all sites should have configured full mesh-testing for BWCTL and OWAMP testing for all Tier-2 sites and BNL
    • Need to create a mapping for all Tier-3s to provide an associated Tier-2 for testing purposes
    • Hiro will update the automated file transfer testing to allow Tier-2 to Tier-3 transfers (SRM-to-SRM) using the existing load testing framework
  • this week:
    • Meeting Notes USATLAS Throughput Call – September 29, 2009
      Attending:   Sarah,  Joe,  Shawn,  Hiro,  Jason,  Dave,  Horst,  Karthik, Doug
      Excused:  Saul, Neng
      (We had some issues with noisy lines.  ESnet is aware of the issue.  There is a problematic line they are trying to track down.  Only solution for now seems to be redialing till you get a good line.)
      1)      perfSONAR status.   Release v3.1 is out.  Already installed  at OU and AGLT2.    Possible issue with iptables: Shawn will send screen capture to Jason.  MWT2, NET2 and Wisconsin have all confirmed they should be updated this week.  Need to hear from WT2 and SWT2-UTA.
      2)      Update on Tier-3 testing.   The info for the KOI perfSONAR box is (total cost is for 2 of them; note there is an additional charge for rails ~$30):
      Item Number Qty Description Unit Cost Total Amount
      2 1U Intel Pentium Dual-Core E2200 2.2GHz System $598.00 $1,196.00
      Breakdown per System:
      1 ASUS RS100-X5/P12 1U Chassis with 180W Single Power Supply. Intel
      945GC/ICH7 Chipset Main Board. Onboard 2 x marvel 8056 GbE LAN
      Controller, Intel Graphics Media Accelerator 950, 2 x SATA Ports.
      1 Intel BX80557E2200 Pentium DC E2200 2.2GHz 1MB 800MHz Processor
      2 Kingston KVR667D2N5/1G 1GB DDR2-5300 667MHz Non-ECC Unbuffered
      1 Seagate ST3160815AS 160GB SATA 16MB 7200RPM Hard Drive
      1 ASUS Slim DVD-ROM Drive
      1 Labor/Shipping
      1 Three Year Parts Repair/Replacement Warranty
      TOTAL: $1,196.00
      3)      No updates.   Still working on “3rd party” transfer capability for use in Tier-2 to Tier-3 testing.  Will need to prestage long-term source files at Tier-2s for this.   Tier-2s will need to set aside ~30GB of space for testing files. 
      4)      Site reports
      a.       BNL – Nothing to report
      b.      AGLT2 – Still low throughput to debug.  Issues with SRM hanging.
      c.       MWT2 – SL5 upgrade underway to fix TCP/Network issues.  
      d.      NET2 – Working on perfSONAR updates.
      e.      SWT2 – perfSONAR installed and running. 
      f.        WT2 – No report
      g.       Wisconsin -  perfSONAR boxes should be upgraded this week.
      5)      AOB  - Some review of perfSONAR milestones.  October 7th to have all USATLAS Tier-2/Tier-1 sites config’ed for mesh-testing.
      a.       Manual load-test to AGLT2 on Wednesday 9:30 AM Eastern
      b.      MWT2 will schedule a manual load-test sometime after their SL5 update
      c.       Analysis stress test coming up.   May have implications for our preparations…
      We plan to meet again next week at the usual time (3 PM Eastern on Tuesday).   Send along any corrections or additions to these notes via email.
    • All sites should update perfsonar installations by end of week.
    • Next step ensure all sites are configured for automated tests - Oct 7 deadline
    • ADC operations is planning a big throughput test, Oct 5, 5 days. Not sure if Tier 2's will be involved.

OSG 1.2 deployment (Rob, Xin)

  • last week:
    • Validation of upgrade to lcg-utils in wn-client, as well as curl. OSG 1.2.3 has this update (relevant for wn-client, client)
    • Tested with UCITB_EDGE7 site. Validation complete DONE
  • this week:

WLCG Capacity Reporting (Karthik)

  • *last discussion(s):
    • Note - if you have more than one CE, the availability will take the "OR".
    • Make sure installed capacity is no greater than the pledge.
    • Storage capacity is given the GIP by one of two information providers (one for dCache, one for Posix-like filesystem) - requires OSG 1.0.4 or later. Note - not important for WLCG, its not passed on. Karthik notes we have two ATLAS sites that are reporting zero. This is a bit tricky.
    • Have not seen yet a draft report.
    • Double check the accounting name doesn't get erased. There was a big in OIM - should be fixed, but checked.
  • Reporting come two sources: OIM and the GIP from the sites
  • Here is a snapshot of the most recent report for ATLAS sites:
    This is a report of Installed computing and storage capacity at sites.
    For more details about installed capacity and its calculation refer to the installed capacity document at
    * Report date: Tue Sep 29 14:40:07
    * ICC: Calculated installed computing capacity in KSI2K
    * OSC: Calculated online storage capacity in GB
    * UL: Upper Limit; LL: Lower Limit. Note: These values are authoritative and are derived from OIMv2 through MyOSG. That does not
    necessarily mean they are correct values. The T2 co-ordinators are responsible for updating those values in OIM and ensuring they
    are correct.
    * %Diff: % Difference between the calculated values and the UL/LL
           -ve %Diff value: Calculated value < Lower limit
           +ve %Diff value: Calculated value > Upper limit
    ~ Indicates possible issues with numbers for a particular site
    #  | SITE                 | ICC        | LL          | UL          | %Diff      | OSC         | LL      | UL      | %Diff   |
                                                          ATLAS sites
    1  | AGLT2                |      5,150 |       4,677 |       4,677 |          9 |    645,022 | 542,000 | 542,000 |      15 |
    2  | ~ AGLT2_CE_2         |        165 |         136 |         136 |         17 |     10,999 |       0 |       0 |     100 |
    3  | ~ BNL_ATLAS_1        |      6,926 |           0 |           0 |        100 |  4,771,823 |       0 |       0 |     100 |
    4  | ~ BNL_ATLAS_2        |      6,926 |           0 |         500 |         92 |  4,771,823 |       0 |       0 |     100 |
    5  | ~ BU_ATLAS_Tier2     |      1,615 |       1,910 |       1,910 |        -18 |        511 | 400,000 | 400,000 | -78,177 |
    6  | ~ MWT2_IU            |        928 |       3,276 |       3,276 |       -252 |          0 | 179,000 | 179,000 |    -100 |
    7  | ~ MWT2_UC            |          0 |       3,276 |       3,276 |       -100 |          0 | 179,000 | 179,000 |    -100 |
    8  | ~ OU_OCHEP_SWT2      |        611 |         464 |         464 |         24 |     11,128 |  16,000 | 120,000 |     -43 |
    9  | ~ SWT2_CPB           |      1,389 |       1,383 |       1,383 |          0 |      5,953 | 235,000 | 235,000 |  -3,847 |
    10 | ~ UTA_SWT2           |        493 |         493 |         493 |          0 |     13,752 |  15,000 |  15,000 |      -9 |
    11 | ~ WT2                |      1,377 |         820 |       1,202 |         12 |          0 |       0 |       0 |       0 |
  • Karthik will clarify some issues with Brian
  • Will work site-by-site to get the numbers reporting correctly
  • What about storage information in config ini file?

Site news and issues (all sites)

  • T1:
    • last week(s): waiting for 120 worker nodes will be delayed by 2 weeks. Pedro has completed pcache installation, evaluating. HOTDISK - have distributed the area over 30 Thors (small amounts). Large number of reprocessing and validation jobs. Have observed efficiency issues getting jobs into running state even though everything was ready - investigation as to origin latency, increased nqueue. Pilot rate issue. Possibility of letting multiple jobs run by the same pilot - will be looked at, but its not a quick fix. Torre notes its an opportune time to look at this since.
    • this week: all is well, not much change from last week. Completing electrical work in new data center. 3 tape libraries have arrived - being installed - adding more than 24K cartridge slots to existing robot.

  • AGLT2:
    • last week: no update.
    • this week: Trying to get purchase orders out - Dell still not providing quotes for 2 TB drives; OSG security challenge readiness. Getting ready for SL5 conversion. One compute node ran 8 jobs to completion.

  • NET2:
    • last week(s): deploying hotdisk space token; proceeding w/ procuremet
    • this week: SL5 migration - have one node deployed and testing; will proceed. Gatekeeper problem last night - not sure what happened. Meeting w/ Dell. Added 130 TB of storage (2 partitions).

  • MWT2:
    • last week(s): SL5.3 upgrade in progress. dCache load testing between IU and UC - some stability issues have gone away. 1 GB/s UC to BNL.
    • this week: Two phases - downtime next week for security patches on head nodes. Will update SL5.3 on computes. LFC site problem fixed. Both sites have perfsonar updated. Using Puppet for as new config management tool. OSG security drill tomorrow.

  • SWT2 (UTA):
    • last week: deployed HOTDISK last night - working fine. focusing on procurement at UTA and CPB sites.
    • this week: UTA cluster will be updated to SL5; CPB will follow. OSG security drill tomorrow. Space getting tight. Had a few nodes go down during reprocessing.

  • SWT2 (OU):
    • last week: waiting on storage purchase - Joel investigating. Upgrade after storage. all is well.
    • this week: OSG security drill today. 100 TB useable storage held up by Langston University's purchasing.

  • WT2:
    • last week(s): SS messed up w/ yum update. Side effect was that update removed checksum FTS code - a number of transfers missed checksum validation. ITB site testing for new version of xrootd - do we need a Panda site? SE use a lot of functions. R410s arrived. 40-41 servers (195 total - other experiments included). xrootd client developer - provide a newer version, will do a test. Power outage at the end of the month addressing safety concerns.
    • this week: SL5 migration - has queue of 6 machines running already. Will be able to migrate some machines before mid-October. Planning two steps (100-200 systems first; hope to completely switch before UAT); ~350 nodes. Will migrate OSG 1.2 in the next two weeks. New bestman testing Friday or Monday. New testbed for xrootd - finding a number of small bugs.

Tier 3 program report (Doug)

  • last week:
    • still working on interviews
    • Doug feels we'll need t2-t3 'affinities'
    • T3 usability should be a focus in the next phase of integration program
  • this week:
    • Wants to know when next integration phase starts
    • Interviews w/ sites nearly completed.
    • Some sites will need a site mover.
    • How are dataflows monitored in T2's and T3's - are Gratia probes needed.

Carryover issues (any updates?)

OIM issue (Xin)

  • last week:
    • Registration information change for bm-xroot in OIM - Wei will follow-up
    • SRM V2 tag - Brian says nothing to do but watch for the change at the end of the month.
  • this week:

Tier 3 data transfers

  • last week
    • no change
  • this week

Release installation, validation (Xin)

The issue of validating presence, completeness of releases on sites.
  • last meeting
  • this meeting:

HTTP interface to LFC (Charles)

VDT Bestman, Bestman-Xrootd

  • See BestMan page for more instructions & references
  • last week
    • Have discussed adding Adler32 checksum to xrootd. Alex developing something to calculate this on the fly. Expects to release this very soon. Want to supply this to the gridftp server.
    • Need to communicate w/ CERN regarding how this will work with FTS.
  • this week

Local Site Mover

Gratia transfer probes @ Tier 2 sites

Hot topic: SL5 migration

  • last weeks:
    • ACD ops action items, http://indico.cern.ch/getFile.py/access?resId=0&materialId=2&confId=66075
    • Kaushik: we have the green light to do this from ATLAS; however there are some validation jobs still going on and there are some problems to solve. If anyone wants to migrate, go ahead, but not pushing right now. Want to have plenty of time before data comes (means next month or two at the latest). Wait until reprocessing is done - anywhere between 2-7 weeks from now, for both SL5 and OSG 1.2.
    • Consensus: start mid-September for both SL5 and OSG 1.2
    • Shawn: considering rolling part of AGT2 infrastructure to SL 5 - should they not do this? Probably okay - Michael. Would get some good information. Sites: use this time to sort out migration issues.
    • Milestone: my mid-October all sites should be migrated.
    • What to do about validation? Xin notes that compat libs are needed
    • Consult UpgradeSL5
  • this week


  • last week
  • this week
    • None

-- RobertGardner - 29 Sep 2009

About This Site

Please note that this site is a content mirror of the BNL US ATLAS TWiki. To edit the content of this page, click the Edit this page button at the top of the page and log in with your US ATLAS computing account name and password.


Powered by TWiki
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback