r6 - 23 Feb 2011 - 18:23:03 - AlexUndrusYou are here: TWiki >  AtlasSoftware Web > NightliesUploadCVMFS

NightliesUploadCVMFS - Upload of ATLAS nightly releases to CVMFS


PROJECT GOAL

Arrange automatic upload of major nightlies branches on CernVM-FS (CVMFS) for users at Tier-1 (and possibly other) Centers and site validation tests. Two-way connection between NICOS (ATLAS Nightly Control System) and CVMFS server machine will assure posting of CVMFS deployments and tests on NICOS web pages.

MILESTONES (under construction)

Predrag (18.02): As for the schedule on our side, we plan to continue testing and deploying replicas at BNL, RAL and CERN/IT hoping that this will converge until end of February or first week of March... As of now, there is no definitive schedule but perhaps I'll be able to update you next week on this... What I can suggest is to wait until we are ready with setting up the replicas and producing related documentation and then schedule similar (iterative) process for documenting server installation - we can give you the initial instructions, you try to follow and provide us the feedback on basis of which we update the documentation and do everything again until everybody involved is happy...
  • deployment of releases for 1 cache branches on CVMFS prototype: ??-??-2011
  • migration of prototype to CERN-IT: ??-??-2011
  • deployment of 1 - 2 major (full) branches: ??-??-2011
  • adjustment of AtlasSetup? procedures for nightly releases on CVMFS: ??-??-2011
  • make nightlies available at Tier 1 BNL: ??-??-2011
  • implementation of validation tests at CERN site: ??-??-2011
  • run validation tests at other sites (e.g. BNL): ??-??-2011

ATLAS NIGHTLY SYSTEM FEATURES:

  • Numerous branches of Nightly Releases produced daily on a local disks of ATLAS Nightly farm at any time of a day
  • The contents of the Release and its external dependencies are fully defined in the Tag Collector DB
  • The completion time for most branches is between 03:00 and 12:00
  • Releases are available for distribution in forms of RPMS, pacman kits, and pacballs all of which reside on AFS
  • Pacman kits can be downloaded via http (pacman supports also gsiftp and ssh)
  • Releases are copied on AFS as
    • a simple copy of the release as it appears on a local disk (no externals) and
    • downloaded to AFS from pacman kit (with externals)
  • Synchronization is via time stamp files on AFS
  • Nightly System makes records in Release DB of ATLAS Installation System (managed by Alessandro) and submits releases of selected branches for KV validation
  • Nightly System can make records in ATLAS AMI DB (tested, but currently not used)

Alex (14.02): a nightly job sets about ten stamps on AFS when certain processes are finished, e.g. build, tests, kit build, kit download. The stamps are designed for internal use and located all in one area, so they have rather weird names with nightly name and platform encoded

NIGHTLY RELEASES TYPES and PROPERTIES:

  • Full releases, leading to 3-digit numbered releases (e.g. 16.3.0)
    • Include all ATLAS software (10 projects)
    • Kit download size is ~ 9 GB (about 4 GB of ATLAS sw and 5 GB of externals such as LCG sw)
    • Kit download is practically self-sufficient (may include gcc compiler)
    • There are seven releases in the loop named rel_0 ... rel_6 (number corresponds to day of week)
    • Two or three branches are needed on CVMFS for Tier I (BNL) users and nightlies site validation tests

  • Cache releases, leading to 4-digit numbered releases (e.g. 16.3.0.1)
    • Include one ATLAS software project (usually AtlasProduction? ) with patches to full release
    • Kit download size is ~ 1 GB (highly variable as number of patches vary)
    • Depend on stable full (3-digit) release
    • There are seven releases in the loop named rel_0 ... rel_6
    • Nightly Cache kits download the candidates of numbered releases, such as 16.3.0.1. The candidates become stable numbered releases when Release Coordinator decides and Shifters make necessary copies
    • Three or four branches are needed on CVMFS for Tier I (BNL) users and nightlies site validation tests

  • Analysis releases, leading to 5-digit numbered releases (e.g. 16.3.0.1.1)
    • Include one ATLAS Analysis project (any kind of name) with code for physics analysis
    • Kit download size is ~ 0.1 GB
    • Depend on stable full (3-digit) release AND stable AtlasProduction? cache
    • Variable number releases in the loop (usually 2)
    • Nightly Analysis kits download the candidates of numbered releases, such as 16.3.0.1.1. The candidates become stable numbered releases when Release Coordinator decides and Shifters make necessary copies
    • The possibility that Nightly Analysis releases will be needed on CVMFS is small
    • Nightly System creates fully functional candidates for stable Analysis releases. It is necessary to envisage the option for the nightly job to put stable Analysis releases on CVMFS

DIRECTORY STRUCTURE: full and cache releases have differences:

  • Full releases are self-sufficient and can be installed in separate directories
nightlies/
          <nightly branch>/
                           rel_0/
                                 AtlasCore/
                                 AtlasProduction/
                                 external/ 
                                 .........                 
                           rel_1/
                           ......        

  • Cache releases in standard ATLAS installations are installed inside full 3-digit releases that they depend on. Nightly cache releases should be installed standalone (at least initially) to avoid any interference with stable releases.

nightlies/          
      <nightly branch>/
                     rel_0/  
                         <3 digit release, e.g. 15.6.14>/
                                                      AtlasProduction/
                                                                     15.6.14.1/
Emil (14.02): assumes that the patch releases will go into the same location as the base release (like on AFS). This way it is clear - if a release is obsolete - all caches in that release are obsolete as well and easy to remove
Predrag (15.02): One thing to remember is that CernVM? -FS namespace is going to change. In order to assure unique namespace for CernVM? -FS we will have to use mount point /cvmfs/atlas.cern.ch and you should really consider installing your software so that it is ready to run in that directory. This is going to require re-installation of old ATLAS releases as they cannot be easily relocated. While in CernVM? we can still create a symlink /opt/atlas -> /cvmfs/atlas.cern.ch so that applications do not break, for installation on Grid/T1 sites creating the symlinks in /opt is not acceptable and this is why we have to consider new namespace and migration of repositories
Alessandro (17.02): prefers to have : a nightly base release separated from the production one + the nightly caches, replaced every day. Also prefers keeping the nightly name (16.6.X), since it gives additional isolation between different branches
Alessandro (17.02): all the nightlies should be in one area: VO_ATLAS_SW_DIR/nightlies
Doug (17.02): created a detailed description of directory structure (and then hence the nested catalogs) based on all proposals
VO_ATLAS_SW_DIR=/cvmfs/atlas.cern.ch/repo/sw" # at CERN-IT
$VO_ATLAS_SW_DIR/software/<version> # for stable ATLAS releases, where version is the release name e.g. 16.0.0
$VO_ATLAS_SW_DIR/database/<DB version> # for DB Releases (DB Releases will be also in ATLAS releases installations)
$VO_ATLAS_SW_DIR/nightlies  for the Nightly releases 
Predrag (17.02): I think that you should start right now with /cvmfs/atlas.cern.ch/ name space - everything is ready on our side to support that. To start with, IT server will be only a replica of the primary server (the same server we are using right now) and there will be 2 more replicas available (RAL and BNL). Once we gain more experience with replica setup at CERN, we will gradually move other services to CERN/IT infrastructure
Asoka (17.02): sw-mgr will install in $VO_ATLAS_SW_DIR/database/DBRelease/; Current installations on the grid use $VO_ATLAS_SW_DIR/software; for this reason, my argument is to go with the least path of resistance unless we also want the grid backend developers (pilot, Ganga developers) to make changes
Alex (17.02): prefers "releases" over "software" because it resembles what we have now on AFS. Suggests that there should be separate areas for base nightlies and cache nightlies:
cache_nightlies/          
             <branch>/
                     rel_[0-6]/  
                                                        <3 digit nightly name>/
                                                                                      AtlasProduction/  ####cleaned every Sunday (Monday...)#####
                                                                                              <3 digit nightly name>.[012345...]
                     
                    <3 digit nightly name>
                                                                                      AtlasProduction/
                                                                                              <3 digit nightly name>.[012345...] 
                                                                                      ###AtlasProduction cache created by the latest nightly replaces existing cache on CVMFS###


The full nightlies should be downloaded separately from cache nightlies. They can reside in nightlies/<nightly branch name> under rel_0,1,3... So we have something like

nightlies/
      16.X.0/
              rel_0,1,2,3,4,5,6  

NIGHTLY RELEASES LIFETIME ON CVMFS:

With 7 releases in the loop a nightly release is re-written once a week. For example, rel_0 release (of all branches) needs to be re-written or updated on Sundays

INTERACTION WITH CVMFS:

  • Ideally Nightly Administrator should be able to control which nightlies are uploaded to CVMFS
  • The commands (or signals) allowing start, re-start, stop uploading are needed
  • Nightly System needs to know when upload is completed
  • Nightly System should have an access to stable and nightly releases on CVMFS for post-install checks, or site validation tests

ISSUES FOR DISCUSSION:

  • Should Nightly system upload to a separate CVMFS manager/repository

Doug (13.02): ATLAS should fight the temptation to have many different CVMFS server machines. If we can have one machine that handles the entire load then perfect. I suspect that we will likely have to have a couple of machines due to time it takes to create the cvmfs repositories"
Predrag (15.02): In case of nightly builds, we have to see if they will be served by separate server (and be given different namespace like /cvmfs/atlas-nightlies.cern.ch) or we will serve them as the sub-catalogs attached to /cvmfs/atlas.cern.ch (the choice depends on who is going to run and be responsible for nightly build servers)
Asoka (17.02): current cvmfs model: one nested repository for the base release, with production caches as well as the 5 digit analysis caches installed in that base release dir. With the 64-bit kits coming along, the plan is to also install these platform in the same base release dir. It works OK. One thing to note is that even if an installation goes badly and corrupts the base release, it is not yet published and so the world will not see it while we fix it - either a complete reinstallation (~ 4 h) or from a backup

  • Is it rational to operate CVMFS server on one of our VOATLAS machines

Doug (13.02): Ultimately, the entire system will be moved into CERN-IT and they will be responsible for it

  • Is it possible to allow Nightly Administrators login on a server machine

Doug (13.02): Yes.. Will have to modify the VO card and add the information in Savannah
Alex (14.02): At prototype phase Nightly Administrators may need to check the status of jobs on the server, perhaps kill and restart if needed and possible. Particularly if there will be a direct download from pacman kit the Nightly Administrator can analyze problems. Once everything is settled then Nightly Administrators may need to login only if the scripts provided by them malfunction

  • Are nightly releases supposed to be mirrored at CVMFS mirrors

Doug (13.02): All items in the cernvmfs system will be served to the mirrors. The mirrors are copies of the entire system including all experiments and VO's

  • In the long run nightly job should be able to upload stable releases created out of nightlies (an attractive option for Analysis caches)

Alex (14.02): Particularly in the case of cache nightlies, nightly system produces the candidate for stable release every day. E.g 16.6.0.Y branch produces 16.6.0.1 AtlasProduction? releases every day. But they are stored in temporary area, obviously not among official ATLAS releases. Then suppose ATLAS release coordinator decides that 16.6.0.1 is good today. Then the Shifter manually switches Nightly System to use 16.6.0.2 in the next nightlies and copy the 16.6.0.1 candidate into official areas. This can be done completely automatically, that is opening new release in the System can trigger all necessary copy and downloads (but there are some issues). With respect to cvmfs, this automation means that the approved candidate release is automatically uploaded to stable release tree (meaning that the nightly job modifies not only nightly releases area, but also stable releases too). The cache releases are added to existing full release: suppose we have: 16.6.0/AtlasProduction/16.6.0.1. The next cache 16.6.0.2 will be added alongside existing 16.6.0.1. This is however a long term plan. It is not supposed to be tested in CVMFS prototype

  • Tool for uploading at the server machine. A nightly job creates a pacman kit (essentially a collection of tar files). This is an "official" representation of ATLAS release. All ATLAS releases are supposed to be downloaded from kits and, if they are used for something serious, validated. ATLAS release can be downloaded and then fetched by rsync but in this case the release must be modified (as there are hardcoded full path references inside).

Grigory (15.02): the easiest way to install the distribution kit is from the SAME location as it will be seen on CVMFS from outside - currently below /opt/atlas, otherwise it will need to be relocated to such path. So it may be best done from the CVMFS server itself
Asoka (16.02): Alessandro's script sw-mgr can also be used on a non-grid environment (desktop or the cvmfs server). The Athena kits are pulled in from pacman mirrors for installation - and it can be specified on the command line which mirrors to use. Possibly has plugin for Firefox?
Asoka (16.02): The sw-mgr will be responsible for $VO_ATLAS_SW_DIR, and anything which T1/T2 (or T3 grid) sites "see" will be that mount point. So having something outside, like /cvmfs/atlas.cern.ch/nightly could mean pilot developers needing to define / use another env variable that points to nightlies that live outside VO_ATLAS_SW_DIR. I am not sure at what stage pilots now work with $VO_ATLAS_SW_DIR/nightlies
Asoka (16.02): If we have nightlies that can be installed with sw-mgr , then the possibilities exist for it to be also very easily installed on grid/non-grid sites which do not want to use cvmfs
Asoka (16.02): current cvmfs uses manageTier3SW which is run on the cvmfs server and uses pacman to pull in the Athena kits from a list of mirrors. sw-mgr will do essentially the same thing on the new cvmfs server
Alessandro (17.02): suggested that we need to put in place a deployment mechanism, similar to what I have for KV in Rome, in the CVMFS server. However, this should only be a cron, no big deal. I already have the scripts to do that, including the way to get the trigger from the timestamps and from the ReleaseDB? too. Agrees to provide support for AUTOMATIC nightlies installation [with sw-mgr] . Examples of sw-mgr use provided for Alex's tests

TASK FOR PROTOTYPE:

  • Upload one cache nightly daily, ~1 GB each (for branch selected by David actual size is 0.43 GB)
  • Keep 7 cache nightly releases, total size < 7 GB
  • Run ATN (nightly test framework) tests for downloaded cache

David Rousseau (11.02): start with the 16.6.X.Y(VAL) AtlasProduction? cache nightlies, then add the dev and devval full release nightlies


Major updates:
-- TWikiAdminGroup - 16 Oct 2018

About This Site

Please note that this site is a content mirror of the BNL US ATLAS TWiki. To edit the content of this page, click the Edit this page button at the top of the page and log in with your US ATLAS computing account name and password.


Attachments

 
Powered by TWiki
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback