r15 - 26 Feb 2009 - 14:45:54 - JoseCaballero? You are here: TWiki >  AtlasSoftware Web > PdM

Panda Data Movement


Introduction

At the start of this project (Fall'08), implemented in Panda were fully automated and configurable data movement capabilities based on the Atlas suite of software (DQ2). However, it is impractical to ask other Virtual Organizations in OSG to adopt these project-specific and complex tools, for their own needs. It follows then that Panda needs to be augmented with a lightweight and generic means of automation of data movement, based on the job definition created by the user

Panda Approach to data movement

General information on DQ2 can be found here.

Panda pilot puts and registers output files to ATLAS-specific area in the local SE/LFC. e.g.,

srm://uct2-dc1.uchicago.edu/pnfs/uchicago.edu/atlasproddisk/... /grid/atlas/dq2/...

Proposed integration of generic data movement into Panda

Requirements and constrains

  • we provide VOs with the option of using Panda for data management and data-driven workflow, following closely how it is done for ATLAS production
  • if the VO chooses this option, they must buy into organizing their data/jobs into datasets/jobsets
  • we do not use DQ2. We use Panda's internal dataset/file system. Meaning we need some non-existing tools, and we absolutely need to decouple from ATLAS Panda (to protect/isolate ATLAS dataset/file tables).

    Q: what tools do you mean? Is this extra code for the Panda server? (or just the dq2-esque tool)
    A: Just the dq2-esque tools, pdm-* or whatever

    Q: decoupling -- The Atlas Panda instance support is being handed over to CERN, as you indicated. Who will run the OSG Panda? If it is I who will have to care for the new setup, it's fine and we'll have to account for that (as is the case otherwise).
    A: ACF will provide servers+DB for OSG Panda as they do now for ATLAS+OSG. Michael is aware of this and is OK with it. Keeping Panda server/monitor for OSG running will be for BNL OSGers. Doesn't take much This we will have to review, once we have migrated ATLAS to CERN/Oracle.
    We'll see then how the terrain looks, whether we deploy this via CERN or ask BNL to continue to support it (with Oracle back end; we don't want to maintain both). But for now, dev/test at BNL with MysQL backend.

    Q: should we prepare to host this new instance? Is RACF involved in this at all?
    A: For dev/test of OSG Panda data management you could either use the test setup you already have for scaling studies (easiest, therefore best option?) or create a new one if necessary

    Q: more on decoupling -- do we still stick with a single code base for the server?
    A: yes

    Q: we should protect the ATLAS tables in the database. Does that mean we need new tables for OSG to register datasets and files?
    A: a distinct Panda instance (DBs, server, monitor). Already this in the test setup, which is enough for now, until this needs to go into production.

    Q: who is the responsible to write the code to register entries in the PANDA DB? My guess is that such code exists already (given the tables exist and they have a lot of entries), and we only need to decouple it from the ATLAS/dq2 software, is thit correct?

Suggested schema

  • it is a user/VO responsibility to store input data in SEs (the 'home SE' below)
  • it is a user/VO responsibility to organize their data in terms of datasets: job sets with corresponding datasets for input/output that will be the basis for Panda managing the jobs/files in blocks

  • we provide a file/dataset registration tool (analogous to dq2_put)
    • registers dataset in Panda dataset catalog
    • registers files associated with dataset. File specified by lfn, pfn
      • generates guid
      • registers file in BNL OSG LFC
      • registers file in Panda file table (defines dataset association)

  • user/VO has a declared 'home SE' that plays the T1 role in dispatch/destination blocks
  • dispatch and destination blocks work the same way as for ATLAS production jobs, except
    • destination subscriptions also use Panda Mover
    • 'T1' is the VO's home SE
  • job submission uses standard client. Only difference from CHARMM is that input, output datasets and files are specified.
  • pilot works in ATLAS-like way. Inputs have been prestaged to local SE; the pilot gets the input file list as part of the job spec, and stages in from the local SE as in ATLAS, using the data mover appropriate to the site (from schedconfig). The pilot registers outputs in the associated SE (either local or remote, with appropriate data mover) and in LRC, and the Panda server handles the subscription to the T1

Q: is LFC part of the OSG Client installation? Can we expect OSG members will have the LFC tools installed?
A: yes. also US ATLAS is working on an http lfc client, a lighter dependency.

Implications

  • must operate from a distinct Panda server/DB instance. Dataset/file management in Panda is heavy enough. Cannot burden ATLAS Panda with OSG dataset/file management.

Implementation

  • file/dataset registration tool: (eg. pdm_register, pdm = Panda data management) and ancillaries (pdm_ls, pdm_get). [ Maxim/Jose with Tadashi/Torre guidance on Panda dataset, file tables. Hiro can help with LFC. ]

    Q: Do we anticipate more than one way of data management in Panda, going forward?
    A: OSG Panda data management will always be different from ATLAS; ATLAS will always use DQ2 (or some successor) and I expect we'll never want to impose ATLAS DDM on OSG

    Q: Are ALL datasets listed as 'dataset' in filesTable4 included in the Dataset table?
    A: Panda-internal datasets (_sub and _dis) are all included. Concerning top-level datasets, they are included if they were used as output datasets. DBRelease are always used as input datasets, and thus are not included.

  • dispatch/destination subscriptions:
    • attribute in VO info table indicating if Panda data management is requested by the VO [ Maxim ]
    • Panda server adaptations: 'home SE' (another VO info table attribute), _sub based on Panda Mover, proddblock and destinationdblock datasets are Panda datasets and not DQ2 [ Tadashi ]
  • pilot adaptation: pilot3's data movers without its ATLAS athena job specificity [ Jose ]

Other comments

Panda datasets are different from ATLAS datasets. Or at least they can be. The Panda dataset table that you're now familiar with records both datasets known to ATLAS DDM system (DQ2) and datasets that Panda creates for itself (for the DataMover). So by using Panda datasets (only) we can support datasets for OSG without bringing in the ATLAS DDM system. There is a file table too, you'll see it in the same DB as the datasets table (filesTable4 is the version in use). There is info on it on the PandaDB page of the Panda wiki https://twiki.cern.ch/twiki/bin/view/Atlas/PandaDB The files table records the file content of the dataset. So again by using (only) our own file table to record the content of our own Panda datasets, we avoid ATLAS DDM.

we need the functionality of pilot3 from the data movers (yes these are *SiteMover.py) -- as well as other nice functionality like heartbeat reporting and interrupt handling -- but minus the ATLAS specificity that is built into pilot3. We need a 'thin' pilot3 that strips ATLAS specifics with minimal duplication of code. The data movers you need should largely already be there now that Paul has implemented LFC-based movers for the OSG environment. eg. a principal one for OSG outside ATLAS is going to be the dCache one; all US CMS T1/T2s use dCache.

In all Panda usages now, when input datasets are declared to Panda as part of a job def, Panda assumes they are DQ2 datasets: datasets registered in DQ2, file content of the datasets recorded in DQ2, files in LFC according to DQ2 dataset locations. We need to break this assumption for OSG. Datasets will be Panda datasets (only), their file content is as recorded in the Panda file table, and the LFC file registrations are there because we (not DQ2) put them there. So we need to provide OSG users a way to set this up:

  • user tool(s) to define their datasets (to the Panda dataset table),
  • define the file content of datasets (to the Panda file table),
  • define file locations in SEs (to LFC).
I defer to Tadashi as to how this is best implemented, except to say it is better not to have direct dependencies on the back end in the user tool, but do it the 'Panda way' via a web service intermediary. As the Panda server does within the Panda workflow, yes. Have the client tool serve up these requests to the Panda server to service them would be a natural way to do it.

Integration with Panda

Comments from Tadashi as answers to my questions

To couple the scripts with Panda:

Panda can run arbitrary scripts. e.g., http://atlas-sw.cern.ch/cgi-bin/viewcvs-atlas.cgi/offline/Production/panda/test/testScript.py?revision=1.1&view=markup

This job gets test.sh from https://tmaeno.web.cern.ch/tmaeno/test.sh and runs

$ python test.sh 

on WN, i.e.,

$ wget job.transformation
$ python basename(job.transformation) job.jobParameters

Another example is http://atlas-sw.cern.ch/cgi-bin/viewcvs-atlas.cgi/offline/Production/panda/test/installSW.py?revision=1.7&view=markup

which runs http://www.usatlas.bnl.gov/svn/panda/apps/sw/installAtlasSW

to install Atlas SW.

For PDM jobs, you may need to set

job.transformation    = ???
job.prodSourceLabel   = 'ddm'
job.computingSite     = ???
job.sourceSite        = ???
job.destinationSite   = ???
job.transferType      = 'dis'
job.jobParameters     = ???

where computingSite is the siteid where movers are running, sourceSite and destinationSite are source siteid and destination siteid, respectively. e.g., for US ATLAS, if a mover transfers files from CERN to AGLT2,

computingSite   = 'BNL_ATLAS_DDM'
souceSite       = 'CERN_MCDISK'
destinationSite = 'AGLT2_PRODDISK'

Only BNL_ATLAS_DDM needs to defined in schedconfig.

About stage-out:

Basically the pilot is designed to access local SE only. AFAIK, we don't plan to extend the pilot for 3rd party transfer.

The pilot puts and registers output files to ATLAS-specific area in the local SE/LFC. e.g.,

srm://uct2-dc1.uchicago.edu/pnfs/uchicago.edu/atlasproddisk/... /grid/atlas/dq2/...

which are reserved for ATLAS. If you run OSG jobs the pilot needs to put/register files to other area like /grid/osg/charmm/... I think Torre was talking about this thing.

Panda has a dataset table but PandaMover itself doesn't use it. It works with a list of GUID/LFN instead of datasets.

#What Movers are doing is to access to local SE (using lcgcp,xcp ###What Movers are doing is to access to local SE (using lcgcp,xcpnd LFC. I don't know which protocol you use for OSG jobs (gsiftp?). You may need OSGSiteMover or something if some special treatment is required for OSG jobs.

Underlying utilities

LFC

Useful links:

To install OSG client with the LFC client:

$ pacman -get OSG:client 
$ pacman -get VDT:LFC-Client 

Charles Waldman is developing an http interface to LFC, which will eliminate the need for the LFC client. Comments from himself:

There is permission from ATLAS to continue developing this. This demo only allows LFC lookups, either by lfc pathname, as in the interactive demo, or by GUID, for example
Before this can be used for real we need to add methods for registering files in LFC, which will require a valid grid proxy.
One concern when this was presented at the ADC meeting is that the LFC is fairly complex (ACLs, etc)

Q: is the intention to duplicate the full functionality of the LFC.
A: no, only to implement the minimum subset that is needed for ATLAS/Panda production and analysis jobs. But, some help could be needed in identifying this subset. The old LRC code and guess can be examined, but it would be nice to have a little bit of guidance here

Q: what is the minimum functionality needed to be implemented here before this would be useful to USATLAS?

Q: Given that the LFC client is already in the OSG worker node client, along with Python bindings, what extra purpose is served? Is it just to have a REST-based web service rather than use the library?
A: That's partly it. The LFC clients and Python bindings are pretty non-portable - part of the bindings are a SWIG-generated ".so" library, and this implementation is brittle. Performance is not good, and programs I've written that use the LFC bindings frequently result in core dumps (which is not easy to do in Python). In addition, these bindings are pretty awkward to use e.g. instead of returning a list, they return something like an _lfc_listptr_p object which you have to use special methods to iterate through, then free the memory when you are done. It's a bit of a pain. Things would be simpler if we just had to make http(s) requests, then we could use standard components (eg. curl, httplib, etc)

The other part of the motivation is that we'd like to EXTEND the LFC by adding extra database columns. e.g. storing PNFSIDs in the LFC, or storing pcache data, so jobs can be sent to nodes where files have been pre-staged.

Q: Will this be a browser-based file registration web application?
A: No, that's not the idea at all. Clients would use curl to register files programattically, using the same proxy as the job. This would not be done via a web browser.

Code can be found here A work instance is here

The implementation

Work in progress: Prototype code in SVN

Register file

Script dpm_register_file.py, based on dq2_cr. Source code can be found here http://www.usatlas.bnl.gov/svn/panda/osgddm/pdm_register_file.py

This script registers a file in LFC. The directory corresponding for the dataset is created in case it does not exist already. The previous existence of the file in the catalog is checked also.

Options are:

        -h | --help             Print this message

        -V | --verbose          Verbosity level

        -p | --pfn              Physical filename

        -l | --lfn              Logical filename

        -g | --guid             Grid Unique ID

        -f | --filename         The name of the file

        -d | --dataset          Dataset name

        -s | --size             File size

        -t | --type_chksum      Type of checksum:
                                      MD for md5
                                      AD for adler32

        -c | --checksum         Cheksum value

        -n | --ntrial           Number of trials for the register operations
                                (default=1)

        -y | --delay            Number of seconds between trials

        -o | --timeout          Number of seconds before timeout

        -F | --force            Force the creation of a directory in LFC catalog

        -v | --vo               VO name (default=osg)
To register several files at a time a list of values split by commas, has to be provided for certain options, These options are pfns, lfns, guids, filenames, sizes, types and values for checksum. The rest of values must to be unique. Registering files belonging different datasets is not allowed. Registering files belonging different VOs is not allowed.

Registering checksum and file size can be done only when the LFC python API is installed on the machine and properly imported. An attempt to register them when the API cannot be imported will raise an exception.

Work in progress

Delete file

Script dpm_delete_file.py

This script removes entries from LFC. The previous existence of the file in the catalog is checked also.

Options are:

        -h | --help             Print this message

        -V | --verbose          Verbosity level

        -p | --pfn              Physical filename

        -l | --lfn              Logical filename

        -g | --guid             Grid Unique ID

        -f | --filename         The name of the file

        -d | --dataset          Dataset name

        -n | --ntrial           Number of trials for the register operations
                                (default=1)

        -y | --delay            Number of seconds between trials

        -o | --timeout          Number of seconds before timeout

        -v | --vo               VO name (def ault=osg)
To remove several entries at a time a list of values split by commas, has to be provided for certain options, These options are pfns, lfns, guids, and filenames. The rest of values must to be unique. Removing files belonging different datasets is not allowed. Removing files belonging different VOs is not allowed.

Work in progress !!!

Errors code

A list of error codes has been created in case of problem. In case of failure, an exception is raised, carrying the corresponding error code.

The source code can be found here: http://www.usatlas.bnl.gov/svn/panda/osgddm/pdmUtils.py

Errors are listed in the following table:

Error Type Error Code Message
Parameters 100 invalid option
101 the verbose level is not a number
102 verbose level not valid
103 the number of trials is not a number
104 number of trials not valid
105 the number of seconds of delay is not a number
106 number of seconds of delay not valid
107 the number of seconds before timeout is not a number
108 number of seconds before timeout not valid
109 the size is not a number
110 the size value is not correct
111 value of type_chksum not valid
112 the cheksum value is not a number
113 no VO especified
114 not all mandatory parameters has been specified
115 the number of LFNs is different to the number of PFNs
116 the number of filenames is different to the number of PFNs
117 neither lfns nor filenames have been specified
118 only one dataset is allowed
119 the number of GUIDs is different to the number of PFNs
120 the number of sizes is different to the number of PFNs
121 the number of type_checksums is more than one but different to the number of pfns
122 the number of checksum values is more than one but different to the number of pfns
123 only one variable, the type_checksums or the checksum values, has been specified
Environment 200 No valid grid-proxy. Do grid-proxy-init
201 No LFC python API
LFC catalog 300 LFC directory does not exist
301 register file not successfully
302 register size and/or cheksum not successfully
303 could not create LFC directory: %s' % value
304 file exists already in the catalog

Work in progress

Transformation script

Something like http://www.usatlas.bnl.gov/svn/panda/mover/trf/run_dq2_cr

Q: is the idea of the design to submit a second job (after the analysis one), invoking such a transformation script, to perform the data movement? If not, why do we need a transformation scritp?
A: trf script is the driver for the distinct panda job that does data movement (PandaMover). PandaMover jobs are triggered by Panda server for data movement.

Q: I see this means the data are firstly stored in the local SE, and then moved to the final destination. Is that right? I think there are in pilot3 several scripts SiteMover.py to do this local stage-out, if I understood an explanation from Tadashi. Should we use them, or a dq2/ATLAS free version, for such local staging?
A: There is a 'home SE'. For US ATLAS, this is BNL. Data is staged to sites from the home SE by Panda using PandaMover. Once on a site SE, jobs using that data at the site are released. These jobs pull the data from local SE to WN (and move outputs to local SE, or to home SE if that is the output target). The SiteMovers? in pilot3 handle the local movement from/to the local SE. We should use them for local staging. Which one to use depends on the site and is part of the schedconfig site config.

Q: In that transformation script there is no any call to these SiteMover scripts. Only dq2- commands invocations. But we don't want dq2 for OSG. It was just to tell me that we need some kind of transformation script, or that we need one trf doing the same operations this does?
A: PandaMover jobs only run at BNL. They are part of the 'Panda infrastructure', not something 'deployed to OSG'.

Q: If we are going to use a OSG version of SiteMover scripts, do we need all of them or I can assume OSG only makes use of SRM?
A: We don't need 'OSG version', we only need some version that works at a site based on its storage and config. We probably have all we need right now. srm, gsiftp, cp, dcache,...

Q: Should we register in LFC and in PANDA DB the local copy? Probably yes, to allow the movement job to find the files to move...
A: yes

I/F in panda server to register datasets

Perhaps mover jobs will be instantiated outside Panda first, and then will be submitted to the Panda server. For ATLAS, the panda server itself instantiats movers.

mover job submitter

They are not so big deal.


-- TWikiAdminGroup - 21 Nov 2018

About This Site

Please note that this site is a content mirror of the BNL US ATLAS TWiki. To edit the content of this page, click the Edit this page button at the top of the page and log in with your US ATLAS computing account name and password.


Attachments

 
Powered by TWiki
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback