DIAL release 1.30: File management system

DIAL provides a command line interface for file management, fms, and a C++ class FileManagementSystem. Both allow users to put files into a storage element and later access them or their replicas. Here we present a description of the command line interface and the environment variables that are used to determine the behavior.


URL's

All file references are by URL protocol://site/path or protocol:identifier. Supported protocols are listed in the following table.
protocol meaning example
guid GUID guid:3238b240-61f1-11da-8cd6-0800200c9a66
lfn: logical file name lfn:dc2.003007.digit.A1_z_ee._00001.aod-1000.pool.root
srm SRM srm://dcsrm.usatlas.bnl.gov/pnfs/usatlas.bnl.gov/data/prod/rome/datafiles/rome/recov10/rome.003041.recov10.J8_Pt_2240/rome.003041.recov10.J8_Pt_2240._00053.AOD.pool.root.1
gsiftp globus copy gsiftp://aftpexp02.bnl.gov/usatlas/magdacache001/common/Rome-AOD/datafiles/rome/recov10/rome.003041.recov10.J8_Pt_2240/rome.003041.recov10.J8_Pt_2240._00040.AOD.pool.root
dcap dCache dcap:/pnfs/usatlas.bnl.gov/data/prod/rome/datafiles/rome/recov10/rome.003041.recov10.J8_Pt_2240/rome.003041.recov10.J8_Pt_2240._00053.AOD.pool.root.1
rfio Castor
file ordinary file file:/usatlas/magdacache001/common/Rome-AOD/datafiles/rome/recov10/rome.003041.recov10.J8_Pt_2240/rome.003041.recov10.J8_Pt_2240._00040.AOD.pool.root


Operations

FMS provides three operations:

get
Get "resolves" a URL, i.e. transforms into another URL corresponding to a replica of the same file. This may require consulting a replica catalog, staging the file or copying to an accessible location. The usual application is to use a logical name (GUID or LFN) to find a file which can be read by an application. The algorithm for resolution is described below.

The command line syntax is

  >fms get URL
where URL can be any of the above. The stdout for the command is one line with resolved URL if successful. Otherwise there is no stdout.

The ownership of the file associated with the output URL will sometimes but not always lie with the user. FMS should be modified to clarify this and perhaps provide means to release or delete the file where appropriate.

copy
The copy command also transforms one URL into another but the new URL is guaranteed to be a copy of the original file that is owned by the caller.

The syntax is

  >fms copy URL file:/somepath
and the output is the same as for get.

put
Put copies the file referenced by the input URL to a storage element and may register it in a replica catalog. It returns a URL carrying the SE or replica catlog name for the file. The syntax is

  >fms put URL


Environment

The behavior of FMS is controlled by the following environmental variables. Their meanings are furhter clarified in the following section on algorithms.

DIAL_FMS_PROTOCOLS
This a a comma-separated list of acceptable protocols and is typically dictated by the application that will consume the file. E.g. for root or athena applications it might be "file,dcap,rfio".

DIAL_FMS_PROTOCOL_ORDER
This a a comma-separated list of extended protocols in order of preference. Extende protocols may add a directory path or can be the special value "copy" meaning to make a copy of the file. The value for this variable depends on the site configuaration. For example, the value

  file:/atlascache001,dcap:/thissite.org:8443,copy:/tmp/usercache"
instructs FMS too first look for an ordinary file in the ATLAS cache directory, then try the indicated dCache sever and finally to copy the file to a tmp directory.

DIAL_FMS_CATALOG
This is a comma-separated list of file replica catalog locations used to resolve URL's. These are prefixed by the the DIAL class used to interpret the location. A suitable value for BNL might be

  PoolFileCatalog:mysqlcatalog_mysql://dq2user:wnw@db1.usatlas.bnl.gov:3306/localreplicas,MagdaFileCatalog

DIAL_FMS_SE
This is an extended protocol specifying the storage element. E.g. gsiftp://mygsiserver/mypath or srm://srmserver/mypath.

DIAL_FMS_SE_CATALOG
This is a prefixed replica catalog location indicating the catalog that should be used to register files put into the storage element.

DIAL_FMS_VERBOSE
FMS will write debugging messages to stderr if this variable is set to any nonblank value.


Algorithms

get
The algorithm for file name resolution (get) is as follows. If the input URL is logical (guid or lfn), then replicas are extracted from each of the resolving file catalogs. The original URL and each of these is used to try to find a match with each of the extended protocols in the indicated order. Values not consistent with the list of acceptable protocols are ignored. FMS may do some translation of names, e.g. use SRM to obtain a dcap, gsiftp or file URL.

copy
The algorithm for copy is the same except the list of extended protocols is replaced with the single destination indicated on the command line.

put
For put, the file is copied to the SE (user retains control of the original). If no SE file catalog is defined, then the SE URL for the file is returned. If there is an SE file catalog, the SE URL is registered there and the natural logical URL is returned, guid for the pool file catalog.


DIAL release 1.30: File management system, updated 30nov05