r95 - 02 May 2011 - 22:44:41 - HongMaYou are here: TWiki >  AtlasSoftware Web > ProofTestBed

Using the BNL Xrootd/PROOF Farm


Introduction

What is PROOF?

The Parallel ROOT Facility, PROOF, is an extension of ROOT allowing transparent analysis of large sets of ROOT files in parallel on clusters of computers or multi-core machines. There is an official PROOF website and a PROOF forum in Root Talk, as well as an ATLAS hypernews forum.

What can I use PROOF for?

PROOF can be used to analyze any ROOT ntuple, including tertiary DPDs and AODs (using AthenaROOTAccess). It's fast, allowing you to analyze the events in a TTree in parallel over many CPUs.

What is the BNL Xrootd/PROOF Farm?

A new Xrootd/PROOF farm was set up at BNL in summer 2010. It is a part of the ATLAS Computing Facility (ACF) at BNL and is located inside the ACF T1 firewall perimeter. As such it is only accessible from ACF’s interactive and batch nodes. Though ssh tunnel through the ACF gateway should work too. Right now it can be accessed by any user with ATLAS credentials at ACF. We may restrict access to the farm in the future, if necessary. The farm redirector node is xrd.usatlas.bnl.gov (aka xrd). Currently the farm has 10 data server nodes and 3 "data vault" nodes. All these nodes have Xeon 5560 8-core CPUs running at 2.8 GHz, with 24 GB of RAM and 1Gb NIC. Data servers have ~2 TB of local disk space (4 SATA disks with 500GB each in RAID0 configuration). Data vaults have ~12 TB each (6 SATA disks with 2TB each in RAID0 configuration).

The PROOF farm master node is acas1010.usatlas.bnl.gov (aka acas1010). It consists of 12 slave nodes (acas1001-1009, acas0784, acas0786, acas0787), with a total of 96 CPUs.

For OS level monitoring we use ACF-wide set up based on Ganglia. One can see Xrootd/PROOF farm Ganglia pages for the first server and second server .

Getting Started with PROOF at BNL

To use the PROOF farm, you need to have access to BNL's interactive nodes. If you do not have an account on acas, you can apply here.

If you would like to try to run PROOF locally instead of at BNL, follow the xproofd setup instructions on the ROOT website (NOTE: in order to run xrootd, you need to setup ROOT).

Setting up ROOT at BNL

Shell scripts are provided in setting up ROOT at BNL, that is:

  • bash/zsh: source /afs/usatlas/scripts/root_set-slc5.sh 5.27.04
  • csh: source /afs/usatlas/scripts/root_set-slc5.csh 5.27.04

You can switch to another ROOT version using the above scripts. It will take care of removing the old one from the environmental variable and adding the new one.

Getting started packages

Two examples are provided at BNL: one in C++, the other in python. They just fill out one histogram and write out a new TTree for those events passing cuts.

* Example-1 in C++: under the directory /direct/usatlas+workarea/yesw2000/root/Proof/NtupleAna/Selector/yesw-Example2, there are 3 relevant source files and one Readme file:

00Readme.txt
mySelector2.C
mySelector2.h
run_chain-mySelector2.C

* Example-1 in python: under the directory /direct/usatlas+workarea/yesw2000/root/Proof/NtupleAna/PySelector/yesw-Example2, there are 2 relevant source files and one Readme file:

00Readme.txt
mySelector2.py
run-mySelector2.py

Just copy each example to your directory and follow the instruction in 00Readme.txt to try it. It should work for all cases: Proof, Proof-Lite or non-Proof ROOT. Try all cases and find out the performance difference.

Adding clist to a TChain or TDSet

For each dataset in xrootd, a clist file is created under directory ~xrdadmin/xrd_copied dataset. You can make use of the clist to add the list of files into your TChain or TDSet, for example:

TFileCollection* fileColl = new TFileCollection("fileColl");
fileColl->AddFromFile("/usatlas/u/xrdadmin/xrd_copied_dataset/data10_7TeV.00162882.physics_MinBias.merge.NTUP_JETMET.f287_p209_tid162924_00.clist");
TChain* chain = new TChain("qcd");
chain->AddFileInfoList(fileColl->GetList());

// or for TDSet
TDSet* dset = new TDSet("dset","qcd");
dset->Add(fileColl->GetList());

Run a Very Simple Example

In this example, you will open a PROOF session, add a file to a TChain, and Draw() a simple quantity.

TProof *p = TProof::Open("acas1010");

TChain *ch = new TChain("FullRec0");

ch->Add("/usatlas/workarea/yesw2000/root/Data/HPTV/user.TARRADEFabien.trig1_misal1_csc11.005145.PythiaZmumu.Athena_12.0.6.GroupArea_12.0.6.6.Jamboree_II-HightPtView-00-00-30.AAN.AANT3._000*.root");

ch->SetProof();

TStopwatch t;
t.Start();
ch->Draw("Jet_C4_p_T","Jet_C4_N>0");
t.Stop();
t.Print();

p->GetOutputList()->Print();
TH1F *htemp = p->GetOutputList()->At(3);
htemp->Draw();

Due to a bug in current ROOT version (5.27.04), the canvas will disappear after the merging is done. A fix is already in the ROOT head version. However, you can always get the output histogram from the output list in TProof.

The output should look something like this:

PROOF Simple Example Session

Reset Your PROOF Session

Sometimes your PROOF session will get corrupted and TProof::Open() will hang. Reset your session to restore functionality:

 TProof::Reset("acas1010");
or for a hard reset, 
 TProof::Reset("acas1010", true);

Finding Available Data

Name space convention for the Xrootd servers is "root://xrd.usatlas.bnl.gov//data/datasetname/filename". Short form "root://xrd//data/datasetname/filename should be also acceptable.

We generate, at copy time, for every copied dataset, a list of files in a dataset, with fully qualified names for the "xrd.usatlas.bnl.gov" farm. These file lists can be found at: ~xrdadmin/xrd_copied dataset directory that is accessible from ACF interactive nodes. The files are named in the following format "datasetname.clist" and each file corresponds to one dataset. As we already mentioned, each such file contains a list of fully qualified file names belonging to that dataset and can be used as input for your analysis. We already tested and will be using so called PQ2 tools (more information about PQ2 tools ca be found at main PROOF page at CERN) for data registration and data discovery on the farm. Full availability of the PQ2 tools will be announced when appropriate.

Currently we are replicating the WWD3PD? from the SM EW subgroup, and QCD d3pd from Jet/EtMiss group (as well as some slimmed version), and some tau performance group d3pd. If you need to copy your data to Xrootd - contact Hong Ma (hma@bnl.gov) or Sergey Panitkin (panitkin@bnl.gov)

Additional Useful Links

  • From the BNL Jamboree, June 2008: Sergey's talk, Shuwei's talk
  •  *Introduction to PROOF farm at BNL and Analysis Tutorial is here.
  • Updated! PROOF in Atlas talk at CHEP09 Conference is here. Corresponding paper can be found below in attachments.
  • Updated! Talk about use of SSDs and I/O related issues in PROOF is here. Corresponding paper can be found below in attachments.
  • Datasets on the xrootd server, DatasetsOnXRD


Major updates:
-- TWikiAdminGroup - 30 Mar 2017

About This Site

Please note that this site is a content mirror of the BNL US ATLAS TWiki. To edit the content of this page, click the Edit this page button at the top of the page and log in with your US ATLAS computing account name and password.


Attachments


pdf panitkin_ssd_chep09_v1.pdf (106.6K) | SergeyPanitkin, 16 Feb 2010 - 14:37 | CHEP09 paper on use of SSDs in PROOF environment and related I/O issues
pdf panitkin_chep09_v3_revised.pdf (299.0K) | SergeyPanitkin, 16 Feb 2010 - 14:34 | CHEP09 talk about Distributed analysis with PROOF in Atlas
png ExampleSession.png (100.1K) | StephanieMajewski, 10 Jul 2008 - 16:01 | PROOF Simple Example Session Screenshot
pdf Xrootd_farm_plans.pdf (114.3K) | KyleCranmer, 13 Jul 2007 - 16:57 |
png Picture_9.png (274.7K) | KyleCranmer, 13 Jul 2007 - 14:50 | ExampleSession1?
c BasicProofCommands.C (1074.1K) | KyleCranmer, 18 Jul 2007 - 11:10 | Basic commands to setup proof, add many files to a dataset, and make a simple plot, run a TSelector
txt ListII.txt (16.8K) | KyleCranmer, 13 Jul 2007 - 17:32 | List of HPTV datasets available via PROOF
c ProofTest.C (3.6K) | KyleCranmer, 16 Jul 2007 - 16:19 |
pdf PROOF_Analysis_Aug1.pdf (1280.0K) | KyleCranmer, 01 Aug 2007 - 16:33 | KyleProofAnalysis? _Aug1
txt AllHPTVFiles.txt (683.6K) | KyleCranmer, 13 Jul 2007 - 17:43 |
h ProofTest.h (125.7K) | KyleCranmer, 16 Jul 2007 - 16:19 | An Example TSelector
pdf xrdmon_presentation.pdf (389.8K) | Main.serp, 01 Aug 2007 - 18:44 | Xrootd monotoring talk from Edgar and Ofer
png AllHPTV_Ntuples.png (37.8K) | KyleCranmer, 16 Jul 2007 - 14:46 |
pdf Xrootd_farm_status_Aug_2007.pdf (120.4K) | Main.serp, 01 Aug 2007 - 18:42 |
png Picture_11.png (100.8K) | KyleCranmer, 13 Jul 2007 - 14:55 | EventProcessingRate?
 
Powered by TWiki
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback