Shears Production

LPC Work Area

Master instructions for creating a CMSSW_80X area for shears.

$ cmsrel CMSSW_8_0_X (For the L1ECAL fixes we need X>=32 so use 32 for now)
$ cd CMSSW_8_0_X/src
$ cmsenv

EGAMMA Corrections:

$ git cms-init
$ git cms-merge-topic cms-egamma:EGM_gain_v1
$ cd EgammaAnalysis/ElectronTools/data
$ git clone -b Moriond17_gainSwitch_unc https://github.com/ECALELFS/ScalesSmearings.git
$ cd $CMSSW_BASE/src
$ scram b
L1ECAL:
$ git cms-merge-topic lathomas:L1Prefiring_8_0_32
$ scram b
//Shears If you want to use a different way than ssh (KRB, etc) go to https://gitlab.cern.ch/darcaro/shears
$ git clone ssh://git@gitlab.cern.ch:7999/darcaro/shears.git -b Run2_2016_CMSSW_8_0_26
$ scram b

Shears makes ntuples in two steps: Baobab and Bonzai. We use the Bonzai files for analysis in the shears/DYJets directory.

Boabab
$ cd shears/ntuple_production
Modify the setup.sh file so that all paths are correct for your setup (sorry not fancy enough to do is automatically)
$ source setup.sh
There should be multiple datasets_##### files in shears/datasets. These files contain the dataset to be used and the output directory for the Baobab files. All the datasets are taken from DAS. Dataset lines with a # will not be run. Change the output path to where you want. Copy the ones you want to user to shears/ntuple_production for convenience.

To submit the jobs we use simple_grow_baobabs. It is a python script that creates the files that are submitted with the crab jobs. It can also call crab commands such as commit. It is up to you if you want to do things manually with crab or use this tool after the crab files are created (probably dont make the files on your own). There are also some paths here that need to be changed***

More paths need to be check in grow_baobabs_cfg.py. This file contains setting for the cmssw job. The actually cmssw code is in shears/Baobab/src. It does not do much except grab the objects and create std::vectors from the info.

Ready to make a crab file

$ simple_grow_boababs datasets_2016_INC.txt --no-submit
A file called crab_DYJetsToLL_M-50_TuneCUETP8M1_13TeV-amcatnloFXFX-pythia8_0000.py will be created that looks like this:

from WMCore.Configuration import Configuration
config = Configuration()
config.section_('General')
config.General.transferOutputs = True
config.General.transferLogs = True
config.General.requestName = 'DYJetsToLL_M-50_TuneCUETP8M1_13TeV-amcatnloFXFX-pythia8_0000'
config.section_('JobType')
config.JobType.psetName = 'grow_baobabs_cfg.py'
config.JobType.inputFiles = ['L1PrefiringMaps_new.root']
config.section_('Data')
config.Data.inputDataset = '/DYJetsToLL_M-50_TuneCUETP8M1_13TeV-amcatnloFXFX-pythia8/RunIISummer16Mini\
AODv2-PUMoriond17_80X_mcRun2_asymptotic_2016_TrancheIV_v6_ext2-v1/MINIAODSIM'
config.Data.unitsPerJob = 100000
config.Data.publication = False
config.Data.splitting = 'EventAwareLumiBased'
config.Data.outLFNDirBase = '/store/user/darcaro/Baobab/MC/v10/Ntuples'
config.section_('User')
config.section_('Site')
config.Site.blacklist = ['T3_TW_NTU_HEP', 'T3_GR_IASA', 'T2_GR_Ioannina', 'T3_MX_Cinvestav', 'T2_DE_RW\
TH', 'T2_UK_SGrid_RALPP', 'T3_RU_FIAN', 'T2_FI_HIP', 'T2_BR_SPRACE', 'T2_ES_CIEMAT', 'T2_EE_Estonia']
config.Site.storageSite = 'T3_US_FNALLPC'

Make sure the outLFNDirBase is correct and the datasets. Read about crab python files if you want to know all the other settings.

The file can then be submitted to crab:

$ crab submit -c crab_DYJetsToLL_M-50_TuneCUETP8M1_13TeV-amcatnloFXFX-pythia8_0000.py

Calling simple_grow_baobabs without the --no-submit option will automatically call the crab submit line for you. Only start doing this after you are sure everything will be correct.

A bunch of text will come up and if it goes alright it will say submitted somewhere. Make note of the URL as well since that is useful when monitoring the jobs: http://dashb-cms-job.cern.ch/dashboard/templates/task-analysis/#user=default&refresh=0&table=Mains&p=1&records=25&activemenu=2&pattern=&task=&from=&till=&timerange=lastWeek

Check other crab commands (crab -h) and also the notes.txt file (lots of examples of what I have done in the past with details).

Once the files are done we need to make a catalog file (just a file to keep track of where all the files are located).

$ simple_grow_boababs datasets_2016_#####.txt --make-catalogs

They are created in the same parent directory of Ntuples.

Bonzai

$ cd shears/bonzai_prod 

This is similar to Boabab where there is a main python scipt: grow_bonzais, and dataset files. The main difference is that there is a skim setting here that is important for what analysis you want. It is basically choosing the lepton flavor and some other cuts. All of this is defined in shears/Bonzai/Pruner/VJetPruner.cc. We will mostly use the Unf variant with double or single muon or electron (DMuUnf, DEUnf, SEUnf, SMuUnf).

For the other parts of the dataset file just make sure the paths are correct for the input catalogs that were produced in the Baobab section and the output directory. Again commented like with # will not be submitted.

Modify the crab_setup.sh script to your directories and source it.

$ source crab_setup.sh

Create the crab file

$ grow_bonzais --task-list grow_bonzai_task_list_####.txt --no-submit

And submmit it if everything looks good.

$ crab submit -c crab_#####.py

Again taking away --no-submit does the submit automatically.

Once all the jobs are done you can make the catalogs like in baobab:

$ grow_bonzais --make-catalogs grow_bonzai_task_list_#####.txt

-- Daniel Arcaro - 17 Jan 2019

Comments


This topic: Main > TWikiUsers > DanielArcaro > DataAnalysis > ShearsProduction
Topic revision: r1 - 17 Jan 2019 - DanielArcaro
 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback