META TOPICPARENT |
name="DataAnalysis" |
Shears Production
LPC Work Area
Master instructions for creating a CMSSW_80X area for shears.
$ cmsrel CMSSW_8_0_X (For the L1ECAL fixes we need X>=32 so use 32 for now)
$ cd CMSSW_8_0_X/src
$ cmsenv
EGAMMA Corrections:
$ git cms-init
$ git cms-merge-topic cms-egamma:EGM_gain_v1
$ cd EgammaAnalysis/ElectronTools/data
$ git clone -b Moriond17_gainSwitch_unc https://github.com/ECALELFS/ScalesSmearings.git
$ cd $CMSSW_BASE/src
$ scram b
L1ECAL:
$ git cms-merge-topic lathomas:L1Prefiring_8_0_32
$ scram b
//Shears
If you want to use a different way than ssh (KRB, etc) go to https://gitlab.cern.ch/darcaro/shears
$ git clone ssh://git@gitlab.cern.ch:7999/darcaro/shears.git -b Run2_2016_CMSSW_8_0_26
$ scram b
Shears makes ntuples in two steps: Baobab and Bonzai. We use the Bonzai files for analysis in the
shears/DYJets directory.
Boabab
$ cd shears/ntuple_production
Modify the setup.sh file so that all paths are correct for your setup (sorry not fancy enough to do
is automatically)
$ source setup.sh
There should be multiple datasets_##### files in shears/datasets. These files contain the dataset to
be used and the output directory for the Baobab files. All the datasets are taken from DAS. Dataset
lines with a # will not be run. Change the output path to where you want. Copy the ones you want to
user to shears/ntuple_production for convenience.
To submit the jobs we use simple_grow_baobabs. It is a python script that creates the files that are
submitted with the crab jobs. It can also call crab commands such as commit. It is up to you if you
want to do things manually with crab or use this tool after the crab files are created (probably dont
make the files on your own). There are also some paths here that need to be changed***
More paths need to be check in grow_baobabs_cfg.py. This file contains setting for the cmssw job. The
actually cmssw code is in shears/Baobab/src. It does not do much except grab the objects and create
std::vectors from the info.
Ready to make a crab file
$ simple_grow_boababs datasets_2016_INC.txt --no-submit
A file called crab_DYJetsToLL_M-50_TuneCUETP8M1_13TeV-amcatnloFXFX-pythia8_0000.py will be created
that looks like this:
from WMCore.Configuration import Configuration
config = Configuration()
config.section_('General')
config.General.transferOutputs = True
config.General.transferLogs = True
config.General.requestName = 'DYJetsToLL_M-50_TuneCUETP8M1_13TeV-amcatnloFXFX-pythia8_0000'
config.section_('JobType')
config.JobType.psetName = 'grow_baobabs_cfg.py'
config.JobType.inputFiles = ['L1PrefiringMaps_new.root']
config.section_('Data')
config.Data.inputDataset = '/DYJetsToLL_M-50_TuneCUETP8M1_13TeV-amcatnloFXFX-pythia8/RunIISummer16Mini\
AODv2-PUMoriond17_80X_mcRun2_asymptotic_2016_TrancheIV_v6_ext2-v1/MINIAODSIM'
config.Data.unitsPerJob = 100000
config.Data.publication = False
config.Data.splitting = 'EventAwareLumiBased'
config.Data.outLFNDirBase = '/store/user/darcaro/Baobab/MC/v10/Ntuples'
config.section_('User')
config.section_('Site')
config.Site.blacklist = ['T3_TW_NTU_HEP', 'T3_GR_IASA', 'T2_GR_Ioannina', 'T3_MX_Cinvestav', 'T2_DE_RW\
TH', 'T2_UK_SGrid_RALPP', 'T3_RU_FIAN', 'T2_FI_HIP', 'T2_BR_SPRACE', 'T2_ES_CIEMAT', 'T2_EE_Estonia']
config.Site.storageSite = 'T3_US_FNALLPC'
Make sure the outLFNDirBase is correct and the datasets. Read about crab python files if you want
to know all the other settings.
The file can then be submitted to crab:
$ crab submit -c crab_DYJetsToLL_M-50_TuneCUETP8M1_13TeV-amcatnloFXFX-pythia8_0000.py
Calling simple_grow_baobabs without the --no-submit option will automatically call the crab submit
line for you. Only start doing this after you are sure everything will be correct.
A bunch of text will come up and if it goes alright it will say submitted somewhere. Make note of the
URL as well since that is useful when monitoring the jobs:
http://dashb-cms-job.cern.ch/dashboard/templates/task-analysis/#user=default&refresh=0&table=Mains&p=1&records=25&activemenu=2&pattern=&task=&from=&till=&timerange=lastWeek
Check other crab commands (crab -h) and also the notes.txt file (lots of examples of what I have done
in the past with details).
Once the files are done we need to make a catalog file (just a file to keep track of where all the
files are located).
$ simple_grow_boababs datasets_2016_#####.txt --make-catalogs
They are created in the same parent directory of Ntuples.
Bonzai
$ cd shears/bonzai_prod
This is similar to Boabab where there is a main python scipt: grow_bonzais, and dataset files. The
main difference is that there is a skim setting here that is important for what analysis you want.
It is basically choosing the lepton flavor and some other cuts. All of this is defined in
shears/Bonzai/Pruner/VJetPruner.cc. We will mostly use the Unf variant with double or single muon or
electron (DMuUnf, DEUnf, SEUnf, SMuUnf).
For the other parts of the dataset file just make sure the paths are correct for the input catalogs
that were produced in the Baobab section and the output directory. Again commented like with # will
not be submitted.
Modify the crab_setup.sh script to your directories and source it.
$ source crab_setup.sh
Create the crab file
$ grow_bonzais --task-list grow_bonzai_task_list_####.txt --no-submit
And submmit it if everything looks good.
$ crab submit -c crab_#####.py
Again taking away --no-submit does the submit automatically.
Once all the jobs are done you can make the catalogs like in baobab:
$ grow_bonzais --make-catalogs grow_bonzai_task_list_#####.txt
-- Daniel Arcaro - 17 Jan 2019
Comments
|