Data Analysis with CMS

Finding Data

The easiest place to start is the cms workbook page on finding data here

The second option is by brute force looking through directory on eos (disk storage).

  • eos can be entered into a terminal on an lxplus machine and an interactive session will start. Commands such as ls and cd can be used to navigate directories. There is a directory called data that has most of the relevant global runs. Enter quit to get out of the session.
  • Another way is to use eos like this:
      $ eos ls eos/cms/store/data 

These file locations such as eos/cms/store/data/Run2015C/Cosmics/RAW/v1/000/253/982/00000/AAF77D5F-FF3F-E511-83B4-02163E0143BA.root are then entered into python files to be used by cmssw. More on this below.

Raw Data Analysis / Data Quality Monitoring





CMSSW Unpacker

CMSSW Unpacker

CMSSW Tricks / Notes

All of this is for lxplus machines except for a few things in the root section. To install a CMSSW version use $ cmsrel CMSSW_#_#_#. Change to the CMSSW_#_#_#/src direction and enter cmsenv to set the environment. To select a specific version number read the Data section. Since some of those tools below require the environment already set just use CMSSW_7_4_2 for the first version and get others when needed.


The best place to start with an Analyzer is the SWGuide. The section mkedanlzr will show how to create a skeleton analyzer to then edit. Once created there are three main files to edit. The first is the file which has all the analysis code. The next file is the BuildFile that links any outside code to the analyzer. The last is the python script which the cmssw code reads to run your analyzer.

The CMSSW Unpacker as listed above is a good place to start looking at some source code examples. Another good example is the built in CMSSW analyzers that can be seen here. The amount of code there is quite hard to sort through so here is a fairly simple one under the data formats directory.


Dependencies / Buildfiles

Here an example is the best thing to look at. The first six use name statements are built in cmssw libraries. The last library is a user created one that is placed in the src directory of the CMSSW_#_#_# directory. The code for this library must be formatted, put into correct directories, and have a build file itself. Again the best is the look at an example and edit it for your purpose. There are source files, header files, and a buildfile to actually create the library. All of it is built in the end with scram b. When making these libraries or putting an #include statement in your analyzer that they point to the correct place. In the CMSSW Unpacker (same as above) the (main analysis code) has an include:

#include <cmsswtools/unpacker/interface/FedEvent.hh>

as if CMSSW_#_#_#/src is the home directory (or ~/). The same should be done with the source code for the user created libraries like can be seen in the previous example.

Another place to look is at the built in libraries source code or even other peoples libraries/analyzers. A good place I have gathered a lot of information from is here. The link has a huge list of both built in analyzer source code and user created libraries plus analyzers. The format used there has worked for current CMSSW version such as 7_4_2.

CMSSW Version

This section is for determining what CMSSW version to choose for a specific data file and see the contents of the file. The data files are all root files and can be found using the procedures above. The main tools are the Event Data Model (edm). Entering edm on lxplus and then pressing tab twice will give a list of all the different edm tools. The two ones used more are edmProvDump which provides the CMSSW version used the create the file. This version should be the version you are using to do any of the analysis. The other tool is edmDumpEventContent which gives a list of the type of data contained in the file. This is explained in more detail in the analyzer section.

Another option is to use a built in cmssw analyzer. An easy implementation of one is in this python script. There are two options for analyzers: EventContentAnalyzer and DumpFEDRawDataProduct( source code)**. The content analyzer will be a similar to edmDumpEventContent and the raw data product just prints the raw data in hex.

The edm tools should always work for global root files but may not work for the local ones. For the recent local runs any CMSSW_7_#_# should work fine since the edmProbDump does not work and cannot tell the exact version.

**The source code for DumpFEDRawDataProduct is a pretty simple one page analyzer because if only prints some data to screen so it is a very good reference. The site also provides pretty much all the source code for the analyzers. It may be quite complicated but very useful.


Drell-Yan Process





Setting up an LPC area for Shears production

-- DanielArcaro - 28 Oct 2015

Edit | Attach | Watch | Print version | History: r4 < r3 < r2 < r1 | Backlinks | Raw View | Raw edit | More topic actions
Topic revision: r4 - 17 Jan 2019 - DanielArcaro
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2023 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback