Analysis Open Space


Este é um espaço para que sejam discutidas questões relacionadas a análises desenvolvidas no grupo do SPRACE. Objetivos e condutas gerais são:

  • Colocar para o grupo problemas de caráter técnico (CMSSW, ROOT, condorg, crab, etc...) ou físico (questões teóricas)
  • Todos estão convidados a responder/opinar sobre os problemas colocados
  • Perguntas e respostas devem ser mantidas na página para referência futura
  • Coloque seu nome entre parêntesis ao fazer uma pergunta ou dar uma resposta
Crie uma nova questão com ---+++ dentro da seção correspondente (CMSSW, ROOT, grid, ...) para que ela apareça no sumário no topo da página (como no Exemplo 1 abaixo).

Exemplo 1: Como uso LaTeX no ROOT?

(Fulano) Preciso fazer o título de um gráfico com símbolo LaTeX, como faço?

(Cicrano) É só usar # antes do comando (ao invés de usar /)

(Beutrano) Por exemplo ->SetTitle("#theta (graus)")

(Fulano) Ok, resolvido


How to skip bad files in a CMSSW job

(Thiago) Sometimes, a CMSSW job is running over hundreds of input files, and it happens that one of then is bad/corrupted/cannot be opened. To avoid the job from crashing, you can do the following in your configuration:

process.source.skipBadFiles = cms.untracked.bool( True )

Remember to check which files are bad and take action afterwards!

How to setup debugging symbols in CMSSW

(Thiago) The best thing you can do when you encounter seg faults is add debugging symbols to your compiled code. To do this, edit the BuildFile for the package that is crashing with the following line:

<flags CXXFLAGS="-O0 -g3 -fno-inline"/>

Then, recompile your package and re-run. This time, the stack trace should tell you a line number in one of your files where the error is occurring.

How to setup random seeds in the Python config file.

(Thiago) Add the following lines to your file:

import uuid
import random
x = uuid.uuid4()

and them, after you load the RandomNumberGeneratorService (usually wish something like process.load('Configuration.StandardSequences.Services_cff') ), add:

randService = process.RandomNumberGeneratorService
for param in [x for x in randService.parameterNames_()
             if type(getattr(randService, x)) == cms.PSet]:
   getattr(randService, param).initialSeed = random.randint(1,100000)

How to find the corresponding Monte Carlo of a dataset

(Caio) I need to find a MC corresponding to a specific dataset. This MC must simulates detector conditions during datataking, as well as the physics of the collision. How to find it? In the specific case the dataset is:

/MinimumBias/Run2010A-Apr21ReReco-v1/RECO com CMSSW_4_2_1_patch1 e global tag FT_R_42_V10A.

Esse link tem informações sobre os dados de 2010 e esse outro sobre os MC, mas como relacionar esses dados com esses MCs? Qual MC corresponde a qual dado? Tem a ver com a estação do ano?

Finding datasets

(Caio) In CMS, data are found in the DAS website, which is a better version of DBS. Example of DAS commands for dataset search:

  • dataset = /Min*Bias*TuneZ2*7TeV*pythia6*Summer11*AODSIM (The star is the wildcard character, e.g. all of these are similar: "heavy_ion" "h*y_ion" "heav*y_ion" "heavy_ion*"
  • dataset = /MinimumBias/Run2010A*AOD
  • block = /MinimumBias/Run2010A-Apr21ReReco-v1/AOD#f9fe2e80-703d-11e0-9135-003048f1c5d0
  • release dataset = *7TeV*pythia*GEN-SIM-RECO Shows all CMSSW releases which have this dataset
  • dataset release = CMSSW_4_2_1_patch1 Shows all the datasets which have his CMSSW release
  • dataset site = T2_BR_SPRACE Shows all datasets which are available in SPRACE

How to create my own track collection?

(Caio) I have a RAW-RECO data file and I would like to create a track collection, like the "generalTracks" one, but something along the lines of a "myTracks" collection, where I put only tracks which pass a certain set of requirements. I would like to save that "myTracks" collection in other data file to be run over with cmsRun. How can I do that?

(Thiago) Use the following EDFilter, with the parameters you want:

process.myTracks = cms.EDFilter("RecoTrackSelector",
    src = cms.InputTag("generalTracks"),
    maxChi2 = cms.double(10000.0),
    tip = cms.double(120.0),
    minRapidity = cms.double(-5.0),
    lip = cms.double(300.0),
    ptMin = cms.double(0.1),
    maxRapidity = cms.double(5.0),
    quality = cms.vstring('loose'),
    algorithm = cms.vstring(),
    minHit = cms.int32(3),
    min3DHit = cms.int32(0),
    beamSpot = cms.InputTag("offlineBeamSpot")

and add an OutputModule to save your new tracks into the output file.

process.out = cms.OutputModule("PoolOutputModule",
    fileName = cms.untracked.string('patTuple.root'),
    outputCommands = cms.untracked.vstring(
        'keep *_myTracks_*_*',

How to generate events from MG5 gridpacks

This should be possible to perform anywhere that has cvmfs: lxplus, access, etc.

(Breno) The first step is to create the gridpack file. A MG5 gridpack is a compressed file that contains all the needed information about the process of interest to do the simulation. It is based on a set of cards that summarize the parameters of the particles in the model (customization card), the parameters of the run (run card), the parameters of the process (process card) and other. Some examples of cards (using relevant processes for the disappearing tracks analysis) are available in the attached files as AMSB_chargino_M700GeV_ctau100cm_customizecards.dat, AMSB_chargino_M700GeV_ctau100cm_proc_card.dat and AMSB_chargino_M700GeV_ctau100cm_run_card.dat. The card naming should follow the example: PROCESS_NAME_customizecards.dat, PROCESS_NAME_proc_card.dat, PROCESS_NAME_run_card.dat, etc.

The easiest way that I found to create the cards is to go to a MG5 version (download it in your machine), and generate the processes there. MG5 already needs a run card, but gives a template, it also gives templates to the param card (customizecards -> just create a script that extracts the information from the param card in the customize card syntax), and the process card is just related to the model and commands used to generate processes (generate pp > Z, add process pp > Z, etc.).

To create the compressed gridpack file, the genproductions repository is needed. To get it, do the following outside of a cmsenv (outside of any CMSSW_X_Y_Z folder!), and go inside the MadGraph5_aMCatNLO folder that is where we will be working:

git clone
cd genproduction/bin/MadGraph/MadGraph5_aMCatNLO

Create a folder inside the cards folder to store the gridpack cards:

mkdir cards/exampleFolder
cp PROCESS_NAME_*.dat cards/exampleFolder

And produce the gridpack file with the following line:

./ PROCESS_NAME cards/exampleFolder

The file is the script that will execute MG5 and produce the gridpack file; first argument PROCESS_NAME will be the name of the gridpack file and of the folder containing some information about the MG5 process (only needed for debugging purposes, but it interesting to take a look); second argument is the folder that contain the gridpack cards to be used. It might take a while for the .tar.xz file to be generated...

With the gridpack file in hand, it is possible to run a CMSSW hadronizer file that takes as input the gridpack and, using the module ExternalLHEProducer. This should all be executed inside a CMSSW_X_Y_Z/src folder after executing the cmsenv command. It would be something like:

externalLHEProducer = cms.EDProducer("ExternalLHEProducer",
    args = cms.vstring('PROCESS_NAME_slc7_amd64_gcc900_CMSSW_12_0_2_tarball.tar.xz'), # path to gridpack file -> if it is in the same folder as the hadronizer this line is ok
    nEvents = cms.untracked.uint32(10), # number of events to be generated -> should never be smaller than number of events in config file that will run the GEN step in CMSSW!
    numberOfParameters = cms.uint32(1), # number of arguments (don't know why more than one should be used)
    outputFile = cms.string('cmsgrid_final.lhe'), # name of output .lhe file for the module itself
    scriptName = cms.FileInPath('GeneratorInterface/LHEInterface/data/') # script inside CMSSW that will run the events generation

For a gridpack file that is saved in EOS use the following (highly recomended; each compressed file can have on the order of 100 MB and the lxplus area doesn't have a lot of space available; example uses disappearing tracks gridpack):

externalLHEProducer = cms.EDProducer("ExternalLHEProducer",
    args = cms.vstring('root://'), # path to gridpack file in EOS
    nEvents = cms.untracked.uint32(10), # number of events to be generated -> should never be smaller than number of events in config file that will run the GEN step in CMSSW!
    numberOfParameters = cms.uint32(1), # number of arguments (don't know why more than one should be used)
    outputFile = cms.string('cmsgrid_final.lhe'), # name of output .lhe file for the module itself
    scriptName = cms.FileInPath('GeneratorInterface/LHEInterface/data/') # script inside CMSSW that will run the events generation

An example of a complete hadronizer (again using the disappearing tracks events) is at To create the config file that will be executed with cmsRun, the following is an example: \
--fileout file:AMSB_chargino700GeV_ctau100cm_step1.root \
--mc --eventcontent RAWSIM \
--customise Configuration/DataProcessing/Utils.addMonitoring,SimG4Core/CustomPhysics/Exotica_HSCP_SIM_cfi \
--datatier GEN-SIM --conditions 124X_mcRun3_2022_realistic_v12 \
--beamspot Realistic25ns13p6TeVEarly2022Collision --step LHE,GEN,SIM --geometry DB:Extended \
--era Run3 \
--python_filename \
--no_exec -n 10

where, line by line, it does the following:

  • Executes the script taking the hadronizer as input;
  • Name of the output root file;
  • Activate the mc flag and set the root file event content as RAWSIM;
  • Customize the config file (needed for the disappearing tracks simulation -> it also needs file AMSB_chargino_700GeV_ctau100cm.slha to work properly; most probably not needed for the majority of other processes);
  • Set the datatier (essentially the steps being executed) and the conditions (global tag; defines the conditions of the detector and the setup in a given run period);
  • Set beamspot, steps of simulation (LHE is what creates the .lhe file that will not be output in this case; GEN and SIM are being performed in the same file) and sets the geometry of the detector;
  • Set the era;
  • Name of the output config file;
  • Sets that the script will not execute the config file and the number of events to produce

Finally, the created config file can be executed with cmsRun


Changing the individual colors of entries in a TLegend

(Thiago) If you want to change the individual colors of entries in a TLegend (for matching the colors of the objects they're related to), use the following recipe:

// Say that you have a valid pointer for the TLegend
TLegend* leg = (TLegend*)0x0000000106bcfe20;
// Get the list of entries
TList* list->GetListOfPrimitives();
// Get each entry individually
TLegendEntry* l1 = (TLegendEntry*)list->At(0);
TLegendEntry* l2 = (TLegendEntry*)list->At(1);
// Now you can set text attributes

N-dimensional histograms with ROOT

(Caio) I know its possible to make an up-to-10 dimensions using ROOT. How to declare it and manipulate its content? (Caio) N-dimensional histograms are declared using THnSparse as following:

    //                 {qT, qL, kT,  Nch}
    Int_t bins[4] =    {40, 40, 100, 150};
    Double_t xmin[4] = {0., 0., 0.,  0.};
    Double_t xmax[4] = {2., 2., 5.,  150.};
    THnSparse *FourDHistogram = new THnSparseF("4d-hist", "4d-hist", 4, bins, xmin, xmax);

The call of Sumw2() right after histogram creation is to store statistics error. Although N-dimensional histogram is useful, is will use a lot of space in memory. If it space goes beyond 4 GB (that happened to me in a 5D histo) your program will crash. The best way to manipulate Ntuples of data is using TTree.

Dois Pads num Canvas sem o espaço em branco

(Angelo) Já vi no ROOT casos em que dois Pads (um em cima do outro) aparecem num único Canvas, mas sem a linha branca que aparece entre dois Pads quando se usa, por exemplo:


Por exemplo: o Pad superior mostraria dois histogramas (data/MC), enquanto que o inferior mostraria algo como "(Data - MC)/sigma". Acredito que não se trata de usar canvas->Divide(1,2). Porém, deve haver alguma forma de dizer onde começa e onde termina cada Pad. Alguma idéia?

(Caio) Estava fazendo uns testes e encontrei essa opção:

//faz o seu Canvas
TCanvas *canvas = new TCanvas();
//depois cria dois Pads
TPad *pad1 = new TPad();
TPad *pad2 = new TPad();
//e desenha seus pads dentro do canvas
//os pads vao cobrir todo o canvas, precisa ir com o mouse e redimensionar eles
//note que o pad2 vai ficar por cima do pad1
//aí vc desenha seu gráfico principal no pad1
//e o de residuos no pad 2
//ai tem que ajustar com o mouse pro pad2 ficar por cima do eixo-X do pad 1

Esse gráfico é um exemplo de como as coisas podem ficar, e esse é o arquivo root correspondente. É uma solução meio grosseira, mas funcional. Alguma idéia melhor?

(Angelo) Após testar o seu método, achei uma novo caminho que permite fazer tudo automático sem a necessidade de usar o mouse. Basta colocar as dimensões correspondentes do pad usando Pad(), além de funcões como SetTopMargin() e SetBottomMargin():

TCanvas *canvas = new TCanvas();
// Aqui você declara as dimensões do pad superior, por exemplo.
// A ordem correta é TPad("", "", xMin, yMin, xMax, yMax)
TPad *pad1 = new TPad("pad1","",0.,0.3,1.,1.);
// Desapareça com que o espaço em branco na parte de baixo do pad superior.
// Se preferir, apague o label e o título do eixo "x" do gráfico superior.
// Aqui o pad inferior é declarado.
// Veja que o eixo vertical ("y") do pad2 termina onde o pad1 começa.
TPad *pad2 = new TPad("pad2","",0.,0.,1.,0.3);
// Desapareça com que o espaço em branco no top do pad inferior
// Diga ao ROOT onde que o gráfico do pad inferior vai ser iniciado.
// Isto é importante para permitir que o label do eixo "x" apareça e não seja cortado.
// É provável que não seja possível ver o label do gráfico do pad2, pois pode
// estar automaticamente com tamanho "0".
// Caso isso acontença, forneça o tamanho do label e do título. Por exemplo:

Como construir histogramas com bins não homogêneos?

(Angelo) Esse é um exemplo de como plotar histogramas com bins variáveis (Franciole's example):

void variable_bin(){

  //histo with variable size bins

  //# of bins = 3 
  int nbins = 3;

  //# of edges. Includes the lowest and highest
  const int nedges = nbins+1;

  //Defines edges
  float xbins[nedges] = {0.0,1.0,3.0,6.0};

  //creates histo
  TH1F *hvar = new TH1F("hvar","hvar title",nbins,xbins);

  //Writes width of all bins
  //Note bin zero is reserved for other purposes (see ROOT documentation)
  //size will be equals to 1
  std::cout << "first bin size:  " << hvar->GetBinWidth(1) << std::endl;
  std::cout << "second bin size: " << hvar->GetBinWidth(2) << std::endl;
  std::cout << "third bin size:  " << hvar->GetBinWidth(3) << std::endl;


First login to access machine

To log on, use ssh

 ssh <username> 

Set up the CMS VO certificate according to instructions here. Verify it all works by executing

voms-proxy-init --voms cms

To navigate through the T2_BR_SPRACE storage use srmls command

srmls srm://<username>

More useful srm commands here.

Using condorg

(Franciole) Temos utilizado o cluster com o condorg para fazer parte de nossas analises. Os jobs variam desde producao de ntuplas, pequenas producoes de MC, codigos privados do ROOT e por ai vai. No entanto, temos cada um uma solucao particular para seu problema. Sera que poderiamos partilhar essas solucoes? Sera que isso ajudaria num futuro proximo?

(Caio) I create a TTree with cmsRun and then run ROOT over it using condor. This is the way how to do that:

tar-compact the area where your ROOT macro are. The TTree might be in this directory or it can be in the storage element.

tar -czvf CMSSW_3_6_2.tgz CMSSW_3_6_2/

then gets proxy

grid-proxy-init -debug -verify

voms-proxy-init -voms cms

and submit the job with

condor_submit condor_run

The script I pass to condor to run ROOT is this. (see this link for useful information about bash programming). This is the condor configuration file (might be obsolete, not sure).

(Cesar) Eu utilizo um script similar a esse do Caio para executar o cmsRun com o condor_g ( condor_test). Testei uns dois dias atrás e funcionou na versão CMSSW_4_2_5. O arquivo de configuração no caso lê um arquivo .root do dCache. Não consegui fazer funcionar quando o aquivo .root está na tarball, ou seja, em um dos subdiretórios do CMSSW (Alguém tem alguma idéia? Já tentei usar $WORKING_DIR/CMSSW_4_2_5/src/flatTuple/patTuple_PATandPF2PAT.root e $IWD/CMSSW_4_2_5/src/flatTuple/patTuple_PATandPF2PAT.root, uma vez que o caminho na access, por exemplo, seria /home/bernardes/CMSSW_4_2_5/src/flatTuple/patTuple_PATandPF2PAT.root). A solução encontrada no momento foi copiar o arquivo para meu diretório no dCache, usando:

srmcp -2 file:///CMSSW_4_2_5/src/flatTuple/patTuple_PATandPF2PAT.root srm:// st/patTuple_PATandPF2PAT.root

Opening files in dcap

(Caio) It is possible to open ROOT with a file in dcap as paremeter. The syntax (thanks to Marco) is given in the following example, where the file 7TeV_Jul16th_Tree_All+-Tracks_2Nch150_noSplitTracks-beamSpot_Rxy.root is opened

root dcap://
root [0] 
Attaching file dcap:// as _file0...
root [2] _file0->ls()
TDCacheFile**           dcap://
 TDCacheFile*           dcap://
  KEY: TTree    track_tree;2    track_tree
  KEY: TTree    track_tree;1    track_tree
  KEY: TH1F     Rxy;1   Ray
  KEY: TTree    ev_tree;1       ev_tree

To hadd several root files located in the storage, make a list of root_files_to_merge.txt and use the

source root_files_to_merge.txt

Useful srmcp commands

To request a proxy valid for seven days, execute:

voms-proxy-init --voms cms -valid 168:00

To get information about the proxy, execute:

voms-proxy-info --all

These are some useful srm commands

  • Listing files at T2_SPRACE
 srmls srm://

  • Delete files at T2_SPRACE
 srmrm srm://

To delete several files at once, type the names in a list_to_remove.txt and use xargs command

xargs srmrm < list_to_remove.txt

  • Making a new directory at T2_SPRACE
srmmkdir srm://

  • Copying from T2_SPRACE to directory at access
srmcp srm:// file:////home/lagana/2439BAA1-FDDF-DF11-ACC0-001D096760DE.root

You can also use the this: copy_from_SPRACE.txt

  • Copying from directory at access to T2_SPRACE
srmcp -2 file:///rootlogon.C srm://

  • Copying from castor to SPRACE:
srmcp --debug=true  -srm_protocol_version=2 srm:// file:////home/lagana/edmfile_9900.root

  • Listing files at T2_MIT
srmls -2 srm://

  • Copying 1 file from T2_MIT to T2_SPRACE
srmcp -2 srm:// srm://

  • Copying many files from T2_MIT to T2_SPRACE
for i in $(srmls -2 srm:// | cut -d " " -f 8 | cut -c115-); do echo "srmcp -2 srm://$i srm://$i";done

  • Saving srmls output in a format ready to be read by file
for i in $(srmls srm:// | cut -d " " -f 8 | cut -c29-); do echo "'$i',";done > to_cfg_file.out

Using CRAB to publish FullSim dataset in 2012

(César) We have some instructions to do FullSim using CRAB in FullSim2012

Analysis in 2012

(Thiago) I am going to prepare a Twiki about the analyses we will be setting up in 2012. It will reside in AnalysisSprace

Analysis in 2013

(Cesar) Detailed recipe for DoubleMuon HLT triggers in EXOTICA here .

(Jose) Pixel Detector Simulation. Instructions here.

(Angelo) AngeloLogBook: this is Angelo's log book of work.

Analysis in 2014

(Jose) Efficiency and fake rate of the cut-based electron ID for Run2. Link.


Para receber um email de notificação toda vez que alguém faz alguma modificação no Analysis Open Space, coloque seu nome nessa área do WebNotify.

-- CaioLagana - 18 Jul 2011 -- ThiagoTomei - 09 May 2012

Topic attachments
I Attachment History Action Size Date Who Comment
Unknown file formatslha AMSB_chargino_700GeV_ctau100cm.slha r1 manage 0.9 K 2023-02-12 - 14:43 UnknownUser  
Texttxt r1 manage 2.9 K 2023-02-12 - 14:43 UnknownUser  
Texttxt r1 manage 11.9 K 2023-02-12 - 14:43 UnknownUser  
Unknown file formatdat AMSB_chargino_M700GeV_ctau100cm_customizecards.dat r1 manage 7.1 K 2023-02-12 - 14:43 UnknownUser  
Unknown file formatdat AMSB_chargino_M700GeV_ctau100cm_proc_card.dat r1 manage 0.2 K 2023-02-12 - 14:43 UnknownUser  
Unknown file formatdat AMSB_chargino_M700GeV_ctau100cm_run_card.dat r1 manage 8.8 K 2023-02-12 - 14:43 UnknownUser  
Texttxt r1 manage 4.0 K 2012-09-03 - 19:14 CesarBernardes  
Texttxt r1 manage 3.5 K 2012-09-03 - 19:54 CesarBernardes  
Unknown file formatEXT condor_ROOT r2 r1 manage 0.6 K 2011-10-21 - 18:38 CaioLagana script para rodar ROOT via condor
Unknown file formatEXT condor_run r1 manage 0.5 K 2011-09-20 - 02:50 CaioLagana condor default script
Unknown file formatEXT condor_run_root r1 manage 0.5 K 2011-10-21 - 18:39 CaioLagana script para rodar ROOT via condor
Unknown file formatEXT condor_test r1 manage 0.9 K 2011-09-20 - 10:01 CesarBernardes condor_test
Texttxt copy_from_SPRACE.txt r2 r1 manage 0.7 K 2016-04-09 - 21:55 UnknownUser  
Unknown file formatcfg crab02.cfg r1 manage 1.1 K 2012-09-03 - 18:47 CesarBernardes  
Unknown file formatcfg crab03.cfg r1 manage 1.5 K 2012-09-03 - 20:11 CesarBernardes  
PNGpng dois-pads-num-canvas.png r1 manage 10.0 K 2011-07-23 - 17:26 CaioLagana Teste para desenhar dois graficos sem espaço entre eles
Unknown file formatroot dois-pads-num-canvas.root r1 manage 19.0 K 2011-07-23 - 19:18 CaioLagana  
Texttxt r1 manage 0.4 K 2015-02-17 - 20:31 UnknownUser Script to merge root files in the storage
Topic revision: r57 - 2023-02-12 - brenoorzari

This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback

antalya escort bursa escort eskisehir escort istanbul escort izmir escort