This is Angelo's log book of work in the Exotica Analysis "Search for Heavy Resonances in the H-tagged Dijet Mass Spectrum in pp Collisions at 8 TeV". This log started to be written on Oct 28th, 2013. It supposed to be a summary of what has been written along CMS Analysis Notes 13-152 and 213/347
, as well as a report of Angelo's related activities.
This is a search for massive resonances decaying into a pair of Higgs bosons each reconstructed in hadronic final states. This search is optimized for large resonance masses, in which the Higgs decay products merge in one massive jet. QCD background is suppressed using jet substructure techniques. Data sample corresponds to an integrated luminosity of 19.6/fb of proton-proton collisions collected in the CMS experiment at the LHC in 2012 at a center-of-mass energy of 8 TeV.
This analysis search for new particles based on physics scenarios Beyond the Standard Model. These particles could be either a spin 0 Radion, or an excited state of the Graviton (spin 2). The X-particle (Radion or Graviton) decays to two Higgs bosons which both decay to a b-quark and a b-antiquark: X → HH → 4b channel. For these heavy X-particles (decaying nearly at rest in the labframe), the Higgs bosons will appear back-to-back, and very boosted due to their very high momentum. The final states will have only two merged, fat jets (dijet state) instead of 4 separated jets because the b-bbar pairs from each jet will appear merged in a single jet.
Branching fraction of X → HH will be around 25%. The H → b-bbar is the preferred decay since b is the most massive quark bellow one half of the Higgs mass. A collision happens only between internal quarks or gluons, which carry only a fraction of the total energy of the proton. Since this analysis uses data of sqrt(s) = 8 TeV, a reasonable effective energy is 3 TeV. Then the energy spectrum of this analysis ranges up to 3 TeV.
First studies have been performed only using QCD background, whose events were generated by MadGraph5 interfacing with Pythia6 for showering and hadronization. Events account only QCD interactions, without Electroweak bosons or top quarks. Since these events do not lead a very precise background estimation, background is estimated by a data-driven technique.
Event selections are enumerated as follow.
A possible discriminator between background and signal events is the N-subjetiness τ21 = τ2/τ1. That is, the smaller is τ21, the closer the jet is to a dipole (rather than monopole) structure, as is explained here.
B-tagging is a method used to identify jets originating from b-(anti)quarks, and is based on the lifetime of the decay products of the b-quark. Hadrons containing b-quarks frequently have a lifetime long enough to travel a measurable distance in the detector, causing a secondary vertex of charged tracks within the jet. This vertex is reconstructed using the adaptive vertex fitter in a cone of ΔR = 0.3 around the the primary vertex. The secondary vertex is rejected if it is either too much like the primary vertex or too far from it. The Combined Secondary Vertex (CSV) combines secondary vertices and lifetime information to construct a probability discriminator (between 0 and 1) to distinguish b-quark jets from other jets, resulting in two "working points":
Two baseline b-tagging approaches have been investigated:
The background is estimated using a data-driven called ABCD method. A sideband is defined with a different jet mass window cut on the second jet (first jet remains in a 110 - 135 GeV window). Then, the background for n b-tags is estimated using the spectrum of n - 1 b-tags. Assuming the mass and b-tag windows as in the table bellow, signal D can be estimated as D = (A/C).B.
Mass window | cut _n -1_ | cut _n_ |
---|---|---|
70 - 110 GeV | A | B |
110 - 135 GeV | C | D |
Two assumptions were considered to use this method:
Double subjet b-tagging with CSV shows better discrimination than fat-jet b-tagging. Only at very high pt, fat-jet b-tagging and subjet b-tagging are equally good. Therefore, different b-tagging cuts are chosen to be implemented on the four subjets, rather than on fat-jets:
Categories |
---|
≥ 1 loose |
≥ 2 loose |
≥ 3 loose |
exact 3 loose |
4 loose |
≥ 1 medium |
≥ 2 medium |
≥ 3 medium |
exact 3 medium |
4 medium |
Studies with Punzi Significance showed that b-tags strongly reduces the background, while leaving most of signal events, when increasing from 1 to 4 b-tags. For loose b-tags, the significance increases the more b-tags are aplied. For medium b-tags, 4 tags reduces the signal so much that it becomes less efficient.
To test the effectiveness of N-subjetiness, a cut τ21 < 0.5 on both jets is applied to all different b-tagging categories. Independent of b-tagging cut, the N-subjetiness reduces signal by a factor of 1.3, and QCD roughly by a factor of 3, being considered uncorrelated.
The used categories for limit-setting (via CLs techniques 1 and 2
) are:
In order to put exact limits on the production cross section of a heavy resonance, several feasibility studies have been done to get more accurate background estimations.
Using the fact that N-subjetiness and b-tagging are uncorrelated, a sideband is constructed with those variables. Then for signal region:
Signal | B-tag sideband | |
---|---|---|
Signal | 2 subjet b-tags | 0 subjet b-tags |
τ21 < 0.5 | τ21 < 0.5 | |
τ21 sideband 1 | 0.5 < τ21 < 0.75 | 0.5 < τ21 < 0.75 |
τ21 sideband 2 | τ21 > 0.75 | τ21 > 0.75 |
page 13 (3.4.1 N-Subjetiness and b-Tagging as Sideband): What is the meaning of "Jet 1 randomly chosen"?
At least 3 background events need to remain in the sideband in order to get the ABCD method working. Since there is a loose of too much (< 20%) signal and since the sensitivity at higher resonance masses is better without N-subjetiness, it is not considered a useful method.
A way to improve the initial background estimation (with ABCD method) is by varying the parameters on different jets, rather than only using one jet in the sideband. This estimation will work good enough if the correlation between those parameters is sufficiently small.
In this sense, signal region is defined as:
b-tags | Closure | Signal | |
---|---|---|---|
Signal | 110 < massjet1 < 135 GeV | 110 < massjet1 < 135 GeV | 110 < massjet1 < 135 GeV |
0 subjet b-tags on jet 2 | 1 subjet b-tag on jet 2 | 2 subjet b-tags on jet 2 | |
Low mass sideband | 70 < massjet1 < 110 GeV | 70 < massjet1 < 110 GeV | 70 < massjet1 < 110 GeV |
0 subjet b-tags on jet 2 | 1 subjet b-tag on jet 2 | 2 subjet b-tags on jet 2 | |
High mass sideband | 135 < massjet1 < 150 GeV | 135 < massjet1 < 150 GeV | 135 < massjet1 < 150 GeV |
0 subjet b-tags on jet 2 | 1 subjet b-tag on jet 2 | 2 subjet b-tags on jet 2 |
First checks were done using 0+1 subjet b-tags (0 subjet b-tags on the second jet, 1 subjet b-tag on first jet) to estimate 1+1 subjet b-tags. Estimations of 1+1 subjet b-tags give very reasonable agreement with the actual values. Estimation of 2+1 subjet b-tags high band is more difficult to compare because of lack of statistics.
It has been found that subjet b-tagging, in a boosted dijet topology, works very effectively as a background discriminator. Almost all QCD background has been removed, leaving most of signal intact. Categories with the highest efficiencies are 3 or 4 loose b-tags, and 3 or 4 medium b-tags. N-subjetiness applied to both jets (τ21 < 0.5) reduces signal events by a factor of 1.3 and QCD background events by a factor of 3, independently of applied b-tagging cuts.
First background estimation, where an ABCD method was attempted using a jet mass and a b-tagging sideband, with the primary jet still "signal-tagged", failed because of the correlation between those variables. The second background estimation, using uncorrelated N-subjetiness and b-tagging, failed due to lack of statistics in the sideband region. A third method, where jet mass and b-tagging were varied on different jets, seems much more successful so far.
This analysis has not been able to quantify this in an actual numbers and uncertainties yet, but this will be done later, and the same method will be used in further analysis and limit setting on production cross section.
The Ntuples (from Tijs) are located in /store/cmst3/user/mgouzevi/HH4B/TIJS_TREES
. They are:
/store/cmst3/user/mgouzevi/HH4B/TIJS_TREES/dijetWtag_Moriond_Data.root
/store/cmst3/user/mgouzevi/HH4B/TIJS_TREES/dijetWtag_Moriond_QCD500.root
/store/cmst3/user/mgouzevi/HH4B/TIJS_TREES/dijetWtag_Moriond_QCD1000.root
/store/cmst3/user/mgouzevi/HH4B/TIJS_TREES/dijetWtag_Moriond_HHPy61000.root
/store/cmst3/user/mgouzevi/HH4B/TIJS_TREES/dijetWtag_Moriond_HHPy61500.root
/store/cmst3/user/mgouzevi/HH4B/TIJS_TREES/dijetWtag_Moriond_HHPy62000.root
/store/cmst3/user/mgouzevi/HH4B/TIJS_TREES/dijetWtag_Moriond_HHPy62500.root
/store/cmst3/user/mgouzevi/HH4B/TIJS_TREES/dijetWtag_Moriond_HHPy63000.root
/store/cmst3/user/mgouzevi/HH4B/TIJS_TREES/dijetWtag_Moriond_Allsignal.root
srmcp -2 srm://srm-eoscms.cern.ch:8443/srm/v2/server?SFN=/eos/cms/store/cmst3/user/mgouzevi/HH4B/TIJS_TREES/<root_file_name> file:///<root_file_name>
But it is not recommend since some of the files are bigger than 1 GB. Instead, use the Physical File Name (PFN), like this:
TFile *file = TFile::Open("root://eoscms//eos/cms/store/cmst3/user/mgouzevi/HH4B/TIJS_TREES/dijetWtag_Moriond_Data.root")
-- Main.assantos - 2013-10-28
antalya escort bursa escort eskisehir escort istanbul escort izmir escort