Skip to content

Private data schema

Introduction

Depending on where you're running the analysis, you can find the ntuples in one of the following locations:

on LXPLUS:

/eos/atlas/atlascerngroupdisk/phys-exotics/jdm/ANA-EXOT-2024-01_dibjetISR/ntuples

or on AF UChicago:

/data/jiling/TLA/eos_ntuples

Directory structure

bash
$ pwd
/eos/atlas/atlascerngroupdisk/phys-exotics/jdm/ANA-EXOT-2024-01_dibjetISR/ntuples

$ tree -L 3
.
├── data22
│   ├── nv0.1
│   │   └── user.aikoulou.00440447.physics_TLA.r14488._24-05-08_tree.root
│   ├── nv1
│   │   └── user.sfranche.00440447.physics_TLA.r14488._nv1_tree.root
│   ├── nv2
│   │   └── user.sfranche.00440447.physics_TLA.r14488.nv2_tree.root
│   └── nv4
│       └── user.agekow.data22_13p6TeV.00440447.physics_TLA.recon.DAOD_TLA.r14488.nv4_tree.root
├── data23
│   ├── nv0.1
│   │   └── user.aikoulou.00452202.physics_TLA.r15316._24-05-08_tree.root
│   ├── nv3
│   │   ├── user.sfranche.00451866.physics_TLA.r15316.nv3_tree.root
│   │   └── user.sfranche.00452202.physics_TLA.r15316.nv3_tree.root
│   ├── nv4
│   │   ├── user.sfranche.00451866.physics_TLA.r15316.nv4_tree.root
│   │   └── user.sfranche.00452202.physics_TLA.r15316.nv4_tree.root
│   └── nv5
│       ├── user.lbozianu.00451866.physics_TLA.r15316.24-08-06_tree.root
│       └── user.lbozianu.00452202.physics_TLA.r15316.24-08-06_tree.root
├── mc23a
│   ├── backgrounds
│   │   ├── nv0.1
│   │   ├── nv1
│   │   ├── nv2
│   │   └── nv4
│   ├── Zprime_bb
│   │   ├── nv0.1
│   │   ├── nv1
│   │   └── nv2
│   └── Zprime_qq
│       ├── nv0.1
│       ├── nv1
│       ├── nv2
│       └── nv4
└── mc23d
    ├── backgrounds
    │   ├── nv3
    │   ├── nv4
    │   └── nv5
    ├── Zprime_bb
    │   ├── nv3
    │   └── nv4
    └── Zprime_qq
        ├── nv3
        └── nv4

Versions of private production

The production is also version, for example, a particular data file under consideration might be:

bash
/eos/atlas/atlascerngroupdisk/phys-exotics/jdm/ANA-EXOT-2024-01_dibjetISR/ntuples/mc23a/Zprime_bb/nv2/user.sfranche.510392.MGPy8EG_S1_qqa_Ph25_mRp125_gASp1_qContentUDSC.e8514_s4162_r15315.nv2_tree.root/merged.root

This is a signal MC (Zbb¯ with mZ=125GeV) with production version nv2.

We can inspect what's in this file (focusing on most relevant content):

julia
julia> ROOTFile("merged.root")
ROOTFile with 16 entries and 29 streamers.
merged.root
├─ duplicates (TTree)
│  ├─ "runNumber"
│  └─ "eventNumber"
├─ TreeAlgo_noCalib (TDirectory)
│  └─ nominal (TTree)
│     ├─ "runNumber"
│     ├─ "eventNumber"
│     ├─ "lumiBlock"
│     ├─ "⋮"
│     ├─ "ph_SumPtChargedPFOPt500"
│     ├─ "ph_SumPtTrkPt500"
│     └─ "ph_TrackWidthPt1000"
├─ MetaData_EventCount (TH1D)
├─ MetaData_SumW (TH1D)
...
  • TreeAlgo_noCalib/nominal is the output event data from xAODAnaHelder, in particular, the relevant xAH config files used for nv2 can be found at this gitlab commit

  • The MetaData_SumW histogram is used for tracking sum of weights information needed for correct scaling of MC events.

Event pre-selection (nv2)

The following common selections are built-in to the private production:

  • HLT trigger (HLT_g35_tight_3j25_pf_ftf_PhysicsTLA_L1EM22VHI for 2022, HLT_g35_tight_3j25_pf_ftf_PhysicsTLA_L1eEM26M for 2023)

  • pt and eta selection on photons

  • Tight ID criteria for photons, and isolation requirement in 2023 data (MC23d).

  • pt and eta selection on jets

  • overlap removal of jets within dR<0.4 of "good" photons (ID + Iso if relevant)

Event Content

We can load the data from the nominal tree:

julia
julia> LazyTree("merged.root", "TreeAlgo_noCalib/nominal")
 Row │ jet_rapidity     ph_e277          ph_EnergyPerSam  coreFlags  jet_Had 
     │ SubArray{Float3  SubArray{Float3  SubArray{Vector  UInt32     SubArra 
─────┼────────────────────────────────────────────────────────────────────────
 1   │ [0.957, 1        [73600.0]        Vector{Fl        0          [5, 5]  
 2   │ [0.33, 2.        [35100.0]        Vector{Fl        0          [5, 0,  
 3   │ [-0.588,         [32000.0]        Vector{Fl        0          [5, 5,  
 4   │ [-0.634,         [27400.0]        Vector{Fl        0          [5, 0,  
 5   │ [-1.17, 0        [30900.0]        Vector{Fl        0          [4, 5,  
 6   │ [-0.716,         [31100.0]        Vector{Fl        0          [5, 5,  
 7   │ [-0.857,         [45900.0]        Vector{Fl        0          [5, 5,  
 8   │ [1.49, -0        [33200.0]        Vector{Fl        0          [5, 0,  
 9   │ [0.805, 1        [51300.0]        Vector{Fl        0          [5, 0,  
 10  │ [-1.35, -        [37200.0]        Vector{Fl        0          [5, 5,  
 11  │ [0.121, -        [34500.0]        Vector{Fl        0          [5, 5,  

                                             67 columns and 17232 rows omitted

Overall, there are 107 columns in nv8 schema:

"102 column names in nv8 shared by both MC and Data (click to expand):"
julia
NPV
actualInteractionsPerCrossing
averageInteractionsPerCrossing
bcid
coreFlags
correctedActualMu
correctedAndScaledActualMu
correctedAndScaledAverageMu
correctedAverageMu
eventNumber
jet_CentroidR
jet_E
jet_EMFrac
jet_FracSamplingMax
jet_FracSamplingMaxIndex
jet_GN1
jet_GN1_pb
jet_GN1_pc
jet_GN1_pu
jet_GhostMuonSegmentCount
jet_HECFrac
jet_HadronConeExclTruthLabelID
jet_JVF
jet_Jvt
jet_LowEtConstituentsFrac
jet_NumTrkPt1000
jet_NumTrkPt500
jet_SumPtTrkPt1000
jet_SumPtTrkPt500
jet_TrackWidthPt1000
jet_TrackWidthPt500
jet_Width
jet_constScaleM
jet_constScalePt
jet_emScaleM
jet_emScalePt
jet_eta
jet_etaJESScaleEta
jet_etaJESScaleM
jet_etaJESScalePhi
jet_etaJESScalePt
jet_fastDIPS
jet_fastDIPS_pb
jet_fastDIPS_pc
jet_fastDIPS_pu
jet_gscScaleEta
jet_gscScaleM
jet_gscScalePhi
jet_gscScalePt
jet_insituScaleEta
jet_insituScaleM
jet_insituScalePhi
jet_insituScalePt
jet_jmsScaleM
jet_jmsScalePt
jet_onoffScaleEta
jet_onoffScaleM
jet_onoffScalePhi
jet_onoffScalePt
jet_originConstitScaleM
jet_originConstitScalePt
jet_phi
jet_pileupScaleEta
jet_pileupScaleM
jet_pileupScalePhi
jet_pileupScalePt
jet_pt
jet_rapidity
lumiBlock
njet
nph
ph_ActiveArea
ph_EMFrac
ph_EnergyPerSampling
ph_FracSamplingMax
ph_Jet_eta
ph_Jet_m
ph_Jet_phi
ph_Jet_pt
ph_Jvt
ph_N90Constituents
ph_NumTrkPt1000
ph_SumPtChargedPFOPt500
ph_SumPtTrkPt500
ph_TrackWidthPt1000
ph_deltae
ph_e277
ph_eratio
ph_eta
ph_f1
ph_m
ph_phi
ph_pt
ph_radhad
ph_radhad1
ph_reta
ph_rphi
ph_topoetcone40
ph_weta2
ph_wtot
runNumber
weight_pileup

In addition, there are 5 columns unique to MC:

julia
mcChannelNumber
mcEventNumber
mcEventWeight
rand_lumiblock_nr
rand_run_nr

and 20 columns unique to data:

julia
jet_AverageLArQF
jet_BchCorrCell
jet_ChargedFraction
jet_HECQuality
jet_LArBadHVEnergyFrac
jet_LArBadHVNCell
jet_LArQuality
jet_LeadingClusterCenterLambda
jet_LeadingClusterPt
jet_LeadingClusterSecondLambda
jet_LeadingClusterSecondR
jet_N90Constituents
jet_NegativeE
jet_OotFracClusters10
jet_OotFracClusters5
jet_Timing
jet_clean_passLooseBad
jet_clean_passLooseBadUgly
jet_clean_passTightBad
jet_clean_passTightBadUgly
"All 76 column names in nv3 - nv6 (click to expand):"
julia
julia> println.(sort(names(tree)))
NPV
actualInteractionsPerCrossing
averageInteractionsPerCrossing
bcid
coreFlags
correctedActualMu
correctedAndScaledActualMu
correctedAndScaledAverageMu
correctedAverageMu
eventNumber
jet_CentroidR
jet_E
jet_EMFrac
jet_FracSamplingMax
jet_FracSamplingMaxIndex
jet_GhostMuonSegmentCount
jet_HECFrac
jet_HadronConeExclExtendedTruthLabelID
jet_HadronConeExclTruthLabelID
jet_JVF
jet_Jvt
jet_LowEtConstituentsFrac
jet_NumTrkPt1000
jet_NumTrkPt500
jet_SumPtTrkPt1000
jet_SumPtTrkPt500
jet_TrackWidthPt1000
jet_TrackWidthPt500
jet_Width
jet_eta
jet_fastDIPS
jet_fastDIPS_pb
jet_fastDIPS_pc
jet_fastDIPS_pu
jet_phi
jet_pt
jet_rapidity
lumiBlock
mcChannelNumber
mcEventNumber
mcEventWeight
njet
nph
ph_ActiveArea
ph_EMFrac
ph_EnergyPerSampling
ph_FracSamplingMax
ph_Jet_eta
ph_Jet_m
ph_Jet_phi
ph_Jet_pt
ph_Jvt
ph_N90Constituents
ph_NumTrkPt1000
ph_SumPtChargedPFOPt500
ph_SumPtTrkPt500
ph_TrackWidthPt1000
ph_deltae
ph_e277
ph_eratio
ph_eta
ph_f1
ph_m
ph_phi
ph_pt
ph_radhad
ph_radhad1
ph_reta
ph_rphi
ph_topoetcone40
ph_weta2
ph_wtot
rand_lumiblock_nr
rand_run_nr
runNumber
weight_pileup
"All 71 column names in nv2 (click to expand):"
julia
julia> println.(sort(names(mytree)));
NPV
actualInteractionsPerCrossing
averageInteractionsPerCrossing
bcid
coreFlags
correctedActualMu
correctedAndScaledActualMu
correctedAndScaledAverageMu
correctedAverageMu
eventNumber
jet_CentroidR
jet_E
jet_EMFrac
jet_FracSamplingMax
jet_FracSamplingMaxIndex
jet_GhostMuonSegmentCount
jet_HECFrac
jet_HadronConeExclExtendedTruthLabelID
jet_HadronConeExclTruthLabelID
jet_JVF
jet_Jvt
jet_LowEtConstituentsFrac
jet_NumTrkPt1000
jet_NumTrkPt500
jet_SumPtTrkPt1000
jet_SumPtTrkPt500
jet_TrackWidthPt1000
jet_TrackWidthPt500
jet_Width
jet_eta
jet_fastDIPS
jet_fastDIPS_pb
jet_fastDIPS_pc
jet_fastDIPS_pu
jet_phi
jet_pt
jet_rapidity
lumiBlock
mcChannelNumber
mcEventNumber
mcEventWeight
njet
nph
ph_ActiveArea
ph_EMFrac
ph_EnergyPerSampling
ph_FracSamplingMax
ph_Jvt
ph_N90Constituents
ph_NumTrkPt1000
ph_SumPtChargedPFOPt500
ph_SumPtTrkPt500
ph_TrackWidthPt1000
ph_deltae
ph_e277
ph_eratio
ph_eta
ph_f1
ph_m
ph_phi
ph_pt
ph_radhad
ph_radhad1
ph_reta
ph_rphi
ph_weta2
ph_wtot
rand_lumiblock_nr
rand_run_nr
runNumber
weight_pileup
"All 60 column names in nv1 (click to expand):"
julia
julia> println.(sort(names(mytree)));
NPV
actualInteractionsPerCrossing
averageInteractionsPerCrossing
bcid
coreFlags
correctedActualMu
correctedAndScaledActualMu
correctedAndScaledAverageMu
correctedAverageMu
eventNumber
jet_CentroidR
jet_E
jet_EMFrac
jet_FracSamplingMax
jet_FracSamplingMaxIndex
jet_GhostMuonSegmentCount
jet_HECFrac
jet_HadronConeExclExtendedTruthLabelID
jet_HadronConeExclTruthLabelID
jet_JVF
jet_LowEtConstituentsFrac
jet_NumTrkPt1000
jet_NumTrkPt500
jet_SumPtTrkPt1000
jet_SumPtTrkPt500
jet_TrackWidthPt1000
jet_TrackWidthPt500
jet_Width
jet_eta
jet_fastDIPS
jet_fastDIPS_pb
jet_fastDIPS_pc
jet_fastDIPS_pu
jet_phi
jet_pt
jet_rapidity
lumiBlock
mcChannelNumber
mcEventNumber
mcEventWeight
njet
nph
ph_deltae
ph_e277
ph_eratio
ph_eta
ph_f1
ph_m
ph_phi
ph_pt
ph_radhad
ph_radhad1
ph_reta
ph_rphi
ph_weta2
ph_wtot
rand_lumiblock_nr
rand_run_nr
runNumber
weight_pileup

Highlighted explanation:

  • ph_* are all the properties related to HLT photons

    • from HLT_egamma_Photons_TLA for data

    • from HLT_egamma_Iso_Photons for MC

  • jet_* are all the properties related to HLT jet

    • from HLT_AntiKt4EMPFlowJets_subresjesgscIS_ftf_TLA for data

    • from HLT_AntiKt4EMPFlowJets_subresjesgscIS_ftf_TLA for MC

    • jet_HadronConeExclTruthLabelID takes the value of 5 for b and anti-b

    • jet_fastDIPS is pre-calculated from jet_fastDIPS_p{u,b,c} and can be re-calculated using calc_db.

Change Log

nv9

  • Adding MC Online/Offline correction

  • Updated GRLs and pile-up re-weighting files

nv8

  • Updated workflow: dropped overlap removal in the ntuple production, included HLT photon calibration, and relaxed kinematic selections on jets.

  • Added new variables, the most relevant ones are: fastGN1 scores and decoration for running jet cleaning in data.

  • Extended signal grid with new mass points and larger stats (ATLMCPROD-11403)

nv6.1

  • Fix MC cleaning for Dijets, should be correctly applied.

nv6

  • Add the jet calibration for 2023 data and mc23d samples

  • Configurations were slightly updated

  • MC cleaning for dijet samples implemented but not applied.

nv5

  • Fixed the configuration files for the Pile-up Reweighting tool !32 (affecting 2023 and mc23d only).

  • Include a separate tree TreeAlgo_Offline/nominal for dumping offline photons (MC only atm).

Note

Only data 2023 and few MC23d background samples have been produced.

nv4.1

  • Updated signal samples from latest reprocessing request ATLMCPROD-11256. Correct trigger should be now available.

  • The m(Z->bb)=125 GeV singal point was renamed !3114.

Warning

We requested to reprocess r15542 but r15540 was used... We don't have full TrigEDM saved (more important for calib) and the LAr bug fix is missing (less relevant though).

nv4

  • Dropped the cut on photon isolation.

  • Updated the photon ID selection to match offline prescription (commit)

nv3

  • Added configurations for dumping data 2023 and MC23d ntuples.

  • Included cut on photon isolation variable (topoetcone40).

  • Addeded few branches:

    • ph_topoetcone40

    • ph_Jet_pt, ph_Jet_eta, ph_Jet_phi, ph_Jet_m (4-momentum of jets OR by photons)

  • Included a couple of bugfix (!33 in TLAAlgosRun3):

    • Missing max dR to copy over jets features to matched photons

    • Photon showere shapes fudge factors applied twice

Note

MC23d and 2023 data samples were produced.

nv2.1

  • Added jet 4-momenta to matched photons, needed to do clean low JZ slices in dijets MC from pile-up.

Note

Only dijets samples were produced with this tag, and are stored in nv2.

nv2

  • Fixed bug in MC background samples, where (anti-)matching with truht photons was required by mistake.

  • Added these branches:

    • ph_EnergyPerSampling

    • ph_SumPtTrkPt500

    • ph_Jvt

    • ph_SumPtChargedPFOPt500

    • ph_ActiveArea

    • ph_FracSamplingMax

    • ph_EMFrac

    • ph_N90Constituents

    • ph_NumTrkPt1000

    • ph_TrackWidthPt1000

    • jet_Jvt

  • Attributes such as ph_EMFrac are created by identifying the jet that's overlapping with the photon and decorating the photon with jet attributes – normally, photons do not have EMFrac.

Warning

MC23a signal events are still selected with 2023 trigger, new samples have been requested (see ATLMCPROD-11256).

nv1

First implementation of a common configuration file, !MR21. It includes a preselection on jets and photons. Analysis photons are required to have pT>20 GeV and pass a tight ID cut. Jets overlapping anlysis photons are removed (dR<0.4), and only jets with pT above 20 GeV are saved in the ntuple.

Warning

MC23a signal events are still selected with 2023 trigger, new samples have been requested (see ATLMCPROD-11256).

nv0.1

  • initial release