reboost package¶
Subpackages¶
- reboost.daq package
- reboost.hpge package
- Submodules
- reboost.hpge.psd module
_current_pulse_model()_drift_time_heuristic_impl()_estimate_current_impl()_get_waveform_maximum_impl()_get_waveform_value()_get_waveform_value_surface()_interpolate_pulse_model()_njit_erf()convolve_surface_response()drift_time()drift_time_heuristic()get_current_template()get_current_waveform()make_convolved_surface_library()maximum_current()r90()
- reboost.hpge.surface module
- reboost.hpge.utils module
- reboost.math package
- reboost.optmap package
- Submodules
- reboost.optmap.cli module
- reboost.optmap.convolve module
- reboost.optmap.create module
- reboost.optmap.evt module
- reboost.optmap.mapview module
- reboost.optmap.numba_pdg module
- reboost.optmap.optmap module
OpticalMapOpticalMap._divide_hist()OpticalMap._edges_eq()OpticalMap._fill_histogram()OpticalMap._fill_histogram_buf()OpticalMap._lock_nda()OpticalMap._mp_preinit()OpticalMap._nda()OpticalMap._prepare_hist()OpticalMap.check_histograms()OpticalMap.create_empty()OpticalMap.create_probability()OpticalMap.fill_hits()OpticalMap.fill_hits_flush()OpticalMap.fill_vertex()OpticalMap.get_settings()OpticalMap.load_from_file()OpticalMap.write_lh5()
- reboost.shape package
- reboost.spms package
Submodules¶
reboost.build_evt module¶
- reboost.build_evt.build_evt(tcm, hitfile, outfile, channel_groups, pars, run_part)¶
Build events out of a TCM.
- Parameters:
tcm (VectorOfVectors) – the time coincidence map.
hitfile (str) – file with the hits.
outfile (str | None) – the path to the output-file, if None with return the events in memory.
channel_groups (AttrsDict) –
a dictionary of groups of channels. For example:
{"det1": "on", "det2": "off", "det3": "ac"}
pars (AttrsDict) –
A dictionary of parameters. The first key should be the run ID, followed by different sets of parameters arranged in groups. Run numbers should be given in the format “p00-r001”, etc.
For example:
{"p03-r000": {"reso": {"det1": [1, 2], "det2": [0, 1]}}}
run_part (AttrsDict) –
The run partitioning file giving the number of events for each run. This should be organized as a dictionary with the following format:
{"p03-r000": 1000, "p03-r001": 2000}
- Returns:
the event file in memory as a table if no output file is specified.
- Return type:
Table | None
reboost.build_glm module¶
- reboost.build_glm.build_glm(stp_files, glm_files, lh5_groups=None, *, out_table_name='glm', id_name='g4_evtid', evtid_buffer=10000000, stp_buffer=10000000)¶
Builds a g4_evtid look up (glm) from the stp data.
This object is used by reboost to efficiency iterate through the data. It consists of a
lgdo.VectorOfVectorsfor each lh5_table in the input files. The rows of thislgdo.VectorOfVectorscorrespond to the id_name while the data are the stp indices for this event.- Parameters:
glm_files (str | list[str] | None) – path to the glm data, can also be None in which case an ak.Array is returned in memory.
out_table_name (str) – name for the output table.
id_name (str) – name of the evtid file, default g4_evtid.
stp_buffer (int) – the number of rows of the step file to read at a time
evtid_buffer (int) – the number of evtids to read at a time
lh5_groups (list | None)
- Returns:
either None or an ak.Array
- Return type:
Array | None
- reboost.build_glm.get_glm_rows(stp_evtids, vert, *, start_row=0)¶
Get the rows of the Geant4 event lookup map (glm).
- Parameters:
stp_evtids (_Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes]) – Array of evtids for the steps
vert (_Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes]) – Array of simulated evtid for the vertices.
start_row (int) – The index of the first element of stp_evtids.
- Returns:
an awkward array of the glm.
- Return type:
Array
- reboost.build_glm.get_stp_evtids(lh5_table, stp_file, id_name, start_row, last_vertex_evtid, stp_buffer)¶
Extracts the rows of a stp file corresponding to a particular range of evtid.
The reading starts at start_row to allow for iterating through the file. The iteration stops when the evtid being read is larger than last_vertex_evtid.
- Parameters:
- Returns:
a tuple of the updated start_row, the first row for the chunk and an awkward Array of the steps.
- Return type:
reboost.build_hit module¶
Routines to build the hit tier from the stp tier.
A build_hit() to parse the following configuration file:
# dictionary of objects useful for later computation. they are constructed with
# auxiliary data (e.g. metadata). They can be accessed later as OBJECTS (all caps)
objects:
lmeta: legendmeta.LegendMetadata(ARGS.legendmetadata)
geometry: pyg4ometry.load(ARGS.gdml)
user_pars: dbetto.TextDB(ARGS.par)
dataprod_pars: dbetto.TextDB(ARGS.dataprod_cycle)
_spms: OBJECTS.lmeta.channelmap(on=ARGS.timestamp)
.group("system").spms
.map("name")
spms: "{name: spm.daq.rawid for name, spm in OBJECTS._spms.items()}"
# processing chain is defined to act on a group of detectors
processing_groups:
# start with HPGe stuff, give it an optional name
- name: geds
# this is a list of included detectors (part of the processing group)
detector_mapping:
- output: OBJECTS.lmeta.channelmap(on=ARGS.timestamp)
.group('system').geds
.group('analysis.status').on
.map('name').keys()
# which columns we actually want to see in the output table
outputs:
- t0
- evtid
- energy
- r90
- drift_time
# in this section we define objects that will be instantiated at each
# iteration of the for loop over input tables (i.e. detectors)
detector_objects:
# The following assumes that the detector metadata is stored in the GDML file
pyobj: legendhpges.make_hpge(pygeomtools.get_sensvol_metadata(OBJECTS.geometry, DETECTOR))
phyvol: OBJECTS.geometry.physical_volume_dict[DETECTOR]
drift_time_map: lgdo.lh5.read(DETECTOR, ARGS.dtmap_file)
# finally, the processing chain
operations:
t0: ak.fill_none(ak.firsts(HITS.time, axis=-1), np.nan)
evtid: ak.fill_none(ak.firsts(HITS.__evtid, axis=-1), np.nan)
# distance to the nplus surface in mm
distance_to_nplus_surface_mm: reboost.hpge.distance_to_surface(
HITS.__xloc, HITS.__yloc, HITS.__zloc,
DETECTOR_OBJECTS.pyobj,
DETECTOR_OBJECTS.phyvol.position.eval(),
surface_type='nplus')
# activness based on FCCD (no TL)
activeness: ak.where(
HITS.distance_to_nplus_surface_mm <
lmeta.hardware.detectors.germanium.diodes[DETECTOR].characterization.combined_0vbb_fccd_in_mm.value,
0,
1
)
activeness2: reboost.math.piecewise_linear(
HITS.distance_to_nplus_surface_mm,
PARS.tlayer[DETECTOR].start_in_mm,
PARS.fccd_in_mm,
)
# summed energy of the hit accounting for activeness
energy_raw: ak.sum(HITS.__edep * HITS.activeness, axis=-1)
# energy with smearing
energy: reboost.math.sample_convolve(
scipy.stats.norm, # resolution distribution
loc=HITS.energy_raw, # parameters of the distribution (observable to convolve)
scale=np.sqrt(PARS.a + PARS.b * HITS.energy_raw) # another parameter
)
# this is going to return "run lengths" (awkward jargon)
clusters_lengths: reboost.shape.cluster.naive(
HITS, # can also pass the exact fields (x, y, z)
size=1,
units="mm"
)
# example of low level reduction on clusters
energy_clustered: ak.sum(ak.unflatten(HITS.__edep, HITS.clusters_lengths), axis=-1)
# example of using a reboost helper
steps_clustered: reboost.shape.reduction.energy_weighted_average(HITS, HITS.clusters_lengths)
r90: reboost.hpge.psd.r90(HITS.steps_clustered)
drift_time: reboost.hpge.psd.drift_time(
HITS.steps_clustered,
DETECTOR_OBJECTS.drift_time_map
)
# example basic processing of steps in scintillators
- name: lar
detector_mapping:
- output: scintillators
outputs:
- evtid
- tot_edep_wlsr
- num_scint_ph_lar
operations:
tot_edep_wlsr: ak.sum(HITS.edep[np.abs(HITS.zloc) < 3000], axis=-1)
- name: spms
# by default, reboost looks in the steps input table for a table with the
# same name as the current detector. This can be overridden for special processors
detector_mapping:
- output: OBJECTS.spms.keys()
input: lar
hit_table_layout: reboost.shape.group_by_time(STEPS, window=10)
pre_operations:
num_scint_ph_lar: reboost.spms.emitted_scintillation_photons(HITS.edep, HITS.particle, "lar")
# num_scint_ph_pen: ...
outputs:
- t0
- evtid
- pe_times
detector_objects:
meta: pygeomtools.get_sensvol_metadata(OBJECTS.geometry, DETECTOR)
spm_uid: OBJECTS.spms[DETECTOR]
optmap_lar: reboost.spms.load_optmap(ARGS.optmap_path_pen, DETECTOR_OBJECTS.spm_uid)
optmap_pen: reboost.spms.load_optmap(ARGS.optmap_path_lar, DETECTOR_OBJECTS.spm_uid)
operations:
pe_times_lar: reboost.spms.detected_photoelectrons(
HITS.num_scint_ph_lar, HITS.particle, HITS.time, HITS.xloc, HITS.yloc, HITS.zloc,
DETECTOR_OBJECTS.optmap_lar,
"lar",
DETECTOR_OBJECTS.spm_uid
)
pe_times_pen: reboost.spms.detected_photoelectrons(
HITS.num_scint_ph_pen, HITS.particle, HITS.time, HITS.xloc, HITS.yloc, HITS.zloc,
DETECTOR_OBJECTS.optmap_pen,
"pen",
DETECTOR_OBJECTS.spm_uid
)
pe_times: ak.concatenate([HITS.pe_times_lar, HITS.pe_times_pen], axis=-1)
# can list here some lh5 objects that should just be forwarded to the
# output file, without any processing
forward:
- /vtx
- /some/dataset
- reboost.build_hit._evaluate_operation(hit_table, field, info, local_dict, time_dict)¶
- Parameters:
field (str)
local_dict (dict)
time_dict (ProfileDict)
- Return type:
None
- reboost.build_hit.build_hit(config, args, stp_files, glm_files, hit_files, *, start_evtid=0, n_evtid=None, out_field='hit', buffer=5000000, overwrite=False, allow_missing_inputs=True)¶
Build the hit tier from the remage step files.
- Parameters:
config (Mapping | str) – dictionary or path to YAML file containing the processing chain.
args (Mapping | AttrsDict) – dictionary or
dbetto.AttrsDictof the global arguments.stp_files (str | list[str]) – list of strings or string of the stp file path.
glm_files (str | list[str] | None) – list of strings or string of the glm file path, if None will be build in memory.
hit_files (str | list[str] | None) – list of strings or string of the hit file path. The hit file can also be None in which case the hits are returned as an ak.Array in memory.
start_evtid (int) – first evtid to read.
n_evtid (int | None) – number of evtid to read, if None read all.
out_field (str) – name of the output field
buffer (int) – buffer size for use in the LH5Iterator.
overwrite (bool) – flag to overwrite the existing output.
allow_missing_inputs – Flag to allow an input table to be missing, generally when there were no events.
- Return type:
None | Array
reboost.cli module¶
- reboost.cli.cli(args=None)¶
- Return type:
None
reboost.core module¶
- reboost.core._remove_col(field, tab)¶
Remove column accounting for nesting.
- reboost.core.add_field_with_nesting(tab, col, field)¶
Add a field handling the nesting.
- reboost.core.evaluate_hit_table_layout(steps, expression, time_dict=None)¶
Evaluate the hit_table_layout expression, producing the hit table.
This expression should be a function call which performs a restructuring of the steps, i.e. it sets the number of rows. The steps array should be referred to by “STEPS” in the expression.
- reboost.core.evaluate_object(expression, local_dict)¶
Evaluate an expression returning any object.
The expression should be a function call. It can depend on any objects contained in the local dict. In addition, the expression can use packages which are then imported.
- reboost.core.evaluate_output_column(hit_table, expression, local_dict, *, table_name='HITS', time_dict=None, name=' ')¶
Evaluate an expression returning an LGDO.
Uses
lgdo.Table.eval()to compute a new column for the hit table. The expression can depend on any field in the Table (prefixed with table_name.) or objects contained in the local dict. In addition, the expression can use packages which are then imported.- Parameters:
hit_table (Table) – the table containing the hit fields.
expression (str) – the expression to evaluate.
local_dict (dict) – local dictionary to pass to
lgdo.Table.eval().table_name (str) – keyword used to refer to the fields in the table.
time_dict (ProfileDict | None) – time profiling data structure.
name (str) – name to use in time_dict.
- Returns:
an LGDO with the new field.
- Return type:
- reboost.core.get_detector_mapping(detector_mapping, global_objects, args)¶
Get all the detector mapping using
get_one_detector_mapping().
- reboost.core.get_detector_objects(output_detectors, expressions, args, global_objects, time_dict=None)¶
Get the detector objects for each detector.
This computes a set of objects per output detector. These should be the expressions (defined in the expressions input). They can depend on the keywords:
ARGS : in which case values of from the args parameter AttrsDict can be references,
DETECTOR: referring to the detector name (key of the detector mapping)
OBJECTS : The global objects.
For example expressions like:
compute_object(arg=ARGS.first_arg, detector=DETECTOR, obj=OBJECTS.meta)
are supported.
- Parameters:
output_detectors (list) – list of output detectors,
expressions (dict) – dictionary of expressions to evaluate.
args (AttrsDict) – any arguments the expression can depend on, is passed as locals to eval().
global_objects (AttrsDict) – a dictionary of objects the expression can depend on.
time_dict (ProfileDict | None) – time profiling data structure.
- Returns:
An AttrsDict of the objects for each detector.
- Return type:
- reboost.core.get_global_objects(expressions, *, local_dict, time_dict=None)¶
Extract global objects used in the processing.
- Parameters:
- Returns:
dictionary of objects with the same keys as the expressions.
- Return type:
- reboost.core.get_one_detector_mapping(output_detector_expression, objects=None, input_detector_name=None, args=None)¶
Extract the output detectors and the list of input to outputs by parsing the expressions.
The output_detector_expression can be a name or a string evaluating to a list of names. This expression can depend on any objects in the objects dictionary, referred to by the keyword “OBJECTS”.
The function produces a dictionary mapping input detectors to output detectors with the following format:
{ "input1": ["output1", "output2"], "input2": ["ouput3", ...], }
If only output_detector_expression is supplied the mapping is one-to-one (i.e. every input detector maps to the same output detector). If instead a name for the input_detector_name is also supplied this will be the only key with all output detectors being mapped to this.
- Parameters:
output_detector_expression (str | list) – An output detector name or a string evaluating to a list of output tables.
objects (AttrsDict | None) – dictionary of objects that can be referenced in the expression.
input_detector_name (str | None) – Optional input detector name for all the outputs.
args (AttrsDict | None) – any arguments the expression can depend on, is passed as locals to eval().
- Returns:
a dictionary with the input detectors as key and a list of output detectors for each.
- Return type:
Examples
For a direct one-to-one mapping:
>>> get_detectors_mapping("[str(i) for i in range(2)]") {'0':['0'],'1':['1'],'2':['2']}
With an input detector name:
>>> get_detectors_mapping("[str(i) for i in range(2)])",input_detector_name = "dets") {'dets':['0','1','2']}
With objects:
>>> objs = AttrsDict({"format": "ch"}) >>> get_detectors_mapping("[f'{OBJECTS.format}{i}' for i in range(2)])", input_detector_name = "dets",objects=objs) {'dets': ['ch0', 'ch1', 'ch2']}
- reboost.core.merge(hit_table, output_table)¶
Merge the table with the array.
- Parameters:
hit_table (Table)
output_table (Array | None)
- reboost.core.read_data_at_channel_as_ak(channels, rows, file, field, group, tab_map)¶
Read the data from a particular field to an awkward array. This replaces the TCM like object defined by the channels and rows with the corresponding data field.
- Parameters:
channels (Array) – Array of the channel indices (uids).
rows (Array) – Array of the rows in the files to gather data from.
file (str) – File to read the data from.
field (str) – the field to read.
group (str) – the group to read data from (eg. hit or stp.)
mapping between indices and table names. Of the form:
{NAME: UID}
For example:
{"det001": 1, "det002": 2}
- Returns:
an array with the data, of the same same as the channels and rows.
- Return type:
Array
reboost.iterator module¶
- class reboost.iterator.GLMIterator(glm_file, stp_file, lh5_group, start_row, n_rows, *, stp_field='stp', buffer=10000, time_dict=None, reshaped_files=False)¶
Bases:
objectA class to iterate over the rows of an event lookup map.
Constructor for the GLMIterator.
The GLM iterator provides a way to iterate over the simulated geant4 evtids, extracting the number of hits or steps for each range in evtids. This ensures a single simulated event is not split between two iterations and allows to specify a start and an end evtid to extract.
In case the data is already reshaped and we do not need to read a specific range of evtids this iterator is just loops over the input stp field. Otherwise if the GLM file is not provided this is created in memory.
- Parameters:
glm_file (str | None) – the file containing the event lookup map, if None the glm will be created in memory if needed.
stp_file (str) – the file containing the steps to read.
lh5_group (str) – the name of the lh5 group to read.
start_row (int) – the first row to read.
n_rows (int | None) – the number of rows to read, if None read them all.
stp_field (str) – name of the group.
buffer (int) – the number of rows to read at once.
time_dict (dict | None) – time profiling data structure.
reshaped_files (bool) – flag for whether the files are reshaped.
- get_n_rows()¶
Get the number of rows to read.
reboost.log_utils module¶
reboost.profile module¶
- class reboost.profile.ProfileDict(value=None)¶
Bases:
AttrsDictA class to store the results of time profiling.
Construct an
AttrsDictobject.Note
The input dictionary is copied.
- _format(data, indent=1)¶
Recursively format the dictionary.
- Parameters:
data (ProfileDict) – The dictionary to format.
indent (int) – The current indentation level.
- Returns:
the formatted print out.
- Return type:
reboost.units module¶
- reboost.units.pg4_to_pint(obj)¶
Convert pyg4ometry object to pint Quantity.
- Parameters:
obj (Quantity | VectorBase)
- Return type:
- reboost.units.unit_to_lh5_attr(unit)¶
Convert Pint unit to a string that can be used as attrs[“units”] in an LGDO.
- reboost.units.units_conv_ak(data, target_units)¶
Calculate numeric conversion factor to reach target_units, and apply to data converted to ak.
- reboost.units.units_convfact(data, target_units)¶
Calculate numeric conversion factor to reach target_units.
- reboost.units.unwrap_lgdo(data, library='ak')¶
Return a view of the data held by the LGDO and its physical units.
- reboost.units.ureg = <pint.registry.ApplicationRegistry object>¶
The physical units registry.
reboost.utils module¶
- reboost.utils._check_input_file(parser, file, descr='input')¶
- reboost.utils._check_output_file(parser, file, optional=False)¶
- reboost.utils._search_string(string)¶
Capture the characters matching the pattern for a function call.
- Parameters:
string (str)
- reboost.utils.assign_units(tab, units)¶
Copy the attributes from the map of attributes to the table.
- reboost.utils.copy_units(tab)¶
Extract a dictionary of attributes (i.e. units).
- reboost.utils.filter_logging(level)¶
- reboost.utils.get_channels_from_groups(names, groupings=None)¶
Get a list of channels from a list of groups.
- reboost.utils.get_file_dict(stp_files, glm_files, hit_files=None)¶
Get the file info as a AttrsDict.
Creates an
dbetto.AttrsDictwith keys stp_files, glm_files and hit_files. Each key contains a list of file-paths (or None).- Parameters:
stp_files (list[str] | str) – string or list of strings of the stp files.
glm_files (list[str] | str | None) – string or list of strings of the glm files, or None in which case the glm will be created in memory.
hit_files (list[str] | str | None) – string or list of strings of the hit files, if None the output files will be created in memory.
- Return type:
- reboost.utils.get_file_list(path, threads=None)¶
Get a list of files accounting for the multithread index.
- reboost.utils.get_function_string(expr, aliases=None)¶
Get a function call to evaluate.
Search for any patterns matching the pattern for a function call. We also detect any cases of aliases being used, by default just for numpy as np and awkward as ak. In this case, the full name is replaces with the alias in the expression and also in the output globals dictionary.
It is possible to chain together functions eg:
ak.num(np.array([1, 2]))
and all packages will be imported.
- Parameters:
- Returns:
a tuple of call string and dictionary of the imported global packages.
- Return type:
- reboost.utils.get_remage_detector_uids(h5file)¶
Get mapping of detector names to UIDs from a remage output file.
The remage LH5 output files contain a link structure that lets the user access detector tables by UID. For example:
├── stp · struct{det1,det2,optdet1,optdet2,scint1,scint2} └── __by_uid__ · struct{det001,det002,det011,det012,det101,det102} ├── det001 -> /stp/scint1 ├── det002 -> /stp/scint2 ├── det011 -> /stp/det1 ├── det012 -> /stp/det2 ├── det101 -> /stp/optdet1 └── det102 -> /stp/optdet2This function analyzes this structure and returns:
{1: 'scint1', 2: 'scint2', 11: 'det1', 12: 'det2', 101: 'optdet1', 102: 'optdet2'g
- reboost.utils.get_table_names(tcm)¶
Extract table names from tcm.attrs[‘tables’] and return them as a dictionary.
- Parameters:
tcm (VectorOfVectors)
- Return type:
- reboost.utils.get_wo_mode(group, out_det, in_det, chunk, new_hit_file, overwrite=False)¶
Get the mode for lh5 file writing.
If all indices are 0 and we are writing a new output file then the mode “overwrite_file” is used (if the overwrite) flag is set, otherwise the mode “write_safe” is used.
Otherwise the code choses between “append_column” if this is the first time a group is being written to the file, or “append”
- Parameters:
group (int) – the index of the processing group
out_det (int) – the index of the output detector
in_det (int) – the index of the input detector
chunk (int) – the chunk index
new_hit_file (bool) – a flag of whether we are writing a new hit file. This does not indicate whether the file already exists on disk, but whether the file name is different from the last written chunk for this detector.
overwrite (bool) – a flag of whether to overwrite the old file.
- Returns:
the mode for IO
- Return type:
- reboost.utils.get_wo_mode_forwarded(written_tables, new_hit_file, overwrite=False)¶
Get the mode for lh5 file writing for forwarded tables tahat will be copied without chunking.
If we are writing a new output file and no other tables had been written yet, then the mode “overwrite_file” is used if the overwrite flag is set, otherwise the mode “write_safe” is used.
Otherwise “append” is used.
- Parameters:
written_tables (set[str]) – a set of already written table names, also including other table names of non-forwarded (i.e. processed) tables.
new_hit_file (bool) – a flag of whether we are writing a new hit file. This does not indicate whether the file already exists on disk, but whether the file name is different from the last written chunk for this forwarded table.
overwrite (bool) – a flag of whether to overwrite the old file.
- Returns:
the mode for IO
- Return type:
- reboost.utils.is_new_hit_file(files, file_idx)¶
Return whether the hit file with the given index is a “new” hit file.
A new file is either the first file written, or when the previous file index has a different file name.
- reboost.utils.merge_dicts(dict_list)¶
Merge a list of dictionaries, concatenating the items where they exist.
- Parameters:
dict_list (list) – list of dictionaries to merge
- Returns:
a new dictionary after merging.
- Return type:
Examples
>>> merge_dicts([{"a":[1,2,3],"b":[2]},{"a":[4,5,6],"c":[2]}]) {"a":[1,2,3,4,5,6],"b":[2],"c":[2]}
- reboost.utils.write_lh5(hit_table, file, time_dict, out_field, out_detector, wo_mode)¶
Write the lh5 file. This function handles writing first the data as a struct and then appending to this.