Analysis and configuration guide
Run the analysis
After the installation, the CLI can be used with the command:
To run the analysis, the requested arguments are the .yaml configuration file path and the data files or folders paths. The .yaml configuration file path is always the first argument.
To analyze one or more source files, the command needs a sequence of source file paths and a calibration with pulsed data as the last argument. This is an example of the analysis of two source files using the same calibration file:
To analyze one or more folders, the order of the arguments is the same as the previous case, with the exception that no calibration file has to be specified, as it is supposed that each folder contains its own calibration file:
The default directory to search for data files or folders is the data directory in the package root. If a file or a folder is inside this directory, there is no need to specify all the path of the folder, but the relative path from the data directory is enough to locate it.
It is also possible to save the analysis results (e.g. plots and numerical values) inside the results directory, which is automatically created in the user's home. This option can be enabled using the optional argument -s (or --save) in the CLI command. It is also possible to choose the format to use to save the plots (pdf or png). An example of command to save the results is:
File naming
To correctly analyze the data, the files have to be named in a specific way, to correctly identify file type, the voltages and also the chronological order.
Calibration files
For calibration files, the name has to be of the form (case-insensitive):
where voltages is a sequence of integers separated by - or _. An example of correct file name is data_ci1-2-3-4mV.mca.
Source files
For source files, the name must contain the information about the back B[back_voltage] and the drift D[drift_voltage] voltages. The order in which they appear and where they are written are not important, but in this case it is case-sensitive, so B and D must be capital letters. An example of correct name is data_D1000_B300.mca
Time trend files
When analyzing data and the chronological order is important, the naming convention is similar to that of source files, with the exceptions that the order of the voltages matters. In particular, the back voltage must be the last to be specified, and after that an index that indicates the order has to be written. The name should be of the following form:
An example of correct file name is data_D1000B380_40.mca
Write the configuration file
To run the analysis, a configuration file must always be specified. This section explains how to write a correct configuration file.
The configuration file is a .yaml file which contains all the information about the acquisition (e.g. the date, the detector, the source, etc...) and the analysis pipeline. This file allows to write an analysis pipeline in which each task can be finely configured, contrarly to a simple CLI.
Warning: when a key is optional and you want to set it to the default value, all the line must be deleted, not only the value, otherwise a type error can be raised or a None value can be assigned to the key.
Acquisition
The acquisition dictionary stores information about the acquisition, such as the date of the acquisition, the name and type of the chip and other detector and source properties. These properties are stored as keys of the dictionary, and a value is assigned to each of them.
The acquisition dictionary is optional and none of its field is mandatory
Note: currently these info are not being used for any purpose, but in the future they can be useful for some tasks.
Pipeline
The pipeline is the core of the analysis, where everything that has to be calculated is defined. Each step of the pipeline is a task that contains all the properties that are needed to configure and execute it. The pipeline is structured as a sequence of tasks. Each task is a dictionary, with its own key-value pairs. A task is defined by its name, and the other keys define its behaviour.
The tasks order in the configuration file does not affect execution, as priority is automatically assigned to the calibration task (which is mandatory for the pipeline) followed by the fit_spec task (which is optional, e.g. if you only want to plot the data without doing any analysis).
Apart from the calibration and fit_spec tasks, which are executed once at the beginning of the analysis, all the other tasks can be written multiple times, specifying different target or configuration parameters. As an example, if you have multiple emission lines in a spectrum and you want to estimate the resolution from each of them independently, it is possible to write a resolution task for each of them by specifying the target declared during the fit_spec subtasks. See the next sections for more details.
For a complete description of the possible keys of each task, please see Tasks.
Calibration
This task performs the calibration between the ADC counts of the multi-channel analyzer and the charge (or voltage). It is performed using the calibration pulse file specified in the command or located into the folder. This file contains data at fixed known voltages (which must be written in the name), and after fitting each pulse with a Gaussian curve, a linear fit of the peak positions is performed to find the conversion parameters.
pipeline:
- task: calibration
charge_conversion: true # Optional, whether to convert to voltage or charge
show: false # Optional, plot the calibration results
Spectral fitting
This task performs the spectral fitting on one or multiple emission lines at the same time. The task is divided into subtasks, each of them defined by the target (a name given to the subtask), and the model to use for the emission line fit (which must be Gaussian or Fe55Forest). These keys are mandatory for all the subtasks.
A subtask also has another optional key, which is fit_pars, that allows to specify some properties of the fitting procedure. If the fit_pars key is written in the configuration file, at least one property must be specified, but none of them is mandatory.
- task: fit_spec
subtasks:
- target: main_peak
model: Fe55Forest
- target: escape_peak
model: Gaussian
fit_pars: # Optional
xmin: 2.0 # Optional, left range limit for the fit
xmax: 4. # Optional, right range limit for the fit
num_sigma_left: 1. # Optional, number of sigma to the left of the line center for the fit
num_sigma_right: 1. # Optional, number of sigma to the right of the line center for the fit
absoulute_sigma: true # Optional, set absolute sigma
p0: [1., 1., 1.] # Optional, init parameters for the line fit
Gain estimate
This task performs the estimate of the gain using the spectral fitting results of a given target, previously specified during the fit_spec task.
If the analysis is performed on a single file, there are no other keys to specify. If the analysis is performed on multiple files or on one or multiple folders, the fit and plot can be specified.
Note: no error is raised if these optional keys are specified during the analysis of a single file, but no fit will be performed and no plot will be shown.
- task: gain
target: main_peak
fit: true # Optional, fit the data with an exponential model
show: true # Optional, plot gain vs back voltage
Gain trend with time
This task performs the study of the gain as a function of time. The gain is estimated in the same way as the gain task. For each source file, the time and the length of the acquisition are extracted from the header. The time info are combined together to get the variation of the gain with time.
The gain trend can also be analyzed using fitting subtasks, which allow to fit the data with a model or a composition of models from aptapy.models. These subtasks share the same syntax of the spectral fitting subtasks.
If more than one trend is visible in the same data, multiple targets can be specified, and the fit range can be adjusted to each trend.
- task: gain_trend
target: main_peak
subtasks: # Optional, fitting subtasks
target: trend
model: Exponential + Constant # Composition of models
Gain compare between folders
This task allows to compare the gain results of two or more different folders on the same plot. To execute this task, it is required that at least a gain task has been completed on a given target emission line.
It is also possible to combine the gain estimates from a list of folders specified in the combine key, and also perform a single exponential fit on the data from these folders.
If there are multiple data taken at a given voltage, the mean of the gain estimated from these files is calculated.
- task: compare_gain
target: main_peak
combine: # Optional, combine the data of all the folders
- folder0
- folder1
Drift varying
Resolution estimate
This task performs the estimate of the resolution of the detector using the spectral fitting results of a given target, previously specified during the fit_spec task. This estimate is made using only the fitted FWHM of the line and the calibrated charge of the line center:
$$
\frac{\Delta E}{E} = \frac{\text{FWHM}}{E}.
$$
The result is reported as a percentage.
As for the gain task, the show key works only if the analysis is performed on multiple files or folders
Resolution estimate with escape peak
This task performs the estimate of the resolution of the detector using the spectral fitting results of a main line emission and the correspondent escape peak. The two emission lines are specified by the target_main and target_escape keys. The resolution is estimated as:
$$
\frac{\Delta E}{E} = \frac{\text{FWHM}}{E_{main}} \frac{E_{main} - E_{peak}}{x_{main} - x_{peak}},
$$
where \(E_{main}\) and \(E_{peak}\) are the theoretical emission energies of the main and the escape peaks, and \(x_{main}\) and \(x_{peak}\) are the fitted line center positions. This estimate and the previous lead to the same result if the charge calibration has been correctly performed.
The show and fit keys can be specified if working on multiple files.
Resolution compare between folders
This task allows to compare the energy resolution between different folders. As for compare_gain task, you need first to perform a resolution task on a target to compare the results.
You can combine the results of different folders by specifying a list with the names of folders to combine in the combine key.
- task: compare_resolution
target: main_peak
combine: # Optional, combine the results from the specified folders
- folder0
- folder1
Plot the spectrum
This task is used to plot the spectrum of all the analyzed files. It is possible to plot the spectrum without performing the spectral fitting before writing only the name of the task.
In the case of spectral fitting and physical quantities estimate, it is possible to specify the target to plot, the general label of the plot, if you want to show the quantity estimates in the legend using task_labels and where to show the legend with loc. It is also possible to set a specific x-axis range for the plot using xrange or modify the automatic one by a factor on the left and/or on the right with xmin_factor and xmax_factor. If the xrange has been specified, it has the priority over the other two keys. You can decide whether to show the plots with the show key.
- task: plot
targets: # Optional, target fit to show
- main_peak
- escape_peak
title: Example title # Optional, the title of the plot
label: Example label # Optional, legend label of the plot
task_labels: # Optional, estimated quantities to show in the legend
- gain
loc: best # Optional, position of the legend in the plot
xrange: [0., 3.] # Optional, the x-axis range to show on the plot
show: true # Optional, whether to show the plot
xmin_fac: 0.5 # Optional, scale the automatic xrange left limit by a factor
xmax_fac: 1.5 # Optional, scale the automatic xrange right limit by a factor
Style
The style dictionary allows to set the style, labels and titles of the plot.