analysis

Functions used to read, analyse, and format data.

read

Functions used to read in the raw Balder data and split it into groups. Only read_all will work with the compact Balder grid.

breidablik.analysis.read.get_wavelengths(data_path=None)

Returns the wavelengths for the flux data.

Parameters:: data_path (str, optional) – The folder that the data is stored in. By default, this path points to Balder in breidablik.
Returns:: wl – The wavelengths for the flux data in nm.
Return type:: 1darray

breidablik.analysis.read.kfolds(data, k=10, seed=None)

Split the data up into k-folds.

Parameters:

data (dict) – The read in data. The outermost keys are the stellar parameters for the models. The next keys are the lithium abundances. The innermost keys are ‘flux’ which retreives the NLTE flux or ‘fluxl’ which retreives the LTE flux.
k (int, optional) – Specifies the number of folds generated. If the data doesn’t divide fully along the folds, then a best split is done.
seed (int, optional) – Seed used to generate random numbers. Set for reproducibility of results.

Returns:

folds – The k-fold data. Outermost keys are the number of the sets (from 0 up to but not including k), the inner keys remain the same as the input data.

Return type:

dict

breidablik.analysis.read.read(eff_t, surf_g, met, abund, D='3D', a=1.5, v=1, data_path=None)

Reads in the flux data for the specified dimension and stellar parameters. Will only work if loose grid is downloaded, use read_all instead.

Parameters:

eff_t (Real) – The temperature closest to the real temperature of the model. This input should to be less than 250 K away from the real temperature of the model. If the input temperature is more than 250 K from the real temperature, a warning will be raised, consult the warnings module to see how to change these warnings.
surf_g (Real) – The surface gravity of the stellar model.
met (Real) – The metallicity of the stellar model.
abund (Real) – The lithium abundance of the stellar model.
D (str, optional) – The dimension of the model. Accepted values are either ‘1D’ or ‘3D’.
a (Real or str, optional) – The mixing length parameter. Accepted values are 1, 1.5, or 2. The input can be expressed in any data type that can be converted into a floating point number.
v (Real or str, optional) – The microturbulence parameter. Accepted values are 0, 1, or 2. The input can be expressed in any data type that can be converted into a floating point number.
data_path (str, optional) – The folder that the data is stored in. By default, this path points to Balder in breidablik.

Returns:

flux_data – The NLTE and LTE flux of the specified dimension and stellar model. ‘flux’ contains the NLTE flux, and ‘fluxl’ contains the LTE flux.

Return type:

dict

breidablik.analysis.read.read_all(D='3D', a=1.5, v=1, data_path=None)

Read in all the data for some dimension.

Parameters:

D (str, optional) – The dimension of the model. Accepted values are either ‘1D’ or ‘3D’.
a (Real or str, optional) – The mixing length parameter. Accepted values are 1, 1.5, or 2. The input can be expressed in any data type that can be converted into a floating point number.
v (Real or str, optional) – The microturbulence parameter. Accepted values are 0, 1, or 2. The input can be expressed in any data type that can be converted into a floating point number.
data_path (str, optional) – The folder that the data is stored in. By default, this path points to Balder in breidablik.

Returns:

data – All the data stored in the specified dimension. The outermost keys are the stellar parameters for the models. The next keys are the lithium abundances. The innermost keys are ‘flux’ which retreives the NLTE flux or ‘fluxl’ which retreives the LTE flux. If split is used then the outermost keys are the split sets (either ‘train’ or ‘test’).

Return type:

dict

breidablik.analysis.read.read_all_abund(eff_t, surf_g, met, D='3D', a=1.5, v=1, data_path=None)

Reads in the fluxes for all lithium abundances for a dimension and stellar model. Will only work if loose grid is downloaded, use read_all instead.

Parameters:

eff_t (Real) – The temperature closest to the real temperature of the model. This input should to be less than 250 K away from the real temperature of the model. If the input temperature is more than 250 K from the real temperature, a ValueError will be raised, consult the warnings module to see how to change these warnings.
surf_g (Real) – The surface gravity of the stellar model.
met (Real) – The metallicity of the stellar model.
D (str, optional) – The dimension of the model. Accepted values are either ‘1D’ or ‘3D’.
a (Real or str, optional) – The mixing length parameter. Accepted values are 1, 1.5, or 2. The input can be expressed in any data type that can be converted into a floating point number.
v (Real or str, optional) – The microturbulence parameter. Accepted values are 0, 1, or 2. The input can be expressed in any data type that can be converted into a floating point number.
data_path (str, optional) – The folder that the data is stored in. By default, this path points to Balder in breidablik.

Returns:

data – The NLTE and LTE fluxes for all lithium abundances of a model. The outermost keys are the lithium abundances. The inner keys are ‘flux’ (retreives the NLTE flux) and ‘fluxl’ (retreives the LTE flux).

Return type:

dict of dict

breidablik.analysis.read.split(data, split, seed=None)

Split up the data into a test and training set.

Parameters:

data (dict) – The read in data. The outermost keys are the stellar parameters for the models. The next keys are the lithium abundances. The innermost keys are ‘flux’ which retreives the NLTE flux or ‘fluxl’ which retreives the LTE flux.
split (float, optional) – Specifies the ratio of test data to all data. No split is done if None.
seed (int, optional) – Seed used to generate random numbers. Set for reproducibility of results.

Returns:

split_sets – The split data. Outermost keys are the split sets (either ‘train’ or ‘test’), the inner keys remain the same as the input data.

Return type:

dict

tools

Quality of life functions used to manipulate spectra.

breidablik.analysis.tools.cut(wavelength, line_profile, errors=None, center=670.9659, upper=10, lower=10)

Cuts the wavelength and line profile and returns the values between center - lower and center + upper.

Parameters:

wavelength (List[Real] or 1darray) – Input wavelengths. Needs to be monotonically increasing.
line_profile (List[Real] or 1darray) – Input line profile.
errors (List[Real] or 1darray, optional) – Errors associated with input line profile. If using synthetic spectra, then no errors, leave as default value.
center (Real, optional) – The center of the wavelengths where the cut should be taken, in the same units as the wavelength. The 3 lithium lines are centered at 610.5298, 670.9659, and 812.8606 nm in the Balder results.
upper (Positive Real, optional) – The amount to go above the center when taking the cut, in the same units as the wavelength.
lower (Positive Real, optional) – The amount to go below the center when taking the cut, in the same units as the wavelength.

Returns:

cut_data – Cut wavelengths and line profiles, errors if provided.

Return type:

ndarray

breidablik.analysis.tools.cut_wavelength(wavelength, center=670.9659, upper=10, lower=10)

Cuts the wavelength returns the values between center - lower and center + upper. Useful for plotting mostly because many functions return a cut line profile but not cut wavelength.

Parameters:

wavelength (List[Real] or 1darray) – Input wavelengths. Needs to be monotonically increasing.
center (Real, optional) – The center of the wavelengths where the cut should be taken, in the same units as the wavelength. The 3 lithium lines are centered at 610.5298, 670.9659, and 812.8606 nm in the Balder results.
upper (Positive Real, optional) – The amount to go above the center when taking the cut, in the same units as the wavelength.
lower (Positive Real, optional) – The amount to go below the center when taking the cut, in the same units as the wavelength.

Returns:

wl_cut – Cut wavelengths.

Return type:

2darray

breidablik.analysis.tools.rew(wavelength, line_profile, center=670.9659, upper=10, lower=10, num=10000)

Calculates the reduced equivlanet width (REW) of the line profile between center - lower and center + upper.

Parameters:

wavelength (List[Real] or 1darray) – Input wavelengths. Needs to be monotonically increasing.
line_profile (List[Real] or 1darray) – Input line profile.
center (Real, optional) – The center of the wavelengths where the REW should be calculated from, in the same units as the wavelength. The 3 lithium lines are centered at 610.5298, 670.9659, and 812.8606 nm in the Balder results.
upper (Positive Real, optional) – The amount to go above the center when taking calculating the REW, in the same units as the wavelength.
lower (Positive Real, optional) – The amount to go below the center when calculating the REW, in the same units as the wavelength.
num (Int, optional) – The number of points in the interpolation. Before calculating the REW, the line profile is interpolated to finer wavelength points.

Returns:

rew – The REW.

Return type:

float

format_read

Functions used to transform data into a format that can be easily use in a machine learning model.

breidablik.analysis.format_read.pixel_format(data, wavelength, center=670.9659, lower=0.4, upper=0.4, ftype='flux')

Changes the data from read into a machine learning format. This function is for machine learning over pixels.

Parameters:

data (dict) – Flux data from the read functions. The outermost keys are the stellar parameters for the models. The next keys are the lithium abundances. The innermost keys are ‘flux’ which retreives the NLTE flux or ‘fluxl’ which retreives the LTE flux. All data must be located at the same wavelength points.
wavelength (List[Real] or 1darray) – The wavelengths that correspond to the data. From read.get_wavelengths().
center (Real, optional) – The center of the wavelengths where the cut should be taken, in the same units as the wavelength. The 3 lithium lines are centered at 610.5298, 670.9659, and 812.8606 nm in the Balder results.
upper (Real, optional) – The amount to go above the center when taking the cut, in the same units as the wavelength.
lower (Real, optional) – The amount to go below the center when taking the cut, in the same units as the wavelength.
ftype (str, optional) – Which type of flux to convert from the data. Accepted options are: ‘flux’ for NLTE or ‘fluxl’ for LTE.

Returns:

Xy – The X and y data sets in the form (X, y). X contains [num of objects x num of parameters], and y contains [num of objects x num of pixels].

Return type:

tuple of 2darrays

breidablik.analysis.format_read.rew_format(data, wavelength, predict='rew', center=670.9659, upper=10, lower=10, ftype='flux', num=10000)

Changes the data from read into a machine learning format. This function is for machine learning over REWs.

Parameters:

data (dict) – Flux data from the read functions. The outermost keys are the stellar parameters for the models. The next keys are the lithium abundances. The innermost keys are ‘flux’ which retreives the NLTE flux or ‘fluxl’ which retreives the LTE flux. All data must be located at the same wavelength points.
wavelength (List[Real] or 1darray) – The wavelengths that correspond to the data. From read.get_wavelengths().
predict (str, optional) – Determines what varlue is placed in the y data. Accepted options are ‘rew’ and ‘li’.
center (Real, optional) – The center of the wavelengths where the cut should be taken, in the same units as the wavelength. The 3 lithium lines are centered at 610.5298, 670.9659, and 812.8606 nm in the Balder results.
upper (Real, optional) – The amount to go above the center when taking the cut, in the same units as the wavelength.
lower (Real, optional) – The amount to go below the center when taking the cut, in the same units as the wavelength.
ftype (str, optional) – Which type of flux to convert from the data. Possible options are: ‘flux’ for NLTE or ‘fluxl’ for LTE.
num (Int, optional) – The number of points in the interpolation. Before calculating the REW, the line profile is interpolated to finer wavelength points.

Returns:

Xy – The X and y data sets in the form (X, y). X contains [num of objects x num of parameters], and y contains [num of objects].

Return type:

tuple of 2darrays