pyird.spec package

Submodules

pyird.spec.continuum module

class pyird.spec.continuum.ContinuumFit(continuum_nonzero_ind=[10, 60], base_order_fit=23, max_order_fit=50, nsigma_continuumfit=[2, 3], maxiter_continuumfit=3)

Bases: object

Continuum Fitting Class

Algorithm overview:

This class fits a smooth continuum per spectral order using iterative sigma-clipped polynomial regression.

For each order: 1. Mask NaNs, zero flux pixels, and edge regions defined by continuum_nonzero_ind. 2. Fit either a centered Vandermonde polynomial or a Legendre polynomial of degree order to the current inliers. 3. Compute residuals r = y - model and robust scatter sigma = sqrt(median(r**2)). 4. Keep samples within [-low*sigma, +high*sigma] and iterate until the inlier count stops changing or maxiter_continuumfit is reached.

Across the full detector: - Begin with base_order_fit. - If any non-ignored order exceeds a residual RMS threshold (1.5%), increment the polynomial degree and refit from the start (up to max_order_fit). - Interpolate the final continuum back to all pixels.

This yields a robust, smooth continuum while suppressing strong absorption features and noise-driven outliers.

continuum_nonzero_ind

tuple of int Pixel indices to mask at both ends of each order.

base_order_fit

int Starting polynomial order for fitting.

max_order_fit

int Maximum allowed polynomial order to avoid overfitting.

nsigma_continuumfit

tuple of float (low, high) sigma limits for clipping.

maxiter_continuumfit

int Maximum number of sigma-clip iterations applied.

continuum_oneord(wdata, flat, order)

fit continuum for a single echelle order.

Parameters:
  • wdata – the wavelength calibrated target spectrum (columns: "wav"/"order"/"flux")

  • flat – the wavelength calibrated FLAT (columns: "wav"/"order"/"flux")

  • order – order number to fit

Returns:

spectrum and continuum of the order

continuum_rsd(rsd, npix=2048, ignore_orders=None)

fit continuum for a raw spectrum detector (RSD) matrix.

Parameters:
  • rsd – raw spectrum detector matrix

  • npix – number of pixels

  • ignore_order – list of orders not to be evaluated the goodness of the fitting

Returns:

pandas DataFrame of continuum

fit_continuum(x, y, order=6, fitfunc='legendre')

Fit the continuum with iterative sigma clipping

Parameters:
  • x – the wavelengths

  • y – the log-fluxes

  • order – the polynomial order to use

  • fitfunc – fitting function (default is legendre)

Returns:

The value of the continuum at the wavelengths in x

pyird.spec.normalize module

class pyird.spec.normalize.SpectrumNormalizer(combfile=None, interp_nonzero_ind=[2, 9], nsigma_sigmaclip=[1, 2], maxiter_sigmaclip=3)

Bases: ContinuumFit, FluxUncertainty

Normalize 1D spectra and propagate uncertainties.

This class (1) derives a blaze/continuum from FLAT or a provided blaze file, (2) normalizes target spectra order-by-order and combines them across overlaps, and (3) computes per-pixel S/N and normalized uncertainties.

interp_nonzero_ind

tuple[int, int] Number of pixels to trim at the start and end of each order where the flux is zero.

nsigma_sigmaclip

tuple[float, float] Sigma-clipping thresholds (lower, upper).

maxiter_sigmaclip

int Maximum number of iterations for sigma clipping.

(Inherited from :class:`FluxUncertainty`)

See that class for attributes related to gain selection and readout-noise estimation.

blaze_to_df(wdata, blaze)

divide target flux by the blaze function

Parameters:
  • wdata – the wavelength calibrated 1D target spectrum ("wav", "order", "flux").

  • blaze – blaze function ("wav", "order", "flux").

Returns:

Per-order dataframe including columns "wav", "order", "flux", "continuum", "nflux", "sn_ratio", "tmp_uncertainty".

combine_normalize(wfile, flatfile, blaze=True)

read .dat file and make 1D normalized spectrum

Parameters:
  • wfile – path to the wavelength calibrated 1D target spectrum

  • flatfile – path to the wavelength calibrated 1D FLAT or blaze file

  • blaze – True when self.apext_flatfield() is used. if False, blaze function will be created from 1D FLAT

Returns:

two pandas.DataFrames of 1D normalized spectrum - df_continuum: per-order dataframe containing columns such as

"wav", "order", "flux", "continuum", "nflux", "sn_ratio", "tmp_uncertainty".

  • df_interp: final 1D combined dataframe (columns: "wav", "flux", "continuum", "sn_ratio", "tmp_uncertainty", later augmented by divide_by_continuum() with "nflux" and "uncertainty").

concat_nonoverlap_region(df_former, df_latter, df_interp, order, min_order)

concatenate the data of the nonoverlap region of df_former and df_latter

Parameters:
  • df_former – pandas.DataFrame of the former order

  • df_latter – pandas.DataFrame of the latter order

  • df_interp – pandas.DataFrame of the interpolated data

  • order – number of the using order

  • min_order – number of the initial order

Returns:

concateneted DataFrame

concat_nonoverlap_region_last_order(df_former, df_interp)

concatenate the data of the nonoverlap region of df_former and df_interp for the last order

Parameters:
  • df_former – pandas.DataFrame of the former order

  • df_interp – pandas.DataFrame of the interpolated data

Returns:

concateneted DataFrame

concat_overlap_region(df_former, df_latter, df_interp)

concatenate the data of the orverlapping region of df_former and df_latter

Parameters:
  • df_former – pandas.DataFrame of the former order

  • df_latter – pandas.DataFrame of the latter order

  • df_interp – pandas.DataFrame of the interpolated data

Returns:

concatenated DataFrame

define_former_and_latter(df_continuum, order, max_order)

define data of the former order (order N) and the latter order (order N+1)

Parameters:
  • df_continuum – pandas.DataFrame that contain the blaze function

  • order – order number (N)

  • max_order – the maximum number of the orders

Returns:

pandas.DataFrames of the former order and the latter order

determine_scale_continuum(wdata, flat, standard_order)

determine the scaling factor of the blaze function

Parameters:
  • wdata – the wavelength calibrated target spectrum (columns: "wav"/"order"/"flux").

  • flat – the wavelength calibrated FLAT with the same column schema as wdata.

  • standard_order – Once an order number is set, the blaze functions are standardized based on that order

Returns:

scaling factor of the blaze function

divide_by_continuum(df_interp)

Finalize normalization by dividing through the continuum.

Adds normalized flux and uncertainty columns, skipping zero-continuum samples to avoid division by zero.

Parameters:

df_interp – pandas.DataFrame that must contain "flux", "continuum", and "tmp_uncertainty".

Returns:

  • "nflux": normalized flux,

  • "uncertainty": normalized uncertainty (temporary uncertainty / continuum).

Return type:

The input dataframe with added columns

make_blaze(wdata, flat, standard_order=None)

extract blaze function for target based on FLAT

Parameters:
  • wdata – the wavelength calibrated target spectrum

  • flat – the wavelength calibrated FLAT

  • standard_order – Once an order number is set, the blaze functions are standardized based on that order

Returns:

the blaze function created by scaling FLAT by a constant Columns include "wav", "order", "flux", "flat", "continuum", "nflux", and uncertainty-related fields.

normalize(df_continuum)

normalize flux after combining all orders

Trims zero-flux edges, concatenates non-overlap and overlap regions in order, computes combined S/N, then divides by continuum.

Parameters:

df_continuum – pandas.DataFrame that contain the blaze function

Returns:

pandas.DataFrame of 1D normalized spectrum with columns "wav", "flux", "continuum", "sn_ratio", "tmp_uncertainty", augmented later by divide_by_continuum().

objective_function(scale, flux, continuum, sigma_lower, sigma_upper, maxiter_sigmaclip)

objective function to optimize the scaling factor of the blaze function

Parameters:
  • scale – scaling factor of continuum

  • flux – flux

  • continuum – continuum

  • sigma_lower – sigma clipping threshold (lower)

  • sigma_upper – sigma clipping threshold (upper)

  • maxiter_sigmaclip – the maximum number of iterations for sigma clipping

Returns:

absolute sum of the residuals after sigma clipping

trim_nonzero_flux(df)

cut pixels in the zero flux region at both ends of the order

Parameters:

df – pandas.DataFrame that contains ‘flux’ column

Returns:

pandas.DataFrame which is cut by +/- interp_nonzero_ind pixels

pyird.spec.rsdmat module

Raw Spectral Detector matrix (RSD matrix)

pyird.spec.rsdmat.multiorder_to_rsd(rawspec, pixcoord, npix=2048, fill_value=nan)

conversion multiorder rawspec+pixcoord to RSD matrix.

Parameters:
  • rawspec – multiorder rawspec

  • pixcoord – multiorder pixel coordinate

  • npix – number of detector pixels in y direction

  • fill_value – filled value in empty elements

Returns:

RSD matrix (npix x norder)

pyird.spec.rsdmat.rsd_order_medfilt(rsd, kernel_size=9)

median filtering for spectrum

Parameters:
  • rsd – RSD matrix (npix x norder)

  • kernel_size – kernel size for median filter

Returns:

median filtered RSD matrix

pyird.spec.uncertainty module

class pyird.spec.uncertainty.FluxUncertainty(wav_boundary_yj_and_h=1420, gain_y=2.99, gain_h=2.78, combfile=None, comb_readout_noise_wav=[950, 1000], calc_method_readout_noise='mad')

Bases: object

Compute flux uncertainties and S/N for IRD spectra.

Algorithmm overview:

Computes per-pixel S/N and flux uncertainties using a Poisson + read-noise model, with automatic band-dependent gain selection and optional read-noise estimation from a comb (LFC) file.

Noise model (units): - Input flux is in ADU; gains are in e-/ADU; readout noise is in e-. - Variance (in ADU^2) is modeled as:

var_total = flux / gain + readout_noise**2
  • Temporary (pre-normalization) uncertainty is:

    tmp_uncertainty = sqrt(var_total)
    
  • Signal-to-noise ratio (dimensionless) is:

    sn_ratio = (gain * flux) / sqrt(gain * flux + (gain * readout_noise)**2)
    

Gain selection: - If any wavelength in a chunk exceeds wav_boundary_yj_and_h, use gain_h, otherwise use gain_y.

Readout-noise estimation: - If a comb file is provided, restrict to comb_readout_noise_wav = [low, high] and estimate noise from the selected flux samples using either MAD (scaled by 1.4826) or standard deviation; otherwise fall back to default_readout_noise.

wav_boundary_yj_and_h

int Boundary wavelength between the Y/J and H bands (in the same units as input data).

gain_y

float Gain of the IRD Y/J-band detector (e-/ADU).

gain_h

float Gain of the IRD H-band detector (e-/ADU).

combfile

str | None Path to the LFC (mmf1) file used to estimate the readout noise, if provided.

comb_readout_noise_wav

tuple[int, int] Wavelength window [low, high] used to compute the readout noise from the comb.

method

{“mad”, “std”} Method to estimate the readout noise from the comb data: Median Absolute Deviation (scaled) or standard deviation.

default_readout_noise

float Default readout noise in electrons (IRD detectors: ~12 e- for a 10 min exposure).

scale_normalized_mad

float Scaling factor (1.4826) to convert MAD to an estimate of the standard deviation.

readout_noise

float Effective readout noise used in calculations. Determined during initialization.

gain

float The detector gain actually used for calculations (either gain_y or gain_h). Note: This attribute is set lazily by determine_gain() (called inside calc_uncertainty()), so it may not exist until those methods are invoked.

calc_uncertainty(df_continuum)

calculation of uncertainty

Parameters:
  • df_continuum – Dataframe that contains at least the columns "wav", "flux",

  • exist (and "continuum". If "sn_ratio" or "tmp_uncertainty") –

:param : :param they will be overwritten.:

Returns:

The temporary (pre-normalization) uncertainty is stored in the column "tmp_uncertainty" as an intermediate product.

calc_uncertainty_overlap_region(df_head, df_tail)

calculate the signal-to-noise ratio and temporary uncertainty of each data point.

For wavelengths in the head order (df_head), returns S/N and temporary uncertainty by referencing the tail order (df_tail) in the overlapping range. If an exact wavelength match is found in df_tail, reuse its sn_ratio; otherwise, interpolate between the nearest two tail samples and propagate uncertainty using linear weights.

Notes

the wavelength range of df_tail should contain the one of df_head. i.e., max(df_tail[‘wav’])>max(df_head[‘wav’]) and min(df_tail[‘wav’])<min(df_head[‘wav’]) See details in the master thesis by Ziying Gu

Parameters:
  • df_head – the spectrum of the latter order in the overlap region.

  • available (Must contain "wav" and "flux"; if) –

  • used. ("sn_ratio" is) –

  • df_tail – the spectrum of the former order in the overlap region.

  • "wav" (Must contain) –

  • "flux"

  • "sn_ratio". (and) –

Returns:

signal-to-noise ratio of each data point tmp_uncertainty: the uncertainty of flux before normalization

(this is a intermediate product for calculating uncertainty)

Return type:

sn_ratio

calculate_tmp_uncertainty(left_flux, right_flux, left_scale, right_scale)

calculate temporal uncertainty for interpolated points

Parameters:
  • left_flux – flux at a point to the left (shorter wavelength) of the point after interpolation

  • right_flux – flux at a point to the right (longer wavelength) of the point after interpolation

  • left_scale – weight for left uncertainty

  • right_scale – weight for right uncertainty

Returns:

temporary error for interpolated points

determine_gain(df_continuum)

determine the gain to use based on wavelength

Parameters:

df_continuum – DataFrame that contains wavelength data in ‘wav’ column

Returns:

gain of the corresponding band

determine_readout_noise()

determine the readout noise

Returns:

if combfile is defined, calculated real value. if not, use default value.

split_tail_by_wavhead(wav_tail, wav_head_i, flux_tail)

split wavelength and flux in tail by wavlength in head

Finds two adjacent tail wavelengths bracketing a head wavelength and returns their fluxes along with linear interpolation weights:

  • left_scale = (right_wav - wav_head) / (right_wav - left_wav)

  • right_scale = 1 - left_scale

Parameters:
  • wav_tail – wavelengths in tail (overlapping region of the former order)

  • wav_head_i – wavelength in head (overlapping region of the latter order)

  • flux_tail – flux in tail

Returns:

flux and contributions of the adjacent data points for interpolating (left and right)

pyird.spec.wavcal module

pyird.spec.wavcal.calculate_residuals(data, wavlength_solution)

calculate residuals for none-zero values

Parameters:
  • data – fitted data

  • wavelength_solution – best-fit model

Returns:

residuals for none-zero values

pyird.spec.wavcal.check_channelfile(channelfile_path, ign_ord)

check the channel file format and set orders

Parameters:
  • channelfile_path – path to the channel file (user-defined)

  • ign_ord – orders to be ignored

pyird.spec.wavcal.errfunc(p, XY, data, W, Ni, Nx)

calculate error function.

Parameters:
  • p – fitting coefficients

  • XY – meshgrid of (pixels, orders)

  • data – fitted data

  • W – matrix of weights

  • Ni – order of the fitting function arong each echelle order

  • Nx – order of the fitting function with respect to the aperture number

Returns:

residuals of data and fitting model

pyird.spec.wavcal.first_identification(dat, channelfile, order_ref, pixel_search_area=5, kernel_size=3)

map pixels to wavelengths by using the previously identified data with ecidentify(IRAF)

Parameters:
  • dat – ThAr spectrum (norder x npix matrix)

  • channelfile – reference channel-wavelength map

  • order_ref – conventional orders

  • pixel_search_area – pixel area to search peaks around a ThAr emission in a reference spectrum

  • kernel_size – kernel size for median filter

Returns:

channel(pixel)-wavelength map

pyird.spec.wavcal.fit_polynomial(XY, Ni, Nx, params, poly='chebyshev')

calculate 2d polynomial series.

Parameters:
  • XY – meshgrid of (pixels, orders)

  • Ni – order of the fitting function arong each echelle order

  • Nx – order of the fitting function with respect to the aperture number

  • params – fitting coefficients

  • poly – ‘chebyshev’ or ‘legendre’ for fitting polynomial series

Returns:

wavelength of each pixels (flattened from npix x norder matrix)

pyird.spec.wavcal.fit_wav_solution(XY, data, W, Ni, Nx)

optimize the fitting by using least-square method.

Parameters:
  • XY – meshgrid of (pixels, orders)

  • data – fitted data

  • W – matrix of weights

  • Ni – order of the fitting function arong each echelle order

  • Nx – order of the fitting function with respect to the aperture number

Returns:

best fit parameters (coefficients of 2d legendre series)

pyird.spec.wavcal.identify_channel_mode(norder, channelfile_path=None, ign_ord=[])

identify the channel mode based on data shape

Parameters:
  • dat – ThAr spectrum (norder x npix matrix)

  • channelfile_path – path to the channel file

  • ign_ord – orders to be ignored

Returns:

diffraction orders, the reference channel-wavelength map file, and conventional orders

pyird.spec.wavcal.iterate_fitting(X, Y, df_pixwavmap, W, Ni, Nx, maxiter, std_threshold, npix, norder, order_ref, Nsigma=1.5)

iterate the fitting until the std of residuals become lower than std_threshold or the number of iteration become maxiter

Parameters:
  • X – grid of pixels

  • Y – grid of orders

  • df_pixwavmap – channel-wavelength data

  • W – matrix of weights

  • Ni – order of the fitting function arong each echelle order

  • Nx – order of the fitting function with respect to the aperture number

  • maxiter – maximum number of iterations

  • std_threshold – When the std of fitting residuals reaches this value, the iteration is terminated.

  • npix – number of pixels

  • norder – number of orders

  • order_ref – conventional orders

  • Nsigma – the number of stds to use for both the lower and upper clipping limit

Returns:

wavelength solution and data of ThAr signals used for fitting

pyird.spec.wavcal.make_weight()

weight matrix

Note

REVIEW: there may be other appropreate weights…

pyird.spec.wavcal.pixel_df_to_wav_mat(df_pixwavmap, order_ref, npix=2048)

conversion channel-wavelength data to wavelength matrix.

Parameters:
  • df_pixwavmap – channel-wavelength data

  • order_ref – conventional orders

Returns:

channel-wavelength matrix (nipx x norder)

pyird.spec.wavcal.second_identification(dat, wavlength_solution_matrix, residuals, npix, norder, order_ref, pixel_search_area=None, kernel_size=3, detect_level=80)

detect additional ThAr lines in the observed data with referencing the line list

Parameters:
  • dat – ThAr spectrum (norder x npix matrix)

  • wavelength_solution_matrix – best-fit model

  • residuals – residuals between data and wavelength solution

  • npix – number of pixels

  • norder – number of orders

  • order_ref – conventional orders

  • pixel_search_area – pixel area to search peaks around a ThAr emission in a line list

  • kernel_size – kernel size for median filter

  • detect_level – determine the lower limit of what percentage of the top data should be detected as peaks

Returns:

channel(pixel)-wavelength map

pyird.spec.wavcal.sigmaclip(data, wavlength_solution, N=3)

clipping outliers.

Parameters:
  • data – the reference ThAr data

  • wavlength_solution – best-fit model

  • N – the number of stds to use for both the lower and upper clipping limit

Returns:

residuals, drop_ind

pyird.spec.wavcal.wavcal_thar(dat, W, Ni=5, Nx=4, maxiter=10, std_threshold=0.005, channelfile_path=None, ign_ord=[])

wavelegth calibration for ThAr spectrum.

Parameters:
  • dat – ThAr spectrum (norder x npix matrix)

  • W – matrix of weights

  • Ni – order of the fitting function arong each echelle order

  • Nx – order of the fitting function with respect to the aperture number

  • maxiter – maximum number of iterations

  • std_threshold – When the std of fitting residuals reaches this value, the iteration is terminated.

  • channelfile_path – path to the channel file

  • ign_ord – orders to be ignored

Returns:

final results of the wavlength solution data of ThAr signals used for fitting

Examples

>>> wavlength_solution, data = wavcal_thar(thar)

Module contents