pyird.spec package
Submodules
pyird.spec.continuum module
- class pyird.spec.continuum.ContinuumFit(continuum_nonzero_ind=[10, 60], base_order_fit=23, max_order_fit=50, nsigma_continuumfit=[2, 3], maxiter_continuumfit=3)
Bases:
objectContinuum Fitting Class
- Algorithm overview:
This class fits a smooth continuum per spectral order using iterative sigma-clipped polynomial regression.
For each order: 1. Mask NaNs, zero flux pixels, and edge regions defined by
continuum_nonzero_ind. 2. Fit either a centered Vandermonde polynomial or a Legendre polynomial of degreeorderto the current inliers. 3. Compute residualsr = y - modeland robust scattersigma = sqrt(median(r**2)). 4. Keep samples within[-low*sigma, +high*sigma]and iterate until the inlier count stops changing ormaxiter_continuumfitis reached.Across the full detector: - Begin with
base_order_fit. - If any non-ignored order exceeds a residual RMS threshold (1.5%), increment the polynomial degree and refit from the start (up tomax_order_fit). - Interpolate the final continuum back to all pixels.This yields a robust, smooth continuum while suppressing strong absorption features and noise-driven outliers.
- continuum_nonzero_ind
tuple of int Pixel indices to mask at both ends of each order.
- base_order_fit
int Starting polynomial order for fitting.
- max_order_fit
int Maximum allowed polynomial order to avoid overfitting.
- nsigma_continuumfit
tuple of float (low, high) sigma limits for clipping.
- maxiter_continuumfit
int Maximum number of sigma-clip iterations applied.
- continuum_oneord(wdata, flat, order)
fit continuum for a single echelle order.
- Parameters:
wdata – the wavelength calibrated target spectrum (columns:
"wav"/"order"/"flux")flat – the wavelength calibrated FLAT (columns:
"wav"/"order"/"flux")order – order number to fit
- Returns:
spectrum and continuum of the order
- continuum_rsd(rsd, npix=2048, ignore_orders=None)
fit continuum for a raw spectrum detector (RSD) matrix.
- Parameters:
rsd – raw spectrum detector matrix
npix – number of pixels
ignore_order – list of orders not to be evaluated the goodness of the fitting
- Returns:
pandas DataFrame of continuum
- fit_continuum(x, y, order=6, fitfunc='legendre')
Fit the continuum with iterative sigma clipping
- Parameters:
x – the wavelengths
y – the log-fluxes
order – the polynomial order to use
fitfunc – fitting function (default is legendre)
- Returns:
The value of the continuum at the wavelengths in x
pyird.spec.normalize module
- class pyird.spec.normalize.SpectrumNormalizer(combfile=None, interp_nonzero_ind=[2, 9], nsigma_sigmaclip=[1, 2], maxiter_sigmaclip=3)
Bases:
ContinuumFit,FluxUncertaintyNormalize 1D spectra and propagate uncertainties.
This class (1) derives a blaze/continuum from FLAT or a provided blaze file, (2) normalizes target spectra order-by-order and combines them across overlaps, and (3) computes per-pixel S/N and normalized uncertainties.
- interp_nonzero_ind
tuple[int, int] Number of pixels to trim at the start and end of each order where the flux is zero.
- nsigma_sigmaclip
tuple[float, float] Sigma-clipping thresholds (lower, upper).
- maxiter_sigmaclip
int Maximum number of iterations for sigma clipping.
- (Inherited from :class:`FluxUncertainty`)
See that class for attributes related to gain selection and readout-noise estimation.
- blaze_to_df(wdata, blaze)
divide target flux by the blaze function
- Parameters:
wdata – the wavelength calibrated 1D target spectrum (
"wav","order","flux").blaze – blaze function (
"wav","order","flux").
- Returns:
Per-order dataframe including columns
"wav","order","flux","continuum","nflux","sn_ratio","tmp_uncertainty".
- combine_normalize(wfile, flatfile, blaze=True)
read .dat file and make 1D normalized spectrum
- Parameters:
wfile – path to the wavelength calibrated 1D target spectrum
flatfile – path to the wavelength calibrated 1D FLAT or blaze file
blaze – True when self.apext_flatfield() is used. if False, blaze function will be created from 1D FLAT
- Returns:
two pandas.DataFrames of 1D normalized spectrum -
df_continuum: per-order dataframe containing columns such as"wav","order","flux","continuum","nflux","sn_ratio","tmp_uncertainty".df_interp: final 1D combined dataframe (columns:"wav","flux","continuum","sn_ratio","tmp_uncertainty", later augmented bydivide_by_continuum()with"nflux"and"uncertainty").
- concat_nonoverlap_region(df_former, df_latter, df_interp, order, min_order)
concatenate the data of the nonoverlap region of df_former and df_latter
- Parameters:
df_former – pandas.DataFrame of the former order
df_latter – pandas.DataFrame of the latter order
df_interp – pandas.DataFrame of the interpolated data
order – number of the using order
min_order – number of the initial order
- Returns:
concateneted DataFrame
- concat_nonoverlap_region_last_order(df_former, df_interp)
concatenate the data of the nonoverlap region of df_former and df_interp for the last order
- Parameters:
df_former – pandas.DataFrame of the former order
df_interp – pandas.DataFrame of the interpolated data
- Returns:
concateneted DataFrame
- concat_overlap_region(df_former, df_latter, df_interp)
concatenate the data of the orverlapping region of df_former and df_latter
- Parameters:
df_former – pandas.DataFrame of the former order
df_latter – pandas.DataFrame of the latter order
df_interp – pandas.DataFrame of the interpolated data
- Returns:
concatenated DataFrame
- define_former_and_latter(df_continuum, order, max_order)
define data of the former order (order N) and the latter order (order N+1)
- Parameters:
df_continuum – pandas.DataFrame that contain the blaze function
order – order number (N)
max_order – the maximum number of the orders
- Returns:
pandas.DataFrames of the former order and the latter order
- determine_scale_continuum(wdata, flat, standard_order)
determine the scaling factor of the blaze function
- Parameters:
wdata – the wavelength calibrated target spectrum (columns:
"wav"/"order"/"flux").flat – the wavelength calibrated FLAT with the same column schema as
wdata.standard_order – Once an order number is set, the blaze functions are standardized based on that order
- Returns:
scaling factor of the blaze function
- divide_by_continuum(df_interp)
Finalize normalization by dividing through the continuum.
Adds normalized flux and uncertainty columns, skipping zero-continuum samples to avoid division by zero.
- Parameters:
df_interp – pandas.DataFrame that must contain
"flux","continuum", and"tmp_uncertainty".- Returns:
"nflux": normalized flux,"uncertainty": normalized uncertainty (temporary uncertainty / continuum).
- Return type:
The input dataframe with added columns
- make_blaze(wdata, flat, standard_order=None)
extract blaze function for target based on FLAT
- Parameters:
wdata – the wavelength calibrated target spectrum
flat – the wavelength calibrated FLAT
standard_order – Once an order number is set, the blaze functions are standardized based on that order
- Returns:
the blaze function created by scaling FLAT by a constant Columns include
"wav","order","flux","flat","continuum","nflux", and uncertainty-related fields.
- normalize(df_continuum)
normalize flux after combining all orders
Trims zero-flux edges, concatenates non-overlap and overlap regions in order, computes combined S/N, then divides by continuum.
- Parameters:
df_continuum – pandas.DataFrame that contain the blaze function
- Returns:
pandas.DataFrame of 1D normalized spectrum with columns
"wav","flux","continuum","sn_ratio","tmp_uncertainty", augmented later bydivide_by_continuum().
- objective_function(scale, flux, continuum, sigma_lower, sigma_upper, maxiter_sigmaclip)
objective function to optimize the scaling factor of the blaze function
- Parameters:
scale – scaling factor of continuum
flux – flux
continuum – continuum
sigma_lower – sigma clipping threshold (lower)
sigma_upper – sigma clipping threshold (upper)
maxiter_sigmaclip – the maximum number of iterations for sigma clipping
- Returns:
absolute sum of the residuals after sigma clipping
- trim_nonzero_flux(df)
cut pixels in the zero flux region at both ends of the order
- Parameters:
df – pandas.DataFrame that contains ‘flux’ column
- Returns:
pandas.DataFrame which is cut by +/- interp_nonzero_ind pixels
pyird.spec.rsdmat module
Raw Spectral Detector matrix (RSD matrix)
- pyird.spec.rsdmat.multiorder_to_rsd(rawspec, pixcoord, npix=2048, fill_value=nan)
conversion multiorder rawspec+pixcoord to RSD matrix.
- Parameters:
rawspec – multiorder rawspec
pixcoord – multiorder pixel coordinate
npix – number of detector pixels in y direction
fill_value – filled value in empty elements
- Returns:
RSD matrix (npix x norder)
- pyird.spec.rsdmat.rsd_order_medfilt(rsd, kernel_size=9)
median filtering for spectrum
- Parameters:
rsd – RSD matrix (npix x norder)
kernel_size – kernel size for median filter
- Returns:
median filtered RSD matrix
pyird.spec.uncertainty module
- class pyird.spec.uncertainty.FluxUncertainty(wav_boundary_yj_and_h=1420, gain_y=2.99, gain_h=2.78, combfile=None, comb_readout_noise_wav=[950, 1000], calc_method_readout_noise='mad')
Bases:
objectCompute flux uncertainties and S/N for IRD spectra.
- Algorithmm overview:
Computes per-pixel S/N and flux uncertainties using a Poisson + read-noise model, with automatic band-dependent gain selection and optional read-noise estimation from a comb (LFC) file.
Noise model (units): - Input flux is in ADU; gains are in e-/ADU; readout noise is in e-. - Variance (in ADU^2) is modeled as:
var_total = flux / gain + readout_noise**2
Temporary (pre-normalization) uncertainty is:
tmp_uncertainty = sqrt(var_total)
Signal-to-noise ratio (dimensionless) is:
sn_ratio = (gain * flux) / sqrt(gain * flux + (gain * readout_noise)**2)
Gain selection: - If any wavelength in a chunk exceeds
wav_boundary_yj_and_h, usegain_h, otherwise usegain_y.Readout-noise estimation: - If a comb file is provided, restrict to
comb_readout_noise_wav = [low, high]and estimate noise from the selected flux samples using either MAD (scaled by 1.4826) or standard deviation; otherwise fall back todefault_readout_noise.
- wav_boundary_yj_and_h
int Boundary wavelength between the Y/J and H bands (in the same units as input data).
- gain_y
float Gain of the IRD Y/J-band detector (e-/ADU).
- gain_h
float Gain of the IRD H-band detector (e-/ADU).
- combfile
str | None Path to the LFC (mmf1) file used to estimate the readout noise, if provided.
- comb_readout_noise_wav
tuple[int, int] Wavelength window
[low, high]used to compute the readout noise from the comb.
- method
{“mad”, “std”} Method to estimate the readout noise from the comb data: Median Absolute Deviation (scaled) or standard deviation.
- default_readout_noise
float Default readout noise in electrons (IRD detectors: ~12 e- for a 10 min exposure).
- scale_normalized_mad
float Scaling factor (1.4826) to convert MAD to an estimate of the standard deviation.
- readout_noise
float Effective readout noise used in calculations. Determined during initialization.
- gain
float The detector gain actually used for calculations (either
gain_yorgain_h). Note: This attribute is set lazily bydetermine_gain()(called insidecalc_uncertainty()), so it may not exist until those methods are invoked.
- calc_uncertainty(df_continuum)
calculation of uncertainty
- Parameters:
df_continuum – Dataframe that contains at least the columns
"wav","flux",exist (and "continuum". If "sn_ratio" or "tmp_uncertainty") –
:param : :param they will be overwritten.:
- Returns:
The temporary (pre-normalization) uncertainty is stored in the column
"tmp_uncertainty"as an intermediate product.
- calc_uncertainty_overlap_region(df_head, df_tail)
calculate the signal-to-noise ratio and temporary uncertainty of each data point.
For wavelengths in the head order (
df_head), returns S/N and temporary uncertainty by referencing the tail order (df_tail) in the overlapping range. If an exact wavelength match is found indf_tail, reuse itssn_ratio; otherwise, interpolate between the nearest two tail samples and propagate uncertainty using linear weights.Notes
the wavelength range of df_tail should contain the one of df_head. i.e., max(df_tail[‘wav’])>max(df_head[‘wav’]) and min(df_tail[‘wav’])<min(df_head[‘wav’]) See details in the master thesis by Ziying Gu
- Parameters:
df_head – the spectrum of the latter order in the overlap region.
available (Must contain "wav" and "flux"; if) –
used. ("sn_ratio" is) –
df_tail – the spectrum of the former order in the overlap region.
"wav" (Must contain) –
"flux" –
"sn_ratio". (and) –
- Returns:
signal-to-noise ratio of each data point tmp_uncertainty: the uncertainty of flux before normalization
(this is a intermediate product for calculating uncertainty)
- Return type:
sn_ratio
- calculate_tmp_uncertainty(left_flux, right_flux, left_scale, right_scale)
calculate temporal uncertainty for interpolated points
- Parameters:
left_flux – flux at a point to the left (shorter wavelength) of the point after interpolation
right_flux – flux at a point to the right (longer wavelength) of the point after interpolation
left_scale – weight for left uncertainty
right_scale – weight for right uncertainty
- Returns:
temporary error for interpolated points
- determine_gain(df_continuum)
determine the gain to use based on wavelength
- Parameters:
df_continuum – DataFrame that contains wavelength data in ‘wav’ column
- Returns:
gain of the corresponding band
- determine_readout_noise()
determine the readout noise
- Returns:
if combfile is defined, calculated real value. if not, use default value.
- split_tail_by_wavhead(wav_tail, wav_head_i, flux_tail)
split wavelength and flux in tail by wavlength in head
Finds two adjacent tail wavelengths bracketing a head wavelength and returns their fluxes along with linear interpolation weights:
left_scale = (right_wav - wav_head) / (right_wav - left_wav)right_scale = 1 - left_scale
- Parameters:
wav_tail – wavelengths in tail (overlapping region of the former order)
wav_head_i – wavelength in head (overlapping region of the latter order)
flux_tail – flux in tail
- Returns:
flux and contributions of the adjacent data points for interpolating (left and right)
pyird.spec.wavcal module
- pyird.spec.wavcal.calculate_residuals(data, wavlength_solution)
calculate residuals for none-zero values
- Parameters:
data – fitted data
wavelength_solution – best-fit model
- Returns:
residuals for none-zero values
- pyird.spec.wavcal.check_channelfile(channelfile_path, ign_ord)
check the channel file format and set orders
- Parameters:
channelfile_path – path to the channel file (user-defined)
ign_ord – orders to be ignored
- pyird.spec.wavcal.errfunc(p, XY, data, W, Ni, Nx)
calculate error function.
- Parameters:
p – fitting coefficients
XY – meshgrid of (pixels, orders)
data – fitted data
W – matrix of weights
Ni – order of the fitting function arong each echelle order
Nx – order of the fitting function with respect to the aperture number
- Returns:
residuals of data and fitting model
- pyird.spec.wavcal.first_identification(dat, channelfile, order_ref, pixel_search_area=5, kernel_size=3)
map pixels to wavelengths by using the previously identified data with ecidentify(IRAF)
- Parameters:
dat – ThAr spectrum (norder x npix matrix)
channelfile – reference channel-wavelength map
order_ref – conventional orders
pixel_search_area – pixel area to search peaks around a ThAr emission in a reference spectrum
kernel_size – kernel size for median filter
- Returns:
channel(pixel)-wavelength map
- pyird.spec.wavcal.fit_polynomial(XY, Ni, Nx, params, poly='chebyshev')
calculate 2d polynomial series.
- Parameters:
XY – meshgrid of (pixels, orders)
Ni – order of the fitting function arong each echelle order
Nx – order of the fitting function with respect to the aperture number
params – fitting coefficients
poly – ‘chebyshev’ or ‘legendre’ for fitting polynomial series
- Returns:
wavelength of each pixels (flattened from npix x norder matrix)
- pyird.spec.wavcal.fit_wav_solution(XY, data, W, Ni, Nx)
optimize the fitting by using least-square method.
- Parameters:
XY – meshgrid of (pixels, orders)
data – fitted data
W – matrix of weights
Ni – order of the fitting function arong each echelle order
Nx – order of the fitting function with respect to the aperture number
- Returns:
best fit parameters (coefficients of 2d legendre series)
- pyird.spec.wavcal.identify_channel_mode(norder, channelfile_path=None, ign_ord=[])
identify the channel mode based on data shape
- Parameters:
dat – ThAr spectrum (norder x npix matrix)
channelfile_path – path to the channel file
ign_ord – orders to be ignored
- Returns:
diffraction orders, the reference channel-wavelength map file, and conventional orders
- pyird.spec.wavcal.iterate_fitting(X, Y, df_pixwavmap, W, Ni, Nx, maxiter, std_threshold, npix, norder, order_ref, Nsigma=1.5)
iterate the fitting until the std of residuals become lower than std_threshold or the number of iteration become maxiter
- Parameters:
X – grid of pixels
Y – grid of orders
df_pixwavmap – channel-wavelength data
W – matrix of weights
Ni – order of the fitting function arong each echelle order
Nx – order of the fitting function with respect to the aperture number
maxiter – maximum number of iterations
std_threshold – When the std of fitting residuals reaches this value, the iteration is terminated.
npix – number of pixels
norder – number of orders
order_ref – conventional orders
Nsigma – the number of stds to use for both the lower and upper clipping limit
- Returns:
wavelength solution and data of ThAr signals used for fitting
- pyird.spec.wavcal.make_weight()
weight matrix
Note
REVIEW: there may be other appropreate weights…
- pyird.spec.wavcal.pixel_df_to_wav_mat(df_pixwavmap, order_ref, npix=2048)
conversion channel-wavelength data to wavelength matrix.
- Parameters:
df_pixwavmap – channel-wavelength data
order_ref – conventional orders
- Returns:
channel-wavelength matrix (nipx x norder)
- pyird.spec.wavcal.second_identification(dat, wavlength_solution_matrix, residuals, npix, norder, order_ref, pixel_search_area=None, kernel_size=3, detect_level=80)
detect additional ThAr lines in the observed data with referencing the line list
- Parameters:
dat – ThAr spectrum (norder x npix matrix)
wavelength_solution_matrix – best-fit model
residuals – residuals between data and wavelength solution
npix – number of pixels
norder – number of orders
order_ref – conventional orders
pixel_search_area – pixel area to search peaks around a ThAr emission in a line list
kernel_size – kernel size for median filter
detect_level – determine the lower limit of what percentage of the top data should be detected as peaks
- Returns:
channel(pixel)-wavelength map
- pyird.spec.wavcal.sigmaclip(data, wavlength_solution, N=3)
clipping outliers.
- Parameters:
data – the reference ThAr data
wavlength_solution – best-fit model
N – the number of stds to use for both the lower and upper clipping limit
- Returns:
residuals, drop_ind
- pyird.spec.wavcal.wavcal_thar(dat, W, Ni=5, Nx=4, maxiter=10, std_threshold=0.005, channelfile_path=None, ign_ord=[])
wavelegth calibration for ThAr spectrum.
- Parameters:
dat – ThAr spectrum (norder x npix matrix)
W – matrix of weights
Ni – order of the fitting function arong each echelle order
Nx – order of the fitting function with respect to the aperture number
maxiter – maximum number of iterations
std_threshold – When the std of fitting residuals reaches this value, the iteration is terminated.
channelfile_path – path to the channel file
ign_ord – orders to be ignored
- Returns:
final results of the wavlength solution data of ThAr signals used for fitting
Examples
>>> wavlength_solution, data = wavcal_thar(thar)