Pipeline to Get 1D Spectra from Raw Data (IRD Stream)
This tutorial demonstrates how to reduce raw data to
wavelength-calibrated 1D spectra. By using the Stream2D framework,
you can apply functions to multiple FITS files efficiently.
The same pipeline is available in examples/python/IRD_stream.py. For REACH data, refer the script available in examples/python/REACH_stream.py.
Step 0: Settings
Directory Structure
First, specify the path to the raw data directory (datadir) and the
output directory (anadir)
This tutorial assumes the following directory structure:
.
└── pyird/
└── data/
└── 20210317/
├── flat
├── thar
├── target
├── dark
└── reduc
In this structure, the flat, thar, target, and dark directories are part
of the datadir, each containing raw data for ‘Flat’, ‘ThAr’,
‘Target’, and optionally ‘Dark’ frames (see also Input Data).
The reduc directory is used as
anadir for storing processed data.
import pathlib
basedir = pathlib.Path('~/pyird/data/20210317/').expanduser()
datadir_flat = basedir/'flat/'
datadir_dark = basedir/'dark/'
datadir_thar = basedir/'thar'
datadir_target = basedir/'target/'
anadir = basedir/'reduc/'
Specify the Data to be Analyzed
Please change the following variables based on the data you want to analyze. The sample data can be downloaded from the Zenodo repository
band = 'h' #'h' or 'y'
mmf = 'mmf2' #'mmf1' (comb fiber) or 'mmf2' (star fiber)
readout_noise_mode = 'default'
Note
Ensure that the readout_noise_mode is set to either ‘real’ or
‘default’.
readout_noise_mode = 'real': Need to reduce the dataset withband = 'y'andmmf = 'mmf1'at first.With this setting, uncertainties and signal-to-noise ratio at each wavelength will be included in the output files (nw…_m?.dat and ncw…_m?.dat).
Those values are based on the readout noise (RN) calculated using the comb spectrum (in mmf1) of the Y/J band.
readout_noise_mode = 'default': Uses a default readout noise (RN) value (RN=12 \(e^{-}\)).
Please also specify the fitsid corresponding to the file you wish to
analyze. For files starts with ‘IRDA000’, set even-numbered fitsid
values.
By default, the band parameter determines which band’s data to
process. However, the pipeline is designed to safely handle cases where
you also change the FITS IDs accordingly (to an odd number) when
reducing the H-band data with band='h'. See also
Naming Conventions for Observed Data with IRD/REACH.
# last 5 digits of FITS file numbers: [start, end file number]
flat_comb_id = [41704, 41803] # flat image for comb
flat_star_id = [41804, 41903] # flat image for star/target
dark_id = [41504, 41505] # dark image
thar_id = [14632, 14731] # ThAr image
target_id = [41510, 41511] # target image
Step 1: Preprocessing the Calibration Dataset
Step 1-1: Identifying Apertures (Spectral Orders)
The
aptracefunction is used to identify apertures.Number of apertures (
num_aperture): 42 for H band, 102 for Y/J band.search_start_rowsets the initial row number on the detector image for searching apertures.These apertures are identified in the FLAT_COMB data.
If your observation was performed with a single fiber, set
num_apertureto half the default value.
from pyird.utils import irdstream
## FLAT_COMB
# settings
flat_comb=irdstream.Stream2D("flat_comb",
datadir_flat,
anadir,
fitsid=list(range(flat_comb_id[0], flat_comb_id[-1], 2)),
band=band)
# aperture extraction
if band=='h':
trace_mmf=flat_comb.aptrace(search_start_row=1300, num_aperture=42)
elif band=='y':
trace_mmf=flat_comb.aptrace(search_start_row=1000,num_aperture=102)
fitsid is incremented.
Processing h band
Processing fitsid: [41705, 41707, 41709, 41711, 41713, 41715, 41717, 41719, 41721, 41723, 41725, 41727, 41729, 41731, 41733, 41735, 41737, 41739, 41741, 41743, 41745, 41747, 41749, 41751, 41753, 41755, 41757, 41759, 41761, 41763, 41765, 41767, 41769, 41771, 41773, 41775, 41777, 41779, 41781, 41783, 41785, 41787, 41789, 41791, 41793, 41795, 41797, 41799, 41801, 41803]
[STEP] Performing aptrace...
Performing median combine for raw images.
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 50/50 [00:00<00:00, 845.81it/s]
Searching for apertures using the default num_aperture value.
Successfully detected the required number of apertures on detector row 1301.
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 42/42 [00:06<00:00, 7.00it/s]
Define ‘trace_mask’ to mask light from both fibers.
Aperture width is 6 pixels (from -2 to +4) for IRD data and 5 pixels (from -2 to 3) for REACH data by default. You can change it
.widthinstance of trace_mmf.
trace_mask = trace_mmf.mask()
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 42/42 [00:00<00:00, 169.15it/s]
Reduce apertures in the mask to extract the spectrum from the desired fiber
trace_mmf.choose_aperture(fiber=mmf)
Processing m2 fiber.
Step 1-2: Removing hotpixels
In this tutorial, we demonstrate how to identify hotpixels using dark images.
If you prefer to load an existing hotpixel mask without using dark images, please refer to the pyird.io.read_hotpix module.
If you are interested in comparing different hotpixel mask identification methods, see issue #121.
from pyird.image.bias import bias_subtract_image
from pyird.image.hotpix import identify_hotpix_sigclip
## HOTPIXEL MASK:
## DARK
dark = irdstream.Stream2D('dark',
datadir_dark,
anadir,
fitsid=list(range(dark_id[0], dark_id[-1], 2)),
band=band) # Multiple file is ok
median_image = dark.immedian()
im_subbias = bias_subtract_image(median_image)
hotpix_mask = identify_hotpix_sigclip(im_subbias)
fitsid is incremented.
Processing h band
Processing fitsid: [41505]
Performing median combine for raw images.
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 155.45it/s]
0.58% of pixels flagged in hot-pixel mask.
Step 1-3: Wavelength Calibration
Wavelength calibration is performed by using reference frames (Thrium-Argon).
You do not need to manually identify emission lines;
calibrate_wavelengthautomatically references the line list!
## THAR (ThAr-ThAr)
# Settings
if band=='h':
rawtag='IRDAD000'
elif band=='y':
rawtag='IRDBD000'
thar=irdstream.Stream2D("thar",
datadir_thar,
anadir,
rawtag=rawtag,
fitsid=list(range(thar_id[0], thar_id[-1]+1)),
band=band)
thar.trace = trace_mmf
# removing noise pattern
thar.clean_pattern(trace_mask=trace_mask,extin='', extout='_cp', hotpix_mask=hotpix_mask)
# wavelength calibration
thar.calibrate_wavelength()
Processing h band
Processing fitsid: [14632, 14633, 14634, 14635, 14636, 14637, 14638, 14639, 14640, 14641, 14642, 14643, 14644, 14645, 14646, 14647, 14648, 14649, 14650, 14651, 14652, 14653, 14654, 14655, 14656, 14657, 14658, 14659, 14660, 14661, 14662, 14663, 14664, 14665, 14666, 14667, 14668, 14669, 14670, 14671, 14672, 14673, 14674, 14675, 14676, 14677, 14678, 14679, 14680, 14681, 14682, 14683, 14684, 14685, 14686, 14687, 14688, 14689, 14690, 14691, 14692, 14693, 14694, 14695, 14696, 14697, 14698, 14699, 14700, 14701, 14702, 14703, 14704, 14705, 14706, 14707, 14708, 14709, 14710, 14711, 14712, 14713, 14714, 14715, 14716, 14717, 14718, 14719, 14720, 14721, 14722, 14723, 14724, 14725, 14726, 14727, 14728, 14729, 14730, 14731]
[STEP] Performing clean_pattern: output extension=_cp
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 100/100 [00:59<00:00, 1.69it/s]
[STEP] Performing calibrate_wavelength...
Performing median combine for raw images.
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 100/100 [00:00<00:00, 2156.61it/s]
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 21/21 [00:00<00:00, 87.16it/s]
Channel mode identified: H band (21 orders).
Std. of wavelength-data residuals after initial identification: 0.00903
Starting ThAr fitting iterations:
Target: std <= 0.00100 or max iteration = 30
# 1 std = 0.01343
# 2 std = 0.00791
# 3 std = 0.00424
# 4 std = 0.00267
# 5 std = 0.00196
# 6 std = 0.00144
# 7 std = 0.00114
# 8 std = 0.00090
Created a new file of the ThAr spectrum with a wavelength solution: /Users/yuikasagi/pyird/data/20210317/reduc/thar_h_m2.fits
Step 1-4: Creating a Normalized Flat
This process similar to hdsis_ecf for HDS/Subaru data to reduce the fringe appearing in a spectrum.
In the preparation of this process, we create the normalized flat by using
apnormalize.After applying
flatten, ‘{stream_id}_{band}_{mmf}.fits’ (e.g., flat_star_h_m2.fits) is created in anadir, containing the extracted spectrum of flat data.
## FLAT
if mmf=='mmf2': # Star fiber -> FLAT_STAR
# Settings
flat_star=irdstream.Stream2D("flat_star",
datadir_flat,
anadir,
fitsid=list(range(flat_star_id[0], flat_star_id[-1], 2)),
band=band)
flat_star.trace = trace_mmf
# Removing noise pattern
flat_star.clean_pattern(trace_mask=trace_mask, extin='', extout='_cp', hotpix_mask=hotpix_mask)
flat_star.imcomb = True # median combine
# Extract 1D spectrum
flat_star.flatten(hotpix_mask=hotpix_mask)
# Flat spectrum normalized in each pixel within an aperture
df_flatn = flat_star.apnormalize()
elif mmf=='mmf1': # Comb fiber -> FLAT_COMB
flat_comb.trace = trace_mmf
# Removing noise pattern
flat_comb.clean_pattern(trace_mask=trace_mask, extin='', extout='_cp', hotpix_mask=hotpix_mask)
flat_comb.imcomb = True # median combine
# Extract 1D spectrum
flat_comb.flatten(hotpix_mask=hotpix_mask)
# Flat spectrum normalized in each pixel within an aperture
df_flatn = flat_comb.apnormalize()
fitsid is incremented.
Processing h band
Processing fitsid: [41805, 41807, 41809, 41811, 41813, 41815, 41817, 41819, 41821, 41823, 41825, 41827, 41829, 41831, 41833, 41835, 41837, 41839, 41841, 41843, 41845, 41847, 41849, 41851, 41853, 41855, 41857, 41859, 41861, 41863, 41865, 41867, 41869, 41871, 41873, 41875, 41877, 41879, 41881, 41883, 41885, 41887, 41889, 41891, 41893, 41895, 41897, 41899, 41901, 41903]
[STEP] Performing clean_pattern: output extension=_cp
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 50/50 [00:26<00:00, 1.86it/s]
[STEP] Performing flatten...
0it [00:00, ?it/s]
Performing median combine for raw images.
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 50/50 [00:00<00:00, 1323.07it/s]
0%| | 0/21 [00:00<?, ?it/s][A
48%|██████████████████████████████████████████████████████████████████████ | 10/21 [00:00<00:00, 90.86it/s][A
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 21/21 [00:00<00:00, 87.23it/s]
0%| | 0/21 [00:00<?, ?it/s][A
48%|██████████████████████████████████████████████████████████████████████ | 10/21 [00:00<00:00, 91.26it/s][A
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 21/21 [00:00<00:00, 86.77it/s]
1it [00:08, 8.57s/it]
Created /Users/yuikasagi/pyird/data/20210317/reduc/flat_star_h_m2.fits
[STEP] Performing apnormalize...
Continuum successfully fitted with polynomial order = 23.
Performing median combine for images with extension '_cp'.
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 50/50 [00:00<00:00, 1622.53it/s]
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 21/21 [00:03<00:00, 5.62it/s]
Step 2: Extracting the Target 1D Spectrum
From here, we will extract target spectrum.
#--------FOR TARGET--------#
# Settings
target = irdstream.Stream2D('targets',
datadir_target,
anadir,
fitsid=list(range(target_id[0], target_id[-1], 2)),
band=band)
target.info = True # show detailed info
target.trace = trace_mmf
fitsid is incremented.
Processing h band
Processing fitsid: [41511]
Step 2-1: Removing Noise Pattern on the Detector
target.clean_pattern(trace_mask=trace_mask, extin='', extout='_cp', hotpix_mask=hotpix_mask)
[STEP] Performing clean_pattern: output extension=_cp
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 1.95it/s]
Step 2-2: Aperture Extraction & Flat Fielding
The
apext_flatfieldfunction extracts each order while applying flat fielding.This process requires the flat spectrum normalized in each pixel within an aperture (i.e., df_flatn).
After this process, ’IRDA000…_flnhp.fits’ (when
hotpix_maskis set) or ’IRDA000…_fln.fits’ (whenhotpix_mask = None) is created.
target.apext_flatfield(df_flatn, hotpix_mask=hotpix_mask)
[STEP] Performing apext_flatfield...
0%| | 0/1 [00:00<?, ?it/s]
0%| | 0/21 [00:00<?, ?it/s][A
10%|██████████████ | 2/21 [00:00<00:02, 7.81it/s][A
14%|█████████████████████▝ | 3/21 [00:00<00:02, 6.56it/s][A
19%|████████████████████████████▝ | 4/21 [00:00<00:02, 6.04it/s][A
24%|███████████████████████████████████▝ | 5/21 [00:00<00:02, 5.79it/s][A
29%|██████████████████████████████████████████▎ | 6/21 [00:01<00:02, 5.62it/s][A
33%|█████████████████████████████████████████████████▎ | 7/21 [00:01<00:02, 5.52it/s][A
38%|████████████████████████████████████████████████████████▝ | 8/21 [00:01<00:02, 5.46it/s][A
43%|███████████████████████████████████████████████████████████████▝ | 9/21 [00:01<00:02, 5.43it/s][A
48%|██████████████████████████████████████████████████████████████████████ | 10/21 [00:01<00:02, 5.42it/s][A
52%|█████████████████████████████████████████████████████████████████████████████ | 11/21 [00:01<00:01, 5.38it/s][A
57%|████████████████████████████████████████████████████████████████████████████████████ | 12/21 [00:02<00:01, 5.37it/s][A
62%|███████████████████████████████████████████████████████████████████████████████████████████ | 13/21 [00:02<00:01, 5.34it/s][A
67%|██████████████████████████████████████████████████████████████████████████████████████████████████ | 14/21 [00:02<00:01, 5.33it/s][A
71%|█████████████████████████████████████████████████████████████████████████████████████████████████████████ | 15/21 [00:02<00:01, 5.33it/s][A
76%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████ | 16/21 [00:02<00:00, 5.36it/s][A
81%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████ | 17/21 [00:03<00:00, 5.36it/s][A
86%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████ | 18/21 [00:03<00:00, 5.34it/s][A
90%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████ | 19/21 [00:03<00:00, 5.34it/s][A
95%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████ | 20/21 [00:03<00:00, 5.37it/s][A
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 21/21 [00:03<00:00, 5.51it/s]
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:03<00:00, 3.86s/it]
Pixel offset from aperture center = -2 | Mean counts along aperture = 0.84721
Pixel offset from aperture center = -1 | Mean counts along aperture = 1.35783
Pixel offset from aperture center = 0 | Mean counts along aperture = 1.46715
Pixel offset from aperture center = 1 | Mean counts along aperture = 1.34046
Pixel offset from aperture center = 2 | Mean counts along aperture = 0.77173
Pixel offset from aperture center = 3 | Mean counts along aperture = 0.17888
Step 2-3: Assigning Wavelength to the Extracted Spectrum
The
dispcorfunction assigns wavelength solution to the extracted spectrum.Please change the
extinoption toextin='_flnhp'orextin='_fln'depending on the previous process.After this process, ’w…_m?.dat’ is created, with data format:
$1: Wavelength [nm],$2: Order,$3: Counts.
target.dispcor(master_path=thar.anadir, extin='_flnhp')
[STEP] Performing dispcor...
Allocate wavelengths based on the ThAr file: /Users/yuikasagi/pyird/data/20210317/reduc/thar_h_m2.fits
Created 1D spectrum: w41511_m2.dat
Step 2-4: Creating the Blaze Function
The blaze function is created from FLAT spectrum to ‘’normalize’’ the spectra.
After this process, ’wblaze_{band}_{mmf}.dat’ is created.
# blaze function
if mmf=='mmf2':
flat_star.apext_flatfield(df_flatn, hotpix_mask=hotpix_mask)
flat_star.dispcor(master_path=thar.anadir)
elif mmf=='mmf1':
flat_comb.apext_flatfield(df_flatn, hotpix_mask=hotpix_mask)
flat_comb.dispcor(master_path=thar.anadir)
[STEP] Performing apext_flatfield...
0%| | 0/1 [00:00<?, ?it/s]
Performing median combine for images with extension '_cp'.
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 50/50 [00:00<00:00, 1312.57it/s]
0%| | 0/21 [00:00<?, ?it/s][A
10%|██████████████ | 2/21 [00:00<00:02, 7.92it/s][A
14%|█████████████████████▝ | 3/21 [00:00<00:02, 6.63it/s][A
19%|████████████████████████████▝ | 4/21 [00:00<00:02, 6.12it/s][A
24%|███████████████████████████████████▝ | 5/21 [00:00<00:02, 5.86it/s][A
29%|██████████████████████████████████████████▎ | 6/21 [00:00<00:02, 5.70it/s][A
33%|█████████████████████████████████████████████████▎ | 7/21 [00:01<00:02, 5.60it/s][A
38%|████████████████████████████████████████████████████████▝ | 8/21 [00:01<00:02, 5.50it/s][A
43%|███████████████████████████████████████████████████████████████▝ | 9/21 [00:01<00:02, 5.46it/s][A
48%|██████████████████████████████████████████████████████████████████████ | 10/21 [00:01<00:02, 5.43it/s][A
52%|█████████████████████████████████████████████████████████████████████████████ | 11/21 [00:01<00:01, 5.42it/s][A
57%|████████████████████████████████████████████████████████████████████████████████████ | 12/21 [00:02<00:01, 5.38it/s][A
62%|███████████████████████████████████████████████████████████████████████████████████████████ | 13/21 [00:02<00:01, 5.36it/s][A
67%|██████████████████████████████████████████████████████████████████████████████████████████████████ | 14/21 [00:02<00:01, 5.37it/s][A
71%|█████████████████████████████████████████████████████████████████████████████████████████████████████████ | 15/21 [00:02<00:01, 5.37it/s][A
76%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████ | 16/21 [00:02<00:00, 5.35it/s][A
81%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████ | 17/21 [00:03<00:00, 5.36it/s][A
86%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████ | 18/21 [00:03<00:00, 5.39it/s][A
90%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████ | 19/21 [00:03<00:00, 5.39it/s][A
95%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████ | 20/21 [00:03<00:00, 5.39it/s][A
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 21/21 [00:03<00:00, 5.55it/s]
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:11<00:00, 11.84s/it]
Pixel offset from aperture center = -2 | Mean counts along aperture = 0.84721
Pixel offset from aperture center = -1 | Mean counts along aperture = 1.35783
Pixel offset from aperture center = 0 | Mean counts along aperture = 1.46715
Pixel offset from aperture center = 1 | Mean counts along aperture = 1.34046
Pixel offset from aperture center = 2 | Mean counts along aperture = 0.77173
Pixel offset from aperture center = 3 | Mean counts along aperture = 0.17888
[STEP] Performing dispcor...
Allocate wavelengths based on the ThAr file: /Users/yuikasagi/pyird/data/20210317/reduc/thar_h_m2.fits
Created 1D spectrum: wblaze_h_m2.dat
Step 2-5: Normalizing the Spectra
Normalize the target spectrum by dividing it by the blaze function.
After
normalize1D, the normalized spectrum (nw…_m?.dat) and the order-combined spectrum (ncw…_m?.dat) are created.Data formats are:
Normalized (nw):
$1: Wavelength [nm],$2: Order,$3: Counts,$4: S/N,$5: UncertaintiesOrder-combined (ncw):
$1: Wavelength [nm],$2: Counts,$3: S/N,$4: Uncertainties
For the order-combined spectra: There are overlapping wavelengths at the edges of orders, so we “normalize” by summing up the flux in these regions to improve the signal-to-noise ratio.
# combine & normalize
if mmf=='mmf2':
target.normalize1D(master_path=flat_star.anadir, readout_noise_mode=readout_noise_mode)
elif mmf=='mmf1':
target.normalize1D(master_path=flat_comb.anadir, readout_noise_mode=readout_noise_mode)
[STEP] Performing normalize1D...
Using default readout Noise : 12
readout noise of IRD detectors: ~12e- (10min exposure)
Created normalized 1D spectrum: nw41511_m2.dat
Created normalized & order-combined 1D spectrum: ncw41511_m2.dat
This concludes the data reduction process!
See Output Spectra and Overview of Output Files for detailed explanations of output spectra.