Fitting the telluric line model

September 25th 2024, Hajime Kawahara

Let’s try modeling the telluric lines, which is one of the nuisances in high-resolution spectral observations by ground-based telescopes, using ExoJAX in this notebook. We use the external packages, specutils and JoviSpec in this notebook.

To install specutils and JoviSpec, run the following commands

pip install specutils
git clone https://github.com/HajimeKawahara/jovispec.git
cd jovispec
python setup.py install
from jax import config

config.update("jax_enable_x64", True)

First, as our data, we will use the spectrum of an A-type star called WASP-33, which was captured with Subaru/HDS. In fact, on the same night, we also obtained the spectrum of Jupiter, and the repository primarily for publishing Jupiter’s reflected light data is JoviSpec.

Let me share a little memory about this data. This data was my first attempt at molecular detection in exoplanets using the Subaru Telescope. In 2012, a paper by M. Brogi et al. on CO detection using VLT/CRIRES, Brogi et al. 2012, Nature 486, 7404, was published. VLT/CRIRES, being a high-resolution spectrometer in the near-infrared, was ideal for detecting the thermal radiation from hot Jupiters. In Japan, at that time, the only high-resolution spectrometer (with R close to 100,000) attached to the Subaru Telescope was HDS (of course, now there is IRD!), so I thought of observing the super-hot Jupiter WASP33b, whose blackbody radiation extends to the visible region, and detecting TiO. But this was my first proposal to Subaru Telescope, and it was rejected the first time. It passed on the second try, and my first trip to the summit of Mauna Kea was for this observation.

The observation in the fall of 2014 was my first experience with the ‘Subaru Telescope’. Due to bad weather until the day before, the humidity was high, and we struggled to open the telescope. Finally, when the humidity dropped, we were able to capture the data of WASP33. However, morning came quickly. So, we persisted until the last moment to capture Jupiter, and that data is the high-resolution Jupiter reflected light data provided by JoviSpec.

By the way, in the following year, despite my recollection of a hurricane being present, we were fortunate with the weather during the observation nights and were able to collect data with good signal-to-noise ratio (S/N). With the data from 2015, we were able to report on TiO and the temperature inversion layer in Nugroho et al. (2017) AJ 154, 221.

Sorry for the long story, but this star, WASP33, being an A-type star with a large vsini, means that most of the high-frequency components in its spectrum are actually telluric lines. Therefore, we will use the spectrum of this WASP33 as an example for a simple model of telluric lines.

from jovispec import abcio
import pkg_resources

jupiter_data = pkg_resources.resource_filename("jovispec", "jupiter_data")

# blue
rlamb2, rspec2, rhead = abcio.read_qfits("06002", jupiter_data, ext="q")  # WASP33b
rlamb3, rspec3, rhead = abcio.read_qfits("06004", jupiter_data, ext="q")  # WASP33b
rlamb4, rspec4, rhead = abcio.read_qfits("06006", jupiter_data, ext="q")  # WASP33b
rlamb5, rspec5, rhead = abcio.read_qfits("06008", jupiter_data, ext="q")  # WASP33b
rlamb = rlamb2
rspec = rspec2 + rspec3 + rspec4 + rspec5

wavelength_start = 7100.0  # AA
wavelength_end = 7450.0  # AA

import matplotlib.pyplot as plt

fig = plt.figure(figsize=(20, 5))
plt.plot(rlamb, rspec)
plt.ylim(0.0, 1.0)
plt.xlim(wavelength_start, wavelength_end)
plt.title("Raw spectrum")
plt.show()
WARNING: VerifyWarning: Invalid 'BLANK' keyword in header.  The 'BLANK' keyword is only applicable to integer data, and will be ignored in this HDU. [astropy.io.fits.hdu.image]
../_images/Fitting_Telluric_Lines_4_1.png

First, we’ll start with some basic cleaning of the data. As those familiar with ground-based high-resolution spectra know, high-resolution spectra often have outliers near the edges of orders and in other places. We’ll mask these outliers. For the analysis, let’s limit ourselves to the wavelength region where water’s telluric lines are visible. Interestingly, Jupiter’s strong methane feature is in the same region, so in the notebook for reflected light analysis, we’ll analyze the same region using Jupiter’s spectrum this time.

Next, we’ll convert the wavelengths from air values to vacuum values. We should also define the wavenumber grid.

In ExoJAX, don’t forget to set the wavenumber grid in ascending order. As a result, make sure that when viewed in the wavelength grid, it appears in descending order.

import numpy as np

# mask some bad regions... as usual in astronomy
mask_wav = [
[7114.0,7114.2],
[7136.8, 7137.0],
[7199.0,7200.0],
[7205.8,7206.0],
[7206.25,7206.75],
[7208.2,7208.4],
[7222.8,7224.0],
[7311.0,7313.],
[7388.3,7388.5],
[7396.4,7396.6],
[7401.0,7405.0]
]
rlamb = np.array([float(d) for d in rlamb])
mask_index=np.digitize(mask_wav,rlamb)
for ind in mask_index:
    rspec[ind[0]:ind[1]+1] = None


# None for outside wvelength start - end region
start_index=np.digitize(wavelength_start,rlamb)
end_index=np.digitize(wavelength_end,rlamb)
rspec[:start_index] = None
rspec[end_index:] = None

#Air-Vaccum correction
from specutils.utils.wcs_utils import refraction_index
import astropy.units as u
mask = rspec == rspec
rlamb = rlamb[mask]
rspec = rspec[mask]
nair = refraction_index(rlamb*u.AA,method="Ciddor1996")
rlamb = rlamb*nair

# ascending wavenumber form
from exojax.utils.grids import wav2nu
rlamb = rlamb[::-1]
nus_obs = wav2nu(rlamb, unit="AA")
rspec = rspec[::-1]

It’s common for spectra to have a trend, and while it would be proper to model it seriously using methods like Gaussian processes, let’s take a shortcut here and see how well we can do with a median filter.

#median filter
from scipy.signal import medfilt
mspec = medfilt(rspec, kernel_size=1001)
fig = plt.figure(figsize=(20,5))
plt.plot(rlamb,rspec, label="masked spectrum")
plt.plot(rlamb,mspec, label="median filter")
plt.legend()
plt.title("masked spectrum")
plt.show()
../_images/Fitting_Telluric_Lines_9_0.png

Since the results are reasonably good, let’s go ahead and divide the spectrum by the median filter. But remember, this is a lazy approach!

#median subtracted
rspec = rspec/mspec*np.mean(rspec)
fig = plt.figure(figsize=(20,5))
plt.plot(rlamb,rspec, label="median subtracted masked spectrum")
plt.legend()
plt.show()
../_images/Fitting_Telluric_Lines_12_0.png

Now we get to the main part. Since it’s a low-temperature environment, HITRAN is sufficient as the molecular database (originally, HITRAN was developed for Earth’s atmosphere). For the opacity calculator, although Direct could be used, here we’ll use PreMODIT.

wavelength_start = 7100.0  # AA
wavelength_end = 7450.0  # AA
from exojax.database.api  import MdbHitran
from exojax.opacity import OpaPremodit
from exojax.utils.grids import wavenumber_grid
from exojax.utils.grids import wav2nu

N = 2**15

margin = 10  # cm-1
nus_start = wav2nu(wavelength_end, unit="AA") - margin
nus_end = wav2nu(wavelength_start, unit="AA") + margin
#nus_start = 1.e8/wavelength_end - margin
#nus_end = 1.e8/wavelength_start + margin

mdb_water = MdbHitran("H2O", nurange=[nus_start, nus_end], isotope=1)
nus, wav, res = wavenumber_grid(nus_start, nus_end, N, xsmode="lpf", unit="cm-1")
# opa = OpaDirect(mdb_water, nu_grid=nus)
snap = mdb_water.to_snapshot()
opa = OpaPremodit.from_snapshot(snap, nu_grid=nus, allow_32bit=True, auto_trange=[150.0, 300.0])
radis engine =  vaex
xsmode =  lpf
xsmode assumes ESLOG in wavenumber space: xsmode=lpf
Your wavelength grid is in *  descending  * order
The wavenumber grid is in ascending order by definition.
Please be careful when you use the wavelength grid.
default elower grid trange (degt) file version: 2
Robust range: 149.42336577900824 - 361.77172380843604 K
/home/kawahara/exojax/src/exojax/utils/grids.py:249: UserWarning: Resolution may be too small. R=660967.8229034714
  warnings.warn("Resolution may be too small. R=" + str(resolution), UserWarning)
/home/kawahara/exojax/src/exojax/opacity/_common/set_ditgrid.py:62: UserWarning: There exists negative or zero value.
  warnings.warn("There exists negative or zero value.")
OpaPremodit: gamma_air and n_air are used. gamma_ref = gamma_air/Patm
max value of  ngamma_ref_grid : 7.591520406260801
min value of  ngamma_ref_grid : 0.6153039614527465
ngamma_ref_grid grid : [0.61530387 0.71331317 0.82693386 0.95865272 1.11135253 1.28837525
 1.49359519 1.73150376 2.0073078  2.32704351 2.6977086  3.12741539
 3.62556839 4.2030701  4.87255967 5.64868945 6.54844571 7.59152079]
max value of  n_Texp_grid : 0.88
min value of  n_Texp_grid : -0.08
n_Texp_grid grid : [-0.08000001  0.24        0.56        0.88000005]
uniqidx: 100%|██████████| 29/29 [00:00<00:00, 3021.01it/s]
Premodit: Twt= 294.0349975006009 K Tref= 168.89218925939926 K
Making LSD:|--------------------| 0%
Making LSD:|####################| 100%

The most straightforward telluric line model involves an absorber with a single temperature, pressure, and column density. It’s worth noting that @YuiKasagi was the first to try this model in ExoJAX.

It seems to fit the data reasonably well (as initial values) with appropriate values.

import jax.numpy as jnp

T = 200.0
P = 0.5
xsv = opa.xsvector(T, P)
nl = 2.0e22
a = 0.52

fig = plt.figure(figsize=(20, 5))
ax = fig.add_subplot(211)
plt.plot(nus_obs, rspec)
plt.ylim(0.0, 0.7)
plt.plot(nus, a * jnp.exp(-nl * xsv), alpha=0.5)
# plt.xlim(1.e8/wavelength_start, 1.e8/wavelength_end)

ax = fig.add_subplot(212)
plt.plot(nus_obs, rspec, ".")
plt.plot(nus, 0.52 * jnp.exp(-nl * xsv), alpha=0.5)
plt.ylim(0.0, 0.7)

# plt.xlim(1.e8/7200,1.e8/7250)

plt.title("masked spectrum")
plt.show()
../_images/Fitting_Telluric_Lines_16_0.png

Now, let’s try fitting it using ADAM with these initial values. Ah, let’s remember that the instrumental profile is Gaussian, and we should also consider its width as a fitting parameter.

from exojax.utils.instfunc import resolution_to_gaussian_std

T_b = 200.0
P_b = 0.5
nl_b = 2.0e22
a_b = 0.52
Rinst = 100000.0
beta_inst_b = resolution_to_gaussian_std(Rinst)

initial_guess = np.array([T_b, P_b, nl_b, a_b, beta_inst_b])
initpar = np.ones_like(initial_guess)
# instrumental setting
from exojax.postproc.specop import SopInstProfile

sop_inst = SopInstProfile(nus, 10.0)


def model(params):
    T, P, nl, a, beta = params * initial_guess
    xsv = opa.xsvector(T, P)
    trans = a * jnp.exp(-nl * xsv)
    Frot_inst = sop_inst.ipgauss(trans, beta)
    mu = sop_inst.sampling(Frot_inst, 0.0, nus_obs)
    return mu
/home/kawahara/exojax/src/exojax/utils/grids.py:249: UserWarning: Resolution may be too small. R=660967.8229034714
  warnings.warn("Resolution may be too small. R=" + str(resolution), UserWarning)

Let’s input the initial values into the defined model and, just to be sure, plot it for a visual check. Looks good.

fig = plt.figure(figsize=(20,5))
ax = fig.add_subplot(211)
plt.title("Initial Guess")
plt.plot(nus_obs,rspec)
plt.ylim(0.0,0.7)
plt.plot(nus_obs,model(np.ones(5)),alpha=0.5)
#plt.xlim(1.e8/wavelength_start, 1.e8/wavelength_end)

ax = fig.add_subplot(212)
plt.plot(nus_obs,rspec,".")
plt.plot(nus_obs,model(np.ones(5)),"x",alpha=0.5)
plt.ylim(0.0,0.7)

#plt.xlim(1.e8/7200,1.e8/7250)

plt.show()
../_images/Fitting_Telluric_Lines_21_0.png

Defines the objective function…

def objective(params):
    f=rspec-model(params)
    return jnp.dot(f,f)*1.e-6

Let’s do ADAM!

import optax
import jax
import tqdm
params = jnp.asarray(initpar)
optimizer = optax.adam(1e-3)
opt_state = optimizer.init(params)

value_and_grad = jax.value_and_grad(objective)

@jax.jit
def step(params, opt_state):
    loss, grads = value_and_grad(params)
    updates, opt_state = optimizer.update(grads, opt_state, params)
    params = optax.apply_updates(params, updates)
    return params, opt_state, loss

Nit = 300
params_history = []
for _ in tqdm.tqdm(range(Nit)):
    params, opt_state, loss = step(params, opt_state)
    params_history.append(params)

final_params = np.array(params)
print(final_params * initial_guess)
100%|██████████| 300/300 [00:14<00:00, 20.59it/s]
[2.40405656e+02 4.17231419e-01 2.21894580e+22 5.07214966e-01
 1.28709026e+00]

The results seem good!

fig = plt.figure(figsize=(20,5))
ax = fig.add_subplot(211)
plt.title("Telluric Line Fitting")
plt.plot(nus_obs,rspec)
plt.plot(nus_obs,model(np.ones(len(params))),alpha=0.5, label="initial guess")
plt.plot(nus_obs,model(params),alpha=0.5, label="best")
plt.legend()
ax = fig.add_subplot(212)
plt.plot(nus_obs,rspec,".", label="data")
plt.plot(nus_obs,model(np.ones(len(params))),"x",alpha=0.5, label="initial guess")
plt.plot(nus_obs,model(params),alpha=0.5, label="best")
plt.legend()

#plt.xlim(1.e8/7250,1.e8/7200)
plt.xlim(13865,13900)
#plt.xlim(13865,13870)

plt.show()
../_images/Fitting_Telluric_Lines_27_0.png
fig = plt.figure(figsize=(20,5))
ax = fig.add_subplot(211)
plt.plot(nus_obs,rspec- model(np.ones(len(params))),alpha=0.5,color="C1", label="initial guess")
plt.plot(nus_obs,rspec - model(params),alpha=0.5,color="C2", label="best fit")
plt.title("Residual")
plt.legend()
ax = fig.add_subplot(212)
plt.plot(nus_obs,rspec,".",alpha=0.5,color="C0", label="data")
plt.plot(nus_obs,model(params),alpha=0.5,color="C2", label="best fit")
plt.legend()
<matplotlib.legend.Legend at 0x75e6c45aa1d0>
../_images/Fitting_Telluric_Lines_28_1.png
xs=13685
xe=13695
fig = plt.figure(figsize=(20,5))
ax = fig.add_subplot(211)
plt.title("Residual")
plt.plot(nus_obs,rspec- model(np.ones(len(params))),alpha=0.5,color="C1", label="initial guess")
plt.plot(nus_obs,rspec - model(params),alpha=0.5,color="C2", label="best fit")
plt.xlim(xs,xe)
plt.legend()
ax = fig.add_subplot(212)
plt.plot(nus_obs,rspec,".",alpha=0.5,color="C0", label="data")
plt.plot(nus_obs,model(params),alpha=0.5,color="C2", label="best fit")
plt.xlim(xs,xe)
plt.legend()
<matplotlib.legend.Legend at 0x7f077fae1640>
../_images/Fitting_Telluric_Lines_29_1.png

So, it looks like even such a simple model can fit the data reasonably well.