2. Fitting SEDs with a Custom Model#

This tutorial will walk you through customizing the B5 model and fitting the same AT2022cmc data as the first tutorial.

[1]:

# imports

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import syncfit

2.1. Reading in the Data#

This is the same as the first tutorial, it is just here as a refresher.

[2]:

# read in the data
cmc = pd.read_csv('AT2022cmc_Andreoni2022.txt')
cmc

[2]:

	facility	date	dt	nu	F_nu	F_err	upperlimit
0	NOEMA	2022-03-24 22:14	41.48	86.0	3914	30	True
1	NOEMA	2022-03-24 22:14	41.48	102.0	3609	34	False
2	NOEMA	2022-03-25 00:45	41.58	136.0	3045	41	False
3	NOEMA	2022-03-25 00:45	41.58	152.0	2750	51	False
4	VLA	2022-03-31 04:08	47.73	31.4	2130	30	False
5	VLA	2022-03-31 04:08	47.73	33.5	2260	30	False
6	VLA	2022-03-31 04:08	47.73	35.6	2350	40	False
7	VLA	2022-03-31 04:08	47.73	37.5	2360	40	True
8	VLA	2022-03-31 04:13	47.73	8.5	270	12	False
9	VLA	2022-03-31 04:13	47.73	9.5	336	11	False
10	VLA	2022-03-31 04:13	47.73	10.5	385	12	False
11	VLA	2022-03-31 04:13	47.73	11.5	438	14	False
12	VLA	2022-03-31 04:23	47.74	12.8	583	12	False
13	VLA	2022-03-31 04:23	47.74	14.3	724	12	False
14	VLA	2022-03-31 04:23	47.74	15.9	801	14	True
15	VLA	2022-03-31 04:23	47.74	17.4	935	15	False

[3]:

# now plot the SED
fig, ax = plt.subplots(1)

ax.errorbar(cmc.nu*1e9, cmc.F_nu, yerr=cmc.F_err, fmt='o')
ax.set_xscale('log')
ax.set_yscale('log')

ax.set_xlabel('Frequency [GHz]')
ax.set_ylabel('Flux [uJy]')

[3]:

Text(0, 0.5, 'Flux [uJy]')

../_images/tutorials_2_fit_custom_model_4_1.png

2.2. Fitting the Data#

Now that we have the AT2022cmc data read in, we can fit it using syncfit. Let’s first just use the default B5 model like we did in the first tutorial.

[4]:

# the initial guesses
theta_init = [
    3, # p-value guess
    1, # initial log_F_nu guess
    10, # initial nu_a guess, this is all we need for B5!
]

# the number of walkers
# 32 is usually fine, more makes it slower!
nwalkers = 32

# now define the number of iterations
# something small for now to make it fast,
# usually ~2000 gives a converging chaing
niter = 500

# now we can fit the data
sampler = syncfit.do_emcee(
    theta_init = theta_init,
    nu = cmc.nu,
    F_muJy = cmc.F_nu,
    F_error = cmc.F_err,
    model = syncfit.models.B5, # get the model from syncfit
    niter = niter,
    nwalkers = nwalkers
)

# Then let's just check how the fit looks
syncfit.analysis.plot_best_fit(
    model = syncfit.models.B5,
    sampler = sampler,

    # be careful with units for the data
    # the frequency must be in GHz space
    # and the flux densities must be in mJy space
    nu = cmc.nu*1e9,
    F = cmc.F_nu*1e-3
)

# print out the constraints we get
labels = syncfit.models.B5.get_labels()
constraints = syncfit.analysis.get_bounds(sampler, labels, toprint=True)

100%|███████████████████████████████████████████████████████████████████████████████| 500/500 [00:00<00:00, 2602.04it/s]

../_images/tutorials_2_fit_custom_model_6_1.png

\mathrm{p} = 2.01e+00_{-0.005}^{0.487}
\mathrm{log F_v} = 7.87e-01_{-0.119}^{0.053}
\mathrm{log nu_a} = 1.05e+01_{-1.051}^{0.007}

Which has a pretty good looking fit! But, let’s see if we can make it better by customizing the B5 model a little more! For this tutorial, we are going to modify the prior function, called lnprior, but some other methods that can be customized are SED (the model itself) and loglik (the likelihood function which, by default, is just gaussian).

To do this, we use the decorator @syncfit.models.B5.override. All of the models, including the generic SyncfitModel model, have this override decorator which allows you to customize that specific method of the model class. If you want to use the generic SyncfitModel instead of one of the other builtin models please note that you will need to define lnprior, SED, and get_labels using this decorator.

[5]:

@syncfit.models.B5.override
def lnprior(theta, p=None, **kwargs):
    '''
    Our customized lnprior function. This needs the arguments
    that it has and no more or less! Otherwise it won't work
    when we actually try to run the MCMC.
    '''

    # All of this is copied from the source code and is fine to leave
    # since it is just unpacking the theta value
    if p is None:
        p, log_F_nu, log_nu_a= theta
    else:
        log_F_nu, log_nu_a = theta

    # However, here we want to change things to narrow the
    # priors on all three parameters
    if 2.9 < p < 3.1 and -1 < log_F_nu < 1 and 9 < log_nu_a < 11:
        return 0.0

    else:
        return -np.inf

In this case, since we’ve just overwritten a method of the B5 model class directly, we can just pass in the B5 model again.

[6]:

# now we can fit the data
sampler = syncfit.do_emcee(
    theta_init = theta_init,
    nu = cmc.nu,
    F_muJy = cmc.F_nu,
    F_error = cmc.F_err,
    model = syncfit.models.B5, # use the same model here!
    niter = niter,
    nwalkers = nwalkers
)

# Then let's just check how the fit looks
syncfit.analysis.plot_best_fit(
    model = syncfit.models.B5,
    sampler = sampler,

    # be careful with units for the data
    # the frequency must be in GHz space
    # and the flux densities must be in mJy space
    nu = cmc.nu*1e9,
    F = cmc.F_nu*1e-3
)

# print out the constraints we get
labels = syncfit.models.B5B3.get_labels()
constraints = syncfit.analysis.get_bounds(sampler, labels, toprint=True)

  0%|                                                                                           | 0/500 [00:00<?, ?it/s]/home/nfranz/.local/lib/anaconda3/lib/python3.11/site-packages/emcee/moves/red_blue.py:99: RuntimeWarning: invalid value encountered in scalar subtract
  lnpdiff = f + nlp - state.log_prob[j]
100%|███████████████████████████████████████████████████████████████████████████████| 500/500 [00:00<00:00, 2566.89it/s]

../_images/tutorials_2_fit_custom_model_10_1.png

\mathrm{p} = 2.91e+00_{-0.007}^{0.063}
\mathrm{log F_v} = 9.58e-01_{-0.014}^{0.004}
\mathrm{log nu_a} = 1.06e+01_{-0.110}^{0.003}

You can also do this customization by subclassing. I know subclassing and learning what that means can be somewhat of a steep learning curve. But, it is the best way to reduce code redundancy for this type of software! If you need to refresh your memory on subclassing in python, I recommend this tutorial for a quick overview: https://www.geeksforgeeks.org/create-a-python-subclass/. The basic idea though, is that when you subclass you get all of the methods from the superclass that you are using.

So, now let’s actually do this customization! In this case, we use the superclass syncfit.models.B5 which itsels subclasses the syncfit.models.SyncfitModel superclass.

[7]:

class CustomB5(syncfit.models.B5):

    def lnprior(theta, p=None, **kwargs):
        '''
        Our customized lnprior function. This needs the arguments
        that it has and no more or less! Otherwise it won't work
        when we actually try to run the MCMC.
        '''

        # All of this is copied from the source code and is fine to leave
        # since it is just unpacking the theta value
        if p is None:
            p, log_F_nu, log_nu_a= theta
        else:
            log_F_nu, log_nu_a = theta

        # However, here we want to change things to narrow the
        # priors on all three parameters
        if 2.9 < p < 3.1 and -1 < log_F_nu < 1 and 9 < log_nu_a < 11:
            return 0.0

        else:
            return -np.inf

Now that we’ve customized the B5 model with some more stringent priors, let’s try fitting our data again!

[8]:

# now we can fit the data
sampler = syncfit.do_emcee(
    theta_init = theta_init,
    nu = cmc.nu,
    F_muJy = cmc.F_nu,
    F_error = cmc.F_err,
    model = CustomB5, # use our custom model here!
    niter = niter,
    nwalkers = nwalkers
)

# Then let's just check how the fit looks
syncfit.analysis.plot_best_fit(
    model = syncfit.models.B5,
    sampler = sampler,

    # be careful with units for the data
    # the frequency must be in GHz space
    # and the flux densities must be in mJy space
    nu = cmc.nu*1e9,
    F = cmc.F_nu*1e-3
)

# print out the constraints we get
labels = syncfit.models.B5B3.get_labels()
constraints = syncfit.analysis.get_bounds(sampler, labels, toprint=True)

100%|███████████████████████████████████████████████████████████████████████████████| 500/500 [00:00<00:00, 2642.76it/s]

../_images/tutorials_2_fit_custom_model_14_1.png

\mathrm{p} = 2.91e+00_{-0.009}^{0.065}
\mathrm{log F_v} = 9.58e-01_{-0.318}^{0.004}
\mathrm{log nu_a} = 1.06e+01_{-1.299}^{0.003}

At first glance, the fit looks very similar to the first fit we had. But, we know that we were finding somewhat unphysical values (\(p=2.01\) doesn’t really make sense) before changing the priors. Although, this also serves as a warning that choosing your priors is very important for the constraints you get! It is usually a good idea to try different sets of priors based on your own intuition.

Fitting SEDs with a Custom Model

Contents

2. Fitting SEDs with a Custom Model#

2.1. Reading in the Data#

2.2. Fitting the Data#