Likelihood Fitting of Tracer Luminosity Function

In this tutorial, we show how you can perform simple likelihood fitting to the tracer luminosity function (LF) using the lumfunc_likelihood module.

Load luminosity function measurements

To get hold of the demo data (a parameter file for the LF model), we use the get_test_data_loc function from the utils module.

[1]:
from horizonground.utils import get_test_data_loc

demo_measurements_file = get_test_data_loc("eBOSS_QSO_LF_measurements.txt")
demo_uncertainties_file = get_test_data_loc("eBOSS_QSO_LF_uncertainties.txt")

Let’s have a preview of the files’ first few rows. Note the table headers format the column names as <lg_:optional>Phi_z_<zbin_min>_<zbin_max>.

[2]:
import pandas as pd
from IPython.display import display

display(pd.read_csv(demo_measurements_file, sep='\s+', index_col=0, nrows=4, escapechar='#'))
display(pd.read_csv(demo_uncertainties_file, sep='\s+', index_col=0, nrows=4, escapechar='#'))
lg_Phi_z_0.68_1.06, lg_Phi_z_1.06_1.44, lg_Phi_z_1.44_1.82, lg_Phi_z_1.82_2.20, lg_Phi_z_2.20_2.60, lg_Phi_z_2.60_3.00, lg_Phi_z_3.00_3.50, lg_Phi_z_3.50_4.00
magnitude,
-28.8 NaN NaN NaN NaN NaN NaN NaN -8.34
-28.4 NaN NaN NaN NaN -7.98 -7.42 -7.89 -8.35
-28.0 NaN NaN NaN -7.63 -7.49 -7.10 -7.33 -8.36
-27.6 NaN NaN -7.32 -6.91 -6.88 -6.99 -7.33 NaN
d_lg_Phi_z_0.68_1.06, d_lg_Phi_z_1.06_1.44, d_lg_Phi_z_1.44_1.82, d_lg_Phi_z_1.82_2.20, d_lg_Phi_z_2.20_2.60, d_lg_Phi_z_2.60_3.00, d_lg_Phi_z_3.00_3.50, d_lg_Phi_z_3.50_4.00
magnitude,
-28.8 NaN NaN NaN NaN NaN NaN NaN 0.44
-28.4 NaN NaN NaN NaN 0.31 0.17 0.26 0.44
-28.0 NaN NaN NaN 0.22 0.18 0.12 0.14 0.44
-27.6 NaN NaN 0.16 0.11 0.10 0.11 0.14 NaN

The LumFuncMeasurements class helps organise the data tables into luminosity and redshift bins. The base10_log boolean keyword parameter, if True, means the LF values will be converted to base 10 logarithms unless 'lg_' is detected in the table column names (i.e. the raw data are already logarithmic). Please note that naming your table columns as shown above helps LumFuncMeasurements process the data labels and values.

[3]:
from pprint import pprint
from collections import OrderedDict

from horizonground.lumfunc_likelihood import LumFuncMeasurements

LF_data = LumFuncMeasurements(demo_measurements_file, demo_uncertainties_file, base10_log=True)

pprint(dict(zip(
    ["Luminosity bins", "Redshift bin centres", "Redshift bin labels"],
    [LF_data.luminosity_bins, LF_data.redshift_bins, LF_data.redshift_labels]
)))
{'Luminosity bins': array([-28.8, -28.4, -28. , -27.6, -27.2, -26.8, -26.4, -26. , -25.6,
       -25.2, -24.8, -24.4, -24. , -23.6, -23.2, -22.8, -22.4, -22. ,
       -21.6, -21.2, -20.8]),
 'Redshift bin centres': array([0.87, 1.25, 1.63, 2.01, 2.4 , 2.8 , 3.25, 3.75]),
 'Redshift bin labels': ['$0.68<z<1.06$',
                         '$1.06<z<1.44$',
                         '$1.44<z<1.82$',
                         '$1.82<z<2.20$',
                         '$2.20<z<2.60$',
                         '$2.60<z<3.00$',
                         '$3.00<z<3.50$',
                         '$3.50<z<4.00$']}

You can use either dictionary or slice syntax to access measurements and uncertainties in a particular redshift bin, e.g.:

[4]:
measurements_in_bin, uncertainties_in_bin = LF_data['z=0.87']

print("Redshift bin z=0.87 measurements: ", measurements_in_bin)
print("Redshift bin z=0.87 uncertainties: ", uncertainties_in_bin)

measurements_in_bin, uncertainties_in_bin = LF_data[2]

print("Third redshift bin measurements: ", measurements_in_bin)
print("Third redshift bin uncertainties: ", uncertainties_in_bin)
Redshift bin z=0.87 measurements:  [  nan   nan   nan   nan   nan   nan -8.13 -6.94 -6.53 -6.32 -6.09 -5.81
 -5.76 -5.51 -5.45 -5.39 -5.45 -5.46   nan   nan   nan]
Redshift bin z=0.87 uncertainties:  [ nan  nan  nan  nan  nan  nan 0.44 0.12 0.09 0.08 0.07 0.06 0.06 0.05
 0.05 0.05 0.06 0.06  nan  nan  nan]
Third redshift bin measurements:  [  nan   nan   nan -7.32 -6.76 -6.51 -6.25 -6.1  -5.92 -5.68 -5.59 -5.52
 -5.42 -5.36 -5.3    nan   nan   nan   nan   nan   nan]
Third redshift bin uncertainties:  [ nan  nan  nan 0.16 0.09 0.08 0.07 0.06 0.06 0.05 0.05 0.05 0.05 0.05
 0.05  nan  nan  nan  nan  nan  nan]

Finally, it’s more useful to get the LF data vectors for measurements and uncertainties. We order the elements by redshift and then luminosity, with invalid values omitted. Note that the values are given in logarithm here, as the raw data have symmetric uncertainties in \(\lg\varPhi\).

[5]:
data_vector, variance_vector = LF_data.get_statistics()

print("Data vector: ", data_vector)
print("Variance vector: ", variance_vector)
Data vector:  [-8.13 -6.94 -6.53 -6.32 -6.09 -5.81 -5.76 -5.51 -5.45 -5.39 -5.45 -5.46
 -8.2  -7.15 -6.59 -6.39 -6.15 -5.97 -5.78 -5.59 -5.5  -5.42 -5.34 -5.35
 -7.32 -6.76 -6.51 -6.25 -6.1  -5.92 -5.68 -5.59 -5.52 -5.42 -5.36 -5.3
 -7.63 -6.91 -6.63 -6.33 -6.17 -5.99 -5.77 -5.73 -5.6  -5.5  -5.41 -5.49
 -7.98 -7.49 -6.88 -6.67 -6.39 -6.18 -6.15 -5.96 -5.79 -5.66 -5.67 -5.62
 -7.42 -7.1  -6.99 -6.8  -6.57 -6.32 -6.16 -6.05 -5.93 -5.88 -5.81 -7.89
 -7.33 -7.33 -7.26 -6.75 -6.55 -6.37 -6.19 -6.23 -6.1  -8.34 -8.35 -8.36
 -7.49 -7.17 -6.71]
Variance vector:  [0.1936 0.0144 0.0081 0.0064 0.0049 0.0036 0.0036 0.0025 0.0025 0.0025
 0.0036 0.0036 0.1936 0.0196 0.0064 0.0049 0.0036 0.0036 0.0036 0.0025
 0.0025 0.0025 0.0025 0.0025 0.0256 0.0081 0.0064 0.0049 0.0036 0.0036
 0.0025 0.0025 0.0025 0.0025 0.0025 0.0025 0.0484 0.0121 0.0064 0.0049
 0.0036 0.0036 0.0025 0.0025 0.0025 0.0025 0.0025 0.0036 0.0961 0.0324
 0.01   0.0064 0.0049 0.0036 0.0036 0.0036 0.0036 0.0025 0.0025 0.0036
 0.0289 0.0144 0.0121 0.0081 0.0064 0.0049 0.0036 0.0036 0.0036 0.0036
 0.0036 0.0676 0.0196 0.0196 0.0169 0.0064 0.0049 0.0049 0.0036 0.0049
 0.0064 0.1936 0.1936 0.1936 0.0289 0.0169 0.01  ]

Evaluate luminosity function likelihood

We will consider the LF likelihood for the quasar PLE luminosity function and the following uniform prior parameter ranges.

[6]:
demo_prior_file = get_test_data_loc("QSO_LF_PLE_model_prior.txt")

with open(demo_prior_file, 'r') as prior_file:
    print(prior_file.read())
# m_\ast(z_\mathrm{p}), \lg\Phi_\ast, \alpha_\mathrm{l}, \alpha_\mathrm{h}, \beta_\mathrm{l}, \beta_\mathrm{h}, k_{1\mathrm{l}}, k_{1\mathrm{h}}, k_{2\mathrm{l}}, k_{2\mathrm{h}}
-29.  -8.  -10.  -5.  -3.  -3.  -1.5  -1.5  -1.5  -1.5
-22.  -5.   -0.  -0.  -0.  -0.   1.5   1.5   1.5   1.5

The LumFuncLikelihood class provided likelihood evaluation tools. It is a derived class from LumFuncMeasurements so automatically handles the data vector as above. It also processes the prior file above into a dictionary of parameter names and parameter prior ranges. For the quasar PLE model which is a double power law, there is an exchange symmetry between the power-law indices \(\alpha, \beta\). To avoid a redundant multi-modal posterior, we impose the constraint \(\alpha < \beta\) by passing the function quasar_PLE_model_constraint to the keyword parameter model_constraint. The log-likelihood is essentially a Gaussian quadratic form in a variable \(\tilde{\varPhi} = f(\varPhi)\) that is related to the luminosity function \(\varPhi\); here we choose the Poisson prescption in Pozzetti et al. (2016) [arXiv: 1603.01453]. For other prescriptions, please refer to Pozzetti et al. (2016) and the API reference.

[7]:
from horizonground.lumfunc_modeller import quasar_PLE_lumfunc, quasar_PLE_model_constraint
from horizonground.lumfunc_likelihood import LumFuncLikelihood

likelihood = LumFuncLikelihood(
    quasar_PLE_lumfunc,
    demo_measurements_file,
    demo_prior_file,
    uncertainty_file=demo_uncertainties_file,
    fixed_file=None,
    prescription='poisson',
    model_constraint=quasar_PLE_model_constraint,
    model_options={'redshift_pivot': 2.22}
)

It is also worth noting that parameters which are not meant to be sampled but rather kept fixed can be passed using fixed_file (see below for an example). The model_options keyword parameter is reserved for other parameters not considered part of the parametric LF model per se but merely a parameter in the Python implementation, e.g. the pivot redshift or the conversion of LF to logarithmic values.

[8]:
demo_fixed_file = get_test_data_loc("QSO_LF_PLE_model_fixed.txt")

with open(demo_fixed_file, 'r') as fixed_file:
    print(fixed_file.read())
# m_\ast(z_\mathrm{p}), \lg\Phi_\ast, \alpha_\mathrm{l}, \alpha_\mathrm{h}, \beta_\mathrm{l}, \beta_\mathrm{h}, k_{1\mathrm{l}}, k_{1\mathrm{h}}, k_{2\mathrm{l}}, k_{2\mathrm{h}}
-26.71, -6.01, -4.31, -3.04, -1.54, -1.38, -0.08, -0.25, -0.40, -0.05

After constructing the likelihood class as above, it can be called just like a function, e.g. evaluated at the fixed parameter values above. We use the load_parameter_set function from the utils module to convert the parameter file above into a dictionary of parameter names and values at which the log-likelihood (essentially \(-\chi^2\big/2\)) is evaluated.

[9]:
from horizonground.utils import load_parameter_set

parameter_point = load_parameter_set(demo_fixed_file)

print(likelihood(parameter_point.values()))
-78.13664554573882