Introduction: Estimating Proabilties with Bayesian Inference¶

Adapted from WIll Koershen¶

Learning Purposes Only¶

Bayesian Approach¶

Problem:

Today's Bayesian problem of the week: Suppose we visit a wild animal preserve where we know that the only animals are lions and tigers and bears, but we don't know how many of each there are. During the tour, we see 3 lions, 2 tigers, and 1 bear. Assuming that every animal had an equal chance to appear in our sample, estimate the prevalence of each species. What is the probability that the next animal we see is a bear?

In a frequentist view, we would just the the observed animals to esimate the prevalance and let the data completetly speak for itself. In, contract, a Bayesian framework we need to incorpate priors. In the limit as n observations tends to infinity, the effect of the priors disappear, we and only use the data. We can alger the weigght of the priors depending on out confidence in them

Model¶

The overall sysm is as follows:

The underlying model is multinomal with parmaters $p_{k}$
The $prior$ distribution of $p_{i}$ is a Dirichlet distribution
THe $\alpha$ is a parameter of the prior Dirichelt distribution,

THe model can be expressed as:

$$ \begin{align} \alpha = (\alpha_{1}...,\alpha_{K}) && = concentration hyperparameter \\ p | \alpha = (\alpha_{1}...,\alpha_{K})&& = (K,\alpha) \\ X | p = (x_{1}...,x_{K}) && = (K,p) \end{align} $$

Our goal is to fine $p_{lions},p_{tigers},p_{bears}$ given for the observation vector $c = [c_{lions},c_{tigers},c_{bears}]$

Multinomial distribution¶

The problem is a classic example of hte multi. dist, which desribes a situation where we have n independent trials, each with k possible outcomes.

Dirichelt Distribution¶

The prior for a multinomial distribution in Bayesian statistics is a Dirichlet distribution. (Together, a multinomial distribution with a dirichlet prior is called, not surprisingly, a Dirichlet-Multinomial Distribution. The Dirichlet distribution is characterized by $\alpha$, the concentration hyperparameter vector.

Hyperapameters¶

The $\alpha$ vector is a hyperparameter, a parameter of a prior distribution. This vector in turn could have its own prior distribution which is called a hyperprior. We won't use a hyperprior, but instead will only specify the hyperparameters.

The hyperparameter vector can be thought of as pseudo-counts, which we use to show our prior belief for the prevalence of each species. If want a uniform hyperparameter reflecting that we believe the chance of observing any species is the same we set each element of alpha equal such as $\alpha = [1, 1, 1]$. We can increase our decrease the effect of the priors by increasing or decreasing the values. This can be useful when we have more or less confidence in our prior beliefs. We'll see the effects of adjusting the hyperparameters in the notebook, but will mainly work with $\alpha = [1, 1, 1]$.

import pandas as pd
import numpy as np

# Visualizations
import matplotlib.pyplot as plt
import seaborn as sns
plt.style.use('fivethirtyeight')
plt.rcParams['font.size'] = 22
%matplotlib inline

from matplotlib import MatplotlibDeprecationWarning

import warnings
warnings.filterwarnings('ignore', category=FutureWarning)
warnings.filterwarnings('ignore', category=MatplotlibDeprecationWarning)

import pymc3 as pm
import arviz

#helper functions
'''Functions for drawing contours of Dirichlet distributions.'''

# Author: Thomas Boggs

import numpy as np
import matplotlib.pyplot as plt
import matplotlib.tri as tri
from functools import reduce

_corners = np.array([[0, 0], [1, 0], [0.5, 0.75**0.5]])
_triangle = tri.Triangulation(_corners[:, 0], _corners[:, 1])
_midpoints = [(_corners[(i + 1) % 3] + _corners[(i + 2) % 3]) / 2.0 \
              for i in range(3)]

def xy2bc(xy, tol=1.e-3):
    '''Converts 2D Cartesian coordinates to barycentric.
    Arguments:
        `xy`: A length-2 sequence containing the x and y value.
    '''
    s = [(_corners[i] - _midpoints[i]).dot(xy - _midpoints[i]) / 0.75 \
         for i in range(3)]
    return np.clip(s, tol, 1.0 - tol)

class Dirichlet(object):
    def __init__(self, alpha):
        '''Creates Dirichlet distribution with parameter `alpha`.'''
        from math import gamma
        from operator import mul
        self._alpha = np.array(alpha)
        self._coef = gamma(np.sum(self._alpha)) / \
                     reduce(mul, [gamma(a) for a in self._alpha])
    def pdf(self, x):
        '''Returns pdf value for `x`.'''
        from operator import mul
        return self._coef * reduce(mul, [xx ** (aa - 1)
                                         for (xx, aa)in zip(x, self._alpha)])
    def sample(self, N):
        '''Generates a random sample of size `N`.'''
        return np.random.dirichlet(self._alpha, N)

def draw_pdf_contours(dist, border=False, nlevels=200, subdiv=8, **kwargs):
    '''Draws pdf contours over an equilateral triangle (2-simplex).
    Arguments:
        `dist`: A distribution instance with a `pdf` method.
        `border` (bool): If True, the simplex border is drawn.
        `nlevels` (int): Number of contours to draw.
        `subdiv` (int): Number of recursive mesh subdivisions to create.
        kwargs: Keyword args passed on to `plt.triplot`.
    '''
    from matplotlib import ticker, cm
    import math

    refiner = tri.UniformTriRefiner(_triangle)
    trimesh = refiner.refine_triangulation(subdiv=subdiv)
    pvals = [dist.pdf(xy2bc(xy)) for xy in zip(trimesh.x, trimesh.y)]

    plt.tricontourf(trimesh, pvals, nlevels, **kwargs)
    plt.axis('equal')
    plt.xlim(0, 1)
    plt.ylim(0, 0.75**0.5)
    plt.axis('off')
    if border is True:
        plt.hold(1)
        plt.triplot(_triangle, linewidth=1)

def plot_points(X, barycentric=True, border=True, **kwargs):
    '''Plots a set of points in the simplex.
    Arguments:
        `X` (ndarray): A 2xN array (if in Cartesian coords) or 3xN array
                       (if in barycentric coords) of points to plot.
        `barycentric` (bool): Indicates if `X` is in barycentric coords.
        `border` (bool): If True, the simplex border is drawn.
        kwargs: Keyword args passed on to `plt.plot`.
    '''
    if barycentric is True:
        X = X.dot(_corners)
    plt.plot(X[:, 0], X[:, 1], 'k.', ms=1, **kwargs)
    plt.axis('equal')
    plt.xlim(0, 1)
    plt.ylim(0, 0.75**0.5)
    plt.axis('off')
    if border is True:
        plt.hold(1)
        plt.triplot(_triangle, linewidth=1)
    
def annotate_plot():
    cs = ['#008fd5', '#fc4f30', '#e5ae38']
    animals = ['lions', 'tigers', 'bears']
    for i, (xy, s) in enumerate(zip([(-0.15, 0.075), (0.38, 0.868), (0.98, 0.075)],
                         animals)):
        plt.annotate(s, xy, color = cs[i], size = 20)

def add_legend(ax):
    """Function to add legend to plots"""
    animals = ['lions', 'tigers', 'bears']
    for l, a in zip(ax.get_lines()[:3], animals):
        ax.plot(0, 0, label = a, c = l.get_color())
    ax.legend()
   
def display_probs(d):
    for key, value in d.items():
        print(f'Species: {key:8} Prevalence: {100*value:.2f}%.')
# if __name__ == '__main__':
#     f = plt.figure(figsize=(8, 6))
#     alphas = [[0.999] * 3,
#               [5] * 3,
#               [2, 5, 15]]
#     for (i, alpha) in enumerate(alphas):
#         plt.subplot(2, len(alphas), i + 1)
#         dist = Dirichlet(alpha)
#         draw_pdf_contours(dist)
#         title = r'$\alpha$ = (%.3f, %.3f, %.3f)' % tuple(alpha)
#         plt.title(title, fontdict={'fontsize': 8})
#         plt.subplot(2, len(alphas), i + 1 + len(alphas))
#         plot_points(dist.sample(5000))
#     plt.savefig('dirichlet_plots.png')
#     print 'Wrote plots to "dirichlet_plots.png".'

Specifics¶

Mainly use one verionsof the hyperparmeters where they are all 1

#observations
animals = ['lions','tigers','bears']
c = np.array([3,2,1])
#hyperparams
alphas = np.array([1,1,1])
alpha_list = [np.array([0.1, 0.1, 0.1]), np.array([1, 1, 1]),
                    np.array([5, 5, 5]), np.array([15, 15, 15])]

Expected Value¶

A way to get a sinlge point esimate of the prevalance is to use the expected value of the posterior for $p_{k}$. The expected value of the Dirchilet mult distr is:

$ \begin{align} [p_{i} | X,\alpha] = \frac{c_{i} + \alpha_{i}}{N + \sum_{k} \alpha_{k}} \end{align} $

Using $\alpha = [1, 1, 1]$, and the observation vector $c = [3, 2, 1]$ we get the expected prevalances:

$$p_{lions} = \frac{4}{9} = 44.4\%$$$$p_{tigers} = \frac{3}{9} = 33.3\%$$$$p_{bears} = \frac{2}{9} = 22.2\%$$

Does not extend to limited amount of data. For that we turn to Bayesian modeling, and sample fromt the posterior MCMC. The expected value represent our best guess for what a single trial. We can also oalter the expected value by adjusting the weight of the hyperparm vectro

display_probs(dict(zip(animals, (alphas + c) / (c.sum() + alphas.sum()))))

Species: lions    Prevalence: 44.44%.
Species: tigers   Prevalence: 33.33%.
Species: bears    Prevalence: 22.22%.

display_probs(dict(zip(animals, (4/9, 3/9, 2/9))))

Species: lions    Prevalence: 44.44%.
Species: tigers   Prevalence: 33.33%.
Species: bears    Prevalence: 22.22%.

Let's try a difference hyperparams value and see how expected value changes

values = []
for alpha_new in alpha_list:
    values.append((alpha_new + c) / (c.sum() + alpha_new.sum()))

value_df = pd.DataFrame(values, columns = animals)
value_df['alphas'] = [str(x) for x in alpha_list]
value_df

melted = pd.melt(value_df, id_vars = 'alphas', value_name='prevalence',
        var_name = 'species')

plt.figure(figsize = (8, 6))
sns.barplot(x = 'alphas', y = 'prevalence', hue = 'species', data = melted,
            edgecolor = 'k', linewidth = 1.5);
plt.xticks(size = 14); plt.yticks(size = 14)
plt.title('Expected Value');

Note, that as the weight of the hyperparams (these are the alpha values used from the posterior dirchilet dist) increases, the prevalcne conveges to around 1/3. Intutitvely, as we increase the number of pseudocounts, we express more strognly our inital belief of the equal prevalence scenario. With heavier priors, the data matters less.

On the hand, when we decrease the weight of the hyperparams, the data matters more and the expected value convered on the counts. The ultimate choise of the hyperparms depends on our confidence in out prior

Maximum Aposeteriori Estimation (MAP)¶

MAP is another point estimation that is equal to the mode of the posterior distribution of the posterior distribution: $${\displaystyle \operatorname {arg\,max} \limits _{\mathbf {p} }p(\mathbf {p} \mid \mathbb {X} )={\frac {\alpha _{i}+c_{i}-1}{\sum _{i}(\alpha _{i}+c_{i}-1)}},\qquad \forall i\;\alpha _{i}+c_{i}>1}$$ Note the lapacian smoothing:

In the case of $\alpha = [1, 1, 1]$ , this turns into the frequency of observed in the data

display_probs(dict(zip(animals, (alphas + c - 1) / sum(alphas + c - 1))))

Species: lions    Prevalence: 50.00%.
Species: tigers   Prevalence: 33.33%.
Species: bears    Prevalence: 16.67%.

display_probs(dict(zip(animals, c / c.sum())))

Species: lions    Prevalence: 50.00%.
Species: tigers   Prevalence: 33.33%.
Species: bears    Prevalence: 16.67%.

values = []
for alpha_new in alpha_list:
    values.append((alpha_new + c - 1) / sum(alpha_new + c - 1))

value_df = pd.DataFrame(values, columns = animals)
value_df['alphas'] = [str(x) for x in alpha_list]
value_df

melted = pd.melt(value_df, id_vars = 'alphas', value_name='prevalence',
        var_name = 'species')

plt.figure(figsize = (8, 6))
sns.barplot(x = 'alphas', y = 'prevalence', hue = 'species', data = melted,
            edgecolor = 'k', linewidth = 1.5);
plt.xticks(size = 14); plt.yticks(size = 14)
plt.title('Maximum A Posterior Value');

Same effect occurs! Both expected value and MAP only provide a point estimate with no uncertainty. Given the relatively few observations, there should only be a large amount of uncertaintiy...

Bayesian Model¶

The objective is to find the parameters of the multinomial dist, $p_{k}$ which are the probs of each species given the evidence. Recally, we are using a multinomial as our model, a Dirichilet distributino as the prior, and the specified hyperparam vector: $ \begin{align} (p|X,\alpha) \end{align} $

PyMC3 and MCMC¶

Build model using PyMC3, and then use a variant of Markov Chain Monte Carlo (the No-Uturn Sample specifically) to draw sample from the posterior. With enough samples, the estimate will convered on the true posterior. ALong with single point estimates (such as the mean of the sampled values), MCMC also give us the built in uncertaintiy because we get thousands of possible values from the posterior

with pm.Model() as model:
    # Parameters of the Multinomial are from a Dirichlet
    parameters = pm.Dirichlet('parameters', a=alphas, shape=3)
    # Observed data is from a Multinomial distribution
    observed_data = pm.Multinomial(
        'observed_data', n=6, p=parameters, shape=3, observed=c)

print(parameters)
print(alphas)
print(c)

parameters
[1 1 1]
[3 2 1]

model

Sampling the model¶

Draw 1000 from the posterior in 2 chains. WE use 500 smaples for tuning which are discarded. This meand that for each random variable in the model - the random variables are the parameters - we will have 2000 values drawn from the posterior distribution

with model:
    # Sample from the posterior
    trace = pm.sample(draws=1000, chains=2, tune=500, 
                      discard_tuned_samples=True)

Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (2 chains in 4 jobs)
NUTS: [parameters]
Sampling 2 chains: 100%|██████████| 3000/3000 [00:00<00:00, 3120.03draws/s]

summary = pm.summary(trace)
summary.index = animals
summary

We can see that the mean of the samples is very close to the expected value. However, instead of just getting one number, we get a rang of uncertaintity

# Samples
trace_df = pd.DataFrame(trace['parameters'], columns = animals)
trace_df.head()

trace_df.shape

(2000, 3)

For a single point estimate, we can use the mean of the samples

# For probabilities use samples after burn in
pvals = trace_df.iloc[:, :3].mean(axis = 0)
display_probs(dict(zip(animals, pvals)))

Species: lions    Prevalence: 44.81%.
Species: tigers   Prevalence: 33.17%.
Species: bears    Prevalence: 22.02%.

summary.iloc[:, 3:5]

The means after sample from the posterior appraoch the expected value, but sampling gives us a range of uncertaintity. In the limit, this uncertainity should become more narrow

Diagnostic plots¶

Posterior plot¶

ax = pm.plot_posterior(trace, varnames = ['parameters'], 
                       figsize = (20, 10),markeredgecolor = 'k');

plt.rcParams['font.size'] = 22
for i, a in enumerate(animals):
    ax[i].set_title(a);

/anaconda3/lib/python3.7/site-packages/pymc3/plots/__init__.py:40: UserWarning: Keyword argument `varnames` renamed to `var_names`, and will be removed in pymc3 3.8
  warnings.warn('Keyword argument `{old}` renamed to `{new}`, and will be removed in pymc3 3.8'.format(old=old, new=new))

Traceplot¶

The traceplot shows a kernel density estimate (a smoothed histogram) on the elt and all the samples that were drawn on the right. We collapse, the chains on the plots (combined = True) but in realtiy we drew 2 independent chains

prop_cycle = plt.rcParams['axes.prop_cycle']
cs = [x['color'] for x in list(prop_cycle)]

ax = pm.traceplot(trace, varnames = ['parameters'], figsize = (20, 12), combined = True);
ax[0][0].set_title('Posterior Probability Distribution'); ax[0][1].set_title('Trace Samples');
ax[0][0].set_xlabel('Probability'); ax[0][0].set_ylabel('Density');
ax[0][1].set_xlabel('Sample number');
add_legend(ax[0][0])
add_legend(ax[0][1])

/anaconda3/lib/python3.7/site-packages/pymc3/plots/__init__.py:40: UserWarning: Keyword argument `varnames` renamed to `var_names`, and will be removed in pymc3 3.8
  warnings.warn('Keyword argument `{old}` renamed to `{new}`, and will be removed in pymc3 3.8'.format(old=old, new=new))

MAPP results with PyMC3,¶

PyMC3 has a method for directly finidng the MAP

with model:
    #Find the MAP estimate
    map_ = pm.find_MAP()
    
display_probs(dict(zip(animals,map_['parameters'])))

/anaconda3/lib/python3.7/site-packages/pymc3/tuning/starting.py:61: UserWarning: find_MAP should not be used to initialize the NUTS sampler, simply call pymc3.sample() and it will automatically initialize NUTS in a better way.
  warnings.warn('find_MAP should not be used to initialize the NUTS sampler, simply call pymc3.sample() and it will automatically initialize NUTS in a better way.')
logp = -1.8042, ||grad|| = 1.118: 100%|██████████| 7/7 [00:00<00:00, 1338.51it/s]

Species: lions    Prevalence: 50.00%.
Species: tigers   Prevalence: 33.33%.
Species: bears    Prevalence: 16.67%.

Sample from Posterior¶

WE can now use the posterior to draw samples from the data.

with model:
    samples = pm.sample_ppc(trace, samples = 1000)
    
dict(zip(animals, samples['observed_data'].mean(axis = 0)))

/anaconda3/lib/python3.7/site-packages/ipykernel_launcher.py:2: DeprecationWarning: sample_ppc() is deprecated.  Please use sample_posterior_predictive()
  
100%|██████████| 1000/1000 [00:00<00:00, 1124.03it/s]

{'lions': 2.682, 'tigers': 1.938, 'bears': 1.38}

Show sampling distributions

sample_df = pd.DataFrame(samples['observed_data'],columns=animals)
plt.figure(figsize = (22, 8))
for i, animal in enumerate(sample_df):
    plt.subplot(1, 3, i+1)
    sample_df[animal].value_counts().sort_index().plot.bar(color = 'm');
    plt.xticks(range(7), range(7), rotation = 0);
    plt.xlabel('Number of Times Seen'); plt.ylabel('Occurences');
    plt.title(f'1000 Samples for {animal}');

Dirichlet Distribution¶

draw_pdf_contours(Dirichlet(6 * alphas))
annotate_plot()

draw_pdf_contours(Dirichlet(6 * pvals))
annotate_plot();

draw_pdf_contours(Dirichlet([0.1, 0.1, 0.1]))
annotate_plot();

Next Observation¶

In order to find what we can expect from the next observation, we draw a single sample of 10000 times form a multinomial distribution. The probaility of seeing each species is proportional to that obtained from the sampling.

#draw from the multinomial posterior
next_obs = np.random.multinomial(n=1,pvals=pvals,size=10000)
# Data manipulation
next_obs = pd.melt(pd.DataFrame(next_obs, columns = ['Lions', 'Tigers', 'Bears'])).\
            groupby('variable')['value'].\
            value_counts(normalize=True).to_frame().\
             rename(columns = {'value': 'total'}).reset_index()
next_obs = next_obs.loc[next_obs['value'] == 1]

# Bar plot
next_obs.set_index('variable')['total'].plot.bar(figsize = (8, 6));
plt.title('Next Observation Likelihood');
plt.ylabel('Likelihood'); plt.xlabel('');

Looks like a lion is next!

next_obs.iloc[:, [0, 2]]

More observations¶

If we have additional observations, more trips to the preserve, we can easily do the sampling taking into accound this data.

c = np.array([[3,2,1],
            [2,3,1],
            [3,2,1],
            [2,3,1]])
c

array([[3, 2, 1],
       [2, 3, 1],
       [3, 2, 1],
       [2, 3, 1]])

c.shape

(4, 3)

with pm.Model() as model:
    # Parameters are a dirichlet distribution
    parameters = pm.Dirichlet('parameters', a=alphas, shape=3)
    # Observed data is a multinomial distribution
    observed_data = pm.Multinomial(
        'observed_data', n=6, p=parameters, shape=3, observed=c)    
    
    trace = pm.sample(draws=1000, chains=2, tune=500, discard_tuned_samples=True)

Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (2 chains in 4 jobs)
NUTS: [parameters]
Sampling 2 chains: 100%|██████████| 3000/3000 [00:01<00:00, 2794.31draws/s]

summary = pm.summary(trace)
summary.index = animals
summary

The uncertaintity of the prevalence of bears has decreased and the prevalence of the lions and tigers is nearly identical as expected from the data. As we gather more data, we can incorporate it into the model to get more accurate estimates. A larger amount of data will mean less uncertainty in the sampling as well.

ax = pm.plot_posterior(trace, varnames = ['parameters'], 
                       figsize = (20, 10), markeredgecolor= 'k')

plt.rcParams['font.size'] = 22
for i, a in enumerate(animals):
    ax[i].set_title(a)
    
plt.suptitle('Posterior with More Observations', y = 1.05)

/anaconda3/lib/python3.7/site-packages/pymc3/plots/__init__.py:40: UserWarning: Keyword argument `varnames` renamed to `var_names`, and will be removed in pymc3 3.8
  warnings.warn('Keyword argument `{old}` renamed to `{new}`, and will be removed in pymc3 3.8'.format(old=old, new=new))

Text(0.5, 1.05, 'Posterior with More Observations')

Increasing/Decreasing Confidence in Hyperparmeters¶

WE can also increase or decrease our confidence in out inital belief (our priors) by changing the weighting of the hyperparameter vecotor. Initially we use all 1's but we can reduce or increase this!

# observations
animals = ['lions', 'tigers', 'bears']
c = np.array([3, 2, 1])

def sample_with_priors(alphas):
    """Sample with specified hyperparameters"""
    with pm.Model() as model:
        # Probabilities for each species
        parameters = pm.Dirichlet('parameters', a=alphas, shape=3)
        # Observed data is a multinomial distribution with 6 trials
        observed_data = pm.Multinomial(
            'observed_data', n=6, p=parameters, shape=3, observed=c)    

        trace = pm.sample(draws=1000, chains=2, tune=500, discard_tuned_samples=True)
    return trace

trace_dict = {}
for alpha_array in [np.array([0.1, 0.1, 0.1]), np.array([1, 1, 1]),
                    np.array([5, 5, 5]), np.array([15, 15, 15])]:
    trace_dict[str(alpha_array[0])] = sample_with_priors(alpha_array)

Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (2 chains in 4 jobs)
NUTS: [parameters]
Sampling 2 chains: 100%|██████████| 3000/3000 [00:01<00:00, 2821.89draws/s]
Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (2 chains in 4 jobs)
NUTS: [parameters]
Sampling 2 chains: 100%|██████████| 3000/3000 [00:00<00:00, 3054.93draws/s]
Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (2 chains in 4 jobs)
NUTS: [parameters]
Sampling 2 chains: 100%|██████████| 3000/3000 [00:01<00:00, 2732.81draws/s]
Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (2 chains in 4 jobs)
NUTS: [parameters]
Sampling 2 chains: 100%|██████████| 3000/3000 [00:01<00:00, 2705.66draws/s]

plt.figure(figsize = (20, 24))

for ii, (alpha, trace) in enumerate(trace_dict.items()):
    plt.subplot(4, 1, ii + 1)
    array = trace['parameters']
    for jj, animal in enumerate(animals):
        sns.kdeplot(array[:, jj], label = f'{animal}')
    plt.legend();
    plt.xlabel('Probability'); plt.ylabel('Density')
    plt.title(f'Alpha = {alpha}');
    plt.xlim((0, 1));
    
plt.tight_layout()
plt.show()

As the weight increases, the estimated prevalence tends to collapse to 1/3 for each species. These results are inuitive; We have a greater belief that the prevalences are equivalent, then the limited data we collected should not change our beliefs greatly

Comparison of Beliefs¶¶

Next we compare the hyperparameters $\alpha = [0.1, 0.1, 0.1]$ with $\alpha = [15, 15, 15]$

prior = '0.1'
trace = trace_dict[prior]

ax = pm.plot_posterior(trace, varnames = ['parameters'], 
                       figsize = (20, 10), markeredgecolor = 'k')

plt.rcParams['font.size'] = 22
for i, a in enumerate(animals):
    ax[i].set_title(a)
    
plt.suptitle(f'{prior} Prior', y = 1.05)

/anaconda3/lib/python3.7/site-packages/pymc3/plots/__init__.py:40: UserWarning: Keyword argument `varnames` renamed to `var_names`, and will be removed in pymc3 3.8
  warnings.warn('Keyword argument `{old}` renamed to `{new}`, and will be removed in pymc3 3.8'.format(old=old, new=new))

Text(0.5, 1.05, '0.1 Prior')

summary = pm.summary(trace)
summary.index = animals
summary

prior = '15'
trace = trace_dict[prior]

ax = pm.plot_posterior(trace, varnames = ['parameters'], 
                       figsize = (20, 10), markeredgecolor = 'k')

plt.rcParams['font.size'] = 22
for i, a in enumerate(animals):
    ax[i].set_title(a)
    
plt.suptitle(f'{prior} Prior', y = 1.05)

/anaconda3/lib/python3.7/site-packages/pymc3/plots/__init__.py:40: UserWarning: Keyword argument `varnames` renamed to `var_names`, and will be removed in pymc3 3.8
  warnings.warn('Keyword argument `{old}` renamed to `{new}`, and will be removed in pymc3 3.8'.format(old=old, new=new))

Text(0.5, 1.05, '15 Prior')

summary = pm.summary(trace)
summary.index = animals
summary

Overall, the choise of the hyperpriors depences on our confidence in out initial belief. If we have good reason to believe all the animals are represneted at the same frequency, then we should increase the weight of the priors!

Conclusions¶

prior = '1'
trace = trace_dict[prior]
summary = pm.summary(trace)
summary.index = animals
summary

ax = pm.plot_posterior(trace, varnames = ['parameters'], 
                       figsize = (20, 10), markeredgecolor = 'k');

plt.rcParams['font.size'] = 22
for i, a in enumerate(animals):
    ax[i].set_title(a)
    
plt.suptitle('Posterior Plot', y = 1.05)

/anaconda3/lib/python3.7/site-packages/pymc3/plots/__init__.py:40: UserWarning: Keyword argument `varnames` renamed to `var_names`, and will be removed in pymc3 3.8
  warnings.warn('Keyword argument `{old}` renamed to `{new}`, and will be removed in pymc3 3.8'.format(old=old, new=new))

Text(0.5, 1.05, 'Posterior Plot')

Estimated Prealences¶

Lions: 44.5% (16.9% - 75.8%)
Tigers: 32.7% (6.7% - 60.5%)
Bears: 22.7% (1.7% - 50.0%)

Proba next obs is bear is:¶

22.9 bases on the sampling

The best parts about Bayesian Inference are the incorporation of priors and the uncertainty inherent in the methods. With the scant evidence, we can provide estimates, but only with a large amount of uncertainty. More visits to the wildlife preserve would certainly help to clear up the matter!

In this notebook, we were able to explore the basics of Bayesian modeling and probabilistic programming. Although this was only a simple problem, we can use the concepts here to build more complex models.

	lions	tigers	bears	alphas
0	0.492063	0.333333	0.174603	[0.1 0.1 0.1]
1	0.444444	0.333333	0.222222	[1 1 1]
2	0.380952	0.333333	0.285714	[5 5 5]
3	0.352941	0.333333	0.313725	[15 15 15]

	lions	tigers	bears	alphas
0	0.636364	0.333333	0.030303	[0.1 0.1 0.1]
1	0.500000	0.333333	0.166667	[1 1 1]
2	0.388889	0.333333	0.277778	[5 5 5]
3	0.354167	0.333333	0.312500	[15 15 15]

	mean	sd	mc_error	hpd_2.5	hpd_97.5	n_eff	Rhat
lions	0.448075	0.162908	0.003424	0.137824	0.749909	1847.877113	0.999899
tigers	0.331740	0.152746	0.003644	0.054202	0.626152	1889.830372	0.999558
bears	0.220185	0.132979	0.003011	0.018361	0.488707	1825.819442	0.999748

	lions	tigers	bears
0	0.212186	0.384596	0.403217
1	0.255443	0.289665	0.454892
2	0.316250	0.400231	0.283519
3	0.341046	0.421098	0.237856
4	0.512615	0.112162	0.375223

	hpd_2.5	hpd_97.5
lions	0.137824	0.749909
tigers	0.054202	0.626152
bears	0.018361	0.488707

	mean	sd	mc_error	hpd_2.5	hpd_97.5	n_eff	Rhat
lions	0.406921	0.091226	0.002188	0.226976	0.581177	1809.102595	0.999936
tigers	0.408330	0.092009	0.002176	0.232542	0.588023	2021.664384	0.999609
bears	0.184749	0.070314	0.001662	0.064785	0.321719	1611.667404	0.999680

	mean	sd	mc_error	hpd_2.5	hpd_97.5	n_eff	Rhat
lions	0.484894	0.176648	0.004198	0.169706	0.818812	1802.463685	0.999508
tigers	0.339925	0.174586	0.004291	0.028798	0.655781	1796.585232	1.000536
bears	0.175181	0.145737	0.003541	0.000211	0.472574	1519.119630	1.000730

	mean	sd	mc_error	hpd_2.5	hpd_97.5	n_eff	Rhat
lions	0.352274	0.065472	0.001486	0.223575	0.477980	2065.347872	1.000749
tigers	0.334161	0.065777	0.001478	0.204646	0.459013	1860.216769	0.999806
bears	0.313564	0.062348	0.001444	0.194092	0.434725	2020.395211	0.999847

	mean	sd	mc_error	hpd_2.5	hpd_97.5	n_eff	Rhat
lions	0.442426	0.155742	0.003330	0.155825	0.746230	2173.840246	1.000089
tigers	0.339080	0.151136	0.003457	0.070684	0.634002	1995.054732	0.999538
bears	0.218494	0.131369	0.003193	0.012217	0.473441	1714.450285	1.000788