Programming Notes | Machine Learning | Probability and Statistics

Bernoulli Distribution

The bernoulli distribution models the probability of a random variable with binary sample space. That is, it models the experiments where there is only one outcome. It is considered a special case of the binomial distribution because it is the binomial when the number of experiments n = 1.

Mathematically: $$ X \sim Bernoulli_p(p) $$ where:
$X$ is a random variable
$\beta_p$: bernoulli distributed with probability of success $p$

Bernoulli Random Variables

We can generate bernoulli random variables using the rvs method from the bernoulli class. Notice that we have to provide a probability value p. Notice that corresponding values of the random variable will roughly produce counts of 1 that are close to the probability p parameter.

from scipy.stats import bernoulli

x_var = bernoulli.rvs(p=.3, size=100)
x_var

array( [0, 0, 1, 0, 1, 1, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 0, 0, 0, 1, 1, 1, 1, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 0, 1, 0, 1, 0, 1, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 1, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0])

Visualize the distribution plot

We can visualize the random variable we generated above with matplotlib's histogram plot that returns the frequency counts. Notice that the counts of success 1 is roughly the proportion equal to the probability

import matplotlib.pyplot as plt
    
fig = plt.figure(figsize=(8,5))
_ = plt.hist(x_var, bins=5)
plt.xticks([0,1])
plt.xlabel('Bernoulli Events')
plt.ylabel('Frequency')
plt.title('Bernoulli Plot')

Probability Mass Function

Like with any distribution, bernoulli has a probability mass function that defines the probability of all events in the sample space. $$pmf = p^x(1-p)^{(1-x)}$$ where:
$x: 0, 1$
$p:$ probability of success.

More systematically, the pmf of a bernoulli random variable - say coin flip - is defined as: $$ p(x) = \begin{cases} p^x(1-p)^{(1-x)} & \text{ } x = 0,1 \\ 0 & \text{otherwise}. \end{cases} $$

Probability Mass Function in Python

The probability density function returns the cumulative probabilities of a combination of events. In this case, we know that we only have two possible outcomes in the event space, therefore, the probability at 0 will be $1-p$ while the probability at 1 will be equal to $1-p + p = 1$.

Notice that the arguments used are $0$ and $1$ since it is a bernoulli distribution.

bernoulli_dist = bernoulli(p=.3)
p_success, p_failure = bernoulli_dist.pmf(1), bernoulli_dist.pmf(0)
p_success, p_failure

(0.3, 0.7000000000000001)

Bernoulli Probability Density function

The probability density function returns the cumulative probabilities of a combination of events. In this case, we know that we only have two possible outcome in the event space therefore the probability at 0 will be $1-p$ while the probability at 1 will be equal to $1-p + p = 1$.

Notice that the arguments used are $0$ and $1$ since it is a bernoulli distribution.

bernoulli_dist.cdf(0), bernoulli_dist.cdf(1)

(0.7, 1.0)

Expected Value

The expected value of the bernoulli distributions is the probability of success parameter $p$. In mathematical expression, we indicate it as: $$ \mathbb{E}(X) = p $$ As we have seen before, we can compute the expected value by returning the mean.

bernoulli_dist.mean()

.3

We defined a random bernoulli variable above called x_var with a probability p = .3. The expected value of that random variable should be equivalent/or close to the probability parameter p. As we see below, it is very close to the parameter.

x_var.mean()

.29

Variance

The variance of the bernoulli distribution is mathematically expressed as:

$$ Var(X) = p(1-p)$$

Given we have the probability of success, we can compute the variance of the distribution. Let's suppose $X \sim Bernoulli_p$ where $p = .1$. Then the variance can be computed numerically and with bernoulli class as follows:

var =.1*(1 - .1)            # Computing analytically
var, bernoulli.var(p=.1)    # Using the bernoulli class

(0.09000000000000001, 0.09000000000000001)

As seen above, the analytical and approximated values are equal in magnitude

Standard Deviation

The standard deviation is simply the square root of variance. For the bernoulli distribution, the standard deviation is computed as: $$ sigma (\sigma) = \sqrt{p(1-p)} $$ Computing this in python assuming that $p$ = .4:

import numpy as np
    
bernoulli_std = np.sqrt(.4*(1 - .4))
bernoulli.std(.4), bernoulli_std

(0.4898979485566356, 0.4898979485566356)