Programming Notes | Machine Learning | Probability and Statistics

Continuous Distributions

In the above examples, we have discussed discrete distributions. We know move on to continous distributions and some of the most common continuous distributions. Let's begin with understanding what continuous distributions

What is a Continuous Distribution?

A countinuous distribution is used to describe the probability of a random variable with infinite possible observations. An example of such a variable is the heights and weight measurements. In this notebook we cover the following distributions

Uniform Distribution
Gaussian Distribution
Exponential Distribution
Beta Distribution - a family of distributions

1. Uniform Distribution

The uniform distribution is characterized by equal probability across all possible events on the sample space. For example, what is the probability of picking a number between 10 and 20. This kind of problem can be modeled using a uniform distribution.

Mathematically:

$$ X \sim Uniform_{[a, b]}(p) $$

where:
$a$: lower bound for the interval space
$b$: upper bound for the interval space

Probability Density Function

We can compute the probability of a uniform distribution using the functional form below:

$$ p(x) = \begin{cases} 0 & \text{ for } x < a \\ \frac {1}{b-a} & \text{for } a \leq x \leq b \\ 0 & \text{ for } x> b \end{cases} $$

Below is an example of how to compute the probability density with lower and upper bounds at $10$ and $20$. We compute the probabilities with both analytical and python in-built method.

from scipy.stats import uniform
 
uniform_dist = uniform(loc=10, scale=10)
uniform_dist.pdf( x=15 ), 1/(20-10)

(0.1, 0.1)

2. Gaussian Distribution

The gaussian distribution is perhaps the most well known and widely used distribution. The gaussian/normal distribution has some nice properties that allow us to model much of the observation that we encounter naturally with data.

The normal distrbution is mathematically represented as:

$$ X \sim \mathcal{N}(\mu, \sigma^2) $$

where:
$\mu$: is the mean of distribution
$\sigma^2 $: is the variance of the distribution

The normal distribution is widely used because it has some very useful characteristics. A few below are especially useful:

Normal distributions are symmetric around their mean
The mean, median, and mode of a normal distribution are equal
68% of the area of a normal distribution is within one standard deviation of the mean
Approximately 95% of the area of a normal distribution is within two standard deviations of the mean

Probability Density Function

The probability density function is given by the formula below:

$$ f(x) = P(x\ |\ \mu, \sigma^2) = \frac {1}{\sigma \sqrt{2\pi}} e^{\frac {-(x-\mu)^2}{2\sigma^2}} $$

The probability density function for the normal distribution estimates the probability of observing an estimate range of values drawn over the range provided by the normal distribution parameters.

Example:
Given a normal distribution centered at 5 with a standard deviation of 1, what is the probability that within a random draw, a number less than 3 is drawn.

Below is an example of the probability of observing the value 2 from a distribution: $$X \sim \mathcal{N}(\mu = 5, \sigma^2 = 1)$$

from scipy.stats import norm
norm.pdf(x=2, loc=5, scale=1)

0.0044318484119380075

3. Exponential Distribution

Exponential distribution is closely related to the Poisson distribution of the discrete case. Like the Poisson distribution, it takes one parameter $\lambda$ which is the rate of events. The exponential distributions measure the time until an event happens.

The exponential distribution is commonly used in survival analysis where time until an even happens such as component failure is modelled. We represent the exponential distribution as:

$$ X \sim Exp(\lambda)$$

where:
$\lambda > 0$
$\lambda$: is the rate of the distribution

The exponential function, like the geometric is memoryless.

Probability Density Function

The pdf of the exponential distribution is given by the formula:

$$ p(x) = \begin{cases} \lambda e^{-\lambda x} & \text{ for } x > 0 \\ 0 & \text{ for } x \leq 0 \end{cases} $$

To motivate the probability density function with an example.

Example:
Suppose that $X$ is exponentially distributed and models the amount of time a cashier spends with a customer. Suppose that the time spent is on average equal to 4 minutes. What is the probability that the cashier spends 2 minutes with a customer.

from scipy.stats import expon
expon.pdf(x=2, scale=1/4)

0.0013418505116100474

4. Beta Distribution

The beta distribution is another commonly used probability distribution particularly in modelling the uncertainty of success of an event. Also called the conjugate prior of the binomial distribution, it takes two parameter $\alpha$ and $\beta$ which are positive real numbers.

An example of the use of the beta distribution is the Multi-Arm Bandit problem where we try to determine the probability of successful wins by updating the beta distribution with both success and failure counts of the total trials.

The mathematical representation of the beta distribution is given by:

$$ X \sim Beta( \alpha, \beta ) $$

Probability Density Function

The probabiliy density function is given in the following form:

$$ p(x) = \begin{cases} \frac { x^{\alpha - 1}(1 - x)^{\beta - 1}} {B(\alpha, \beta)} & \text{ for } 0< x < 1 \\ 0 & \text{ for } x < 0\ or\ x\> 1 \end{cases} $$

$$ B(\alpha, \beta) = \frac { \Gamma (\alpha) \Gamma (\beta) } {\Gamma (\alpha + \beta)} = \frac { (p-1)! (q-1)! }{ (p+q -1)! } $$

where:
$\alpha:$ positive real number parameter
$\beta:$ positive real number parameter
$B(\alpha, \beta)$: is the beta function with parameters $\alpha$ and $\beta$