Chapter 3 Distributions for response variable \(y\)

3.1 Exponential family of distributions

Definition 3.1 (Probability distributions) The probability density function (p.d.f.) \(p_y\) of the random variable \(y\) has the following properties:

  • \(\forall y\in \mathbb{Y}\), \(p_{y}(y) \geq 0\) ( the function \(p_y\) is positive for all possible outcomes, \(\mathbb{Y}\) is the notation used for the space of all possible outcomes)

  • the function \(p_{y}\) integrates to 1 on the space of all possible outcomes such that:

    • when \(y\) is a continuous random variable: \[\int_{\mathbb{Y}} p_{y}(y) \ dy=1\]

    • when \(y\) is a discrete random variable: \[\sum_{y\in \mathbb{Y}} p_{y}(y) =1\]

where \(\mathbb{Y}\) is the space of all possible outcomes.

Only distributions from the exponential family of distributions, will be considered for the response variable \(y\).

Definition 3.2 (Exponential family of distributions) The distribution belongs to the exponential family if it can be written as: \[p_{y|\theta}(y|\theta)=\exp \left\lbrack a(y) b(\theta)+c(\theta)+d(y)\right\rbrack\] where \(a,b,c,d\) are known functions. If \(a(y)=y\) then the distribution is said to be in canonical form.

Definition 3.3 (Expectation) Consider a random variable \(y\) with p.d.f. \(p_y\) on the space of outcomes \(\mathbb{Y}\), the expectation of \(y\) is computed by:

  • when \(y\) is a continuous random variable: \[\mathbb{E}[y]=\int_{\mathbb{Y}} y\ p_{y}(y) \ dy\]

  • when \(y\) is a discrete random variable: \[\mathbb{E}[y]= \sum_{y\in \mathbb{Y}} y\ p_{y}(y)\]

3.2 Some members of the exponential family

Several probability density functions (pdf) are introduced in this section.

Definition 3.4 (Gaussian or Normal distribution) The normal distribution associated with a random variable \(y\in \mathbb{R}\) is defined as: \[p_{y|\theta}(y|\theta,\sigma)=\frac{1}{\sqrt{2\pi}\sigma} \exp\left(\frac{-(y-\theta)^2}{2\sigma^2}\right)\] \(\theta \in \mathbb{R}\) and \(\sigma \in \mathbb{R}^{+}\) are resp. the mean and standard deviation. The expectation is \(\mathbb{E}[y]=\theta\).

Definition 3.5 (Poisson distribution)

The Poisson distribution expresses the probability of a given number of events \(y\) occurring in a fixed interval of time and/or space, (and/or fixed total population size) \[p_{y|\theta}(y|\theta)=\frac{\theta^y\exp(-\theta)}{y!} , \quad y\in \mathbb{N},\ \theta\in \mathbb{R}^{+*} \label{eq:poisson:dist}\] The expectation of \(y\) is \(\mathbb{E}[y]=\theta\) represents the average of the number of events.

As an example, the response \(y\) associated with the Poisson distribution can be modelling the number of people in line in front of you at the grocery store.

Definition 3.6 (Binary variable) A binary variable \(y\) has only two possible outcomes (\(\mathbb{Y}=\lbrace 0,1\rbrace\)): \[y= \left\lbrace \begin{array}{ll} 1 & \text{if the outcome is a } success\\ 0& \text{if the outcome is a }failure\\ \end{array}\right.\] The notions success and failure are user defined.
Definition 3.7 (Bernouilli distribution) Having \(y\) a binary variable, lets \(p_{y|\theta}(y=1|\theta)=\theta\) then \(p_{y|\theta}(y=0|\theta)=1-\theta\) and more generally: \[p_{y|\theta}(y|\theta)=\theta^{y} \ (1-\theta)^{1-y}, \quad y\in \lbrace0,1\rbrace,\ \theta\in [0;1]\] is the Bernoulli distribution \(\mathrm{Bernouilli}(\theta)\) and \(\mathbb{E}[y]=\theta\).

A standard example for using the Bernouilli distribution is when the response \(y\) is the outcome of a single toss of a coin where head is encoded \(y=1\) and tail is \(y=0\).

Definition 3.8 (Binomial distribution) Consider the response \(y\) that is the number of successes in \(n\) trials (so the space of outcomes is \(\mathbb{Y}=\lbrace 0,1,\cdots,n\rbrace\)), and the proportion \(\theta\) is a real number between 0 and 1. The Binomial distribution is defined as: \[p_{y|\theta}(y|\theta)=\frac{n!}{(n-y)! y!}\ \theta^{y} (1-\theta)^{n-y}\] The expectation is \(\mathbb{E}[y]=n\theta\). The Bernouilli distribution corresponds to the Binomial distribution with the number of trial \(n\) is 1.

The Binomial distribution is used when considering the number \(y\) of heads when tossing a coin \(n\) times. Another example is when \(y\) is the number of students passing an exam amongst \(n\) students taking that exam.

Definition 3.9 (Exponential distribution) The exponential distribution is defined as: \[p_{y|\theta}(y|\theta) =\theta\ \exp\left(-\theta\ y \right) , \quad \text{with}\ y\in\mathbb{R}^+, \ \theta\in\mathbb{R}^{+*}\] The expectation is \(\mathbb{E}[y]=\frac{1}{\theta}\).
Definition 3.10 (Weibull distribution) The Weibull distribution is defined as \[p_{y|\lambda\theta}(y|\lambda,\theta) = \lambda\ \theta \ y^{\lambda-1} \exp\left \lbrack -\theta\ y^{\lambda} \right \rbrack ,\quad \text{with}\ y\in\mathbb{R}^+, \ \theta\in\mathbb{R}^{+*} , \ \lambda\in\mathbb{R}^{+*}\] The expectation is \(\mathbb{E}[y]=\left(\frac{1}{\theta}\right)^{1/\lambda}\Gamma\left( 1+\frac{1}{\lambda} \right)\) where \(\Gamma(u)=\int_0^{+\infty} s^{u-1}\ \exp(-s) \ ds\). Note that the exponential distribution is a special case of the Weibull distribution with \(\lambda=1\).

The Weibull distribution (or exponential distribution) is often used to model a time to failure \(y\), for instance the time taken by a new hard drive to eventually fail.

Exercises. You should be able to show that these pdf are positive functions, integrating to 1 on the space of outcomes, compute their expectations, and be able to identify the function \(a,b,c,d\) to show that they are members of the exponential family of distributions.