Chapter 3 Distributions for response variable \(y\)
3.1 Exponential family of distributions
Definition 3.1 (Probability distributions) The probability density function (p.d.f.) \(p_y\) of the random variable \(y\) has the following properties:
\(\forall y\in \mathbb{Y}\), \(p_{y}(y) \geq 0\) ( the function \(p_y\) is positive for all possible outcomes, \(\mathbb{Y}\) is the notation used for the space of all possible outcomes)
the function \(p_{y}\) integrates to 1 on the space of all possible outcomes such that:
when \(y\) is a continuous random variable: \[\int_{\mathbb{Y}} p_{y}(y) \ dy=1\]
when \(y\) is a discrete random variable: \[\sum_{y\in \mathbb{Y}} p_{y}(y) =1\]
Only distributions from the exponential family of distributions, will be considered for the response variable \(y\).
Definition 3.3 (Expectation) Consider a random variable \(y\) with p.d.f. \(p_y\) on the space of outcomes \(\mathbb{Y}\), the expectation of \(y\) is computed by:
when \(y\) is a continuous random variable: \[\mathbb{E}[y]=\int_{\mathbb{Y}} y\ p_{y}(y) \ dy\]
- when \(y\) is a discrete random variable: \[\mathbb{E}[y]= \sum_{y\in \mathbb{Y}} y\ p_{y}(y)\]
3.2 Some members of the exponential family
Several probability density functions (pdf) are introduced in this section.
Definition 3.5 (Poisson distribution)
The Poisson distribution expresses the probability of a given number of events \(y\) occurring in a fixed interval of time and/or space, (and/or fixed total population size) \[p_{y|\theta}(y|\theta)=\frac{\theta^y\exp(-\theta)}{y!} , \quad y\in \mathbb{N},\ \theta\in \mathbb{R}^{+*} \label{eq:poisson:dist}\] The expectation of \(y\) is \(\mathbb{E}[y]=\theta\) represents the average of the number of events.As an example, the response \(y\) associated with the Poisson distribution can be modelling the number of people in line in front of you at the grocery store.
A standard example for using the Bernouilli distribution is when the response \(y\) is the outcome of a single toss of a coin where head is encoded \(y=1\) and tail is \(y=0\).
The Binomial distribution is used when considering the number \(y\) of heads when tossing a coin \(n\) times. Another example is when \(y\) is the number of students passing an exam amongst \(n\) students taking that exam.
The Weibull distribution (or exponential distribution) is often used to model a time to failure \(y\), for instance the time taken by a new hard drive to eventually fail.
Exercises. You should be able to show that these pdf are positive functions, integrating to 1 on the space of outcomes, compute their expectations, and be able to identify the function \(a,b,c,d\) to show that they are members of the exponential family of distributions.