Chapter 8 Explanatory variables in GLMs

So far, we have found that various GLMs for analysing data can be defined by selecting various distributions, and various link functions. In this chapter, we focus on the design of the linear relation \(\beta^{T}x\).

8.1 Nature of variables

Responses and explanatory variables can be:

nominal e.g. (red,green,blue); (dead,alive), (male, female)
ordinal in which there is a natural order or ranking e.g. age categories
continuous e.g. time, weight, temperature etc.

Categorical variables refer to nominal and ordinal data. When considering a linear form with a continuous variable \(x\), additionnal polynomial explanatory variables can be considered in the model e.g.:

using \(x\) as explanatory variable: \(\beta_0+\beta_1 \ x\) (\(\dim(\beta)=2\)), \(\beta_1 \ x\) (\(\dim(\beta)=1\))
using \(x^2\) as explanatory variable: \(\beta_0+\beta_2 \ x^2\) (\(\dim(\beta)=2\)), \(\beta_2 \ x^2\) (\(\dim(\beta)=3\))
using \(x\) and \(x^2\) as explanatory variables: \(\beta_0+\beta_1 x+\beta_2 \ x^2\) (\(\dim(\beta)=3\)), \(\beta_1 x+\beta_2 \ x^2\) (\(\dim(\beta)=2\))

8.2 Generalized Mixed Linear Models

Definition 8.1 (Mixed models) Consider the scenario of having \(m\) clusters of data where the response is modeled as a function of a single regressor \(x\). The generalized linear mixed model is: \[g(\mathbb{E}[y])=\beta_0+\beta_1\ x+ \sum_{j=1}^m \delta_j \ ( \alpha_j + \gamma_j \ x )\] with

\(x\) continuous explanatory variable,
\(\delta_j\) is the indicator variable for the \(j^{th}\) cluster,
\(\alpha_j\) is the intercept and \(\gamma_j\) is the slope for the \(j^{th}\) cluster.