Member-only story
Linear regression vs. Generalized linear models (GLM): What’s the difference?

Linear Regression Definition
Linear Regression is a modelling approach that assumes a linear relationship between an output (a.k.a. “dependent variables”) and one or more inputs (a.k.a. “independent variables”).
Here are a few examples of linear regression models in life :
- Weight(as Y) as a function of a person’s Height (as X)
- Report Happiness (as Y) as a function of Income (as X)
- Sales Revenue (as Y) as a function of Marketing Budget (as X)
Assumptions for Linear Regression
- Linear relationship between inputs(s) and output
- Output variable is continuous and unbounded
- Residuals are normally distributed (or follows student-t distribution, if you want to allow for greater variance). Residuals are also called “errors” as they measure how well the regression line fits the data.

Generalized Linear Model (GLM) Definition
As the name indicates, GLM is a generalized form of linear regressions. It is more flexible than linear regression because:
- GLM works when the output variables are not continuous or unbounded
- GLM allows changes in unconstrained inputs to affect the output variable on an appropriately constrained scale
Here’s an example:
Imagine that you want to model the number of Covid cases based on the number of population in an area. This output variable is constrained, as the number of Covid cases must be a non-negative integer. This makes it hard for you to use linear regression to model the data because assumptions that the output variable should continuous and unbounded is violated. However, you can use GLM in this case and test the effect…