Member-only story

Linear regression vs. Generalized linear models (GLM): What’s the difference?

Anyi Guo
5 min readMar 18, 2022

--

This post shows difference between 1) linear regression and 2) generalized linear models

Linear Regression Definition

Linear Regression is a modelling approach that assumes a linear relationship between an output (a.k.a. “dependent variables”) and one or more inputs (a.k.a. “independent variables”).

Example of a simple linear regression with only 1 input variable X1

Here are a few examples of linear regression models in life :

  1. Weight(as Y) as a function of a person’s Height (as X)
  2. Report Happiness (as Y) as a function of Income (as X)
  3. Sales Revenue (as Y) as a function of Marketing Budget (as X)

Assumptions for Linear Regression

  • Linear relationship between inputs(s) and output
  • Output variable is continuous and unbounded
  • Residuals are normally distributed (or follows student-t distribution, if you want to allow for greater variance). Residuals are also called “errors” as they measure how well the regression line fits the data.
A residual (in red) is the vertical distance between the data points (in black) and the regression line (in grey)

Generalized Linear Model (GLM) Definition

As the name indicates, GLM is a generalized form of linear regressions. It is more flexible than linear regression because:

  1. GLM works when the output variables are not continuous or unbounded
  2. GLM allows changes in unconstrained inputs to affect the output variable on an appropriately constrained scale

Here’s an example:

Imagine that you want to model the number of Covid cases based on the number of population in an area. This output variable is constrained, as the number of Covid cases must be a non-negative integer. This makes it hard for you to use linear regression to model the data because assumptions that the output variable should continuous and unbounded is violated. However, you can use GLM in this case and test the effect…

--

--

Anyi Guo
Anyi Guo

Written by Anyi Guo

Head of Data Science @ UW. This is my notepad for thoughts on data science, machine learning & AI.

Responses (4)

Write a response