Linaer regression

In this section, we will review regression from a statistical perspective. For a machine learning perspective, see the machine learning section.

The main goal is to find the best line that fits the data. For instance, consider the following figures constructed using World Bank data.

pic1 pic2

The problem in each dataset consists of finding a linear approximation \(\hat{y}_{i}\) of the true value \(y_{i}\), where the error is measured as \(u_{i}^{2} = (y_{i} - \hat{y}_{i})^{2}\).

\begin{equation} \min_{\beta_{0}, \beta_{1}} \sum u_{i}^{2} \end{equation}

pic3 pic4

Here, we see that the values \(\beta_{0}\) and \(\beta_{1}\) that solve the problem can be obtained using two methods: Maximum Likelihood Estimation and the Least Squares Method. Both methods yield similar results.

Slides

Notebooks

Model

Material

Introduction to linear regression

Slides

Linear regression notebook

Notebook lr

Excercise linear regression

Notebook excercise lr

Introduction to dummy variables

Dummy variables slides

Multicollinearity

Notebook

Assumptions

Slides assumptions

Assumptions

Notebook assumptions

Especification

Notebook especification

Videos

Laboratories

Impacto