This tutorial is about calculating the R-squared in Python with and without the sklearn package.

For an exemplary calculation we are first defining two arrays. While the y_hat is the predicted y variable out of a linear regression, the y_true are the true y values.

```
import numpy as np
y_hat = np.array([2,3,5,7,2,3,8,5,3,1])
y_true = np.array([5,4,2,7,4,2,1,6,5,3])
```

Now we are calculating the R-squared out of those two variables.

The formulas for calculating the R-squared are:

where SST is:

and SSE is:

To understand the SST and SSE consider the following image found on Wikipedia and created by Orzetto (Please see the credits and license below the image):

On the left-hand side, you see the SST – the total sum of squares which are just the squared differences between the actual y values and the mean y.

On the right-hand side, you see the SSE – the residual sum of squares which is just the summed squared differences between the regression line (m*x+b) and the predicted y values.

You can also just use the sklearn package to calculate the R-squared.

```
from sklearn.metrics import r2_score
r2_score(y_true,y_hat)
```

For an application of the R-squared on real data, you are kindly invited to check out the video on my channel