{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "@author: Octavio Gutiérrez de Código Máquina\n", "\n", "URL del canal: https://www.youtube.com/CodigoMaquina\n", "\n", "URL del video: https://youtu.be/9IZ6OPQWtpw\n", "\n", "

Métricas para Regresión

\n", "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "



















\n", "

Error absoluto máximo (M)

\n", "\n", "$y$ : valor verdadero \n", "\n", "$\\hat{y}$ : valor predicho \n", "\n", "$$\\text{M}(y, \\hat{y}) = max(\\left| y_i - \\hat{y}_i \\right|)$$\n" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "10" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from sklearn.metrics import max_error\n", "\n", "y_verdadero = [1, 2, 3, 4, 5]\n", "y_predicho = [1, 2, 3, 4, -5]\n", "max_error(y_verdadero, y_predicho)" ] }, { "cell_type": "markdown", "metadata": { "scrolled": false }, "source": [ "



















\n", "

Error absoluto medio (mean absolute error - MAE)

\n", "\n", "$y$ : valor verdadero \n", "\n", "$\\hat{y}$ : valor predicho \n", "\n", "$n$ : tamaño de la muestra \n", "\n", "$$\\text{MAE}(y, \\hat{y}) = \\frac{1}{n} \\sum_{i=1}^{n} \\left| y_i - \\hat{y}_i \\right|$$\n", "\n", "\n" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "2.0" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from sklearn.metrics import mean_absolute_error\n", "\n", "y_verdadero = [1, 2, 3, 4, 5]\n", "y_predicho = [1, 2, 3, 4, -5]\n", "mean_absolute_error(y_verdadero, y_predicho)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "



















\n", "

Error cuadrático medio (mean squared error - MSE)

\n", "$y$ : valor verdadero \n", "\n", "$\\hat{y}$ : valor predicho \n", "\n", "$n$ : tamaño de la muestra \n", "\n", "$$\\text{MSE}(y, \\hat{y}) = \\frac{1}{n} \\sum_{i=1}^{n} (y_i - \\hat{y}_i)^2$$\n" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "20.0" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from sklearn.metrics import mean_squared_error\n", "\n", "y_verdadero = [1, 2, 3, 4, 5]\n", "y_predicho = [1, 2, 3, 4, -5]\n", "mean_squared_error(y_verdadero, y_predicho)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "



















\n", "

Suma de los cuadrados de los residuos (RSS)

\n", "\n", "$y$ : valor verdadero \n", "\n", "$\\hat{y}$ : valor predicho \n", "\n", "$n$ : tamaño de la muestra \n", "\n", "$$\\text{RSS}(y, \\hat{y}) = \\sum_{i=1}^{n} (y_i - \\hat{y}_i)^2$$" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "100.0" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from sklearn.metrics import mean_squared_error\n", "\n", "y_verdadero = [1, 2, 3, 4, 5]\n", "y_predicho = [1, 2, 3, 4, -5]\n", "mean_squared_error(y_verdadero, y_predicho)*len(y_predicho)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "



















\n", "

Raíz cuadrada del error cuadrático medio (RMSE)

\n", "\n", "$y$ : valor verdadero \n", "\n", "$\\hat{y}$ : valor predicho \n", "\n", "$n$ : tamaño de la muestra \n", "\n", "$$\\text{RMSE}(y, \\hat{y}) = \\sqrt{\\frac{1}{n} \\sum_{i=1}^{n} (y_i - \\hat{y}_i)^2}$$" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "4.47213595499958" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from sklearn.metrics import mean_squared_error\n", "\n", "y_verdadero = [1, 2, 3, 4, 5]\n", "y_predicho = [1, 2, 3, 4, -5]\n", "mean_squared_error(y_verdadero, y_predicho, squared=False)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "



















\n", "

R^2 (Coeficiente de determinación)

\n", "\n", "$y$ : valor verdadero \n", "\n", "$\\hat{y}$ : valor predicho \n", "\n", "$\\bar{y}$ : promedio de los valores verdaderos\n", "\n", "$n$ : tamaño de la muestra \n", "\n", "$$R^2(y, \\hat{y}) = 1 - \\frac{\\sum_{i=1}^{n} (y_i - \\hat{y}_i)^2}{\\sum_{i=1}^{n} (y_i - \\bar{y})^2}$$" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0.09999999999999998" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from sklearn.metrics import r2_score\n", "\n", "y_verdadero = [1, 2, 3, 4, 5]\n", "y_predicho = [1, 2, 3, 4, 2]\n", "r2_score(y_verdadero, y_predicho)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "











\n", "Spiess, Andrej-Nikolai, and Natalie Neumeyer. \"An evaluation of R2 as an inadequate measure for nonlinear models in pharmacological and biochemical research: a Monte Carlo approach.\" BMC pharmacology 10.1 (2010): 1-11." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "























" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.5" } }, "nbformat": 4, "nbformat_minor": 4 }