{ "cells": [ { "cell_type": "markdown", "id": "b90d6dbc-9281-4c8d-a7f4-d921139239ed", "metadata": {}, "source": [ "## Pandas Profiling: Ejemplo Pokemon!" ] }, { "cell_type": "markdown", "id": "97389f40-4d51-4682-bdb2-fe941e36e7c5", "metadata": {}, "source": [ "* Fuente original de la fuente de datos: [https://raw.githubusercontent.com/bryanpaget/html/main/pokemon.csv](https://raw.githubusercontent.com/bryanpaget/html/main/pokemon.csv\")" ] }, { "cell_type": "markdown", "id": "bb533d01-24eb-42ed-a4b9-8d806a27bddc", "metadata": {}, "source": [ "## Importar librerias" ] }, { "cell_type": "code", "execution_count": 1, "id": "b46d1cde-a575-4efb-824f-a9099d01e31f", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Channels:\n", " - defaults\n", " - conda-forge\n", "Platform: linux-64\n", "Collecting package metadata (repodata.json): done\n", "Solving environment: failed\n", "\n", "PackagesNotFoundError: The following packages are not available from current channels:\n", "\n", " - ydata_profiling\n", "\n", "Current channels:\n", "\n", " - https://repo.anaconda.com/pkgs/main\n", " - https://repo.anaconda.com/pkgs/r\n", " - https://conda.anaconda.org/conda-forge\n", "\n", "To search for alternate channels that may provide the conda package you're\n", "looking for, navigate to\n", "\n", " https://anaconda.org\n", "\n", "and use the search bar at the top of the page.\n", "\n", "\n" ] } ], "source": [ "!conda install -y numpy pandas ydata_profiling" ] }, { "cell_type": "code", "execution_count": 2, "id": "426d9021-e616-4c2d-b765-7b1d6199922f", "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "import pandas as pd\n", "\n", "from ydata_profiling import ProfileReport\n", "from ydata_profiling.utils.cache import cache_file" ] }, { "cell_type": "markdown", "id": "0e99d6fc-bdfa-40ea-b31e-8e5227e2dc26", "metadata": {}, "source": [ "## Cargar fuente de datos" ] }, { "cell_type": "code", "execution_count": 3, "id": "3d05bcd9-6882-4cf0-8a06-1f0d5fcd7b80", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "PosixPath('/home/ir_inf/anaconda3/envs/pandas_profiling/lib/python3.12/data/pokemon.csv')" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "file_name = cache_file(\n", " \"pokemon.csv\",\n", " \"https://raw.githubusercontent.com/bryanpaget/html/main/pokemon.csv\"\n", ")\n", "\n", "file_name" ] }, { "cell_type": "code", "execution_count": 4, "id": "5a6e2d0b-1e9b-4f2f-9fe0-bf8059359794", "metadata": {}, "outputs": [], "source": [ "pokemon_df = pd.read_csv(file_name)" ] }, { "cell_type": "markdown", "id": "ce13e44f-639d-4842-a6fd-76f65511b168", "metadata": {}, "source": [ "Crea un perfil de datos para el dataframe pokemon_df.\n", "\n", "* `sort=None`: Indica que no se debe ordenar los resultados del informe (por defecto, el perfil ordena las variables de acuerdo con su tipo de datos).\n", "* `html={\"style\": {\"full_width\": True}}`: Configura el informe HTML para que ocupe todo el ancho disponible, mejorando la presentación.\n", "* `progress_bar=False`: Desactiva la barra de progreso durante la creación del informe.\n", "* `correlations`: Aquí se configuran las correlaciones que se calcularán en el perfil. En este caso:\n", " * `auto`: Calcula automáticamente las correlaciones entre las variables numéricas.\n", " * `pearson`, `spearman`, `kendall`: Estos métodos de correlación se desactivan en este informe (False).\n", " * `phi_k` y `cramers`: Se calculan estos tipos de correlación (útiles para variables categóricas).\n", "* `explorative=True`: Habilita una configuración más detallada y exploratoria para el análisis, lo que generalmente proporciona más estadísticas descriptivas y análisis visual.\n", "* `title=\"Profiling Report\"`: Define el título del informe generado." ] }, { "cell_type": "code", "execution_count": 5, "id": "9e8e817d-9a9e-4cf3-a776-f5d379dfb747", "metadata": {}, "outputs": [], "source": [ "profile_report = ProfileReport(\n", " pokemon_df,\n", " sort=None,\n", " html={\n", " \"style\": {\"full_width\": True}\n", " }, \n", " progress_bar=False,\n", " correlations={\n", " \"auto\": {\"calculate\": True},\n", " \"pearson\": {\"calculate\": False},\n", " \"spearman\": {\"calculate\": False},\n", " \"kendall\": {\"calculate\": False},\n", " \"phi_k\": {\"calculate\": True},\n", " \"cramers\": {\"calculate\": True},\n", " },\n", " explorative=True,\n", " title=\"Profiling Report\"\n", ")\n", "\n", "profile_report.to_file(\"pokemon.html\")\n", "# Imprime el reporte dentro del notebook\n", "#profile_report" ] }, { "cell_type": "markdown", "id": "46b114a4-ed66-4f6d-b34f-e63a2dd5fb4d", "metadata": {}, "source": [ "## Comparar datasets" ] }, { "cell_type": "markdown", "id": "08bc4025-95ff-4fe2-84c8-fcc89c58d807", "metadata": {}, "source": [ "También podemos generar informes comparando dos conjuntos de datos. El siguiente ejemplo compara conjuntos de datos de Pokémon de entrenamiento y de prueba. [train_test_split](https://scikit-learn.org/dev/modules/generated/sklearn.model_selection.train_test_split.html#train-test-split) de scikit-learn se utiliza para crear los conjuntos de datos de entrenamiento y prueba." ] }, { "cell_type": "code", "execution_count": 6, "id": "9126a20b-9d67-48ec-b125-2c4bdfe016f0", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Found existing installation: numpy 2.0.0\n", "Uninstalling numpy-2.0.0:\n", " Successfully uninstalled numpy-2.0.0\n", "Found existing installation: scikit-learn 1.5.2\n", "Uninstalling scikit-learn-1.5.2:\n", " Successfully uninstalled scikit-learn-1.5.2\n", "Found existing installation: pandas 2.2.3\n", "Uninstalling pandas-2.2.3:\n", " Successfully uninstalled pandas-2.2.3\n", "Found existing installation: ydata-profiling 4.12.0\n", "Uninstalling ydata-profiling-4.12.0:\n", " Successfully uninstalled ydata-profiling-4.12.0\n", "Collecting numpy==2.0\n", " Using cached numpy-2.0.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (60 kB)\n", "Collecting scikit-learn\n", " Using cached scikit_learn-1.5.2-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (13 kB)\n", "Collecting ydata_profiling\n", " Using cached ydata_profiling-4.12.0-py2.py3-none-any.whl.metadata (20 kB)\n", "Requirement already satisfied: scipy>=1.6.0 in /home/ir_inf/anaconda3/envs/pandas_profiling/lib/python3.12/site-packages (from scikit-learn) (1.13.1)\n", "Requirement already satisfied: joblib>=1.2.0 in /home/ir_inf/anaconda3/envs/pandas_profiling/lib/python3.12/site-packages (from scikit-learn) (1.4.2)\n", "Requirement already satisfied: threadpoolctl>=3.1.0 in /home/ir_inf/anaconda3/envs/pandas_profiling/lib/python3.12/site-packages (from scikit-learn) (3.5.0)\n", "Collecting pandas!=1.4.0,<3,>1.1 (from ydata_profiling)\n", " Using cached pandas-2.2.3-cp312-cp312-linux_x86_64.whl\n", "Requirement already satisfied: matplotlib<3.10,>=3.5 in /home/ir_inf/anaconda3/envs/pandas_profiling/lib/python3.12/site-packages (from ydata_profiling) (3.9.2)\n", "Requirement already satisfied: pydantic>=2 in /home/ir_inf/anaconda3/envs/pandas_profiling/lib/python3.12/site-packages (from ydata_profiling) (2.10.1)\n", "Requirement already satisfied: PyYAML<6.1,>=5.0.0 in /home/ir_inf/anaconda3/envs/pandas_profiling/lib/python3.12/site-packages (from ydata_profiling) (6.0.2)\n", "Requirement already satisfied: jinja2<3.2,>=2.11.1 in /home/ir_inf/anaconda3/envs/pandas_profiling/lib/python3.12/site-packages (from ydata_profiling) (3.1.4)\n", "Requirement already satisfied: visions<0.7.7,>=0.7.5 in /home/ir_inf/anaconda3/envs/pandas_profiling/lib/python3.12/site-packages (from visions[type_image_path]<0.7.7,>=0.7.5->ydata_profiling) (0.7.6)\n", "Requirement already satisfied: htmlmin==0.1.12 in /home/ir_inf/anaconda3/envs/pandas_profiling/lib/python3.12/site-packages (from ydata_profiling) (0.1.12)\n", "Requirement already satisfied: phik<0.13,>=0.11.1 in /home/ir_inf/anaconda3/envs/pandas_profiling/lib/python3.12/site-packages (from ydata_profiling) (0.12.4)\n", "Requirement already satisfied: requests<3,>=2.24.0 in /home/ir_inf/anaconda3/envs/pandas_profiling/lib/python3.12/site-packages (from ydata_profiling) (2.32.3)\n", "Requirement already satisfied: tqdm<5,>=4.48.2 in /home/ir_inf/anaconda3/envs/pandas_profiling/lib/python3.12/site-packages (from ydata_profiling) (4.67.1)\n", "Requirement already satisfied: seaborn<0.14,>=0.10.1 in /home/ir_inf/anaconda3/envs/pandas_profiling/lib/python3.12/site-packages (from ydata_profiling) (0.13.2)\n", "Requirement already satisfied: multimethod<2,>=1.4 in /home/ir_inf/anaconda3/envs/pandas_profiling/lib/python3.12/site-packages (from ydata_profiling) (1.12)\n", "Requirement already satisfied: statsmodels<1,>=0.13.2 in /home/ir_inf/anaconda3/envs/pandas_profiling/lib/python3.12/site-packages (from ydata_profiling) (0.14.4)\n", "Requirement already satisfied: typeguard<5,>=3 in /home/ir_inf/anaconda3/envs/pandas_profiling/lib/python3.12/site-packages (from ydata_profiling) (4.4.1)\n", "Requirement already satisfied: imagehash==4.3.1 in /home/ir_inf/anaconda3/envs/pandas_profiling/lib/python3.12/site-packages (from ydata_profiling) (4.3.1)\n", "Requirement already satisfied: wordcloud>=1.9.3 in /home/ir_inf/anaconda3/envs/pandas_profiling/lib/python3.12/site-packages (from ydata_profiling) (1.9.4)\n", "Requirement already satisfied: dacite>=1.8 in /home/ir_inf/anaconda3/envs/pandas_profiling/lib/python3.12/site-packages (from ydata_profiling) (1.8.1)\n", "Requirement already satisfied: numba<1,>=0.56.0 in /home/ir_inf/anaconda3/envs/pandas_profiling/lib/python3.12/site-packages (from ydata_profiling) (0.60.0)\n", "Requirement already satisfied: PyWavelets in /home/ir_inf/anaconda3/envs/pandas_profiling/lib/python3.12/site-packages (from imagehash==4.3.1->ydata_profiling) (1.7.0)\n", "Requirement already satisfied: pillow in /home/ir_inf/anaconda3/envs/pandas_profiling/lib/python3.12/site-packages (from imagehash==4.3.1->ydata_profiling) (11.0.0)\n", "Requirement already satisfied: MarkupSafe>=2.0 in /home/ir_inf/anaconda3/envs/pandas_profiling/lib/python3.12/site-packages (from jinja2<3.2,>=2.11.1->ydata_profiling) (3.0.2)\n", "Requirement already satisfied: contourpy>=1.0.1 in /home/ir_inf/anaconda3/envs/pandas_profiling/lib/python3.12/site-packages (from matplotlib<3.10,>=3.5->ydata_profiling) (1.3.1)\n", "Requirement already satisfied: cycler>=0.10 in /home/ir_inf/anaconda3/envs/pandas_profiling/lib/python3.12/site-packages (from matplotlib<3.10,>=3.5->ydata_profiling) (0.12.1)\n", "Requirement already satisfied: fonttools>=4.22.0 in /home/ir_inf/anaconda3/envs/pandas_profiling/lib/python3.12/site-packages (from matplotlib<3.10,>=3.5->ydata_profiling) (4.55.0)\n", "Requirement already satisfied: kiwisolver>=1.3.1 in /home/ir_inf/anaconda3/envs/pandas_profiling/lib/python3.12/site-packages (from matplotlib<3.10,>=3.5->ydata_profiling) (1.4.7)\n", "Requirement already satisfied: packaging>=20.0 in /home/ir_inf/anaconda3/envs/pandas_profiling/lib/python3.12/site-packages (from matplotlib<3.10,>=3.5->ydata_profiling) (24.2)\n", "Requirement already satisfied: pyparsing>=2.3.1 in /home/ir_inf/anaconda3/envs/pandas_profiling/lib/python3.12/site-packages (from matplotlib<3.10,>=3.5->ydata_profiling) (3.2.0)\n", "Requirement already satisfied: python-dateutil>=2.7 in /home/ir_inf/anaconda3/envs/pandas_profiling/lib/python3.12/site-packages (from matplotlib<3.10,>=3.5->ydata_profiling) (2.9.0.post0)\n", "Requirement already satisfied: llvmlite<0.44,>=0.43.0dev0 in /home/ir_inf/anaconda3/envs/pandas_profiling/lib/python3.12/site-packages (from numba<1,>=0.56.0->ydata_profiling) (0.43.0)\n", "Requirement already satisfied: pytz>=2020.1 in /home/ir_inf/anaconda3/envs/pandas_profiling/lib/python3.12/site-packages (from pandas!=1.4.0,<3,>1.1->ydata_profiling) (2024.1)\n", "Requirement already satisfied: tzdata>=2022.7 in /home/ir_inf/anaconda3/envs/pandas_profiling/lib/python3.12/site-packages (from pandas!=1.4.0,<3,>1.1->ydata_profiling) (2024.2)\n", "Requirement already satisfied: annotated-types>=0.6.0 in /home/ir_inf/anaconda3/envs/pandas_profiling/lib/python3.12/site-packages (from pydantic>=2->ydata_profiling) (0.7.0)\n", "Requirement already satisfied: pydantic-core==2.27.1 in /home/ir_inf/anaconda3/envs/pandas_profiling/lib/python3.12/site-packages (from pydantic>=2->ydata_profiling) (2.27.1)\n", "Requirement already satisfied: typing-extensions>=4.12.2 in /home/ir_inf/anaconda3/envs/pandas_profiling/lib/python3.12/site-packages (from pydantic>=2->ydata_profiling) (4.12.2)\n", "Requirement already satisfied: charset-normalizer<4,>=2 in /home/ir_inf/anaconda3/envs/pandas_profiling/lib/python3.12/site-packages (from requests<3,>=2.24.0->ydata_profiling) (3.4.0)\n", "Requirement already satisfied: idna<4,>=2.5 in /home/ir_inf/anaconda3/envs/pandas_profiling/lib/python3.12/site-packages (from requests<3,>=2.24.0->ydata_profiling) (3.10)\n", "Requirement already satisfied: urllib3<3,>=1.21.1 in /home/ir_inf/anaconda3/envs/pandas_profiling/lib/python3.12/site-packages (from requests<3,>=2.24.0->ydata_profiling) (2.2.3)\n", "Requirement already satisfied: certifi>=2017.4.17 in /home/ir_inf/anaconda3/envs/pandas_profiling/lib/python3.12/site-packages (from requests<3,>=2.24.0->ydata_profiling) (2024.8.30)\n", "Requirement already satisfied: patsy>=0.5.6 in /home/ir_inf/anaconda3/envs/pandas_profiling/lib/python3.12/site-packages (from statsmodels<1,>=0.13.2->ydata_profiling) (1.0.1)\n", "Requirement already satisfied: attrs>=19.3.0 in /home/ir_inf/anaconda3/envs/pandas_profiling/lib/python3.12/site-packages (from visions<0.7.7,>=0.7.5->visions[type_image_path]<0.7.7,>=0.7.5->ydata_profiling) (24.2.0)\n", "Requirement already satisfied: networkx>=2.4 in /home/ir_inf/anaconda3/envs/pandas_profiling/lib/python3.12/site-packages (from visions<0.7.7,>=0.7.5->visions[type_image_path]<0.7.7,>=0.7.5->ydata_profiling) (3.4.2)\n", "Requirement already satisfied: six>=1.5 in /home/ir_inf/anaconda3/envs/pandas_profiling/lib/python3.12/site-packages (from python-dateutil>=2.7->matplotlib<3.10,>=3.5->ydata_profiling) (1.16.0)\n", "Using cached numpy-2.0.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (19.0 MB)\n", "Using cached scikit_learn-1.5.2-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (12.9 MB)\n", "Using cached ydata_profiling-4.12.0-py2.py3-none-any.whl (390 kB)\n", "Installing collected packages: numpy, pandas, scikit-learn, ydata_profiling\n", "Successfully installed numpy-2.0.0 pandas-2.2.3 scikit-learn-1.5.2 ydata_profiling-4.12.0\n" ] } ], "source": [ "#!conda update numpy\n", "#!conda uninstall -y numpy scikit-learn\n", "#!conda install numpy==1.14.5 --yes\n", "#!conda install -y numpy\n", "#!conda update numpy\n", "#!conda install -y numpy==1.26.4\n", "#!pip install numpy==1.26.4 --no-binary :all:\n", "#!pip install numpy\n", "!pip uninstall -y numpy scikit-learn pandas ydata_profiling\n", "#!pip install numpy==1.21.4 pandas ydata_profiling\n", "!pip install --no-binary pandas numpy==2.0 scikit-learn ydata_profiling" ] }, { "cell_type": "code", "execution_count": 7, "id": "fb9bfe02-64b3-47e6-94f0-988cc3d18fd0", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "# packages in environment at /home/ir_inf/anaconda3/envs/pandas_profiling:\n", "#\n", "# Name Version Build Channel\n", "numpy 2.0.0 pypi_0 pypi\n", "# packages in environment at /home/ir_inf/anaconda3/envs/pandas_profiling:\n", "#\n", "# Name Version Build Channel\n", "pandas 2.2.3 pypi_0 pypi\n", "# packages in environment at /home/ir_inf/anaconda3/envs/pandas_profiling:\n", "#\n", "# Name Version Build Channel\n", "scikit-learn 1.5.2 pypi_0 pypi\n", "# packages in environment at /home/ir_inf/anaconda3/envs/pandas_profiling:\n", "#\n", "# Name Version Build Channel\n" ] } ], "source": [ "!conda list numpy\n", "!conda list pandas\n", "!conda list scikit-learn\n", "!conda list ydata_profiling" ] }, { "cell_type": "code", "execution_count": 8, "id": "5e8199f7-2919-49a3-9bd0-a2bfb3e98c87", "metadata": {}, "outputs": [], "source": [ "from sklearn.model_selection import train_test_split" ] }, { "cell_type": "code", "execution_count": 9, "id": "0a85a566-0821-4a5b-90b7-d0c473d8f98f", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Requirement already satisfied: numba in /home/ir_inf/anaconda3/envs/pandas_profiling/lib/python3.12/site-packages (0.60.0)\n", "Requirement already satisfied: numpy in /home/ir_inf/anaconda3/envs/pandas_profiling/lib/python3.12/site-packages (2.0.0)\n", "Requirement already satisfied: llvmlite<0.44,>=0.43.0dev0 in /home/ir_inf/anaconda3/envs/pandas_profiling/lib/python3.12/site-packages (from numba) (0.43.0)\n", "Note: you may need to restart the kernel to use updated packages.\n" ] } ], "source": [ "pip install --dry-run numba numpy" ] }, { "cell_type": "code", "execution_count": 18, "id": "1df20cae-78dc-415b-b729-c911f4e1d745", "metadata": {}, "outputs": [], "source": [ "from ydata_profiling.utils.cache import cache_file" ] }, { "cell_type": "code", "execution_count": 19, "id": "ded35658-8e41-46be-b7fa-6bfed3623b40", "metadata": {}, "outputs": [], "source": [ "file_name = cache_file(\n", " \"pokemon.csv\",\n", " \"https://raw.githubusercontent.com/bryanpaget/html/main/pokemon.csv\"\n", ")" ] }, { "cell_type": "code", "execution_count": 20, "id": "82e8f89c-b76f-4385-b04c-92cffd776d4f", "metadata": {}, "outputs": [], "source": [ "pokemon_df = pd.read_csv(file_name)" ] }, { "cell_type": "code", "execution_count": 13, "id": "6c9f38e3-63c4-4811-a967-8e447352cb62", "metadata": {}, "outputs": [], "source": [ "X = pokemon_df[['Total', 'HP', 'Attack', 'Defense', 'Sp. Atk', 'Sp. Def', 'Speed']]\n", "y = pokemon_df[['Type 1', 'Type 2']]" ] }, { "cell_type": "code", "execution_count": 14, "id": "42f629b9-6428-41b1-87de-f287cde1f454", "metadata": {}, "outputs": [], "source": [ "X_train, X_test, y_train, y_test = train_test_split(\n", " X, y, test_size=0.33, random_state=42)" ] }, { "cell_type": "code", "execution_count": 15, "id": "7b7d0e61-f416-4a82-9ea5-b6dd4d9c0f81", "metadata": {}, "outputs": [], "source": [ "train_df = X_train\n", "train_report = ProfileReport(train_df, title=\"Train\")" ] }, { "cell_type": "code", "execution_count": 16, "id": "24b3041b-8197-441a-afc8-0e445af09408", "metadata": {}, "outputs": [], "source": [ "test_df = X_test\n", "test_report = ProfileReport(test_df, title=\"Test\")" ] }, { "cell_type": "code", "execution_count": 17, "id": "850958cc-dace-4895-b989-cd10f53dc57f", "metadata": {}, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "e262b96acab247abb835eb9f5b13e999", "version_major": 2, "version_minor": 0 }, "text/plain": [ "Summarize dataset: 0%| | 0/5 [00:00" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/plain": [] }, "execution_count": 21, "metadata": {}, "output_type": "execute_result" } ], "source": [ "file_name = cache_file(\n", " \"msft.csv\",\n", " \"https://raw.githubusercontent.com/bryanpaget/html/main/msft.csv\"\n", ")\n", "\n", "msft_df = pd.read_csv(file_name)\n", "msft_df[\"Date\"] = pd.to_datetime(msft_df[\"Date\"])\n", "\n", "# Enable tsmode to True to automatically identify time-series variables\n", "# Provide the column name that provides the chronological order of your time-series\n", "profile = ProfileReport(msft_df, tsmode=True, sortby=\"Date\", title=\"Time-Series EDA\")\n", "\n", "profile.to_file(\"msft-report-timeseries.html\")\n", "#profile" ] }, { "cell_type": "markdown", "id": "d2b77bab-b0f9-4ca5-b770-ef3cad7c4ebc", "metadata": {}, "source": [ "## Referencias" ] }, { "cell_type": "markdown", "id": "22daaf24-f466-4c5f-9ce3-369f5ef5eef3", "metadata": {}, "source": [ "* [Examples](https://docs.profiling.ydata.ai/latest/getting-started/examples/): Ejemplos enlazados en la Web oficial. [Titanic ](https://colab.research.google.com/github/ydataai/ydata-profiling/blob/master/examples/titanic/titanic.ipynb) Colab.\n", "* Medium [YData Profiling: Streamlining Data Analysis](https://bryanpaget.medium.com/ydata-profiling-71b23ef5ff07)\n", "* DZone [Pandas One Line Magical Code for EDA: Pandas Profile Report](https://dzone.com/articles/pandas-one-line-magical-code-for-eda-pandas-profil) -- [Notebook](https://github.com/skappal7/Data-Science-Projects-Prototypes/blob/master/NewData%20(EDA%20-%20Pandas%20Profile%20Report).ipynb)\n", "* Kaggle: [Notebooks using ydata-profiling (previously cally pandas-profiling)](https://www.kaggle.com/search?q=ydata-profiling) (100+ notebooks)\n", " * [https://www.kaggle.com/code/waalbannyantudre/ydata-profiling-tutorial-quick-efficient-eda](https://www.kaggle.com/code/waalbannyantudre/ydata-profiling-tutorial-quick-efficient-eda)\n", " * [https://www.kaggle.com/code/anshtanwar/ydata-profiling-startups-india](https://www.kaggle.com/code/anshtanwar/ydata-profiling-startups-india)\n", " * [https://www.kaggle.com/code/anshtanwar/food-price-inflation-import-and-eda](https://www.kaggle.com/code/anshtanwar/food-price-inflation-import-and-eda)" ] }, { "cell_type": "code", "execution_count": null, "id": "12023ab3-1dea-456b-9253-d8bfa192b1d9", "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.12.7" } }, "nbformat": 4, "nbformat_minor": 5 }