{ "cells": [ { "cell_type": "markdown", "metadata": { "id": "k36y7v9ndkzu" }, "source": [ "# Ejercicio \n", "\n", "El fichero [titanic.csv](https://raw.githubusercontent.com/asalber/asalber.github.io/master/python/ejercicios/soluciones/pandas/titanic.csv) contiene información sobre los pasajeros del Titanic. Escribir un programa con los siguientes requisitos:\n", "\n", "1. Generar un DataFrame con los datos del fichero.\n", "2. Mostrar por pantalla las dimensiones del DataFrame, el número de datos que contiene, los nombres de sus columnas y filas, los tipos de datos de las columnas, las 10 primeras filas y las 10 últimas filas.\n", "3. Mostrar por pantalla los datos del pasajero con identificador 148.\n", "3. Mostrar por pantalla las filas pares del DataFrame.\n", "4. Mostrar por pantalla los nombres de las personas que iban en primera clase ordenadas alfabéticamente.\n", "5. Mostrar por pantalla el porcentaje de personas que sobrevivieron y murieron.\n", "5. Mostrar por pantalla el porcentaje de personas que sobrevivieron en cada clase.\n", "6. Eliminar del DataFrame los pasajeros con edad desconocida.\n", "7. Mostrar por pantalla la edad media de las mujeres que viajaban en cada clase.\n", "8. Añadir una nueva columna booleana para ver si el pasajero era menor de edad o no.\n", "9. Mostrar por pantalla el porcentaje de menores y mayores de edad que sobrevivieron en cada clase.\n", "\n" ] }, { "cell_type": "markdown", "metadata": { "id": "4bqJ3Wm0dkzz" }, "source": [ "# Solución" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "4XL3aRZ_dkz1", "outputId": "b16d3afe-5973-4b0c-dd5d-2c432ae593e1" }, "outputs": [ { "output_type": "stream", "name": "stdout", "text": "Survived Pclass \\\nPassengerId \n1 0 3 \n2 1 1 \n3 1 3 \n4 1 1 \n5 0 3 \n... ... ... \n887 0 2 \n888 1 1 \n889 0 3 \n890 1 1 \n891 0 3 \n\n Name Sex Age \\\nPassengerId \n1 Braund, Mr. Owen Harris male 22.0 \n2 Cumings, Mrs. John Bradley (Florence Briggs Th... female 38.0 \n3 Heikkinen, Miss. Laina female 26.0 \n4 Futrelle, Mrs. Jacques Heath (Lily May Peel) female 35.0 \n5 Allen, Mr. William Henry male 35.0 \n... ... ... ... \n887 Montvila, Rev. Juozas male 27.0 \n888 Graham, Miss. Margaret Edith female 19.0 \n889 Johnston, Miss. Catherine Helen \"Carrie\" female NaN \n890 Behr, Mr. Karl Howell male 26.0 \n891 Dooley, Mr. Patrick male 32.0 \n\n SibSp Parch Ticket Fare Cabin Embarked \nPassengerId \n1 1 0 A/5 21171 7.2500 NaN S \n2 1 0 PC 17599 71.2833 C85 C \n3 0 0 STON/O2. 3101282 7.9250 NaN S \n4 1 0 113803 53.1000 C123 S \n5 0 0 373450 8.0500 NaN S \n... ... ... ... ... ... ... \n887 0 0 211536 13.0000 NaN S \n888 0 0 112053 30.0000 B42 S \n889 1 2 W./C. 6607 23.4500 NaN S \n890 0 0 111369 30.0000 C148 C \n891 0 0 370376 7.7500 NaN Q \n\n[891 rows x 11 columns]\n" } ], "source": [ "import pandas as pd\n", "\n", "# Generar un DataFrame con los datos del fichero.\n", "titanic = pd.read_csv('https://raw.githubusercontent.com/asalber/asalber.github.io/master/python/ejercicios/soluciones/pandas/titanic.csv', index_col=0)\n", "\n", "print(titanic)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "MOwQNn_Edkz4", "outputId": "177d5f0d-2182-49bb-913b-a2890e6d36c9" }, "outputs": [ { "output_type": "stream", "name": "stdout", "text": "Dimensiones: (891, 11)\nNúmero de elemntos: 9801\nNombres de columnas: Index(['Survived', 'Pclass', 'Name', 'Sex', 'Age', 'SibSp', 'Parch', 'Ticket',\n 'Fare', 'Cabin', 'Embarked'],\n dtype='object')\nNombres de filas: Int64Index([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10,\n ...\n 882, 883, 884, 885, 886, 887, 888, 889, 890, 891],\n dtype='int64', name='PassengerId', length=891)\nTipos de datos:\n Survived int64\nPclass int64\nName object\nSex object\nAge float64\nSibSp int64\nParch int64\nTicket object\nFare float64\nCabin object\nEmbarked object\ndtype: object\nPrimeras 10 filas:\n Survived Pclass \\\nPassengerId \n1 0 3 \n2 1 1 \n3 1 3 \n4 1 1 \n5 0 3 \n6 0 3 \n7 0 1 \n8 0 3 \n9 1 3 \n10 1 2 \n\n Name Sex Age \\\nPassengerId \n1 Braund, Mr. Owen Harris male 22.0 \n2 Cumings, Mrs. John Bradley (Florence Briggs Th... female 38.0 \n3 Heikkinen, Miss. Laina female 26.0 \n4 Futrelle, Mrs. Jacques Heath (Lily May Peel) female 35.0 \n5 Allen, Mr. William Henry male 35.0 \n6 Moran, Mr. James male NaN \n7 McCarthy, Mr. Timothy J male 54.0 \n8 Palsson, Master. Gosta Leonard male 2.0 \n9 Johnson, Mrs. Oscar W (Elisabeth Vilhelmina Berg) female 27.0 \n10 Nasser, Mrs. Nicholas (Adele Achem) female 14.0 \n\n SibSp Parch Ticket Fare Cabin Embarked \nPassengerId \n1 1 0 A/5 21171 7.2500 NaN S \n2 1 0 PC 17599 71.2833 C85 C \n3 0 0 STON/O2. 3101282 7.9250 NaN S \n4 1 0 113803 53.1000 C123 S \n5 0 0 373450 8.0500 NaN S \n6 0 0 330877 8.4583 NaN Q \n7 0 0 17463 51.8625 E46 S \n8 3 1 349909 21.0750 NaN S \n9 0 2 347742 11.1333 NaN S \n10 1 0 237736 30.0708 NaN C \nÚltimas 10 filas:\n Survived Pclass Name \\\nPassengerId \n882 0 3 Markun, Mr. Johann \n883 0 3 Dahlberg, Miss. Gerda Ulrika \n884 0 2 Banfield, Mr. Frederick James \n885 0 3 Sutehall, Mr. Henry Jr \n886 0 3 Rice, Mrs. William (Margaret Norton) \n887 0 2 Montvila, Rev. Juozas \n888 1 1 Graham, Miss. Margaret Edith \n889 0 3 Johnston, Miss. Catherine Helen \"Carrie\" \n890 1 1 Behr, Mr. Karl Howell \n891 0 3 Dooley, Mr. Patrick \n\n Sex Age SibSp Parch Ticket Fare Cabin \\\nPassengerId \n882 male 33.0 0 0 349257 7.8958 NaN \n883 female 22.0 0 0 7552 10.5167 NaN \n884 male 28.0 0 0 C.A./SOTON 34068 10.5000 NaN \n885 male 25.0 0 0 SOTON/OQ 392076 7.0500 NaN \n886 female 39.0 0 5 382652 29.1250 NaN \n887 male 27.0 0 0 211536 13.0000 NaN \n888 female 19.0 0 0 112053 30.0000 B42 \n889 female NaN 1 2 W./C. 6607 23.4500 NaN \n890 male 26.0 0 0 111369 30.0000 C148 \n891 male 32.0 0 0 370376 7.7500 NaN \n\n Embarked \nPassengerId \n882 S \n883 S \n884 S \n885 S \n886 Q \n887 S \n888 S \n889 S \n890 C \n891 Q \n" } ], "source": [ "# Mostrar por pantalla las dimensiones del DataFrame, el número de datos que contiene, los nombres de sus columnas y filas, los tipos de datos de las columnas, las 10 primeras filas y las 10 últimas filas.\n", "print('Dimensiones:', titanic.shape)\n", "print('Número de elemntos:', titanic.size)\n", "print('Nombres de columnas:', titanic.columns)\n", "print('Nombres de filas:', titanic.index)\n", "print('Tipos de datos:\\n', titanic.dtypes)\n", "print('Primeras 10 filas:\\n', titanic.head(10))\n", "print('Últimas 10 filas:\\n', titanic.tail(10))\n", "\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "Z4d1rGIUdkz6", "outputId": "19f8d317-0878-42ed-8fa5-b877456e3cfa" }, "outputs": [ { "output_type": "stream", "name": "stdout", "text": "Survived 0\nPclass 3\nName Ford, Miss. Robina Maggie \"Ruby\"\nSex female\nAge 9\nSibSp 2\nParch 2\nTicket W./C. 6608\nFare 34.375\nCabin NaN\nEmbarked S\nName: 148, dtype: object\n" } ], "source": [ "# Mostrar por pantalla los datos del pasajero con identificador 148\n", "print(titanic.loc[148])" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "UR1ARuC9dkz8", "outputId": "e980bf0e-a660-4149-e6dc-64b46be09775" }, "outputs": [ { "output_type": "stream", "name": "stdout", "text": "Survived Pclass \\\nPassengerId \n1 0 3 \n3 1 3 \n5 0 3 \n7 0 1 \n9 1 3 \n... ... ... \n883 0 3 \n885 0 3 \n887 0 2 \n889 0 3 \n891 0 3 \n\n Name Sex Age \\\nPassengerId \n1 Braund, Mr. Owen Harris male 22.0 \n3 Heikkinen, Miss. Laina female 26.0 \n5 Allen, Mr. William Henry male 35.0 \n7 McCarthy, Mr. Timothy J male 54.0 \n9 Johnson, Mrs. Oscar W (Elisabeth Vilhelmina Berg) female 27.0 \n... ... ... ... \n883 Dahlberg, Miss. Gerda Ulrika female 22.0 \n885 Sutehall, Mr. Henry Jr male 25.0 \n887 Montvila, Rev. Juozas male 27.0 \n889 Johnston, Miss. Catherine Helen \"Carrie\" female NaN \n891 Dooley, Mr. Patrick male 32.0 \n\n SibSp Parch Ticket Fare Cabin Embarked \nPassengerId \n1 1 0 A/5 21171 7.2500 NaN S \n3 0 0 STON/O2. 3101282 7.9250 NaN S \n5 0 0 373450 8.0500 NaN S \n7 0 0 17463 51.8625 E46 S \n9 0 2 347742 11.1333 NaN S \n... ... ... ... ... ... ... \n883 0 0 7552 10.5167 NaN S \n885 0 0 SOTON/OQ 392076 7.0500 NaN S \n887 0 0 211536 13.0000 NaN S \n889 1 2 W./C. 6607 23.4500 NaN S \n891 0 0 370376 7.7500 NaN Q \n\n[446 rows x 11 columns]\n" } ], "source": [ "# Mostrar por pantalla las filas pares del DataFrame.\n", "print(titanic.iloc[range(0,titanic.shape[0],2)])" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "YLei6aqqdkz9", "outputId": "5f6727ff-5ac1-4b7d-8d6a-7f442c7707b5" }, "outputs": [ { "output_type": "stream", "name": "stdout", "text": "PassengerId\n731 Allen, Miss. Elisabeth Walton\n306 Allison, Master. Hudson Trevor\n298 Allison, Miss. Helen Loraine\n499 Allison, Mrs. Hudson J C (Bessie Waldo Daniels)\n461 Anderson, Mr. Harry\n ... \n156 Williams, Mr. Charles Duane\n352 Williams-Lambert, Mr. Fletcher Fellows\n56 Woolner, Mr. Hugh\n556 Wright, Mr. George\n326 Young, Miss. Marie Grice\nName: Name, Length: 216, dtype: object\n" } ], "source": [ "# Mostrar los nombres de las personas que iban en primera clase ordenadas alfabéticamente.\n", "print(titanic[titanic[\"Pclass\"]==1]['Name'].sort_values())" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "Lzi-dCEOdkz_", "outputId": "839a9505-fb89-41e7-cc60-f07b4dd78e93" }, "outputs": [ { "output_type": "stream", "name": "stdout", "text": "0 61.616162\n1 38.383838\nName: Survived, dtype: float64\n0 61.616162\n1 38.383838\nName: Survived, dtype: float64\n" } ], "source": [ "# Mostrar por pantalla el porcentaje de personas que sobrevivieron y murieron\n", "print(titanic['Survived'].value_counts()/titanic['Survived'].count() * 100)\n", "\n", "# Alternativa\n", "print(titanic['Survived'].value_counts(normalize=True) * 100)\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "Mo_VS4oMdk0B", "outputId": "5610bc72-7c72-4248-83bc-c2a52704a0f7" }, "outputs": [ { "output_type": "stream", "name": "stdout", "text": "Pclass Survived\n1 1 0.629630\n 0 0.370370\n2 0 0.527174\n 1 0.472826\n3 0 0.757637\n 1 0.242363\nName: Survived, dtype: float64\n" } ], "source": [ "#Mostrar por pantalla el porcentaje de personas que sobrevivieron en cada clase\n", "print(titanic.groupby('Pclass')['Survived'].value_counts(normalize=True))" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "xdDkclURdk0D", "outputId": "010abf29-16bd-4218-f74b-9fab50b4764f" }, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": " Survived Pclass \\\nPassengerId \n1 0 3 \n2 1 1 \n3 1 3 \n4 1 1 \n5 0 3 \n... ... ... \n886 0 3 \n887 0 2 \n888 1 1 \n890 1 1 \n891 0 3 \n\n Name Sex Age \\\nPassengerId \n1 Braund, Mr. Owen Harris male 22.0 \n2 Cumings, Mrs. John Bradley (Florence Briggs Th... female 38.0 \n3 Heikkinen, Miss. Laina female 26.0 \n4 Futrelle, Mrs. Jacques Heath (Lily May Peel) female 35.0 \n5 Allen, Mr. William Henry male 35.0 \n... ... ... ... \n886 Rice, Mrs. William (Margaret Norton) female 39.0 \n887 Montvila, Rev. Juozas male 27.0 \n888 Graham, Miss. Margaret Edith female 19.0 \n890 Behr, Mr. Karl Howell male 26.0 \n891 Dooley, Mr. Patrick male 32.0 \n\n SibSp Parch Ticket Fare Cabin Embarked \nPassengerId \n1 1 0 A/5 21171 7.2500 NaN S \n2 1 0 PC 17599 71.2833 C85 C \n3 0 0 STON/O2. 3101282 7.9250 NaN S \n4 1 0 113803 53.1000 C123 S \n5 0 0 373450 8.0500 NaN S \n... ... ... ... ... ... ... \n886 0 5 382652 29.1250 NaN Q \n887 0 0 211536 13.0000 NaN S \n888 0 0 112053 30.0000 B42 S \n890 0 0 111369 30.0000 C148 C \n891 0 0 370376 7.7500 NaN Q \n\n[714 rows x 11 columns]", "text/html": "
\n\n\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n
SurvivedPclassNameSexAgeSibSpParchTicketFareCabinEmbarked
PassengerId
103Braund, Mr. Owen Harrismale22.010A/5 211717.2500NaNS
211Cumings, Mrs. John Bradley (Florence Briggs Th...female38.010PC 1759971.2833C85C
313Heikkinen, Miss. Lainafemale26.000STON/O2. 31012827.9250NaNS
411Futrelle, Mrs. Jacques Heath (Lily May Peel)female35.01011380353.1000C123S
503Allen, Mr. William Henrymale35.0003734508.0500NaNS
....................................
88603Rice, Mrs. William (Margaret Norton)female39.00538265229.1250NaNQ
88702Montvila, Rev. Juozasmale27.00021153613.0000NaNS
88811Graham, Miss. Margaret Edithfemale19.00011205330.0000B42S
89011Behr, Mr. Karl Howellmale26.00011136930.0000C148C
89103Dooley, Mr. Patrickmale32.0003703767.7500NaNQ
\n

714 rows × 11 columns

\n
" }, "metadata": {}, "execution_count": 60 } ], "source": [ "# Eliminar del DataFrame los pasajeros con edad desconocida.\n", "titanic.dropna(subset=['Age'])\n", "\n", "# Alternativa \n", "# titanic = titanic[titanic['Age'].notna()]\n", "\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "CCcVo2pZdk0E", "outputId": "8f2f7ca1-1182-4ed4-bfb3-61887913889f" }, "outputs": [ { "output_type": "stream", "name": "stdout", "text": "Pclass\n1 34.611765\n2 28.722973\n3 21.750000\nName: female, dtype: float64\n" } ], "source": [ "# Mostrar la edad media de las mujeres que viajaban en cada clase.\n", "print(titanic.groupby(['Pclass','Sex'])['Age'].mean().unstack()['female'])" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "jZsSE1fzdk0G" }, "outputs": [], "source": [ "# Añadir una nueva columna booleana para ver si el pasajero era menor de edad o no.\n", "titanic['Young'] = titanic['Age'] < 18" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "su6PcRNHdk0H", "outputId": "787f7976-e597-459e-bf70-a0a854148ee8" }, "outputs": [ { "output_type": "stream", "name": "stdout", "text": "Pclass Young Survived\n1 False 1 61.274510\n 0 38.725490\n True 1 91.666667\n 0 8.333333\n2 False 0 59.006211\n 1 40.993789\n True 1 91.304348\n 0 8.695652\n3 False 0 78.208232\n 1 21.791768\n True 0 62.820513\n 1 37.179487\nName: Survived, dtype: float64\n" } ], "source": [ "# Mostrar el porcentaje de menores y mayores de edad que sobrevivieron en cada clase.\n", "print(titanic.groupby(['Pclass', 'Young'])['Survived'].value_counts(normalize = True) * 100)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "LxtusvKLdk0I" }, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.6-final" }, "colab": { "provenance": [] } }, "nbformat": 4, "nbformat_minor": 0 }