From b182c598ae42bb75cec32e5dd5b93d1341488286 Mon Sep 17 00:00:00 2001
From: Sarah Nabelsi <snabelsi@Sarahs-MacBook-Pro-2.local>
Date: Mon, 23 Jan 2023 10:05:57 -0800
Subject: [PATCH 1/4] adding 03_02 files

---
 03_02/03_02 Read from CSV.ipynb | 434 ++++++++++++++++++++++++++++++++
 1 file changed, 434 insertions(+)
 create mode 100644 03_02/03_02 Read from CSV.ipynb

diff --git a/03_02/03_02 Read from CSV.ipynb b/03_02/03_02 Read from CSV.ipynb
new file mode 100644
index 0000000..704c6e6
--- /dev/null
+++ b/03_02/03_02 Read from CSV.ipynb	
@@ -0,0 +1,434 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Credit Card Retention Analysis"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Dataset Description"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "- A manager at the bank is disturbed with more and more customers leaving their credit card services. They would really like to understand what characteristics lend themselves to someone who is going to churn so they can proactively go to the customer to provide them better services and turn customers' decisions in the opposite direction.\n",
+    "\n",
+    "- This dataset consists of 10,000 customers mentioning their age, salary, marital_status, credit card limit, credit card category, etc. There are nearly 18 features.\n",
+    "\n",
+    "- 16.07% of customers have churned.\n",
+    "\n",
+    "- [Dataset link](https://www.kaggle.com/datasets/whenamancodes/credit-card-customers-prediction)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "***"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "In order to read in a csv into python, we will be leveraging the Pandas library. Any package we want to use in Python will need an import statement. In addition to pandas which we will import using `import pandas as pd`, we will also import matplotlib and seaborn (libraries used for visualization) and numpy (a library for array manipulation)."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Imports"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import pandas as pd\n",
+    "import matplotlib.pyplot as plt\n",
+    "import seaborn as sns\n",
+    "import numpy as np\n",
+    "import plotly.graph_objs as go\n",
+    "from plotly.offline import iplot\n",
+    "sns.set()\n",
+    "pd.options.display.max_columns = 999"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Reading in Dataset"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "The method we will be using is `pd.read_csv` which implies we will be reading a comma separated value file. You can see this in the defaults for this method by typing `help(pd.read_csv)` and see that the separator is set to `,` with other helpful defaults like `header='infer'`. You can read through the rest to get familiar with parameters you can pass through that might be specific to what you may need and different from the defaults. \n",
+    "\n",
+    "If you type `pd.read` and then press `tab` you will see other methods available to you out of the box to read in files. Examples: `pd.read_excel`, `pd.read_pickle`, `pd.read_json`"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "help(pd.read_csv)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# pd.read"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "The next step to read in a csv file is to know where the relative location is to your Python script. In this case, I've created a folder called `data/` that I will use to store any input data files. To read in the file, I will just pass the file name into the parenthesis and take a look at the output."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "data = pd.read_csv('../data/BankChurners_v2.csv')"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "The next steps I always do after reading in my file is to:\n",
+    "\n",
+    "1) `data.shape` to see the size of the dataset. The size will help me decide on how to manage working with the dataset if it happens to be large. Here we see this dataset has **10K+** rows of customer data and **23** columns describing the behavior of those customers.\n",
+    "\n",
+    "2) `data.head()` to see the top of the dataset and make any changes like renaming column names. The default will show the top 5 rows, but you can pass through any number you like (10,25, etc)\n",
+    "\n",
+    "3) `data.columns` to see what all the column names\n",
+    "\n",
+    "Let's do that here."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "(10127, 23)"
+      ]
+     },
+     "execution_count": 5,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "data.shape"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 6,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/html": [
+       "<div>\n",
+       "<style scoped>\n",
+       "    .dataframe tbody tr th:only-of-type {\n",
+       "        vertical-align: middle;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe tbody tr th {\n",
+       "        vertical-align: top;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe thead th {\n",
+       "        text-align: right;\n",
+       "    }\n",
+       "</style>\n",
+       "<table border=\"1\" class=\"dataframe\">\n",
+       "  <thead>\n",
+       "    <tr style=\"text-align: right;\">\n",
+       "      <th></th>\n",
+       "      <th>CLIENTNUM</th>\n",
+       "      <th>Attrition_Flag</th>\n",
+       "      <th>Customer_Age</th>\n",
+       "      <th>Gender</th>\n",
+       "      <th>Dependent_count</th>\n",
+       "      <th>Education_Level</th>\n",
+       "      <th>Marital_Status</th>\n",
+       "      <th>Income_Category</th>\n",
+       "      <th>Card_Category</th>\n",
+       "      <th>Months_on_book</th>\n",
+       "      <th>Total_Relationship_Count</th>\n",
+       "      <th>Months_Inactive_12_mon</th>\n",
+       "      <th>Contacts_Count_12_mon</th>\n",
+       "      <th>Credit_Limit</th>\n",
+       "      <th>Total_Revolving_Bal</th>\n",
+       "      <th>Avg_Open_To_Buy</th>\n",
+       "      <th>Total_Amt_Chng_Q4_Q1</th>\n",
+       "      <th>Total_Trans_Amt</th>\n",
+       "      <th>Total_Trans_Ct</th>\n",
+       "      <th>Total_Ct_Chng_Q4_Q1</th>\n",
+       "      <th>Avg_Utilization_Ratio</th>\n",
+       "      <th>Naive_Bayes_Classifier_Attrition_Flag_Card_Category_Contacts_Count_12_mon_Dependent_count_Education_Level_Months_Inactive_12_mon_1</th>\n",
+       "      <th>Naive_Bayes_Classifier_Attrition_Flag_Card_Category_Contacts_Count_12_mon_Dependent_count_Education_Level_Months_Inactive_12_mon_2</th>\n",
+       "    </tr>\n",
+       "  </thead>\n",
+       "  <tbody>\n",
+       "    <tr>\n",
+       "      <th>0</th>\n",
+       "      <td>90032</td>\n",
+       "      <td>Existing Customer</td>\n",
+       "      <td>45</td>\n",
+       "      <td>M</td>\n",
+       "      <td>3</td>\n",
+       "      <td>High School</td>\n",
+       "      <td>Married</td>\n",
+       "      <td>$60K - $80K</td>\n",
+       "      <td>Blue</td>\n",
+       "      <td>39</td>\n",
+       "      <td>5</td>\n",
+       "      <td>1</td>\n",
+       "      <td>3</td>\n",
+       "      <td>12691.0</td>\n",
+       "      <td>777</td>\n",
+       "      <td>11914.0</td>\n",
+       "      <td>1.335</td>\n",
+       "      <td>1144</td>\n",
+       "      <td>42</td>\n",
+       "      <td>1.625</td>\n",
+       "      <td>0.061</td>\n",
+       "      <td>0.000093</td>\n",
+       "      <td>0.99991</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>1</th>\n",
+       "      <td>90033</td>\n",
+       "      <td>Existing Customer</td>\n",
+       "      <td>49</td>\n",
+       "      <td>F</td>\n",
+       "      <td>5</td>\n",
+       "      <td>Graduate</td>\n",
+       "      <td>Single</td>\n",
+       "      <td>Less than $40K</td>\n",
+       "      <td>Blue</td>\n",
+       "      <td>44</td>\n",
+       "      <td>6</td>\n",
+       "      <td>1</td>\n",
+       "      <td>2</td>\n",
+       "      <td>8256.0</td>\n",
+       "      <td>864</td>\n",
+       "      <td>7392.0</td>\n",
+       "      <td>1.541</td>\n",
+       "      <td>1291</td>\n",
+       "      <td>33</td>\n",
+       "      <td>3.714</td>\n",
+       "      <td>0.105</td>\n",
+       "      <td>0.000057</td>\n",
+       "      <td>0.99994</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>2</th>\n",
+       "      <td>90034</td>\n",
+       "      <td>Existing Customer</td>\n",
+       "      <td>51</td>\n",
+       "      <td>M</td>\n",
+       "      <td>3</td>\n",
+       "      <td>Graduate</td>\n",
+       "      <td>Married</td>\n",
+       "      <td>$80K - $120K</td>\n",
+       "      <td>Blue</td>\n",
+       "      <td>36</td>\n",
+       "      <td>4</td>\n",
+       "      <td>1</td>\n",
+       "      <td>0</td>\n",
+       "      <td>3418.0</td>\n",
+       "      <td>0</td>\n",
+       "      <td>3418.0</td>\n",
+       "      <td>2.594</td>\n",
+       "      <td>1887</td>\n",
+       "      <td>20</td>\n",
+       "      <td>2.333</td>\n",
+       "      <td>0.000</td>\n",
+       "      <td>0.000021</td>\n",
+       "      <td>0.99998</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>3</th>\n",
+       "      <td>90035</td>\n",
+       "      <td>Existing Customer</td>\n",
+       "      <td>40</td>\n",
+       "      <td>F</td>\n",
+       "      <td>4</td>\n",
+       "      <td>High School</td>\n",
+       "      <td>NaN</td>\n",
+       "      <td>Less than $40K</td>\n",
+       "      <td>Blue</td>\n",
+       "      <td>34</td>\n",
+       "      <td>3</td>\n",
+       "      <td>4</td>\n",
+       "      <td>1</td>\n",
+       "      <td>3313.0</td>\n",
+       "      <td>2517</td>\n",
+       "      <td>796.0</td>\n",
+       "      <td>1.405</td>\n",
+       "      <td>1171</td>\n",
+       "      <td>20</td>\n",
+       "      <td>2.333</td>\n",
+       "      <td>0.760</td>\n",
+       "      <td>0.000134</td>\n",
+       "      <td>0.99987</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>4</th>\n",
+       "      <td>90036</td>\n",
+       "      <td>Existing Customer</td>\n",
+       "      <td>40</td>\n",
+       "      <td>M</td>\n",
+       "      <td>3</td>\n",
+       "      <td>Uneducated</td>\n",
+       "      <td>Married</td>\n",
+       "      <td>$60K - $80K</td>\n",
+       "      <td>Blue</td>\n",
+       "      <td>21</td>\n",
+       "      <td>5</td>\n",
+       "      <td>1</td>\n",
+       "      <td>0</td>\n",
+       "      <td>4716.0</td>\n",
+       "      <td>0</td>\n",
+       "      <td>4716.0</td>\n",
+       "      <td>2.175</td>\n",
+       "      <td>816</td>\n",
+       "      <td>28</td>\n",
+       "      <td>2.500</td>\n",
+       "      <td>0.000</td>\n",
+       "      <td>0.000022</td>\n",
+       "      <td>0.99998</td>\n",
+       "    </tr>\n",
+       "  </tbody>\n",
+       "</table>\n",
+       "</div>"
+      ],
+      "text/plain": [
+       "   CLIENTNUM     Attrition_Flag  Customer_Age Gender  Dependent_count  \\\n",
+       "0      90032  Existing Customer            45      M                3   \n",
+       "1      90033  Existing Customer            49      F                5   \n",
+       "2      90034  Existing Customer            51      M                3   \n",
+       "3      90035  Existing Customer            40      F                4   \n",
+       "4      90036  Existing Customer            40      M                3   \n",
+       "\n",
+       "  Education_Level Marital_Status Income_Category Card_Category  \\\n",
+       "0     High School        Married     $60K - $80K          Blue   \n",
+       "1        Graduate         Single  Less than $40K          Blue   \n",
+       "2        Graduate        Married    $80K - $120K          Blue   \n",
+       "3     High School            NaN  Less than $40K          Blue   \n",
+       "4      Uneducated        Married     $60K - $80K          Blue   \n",
+       "\n",
+       "   Months_on_book  Total_Relationship_Count  Months_Inactive_12_mon  \\\n",
+       "0              39                         5                       1   \n",
+       "1              44                         6                       1   \n",
+       "2              36                         4                       1   \n",
+       "3              34                         3                       4   \n",
+       "4              21                         5                       1   \n",
+       "\n",
+       "   Contacts_Count_12_mon  Credit_Limit  Total_Revolving_Bal  Avg_Open_To_Buy  \\\n",
+       "0                      3       12691.0                  777          11914.0   \n",
+       "1                      2        8256.0                  864           7392.0   \n",
+       "2                      0        3418.0                    0           3418.0   \n",
+       "3                      1        3313.0                 2517            796.0   \n",
+       "4                      0        4716.0                    0           4716.0   \n",
+       "\n",
+       "   Total_Amt_Chng_Q4_Q1  Total_Trans_Amt  Total_Trans_Ct  Total_Ct_Chng_Q4_Q1  \\\n",
+       "0                 1.335             1144              42                1.625   \n",
+       "1                 1.541             1291              33                3.714   \n",
+       "2                 2.594             1887              20                2.333   \n",
+       "3                 1.405             1171              20                2.333   \n",
+       "4                 2.175              816              28                2.500   \n",
+       "\n",
+       "   Avg_Utilization_Ratio  \\\n",
+       "0                  0.061   \n",
+       "1                  0.105   \n",
+       "2                  0.000   \n",
+       "3                  0.760   \n",
+       "4                  0.000   \n",
+       "\n",
+       "   Naive_Bayes_Classifier_Attrition_Flag_Card_Category_Contacts_Count_12_mon_Dependent_count_Education_Level_Months_Inactive_12_mon_1  \\\n",
+       "0                                           0.000093                                                                                    \n",
+       "1                                           0.000057                                                                                    \n",
+       "2                                           0.000021                                                                                    \n",
+       "3                                           0.000134                                                                                    \n",
+       "4                                           0.000022                                                                                    \n",
+       "\n",
+       "   Naive_Bayes_Classifier_Attrition_Flag_Card_Category_Contacts_Count_12_mon_Dependent_count_Education_Level_Months_Inactive_12_mon_2  \n",
+       "0                                            0.99991                                                                                   \n",
+       "1                                            0.99994                                                                                   \n",
+       "2                                            0.99998                                                                                   \n",
+       "3                                            0.99987                                                                                   \n",
+       "4                                            0.99998                                                                                   "
+      ]
+     },
+     "execution_count": 6,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "data.head()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.8.5"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 4
+}

From 5e41a933c0678188519d4ebdd16fd76675735a7b Mon Sep 17 00:00:00 2001
From: Sarah Nabelsi <snabelsi@Sarahs-MacBook-Pro-2.local>
Date: Mon, 23 Jan 2023 10:49:08 -0800
Subject: [PATCH 2/4] adding beginning file'

---
 03_02/03_02 Read from CSV [Begin].ipynb | 162 ++++++++++++++++++++++++
 03_02/03_02 Read from CSV.ipynb         |   7 -
 2 files changed, 162 insertions(+), 7 deletions(-)
 create mode 100644 03_02/03_02 Read from CSV [Begin].ipynb

diff --git a/03_02/03_02 Read from CSV [Begin].ipynb b/03_02/03_02 Read from CSV [Begin].ipynb
new file mode 100644
index 0000000..24312ad
--- /dev/null
+++ b/03_02/03_02 Read from CSV [Begin].ipynb	
@@ -0,0 +1,162 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Credit Card Retention Analysis"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Dataset Description"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "- A manager at the bank is disturbed with more and more customers leaving their credit card services. They would really like to understand what characteristics lend themselves to someone who is going to churn so they can proactively go to the customer to provide them better services and turn customers' decisions in the opposite direction.\n",
+    "\n",
+    "- This dataset consists of 10,000 customers mentioning their age, salary, marital_status, credit card limit, credit card category, etc. There are nearly 18 features.\n",
+    "\n",
+    "- 16.07% of customers have churned.\n",
+    "\n",
+    "- [Dataset link](https://www.kaggle.com/datasets/whenamancodes/credit-card-customers-prediction)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "***"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "In order to read in a csv into python, we will be leveraging the Pandas library. Any package we want to use in Python will need an import statement. In addition to pandas which we will import using `import pandas as pd`, we will also import matplotlib and seaborn (libraries used for visualization) and numpy (a library for array manipulation)."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Imports"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import pandas as pd\n",
+    "import matplotlib.pyplot as plt\n",
+    "import seaborn as sns\n",
+    "import numpy as np\n",
+    "import plotly.graph_objs as go\n",
+    "from plotly.offline import iplot\n",
+    "sns.set()\n",
+    "pd.options.display.max_columns = 999"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Reading in Dataset"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "The method we will be using is `pd.read_csv` which implies we will be reading a comma separated value file. You can see this in the defaults for this method by typing `help(pd.read_csv)` and see that the separator is set to `,` with other helpful defaults like `header='infer'`. You can read through the rest to get familiar with parameters you can pass through that might be specific to what you may need and different from the defaults. \n",
+    "\n",
+    "If you type `pd.read` and then press `tab` you will see other methods available to you out of the box to read in files. Examples: `pd.read_excel`, `pd.read_pickle`, `pd.read_json`"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "The next step to read in a csv file is to know where the relative location is to your Python script. In this case, I've created a folder called `data/` that I will use to store any input data files. To read in the file, I will just pass the file name into the parenthesis and take a look at the output."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "The next steps I always do after reading in my file is to:\n",
+    "\n",
+    "1) `data.shape` to see the size of the dataset. The size will help me decide on how to manage working with the dataset if it happens to be large. Here we see this dataset has **10K+** rows of customer data and **23** columns describing the behavior of those customers.\n",
+    "\n",
+    "2) `data.head()` to see the top of the dataset and make any changes like renaming column names. The default will show the top 5 rows, but you can pass through any number you like (10,25, etc)\n",
+    "\n",
+    "3) `data.columns` to see what all the column names\n",
+    "\n",
+    "Let's do that here."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.8.5"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 4
+}
diff --git a/03_02/03_02 Read from CSV.ipynb b/03_02/03_02 Read from CSV.ipynb
index 704c6e6..cdf3712 100644
--- a/03_02/03_02 Read from CSV.ipynb	
+++ b/03_02/03_02 Read from CSV.ipynb	
@@ -401,13 +401,6 @@
    "source": [
     "data.head()"
    ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": []
   }
  ],
  "metadata": {

From a2de0404cf2abd8c39d26a3320747780e973eb35 Mon Sep 17 00:00:00 2001
From: Sarah Nabelsi <snabelsi@Sarahs-MacBook-Pro-2.local>
Date: Tue, 24 Jan 2023 05:39:34 -0800
Subject: [PATCH 3/4] modify final

---
 03_02/03_02 Read from CSV [Begin].ipynb | 33 +++----------------------
 1 file changed, 3 insertions(+), 30 deletions(-)

diff --git a/03_02/03_02 Read from CSV [Begin].ipynb b/03_02/03_02 Read from CSV [Begin].ipynb
index 24312ad..f036f17 100644
--- a/03_02/03_02 Read from CSV [Begin].ipynb	
+++ b/03_02/03_02 Read from CSV [Begin].ipynb	
@@ -34,13 +34,6 @@
     "***"
    ]
   },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "In order to read in a csv into python, we will be leveraging the Pandas library. Any package we want to use in Python will need an import statement. In addition to pandas which we will import using `import pandas as pd`, we will also import matplotlib and seaborn (libraries used for visualization) and numpy (a library for array manipulation)."
-   ]
-  },
   {
    "cell_type": "markdown",
    "metadata": {},
@@ -71,15 +64,6 @@
     "## Reading in Dataset"
    ]
   },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "The method we will be using is `pd.read_csv` which implies we will be reading a comma separated value file. You can see this in the defaults for this method by typing `help(pd.read_csv)` and see that the separator is set to `,` with other helpful defaults like `header='infer'`. You can read through the rest to get familiar with parameters you can pass through that might be specific to what you may need and different from the defaults. \n",
-    "\n",
-    "If you type `pd.read` and then press `tab` you will see other methods available to you out of the box to read in files. Examples: `pd.read_excel`, `pd.read_pickle`, `pd.read_json`"
-   ]
-  },
   {
    "cell_type": "code",
    "execution_count": null,
@@ -94,13 +78,6 @@
    "outputs": [],
    "source": []
   },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "The next step to read in a csv file is to know where the relative location is to your Python script. In this case, I've created a folder called `data/` that I will use to store any input data files. To read in the file, I will just pass the file name into the parenthesis and take a look at the output."
-   ]
-  },
   {
    "cell_type": "code",
    "execution_count": null,
@@ -112,15 +89,11 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "The next steps I always do after reading in my file is to:\n",
-    "\n",
-    "1) `data.shape` to see the size of the dataset. The size will help me decide on how to manage working with the dataset if it happens to be large. Here we see this dataset has **10K+** rows of customer data and **23** columns describing the behavior of those customers.\n",
-    "\n",
-    "2) `data.head()` to see the top of the dataset and make any changes like renaming column names. The default will show the top 5 rows, but you can pass through any number you like (10,25, etc)\n",
+    "1) `data.shape` \n",
     "\n",
-    "3) `data.columns` to see what all the column names\n",
+    "2) `data.head()` \n",
     "\n",
-    "Let's do that here."
+    "3) `data.columns` "
    ]
   },
   {

From 3cf3dca3a9cc477c4698e2957ce45ef0bc3f59c4 Mon Sep 17 00:00:00 2001
From: MAhsan89 <152721796+MAhsan89@users.noreply.github.com>
Date: Fri, 30 Aug 2024 02:54:03 +0500
Subject: [PATCH 4/4] Update 03_02 Read from CSV.ipynb