From cd23d6f66f4dacafc6db4743da9db8a6c5c85dee Mon Sep 17 00:00:00 2001
From: vly12
Date: Fri, 15 Mar 2024 10:24:05 +0000
Subject: [PATCH 01/15] Had to input eval = FALSE into the second setup
codechunk to bypass it.
Pandas Module was not found.
---
intro.Rmd | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/intro.Rmd b/intro.Rmd
index 753f7c0..5b9de4a 100644
--- a/intro.Rmd
+++ b/intro.Rmd
@@ -27,7 +27,7 @@ tutorial_options(
)
```
-```{python pandas-setup, include = FALSE}
+```{python pandas-setup, include = FALSE, eval = FALSE}
# Import pandas library
import pandas as pd
From 1ef6844ce193f18ef86cb416089417bfe7c40bbd Mon Sep 17 00:00:00 2001
From: vly12
Date: Fri, 15 Mar 2024 10:27:34 +0000
Subject: [PATCH 02/15] Changed some of the .png to .PNG in order to correctly
map to the image files.
---
intro.Rmd | 14 ++++++-------
intro.html | 58 +++++++++++++++++++++++++++---------------------------
2 files changed, 36 insertions(+), 36 deletions(-)
diff --git a/intro.Rmd b/intro.Rmd
index 5b9de4a..4abb271 100644
--- a/intro.Rmd
+++ b/intro.Rmd
@@ -87,7 +87,7 @@ You will see the interface looks like this. There are three main tabs in Jupyter
- **Clusters** For using IPython in parallel with your cluster (beyond the scope of this training guidance)
```{r startsession, fig.align='center', out.width="100%"}
-knitr::include_graphics("images/python-pwb-startsession.png")
+knitr::include_graphics("images/python-pwb-startsession.PNG")
```
@@ -99,7 +99,7 @@ knitr::include_graphics("images/python-pwb-newnb.png")
```
```{r blanknb, fig.align='center', out.width="100%"}
-knitr::include_graphics("images/python-pwb-blanknb.png")
+knitr::include_graphics("images/python-pwb-blanknb.PNG")
```
@@ -117,7 +117,7 @@ Jupyter notebook is a modal editor which means that the keyboard does different
- **Edit mode** - it is indicated by a green cell border and left sidebar, and a prompt showing in the editor area:
```{r editmode, fig.align='center', out.width="100%"}
-knitr::include_graphics("images/edit-mode.png")
+knitr::include_graphics("images/edit-mode.PNG")
```
When a cell is in edit mode, you can type things such as Python codes into the cell, like a normal text editor. Enter edit mode by pressing Enter or using the mouse to click on a cell’s editor area.
@@ -125,7 +125,7 @@ When a cell is in edit mode, you can type things such as Python codes into the c
- **Command mode** - Once you click somewhere else outside the cell or press “esc” on keyboard, the cell turns into Command mode. Command mode is indicated by a grey cell border and a blue sidebar:
```{r commandmode, fig.align='center', out.width="100%"}
-knitr::include_graphics("images/command-mode.png")
+knitr::include_graphics("images/command-mode.PNG")
```
When you are in command mode, you are able to edit the notebook as a whole, but not type into individual cells. Most importantly, in command mode, the keyboard is mapped to a set of shortcuts that let you perform notebook and cell actions efficiently. For example, if you are in command mode and you press C and V, you will copy and paste the current cell.
@@ -137,7 +137,7 @@ A full list of useful shortcuts is available by going to "Help > Keyboard Shortc
Markdown text cells support plain text, Markdown and HTML. It will be useful to create headings, text instructions etc using markdown to organise the notebook like a written document. A cell can be changed from code mode to markdown mode by going to the top menu bar "Cell > Cell Type > Markdown". Or select “Markdown” from the dropdown list:
```{r markdown, fig.align='center', out.width="100%"}
-knitr::include_graphics("images/markdown-menu.png")
+knitr::include_graphics("images/markdown-menu.PNG")
```
Or press M while in Command Mode and highlighting the cell.
@@ -145,13 +145,13 @@ Or press M while in Command Mode and highlighting the cell.
Here is an example of typing some text in a markdown cell. You can use hash key “#” to indicate the size of heading, followed by a space and the text.
```{r markdown-example1, fig.align='center', out.width="100%"}
-knitr::include_graphics("images/markdown-example1.png")
+knitr::include_graphics("images/markdown-example1.PNG")
```
Then press *Shift + Enter* to finish.
```{r markdown-example2, fig.align='center', out.width="100%"}
-knitr::include_graphics("images/markdown-example2.png")
+knitr::include_graphics("images/markdown-example2.PNG")
```
#### Python Library
diff --git a/intro.html b/intro.html
index 0d571bd..5482ea7 100644
--- a/intro.html
+++ b/intro.html
@@ -12,7 +12,7 @@
-
+
Intro to Python
@@ -182,14 +182,14 @@ Open a New Jupyter Notebook
Clusters For using IPython in parallel with your
cluster (beyond the scope of this training guidance)
-
+
To open a new Jupyter Notebook, click the “New” drop down menu on the
Files tab and select “Python 3.10.2” under the notebooks heading. This
will open a blank Notebook with an IPython console running underneath
it.

-
+
The IPython console is used to input and execute Python code
interactively. Outputs, errors and warning messages are directly shown
@@ -214,7 +214,7 @@
Command and Edit Modes
Edit mode - it is indicated by a green cell border
and left sidebar, and a prompt showing in the editor area:
-
+
When a cell is in edit mode, you can type things such as Python codes
into the cell, like a normal text editor. Enter edit mode by pressing
Enter or using the mouse to click on a cell’s editor area.
@@ -224,7 +224,7 @@ Command and Edit Modes
mode. Command mode is indicated by a grey cell border and a blue
sidebar:
-
+
When you are in command mode, you are able to edit the notebook as a
whole, but not type into individual cells. Most importantly, in command
mode, the keyboard is mapped to a set of shortcuts that let you perform
@@ -243,14 +243,14 @@
Markdown Text Cells
from code mode to markdown mode by going to the top menu bar “Cell >
Cell Type > Markdown”. Or select “Markdown” from the dropdown
list:
-
+
Or press M while in Command Mode and highlighting the cell.
Here is an example of typing some text in a markdown cell. You can
use hash key “#” to indicate the size of heading, followed by a space
and the text.
-
+
Then press Shift + Enter to finish.
-
+
Python Library
@@ -847,11 +847,11 @@ Feedback
")"), chunk_opts = list(label = "setup", include = FALSE)), setup = "# Import pandas library\nimport pandas as pd \n\nborders2 = pd.read_csv(\"data/borders_inc_age.csv\", usecols = ['URI', 'HospitalCode', 'Specialty']) ",
chunks = list(list(label = "pandas-setup", code = "# Import pandas library\nimport pandas as pd \n\nborders2 = pd.read_csv(\"data/borders_inc_age.csv\", usecols = ['URI', 'HospitalCode', 'Specialty']) ",
opts = list(label = "\"pandas-setup\"", include = "FALSE",
- engine = "\"python\""), engine = "python"), list(
- label = "csv-read-columns1", code = "# Read in the dataset with specific columns\nborders2 = pd.read_csv(\"data/borders_inc_age.csv\", usecols = ['URI', 'HospitalCode', 'Specialty']) \n\n# Check the first few rows of the dataset. Default is 5 rows. \nborders2.head() ",
- opts = list(label = "\"csv-read-columns1\"", exercise = "TRUE",
- exercise.setup = "\"pandas-setup\"", engine = "\"python\""),
- engine = "python")), code_check = NULL, error_check = NULL,
+ eval = "FALSE", engine = "\"python\""), engine = "python"),
+ list(label = "csv-read-columns1", code = "# Read in the dataset with specific columns\nborders2 = pd.read_csv(\"data/borders_inc_age.csv\", usecols = ['URI', 'HospitalCode', 'Specialty']) \n\n# Check the first few rows of the dataset. Default is 5 rows. \nborders2.head() ",
+ opts = list(label = "\"csv-read-columns1\"", exercise = "TRUE",
+ exercise.setup = "\"pandas-setup\"", engine = "\"python\""),
+ engine = "python")), code_check = NULL, error_check = NULL,
check = NULL, solution = NULL, tests = NULL, options = list(
eval = FALSE, echo = TRUE, results = "markup", tidy = FALSE,
tidy.opts = NULL, collapse = FALSE, prompt = FALSE, comment = NA,
@@ -900,11 +900,11 @@ Feedback
")"), chunk_opts = list(label = "setup", include = FALSE)), setup = "# Import pandas library\nimport pandas as pd \n\nborders2 = pd.read_csv(\"data/borders_inc_age.csv\", usecols = ['URI', 'HospitalCode', 'Specialty']) ",
chunks = list(list(label = "pandas-setup", code = "# Import pandas library\nimport pandas as pd \n\nborders2 = pd.read_csv(\"data/borders_inc_age.csv\", usecols = ['URI', 'HospitalCode', 'Specialty']) ",
opts = list(label = "\"pandas-setup\"", include = "FALSE",
- engine = "\"python\""), engine = "python"), list(
- label = "csv-read-columns2", code = "# Rearrange the columns\nborders3 = borders2[['URI', 'Specialty', 'HospitalCode']]\n\n# Check the first few rows of the dataset. Default is 5 rows. \nborders3.head()",
- opts = list(label = "\"csv-read-columns2\"", exercise = "TRUE",
- exercise.setup = "\"pandas-setup\"", engine = "\"python\""),
- engine = "python")), code_check = NULL, error_check = NULL,
+ eval = "FALSE", engine = "\"python\""), engine = "python"),
+ list(label = "csv-read-columns2", code = "# Rearrange the columns\nborders3 = borders2[['URI', 'Specialty', 'HospitalCode']]\n\n# Check the first few rows of the dataset. Default is 5 rows. \nborders3.head()",
+ opts = list(label = "\"csv-read-columns2\"", exercise = "TRUE",
+ exercise.setup = "\"pandas-setup\"", engine = "\"python\""),
+ engine = "python")), code_check = NULL, error_check = NULL,
check = NULL, solution = NULL, tests = NULL, options = list(
eval = FALSE, echo = TRUE, results = "markup", tidy = FALSE,
tidy.opts = NULL, collapse = FALSE, prompt = FALSE, comment = NA,
@@ -952,11 +952,11 @@ Feedback
")"), chunk_opts = list(label = "setup", include = FALSE)), setup = "# Import pandas library\nimport pandas as pd \n\nborders2 = pd.read_csv(\"data/borders_inc_age.csv\", usecols = ['URI', 'HospitalCode', 'Specialty']) ",
chunks = list(list(label = "pandas-setup", code = "# Import pandas library\nimport pandas as pd \n\nborders2 = pd.read_csv(\"data/borders_inc_age.csv\", usecols = ['URI', 'HospitalCode', 'Specialty']) ",
opts = list(label = "\"pandas-setup\"", include = "FALSE",
- engine = "\"python\""), engine = "python"), list(
- label = "delete-column", code = "# Delete URI column\ndel borders2[\"URI\"]\n\n# Check the first few rows of the dataset. Default is 5 rows. \nborders2.head()",
- opts = list(label = "\"delete-column\"", exercise = "TRUE",
- exercise.setup = "\"pandas-setup\"", engine = "\"python\""),
- engine = "python")), code_check = NULL, error_check = NULL,
+ eval = "FALSE", engine = "\"python\""), engine = "python"),
+ list(label = "delete-column", code = "# Delete URI column\ndel borders2[\"URI\"]\n\n# Check the first few rows of the dataset. Default is 5 rows. \nborders2.head()",
+ opts = list(label = "\"delete-column\"", exercise = "TRUE",
+ exercise.setup = "\"pandas-setup\"", engine = "\"python\""),
+ engine = "python")), code_check = NULL, error_check = NULL,
check = NULL, solution = NULL, tests = NULL, options = list(
eval = FALSE, echo = TRUE, results = "markup", tidy = FALSE,
tidy.opts = NULL, collapse = FALSE, prompt = FALSE, comment = NA,
@@ -987,16 +987,16 @@ Feedback
@@ -1060,12 +1060,12 @@ Feedback
From fdfd9d8c4dae81b6e1b040fdf658e89714d0b715 Mon Sep 17 00:00:00 2001
From: vly12
Date: Fri, 15 Mar 2024 17:29:32 +0000
Subject: [PATCH 03/15] adding code for mean, median, description (summary),
and frequency/crosstab/margin
---
intro.Rmd | 108 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 108 insertions(+)
diff --git a/intro.Rmd b/intro.Rmd
index 4abb271..57bfc30 100644
--- a/intro.Rmd
+++ b/intro.Rmd
@@ -347,6 +347,114 @@ hello_world <- "Hello World"
print(hello_world)
```
+## Explore
+
+### Mean/Median & Summary
+
+* `mean()` and `median()` are passed arrays of values (usually from a data frame) to return the mean and median value.
+
+* `describe()` returns all summary statistics based on a given array.
+
+In the exercise below, you have the borders data-set loaded as `borders_data`. See if you can get the mean value for `LengthOfStay`, store it in a variable, and print that variable. Use the hint button if you need some help.
+
+```{python mean, exercise=TRUE, exercise.eval=TRUE}
+import pandas as pd
+borders_data = pd.read_csv("data/borders_inc_age.csv")
+
+borders_data
+```
+
+```{python mean-hint-1}
+import pandas as pd
+borders_data = pd.read_csv("data/borders_inc_age.csv")
+
+borders_data["LengthOfStay"]
+```
+
+
+```{python mean-hint-2}
+import pandas as pd
+borders_data = pd.read_csv("data/borders_inc_age.csv")
+
+borders_data["LengthOfStay"].mean()
+```
+
+
+```{python mean-hint-3}
+import pandas as pd
+borders_data = pd.read_csv("data/borders_inc_age.csv")
+
+mean_value = borders_data["LengthOfStay"].mean()
+
+print(mean_value)
+```
+
+```{python mean-solution}
+import pandas as pd
+borders_data = pd.read_csv("data/borders_inc_age.csv")
+
+mean_value = borders_data["LengthOfStay"].mean()
+
+print(mean_value)
+```
+
+```{python mean-check}
+grade_code()
+```
+
+
+### Frequencies & Crosstabs
+
+* Frequency: `["").value_counts()`
+* Crosstab: `pd.crosstab([""], [""])`
+* Crosstab & Add Col/Row Totals: `pd.crosstab([""], [""], margins = True)`
+tab
+Create a crosstab for `HospitalCode` and `Sex`, add column and row totals, store the table in a variable, and print it out. Use the hint button if you need some help.
+
+```{python freq, exercise=TRUE}
+import pandas as pd
+borders_data = pd.read_csv("data/borders_inc_age.csv")
+
+
+```
+
+```{python freq-hint-1}
+import pandas as pd
+borders_data = pd.read_csv("data/borders_inc_age.csv")
+
+...pd.crosstab(...)
+```
+
+```{python freq-hint-2}
+import pandas as pd
+borders_data = pd.read_csv("data/borders_inc_age.csv")
+
+crosstab_hospitalcode_sex = pd.crosstab(
+ borders_data["HospitalCode"], borders_data["Sex"]...)
+```
+
+```{python freq-hint-3}
+import pandas as pd
+borders_data = pd.read_csv("data/borders_inc_age.csv")
+
+crosstab_hospitalcode_sex = pd.crosstab(
+ borders_data["HospitalCode"], borders_data["Sex"], margins = True)
+print(crosstab_hospitalcode_sex)
+```
+
+```{python freq-solution}
+import pandas as pd
+borders_data = pd.read_csv("data/borders_inc_age.csv")
+
+crosstab_hospitalcode_sex = pd.crosstab(
+ borders_data["HospitalCode"], borders_data["Sex"], margins = True)
+
+print(crosstab_hospitalcode_sex)
+```
+
+```{python freq-check}
+grade_code()
+```
## Help & Feedback
From f1abbd4489ad4cb539126e215062d3602a1db2ba Mon Sep 17 00:00:00 2001
From: vly12
Date: Wed, 20 Mar 2024 17:53:32 +0000
Subject: [PATCH 04/15] Removed import pandas text. Started Location Section
---
intro.Rmd | 264 +++++++++++++++++++++++++++++++++++++++---
python-training.Rproj | 0
2 files changed, 251 insertions(+), 13 deletions(-)
mode change 100644 => 100755 intro.Rmd
mode change 100644 => 100755 python-training.Rproj
diff --git a/intro.Rmd b/intro.Rmd
old mode 100644
new mode 100755
index 57bfc30..95319d9
--- a/intro.Rmd
+++ b/intro.Rmd
@@ -8,6 +8,7 @@ output:
runtime: shiny_prerendered
---
+
```{r setup, include=FALSE}
# Author: Tina Fu
# Original Date: February 2024
@@ -357,15 +358,15 @@ print(hello_world)
In the exercise below, you have the borders data-set loaded as `borders_data`. See if you can get the mean value for `LengthOfStay`, store it in a variable, and print that variable. Use the hint button if you need some help.
-```{python mean, exercise=TRUE, exercise.eval=TRUE}
-import pandas as pd
+```{python mean, exercise = TRUE, exercise.setup = "pandas-setup"}
+
borders_data = pd.read_csv("data/borders_inc_age.csv")
borders_data
```
```{python mean-hint-1}
-import pandas as pd
+
borders_data = pd.read_csv("data/borders_inc_age.csv")
borders_data["LengthOfStay"]
@@ -373,7 +374,7 @@ borders_data["LengthOfStay"]
```{python mean-hint-2}
-import pandas as pd
+
borders_data = pd.read_csv("data/borders_inc_age.csv")
borders_data["LengthOfStay"].mean()
@@ -381,7 +382,7 @@ borders_data["LengthOfStay"].mean()
```{python mean-hint-3}
-import pandas as pd
+
borders_data = pd.read_csv("data/borders_inc_age.csv")
mean_value = borders_data["LengthOfStay"].mean()
@@ -390,7 +391,7 @@ print(mean_value)
```
```{python mean-solution}
-import pandas as pd
+
borders_data = pd.read_csv("data/borders_inc_age.csv")
mean_value = borders_data["LengthOfStay"].mean()
@@ -405,28 +406,28 @@ grade_code()
### Frequencies & Crosstabs
-* Frequency: `["").value_counts()`
+* Frequency: `["").value_counts()]`
* Crosstab: `pd.crosstab([""], [""])`
* Crosstab & Add Col/Row Totals: `pd.crosstab([""], [""], margins = True)`
tab
Create a crosstab for `HospitalCode` and `Sex`, add column and row totals, store the table in a variable, and print it out. Use the hint button if you need some help.
-```{python freq, exercise=TRUE}
-import pandas as pd
+```{python freq, exercise = TRUE, exercise.setup = "pandas-setup"}
+
borders_data = pd.read_csv("data/borders_inc_age.csv")
```
```{python freq-hint-1}
-import pandas as pd
+
borders_data = pd.read_csv("data/borders_inc_age.csv")
...pd.crosstab(...)
```
```{python freq-hint-2}
-import pandas as pd
+
borders_data = pd.read_csv("data/borders_inc_age.csv")
crosstab_hospitalcode_sex = pd.crosstab(
@@ -434,7 +435,7 @@ crosstab_hospitalcode_sex = pd.crosstab(
```
```{python freq-hint-3}
-import pandas as pd
+
borders_data = pd.read_csv("data/borders_inc_age.csv")
crosstab_hospitalcode_sex = pd.crosstab(
@@ -443,7 +444,187 @@ print(crosstab_hospitalcode_sex)
```
```{python freq-solution}
-import pandas as pd
+
+borders_data = pd.read_csv("data/borders_inc_age.csv")
+
+crosstab_hospitalcode_sex = pd.crosstab(
+ borders_data["HospitalCode"], borders_data["Sex"], margins = True)
+
+print(crosstab_hospitalcode_sex)
+```
+
+```{python freq-check}
+grade_code()
+```
+
+## Wrangle – Part 1
+
+For the next sections, we will focus on using the pandas module manipulate data and data frames.
+
+
+### Location
+
+There are multiple ways to wrangling data in Python - some are more straightforward than others.
+
+To begin efficiently locating and manipulating data, it would be beneficial to become familiar with the less straightforward concept of locating the data.
+
+The `loc` and `iloc` attributes are used indexing, which allows for the locating of data from a dataframe based on row and column labels.
+
+`loc` - Label-based indexing (such as Hospital Codes, Names, etc.).
+`iloc` - Index-based indexing (remember Python indexes count from 0 onwards).
+
+By default, `loc` and `iloc` locates row indexes, instead of column.
+
+
+`.loc['']` will select all the rows with the row_label.
+`.iloc['']` will select the row at index == row_index_label.
+
+However, you can use loc and iloc to locate rows and columns alike:
+
+`.loc['']` will select all the rows with the row_label.
+`.iloc['']` will select the row at index == row_index_label.
+
+### Select / Slice Out Columns
+
+By far, the most straightfoward way to select a column is by the following method:
+
+* Selecting/Slicing Out a column: `[""]`
+
+However, to be able to start retrieve and manipulate data well, it would be useful to become familiar with loc and iloc.
+
+* Selecting/Slicing Out a column: `.loc[""]`
+
+Using the data stored in the borders_inc_age csv file, read it in, select/slice out the `HospitalCode` column, and print out the selected column.
+
+```{python freq, exercise = TRUE, exercise.setup = "pandas-setup"}
+
+borders_data = pd.read_csv("data/borders_inc_age.csv")
+
+column_hospital_code =
+
+```
+
+```{python freq-hint-1}
+
+borders_data = pd.read_csv("data/borders_inc_age.csv")
+
+column_hospital_code = borders_data
+```
+
+```{python freq-hint-2}
+
+borders_data = pd.read_csv("data/borders_inc_age.csv")
+
+column_hospital_code = borders_data[""]
+```
+
+```{python freq-hint-3}
+
+borders_data = pd.read_csv("data/borders_inc_age.csv")
+
+column_hospital_code = borders_data["HospitalCode"]
+
+print(column_hospital_code)
+```
+
+```{python freq-solution}
+
+borders_data = pd.read_csv("data/borders_inc_age.csv")
+
+column_hospital_code = borders_data["HospitalCode"]
+
+print(column_hospital_code)
+```
+
+```{python freq-check}
+grade_code()
+```
+
+
+### Select Rows
+
+
+```{python freq, exercise = TRUE, exercise.setup = "pandas-setup"}
+
+borders_data = pd.read_csv("data/borders_inc_age.csv")
+
+
+```
+
+```{python freq-hint-1}
+
+borders_data = pd.read_csv("data/borders_inc_age.csv")
+
+...pd.crosstab(...)
+```
+
+```{python freq-hint-2}
+
+borders_data = pd.read_csv("data/borders_inc_age.csv")
+
+crosstab_hospitalcode_sex = pd.crosstab(
+ borders_data["HospitalCode"], borders_data["Sex"]...)
+```
+
+```{python freq-hint-3}
+
+borders_data = pd.read_csv("data/borders_inc_age.csv")
+
+crosstab_hospitalcode_sex = pd.crosstab(
+ borders_data["HospitalCode"], borders_data["Sex"], margins = True)
+print(crosstab_hospitalcode_sex)
+```
+
+```{python freq-solution}
+
+borders_data = pd.read_csv("data/borders_inc_age.csv")
+
+crosstab_hospitalcode_sex = pd.crosstab(
+ borders_data["HospitalCode"], borders_data["Sex"], margins = True)
+
+print(crosstab_hospitalcode_sex)
+```
+
+```{python freq-check}
+grade_code()
+```
+
+### Add a New Row/Column
+
+
+```{python freq, exercise = TRUE, exercise.setup = "pandas-setup"}
+
+borders_data = pd.read_csv("data/borders_inc_age.csv")
+
+
+```
+
+```{python freq-hint-1}
+
+borders_data = pd.read_csv("data/borders_inc_age.csv")
+
+...pd.crosstab(...)
+```
+
+```{python freq-hint-2}
+
+borders_data = pd.read_csv("data/borders_inc_age.csv")
+
+crosstab_hospitalcode_sex = pd.crosstab(
+ borders_data["HospitalCode"], borders_data["Sex"]...)
+```
+
+```{python freq-hint-3}
+
+borders_data = pd.read_csv("data/borders_inc_age.csv")
+
+crosstab_hospitalcode_sex = pd.crosstab(
+ borders_data["HospitalCode"], borders_data["Sex"], margins = True)
+print(crosstab_hospitalcode_sex)
+```
+
+```{python freq-solution}
+
borders_data = pd.read_csv("data/borders_inc_age.csv")
crosstab_hospitalcode_sex = pd.crosstab(
@@ -456,6 +637,63 @@ print(crosstab_hospitalcode_sex)
grade_code()
```
+### Delete a Row/Columns
+
+
+```{python freq, exercise = TRUE, exercise.setup = "pandas-setup"}
+
+borders_data = pd.read_csv("data/borders_inc_age.csv")
+
+
+```
+
+```{python freq-hint-1}
+
+borders_data = pd.read_csv("data/borders_inc_age.csv")
+
+...pd.crosstab(...)
+```
+
+```{python freq-hint-2}
+
+borders_data = pd.read_csv("data/borders_inc_age.csv")
+
+crosstab_hospitalcode_sex = pd.crosstab(
+ borders_data["HospitalCode"], borders_data["Sex"]...)
+```
+
+```{python freq-hint-3}
+
+borders_data = pd.read_csv("data/borders_inc_age.csv")
+
+crosstab_hospitalcode_sex = pd.crosstab(
+ borders_data["HospitalCode"], borders_data["Sex"], margins = True)
+print(crosstab_hospitalcode_sex)
+```
+
+```{python freq-solution}
+
+borders_data = pd.read_csv("data/borders_inc_age.csv")
+
+crosstab_hospitalcode_sex = pd.crosstab(
+ borders_data["HospitalCode"], borders_data["Sex"], margins = True)
+
+print(crosstab_hospitalcode_sex)
+```
+
+```{python freq-check}
+grade_code()
+```
+
+
+##Wrangle – Part 2
+
+· Manipulate Strings
+
+· Aggregate Dataframe
+
+· Merge Dataframe
+
## Help & Feedback
#### Feedback
diff --git a/python-training.Rproj b/python-training.Rproj
old mode 100644
new mode 100755
From 5bda74bd758e1f1751769e274cfbff92f39d7f23 Mon Sep 17 00:00:00 2001
From: vly12
Date: Thu, 21 Mar 2024 15:48:23 +0000
Subject: [PATCH 05/15] Wrangle #1
---
intro.Rmd | 197 +++++++++-----
intro.html | 752 ++++++++++++++++++++++++++++++++++++++++++++++++++++-
2 files changed, 872 insertions(+), 77 deletions(-)
diff --git a/intro.Rmd b/intro.Rmd
index 95319d9..216183f 100755
--- a/intro.Rmd
+++ b/intro.Rmd
@@ -464,39 +464,47 @@ For the next sections, we will focus on using the pandas module manipulate data
### Location
-There are multiple ways to wrangling data in Python - some are more straightforward than others.
+There are multiple ways to wrangle data in Python - some are more straightforward than others.
-To begin efficiently locating and manipulating data, it would be beneficial to become familiar with the less straightforward concept of locating the data.
+To begin efficiently locating and manipulating data, it would be beneficial to become familiar with the concept of locating data in a dataframe.
-The `loc` and `iloc` attributes are used indexing, which allows for the locating of data from a dataframe based on row and column labels.
+The `loc` and `iloc` attributes are used for indexing, which allows for the locating of data from a dataframe based on row and column labels/indexes.
`loc` - Label-based indexing (such as Hospital Codes, Names, etc.).
-`iloc` - Index-based indexing (remember Python indexes count from 0 onwards).
+`iloc` - Integer-based indexing (remember Python indexes count from 0).
-By default, `loc` and `iloc` locates row indexes, instead of column.
+By default, `loc` and `iloc` locates row indexes, instead of column indexes.
-
-`.loc['']` will select all the rows with the row_label.
+`.loc['']` will select the first row with row_label as the label index.
`.iloc['']` will select the row at index == row_index_label.
-However, you can use loc and iloc to locate rows and columns alike:
-`.loc['']` will select all the rows with the row_label.
-`.iloc['']` will select the row at index == row_index_label.
+This does not mean the location attributes are isolated to only locating rows - in fact, the real power of the location attributes is in its ability to freely locate relevant rows, columns, values, etc.
+
+In the next few sections, we will see how useful the location attributes can be.
+
### Select / Slice Out Columns
-By far, the most straightfoward way to select a column is by the following method:
+By far, the most straightforward way to select a column is by the following method:
* Selecting/Slicing Out a column: `[""]`
-However, to be able to start retrieve and manipulate data well, it would be useful to become familiar with loc and iloc.
+However, by using `loc` and `iloc`, we can start to specify further:
-* Selecting/Slicing Out a column: `.loc[""]`
+* Selecting/Slicing Out a column: `.loc["",""]`
-Using the data stored in the borders_inc_age csv file, read it in, select/slice out the `HospitalCode` column, and print out the selected column.
+* Example: `borders_data.loc[:,"Specialty"]` will obtain all the rows from the `Specialty` column.
-```{python freq, exercise = TRUE, exercise.setup = "pandas-setup"}
+You may be wondering why using the `loc` attribute is helpful here when the first method is much more simple.
+
+Using `loc`, you have the ability to select specific rows within that column
+
+* Example: `borders_data.loc[0:2,"HospitalCode"]` will obtain the first 3 rows from the `HospitalCode` column.
+
+Read in the data stored in the borders_inc_age csv file, select/slice out the entire `HospitalCode` column using pandas, and print out the selected column.
+
+```{python select, exercise = TRUE, exercise.setup = "pandas-setup"}
borders_data = pd.read_csv("data/borders_inc_age.csv")
@@ -504,184 +512,229 @@ column_hospital_code =
```
-```{python freq-hint-1}
+```{python select-hint-1}
borders_data = pd.read_csv("data/borders_inc_age.csv")
-column_hospital_code = borders_data
+column_hospital_code = borders_data.loc
```
-```{python freq-hint-2}
+```{python select-hint-2}
borders_data = pd.read_csv("data/borders_inc_age.csv")
-column_hospital_code = borders_data[""]
+column_hospital_code = borders_data.loc[:,]
```
-```{python freq-hint-3}
+```{python select-hint-3}
borders_data = pd.read_csv("data/borders_inc_age.csv")
-column_hospital_code = borders_data["HospitalCode"]
+column_hospital_code = borders_data.loc[:,'HospitalCode']
print(column_hospital_code)
```
-```{python freq-solution}
+```{python select-solution}
borders_data = pd.read_csv("data/borders_inc_age.csv")
-column_hospital_code = borders_data["HospitalCode"]
+column_hospital_code = borders_data.loc[:,'HospitalCode']
print(column_hospital_code)
```
-```{python freq-check}
+```{python select-check}
grade_code()
```
### Select Rows
+The pandas library has multiple way to select rows, depending on your criteria.
-```{python freq, exercise = TRUE, exercise.setup = "pandas-setup"}
+* Select Rows by index: `.iloc[]']`
+* Example: `borders_data.iloc[0]` *This will show the first row of the dataframe *
+
+* Select Rows which have a specific value in a specific column: `[[''] == '']`
+* Example: `borders_data[borders_data['HospitalCode'] == 'B120H']` *This will show all the rows which have `B120H` in the `HospitalCode` column *
+
+* Select Rows by index: `.loc['']']`
+* Example: `borders_data.loc['row_label']` *This will locate the first instance of the row with row_label as the label index - please note, this csv file does not have row labels*
+
+Read in the borders_inc_age csv file, and select all the rows with `Specialty == A1`.
+
+```{python select-row, exercise = TRUE, exercise.setup = "pandas-setup"}
borders_data = pd.read_csv("data/borders_inc_age.csv")
```
-```{python freq-hint-1}
+```{python select-row-hint-1}
borders_data = pd.read_csv("data/borders_inc_age.csv")
-...pd.crosstab(...)
+rows_of_a1 = borders_data[...]
```
-```{python freq-hint-2}
+```{python select-row-hint-2}
borders_data = pd.read_csv("data/borders_inc_age.csv")
-crosstab_hospitalcode_sex = pd.crosstab(
- borders_data["HospitalCode"], borders_data["Sex"]...)
+rows_of_a1 = borders_data[borders_data['...']...]
```
-```{python freq-hint-3}
+```{python select-row-hint-3}
borders_data = pd.read_csv("data/borders_inc_age.csv")
-crosstab_hospitalcode_sex = pd.crosstab(
- borders_data["HospitalCode"], borders_data["Sex"], margins = True)
-print(crosstab_hospitalcode_sex)
+rows_of_a1 = borders_data[borders_data['Specialty']=='A1']
```
-```{python freq-solution}
+```{python select-row-solution}
borders_data = pd.read_csv("data/borders_inc_age.csv")
-crosstab_hospitalcode_sex = pd.crosstab(
- borders_data["HospitalCode"], borders_data["Sex"], margins = True)
-
-print(crosstab_hospitalcode_sex)
+rows_of_a1 = borders_data[borders_data['Specialty']=='A1']
```
-```{python freq-check}
+```{python select-row-check}
grade_code()
```
### Add a New Row/Column
+Adding a new row/column is straightforward.
-```{python freq, exercise = TRUE, exercise.setup = "pandas-setup"}
+When adding a new row, you can first create a dictionary, where `keys` (column names) and `values` (corresponding values in the new row) are mapped together.
+
+* Create dictionary of mapped keys (columns) and values (row values):
+`new_row_data = {'':'','':'','':''}`
+
+Then, you can append the new row into the dataframe.
+
+` = .append(new_row_data)`
+
+If you want to leave the value blank, you can input `None` into the row_value.
+
+
+* Add new column: `[''] = `
+
+This will create a new column with all the rows populated by the row_value.
+
+
+Read in the borders_inc_age file. Create a new column called `number_of_eyebrows` with the row_value as `2` for all the rows.
+
+```{python new-row, exercise = TRUE, exercise.setup = "pandas-setup"}
borders_data = pd.read_csv("data/borders_inc_age.csv")
+
```
-```{python freq-hint-1}
+```{python new-row-hint-1}
borders_data = pd.read_csv("data/borders_inc_age.csv")
-...pd.crosstab(...)
+borders_data[...] ...
```
-```{python freq-hint-2}
+```{python new-row-hint-2}
borders_data = pd.read_csv("data/borders_inc_age.csv")
-crosstab_hospitalcode_sex = pd.crosstab(
- borders_data["HospitalCode"], borders_data["Sex"]...)
+borders_data['number_of_eyebrows'] = ....
```
-```{python freq-hint-3}
+```{python new-row-hint-3}
borders_data = pd.read_csv("data/borders_inc_age.csv")
-crosstab_hospitalcode_sex = pd.crosstab(
- borders_data["HospitalCode"], borders_data["Sex"], margins = True)
-print(crosstab_hospitalcode_sex)
+borders_data['number_of_eyebrows'] = 2
```
-```{python freq-solution}
+```{python new-row-solution}
borders_data = pd.read_csv("data/borders_inc_age.csv")
-crosstab_hospitalcode_sex = pd.crosstab(
- borders_data["HospitalCode"], borders_data["Sex"], margins = True)
-
-print(crosstab_hospitalcode_sex)
+borders_data['number_of_eyebrows'] = 2
```
-```{python freq-check}
+```{python new-row-check}
grade_code()
```
### Delete a Row/Columns
+There are multiple ways to delete rows and columns.
-```{python freq, exercise = TRUE, exercise.setup = "pandas-setup"}
+When deleting a row, you could create a dataframe with the rows containing certain row_values filtered out:
+
+* Deleting a row: ` = [[''] != '']`
+
+You can also delete rows based on their index and row_label by slicing the locations.
+
+* Removing the first row: ` = .iloc[1:]`
+
+If we want to remove only the 2nd row using the same method, you would need to using locate the specific rows you want to keep, and append them together.
+
+* Removing the second row: ` = .iloc[:1].append(.iloc[2:]`
+
+Alternatively, you could use the `drop()` method.
+
+* Removing the second row: ` = .drop(.index[1])`
+
+* Removing the second row row_label: ` = .drop('', axis=0)` **`axis=0` specifies you are removing a row.**
+
+* Removing the second row by index position: ` = .drop(index_position, axis=0)`
+
+The `drop()` method can also be used to drop columns.
+
+* Removing a column by label: ` = .drop('', axis=1)` **`axis=1` specifies you are removing a column.**
+
+* Removing a column by index position: ` = .drop(.columns[index_position], axis=1)`
+
+
+Read in the borders_inc_age csv. Remove all the rows which have `Male` as the value in the `Sex` column.
+
+```{python delete-row, exercise = TRUE, exercise.setup = "pandas-setup"}
borders_data = pd.read_csv("data/borders_inc_age.csv")
```
-```{python freq-hint-1}
+```{python delete-row-hint-1}
borders_data = pd.read_csv("data/borders_inc_age.csv")
-...pd.crosstab(...)
+borders_data = ...
```
-```{python freq-hint-2}
+```{python delete-row-hint-2}
borders_data = pd.read_csv("data/borders_inc_age.csv")
-crosstab_hospitalcode_sex = pd.crosstab(
- borders_data["HospitalCode"], borders_data["Sex"]...)
+borders_data = borders_data[...]
```
-```{python freq-hint-3}
+```{python delete-row-hint-3}
borders_data = pd.read_csv("data/borders_inc_age.csv")
-crosstab_hospitalcode_sex = pd.crosstab(
- borders_data["HospitalCode"], borders_data["Sex"], margins = True)
-print(crosstab_hospitalcode_sex)
+borders_data = borders_data[borders_data[...] != ...]
```
-```{python freq-solution}
+```{python delete-row-solution}
borders_data = pd.read_csv("data/borders_inc_age.csv")
-crosstab_hospitalcode_sex = pd.crosstab(
- borders_data["HospitalCode"], borders_data["Sex"], margins = True)
-
-print(crosstab_hospitalcode_sex)
+borders_data = borders_data[borders_data['Sex'] != 'Male']
```
-```{python freq-check}
+```{python delete-row-check}
grade_code()
```
diff --git a/intro.html b/intro.html
index 5482ea7..db3af2c 100644
--- a/intro.html
+++ b/intro.html
@@ -497,6 +497,419 @@ Code Exercise
+
+
Explore
+
+
+
Frequencies & Crosstabs
+
+- Frequency:
+
<df_name>["<col_name>").value_counts()]
+- Crosstab:
+
pd.crosstab(<df_name>["<col_1_name>"], <df_name>["<col_2_name>"])
+- Crosstab & Add Col/Row Totals:
+
pd.crosstab(<df_name>["<col_1_name>"], <df_name>["<col_2_name>"], margins = True)
+tab Create a crosstab for HospitalCode and
+Sex, add column and row totals, store the table in a
+variable, and print it out. Use the hint button if you need some
+help.
+
+
+
borders_data = pd.read_csv("data/borders_inc_age.csv")
+
+
+
+
borders_data = pd.read_csv("data/borders_inc_age.csv")
+
+...pd.crosstab(...)
+
+
+
borders_data = pd.read_csv("data/borders_inc_age.csv")
+
+crosstab_hospitalcode_sex = pd.crosstab(
+ borders_data["HospitalCode"], borders_data["Sex"]...)
+
+
+
borders_data = pd.read_csv("data/borders_inc_age.csv")
+
+crosstab_hospitalcode_sex = pd.crosstab(
+ borders_data["HospitalCode"], borders_data["Sex"], margins = True)
+print(crosstab_hospitalcode_sex)
+
+
+
borders_data = pd.read_csv("data/borders_inc_age.csv")
+
+crosstab_hospitalcode_sex = pd.crosstab(
+ borders_data["HospitalCode"], borders_data["Sex"], margins = True)
+
+print(crosstab_hospitalcode_sex)
+
+
+
+
+
Wrangle – Part 1
+
For the next sections, we will focus on using the pandas module
+manipulate data and data frames.
+
+
Location
+
There are multiple ways to wrangle data in Python - some are more
+straightforward than others.
+
To begin efficiently locating and manipulating data, it would be
+beneficial to become familiar with the concept of locating data in a
+dataframe.
+
The loc and iloc attributes are used for
+indexing, which allows for the locating of data from a dataframe based
+on row and column labels/indexes.
+
loc - Label-based indexing (such as Hospital Codes,
+Names, etc.). iloc - Integer-based indexing (remember
+Python indexes count from 0).
+
By default, loc and iloc locates row
+indexes, instead of column indexes.
+
<df_name>.loc['<row_label>'] will select the
+first row with row_label as the label index.
+<df_name>.iloc['<row_index_label>'] will select
+the row at index == row_index_label.
+
This does not mean the location attributes are isolated to only
+locating rows - in fact, the real power of the location attributes is in
+its ability to freely locate relevant rows, columns, values, etc.
+
In the next few sections, we will see how useful the location
+attributes can be.
+
+
+
Select / Slice Out Columns
+
By far, the most straightforward way to select a column is by the
+following method:
+
+- Selecting/Slicing Out a column:
+
<df_name>["<col_name>"]
+
+
However, by using loc and iloc, we can
+start to specify further:
+
+Selecting/Slicing Out a column:
+<df_name>.loc["<row_index>","<column_index>"]
+Example: borders_data.loc[:,"Specialty"] will obtain
+all the rows from the Specialty column.
+
+
You may be wondering why using the loc attribute is
+helpful here when the first method is much more simple.
+
Using loc, you have the ability to select specific rows
+within that column
+
+- Example:
borders_data.loc[0:2,"HospitalCode"] will
+obtain the first 3 rows from the HospitalCode column.
+
+
Read in the data stored in the borders_inc_age csv file, select/slice
+out the entire HospitalCode column using pandas, and print
+out the selected column.
+
+
borders_data = pd.read_csv("data/borders_inc_age.csv")
+
+column_hospital_code =
+
+
+
+
borders_data = pd.read_csv("data/borders_inc_age.csv")
+
+column_hospital_code = borders_data.loc
+
+
+
borders_data = pd.read_csv("data/borders_inc_age.csv")
+
+column_hospital_code = borders_data.loc[:,]
+
+
+
borders_data = pd.read_csv("data/borders_inc_age.csv")
+
+column_hospital_code = borders_data.loc[:,'HospitalCode']
+
+print(column_hospital_code)
+
+
+
borders_data = pd.read_csv("data/borders_inc_age.csv")
+
+column_hospital_code = borders_data.loc[:,'HospitalCode']
+
+print(column_hospital_code)
+
+
+
+
Select Rows
+
The pandas library has multiple way to select rows, depending on your
+criteria.
+
+Select Rows by index:
+<df_name>.iloc[<index_number>]']
+Example: `borders_data.iloc[0] This will show the first
+row of the dataframe
+Select Rows which have a specific value in a specific column:
+<df_name>[<df_name>['<column_name>'] == '<row_label>']
+Example:
+borders_data[borders_data['HospitalCode'] == 'B120H']
+This will show all the rows which have B120H in the
+HospitalCode column
+Select Rows by index:
+<df_name>.loc['<row_label>']']
+Example: borders_data.loc['row_label'] This
+will the first instance of the row with row_label as the label index -
+please note, this csv file does not have row labels
+
+
Read in the borders_inc_age csv file, and select all the rows with
+Specialty == A1.
+
+
borders_data = pd.read_csv("data/borders_inc_age.csv")
+
+
+
+
borders_data = pd.read_csv("data/borders_inc_age.csv")
+
+rows_of_a1 = borders_data[...]
+
+
+
borders_data = pd.read_csv("data/borders_inc_age.csv")
+
+rows_of_a1 = borders_data[borders_data['...']...]
+
+
+
borders_data = pd.read_csv("data/borders_inc_age.csv")
+
+rows_of_a1 = borders_data[borders_data['Specialty']=='A1']
+
+
+
borders_data = pd.read_csv("data/borders_inc_age.csv")
+
+rows_of_a1 = borders_data[borders_data['Specialty']=='A1']
+
+
+
+
Add a New Row/Column
+
Adding a new row/column is straightforward.
+
When adding a new row, you can first create a dictionary, where
+keys (column names) and values (corresponding
+values in the new row) are mapped together.
+
+- Create dictionary of mapped keys (columns) and values (row values):
+
new_row_data = {'<column_name>':'<row_value>','<column_name>':'<row_value>','<column_name>':'<row_value>'}
+
+
Then, you can append the new row into the dataframe.
+
<df_name> = <df_name>.append(new_row_data)
+
If you want to leave the value blank, you can input None
+into the row_value.
+
+- Add new column:
+
<df_name>['<new_column_name>'] = <row_value>
+
+
This will create a new column with all the rows populated by the
+row_value.
+
Read in the borders_inc_age file. Create a new column called
+number_of_eyebrows with the row_value as 2 for
+all the rows.
+
+
borders_data = pd.read_csv("data/borders_inc_age.csv")
+
+
+
+
borders_data = pd.read_csv("data/borders_inc_age.csv")
+
+borders_data[...] ...
+
+
+
borders_data = pd.read_csv("data/borders_inc_age.csv")
+
+borders_data['number_of_eyebrows'] = ....
+
+
+
borders_data = pd.read_csv("data/borders_inc_age.csv")
+
+borders_data['number_of_eyebrows'] = 2
+
+
+
borders_data = pd.read_csv("data/borders_inc_age.csv")
+
+borders_data['number_of_eyebrows'] = 2
+
+
+
+
Delete a Row/Columns
+
There are multiple ways to delete rows and columns.
+
When deleting a row, you could create a dataframe with the rows
+containing certain row_values filtered out:
+
+- Deleting a row:
+
<df_name> = <df_name>[<df_name>['<column_name>'] != '<row_value>']
+
+
You can also delete rows based on their index and row_label by
+slicing the locations.
+
+- Removing the first row:
+
<df_name> = <df_name>.iloc[1:]
+
+
If we want to remove only the 2nd row using the same method, you
+would need to using locate the specific rows you want to keep, and
+append them together.
+
+- Removing the second row:
+
<df_name> = <df_name>.iloc[:1].append(<df_name>.iloc[2:]
+
+
Alternatively, you could use the drop() method.
+
+Removing the second row:
+<df_name> = <df_name>.drop(<df_name>.index[1])
+Removing the second row row_label:
+<df_name> = <df_name>.drop('<row_label>', axis=0)
+axis=0 specifies you are removing a
+row.
+Removing the second row by index position:
+<df_name> = <df_name>.drop(index_position, axis=0)
+
+
The drop() method can also be used to drop columns.
+
+Removing a column by label:
+<df_name> = <df_name>.drop('<column_name>', axis=1)
+axis=1 specifies you are removing a
+column.
+Removing a column by index position:
+<df_name> = <df_name>.drop(<df_name>.columns[index_position], axis=1)
+
+
+
borders_data = pd.read_csv("data/borders_inc_age.csv")
+
+
+
+
borders_data = pd.read_csv("data/borders_inc_age.csv")
+
+...pd.crosstab(...)
+
+
+
borders_data = pd.read_csv("data/borders_inc_age.csv")
+
+crosstab_hospitalcode_sex = pd.crosstab(
+ borders_data["HospitalCode"], borders_data["Sex"]...)
+
+
+
borders_data = pd.read_csv("data/borders_inc_age.csv")
+
+crosstab_hospitalcode_sex = pd.crosstab(
+ borders_data["HospitalCode"], borders_data["Sex"], margins = True)
+print(crosstab_hospitalcode_sex)
+
+
+
borders_data = pd.read_csv("data/borders_inc_age.csv")
+
+crosstab_hospitalcode_sex = pd.crosstab(
+ borders_data["HospitalCode"], borders_data["Sex"], margins = True)
+
+print(crosstab_hospitalcode_sex)
+
+
##Wrangle – Part 2
+
· Manipulate Strings
+
· Aggregate Dataframe
+
· Merge Dataframe
+
+
Help & Feedback
@@ -987,16 +1400,16 @@
Feedback
@@ -1057,10 +1470,339 @@
Feedback
fig.num = 0, exercise.df_print = "paged"), engine = "r",
version = "4"), class = c("r", "tutorial_exercise")))
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
From 495d3542c46674a22a5cd3dcdec331af2c4c3d41 Mon Sep 17 00:00:00 2001
From: vly12
Date: Thu, 21 Mar 2024 18:14:42 +0000
Subject: [PATCH 06/15] Wrangle #2 beginning
---
intro.Rmd | 64 ++++++++++++++++++++++++++++++++++++++++++++++++++++---
1 file changed, 61 insertions(+), 3 deletions(-)
diff --git a/intro.Rmd b/intro.Rmd
index 216183f..60a88f8 100755
--- a/intro.Rmd
+++ b/intro.Rmd
@@ -741,11 +741,69 @@ grade_code()
##Wrangle – Part 2
-· Manipulate Strings
+### Manipulate Strings
-· Aggregate Dataframe
+There are a number of ways to manipulate strings in Python. Here are a few methods to achieve various outputs:
-· Merge Dataframe
+* Split values at a delimiter: `[''] = [''].str.split('')`
+
+*Example: `borders_data['ManagementofPatient'] = borders_data['ManagementofPatient'].str.split('a')`
+
+After the above line of code, if you `print(borders_data)`, you will see that all the values in the column `ManagementofPatient` has been split at the value 'a'.
+
+* Strip values of leading and trailing white space: `[''] = [''].str.strip()`
+* Convert values to upper case: `[''] = [''].str.upper()`
+* Convert values to lower case: `[''] = [''].str.lower()`
+* Replace values: `[''] = [''].str.replace('','')`
+
+It is important to note that this is not an exhaustive list of all the string manipulation methods - there are many more to find out and use, depending on your specification requirements.
+
+In the exercise below, read the borders_inc_age csv in, then replace all instances of `B120H` in the `HospitalCode` column with the phrase `BRILLIANT`.
+
+```{python manipulate-string, exercise = TRUE, exercise.setup = "pandas-setup"}
+
+borders_data = pd.read_csv("data/borders_inc_age.csv")
+
+
+```
+
+```{python manipulate-string-hint-1}
+
+borders_data = pd.read_csv("data/borders_inc_age.csv")
+
+borders_data['HospitalCode'] = ...
+```
+
+```{python manipulate-string-hint-2}
+
+borders_data = pd.read_csv("data/borders_inc_age.csv")
+
+
+borders_data['HospitalCode'] = borders_data['HospitalCode']...
+```
+
+```{python manipulate-string-hint-3}
+
+borders_data = pd.read_csv("data/borders_inc_age.csv")
+
+
+borders_data['HospitalCode'] = borders_data['HospitalCode'].str.replace('B120H','BRILLIANT')
+```
+
+```{python manipulate-string-solution}
+
+borders_data = pd.read_csv("data/borders_inc_age.csv")
+
+borders_data['HospitalCode'] = borders_data['HospitalCode'].str.replace('B120H','BRILLIANT')
+```
+
+```{python manipulate-string-check}
+grade_code()
+```
+
+### Aggregate Dataframe
+
+### Merge Dataframe
## Help & Feedback
From dcb6ab88ae486acdeaf52ef9d8f805f34120f6b3 Mon Sep 17 00:00:00 2001
From: vly12
Date: Thu, 28 Mar 2024 19:49:53 +0000
Subject: [PATCH 07/15] Added new dataframes in order to illustrate
merging/joining. Added joining image. Completed sections on groupby and
aggregate.
---
data/Baby5.csv | 9 +
data/Baby6.csv | 5 +
images/join_python.png | Bin 0 -> 73465 bytes
intro.Rmd | 124 ++++++++++++-
intro.html | 406 +++++++++++++++++++++++++++++++++++++----
5 files changed, 507 insertions(+), 37 deletions(-)
create mode 100644 data/Baby5.csv
create mode 100644 data/Baby6.csv
create mode 100644 images/join_python.png
diff --git a/data/Baby5.csv b/data/Baby5.csv
new file mode 100644
index 0000000..fea1a5c
--- /dev/null
+++ b/data/Baby5.csv
@@ -0,0 +1,9 @@
+FAMILYID,BABYNAME,DOB
+1,Julie,1/15/2017
+2,David,1/1/2017
+4,Caroline,1/6/2017
+3,Lucy,1/29/2017
+1,Kevin,1/15/2017
+1,Stephen,1/15/2017
+2,Declan,1/1/2017
+4,Sara,1/6/2017
diff --git a/data/Baby6.csv b/data/Baby6.csv
new file mode 100644
index 0000000..483aef9
--- /dev/null
+++ b/data/Baby6.csv
@@ -0,0 +1,5 @@
+FAMILYID,DOB,SURNAME
+2,1/1/2017,Smith
+3,1/29/2017,Jones
+1,1/15/2017,McDaid
+4,1/6/2017,MacArthur
diff --git a/images/join_python.png b/images/join_python.png
new file mode 100644
index 0000000000000000000000000000000000000000..2e3bb3f00e6647ac63b48274078ed7d9d90aa97c
GIT binary patch
literal 73465
zcmZ5|bySpH`0dakCEXz)-QA#o4BbeB3ep|YDM%yTAl*oJOLq%McX!u4-@WU%*1dQB
zn56^5yze}T(NCPZ2B13D@(Dg**ShssE*LLl%x;NMrsu;6*+ZsrFBLJ5INN~nKJ
zJJfd-);uC^#{)**4KFJuGRX-&1Z~ymG@C$N74x8`)|EeMi
zdFJN-_q%JZ9UHO2ryX#Z|JTza#y8KH|Nq@9Fi}e9|L^aeo(hQz-NXLBpFIoXeao|1`mv`A<5FMw}N0;!D||X~-pk1up@CWW9xy6G$z>tfv!n
z{m(@_Qp>~yW}O(rafv?g8Az5cjy;BEa%##vRK|}I3PnLdKtLdrhKGmG&CR{M
zx_U6XS=-q;DOgl`pF=>gf;k=6y{7ypMsC-H3wAzSCYDC~0PW-C^GYn^0U2-mrb)1U6Hk9;5fA9LGd
zQd(7%#bu;r4b(sCHKhK|ZQvFY6Eiy(pn_Qr$kntQ=ggEOpH%V(1^oL#OHo6pf#aoQ7BvM4#Fho
z{HUd+B_;LUJ5=Le)zJBXAw51M>-X=|rw7cmprD7fwRvsrEA*@6!ObSj;&hv|lu+Y%Kp&{^WN4jIjKO%}HrGhJE^5x5)o}R6(EfW(HkE{KeW>;Hu
zbaW6HJUq3gnNh}sQ+qDfVe4xv0oq$xhnFY*)r0>%WHExLR7n_4&(2Vgk$;4Ry+T2$
zeZ*(f3!p@n!0L$Ccx^f#Z1L-tW>ltZ&M&_J8g7x~f)ayV)-}Wt%
zqNIR(Rzc4lUX|1FH6!Ul8tD1i8As;29skL4>%-|9TyQs3QBe^JWng0i*UTUJ)x<_@u1?S!zCO0YnxKUPX>(g)JZb?bWLL>oF3~L_i-0oqsOnijkZnwL0laC)iIyp^O
ze;wN+xKTZrk}Gp<7ey3IdBWR@`(lEdOS^@DKp99CoM2F|f(uWH?2D6Co2u#&(`&U*ql|b2GEQ3RP3PdA~|ZG_Z92{r!Lbj@yUxiwTn&%s}`<9k6i;1-d2L3r+>&62?5FbA>JPeP3u(i3lJEuH9<-$9V
zU?yFV8sgu*hKq;ib$hzD)O?eiO!VI)nMgpukdcwmF8nq%>JcLj6CVdpOI1@-P2(VM
zqNE^5dZo>)wuT!d+U;Gt+K6e$mkusyo&TOlvK%)wMk!PD>3l1!z1;^y^i+Y8ZZ8@;
zTG)>tGBPrkhx2OE()sh%;I6^vfE)!E=7bj+0+w=MU|@cJo`HcOIVowJSf;>lAzAlX
z|2GDte{4)iQPKPEd@JWel&I%zbbS2GVJp}`5X_(qynFZV)2B}(8D$j}Y)nigg@yFs
zZ_v}5%^K5#{}vSqd^=rT>khs-n5%F-S-w;w(>Pu8E9sb)hd{R2>`0>`LRy{nRAtQU
z%*~0eV7<~Ms1c)xgpG1tDxSfP`@=${UXYY
zd!m%L!9uW=qgDbMa#jjcUI<{Nn#Gwv@{{HX$
zAAJoW9atSG;@FUAi0$j&DnH5NJ38!@S0MW&{V6HMX^)3vjYJe&E3JwlJ{PPE^^~dZ
zB53bF_a+gcbEa@Qhh}7uUs3iP?-o|h@PxFb4qKo>D5wsd{fY_4ia%Wf#lP5RLgUL%aw8TTPu
zlmOz3^GyY6?5tn{)wVab_dcE*n-@s_-eU+o^^#IkRwv|j`g^)Y9Uk$)!qQUB-7u5o5+-Um{-_1jq&XaZDJR6JK6_Zl)Xpae+j>HX{KdZ(hIA}?PGQjA4=szOCg
zEjA_wB&MdiIxarGrjE`%NcX|PlXMAC3OKpBo$Tz`xw+#K6AQ?59jmMh66CwTl&ge<
zgm7?hfWJPLAyQINQ9!P)r8PG@%gfF_J~yWYK0#d_yyfe*^$RU%8DK3P9UPY1yi${r
zl44@cw?|0swY|K&z(%}!6$X}eaygj+)xZ%(ZahY4W_EVD*<);Oj*pu;&OO50kh+B9
z`fYcwMez*|K>(C=n#UIr1G>r&4+D{a_$U22YqwCWevKv`6E`QJ5|@dIj0}N=>&fHs
z$rBFq!gdkkWf$UugHY1bvwJOsBfKI&BjF8qXA3g+!M^3l(MyqT^Ss;dy1U?ee3F$L
zgg{dEy{uXCBiML?I3l7Ejx%Pj@;ZGI9hhcxzk0dci*0`?(|vws
zGi=Sx&`?$8BLBqelr
zj}Mfi+arSYAFlHdawogo1|l2+D|=
znHgwg&dxknrBzku;0?VIBz(qQ@F}`=<`kVuu|)uLVSf7DP5lNo;Ml~=qSX0Em<&2E
z;ja>|tcqFRWswKjA44*-Kb&!_biT=mNBdGju482}!OVc{YYcg>kgSP3J?j@!6*_so
zeyMa~(C!g`PQ@U4hO%VvtGwLQ+}s*~JS*Xt?F4-uzd<+z(+^B#5gYrK5Y$9YFW>_V)g-i?WHUs7wqF
zT3jE_Q-6re1@)TxsmK^~A6jPSLvVK>bEc_@iHYC7eVd=34~K*?Jv(c2N98ovEoOlC
z6cIONb+_*C?;jW#SfG>{^!+>S+qbHi*8pJ8&bpj$4mi1|fMoya0%iB2_p0cA*R7Z#
zOhk%}2Ep1qF%>Xf&pqP`CJaybiT{T~Sq+IDEdT;h_F!HOXC;A+rh#CIiM^pgmcjNM
zWeIm9-lo{R5I&=Vh>QFBTpu=;XqN5IRj`>AyFC^#eUZ+XCXPu+a5>MtX}tJa!dL=<
zY)H@l)fNmV{zS#bB)^(fN6u}xpoBZj1WzLY+c6e}iUJ{ob^DH2sr249*dW9W8xazn
z@Y8_ZwliBQYjtS)O7p%1p(`=-fhYa9)yyCm1}w0r8MMP{PUX{(K@OidJM<
zhyVe&zHffs4XnHRcKrU!ops{)WCs|-o}QlA4BJQv2#_FH`1p0Twdwr!*^P~jfr0Q(
z(xD>n6wIVleB0*@N^^5*czLfu41+eGS2fWcj0u2OTUi;4v$1|};Wsm)G6(*&-Q8Uh
zKAXVcV4tUjk8V>_Q(U~f8NzN=paX#ruU0ZT{_}L=+LGlQwLfo}FJo?QUQ|@{>({R}
zFu=@5l)<-p+|*T8CWCkU{Q3GyhN-G3Q@Vsg%)7;L=h$RLu1KNvhB~4bjgON|v|s*x
zt*a{$2QnfJ%szkCFF#Hje+xbf*;4;Z=gr4W%c49O?$Ml1PkK;^u;68|aq+eCzP9_E
zb21)lPfux8PA}f?UM6p__qj~cSRXByv7(X-k`5b?3ub-;h)*t&6#*VzxqCpw{gNGI
zS!HGA@84K>czG*w&1mDBzv&;L7MYx<32G@kJUlliE0~~IhlC6a59hYoPv$}|cE&)?
z7YHv=jG8=gQ#^qj7)oR_Y;@QHKY>8NdptZI>^FKN2s!)(lAhE~?t-(kK5%zN(Gf+{
zB&Vg3l972m-Ct={7{EYoZ*NUU(*(YbrcLC0n40V6nq&=1&}e9B5w3V|K-Z!)Jv;pl
zQdgZOI%M-}!uMy_lDIWXVMQMdB^9Bmx}@M?6+7Dl!`sN9BPixba!4$f2QrZvp5jf
zC&Pfu+mN20oaWF|Qb|ZEL*n9iggsw-@8!E5PT4_x{rozcoDbeAWrSzbS659!%Nn@O
zpiTVJBp=Pq=eLFuBgsW%8`BId7Syas`%?d2&5o~bz|~%5wY71W$h_`%kx}$yTX_wS
zB7^Mz9o`tmTUr0ra?1k~d)J9(e|)y>9mH26AUuK1kk@%%T|z?Ql;YJ2ix-sy7HGF0
zqR1mcLt!BRmkJ3Af)$G$&E-ucvOLfvI`4auq8Iv=6OBnBs@Lpt^mumxLL36&;;P1D
z()_5a+s!_#XDD7!%jA-s_lgDzdiy0v81MnpYM>Ya1OR|GA2)X{pjBv+8(n14bt=yx
zDRI$6)d}(OPgiq>ZEbDfw|ja3Y4!T`YayX#sWAN7!{NIl-}q!LMFj;BpXZj$73bo_
z+V(oXyyW!xA1Gcm_s%%*@Dh0mC@~#Wg!%5VIzx&?+F!6CF*L}?(#p8UZ;_1%r>Ff~
z&26(Of`cRxXxilOlqI<+QKSoV)>^Wywrle)#$6AtxsX1`==?G?rQGP+A!gNa*9tVrJGs
zc*b&DApE&Gn*qBQ#y2n!3kD|oD(}{1ck7jRpTmV19AtYWwch!Fmxo8=ln;hI7OFbt
z{ruRoG;)-C1BLh|xUbMU&grBoB>coKFUuk1@oi9jF$dfs7M5fnJiwY;OLpzCiNc~g
zrX9#F?lTsN&Ie7dw#=N)lisbPopid}x1T?K!tfx&K*jhkZi+{etN!$QHsv>AOZe{V
z_-;h-N}H?;ohGn1(uQWgS%O3c@SUvOEeZKN9e*F(cR)srbWEQuHJDPWr!eCzHiZAl
z0(I8ZOh`gW
z@L6=KP0^jtUg%ND8<1EIHbF@iwqO4f&!kmI{}l|600kr@BwTLG51Y4MH@4RtMYHgd
zxq&*h)a+WJ*9dStEF?NMHkHpdqq^D=po|V5`Iq`|L@?EQd9u2Dc&Op2Y;PcEjr2(O9F&Z}K+adQOW2l@yEvSU6(
zFsxIyBv)2WK0UJrf&GWI{!HCgz==XC{=1MJ4e|~5<#dyFtUnk~N@4c+5BfK1)ZgV9
zrTk1|P*6}nS;)qoiY+5e5U?4an^-f-YT6cxfd5x^il%}Efqk=tMTl4tgA=81_bL+5
z)be$iEuVKHy`S!3g6JFW2zcP?F9s6iCm(We+MXT?n`XI5`rM|ph<2}$AnG&G@DfYD
zE)fII=rZE(AkpeKL=u$rjErde#2vA>f7RNMawvjfAZTc4E+@+Xb0Iu4XuPlB0oCeX
zo0rhcoS^w+uJgexI+-9YcMtp#QPGRmIETm)8CXrwIbUO8K_KiL9P^EiJB2DakTD(y
zk=dHXOQbcE{A{nq9|bg_bxR#6}7guf?j^B!X@!xMcGs&aJIX&v=of*nL@6S
z7H;gEoC{SZVG$ANw**grSqMu0{zWP;e*hyU3c(*Ayt$X0_C
zE*3np#d{}YY+P}WbQASWg?$n`<>b(B7ibW6)8QnR)cT}IUIc|~vocOjOl!Y{=QjcZ
z^CKfi_g4oH2%tm0zal9ASm~d#^Yu8g*P1WRY5{}Tf_Q?Bl~oi}&BR0ko>(!E4AzTv
z=%oBukFD)yb@G7IBNuS^Gc!{QV4}y(vE^L3GoUOI5)k#6Lo>I(0CpJ`4o*Zwq-eb?
zAfP*6A=Syr2{PvF!mDh_04VkJtFy)$?{qm8`^vb$!<6rfUlkl;um~6{XrQpcMu89J
zU*xzM*rBWNVq%QUXq;7T87-SGohSSdquuCVyECg%NXQxMn=ZW%nYN5JFBY>-@HRzZlJOWPX|9?xRc%7tq;t+3^UX}lLGG;(ZS)B
zmrc5;^PZ8jwMi7Yh|BR`0iRCK(;Ts^j$s|^g&GDfE;U(Mi=hM-KDbsDgLH0bX_1KmjFmn}XMcoO+&nxI
zi?)`J!DdCl5MKf|15yEpK-B-pFnv8e(1uyxym^-yJCLjDghYlH@`=#p`cOb?&dacE
z^GBF0Yf|lGr_HZo)th@nW!=Rj`7K2@nb&eLR)@o{vERWXh`T6u&Io-S(-TnDRGG{P
zUQZa&-5JU38yg$@9RDvQL7r1k&>aM@I-U*S4!{oA*3`i3*(S-e`{3$VhKGe+oSp(^
z8wNs1NZ8QS6iFt8|2XVC2{?C`|8Od}S0u>qnwozN_SCggN?&mi9AAYT9v;3A#vl`M
z$AuVKS+#iF*ciO4tM=6qGrxCwxK7WLOT@*+rDrI#(QE6od1MyFRo8U=xkT%2$H7iR
z1xpaGp4R5DIli|fS$pz)hDKHaT8zN_D+6lQQ{ijW6!w
zItu+Nv7wkrBLP9iCh%h_CK-Joa}}!l9>DYWul(iZ>G<^Iqj`U4`l__({4m+<++1^G
z}|SdwYcC`Yl7Axd>tug@tb*PzRQXMUj-mxg83SSTSH*ijG2kaa8oW|7Zpe^FKTUVt*EG|
zZk-ha3(L_;`wKwc3JMB)1gFe#i{eaNKss6jv@KW&(3|^ZnGQd0h|ajC;g{IaG2>TkdaZl`6QJVIXSx^7Acj7XZkBX$j%3*;=R3b*)r`ep05`m~g*1
zn-Z!j4_{Pl&VS-+RWbJ#d-WAQvz?s2lBr
z^+V=PGz=mSfWGAigdaZ_5J|+)P)(vEK;Wg8mVBHXrLU3P+}xBfm4fFnrkMi#{S#RA
zQ(|Lt%F4VhcU6FrHlFNdELVeuOD^i!U@=n+YC53(KyU+^7be5l*^wo6f{xcgXE-uMy_y`h2EjDh(H{`4Z&qJzs@)F)77YKOtt0iF0sfIx$i-?n
ziDQ}x5bjMN0u2oTM+%VNT!kU%0n72DMHT`C?tq|HNag)6s`vKN1ssDG8dC-#=X!LB
zzC;K3`1n9u;ijbYjZ!ii8uWTarrE8WXc8mOuATRTT~$T9c4M`PseVj(%tmLeX?Ws0
z#rKk84#Hm;dPfV{5O`<|Ji}Xmb#+aO(`Dzu8%jt=cXxMB4}zW7<&czTmG-BrbCZ=?
z%UN6>rr8tA8rP?Z*U|xCFp9TZOBCXM1$2mui@llb2U(ImZf!
zzs=1r=;#q@1&TnL^BdK?ySp))F9p(3y-}xsO-&7mDbU0xIy!taIyZEh_ub803;~%6
z20t$_gF$=&84nOtq038oZi3hm$-m#maCkYkfXxM{2No8V2HMn2@wdjD8q#5UI6cC2
zZGN+_t%AZ%8n9OW7S+E)1s)F%hJveFxW)A!Y9lNJL?yuXWIT`EPXg2BNTzT!{*>Ut
z05AgsEF1(VBUZ~T+N!ECqV7meWzAEi5?O%Q>F|RCQVSvkjfit|cOnN+IiQ5XLaab#
z1((s))umsXz3kB1LU+j%@Xc`Ob{J5qyy>r#~@e#
zGNzy>-;iB(zy(DeRur-4*nAE+EjxZ0dVH*`mtMgnVz~E;5U$H0{%crKN*XY~rcLNU
z6pj5!M$={)0jtmQf4l(qt#WTD5KBr*z&&{Zw)>w|mtH4bYFR)?h>w7fP%8+h53YDO
zEX$gonTstT!QC@0EgBtP;ypbJ!{KYDhORmWtZAU)5J8v6zf~qf;&Z6;J}{=&7?9PI
zlXkO-?8(VV^QruFQ7;`my)=TBI+CPD@)(+W07Pt8L?OQ4zkhFT_5h7rVoqZyK}APh
zeY#kqL{&`s045bsT=UeHPNefeguRWglynqscgN2n>wWs@mtvc(VoUcVXZLqH@`ApFn$l;W48Bmz=rIc=4v&F7Nd_lPT~
zjm-`vpPj9@-URbB>k0D{cCgod!#AZ)%H=10;J=MZo;g?(i=M3=GUMv-K}oMbjCI+1JqGl^aLRhA^;G|(>gH_Y(DN)$JO
z)nPOtXZbiX;^lg^lOr?ekdE&b)A(85lvkfQ&{8=~WTDW3a{XqTe+^*PUpqRA$flRf
zydFR>|H&F+>Ic-(=(xCPkSq5$Ctxlm!N;F%_wjN6AX$h)<~37>a!)W(Y-Vbj02EsD
zDO1uueSLkPUtL@ClRGlWE9&Hxk_J;+G?VS?>&xPeX~(bl
zmCg0J7YzOlq(}EUv~idKH>wIsnbqf#d<8a;U!_WK{t1
z+iZie>Twr1wQ2g$hTY88d&W95QzO66+&5Le(kH+IPv2?
zM2n~xl*p!9x2@;)m%=Y~PwP`!7)0S}N^UQz?{IrL!g()P(cC2h
zj2mjuC_b?>(q12M_Fa7)%ZQIn^f+CGsXd}t<$O4j5rsCtFRprxjXepdIj*mxAES~V
zMa#ccc83l!hIbnu|67VjFXhk(F#hz$&f7S@GJTCVD*ip1H0|Cy%=STx^FMGNja
zfuoBHxuDZWV`Hk?EN8b`t6oNs`=G~OuXQ7_u$1jh6{IF7$H&F}mc-SA&F|a>^6>t?
z4H$}(enB17Gp42yOtuJjLyvaF5UhW*qn({eD84w_>2N_mRQq_!^e%iSVJv1RrBlHz
zvZWM2|1%*%_s+|Zn|oT9A8p>=F13MPR^tegs8Uo1F-c{Ib!4x~M_gS%{H;`pB}%US$ulc?F=v4x
z4XtuTHTnCuX02o2JZ3j$MKT{^utQg5PsveRKii02{dL(sTI2)4u-xoQ5n9z#zT2$y
z^u^8X4oK>tNq}k0!os4dxw-lS?h#_XVm7_|Y6pyo$=2V$f5DswHd|0YKt(==)ywH{
zrj94cydo)y5X^u;j1*MRzvkUO2ze?n{BDX##D1>Xzt#5k95eHSj4UPp4K2nZY(yz`
z5Q;xDOSe_8A~a;gkp033Fb2^vF<}4JM@cQdB5`4cW$yx>5eV^b-n=0tB?ZkC6jta!
z%Z1I&Q)b7o8&C{@!-$lB_1LKXbYp|llB;u
z=EjmsV5_v5bYQdeotj
zjmX>meSrGwjiNBr(Ggtq{&kR#g9XpWy~-}6BM7r}9JcZ9oy=K%SK~{y+_#J2u0Pke
zZ9`p4ftCk$@P1HiajXswJxM|q1}0~n&o9hWib_gi%*^rLfC@&?B0X4kKY^a~in<^%u!=>e|u_Vyc~8h}@dnmVUkknT=xeCc-jZ!dt%
zYz8e^rG>{E&WqypKITNyBT!jrhZbyk`ATzXQ=LOh&oH$It2h?LP>@Jrr|mmcG^nhX
z(2b8vrX$z(Xu8K*7Xo3B8%}%~LjJqinHhx)VF4N%Nlne!l@*`qLX|k{cZYuq!xf=w
zEZATgSZQ{R3Xf}vnoD1k4v>>;KQVi^+K@z_l78WqF;tTqcl)~g13Ur}R7U3apdoue
z`TF`gNCrtMsR%M5UVt41H50%7vWAxyz;JPJ0JfqD<(W2b&W3_ecPZLl^0z32`rlMz
zcI`89beFV%9=dXh!LwUnsCkB?;yGry(U40+Oh_(#H_3rNOa-<`&t#N+8op?{X-R||kYnQbK=JtF=J5RFsf)!UgZKCDUK^1KU
z899azE8ptedY=1~$cKgoq$DN54l0Yqq6+5a=Dwk(29y@{n>SxfOg7fnzu_<78@YV)
zdAhF!N+ozqS*K+p9D#uD`C2`@Wl8u_q5BjvqU_RG-V%2CCW
z?RSouHj(J)pL+%MjJlxXG!0%|Osk)9;@X^wP`|eT65NWnuGxNS!yq$TKRj$Pn|NcI
zBY|XK+Tik5G$4n$TJ
zrnQ>S)MN+Q3GXCuYs8HR8CbgUM|B=Ampx3Rt^j#Fn#vb7-!Y{kRSv--_W*PRecs
zDadH>;qe}8^3s?ggvMY^)1$`Fyv8jmp@UMc`fQK>T-q68Y*Vl!t-Kg_F^A&z33mQ=K?@ZDzD9__;w@7
z*X%*Ypn{F|_XFPw51=udoSDYn=@K{%!Ue4XmAAO4sMY7?x&K(M03)rexObO5h!*h`
zr1k-#uAw39`DrDqFm+MO<36*o+nmzg|FM=nr5E{Ule?;<v)BvP{!o#(%|f)NB*
zfzTM1>sN$0@r2UveAzA?E{olC3*k(GhOw9E9LK#7HP4m0E9x!J#YwDgV9;jX^L!=8
z7Z-1=tR<{*elk8?EtyRZ%p4qUYk%j8*a${-j_L-gTTp|u4AV>dcU}E&@3Qc8$!x#h
z5)-0@hlhi{^hsk5b?HR1wjD@qq{83E$H&eG*}ZGz-#=lXs{DoD7PAD
z$H&h=JF#F-6KQXg5y?(waXv*p=krL9)}2c`;W)?EbeK_*n?>$@^u3{w8;N>;!x)Bo
zIXX`KBnU~MtS%lj_o%43@$uabv$FL4STXJi{l&S89U3A`0O&eopw
ztW@+Y7ZHBn?+>OEcZ6*kZLgTiZg=3ulS9%LbPm%BqIT-4LqbrB8H<5HvR*v@uMUyr
zO|kXB)YJuF$-!%?Dk^Rl+vrQ^J%}ogR7u-zC(9y0`vvs)e{?SrQa20dlRn9?Xw&7T
z6L|e$?FQ3I#Chv)L=Yl&--{-$!gtx4s)V?>-Lq4dF470uGiVfg-_7uS3?uWOgeN%2
z=ZG$Qz$BjvIBbnaIs)b%c(L}(_k_D2m9pr1sgMUpMt~h9A|%A#$;tEKx(ozG$zY7H
zy$c1T<5OBshDbBnA--tijgGoxd3na;Ro$zVnHWj|QoD=GqoWd$3=yCF;_UmqDY_pD
zi?6u|IMBKkGev>vaSQ0YfItI(PBSE%MV$AjEg*Xr@PPQNdVh7_^HCkffC>1OIA(
zCURr_lbxOV>fhu5nw~tv?PhVE&GNnGRac%G(osU%z&
zSjcSS;x{yN=!Bb|@He7UQd-GDr28>viQLj~X;eGzC;|qHxOf4@I9VF8cPqYhpYrR+
zM0!$U9B%hwQL)U9>W-+{?byM=Qy(#9IX>0IxKEL4>D?8*8~g%h00v39Po@gIzsn2nU5cp0LytIvnD2k;cwacTDO8u9teh=(*eT*<8Fz7ghY+pl@s@ZIGoT3MCL0{XX)s6`hQXbK0Nn26#g*j3t~71E3!Xu
z4A**C27d^o`0IXu>wR)Ai!l?7`P?HbRkBr&Z8g)vQ$NZU91ufkPbs%2G8b%z%2d=VsRxi1Ji}
zZt%f-VsiH-lULD9*6tK9ah1#Jt5{p?8&*c1u^yX~pCXg0^g;H2b0QQhaaFb7n_V9!
zVkOKgEgFHh13jZfZ%oWTdFk}(iiejM5hCn$@3b`-AJQCWf4mnHlsMIRK*gRbH%$H9zBzgqy`X~@6wUi&VFTAXE+df%2^U*|LL3F>D
zOZ;4|&(HV8Mqj}N!I+R?TH$=JRh|0jCo3x}AoxD0#C02MZOXok^!43hFsqPMEr{_N0OClD{DGAg>
zP`0F)NFVTr1$6fbsAbI)CS+7}K-;6?zgeT`_Ftz+2c
zaHZV`G!DT20!I(2pcB0ij)?kCUT}Yazw3IpUhz(hXdk_z@D0|bYZDSfZJ=?3AJ{af
zD9pR8l{qu8le)aN#;H1!VIsLT0i=)q;OMt%DRjeH6{=;@Upp_`s{^9V4dVn
z-ESz5YmAo%zY+pb)e2r|KR3Fx8v;0%T=3fu3lTFV6-pu3W2>Af)0hE2ICS8EsxzHX
zykR+BkrjC1`So}Htd)}?vixt;rT>abAH&etHv@XM9ZIteL_D9>{<1jg`FLi!?olMsJ>kc|vk%uh9aqt@QLOmb
z(XBbJ*MwB_qZ6W4e;}a}%+XWpOEwm2Y{D0E8)af(Fb8m4xZ4UP{~drShfB@%05}6*
z7T72VgqQbfmEWHdKnY;ZJ;}?*ab;4GlqBx!%p8fFl)_Hx==4v1g|kef*XBY6jFsSu
z)5~#!j4^__z=0Pq9g(+VG<@CXtH>DD@Y=QV8r&xH
zm$xTFNcoGyLSCy@P;I6U&)iY#c6J5kVe2pM+g^9RDGr%_O-;_u{3VD(d&_evI+0V{
z!8b&EAC$-gDf_3VYeYn>bkCsq&U*)YbhJ;kJ{!=)9?OrG`h>kdV>VN(9eT^6hb#$>
z9F{g0LbJ`(-x3xTZ4K9DKGU+Tu&tgS%ng9-{*ua-CHI~bqi=8!weJkLI8~`&*&OmD
zVh4(g8Lh34{yW9v?tc8YGJx}=X2_5EU)%r#0TkfcOuF3i>7ss?U1?!$X4Zat#O^Wa
z=WAjdtOuG&^Myxu7nQ-2AZ`5s?MPl6f_Ndf<%6@FzDUJL;qEW~MbVbCr2
zy+w|Fo12@6nB?+2t~CMVbtsu%W}mLEkZ9=IMcYn_H`M#89MaNfzZ)mT!LJ;~bNT1k
zG%?RtThAoe=6yW4X6A@Duc}oTo$E?1%avA>n_ZA%MDLo?2!rj=P
zDZ#+NfP271Z@M}^7cS2AzkF)5n;+Xl7vqYI97nCk(Sm3
zxO}Z}dfJhwVFDGC*r~vDv=k#kl2}$yXZJb&53x0VA1$qNlp0d&=wX_uy*Ey>6_X0P
zoTK9**&B+W%p0NX#jkc03L}UZFz=6?f)Mz8lWWp#EMKTZp2k&NzaOIpgd^5s=2>}(
zLOmqmC4ey`Ce|=hU{%byD=t*_@kW|VkZt$7v@~jZ^B@)ddC*Xdh_03mt2xQRC&nr6
zRFoCjl8Spnd0BT6OZ+23ct5?a92F|9+I!j+cnTWTs#%)n07W~MQni?2`&I{T)
z32~bG4Gy6n7$PDfzyX0MGNFXjR8S&@cvT)r*wg!g@c>*F{+M>X8$
z<<1kU&CN$+zU8&jojYFI*vlJLSzX;R`;0dBH>{U&6$NKluS}#&NanYESG-)%HAl~sOGfN;PVg^7vj66%k5
z^tivj5A-|$0iPZp0k-a~qn_xDvhSeZPytlRr3kSVPj7Kmv4Ui0ZXAA&0>47sr2T$N
zkh@D+gaVnYojzIlidJxT+{DBL5RZ*?=@4+7vXP`F&`Ef|xw~VK@E!v5Co@fK76bct
z=$#$?onPz!v~m1j1CzziysI
zw+?tWzJ%6pOo;R?6dH9rQ6j&S!NQ1oWmsjNGN2r0+4rRDWqN+nVdl9tT;-Z5%j^-09D}6kJtQ>9&T=xYPh~X4?5XP0VoDan=ofc_HG~w?@yJVA6y<5bQd%<
zYpH4`n(_3xIhM1g(>80T5`e
zj~0Qvy|J+|4}Dvdg%q>~$V)JPd$%G<-icI{T3je_CdWC1Uj3;V9%k|;fX00J)ov!u
zP*PkB^a#}M-$g|X0BTq&^LhXh1RQW7!6f%M`lH|y106I~!Oit?x82~@`yeMa%~#%1
zrC-0CyMzk$-^1Ctz0GWL+3E5p8iIkZE=ECSclQOAyqa$80<+EG^46G+jt)ToLYA8&
z+uHOm=7`@Z`vHH-s@L$_;(0dR(j`8jvV0^SoSDn+`aPyVktlKjcrwug`#Q3Inuz=#
z$<(x+U)ww0ZbSR`)Cve#b^c<|+SWUC$JXl8LkXvERKz(ZhzFAAb0M{uS1Ymh1IXJ|7E!9cEF)l;bF+eY^sm@4s-EG
z7}ZX>;m3B@&S4_13WH|ZC1TAGSco{whtAGU;1T`r)DKX4QU_=xuz(doCq+jDSDj)-
zp@)6kC4L}*K)ehyX1#F9vjl1E81yaxqLVb9{b3!8oLpSmfTs{p2x)0)AL&CAvT);R
z27q$_Fp~<->2$6u>KfIRBeg|BWE*xCh&qL57q41SAK+zhr3J-tr@bVMIECnz{^3$iu*%=>Jn2Q}<9UX8o0kD9!7uzGHw6W8eE0tb-$0E&q%!}>z
z^Ffp6*Ek4Vb6f-S`<)jL7hg-S{~c?q27w_n<7pSN;UWX2Ebm)dQnCW4QgTX4m`b=~
zO_iZgXi!iPIErW3<_YAuPjtz-n#K3`?mRqKLGGhN(P-Y@uL-4Ps-EyI7Mw$obMn>D
z)I{Hkd^+v!Or{aC;<5n8dSY;~19Qlw_)7_Gg>3T2j8|e*NhRjD3_5R`
zIPQ)&NG^yFdwX-%*U4-2d+VHUX`Ek-2r*Dmt12rwCMoP+EcDC;NT!C>Wg(F4+eIyE
zOLU$*1Eu5EvZ7AGP4UHiR0yJY>(5$E?YsK$VKmSd;3XhKX%f}2`rx>jyL%HjiIuqG
zpE;1VJ(lV7{OI=IxxX7OumAA^fH9-08Wlce(g1ixlYx~bJmjKvNwNM#+EZTE-0JJ)
zwwHh3tm8$#P`p+p>zPt8HQ@3?%Z9w*TD(~Ahn>@gVu+E+k^ssEHaSQBw7582p!%@3
zv)00M9nDl^+E`c&<}4N!%eLBO!}VtF#=N2=LDarttYE&oJhKcx*GIsbWwCknM>M60aI{=Ngm~I$?p>nWGnJ@Tj|K}v|-a)vwX-R71L
zx$&Z|s;r>c3zGyE^CtH2xB@tyCn>oJqUv~qsbUXZ`80-|xvokPwYRy5#DzyT(RgfI4O8^mG&a1X!7Z
z+e|rC(cKPiSB5Oe7~tp2%S)j4si~>?8=3!mFC_qQ7a)&Y9hir$HvcX(Vc;NiJDWx&
zO@EM;6=Y*eq=CAWr^S5-s1`VX!RZQc76~*3TzV<}8ncOHuCK!N>8@$}AGJq9AGNrY
z<7tekZuiMnW`xLI>W%s$XBL65)_~qH=+nwxxMzjXe20@mf%HM|p(ILEEUhQaP!{$V
zXI*C}RYJl>QJ=0BuFoJ{vOo_xumJ=$v;FS1w;;oE=v>U1Z`HGAmxK>^i(GD3Y
zM}YrHHLC+077h-MK&WaiHlECaodWl6(C(c+|9yF4Ze}@4Hw~C=+CKG!MhxlmF>5PD
zaJ!yrz^KYVB7ZcCeT|<1o?voevqSuYj*@+>@$yYjncUsoKlg@*gof^Sm0z5Q^W7uD
z!^>v~yG?!z4hv>g$w7rg`+VM9rlE)nz)^=Bie*SY*Mmg4KbYQR?r2<_N^kb=e=%|Ev&
zP@?!pV|@@&BoFr@uJf5<=zfOdCmml)HbwR~LM|@rDU*)U=kd`6-!cHa;mWwQ$k}ie
z#qY&eUU_*v0sEqqv@}qc)v|Zp_XkV+KxqILcTte--*@EwDW(j%gV8HWN-$e)e{@p`
z&R(dj$i>)UBFwx(I?+^*PgUT<#*yGa3&@&Hv)b`iZYPjX!3aifyZjMgA082*Qo|JH
zPXo1Z><6Rnf0m;4KOw-Ukw5l%celh+Yg}(@B$bazvy_m}Vm-*@KyHYmAx8>E0t>CP
zw)PgDD{SVK3;1RPGCNzRbgSkMH3xw2cZ
zRx2|`b-{H04N?a`rjiUKy~->+sW9-Q(!*ry-@CV+HzJBEXu*IoXJlft*AbDDh3(-;
zBVnx&k!YS9v^d-@l7YZ*aXBuu)~ypfO}J61E54S9uKqge?nb7=ZaIQlV4|VHk4A0P
zY#ish_FiH=^-m8;K_p+T>&m*O<|vCc6XTVxT37@h+0&|dY1D}u6BCn^R1fQl$}IMm
zDU>&$+kmfB5OUc30W6ikdUR!_g^e)n^lfalGoan`&H%W@sz0mDgcbW6DxfS7lEfx?!k>NkAJf@}cId}b8&`GmiQp18*LH}PgeRWjT-ShQ_
z4v`M&P^4SByOg*B(j_eo(v1j$64G5>x>LGDx*McKy1U^$&-b@p|5KKiyH3oR*|Yba
z&wq31yo97xt|^#%IVvhEvxgb}TVnH3?<2tX@0g$0RaZX-6`DX~b)6H`;}k@5vwJ~8
z5`)!pQd3m)oE$C1fC{T?sy3D4seP!h~__nDeFhJ1(y%&&(g0lhYn{m~T
zd$qMlTR(mrE3ARjY0`S!x%f-_CmK%HFn_0$qN3~h#d_4==MEn7g=cEdf0nStG~x$A
zN|;r2bX6pGk^OxIyWL4A9<*rvxsghSKbu{boKs
z&INr?+axMmj17?%Adv&8EU1E0?Q{P;WZm}g0bbF;!SElx>RgV}7D)~#AeEb*o_=Cc
zDp)a!uU=_4bzs!8gKQoo*-+~?4o^r
z+o9O}^JhWCb%Z~LMQ&(gqZiP<8X8hjQ&W@6NH&3PSUtED`}*DnL7yokM3m1K^ec_tuOBq5`n79vegSRkpCKjm`pr!58O-r?}6
zaPPKzbr!tZl7hlQAb#T&6HwOIC;P1J6^l-h7%gE@o`gs7k!M<5R{BkBu?StrZ#rh`
zYk`N=n(p_uc6Nt<8oDeZC2?geEb;&<2vkj_di9N88MhuP0Zb2s&4cajG_ckww6g#_LP>Ti%wJBnH?Swx
zh11jyD}HJ@66!T2_)ShHVv@Lgd;|-o))0t4~!xou?uIT(c
z8TqT0Dk+oV<&Y)*vPhfbT2+`M$;@nKj&&X(aAW%FEwPb52-GxneNDT>YdJ(D^FVILcX>n3?}>Zf@40
zH!VsCegW-|X8*_5pFg!FB(ka_f|fSHgG>sjwICw`^kxf^-Dh`+S;f@(&8loVjsLbX
zSSIW&CMTT=8G!^5LrMx!r+Uqw#DQv|^=4FQQ#1b0yJBm!=oc8!+#B3RQrGv_(>lP%
zN2k`toTOl7OcOhZ2GAowryKAreglC;XQd*hhQS9QX8GcOfBup=Gqaq0X8ND?%v>qC
zkYlr#qP4oDNnmuen&l-)+tdD!{~;#8%LS0U^y=*mGznd?xBfJ^05Z#;*7*>$aGC1FjZF?CSk99|@(KFH??%Dc2!$#$BpsRm
z%)KSlSpFo{A${z@`=p@qrYhyV8Gj%9iW^kBjWT2CLLNB$6|iOti;Cvw=00AOy=$D{
zWoQ3SK{9Bt_`eHwHpIrvyfc>D!zn$HrHluM|5;dE6F#ylG1`y;!~=${K6A^<%S%fcUPY%51&Wi`9nH>)NzwDc#s+Lm
zJkg{Pa(>)9M~9)TQb|f-u36pun!5!+vuP9>Ws(`wzNJ0>J}%XD+&^vc)+fr4gRP6U
z?sp4vB%X})GM6*b*?Fdz_dN6O+ss>oZIySbI3s6YT`!07ySqQp{&Z=&UY1^ajUrBX
zdZp+%`6jX3+pDlJ`o*vHoUF2^YXK&~sKaSBD;MbtQPkpjo(l3t22DAIc!;nEaqk&Y
zD%q)!k%nrvg>mx0Poszmhb@dWbU6w{g)n3$#2^&maKd;)nUpN6AIvo4#p4()mkG!U
zRf=_o!-MyJ`c-l=6V=tphrT{kDEz5B?gtHTlkhlQ)`b74R|QjNQ;FWQwlJe``MkI7pKCv0@A2N$FY^JO_fY=6Z0jmlI)G-3
zz&-g-j_R;%NS7D{>)-1g&6ZAA*=*+cTk6)a8*AH)sKf-%G3;UcV&t_y4KEir{H)+(
z&NRD$9A0ToCy#SuwRWM27MZTbh+wvRi9eimdHZ(}1r3K1Fg4z1Gde8&jio|ewB=atMB_+dl=~*mu)M|luuLRZ8!KZN
zW?z2`Of6VHuh?auV5y$H2QV5yAAk-<2dFBNhLz@&Sui~!YtMJ5-M8o02XGS44yTuG
zNOb7q+|6$tJYZBy9+&5teg-0k5m#lDjkD>nk{)&&k19j|JM;CesM~vrN0<78#o5P`
zG5FD|-_1a&13ZEN>$+TC{}gRRajHe{lJ~>R3)=>ATwJ9e=`*pEC8UbyeDLV@V6;op
zt^>R5W~4eeAZWdQ3Eom?3_VPo5x>>W5FlM?nS6*pw78=g$sz`tRP?fbHfbBJ()#`U
z7C?^J{J@l3GGF|g2Y&x8P@WG+m)#2lM~&3m-q*UGcA?==5&k(Ga`E~8&1rY0o*Mo)
zsPCB6BAJ+;Ct}M%ZBYN@<9Kk>X=|8!ql{6u4b6v=1YLNFkjZMxFK@s(1*hcm`xIIh
ze7f+zpb1B`EP{ss#7;;hVucOLATcC{BYSQ!+Nki@BNpYg?%aktChD-wW#NN
z4s}Oj>T7{vpVnT#d>Dw@*#7w2o#CldgX~!1n)5X(9UXpz=GIKD`#0u~Xius3oEv;x
z{Tdm{0UcHFzh_y%$<)fKfT4HIZabVPLty`St><02j;Pm>JeN8i&2=*sj<)m@siWPf
z1bANF>okE%Tyotj;-qatWRszk|FofZS8Gwqv#^R6BvQQr4@iWV>;K$0lgCjy`|MkV
zM1(p%Nj$&2-or^_?1qSrIM`imyFr0&nBT&C3KpSSgPvB
z0K+G{`9zdU?^E2&Gc{hzk+f2sVvFT6hrtv92GI$AW~T_m9Gahhrgug@@t0nn-N4(D
zU+|2vUmCyn)NR&(yo;ZB`Ic;#9h|{t2*>E*Zv=<^QnMgXZ|6^3>xX?;?*?5RVZXbp
z^W7Ox6`51rKLhp(B|6m?rA>ryg@^;L`X{a3#%~I@cit4s`5f&896ZnS=T^>Ou$w+0
zI(WHgePenZ^IUE?gJ(fr2VX0Q2=Iw&Lwdw5)x0hDQ}rsdf_CTtJ|jdS
z&??Nv_9l0X@Ymsxw9mNo)S1sZ(YN`-iTb=`ZV#z{%2UW!F4V8d_nn*a{HrF#Vw;%9x4l0%e;=2C$)dahqGtit>CGoi>%v
zP6(_XkmCSUfmUx6rez$JmXss_8i*Mj*d>MyL5xA=`Y&le2n0^Oyx5+vw&_k;F3s>i
z|7{?X8PkT_LFbW!uE~qx=2codp8F~nc
z8_bE~@^nNT;EmSs~y+e{xFs*^t^uAugP)suIYIz=N7N+mfC|~
zrE0LCPh#1kVT<*3mf`#U1nf|G+XN145;_=Wj>PVC!HxdzuyP_>i~C`QZ3DmCIVJ`M
zskq-=I1vjtk=xSBF^eaK8bvzQ7F>^0@BVSfc=%j)1f#MLhL8OAPtREOPKbi_GZB(n
znEYl~R%ed9wRKd37%-c$M)YJ}d{Q346(binDAZ`-Tj9-g(VkmUR#MWfu~G*u0YFF6
zL$t{Jhr^0zK-kT{po7{#rGjb~0Ut0od
zbq}>aP&wRy1+D&gblw$Si3);u19&7C9A1wX5PpiS?eShc8T#1v$XRl@fM7g=M1NWlPMY@ok4M5`wvqUAqT4t>L+m$xN
z!N#JTx75gs8cbtyRa26cJmUYZ+Dg^s6z(4Cztf~KlxtdwWil>&itE%HP5TLV?xbXI
zPW<{G0URc1Gs$=%?=TS0BNF%cw>#oLguwH}3_OE+2x*-``LA~O=3G+!mqq+$Hzg5J
zk2>JS<*F;$CW~ZtDIfoMTP-^DfX|z3q2C=
zxtyBPJ8%q7kbI@CIzt%vPZk_VfZ+Jet#M5EhVmP@f1kr-@V@S4_2?~3XMAAN5(qf{bL(b*~5Z~Ajkc2=@v7@d2j`Xo7=bQ
zFAX^(t$v>D-QB*YR;p*8>giJg)He*Gp%_G0tYLbGmjAVnz6b?&3+%z$v@Fsomw?)T!0o
z-^}!H1@aP*EnmqXXE=%GFt%PUdH?6oA@}XCXL8>9-ISP#gulOZ`;lKRj}M>pHr`Au
zyUOhfIdo;b
zo=pwkvuYJag)`H61jS~mQ0HnDE^l3GeV(f?Okwqt^6cLKOHLct6iC@)eQ(LV31sN@wJC
z5x-`Meed~7l4%X>rvSbD%dcr}0L=q*v|HqvuAtaY(UtFe5ZghAO*`@3axKX<>Jt@w
zsRVW%o9eljEj-J%#J*+025y|*F&Ru>DNks7m_9%CUVEGq1tx