12
A Practical Primer
10/30/23
Download and install Python 3 on your computer
Be able to take input data and generate output results using Python
Be able to import modules libraries and load data
Know what an array and dataframe are
Understand what a function is and how common functions work
Learn data wrangling basics
Download Python at python.org/downloads
Download Visual Studio Code at code.visualstudio.com/download
Visual Studio Code (VS Code) is an Integrated Development Environment (IDE), which provides a visual and interactive interface to make coding in Python easier
Python is the programming language that we can use to explore datasets
The power of Python is being open source
pip install package_name
installs a package for the first time (only need to do once)
For this session, go to the terminal and run pip install numpy⏎
and pip install pandas⏎
in your console
import numpy as np
and import pandas as pd
loads these packages into your .py
program
Assignment: assign a value to a variable name (e.g. x = 10
)
snake_case
or camelCase
Functions: your “action verbs”, which take in input argument and return output (e.g. mean()
)
Comments: helpful statements to help you and others better understand your code, but are not executed. (e.g. # This is a comment!
)
This is how to write your own function:
syntax: def function_name(arguments):
arguments
: what you pass in as inputs
Basic data structure in Python
Function np.array()
converts a list
into an array
Indexing []
retrieves elements of a vector by position (or by name for a named vector)
len()
: returns the number of elements in the array
np.mean()
: mean of elements in vector
Be careful with NaN
values
DataFrame
'sTidy data principles
Each row is an observation
Each column is a variable
Each cell contains one value
How do data frames relate to vectors?
pd.DataFrame
creates a data frame
DataFrame
'sUse print(my_data.columns)
to print out the features in your dataset.
Use print(my_data)
to view the size of the dataset and some example rows.
Use the matplotlib
library to plot data of interest. Tutorial
Use the []
operator to select specific columns
Sepal.Length Sepal.Width Petal.Length Petal.Width Factor
1 5.1 3.5 1.4 0.2
2 4.9 3.0 1.4 0.7
3 4.7 3.2 1.3 1.3
4 4.6 3.1 1.5 1.0
5 5.0 3.6 1.4 0.9
6 5.4 3.9 1.7 0.4
0.2 0.7 1.3 1.0 0.9 0.4
Google and StackOverflow are your best friends
When asking a question online, make sure your code is as simple as possible. Only include the minimum required lines
ChatGPT may be helpful
Stop by Office Hours and ask for help!
Many thanks to Lathan Liou for providing the template for this presentation!