Master Data Wrangling and Visualization Techniques ๐ง
Welcome to Week 3! This week, we explore data manipulation using dplyr and the tidyverse ecosystem - essential tools for organizing, cleaning, and visualizing agricultural data. Learn to filter, select, arrange, and transform data efficiently!
Click the "Launch Week 3" button above to start your R environment. This will take 2-5 minutes to load with all necessary packages for data manipulation.
Once Binder loads, you'll see the Jupyter Notebook interface. In the left panel, you'll see:
assignment/ - Assignment 3 on data visualization and analysisclass_activity/ - Week 3 lab tutorialClick on the class_activity folder to access this week's content.
Inside the class_activity folder, double-click on Week3_Data_Manipulation.ipynb to open the interactive lab notebook.
This week we'll use multiple datasets including the iris dataset and real-world data! The notebook will guide you through:
Use these interactive tools to understand data manipulation concepts before working with R code:
๐ก Tip: Use these tools to visualize data manipulation concepts before applying them in your R notebook!
filter(data, condition) # Subset rowsselect(data, columns) # Choose columnsslice(data, rows) # Select by positionarrange(data, variable) # Sort datamutate(data, new_var = ...) # Create variablesgroup_by(data, variable) # Group for analysis
starts_with("Sepal") # Columns starting with "Sepal"ends_with("Length") # Columns ending with "Length"contains("Petal") # Columns containing "Petal"matches(".*Width") # Regular expression matching
str_replace_all(text, pattern, replacement) # Clean textna.omit(data) # Remove missing valuesas.integer(vector) # Convert data types
From the main directory, click on the assignment folder to access Assignment 3.
Double-click on Assignment3.ipynb to open your assignment on data visualization and analysis.
Filter data by gender and create comparative boxplots
Import, clean, and subset real-world data
Create stem-and-leaf plots and analyze patterns
The assignment uses multiple datasets:
Learn to handle messy real-world data with non-numeric values and missing information!
โ ๏ธ Important: Binder environments are temporary! Always save your work locally.
When you're done working, save your progress:
To resume your work:
.ipynb fileFor Assignment 3, submit TWO files to UC Davis Canvas:
Your completed assignment with all outputs and analysis
Your notebook code as backup
Due Date: Check Canvas for assignment deadline
By the end of this week, you will be able to:
Mohammadreza Narimani
๐ง mnarimani@ucdavis.edu
๐ซ Department of Biological and Agricultural Engineering, UC Davis
?function_name for helpClick the Binder badge below to launch Week 3!
Happy data wrangling! ๐ง๐พ