The iris dataset for complex operations on grouped data.
The mtcars dataset for reshaping between long and wide formats.
This will allow us to compare different data manipulation tasks using base R and the tidyverse.
Task 1: Complex Operations on Grouped Data
We will calculate the mean and standard deviation for each measurement (Sepal.Length, Sepal.Width, Petal.Length, Petal.Width) by species.
Base R Solution
# Load the iris datasetdata(iris)# Base R approach using tapply and aggregatemean_sd_base <-aggregate(cbind(Sepal.Length, Sepal.Width, Petal.Length, Petal.Width) ~ Species, data = iris, FUN =function(x) c(mean =mean(x), sd =sd(x)))# Flatten the resultsmean_sd_base <-do.call(data.frame, mean_sd_base)# Display the resultmean_sd_base
We will reshape the mtcars dataset by converting it into a long format where each measurement is recorded separately for each car model, and then back into a wide format.
Base R Solution
# Load the mtcars datasetdata(mtcars)# Add car names as a column instead of row namesmtcars$car <-rownames(mtcars)# Base R approach to long formatmtcars_long_base <-reshape(mtcars, idvar ="car", varying =names(mtcars)[1:11], v.names ="value", timevar ="variable", times =names(mtcars)[1:11], direction ="long")# Back to wide formatmtcars_wide_base <-reshape(mtcars_long_base, idvar ="car", timevar ="variable", direction ="wide")# Display resultshead(mtcars_long_base)
Learning curve: Users new to functional programming might need time to adapt.
May not cover every niche use case: Highly specific transformations might need workarounds.
Summary
We compared base R and tidyverse methods for complex grouped operations and reshaping data between long and wide formats. The tidyverse offers a more readable and efficient approach, particularly for grouped data and reshaping tasks. Base R remains useful for cases where fine control over transformations is needed, but it can be more verbose and complex for users unfamiliar with its syntax.