# Premise

During my time as a teaching assistant for 36-202 the course moved away from conducting analysis in MiniTab to rudimentary analysis and visualizations in R. 36-202 is an introductory statistics course typically completed in a student’s second semester and builds upon EDA and single linear regression techniques explored in 36-200.

# Process

Given the population of intended users, second-semester freshmen with little to no programming experience, I prioritized usability and simplicity. Introductory Carnegie Mellon computer science courses are typically taught in python (at the time I wrote cmu202). As a result of this trend I assumed that students who entered 36-202 with programming knowledge would most likely be familiar with pythonic syntax – a stark contrast to the more rigid nature of R.

In consultation with Gordon Weinberg, Kayla Frisoli and Rebecca Nugent, we examined the existing R functionality we felt that introductory students could understand and documented the functionality that would either be needlessly difficult. The functionality implemented was a follows:

**Data processing and loading:**Loading data into R requires a rudimentary knowledge of file storage formats, relative and absolute paths and`read.XYZ`

functions. All needed datasets can be loaded with a call of`data("DATASET_NAME")`

. As the course progressed students gradually moved to`read.csv`

and similar functions. The presence of simple data-loading procedures reduces the cognitive load of data processing and allows for students to focus on other skills within R.**Outlier Calculation:**A secondary goal of the course is to familiarize students with basic R functionality. This function allows for easy $\text{k} \times IQR(\text{array})$ rule calculations. The functionality provides a quick way to calculate a commonly-used metric and ca be combined with basic data manipulation techniques to enhance EDA displays.**Cross Means:**This functionality implements a simply way to calculate a $2 \times 2$ table of means for 2-way ANOVA. Students struggled with`aov`

functinality and attempted to minimize its use

# Skills Used

- R Package Development
- Namespace Management
- Function Documentation
- Repository Management
- Remote Installation: I installed updated versions of the package on a custom CMU Statistics R server for students to use
- Pedagogy
- Elementary statistical functionality