1-on-1 R Lessons For Data Science/Analytics

Overview

Since this is 1-on-1 format, there are very limited spots available.

This course consists of 12 one-hour lessons designed to teach you everything you need to get started as a Data Scientist or Data Analyst using R. You’ll learn the essential foundations, the key packages you’ll use daily on the job, and a selection of more advanced topics to set you apart.

My goal is to prepare you more thoroughly than a typical university course, coding bootcamp, or online certification. Every lesson will be delivered in a 1-on-1 format, giving you my full attention. You’ll write real R code, work through hands-on exercises, and have the opportunity to ask questions at any time.

While I cannot guarantee job offers upon completing the course, I can guarantee that you will gain the practical skills, experience, and portfolio projects necessary to strengthen your resume and confidently take the next steps toward a career in data science and analytics.

One often overlooked strength of R is its ability to easily generate professional reports. In fact, this entire document was created using simple R code — and by the end of the course, you’ll be able to create polished reports like this yourself.

It is required that you are able to speak English as that is the only language I speak. You also must have your own laptop/computer and be able to share your screen. Having more than one monitor is strongly recommended so you can see my screen that I’ll be sharing and do your coding at the same time.

Pricing

This course costs $1200 USD ($100 per lesson)

Lesson Plan

(The following lesson plan and basic code examples are not indicative of everything you will learn. Lessons will be adjusted to match your skill level and desires)

Lesson 1

Tour of RStudio
Variables and data types
Vectors and basic operations
Creating and manipulating objects

## Creating simple vectors
body_weight = c(90, 100, 60, 89, 80)
height = c(60, 70, 66, 72, 68)

## Finding the mean body weight
m_bw = mean(body_weight)

print(m_bw)
## [1] 83.8

Lesson 2

Data frame structure and indexing
Accessing data and creating new columns
Filtering using base R
Operations on data frames
Basic visualizations

Lesson 3

Importing data
Exporting data
Import/export all kinds of formats

## Read in our .csv file of titanic survival data
df = read.csv('data/Titanic-Dataset.csv')

## Filter for only survivors
df_surv = df[df$Survived == 1,]
table(df_surv$Sex)
## 
## female   male 
##    233    109

## Write our data to a .csv
write.csv(df_surv, "output/survivors.csv", row.names = FALSE)

Lesson 4

Dplyr (one of the most used R packages)
filter(), select(),mutate(), arrange()
Pipe operator %>%
Grouped summaries

library(dplyr)

## Filter our data for survivors only
## Select the variables of interest
## Group our data by Sex
## Create a summary data.frame of mean age
surv_age = df %>%
  filter(Survived == 1) %>%
  select(Pclass, Sex, Age, Survived) %>%
  group_by(Sex) %>%
  summarise(mean_age = mean(Age, na.rm = TRUE))

## Check results
surv_age
## # A tibble: 2 × 2
##   Sex    mean_age
##   <chr>     <dbl>
## 1 female     28.8
## 2 male       27.3

Lesson 5

Data cleaning
Handling missing values
Renaming columns
Converting data types
case_when()

colSums(is.na(df))
## PassengerId    Survived      Pclass        Name         Sex         Age 
##           0           0           0           0           0         177 
##       SibSp       Parch      Ticket        Fare       Cabin    Embarked 
##           0           0           0           0           0           0

df$Age[is.na(df$Age)] = 0

colSums(is.na(df))
## PassengerId    Survived      Pclass        Name         Sex         Age 
##           0           0           0           0           0           0 
##       SibSp       Parch      Ticket        Fare       Cabin    Embarked 
##           0           0           0           0           0           0

Lesson 6

Reshaping data
pivot_longer() and pivot_wider()
Import/export all kinds of formats

Lesson 7

Visuals with ggplot2
Learning the syntax
Different kinds of plots
Aesthetics

Lesson 8

Exploring data (EDA)
Visual checks
Correlation and distribution insights
Finding outliers and patterns

##  [1] "PassengerId" "Survived"    "Pclass"      "Name"        "Sex"        
##  [6] "Age"         "SibSp"       "Parch"       "Ticket"      "Fare"       
## [11] "Cabin"       "Embarked"

## 'data.frame':    891 obs. of  12 variables:
##  $ PassengerId: int  1 2 3 4 5 6 7 8 9 10 ...
##  $ Survived   : int  0 1 1 1 0 0 0 0 1 1 ...
##  $ Pclass     : int  3 1 3 1 3 3 1 3 3 2 ...
##  $ Name       : chr  "Braund, Mr. Owen Harris" "Cumings, Mrs. John Bradley (Florence Briggs Thayer)" "Heikkinen, Miss. Laina" "Futrelle, Mrs. Jacques Heath (Lily May Peel)" ...
##  $ Sex        : chr  "male" "female" "female" "female" ...
##  $ Age        : num  22 38 26 35 35 0 54 2 27 14 ...
##  $ SibSp      : int  1 1 0 1 0 0 0 3 0 1 ...
##  $ Parch      : int  0 0 0 0 0 0 0 1 2 0 ...
##  $ Ticket     : chr  "A/5 21171" "PC 17599" "STON/O2. 3101282" "113803" ...
##  $ Fare       : num  7.25 71.28 7.92 53.1 8.05 ...
##  $ Cabin      : chr  "" "C85" "" "C123" ...
##  $ Embarked   : chr  "S" "C" "S" "S" ...

## Survived
##   0   1 
## 549 342

## Sex
## female   male 
##    314    577

##         Survived
## Sex        0   1
##   female  81 233
##   male   468 109

Lesson 9

Basic modeling
Predictive models (Machine learning)
Train/test data
Assessing predictions

# Modify some variables
df$Sex = as.factor(df$Sex)
df$Pclass = as.factor(df$Pclass)

# Remove all rows that contain an NA
df_no_na = na.omit(df)

# Build basic logistic regression model
model = glm(Survived ~ Pclass + Sex + Age, data = df_no_na)

# Check summary
summary(model)

# Make predictions
predicted_probs = predict(model, type = 'response')

Lesson 10

Creating reports
Display your results and tell the story of the data
RShiny introduction

Lesson 11/12

Either building RShiny web apps or learning xgboost machine learning