Since this is 1-on-1 format, there are very limited spots available.
This course consists of 12 one-hour lessons designed to teach you everything you need to get started as a Data Scientist or Data Analyst using R. You’ll learn the essential foundations, the key packages you’ll use daily on the job, and a selection of more advanced topics to set you apart.
My goal is to prepare you more thoroughly than a typical university course, coding bootcamp, or online certification. Every lesson will be delivered in a 1-on-1 format, giving you my full attention. You’ll write real R code, work through hands-on exercises, and have the opportunity to ask questions at any time.
While I cannot guarantee job offers upon completing the course, I can guarantee that you will gain the practical skills, experience, and portfolio projects necessary to strengthen your resume and confidently take the next steps toward a career in data science and analytics.
One often overlooked strength of R is its ability to easily generate professional reports. In fact, this entire document was created using simple R code — and by the end of the course, you’ll be able to create polished reports like this yourself.
It is required that you are able to speak English as that is the only language I speak. You also must have your own laptop/computer and be able to share your screen. Having more than one monitor is strongly recommended so you can see my screen that I’ll be sharing and do your coding at the same time.
This course costs $1200 USD ($100 per lesson)
(The following lesson plan and basic code examples are not indicative of everything you will learn. Lessons will be adjusted to match your skill level and desires)
## Creating simple vectors
body_weight = c(90, 100, 60, 89, 80)
height = c(60, 70, 66, 72, 68)
## Finding the mean body weight
m_bw = mean(body_weight)
print(m_bw)
## [1] 83.8
## Read in our .csv file of titanic survival data
df = read.csv('data/Titanic-Dataset.csv')
## Filter for only survivors
df_surv = df[df$Survived == 1,]
table(df_surv$Sex)
##
## female male
## 233 109
## Write our data to a .csv
write.csv(df_surv, "output/survivors.csv", row.names = FALSE)
filter()
, select()
,mutate()
,
arrange()
%>%
library(dplyr)
## Filter our data for survivors only
## Select the variables of interest
## Group our data by Sex
## Create a summary data.frame of mean age
surv_age = df %>%
filter(Survived == 1) %>%
select(Pclass, Sex, Age, Survived) %>%
group_by(Sex) %>%
summarise(mean_age = mean(Age, na.rm = TRUE))
## Check results
surv_age
## # A tibble: 2 × 2
## Sex mean_age
## <chr> <dbl>
## 1 female 28.8
## 2 male 27.3
case_when()
colSums(is.na(df))
## PassengerId Survived Pclass Name Sex Age
## 0 0 0 0 0 177
## SibSp Parch Ticket Fare Cabin Embarked
## 0 0 0 0 0 0
df$Age[is.na(df$Age)] = 0
colSums(is.na(df))
## PassengerId Survived Pclass Name Sex Age
## 0 0 0 0 0 0
## SibSp Parch Ticket Fare Cabin Embarked
## 0 0 0 0 0 0
pivot_longer()
and pivot_wider()
ggplot2
## [1] "PassengerId" "Survived" "Pclass" "Name" "Sex"
## [6] "Age" "SibSp" "Parch" "Ticket" "Fare"
## [11] "Cabin" "Embarked"
## 'data.frame': 891 obs. of 12 variables:
## $ PassengerId: int 1 2 3 4 5 6 7 8 9 10 ...
## $ Survived : int 0 1 1 1 0 0 0 0 1 1 ...
## $ Pclass : int 3 1 3 1 3 3 1 3 3 2 ...
## $ Name : chr "Braund, Mr. Owen Harris" "Cumings, Mrs. John Bradley (Florence Briggs Thayer)" "Heikkinen, Miss. Laina" "Futrelle, Mrs. Jacques Heath (Lily May Peel)" ...
## $ Sex : chr "male" "female" "female" "female" ...
## $ Age : num 22 38 26 35 35 0 54 2 27 14 ...
## $ SibSp : int 1 1 0 1 0 0 0 3 0 1 ...
## $ Parch : int 0 0 0 0 0 0 0 1 2 0 ...
## $ Ticket : chr "A/5 21171" "PC 17599" "STON/O2. 3101282" "113803" ...
## $ Fare : num 7.25 71.28 7.92 53.1 8.05 ...
## $ Cabin : chr "" "C85" "" "C123" ...
## $ Embarked : chr "S" "C" "S" "S" ...
## Survived
## 0 1
## 549 342
## Sex
## female male
## 314 577
## Survived
## Sex 0 1
## female 81 233
## male 468 109
# Modify some variables
df$Sex = as.factor(df$Sex)
df$Pclass = as.factor(df$Pclass)
# Remove all rows that contain an NA
df_no_na = na.omit(df)
# Build basic logistic regression model
model = glm(Survived ~ Pclass + Sex + Age, data = df_no_na)
# Check summary
summary(model)
# Make predictions
predicted_probs = predict(model, type = 'response')