Rows: 891 Columns: 12
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (5): Name, Sex, Ticket, Cabin, Embarked
dbl (7): PassengerId, Survived, Pclass, Age, SibSp, Parch, Fare
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
head(df_titanic)
# A tibble: 6 × 12
PassengerId Survived Pclass Name Sex Age SibSp Parch Ticket Fare Cabin
<dbl> <dbl> <dbl> <chr> <chr> <dbl> <dbl> <dbl> <chr> <dbl> <chr>
1 1 0 3 Braund… male 22 1 0 A/5 2… 7.25 <NA>
2 2 1 1 Cuming… fema… 38 1 0 PC 17… 71.3 C85
3 3 1 3 Heikki… fema… 26 0 0 STON/… 7.92 <NA>
4 4 1 1 Futrel… fema… 35 1 0 113803 53.1 C123
5 5 0 3 Allen,… male 35 0 0 373450 8.05 <NA>
6 6 0 3 Moran,… male NA 0 0 330877 8.46 <NA>
# ℹ 1 more variable: Embarked <chr>
Exercise 1: Does Age predict Survived?
Build a logistic regression model predicting Survived from Age.
Is the coefficient for Age significant? If so, what does that mean?
Interpret the intercept and slope coefficient. What do they correspond to?
According to this model, what is the probability that a 20-year-old person would survive? What about a 30-year-old person?
What is the AIC of this model?
Exercise 2: Does Pclass predict Survived?
Build a logistic regression model predicting Survived from Pclass (passenger class).
Is the coefficient for Pclass significant? If so, what does that mean?
Interpret the intercept and slope coefficient. What do they correspond to?
According to this model, what is the probability that a person in 1st class would survive? What about third-class?
What is the AIC of this model?
Exercise 3: Does Sex predict Survived??
Build a logistic regression model predicting Survived from Sex.
Is the coefficient for Sex significant? If so, what does that mean?
Interpret the intercept and slope coefficient. What do they correspond to?
According to this model, what is the probability that a person listed as female would survive? What about male?
What is the AIC of this model?
Exercise 4: Compare the single-variable models.
In Exercises 1-3, you calculated the AIC of each of these models. Now let’s compare those AIC values. Which is lowest? What does that tell us about the ability of each predictor to account for whether someone Survived?
Exercise 5: Build a multivariate model.
Now let’s combine those three variables in a single model predicting Survived.
Do any of the coefficients change? How so and how much?
Use the broom package (and ggplot2) to visualize the coefficients with their standard errors.
What is the AIC of this new model?
Write out the linear equation corresponding to this model to help think through what it means.