I saw a post recently about the likelihood of a baseball team winning based on how many runs, hits, and other baseball statistics. I liked the idea and thought of applying that to college football. Particularly, I’m interested in knowing whether scoring more points or having a stout defense improves the likelihood of becoming bowl eligible. Using some data scraped from the cfbDatawarehouse to figure out how likely a team would be bowl eligible based on the number of points they score.
I often see graphs that are poorly implemented in that they do not achieve their goal. One such type of graph that I see are dodged bar charts. Here is an example of a dodged bar chart summarizing the number of all star players by team (focusing specifically on the AL central division) and year from the Lahman r package: library(Lahman) library(dplyr) library(ggplot2) library(RColorBrewer) AllstarFull$selected <- 1 numAS <- AllstarFull %>% filter(yearID > 2006, lgID == 'AL', teamID %in% c('MIN', 'CLE', 'DET', 'CHA', 'KCA')) %>% group_by(teamID, yearID) %>% summarise(number = sum(selected)) b <- ggplot(numAS, aes(x = teamID, y = number, fill = factor(yearID))) + theme_bw() b + geom_bar(stat = "identity", position = "dodge") + scale_fill_brewer("Year", palette = "Dark2") Note: If you are curious from the above graph, there appears to be two typos in the teamIDs, where CHA should be CHW (Chicago White Sox) and KCA should be KCR (Kansas City Royals).