Week 9: Visualizations

Hi everyone!

This week we learned about three types of visualization including basic graphics, lattice, and ggplot2. Basic graphics encompasses functions such as plot() for line graphs, barplot() for bar plots, and hist() for histograms. We can customize symbols, line types, and colors in these visualizations. The lattice package uses the function xyplot(y~x) for plots and bwplot() for box plots. The advantages of this package is multi-conditioning with many panels in one output and convey additional information easier. The ggplot2 package is great for visualizations and is structured as ggplot(dataframe, aes(x,y)) + geom_[]. 

Choose any data set for your visualization from Vincent Arel Bundock dataset list: https://vincentarelbundock.github.io/Rdatasets/datasets.htmlLinks to an external site.
Using this data, generate three types of visualization on the data set you have chosen. In your blog, discuss and present your three visualizations you will create and express your opinion on these three different types of graphics output.

I utilized the Weights of College Student Backpacks dataset. The two variables that were most interesting to me were gender and year of college when deciphering the differences between backpack weights. Since both of these variables are categorical, I decided to visualize the distribution of backpack weights for undergraduates as box plots based on these categories. With higher visualization tools such as lattice and ggplot2, I was able to visualize gender differences within each student year to understand nuances within each grade and gender.

Basic boxplots
In this first graph, we can see the mean backpack weight is highest in sophomore year.



In this second visualization, we see males carry slightly more backpack weight by mean.



















From the basic boxplots, I understand how both gender or year relates to the backpack weight. In order to understand the relationship between all three variables, higher level visualization techniques are necessary.

Lattice boxplots

In this lattice boxplot, we can clearly see the mean female and male backpack weights for each year of college students. Through this plot, we understand males do not always have heavier backpacks, but their weights were higher in the first two years of college while the female backpack weight was higher in the fourth year.

From the ggplot, we see similar information as the lattice boxplot in a different visualization. Without extra code, the legend and one clearly labeled weight scale on the left is displayed in this ggplot2 box plot that was not seen in the lattice boxplot. However, the mean is less visible as a black line especially in year 4 for the males.

ggplot2 boxplots 


















From these three forms of visualization, I find ggplot2 to be the most understandable and highly efficient plot to visualize the relationship between backpack weight, gender, and school year in this data set. The basic plots are clear for simple data, while lattice is good for pointing out the mean clearly. In my opinion, ggplot2 captures all of these elements nicely to portray the most information.

Check the code out in GitHub!

-Ramya's POV


Comments

Popular posts from this blog

Week 6: Doing Math P2