Posts

Week 12: R Markdown

Hi everyone! This week we learned about R Markdowns which are interactive R documents to nicely format text and r code to present to viewers. In my R Markdown, I started developing the code for my final project R package: hireLetter.  Check this out in GitHub ! Reflection: I found developing an R Markdown file very user-friendly and simple. I enjoyed writing headings and descriptive text outside of the code instead of entirely using comments as we did in R scripts. After starting to write the code for my final project, I was faced with many challenges. I utilized my knowledge of shiny apps from a previous class to begin my code because I realized user input is very important for my package and shiny app gives a great format for this. I hope to further develop my shiny app function after learning more in the next module of this course. I have a few issues and questions I would appreciate help resolving.  How can I  reference the pass and fail output tab...

Week 11: Debugging and defensive programming

Image
Hi everyone! This week we learned about debugging in R. Debugging involves fixing the issues as they arise in the code you write. In R, there is an in-built debugging function in the Debug tab that toggles breakpoints. To take a more code based debugging approach, we can use traceback(), debug(), or trace() to see line by line where the error lies. The code below contains a 'deliberate' bug!   tukey_multiple <- function(x) {    outliers <- array(TRUE,dim=dim(x))    for (j in 1:ncol(x))     {     outliers[,j] <- outliers[,j] && tukey.outlier(x[,j])     } outlier.vec <- vector(length=nrow(x))     for (i in 1:nrow(x))     { outlier.vec[i] <- all(outliers[i,]) } return(outlier.vec) } Find the bug and fix it! The first step I took to understand where the bug may be was to simply run the code in R to see the error message which is seen below. This message says there is an...

Week 10: Building your own R Package

Hi everyone! This week we learned about R packages and creating our own! The first step is loading in the devtools and roxygen2 packages and then setting up the description page and naming the package.  I was interested with the hiring letter function I started developing in the S3 and S4 module, so I decided to make that an actual template for employers to use in a easy to access R package. Check out GitHub to see my DESCRIPTION page for the hireLetter package I am creating! Final Project Proposal My final project proposal is the hireLetter package. This package will include a function called pass() and fail() where a data frame of employee information and qualifications is inputted as the parameter to output a list of individuals who either pass or fail the employer's expectations. The other function add-ons include the acceptedL() and rejectedL() that customizes a letter for the inputted parameter of the person's name of either acceptance or rejection. I may add more funct...

Week 9: Visualizations

Image
Hi everyone! This week we learned about three types of visualization including basic graphics, lattice, and ggplot2. Basic graphics encompasses functions such as plot() for line graphs, barplot() for bar plots, and hist() for histograms. We can customize symbols, line types, and colors in these visualizations. The lattice package uses the function xyplot(y~x) for plots and bwplot() for box plots. The advantages of this package is multi-conditioning with many panels in one output and convey additional information easier. The ggplot2 package is great for visualizations and is structured as ggplot(dataframe, aes(x,y)) + geom_[].  Choose any data set for your visualization from Vincent Arel Bundock dataset list:  https://vincentarelbundock.github.io/Rdatasets/datasets.html Links to an external site. Using this data, generate three types of visualization on the data set you have chosen. In your blog, discuss and present your three visualizations you will create and express your opi...

Week 8: Inputs and Outputs

Image
Hi everyone! This week we learned about inputs and outputs, string manipulation, and the plyr package in R. Among the functions learned, read.table() and ddply() are particularly interesting. The function read.table() can be used to read in a text file to use in R. The ddply() function can be used to split by a category such as gender to order data, while you can also modify this new data frame using transform and calculate a new column.  Please following steps 1-3 Step # 1   Import assignment 6 Data-set to R   Download Import assignment 6 Data-set to R . Then, Run the commend "mean" using Sex as the category (use plyr package for this operation). Last commend in this step:  write the resulting output to a file. In data frame y, you can see the students are separated into females first and males second with the new grade average column added. Step # 2  Convert the data set  to a  dataframe for names whos' name contains the letter i, then crea...

Week 7: S3 vs S4 objects

Hi everyone! This week we learned about S3 and S4 objects. Objects in R store data elements and are part of classes where you can use different methods or functions to operate on them. An object can be a vector, variable, list, matrix, etc. In R, the two types of objects are S3 used for informal methods from the beginning and S4 used for formal methods in classes and methods developed later.  Download any type of data (from the web or use datasets package) or create your own set.  Then, on the second step, determine if generic function as discussed in this module can be assigned to your data set, and if not, why? (Example, here is list of data set in R) data("mtcars") head (mtcars, 6) list(mtcars, 6) In third and last step, explore if S3 and S4 can be assigned to your data set. I created a dataset of two potential employees with their name, age, years of experience, and the role they are applying to. The generic function print works, but functions such as mean do not because ...

Week 6: Doing Math P2

Image
Hi everyone! This week we learned more mathematical operations to use in matrices. A new function I learned is diag() which represents the number you choose in the top left to bottom right diagonal spots of the matrix. Similar to a data frame, you can select cells in a matrix using square brackets []. I utilized this to define numbers for spots in the matrix. 1. Consider A=matrix(c(2,0,1,3), ncol=2) and B=matrix(c(5,2,4,-1), ncol=2). a) Find A + B b) Find A - B 2. Using the  diag()  function to build a matrix of size 4 with the following values in the diagonal 4,1,2,3. 3. Generate the following matrix: ## [,1] [,2] [,3] [,4] [,5] ## [1,] 3 1 1 1 1 ## [2,] 2 3 0 0 0 ## [3,] 2 0 3 0 0 ## [4,] 2 0 0 3 0 ## [5,] 2 0 0 0 3 Hint: You can use the command  diag()  to build it. Check this out in GitHub ! - Ramya's POV