Week 4: Programming structures

Hi everyone!

This week we learned about different apply functions including lapply and other visualization techniques such as box plots, and histograms. The lapply() function in the R Language takes a list, vector, or data frame as input and gives output in the form of a list object. The structure is lapply(x, function to apply). Boxplots are used to visualize the spread of data and in R we can do this by using boxplot(variable1 ~ variable2) where variable 1 on the y axis is explained by variable 2 on the x axis. Histograms represent the distribution of numerical data. In R, we can create basic histograms using hist(variable1) where variable1 is the numerical dataset. 

The following data was collected by the local hospital. This data set contains 5 variables based on observation of 8 patients. In addition to the measurements of the patients checking in to the hospital that night, this data provides the patients' histories regarding the frequency of their visits to the hospital in the last 12 months.
This data displays the measurement of blood pressure, first assessment by general doctor (bad=1, good =0) titled "first," the second assessment by external doctor (called "second"), and the last row provides the head of the emergency unit's decision regarding immediate care for the patient based on the values 0 or 1 (low = 0, high =1).

Your first assignment: Create a side-by-side boxplot (boxplot(x, ...)) and and histogram ((hist(x, ...)). 
Discuss the outcome of your results regarding patients BPs & MD’s Ratings.

The first step was to create a data frame with the blood pressure, both doctors assessments, the head doctor's assessment, and the frequency of the patient's visits to the hospital. Next, I created side-by-side box plots of the blood pressure of the patients and each doctor's assessment of low or high need of immediate care. I made three box plots for each doctor's assessments to understand the distribution of the patients they said were in need of low or high care in comparison to the other doctor opinions.







The first doctor determined more of the patient's with low blood pressure needed high immediate care with relatively similar median blood pressures between both groups.











The second doctor had more variability in blood pressure of the patients recommended for high need of immediate care with a slightly higher median blood pressure for high risk patients. 
The final head doctor's decisions revealed a more stark difference in median blood pressures for patients highly needing immediate care versus lowly needing immediate care. From this data, I can see the similarity in judgement for the second doctor and the head doctor who both put patients with higher blood pressures in high immediate care.








Based on these results, I am interested in seeing the similarities in overall MD ratings for all three doctors to understand if their similar judgements are averaging the same for all patients. I utilized lapply for this calculation and found the mean of the second doctor's opinion is the same as the head doctor's final opinion. We can conclude the second doctor's opinion of higher blood pressure indicating a higher need of immediate care is what the hospital prefers for judgement.










In order to understand the distributions of blood pressure for patients rated the same by both second doctor and the final doctor, I created a histogram of only these specific blood pressures in obs2$bp. From this we can see the most similarly rated patients are within the blood pressures of 100 to 150 and evenly distributed.

















However, the similarly rated patients by the first and final doctor are more variable with higher matches for low blood pressures. The stability and large spread of the second and final doctor's ratings helps explain that the second doctor's opinion truly does match the hospital's categorization for immediate care the best.
Check this out in GitHub!

-Ramya's POV


Comments

Popular posts from this blog

Week 6: Doing Math P2