So I have this great little custom function I’ve used when looking at survey data in R. I call this function
pull(). The goal of
pull() is to quickly produce frequency tables with n sizes from individual-level survey data.
pull(), I create a big table that includes information about the survey questions I want to pull. The data are structured like this:
- quest represents the question coding in the raw survey data.
- survey is the name of the survey (in my case, the elementary school students, middle school students, high school students, parents, teachers, or administrators).
- year is the year that the survey data are collected.
- break is the ID I want to aggregate on like schoolcode or districtcode.
They key is that
paste(survey, year,sep='') produces the name of the
data.frame where I store the relevant survey data. Both quest and break are columns in the survey data.frame. Using a data.frame with this data allows me to apply through the rows and produce the table for all the relevant questions at once.
pull() does the work of taking one row of this
data.frame and producing the output that I’m looking for. I also use
pull() one row at a time to save a data.frame that contains these data and do other things (like the visualizations in this post).
In some sense,
pull() is really just a fancy version of
prop.table that takes in passed paramaters and adds an “n” to each row and adding a “total” row. I feel as though there must be an implementation of an equivalent function in a popular package (or maybe even base) that I should be using rather than this technique. It would probably be more maintainable and easier for collaborators to work with this more common implementation, but I have no idea where to find it. So, please feel free to use the code below, but I’m actually hoping that someone will chime in and tell me I’ve wasted my time and I should just be using some function foo::bar.
P.S. This post is a great example of why I really need to change this blog to Markdown/R-flavored Markdown. All those inline references to functions, variables, or code should really be formatted in-line which the syntax highlighter plug-in used on this blog does not support. I’m nervous that using WP-Markdown plugin will botch formatting on older posts, so I may just need to setup a workflow where I pump out HTML from the Markdown and upload the posts from there. If anyone has experience with Markdown + Wordpress, advice is appreciated.