EDA: Open & Closed Data

Introduction

Each summer, nearly 8,000 incoming students attend New Student Orientation (NSO) at Penn State’s University Park campus. At the conclusion of NSO, each student is sent a survey to gather their perspectives on various aspects of their experiences at their respective sessions. Questions range from their experiences at check-in to their understanding of student services and various initiatives, such as Penn State’s commitment to diversity and inclusion. These data provide an opportunity to asses which aspects of NSO warrant further exploration.

Cleaning & Inspecting the Data

Survey results were provided from the office of New Student Orientation for the sessions occurring in the summers of 2017,2018, & 2019. Each of these databases were inspected to find which variables were consistent across all three spreadsheets. Variables that were not consistent across each survey were discarded combined into one master database, coded by their respective years. The variables that were consistent across all three years were:

  • Leader Connection – The extent to which a meaningful connection was made with their Orientation Leader during NSO.
  • Substances -The extent to which their understanding changed related to the consequences of alcohol and drug use and abuse during NSO.
  • Assault Resources – The extent to which their understanding changed as a result of attending NSO related to reporting and support services Penn State provides for victims of sexual harassment and sexual assault.
  • Bystander Prevention – The extent to which their understanding changed as a result of attending NSO related to how to handle dangerous situations.
  • Health Resources – The extent to which their understanding changed as a result of attending NSO related to support services Penn State provides for mental health, physical health, and personal well-being.
  • Safety Resources – The extent to which their understanding changed as a result of attending NSO related to support services Penn State provides to help keep me safe.
  • Diversity / Inclusion – The extent to which their understanding changed related to the importance of diversity and inclusion on our campus.
  • Definition of Consent – An open-ended survey question asking participants to define the term “consent.”

The final step in cleaning the data was to remove any personable identifiable information and missing values. Basic demographic information, such as race, gender, sexual orientation, resident status, and matriculation date were maintained, but not utilized for this analysis. Any observations that broke off from the survey prior to answering the 8 variables of interest were discarded as well.

Exploratory Data Analysis

Once data were gathered and cleaned, an exploratory data analysis was conducted to examine patterns in the data. Survey questions that utilized a Likert scale were compared against one another, revealing similar, right-skewed distributions on each of the factors, with the exception of leader connection, which was more widely distributed amongst Likert responses (figure 3.1). The leader connection variable, when examined in a bar chart, grouped by year (figure 3.2) showed the widest variety of distributions in comparison the remaining variables visualized in the same way (figure 3.3). Since Likert scale data is ordinal in nature, a Kruskil-Wallace test was conducted on each variable to examine differences by year, followed by a post-hoc analysis using the Dunn-Bonferroni correction to reveal where differences may occur. Each variable showed an upward trend in Likert responses over time, with with statistically significant differences (α < .05) in each variable over time, with the exception of the variable measuring the importance of diversity and inclusion at Penn State.

The open ended survey question asking for a definition of consent revealed interesting results across the three surveys. In 2017 (figure 3.4) & 2018 (figure 3.5), the top two words used to define consent were the words “consent” and “yes,” respectively. However, in 2019 (figure 3.6), the top two words were “given” and “freely.” It is known that the 2019 version of the Results Will Vary interactive play that all NSO students see featured a production related to consent. In this scene, the an acronym F.R.I.E.S is used to represent that consent must be freely given, reversible, informed, enthusiastic, & specific. Each of these words, along with the aforementioned acronym all occur in the top 10 words for the 2019 survey, while none of them were found in the top 10 of the previous two NSO years.

Likert Data Comparison (figure 3.1)

Leader Connection (FIGURE 3.2)

Protocols, Services, & Resources (FIGURE 3.3)

2017 Open Ended Data (FIGURE 3.4)

Frequency chart of top 20 Words (FIGURE 3.5)

2018 Open Ended Data (FIGURE 3.6)

Frequency chart of top 20 Words (FIGURE 3.7)

2019 Open Ended Data (FIGURE 3.8)

Frequency chart of top 20 Words (FIGURE 3.9)

Conclusion / Suggestions for the Future

While it is difficult to make inferences on observational data, we can see some trends towards greater understanding from students who attend NSO at Penn State’s University Park campus. These increases in understanding could be due to any variety of factors, including changes in the population of interest (e.g. incoming students) or a variety of areas within the NSO experiences. The open ended survey data defining consent showed the clearest picture of the differences between year with the 2019 data pointing clearly towards connections made with a scene dedicated to consent in the Results Will Vary interactive play.

To gain a greater understanding of student’s perceptions of new student orientation, a variety of opportunities exist. A clear definition of what kinds of insights you would like to gain from students regarding NSO should inform question formation. For example, an argument could be made that leader connection is a multi-dimensional construct that cannot be measured accurately with one question. Consequently, leader connection was the variable with both the lowest score and the widest distribution of responses among participants; further investigation is warranted. Finally, the use of open-ended survey responses could provide a wealth of feedback on specific initiatives, particularly if they are formed in conjunction with specific experiences during NSO. For example, the F.R.I.E.S scene from the Results Will Vary interactive play demonstrated clear connections towards changes in understanding of consent. Future new student orientations could benefit by exploring these connections in other topics covered within Results Will Vary to measure both the effectiveness of play and the perceptions of students.

Author: Scott Atchison

I am a PhD student in Music Education at Penn State with a cognate in Instructional Technology. My research interests within my cognate focus on Open Educational Resources (OER), online learning, and instructional design.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s