Event Details:
Data reuse, or secondary data analysis, is the analysis of existing data collected by other individuals or institutions for a new research purpose (National Library of Medicine). Data reuse is a resource-effective way to conduct novel research, discover new collaborators, and contribute to your academic field. In this workshop, we will learn about the advantages of reusable data and how to evaluate data for reuse. Then, we will apply these principles to Stanford Libraries’ growing data collections, which are sourced from vendors like Gallup, Washington Post and CoreLogic. Finally, we will practice using basic SQL commands to query reusable data on the Redivis platform.
Prerequisites
None. Please bring your own computer. It would be helpful before the class to create a free Redivis account if you don’t already have one.
Audience
Advanced undergraduate students, graduate students and postdocs.
Learning Objectives
By the end of the workshop, attendees will be able to:
- Apply data quality criteria when considering a dataset for reuse
- Define “relational database” and explain their applications in research
- Recognize basic SQL keywords, such as SELECT, FROM, WHERE and JOIN