BIOL 1230A
DataScience Across Disciplines
Data Science Across Disciplines
In this course, we will gain exposure to the entire data science pipeline—obtaining and cleaning large and messy data sets, exploring these data and creating engaging visualizations, and communicating insights from the data in a meaningful manner. During morning sessions, we will learn the tools and techniques required to explore new and exciting data sets. During afternoon sessions, students will work in small groups with one of several faculty members on domain-specific research projects in Anthropology, Biology, Classics, Political Science, and Statistics. This course will use the R programming language. No prior experience with programming is necessary.
ANTH/LNGT: In this section we will explore indigenous political voice in unexpected places. The data we will analyze will consist of bilingual Apache-English and Maidu-English stories, songs and speeches originally recorded by anthropological linguists in the early twentieth century as examples of traditional culture. Students will use R to search this corpora for how indigenous contributors were also making claims on the future in their address to the researcher and to wider anticipated audiences.
BIOL/ECSC 1230: In this section we will work with data collected by elephant seals equipped with oceanographic instruments in the Southern Ocean. Depending on your interests, you can approach the project from different angles: students focusing on biology will explore where the seals travel and what drives their movements, while those interested in earth science will investigate the temperature and salinity profiles gathered during their dives. Working in teams, you’ll combine these perspectives to build a fuller picture of both seal ecology and the oceanographic processes that shape their environment. Along the way, you’ll practice manipulating and visualizing different types of data including maps of seal tracks, temperature and salinity profiles, and cross-sections of ocean properties. We will also bring in satellite and autonomous float data to place seal activity and the data they collect in a broader context. By the end, you’ll have a sense of how these different data sources fit together and what unique insights we gain from using seals as oceanographers.
CLAS 1230: In this section students will gain hands-on experience with a variety of natural language processing and text mining techniques by exploring the writings of the ancient historian Plutarch, who lived during the first and second centuries AD. We will focus on the biographies of "great men" in Plutarch’s Lives, which chronicles the history, morals, and virtues of major figures who played parallel roles in ancient Greek and Roman society. The public domain English translation from Project Gutenberg will serve as our main corpus; however, students with a background or interest in Ancient Greek can work with the text in the original language. Using the R programming language, students will transform unstructured text into quantitative data for statistical analysis and morphosyntactic parsing. We will also apply machine learning models to our data to reveal underlying patterns in large amounts of text. No prior experience with R is necessary.
PSCI 1230: Who votes in elections? Who attends protests? Why? In this section we will use the tools of data science to explore these and other questions about political participation in the Americas. We will examine engagement in different forms of participation and the demographic, economic, social, and other factors that shape participation. The class will introduce students to the basics of survey research and the study of political participation. Students will complete a final group project showcasing the concepts and tools learned in class.
STAT 1230: In this section students will dive into the world of data science by focusing on invasive species monitoring data. Early detection is crucial to controlling many invasive species; however, there is a knowledge gap regarding the sampling effort needed to detect the invader early. In this course, we will work with decades of invasive species monitoring data collected across the United States to better understand how environmental variables play a role in the sampling effort required to detect invasive species. Students will gain experience in the entire data science pipeline, but the primary focus will be on data scraping, data visualization, and communication of data-based results to scientists and policymakers.
- Schedule
- 10:00am-11:30am on Monday, Tuesday, Wednesday, Thursday at MBH 104 (Jan 5, 2026 to Jan 30, 2026)
1:00pm-2:15pm on Monday, Tuesday, Wednesday, Thursday at MBH 311 (Jan 5, 2026 to Jan 30, 2026) - Location
- McCardell Bicentennial Hall 104
- Instructors
-
-
Lyford, Alex
alyford@middlebury.edu -
Schine, Casey
cschine@middlebury.edu
-