Exploring the Potential of the Profile of Big-Data-Enabled Specialists

Ruth Krumhansl is an expert in curriculum development, education research, science teaching, and applied science with a focus on using Internet-based tools to bring authentic scientific data into the K–16 classroom. She is the author of the 2014 high school curriculum EDC Earth Science and the Director of EDC’s Oceans of Data Institute. In her leadership of the Institute, Ruth is working to transform K–16 science education to ensure students’ data literacy and prepare them for college and career success—as well as lives as informed citizens—in a world of Big Data. In December 2014, the Institute published the first ever profile of the skills and knowledge needed to be successful as a big data specialist in a wide variety of fields. In this post, Ruth discusses the profile, reflects on the importance of data literacy, and shares some of the Institute’s next steps.

The Oceans of Data Institute’s Profile of Big-Data-Enabled Specialists has sparked a good deal of interest. Developed by big data experts, the profile describes the skills needed to use data to solve diverse challenges on the job every day. Over 150 big data professionals from 15 industry sectors have endorsed the profile. For these sectors, the profile can help inform training and workforce development.

Our Institute is using the profile to enhance K–16 learning and teaching. We have substantial experience designing curriculum and supporting teachers, as well as conducting research. So, our team is using the profile to map out the specific skills, knowledge, and behaviors that K–16 students need to acquire, and how they can build these abilities over the course of their schooling to achieve data literacy by the time they graduate. From this work, we will come up with strategies to enhance curriculum, instruction, and teacher professional development. 

Right now, very few students graduate from school ready to use large, complex data sets and analytical tools effectively. Many students work with data they collect themselves in science class, which gives them important foundational skills. But in a big-data-driven world, it’s critical to understand how to work with data that was collected by others, and data that was collected using unfamiliar methods and tools. You need to be able to ask questions about what the data actually means—what is being measured and the limitations of data collection methodologies. Whether students pursue careers in health, business, science, technology, engineering, criminal justice, or dozens of other fields they will need to be able to do this. As citizens, they will also need to make meaning of big data in the headlines and use data to make informed decisions about health, housing, and much more.

Last week, I was in my kitchen when I got a big “real life” reminder of the importance of thinking critically about data and the limitations of data collection tools. I was tinkering with a new gadget I bought and wore on my wrist to help me stay fit. It measures steps, counts calories burned, generates graphs and charts, and shows your progress in meeting your goals. It’s a great tool, but the data it produces presents an incomplete picture. The gadget counts steps, but it does not measure your heart rate and it does not track all of your activity. So, if you cross-country ski or snow shoe, it simply counts your steps and “sees” no difference between these activities and walking. If you dance energetically to music, it isn’t likely to track that. But, if you’re standing in one place swinging your arms, it counts that as steps. So, in terms of the calories you burn from activity, which are automatically calculated based on the number of steps, the graphs and charts produced may not be accurate. You might be taking a brisk walk and exercising hard, or you might just be standing in place flapping your arms. Long before students enter the workforce, they can and should be thinking about things like this. They should know how data tools work, understand what they measure, and factor that into deciding whether they are appropriate to answer a particular question.

Learning how to question data and data tools does not happen overnight. Yet we can design curriculum and activities that help students learn to think about data in an analytic way from a very young age. We can also prepare teachers to promote the emerging data literacy of students of all ages. How teachers foster students’ data literacy will look very different by grade and by student. Over time, from grade to grade, data literacy skills and knowledge will slowly build. Students can slowly master many of the skills in the profile—defining problems, selecting appropriate data, designing experiments, conducting exploratory analyses, evaluating results, and writing reports that convey the “story” the data tells (and its limitations). And, for students of all ages, there are appropriate strategies to cultivate critical thinking skills and problem-solving skills—key aspects of a big-data-enabled specialist’s work.

As students grow older, we can expose them to big data sets, tools, and techniques they will use in a wide range of careers. We use the acronym CLIP to describe what sets these data sets apart from others:

  • Complex (including a variety of data types, collected using varied methodologies and instruments)
  • Large (including more data than are appropriate to answer any particular question)
  • Interactively accessed (offering choices about what data to examine and how to visualize and analyze that data)
  • Professionally collected (going beyond what students are able to collect themselves)

Early on, our Institute recognized that in the last decade, students’ online access to a broad variety of data sets has been exploding. These datasets offer rich opportunities for students to develop the skills that are unique to work with CLIP data. These include the ability to select appropriate data to investigate a question, create a variety of unique and customized data visualizations appropriate to answer a question, relate multiple data parameters to each other, and use multiple lines of evidence to support a claim. In our Ocean Tracks Phase 1 (high school) and Ocean Tracks: College Edition projects, our Institute, Stanford University, and the Scripps Institution of Oceanography developed and are testing a Web interface and data analysis tools that simplify accessing and analyzing data. The tools and interface make it possible for students, with the support of carefully designed and tested curriculum and teacher facilitation, to practice some of these sophisticated skills. And, we’re working to identify and tackle obstacles students encounter as they try to work with large datasets.

As we continue to “mine” the Big-Data-Enabled Specialist profile and identify recommendations for K–16 curriculum and instruction, we are dedicated to preparing students to engage in tomorrow’s big-data workforce. But, we are also focused on preparing all citizens to live wisely and productively in a data-rich world whether they are trying to making choices about what house to buy, how to educate their children, who to vote for, what news to believe, or how to use a fitness gadget to improve their health.