The Big Data Driving Google Maps

Patrick McDeed, an intern with EDC’s Oceans of Data Institute (ODI), is contributing to several ODI curriculum R&D initiatives including Ocean Tracks: College Edition, an interactive Web-based learning resource that helps undergraduate students develop valuable skills in analyzing and learning from large, authentic scientific datasets. To this work, he brings a deep interest in mathematics and secondary education, as well as experience manipulating and reporting on “Big Data” as a consumer online behavior analyst—in which capacity he drew upon some of the skills and knowledge that ODI documented in its “Occupational Skills Profile for the Big-Data-Enabled-Specialist.” In this post, originally published on ODI’s blog, Patrick explores the important role that “Big Data”—in the form of Google Maps—plays in helping us navigate our daily lives as we commute to work, carpool our kids to school, and take road trips.

For many, the term “Big Data” remains very much a black box. How it is collected, managed, and then analyzed for practical use is still largely an unknown. To help unpack the mysteries of this buzzword, big data, it is beneficial to explore the ways it impacts our day-to-day lives: whether it be the ways in which we engage with content online to how it impacts our daily commute.

According to a recent report from the Texas Transportation Institute the typical American loses an average of 42 hours a year sitting in traffic. What if instead of costly infrastructure improvements like Boston’s Big Dig, the solution to our traffic problems was big data? Real-time data collected from smartphone GPS, traffic cameras, stoplights, weather reports, even social media text provide big data analysts a diverse dataset to better predict traffic flow and how drivers will behave on any given day.

Fortunately, avoiding that rush hour traffic isn’t reserved only for the data-savvy and technical elite. In fact, many of us have the capabilities right in our pockets with GPS enabled smartphones, loaded with mapping applications like Google Maps.

Normally, we roll out of bed every morning, grab our cup of coffee and prepare ourselves for that inevitable back up on the highway. We know this should be our fastest route to work: the highway is more direct and the speed limits are higher. It doesn’t take advanced analytics for us to reason this out. Unfortunately our fellow commuters often throw a wrench into our well-formulated morning routine. To combat this, we occasionally turn on Google Maps before heading out to work to see which route it suggests. Lo and behold it directs us to the back roads. While we certainly don’t complain about avoiding that two-mile back up on the highway, we’re left wondering: How did Google see this coming and know the back roads would be the fastest route? The short answer, as we’ve alluded to, is data -- lots and lots of data.

The long answer is a bit more complex. It’s not just “data” that powers the amazing predictive powers of Google Maps; it’s rich and reliable data, from many sources, coupled with powerful machine-learning algorithms. Google’s main source of data comes from our smartphones. Data from phone-based GPS provides information about drivers’ location, relative speed, and itinerary to the Google Maps system. This allows Google to calculate the density of traffic on a particular stretch of road or see, for example, that cars are suddenly moving 12 MPH slower than they should be.

This real-time GPS data is only a small piece of a much larger puzzle. Real-time predictions are improved further by historical data. What did the flow of traffic look like yesterday? Last week? Two years ago? Google takes into account daily calendar events and adjusts for whether it’s a normal workday, a holiday, or even if there is road race taking place that will cause street closures.

To improve their ability to help drivers even further, Google purchased the navigation start-up Waze in 2013, to diversify their data sources. Waze relies on its users to report accidents, bottlenecks, and traffic as they drive (or sit stopped in traffic). Waze also partners with local city authorities to elicit accident data, construction updates, road closures, etc. This data allows Waze, and now Google, to take into account the roads and intersections that are prone to accidents or to adjust, for example, a water main break in real-time.

The data sources useful to improving traffic flow will only continue to grow as cities update their public transportation systems with more sensors or install “smart” traffic signals. Companies like Google, to keep their competitive advantage, will need workers to continue to make sense of new and growing volumes of data as they become available: determining how to best leverage these data in real-time to provide us average drivers with the fastest route to work every morning.

We live in a data-driven world. Big data impacts our day-to-day lives in increasingly diverse ways, and as we’ve just seen, even that pesky commute. This is one of the many reasons why ODI is working towards building data skills into the learning experiences of all K-16 students. (Sign ODI’s Call to Action to Promote Data Literacy.)