Harnessing big data to see the big picture in ecology

MSU will use a $4.2 million National Science Foundation grant to help shape the future of the changing field of ecology.

Team of MSU researchers, including Pat Soranno (second from left)

MSU will use a $4.2 million National Science Foundation grant to help shape the future of the changing field of ecology.

“Ecologists are often asked how lakes and rivers, forests and other natural systems of a country or region will change as climate changes,” said Patricia Soranno, MSU professor of fisheries and wildlife and macrosystems ecology pioneer. “But, ecologists still struggle with the basic idea of extrapolation – how to apply knowledge gained from individual ecosystem studies that ecologists typically study to all ecosystems within regions, across regions, and ultimately, across continents.”

An interdisciplinary team of ecologists, statisticians, computer scientists and data scientists will work to scale up traditional ecology, to regional and continental scales. The evolving field is called “macrosystems ecology,” and it can play a critical role in helping solve many challenges prompted by a changing climate.

For example, for the U.S. alone, ecologists have many decades of data, from across the country, that have been collected on lakes and rivers, but nobody has put them all together to understand the freshwater systems of the U.S. as a whole. The sources include many small, individual projects from university researchers, government agencies that have been monitoring natural resources for decades, terabytes of data collected from new or existing field sensors and observation networks, as well as millions of high-definition satellite images, just to name a few.

Paired with this near-endless data deluge is easy access to supercomputers. Analyses that once took months or years to complete can now be conducted in hours or days. And, there are many new computer science and statistical modeling tools available to analyze such large datasets.

The challenge, though, is that most ecologists do not know how to put all these data together or how to use these new tools.

Therefore, Soranno and her MSU co-leader and fellow ecologist, Kendra Spence Cheruvelil, assembled an interdisciplinary team, including MSU computer scientists Pang-Ning Tan and Jiayu Zhou, to use big data and the latest computer tools to harness and blend knowledge from individual studies and scale them up to show what is happening today and into the future at the scale of the entire U.S. Other members of the team are from the University of Wisconsin, the University of Missouri and Penn State University.

For example, many lakes in the United States experience nutrient or algal blooms that can affect water supplies, close beaches and cause illnesses in people, pets and wildlife. The team hopes to potentially predict where these blooms might be more likely to happen across all continental U.S. lakes – about 150,000 lakes in all.

Zhou and Tan joined the project to help these and other ecologists overcome the many significant data challenges that exist when studying so many ecosystems together. The challenges include missing values and “noisy” measurements, unknown relationships among hundreds of variables, and computers that might still be too slow when trying to model all lakes in an entire country, said Zhou.

“Our goal is to build a suite of novel computer science methods to explore the data to help us learn about all of the freshwater systems in the U.S., even from data that are messy and incomplete, and to share those methods with other scientists conducting such work,” he said.

Did you find this article useful?