UC Berkeley to co-lead regional data science ‘brain trust’

November 2, 2015
By: Sarah Yang

UC Berkeley is teaming up with UC San Diego and the University of Washington to lead one of four regional “brain trusts” in data science established by the National Science Foundation.

UC Berkeley will co-lead one of four regional Big Data Innovation Hubs established by the National Science Foundation. Image: iStockphoto.

In today’s announcement of the Big Data Regional Innovation Hubs, or BD Hubs, the NSF cited the need to facilitate large, multi-sector collaborations to accelerate advances in data science. The program is meant to increase opportunities for sharing ideas, resources, and best practices; reduce coordination costs; and tap into the pool of top experts in the field to address regional issues.

NSF will award $5 million over three years to four regional hubs in the Northeast, South, Midwest and West. The four hubs cover all 50 states and involve collaborations with 281 partners from academia, industry, government and non-profit organizations. The West Hub includes a third of the partners and 13 states.

UC Berkeley is co-leading the West Big Data Innovation Hub with UC San Diego and the University of Washington, and will be the home base for executive director Meredith Lee.

“We’re excited to build and strengthen partnerships in the region and across the nation as we launch this effort,” said Lee, a former science and technology policy fellow with the U.S. Department of Homeland Security’s Big Data Analytics team. “With each collaborator bringing their unique skills and perspectives to the table, we have an opportunity to advance the data science ecosystem in new and creative ways. The West region in particular can leverage a strong track record of entrepreneurialism and hands-on experimentation.”

The principal investigators of the West Hub are Michael Franklin, Thomas M. Siebel Professor of Computer Science and chair of UC Berkeley’s Computer Science Division; Michael Norman, professor of physics at UC San Diego and director of the San Diego Supercomputer Center; and Ed Lazowska, Bill and Melinda Gates Chair in Computer Science and Engineering at the University of Washington and director of the UW eScience Institute.

“The NSF Big Data Innovation Hub program will bring together a wide range of participants to accelerate data-driven solutions to many of the most pressing problems facing our region and beyond,” said Franklin, Berkeley’s West Hub principal investigator. “Berkeley has a leadership position in Big Data research through high-impact projects such as the AMPLab and the Berkeley Institute for Data Science, and we look forward to lending our expertise to this exciting new initiative.”

The West Hub will focus on five thematic areas:

  • Big Data technology: The Western region leads the nation in data management innovation through its unique blend of leading universities, national laboratories, start-ups and established companies that are at the center of big data research and analysis.
  • Managing natural resources and hazards: Big data can be used to record and manage environmental challenges related to earthquakes, wildfires, floods, air and water quality, and more.
  • Precision medicine: Advances in genomics and the personalized wellness industry have transformed modern healthcare. Big data is ushering in an era where medical care can be tailored to individual patients.
  • Metro data science: Increasing urbanization has created new challenges in maintaining efficient, sustainable and livable urban areas. Big data enables informed decisions in the smart management of the increasingly complex interconnections among residents, transportation, housing and public services.
  • Data-enabled scientific discovery and learning: Digitally generated data is flooding in from a wide range of sources such as global climate models, earthquake scenarios, environmental monitors, satellites and telescopes, laboratory instruments and social science data. The massive amounts of information provided by these sources could lead to meaningful insights and discoveries.

The BD Hubs program builds upon the National Big Data Research and Development Initiative, which was announced in 2012. Other hubs will be coordinated by Columbia University (Northeast Hub), the University of North Carolina and Georgia Tech (South Hub) and the University of Illinois at Urbana-Champaign (Midwest Hub).

“The BD Hubs program represents a unique approach to improving the impact of data science by establishing partnerships among likeminded stakeholders,” said Jim Kurose, NSF’s head of Computer and Information Science and Engineering. “In doing so, it enables teams of data science researchers to come together with domain experts, with cities and municipalities, and with anchor institutions to establish and grow collaborations that will accelerate progress in a wide range of science and education domains with the potential for great societal benefit.”

The BD Hub grant is the latest award to UC Berkeley in the field of data science. Three and a half years ago, the NSF awarded $10 million to UC Berkeley to help establish the AMPLab, the birthplace of the widely used Berkeley Data Analytics Stack (BDAS), a set of open-source software programs, including Apache Spark, designed to quickly process and make sense of Big Data.

The BD Hubs program is expected to lead to a larger NSF initiative called Big Data Spokes (BD Spokes), which aims to help initiate research in specific priority areas identified by the BD Hubs. More information about the BD Spokes program is online.