Guest Column | January 13, 2021

Statistical Modeling In Support Of Lead Service Line Identification, Inventory, And Replacement

By Jake Abernethy and Eric Schwartz

BlueConduit_sm

One of the most high-profile, pressing, and complicated infrastructure problems gets chopped down to size with the help of data-driven analysis.

Replacing lead service lines (LSLs) is a public health and infrastructure priority that can be a costly and time-consuming endeavor for municipalities. Statistical modeling allows for faster, more accurate identification of LSLs, equipping decision-makers with the information they need to plan and prioritize replacement work.

The EPA estimates that there are between 6 million to 10 million LSLs in the U.S., and the Robert Wood Johnson Foundation and The Pew Charitable Trust report that removing LSLs from homes of children born only in 2018 would protect more than 350,000 children and yield more than $2.7 billion in future benefits.

Compared to a conventional approach, the statistical model allowed the replacement crews to better target homes for replacement, empowering decision-makers with data to save tens of millions of dollars and accelerate the removal of lead pipes.

The American Water Works Association (AWWA) estimates that replacement costs could be as high as $30 billion. Uncertainty around the number and location of LSLs makes it difficult to execute efficient replacement programs. Statistical modeling has been proven to have significant positive impacts in the identification, inventory, and replacement of LSLs, reducing the time and cost of replacement.

Conventional Methods vs. Statistical Modeling

The processes that water systems use to budget and prioritize their replacement projects carry significant public health and utility costs, and they are often using incomplete, inaccurate, and unreliable historical records to make these decisions. It is these uncertainties that statistical modeling is focused on reducing. Data-driven analyses, powered by fundamental statistical methods and machine learning, can allow communities to accurately inventory LSLs and accelerate their removal in the most efficient and cost-effective way.

BlueConduit, a water data analytics company that helps cities reduce uncertainty around LSL inventory and location, has seen this firsthand. One city the team worked with had initially estimated that 10 to 20 percent of its pipes contained lead. However, a statistical analysis that included a representative set of inspections estimated that roughly 37 percent of active water accounts contained lead. After 25,000 inspections, the true proportion of lead in the community was 38 percent. This accuracy allowed the city to request the appropriate funding to remediate the problem and target the homes most likely to have lead lines. The same data challenges due to misleading or outdated records are shared by many municipalities and water utilities.

While statistical methods can help all types of communities, they provide the greatest benefits in cities with greater uncertainty.

Compared to a conventional approach, the statistical model allowed the replacement crews to better target homes for replacement, empowering decision-makers with data to save tens of millions of dollars and accelerate the removal of lead pipes.

How It Works

Statistical modeling uses information that is known (e.g., location, year built, water-main size and material, construction records, etc.) to make an initial prediction about something that is not known with certainty — in this case, service line material. The utility gathers data on service line material at a representative set of homes. Combining that data with the previously known information, the model assigns a material likelihood (that is, a probability of lead between 0 to 100 percent) to parcels with “Unknown” SL materials. As those unknown materials become verified, the statistical model incorporates this new information and updates the likelihoods. These parcel-level likelihoods help municipalities “dig where the lead is,” saving time and money while eliminating negative environmental health impact.

In a recent project, BlueConduit’s model’s accuracy outperformed the city’s previous partner by more than 25 percentage points. Over time, the hit-rate accuracy consistently remained above 75 percent.

While statistical methods can help all types of communities, they provide the greatest benefits in cities with greater uncertainty.

Key Benefits Of Statistical Modeling

The accuracy of this model drives three primary benefits. First, a municipality saves time by avoiding unneeded excavations. For example, for a municipality targeting 500 service line replacements in a given timeframe, the statistical model would suggest excavating 725 sites while the baseline model would suggest excavating 1,000 sites. This more accurate inventory provided by statistical modeling helps a city better plan and budget for the pipes that may need to be replaced.

Second, when addressing the “Unknowns,” using a conservative 70 percent statistical model accuracy rate compared to 50 percent, the statistical model could generate more than 25 percent in savings. Using the above example, which saved 275 excavations, at an estimated cost per excavation of $3,000, the statistical model would save $825,000.

Finally, the quick identification of those parcels most likely to have an LSL enables municipalities to proactively engage in community outreach and adopt a program (e.g., water filter distribution) to reduce potential lead exposure before any excavation begins.

About The Authors

Jake Abernethy and Eric Schwartz are the co-founders of BlueConduit. Additionally, Abernethy is an associate professor in computer science at the Georgia Institute of Technology, and Schwartz is an associate professor of marketing at the University of Michigan. They have pioneered the use of machine learning to help municipalities and utilities identify and inventory lead service lines, helping municipalities save millions of dollars and accelerating the remediation of this critical health issue. Initially working with Flint, MI, BlueConduit now works with municipalities across the United States and Canada. Recognized as a leader in its field, BlueConduit has provided legislative policy development support and has had its work recognized by several media outlets (https://www.blueconduit.com/blog).