Landslide Susceptibility in Santa Barbara County

An Interactive Map Displaying Calculated Landslide Risk In Santa Barbara
Python
Landslide
Logistic Regression
Santa Barbara
Author

Ryan Green

Published

April 27, 2026

Introduction

On January 9, 2018, a debris flow killed 23 people and destroyed more than 100 homes in Montecito, California1. It was caused by a rainstorm that dropped roughly half an inch of rain in five minutes2, onto slopes that the Thomas Fire had burned just weeks before. That fire burned close to 282,000 acres across Ventura and Santa Barbara counties in late 20173, leaving behind bare soil and a hydrophobic surface that promoted runoff. The result was one of the most destructive slope failures in the county’s recent history.

This project builds a landslide susceptibility model for Santa Barbara County, identifying the landscape characteristics associated with slope failure and validating the results against recorded landslide locations.

Open Interactive Map

Approach: Logistic Regression

The model used in this analysis is a logistic regression model from the Python package sklearn (sklearn.linear_model.LogisticRegression). The model calculates the probability that a location is a landslide ‘presence’.

The model was trained on 926 confirmed landslide sites drawn from the USGS National Landslide Inventory (used as “presences”), as well as ~4,600 comparison points placed in areas at least 200 meters away from any known landslide (used as “pseudo-absences”). By comparing the landscape characteristics at these two groups of locations, the model learns which combinations of terrain, climate, and land conditions are associated with higher landslide likelihood.

The output is a probability score for every 10-meter pixel across the county, determining how closely each location resembles the conditions where landslides have been recorded. These scores are grouped into five classes (Very Low through Very High) based on where they fall in the county-wide distribution of all land pixels.

The Eight Factors

Each factor captures a different physical condition that contributes to slope instability. The influence percentage shown for each factor is how much that factor contributed to the final model, based on its statistical coefficient. All data sources are listed in the Footnotes section.

Slope (27.5%) Steep terrain is the most important predictor of landslide susceptibility in this analysis. Slope is derived from a USGS 3DEP 10-meter digital elevation model4.

Precipitation (22.4%) Areas that receive more intense rainfall during extreme storm events are more likely to experience slope failure. This factor uses the 100-year, 24-hour storm depth from NOAA Atlas 145, using the amount of rain expected in a single day in a very severe storm as a measure of the rainfall each respective location receives.

Land Cover (17.1%) Vegetation helps stabilize slopes by anchoring soil with roots and promoting water infiltration. This factor uses an ecosystem classification6 to score each land cover type by its associated risk level.

Terrain Curvature (9.3%) Curvature describes the shape of the slope. Concave slopes tend to funnel runoff and debris, and are often associated with landslide scour zones.

Burn Severity (9.1%) Areas that have recently burned carry elevated landslide risk, because fire removes vegetation and disrupts soil structure. This factor is calculated from CAL FIRE perimeter data7 and gives more weight to more recent fires, using a three-year exponential decay, to approximate vegetation recovery after a fire.

Lithology (7.6%) The underlying rock type can affect slope failure. Sedimentary deposits tend to absorb water and lose strength, while harder crystalline rocks are more stable. Rock types are drawn from the USGS State Geologic Map Compilation8.

Soil Erodibility (6.7%) Some soil types move more easily with runoff than others. This factor uses the K-factor from the USDA gSSURGO soil database9, measuring how easily a soil erodes. Higher soil erodibility is associated with higher landslide likelihood.

Vegetation Density – NDVI (<1%) The Normalized Difference Vegetation Index (NDVI) measures the ‘greenness’ of vegetation, as seen from satellite imagery10. Low NDVI values indicate sparse or dead vegetation. In this model, NDVI contributed very little additional information beyond what the land cover factor already captured, so its influence on the final model is negligible.

How the Map Is Made

The eight factors are assembled into a grid of values covering the entire county, at 10-meter resolution. These are normalized to a common scale (from 0 to 1) and given to the logistic regression model, which combines them into a single probability score for each grid square, or pixel. That score is then classified into the five susceptibility classes, based on the percentile distribution across all land pixels in the county. The interactive map shows which parts of Santa Barbara County have landscape conditions that are most similar to where landslides have been recorded.

How Well It Works

The model was evaluated three ways:

Cross-validation: The county was divided into seven geographic blocks. Each block was left out in turn while the model was trained on the other six, then tested on the withheld block, testing if the model generalizes to areas it was not trained on. The average AUC across all seven blocks was 0.719 (where 1.0 is a perfect model, and 0.5 is random guessing). The variation across blocks (+/- 0.256) was high, reflecting the fact that most training landslides are clustered in the Santa Ynez Mountains. Therefore, the model performs better in some parts of the county than others.

Hold-out test: A random 20% of the 926 training landslide points were set aside before training. Of these 185 withheld points, 49.7% fell in the High or Very High susceptibility class.

Independent validation: 8,323 landslide points mapped after the January 9, 2023 atmospheric river storm in the Santa Ynez Mountains were compared against the model output. This data was not used in model training. 92.8% of those points fell in the High or Very High class. This is the strongest indicator of model performance, because the validation data comes from a real storm event.

Interactive Map

This project outputs an interactive map of Santa Barbara County, displaying the susceptibility data along with several additional data layers: historical landslide deposit polygons from the USGS National Landslide Inventory (1,043 confirmed events)11, the 926 training landslide points used to build the model, Quaternary fault lines with slip-rate and age attributes12, CAL FIRE fire perimeters since 201613, and geology14. An address search tool allows users to look up the risk score and the leading contributing factors for any address in the county.

Limitations

  1. Training data is unevenly distributed across the county. The 926 training landslide locations are heavily concentrated in the Santa Ynez Mountains, because that is where most historical landslide mapping has been done. In other parts of the county, the lack of recorded landslides reflects gaps in data rather than a lower rate of slope failure. The model may underestimate risk in areas with fewer recorded landslides.

  2. All input layers represent conditions at a fixed point in time. Changes to data or environmental factors will need to be added manually to update the model.

  3. The model identifies susceptibility, not failure probability. The model identifies where landscape conditions resemble those of past landslide locations, but cannot predict them.

  4. No triggering events are included. Landslides require both a susceptible landscape and something to set them off, whether it be intense rainfall, an earthquake, etc. This model captures only the inherent terrain conditions. Two areas with the same susceptibility score in this model may have very different actual failure histories, depending on the trigger events they may have experienced.

  5. Historical landslide polygons have variable location accuracy. The USGS NLI polygons displayed in the map range from “Possible” to “High confidence” in location accuracy. They should not be treated as precise failure boundaries.

  6. Small scale terrain features may not be captured. At 10-meter pixel resolution, susceptibility in localized steep terrain may be underestimated relative to the surrounding area. Higher resolution imagery could help capture this.


Footnotes

  1. Kean, J.W., et al. (2019). “Inundation, flow dynamics, and damage in the 9 January 2018 Montecito debris-flow event, California, USA.” Geosphere, 15(4): 1140-1163. https://doi.org/10.1130/GES02040.1↩︎

  2. Oakley, N.S., et al. (2018). “Brief communication: Meteorological and climatological conditions associated with the 9 January 2018 post-fire debris flows in Montecito and Carpinteria, California, USA.” Natural Hazards and Earth System Sciences, 18: 3037-3043. https://nhess.copernicus.org/articles/18/3037/2018/↩︎

  3. Oakley et al. (2018). Same as [2]. The Thomas Fire burned 281,893 acres across Ventura and Santa Barbara Counties, December 4, 2017 through January 12, 2018.↩︎

  4. U.S. Geological Survey, 3D Elevation Program (3DEP). 1/3 arc-second National Elevation Dataset – digital elevation model. https://www.usgs.gov/3d-elevation-program↩︎

  5. NOAA Hydrometeorological Design Studies Center. NOAA Atlas 14 Precipitation Frequency Data Server – 100-year/24-hour storm depth. https://hdsc.nws.noaa.gov/pfds/↩︎

  6. U.S. Geological Survey. GAP/LANDFIRE National Terrestrial Ecosystems 2011 – land cover classification. https://www.usgs.gov/data/gaplandfire-national-terrestrial-ecosystems-2011↩︎

  7. CAL FIRE Fire and Resource Assessment Program. California Fire Perimeters (all) – burn severity and fire history display layer. https://data.ca.gov/dataset/california-fire-perimeters-all↩︎

  8. Horton, J.D., San Juan, C.A., and Stoeser, D.B. (2017). USGS State Geologic Map Compilation (SGMC) – California geology layer. U.S. Geological Survey Data Series 1052. https://www.usgs.gov/data/state-geologic-map-compilation-sgmc-geodatabase-conterminous-united-states↩︎

  9. USDA Natural Resources Conservation Service. Gridded Soil Survey Geographic (gSSURGO) Database – soil erodibility (K-factor). https://www.nrcs.usda.gov/resources/data-and-reports/gridded-soil-survey-geographic-gssurgo-database↩︎

  10. European Space Agency. Sentinel-2 satellite imagery – used to derive NDVI via Microsoft Planetary Computer. https://www.esa.int/Applications/Observing_the_Earth/Copernicus/Sentinel-2↩︎

  11. U.S. Geological Survey. Landslide Inventories across the United States, ver. 3.0, February 2025 – training and display landslide polygons. https://www.usgs.gov/data/landslide-inventories-across-united-states-ver-30-february-2025↩︎

  12. U.S. Geological Survey. Quaternary Fault and Fold Database of the United States – fault lines display layer. https://www.usgs.gov/programs/earthquake-hazards/faults↩︎

  13. CAL FIRE Fire and Resource Assessment Program. California Fire Perimeters (all) – burn severity and fire history display layer. https://data.ca.gov/dataset/california-fire-perimeters-all↩︎

  14. Macrostrat. Geologic map API – supplemental geology attributes. https://macrostrat.org/api/↩︎