Read in 2023 FEMA National Risk Estimates

Mapping
Supporting Activism
Read in the FEMA National Risk Estimate table and split out by disaster
Author

Alan Jackson

Published

April 28, 2025

Read in the National Risk Estimates

Data downloaded from https://hazards.fema.gov/nri/data-resources#csvDownload

This is more painful than it sounds. The csv file has 467 columns, most of which are double precision, but some are character. And many are blank near the top of the file, so automatic schemes fail. Also, the number of columns for each disaster type varies, so simple schemes for determining how to read the data in will fail.

After (finally) getting it read in, I’ll look at it and then save files by disaster.

Metadata

The disaster codes are pretty easy to guess, but here are the definitions:

Hazard Prefix
Avalanche AVLN
Coastal Flooding CFLD
Cold Wave CWAV
Drought DRGT
Earthquake ERQK
Hail HAIL
Heat Wave HWAV
Hurricane HRCN
Ice Storm ISTM
Landslide LNDS
Lightning LTNG
Riverine Flooding RFLD
Strong Wind SWND
Tornado TRND
Tsunami TSUN
Volcanic Activity VLCN
Wildfire WFIR
Winter Weather WNTW
Suffix Meaning
EVNTS Number of Events
AFREQ Annualized Frequency
EXPB Exposure - Building Value
EXPP Exposure - Population
EXPPE Exposure - Population Equivalence
EXPA Exposure - Agriculture Value
EXPT Exposure - Total
EXP_AREA Exposure - Impacted Area (sq mi)
HLRB Historic Loss Ratio - Buildings
HLRP Historic Loss Ratio - Population
HLRA Historic Loss Ratio - Agriculture
HLRR Historic Loss Ratio - Total Rating
EALB Expected Annual Loss - Building Value
EALP Expected Annual Loss - Population
EALPE Expected Annual Loss - Population Equivalence
EALA Expected Annual Loss - Agriculture Value
EALT Expected Annual Loss - Total
EALS Expected Annual Loss Score
EALR Expected Annual Loss Rating
ALRB Expected Annual Loss Rate - Building
ALRP Expected Annual Loss Rate - Population
ALRA Expected Annual Loss Rate - Agriculture
ALR_NPCTL Expected Annual Loss Rate - National Percentile
RISKV Hazard Type Risk Index Value
RISKS Hazard Type Risk Index Score
RISKR Hazard Type Risk Index Rating

“Population Equivalence” is the monetary value of the exposed population, using a “value of statistical life (VSL)” approach.

Each fatality or ten injuries is treated as $11.6 million of economic loss.

Code
library(tidyverse)
library(sf)
library(leaflet)
library(gt)

googlecrs <- "EPSG:4326"

path <- "/home/ajackson/Dropbox/Rprojects/ERD/Data/FEMA_National_Risk_Index_Data/"
Datapath <- "/home/ajackson/Dropbox/Rprojects/Curated_Data_Files/"

df <- read_csv(paste0(path, "NRI_Table_CensusTracts/NRI_Table_CensusTracts.csv"))

foo <- names(df)[41:466] #    only look at disaster categories

# are all disasters the same? No.
str_sub(foo, 1, 4) %>% as_tibble() %>% count(value) %>% 
  gt() %>% 
  tab_header(
    title=md("**Count of Fields for each Disaster**")
  ) %>% 
  fmt_number(
    columns=n,
    sep_mark=',',
    decimals=0
  ) %>% 
  cols_label(
    value = "Disaster Code",
    n = "Number of Fields"
  )  
Count of Fields for each Disaster
Disaster Code Number of Fields
AVLN 22
CFLD 22
CWAV 26
DRGT 16
ERQK 22
HAIL 26
HRCN 26
HWAV 26
ISTM 22
LNDS 22
LTNG 22
RFLD 26
SWND 26
TRND 26
TSUN 22
VLCN 22
WFIR 26
WNTW 26
Code
#   Build a table to decide how to read column

Cspec <- str_extract(foo, "_.*") %>% 
  as_tibble() %>% 
  count(value)

foo2 <- foo %>% as_tibble() %>% 
  mutate(Spec=if_else(str_detect(value, "R$"), "c", "d"))

Spec <- paste0(foo2$Spec, collapse="")

Col_spec=paste0(str_dup("c",11), "ddddddcddcddddddddddddcddcddd",
                Spec, "c")

#   Try again


df <- read_csv(paste0(path, "NRI_Table_CensusTracts/NRI_Table_CensusTracts.csv"),
               col_types = Col_spec)

Let’s check for understanding and consistency

Hmmm… there are tracts where the number of people expected to be affected by the disaster is larger than the population of the tract.

Code
foo <- df %>% 
  select(TRACTFIPS, POPULATION, CFLD_EXPP) %>% 
  mutate(Pop_diff=(POPULATION - CFLD_EXPP)/POPULATION) %>% 
  filter(Pop_diff>0) %>% 
  filter(Pop_diff<1) %>% 
  mutate(Pct_effected=CFLD_EXPP*100/POPULATION)
  ggstatsplot::gghistostats(data=foo,
                            x=Pct_effected,
                            xlab  = "Percent of Population affected",
                            title = "Percent of the Population in a Tract Affected by Flooding",
                            subtitle = "After elimination of unrealistic values")

Code
df %>% 
  select(TRACTFIPS, POPULATION, CFLD_EXPP) %>% 
  mutate(Pop_diff=POPULATION - CFLD_EXPP) %>% 
  filter(Pop_diff<0) %>% 
  ggplot(aes(x=Pop_diff)) +
  geom_histogram() +
  labs(title="Population Minus Affected Population, Tracts that Flood",
       subtitle="For values less than zero only (bad data values?)",
       x="Population - Affected Population")

Attach the polygons and make a map

To highlight the more interesting areas, I will map only the tracts denoted by a risk rating of “very high”.

I’ll also restrict the data to tracts with a population of greater than 100, and with a percent of the population at flood risk greater than 25%.

Not surprisingly, most of the flood risk falls near the coast.

Code
df$RISK_RATNG %>% as_tibble() %>% count(value) %>% 
  gt() %>% 
  tab_header(
    title=md("**Count of Tracts for each Risk Rating**")
  ) %>% 
  fmt_number(
    columns=n,
    sep_mark=',',
    decimals=0
  ) %>% 
  cols_label(
    value = "Risk Rating",
    n = "Number of Tracts"
  )
Count of Tracts for each Risk Rating
Risk Rating Number of Tracts
Insufficient Data 1,062
No Rating 19
Relatively High 10,006
Relatively Low 29,197
Relatively Moderate 19,559
Very High 2,396
Very Low 22,915
Code
df_sub <- df %>% 
  filter(RISK_RATNG=="Very High") %>% 
  select(STATEABBRV, TRACTFIPS, POPULATION, RISK_VALUE, ends_with("AFREQ"), 
         ends_with("EXPP"))

#   Attach tract polygons

Census <- readRDS(paste0(Datapath, "Census_Tracts_2023/All_ACS_2023.rds"))

df_sub <- df_sub %>% 
  filter(POPULATION>100) %>% 
  mutate(GEOID=TRACTFIPS) %>% 
  mutate(Pop_Exposed=as.integer(pmin(100*pmax(
    CFLD_EXPP, RFLD_EXPP, na.rm=TRUE)/POPULATION,100))) %>% 
  filter(Pop_Exposed>25) %>% 
  mutate(CFLD_AFREQ=signif(CFLD_AFREQ, 3),
         CFLD_EXPP=as.integer(CFLD_EXPP),
         RFLD_AFREQ=signif(RFLD_AFREQ, 3),
         RFLD_EXPP=as.integer(RFLD_EXPP)) %>% 
  inner_join(., Census, by="GEOID") %>% 
  sf::st_as_sf()

#   make map

pal <- colorNumeric("YlOrBr",
                        c(min(df_sub$Pop_Exposed, na.rm=TRUE),
                          max(df_sub$Pop_Exposed, na.rm=TRUE)),
                        na.color = "transparent")
 
leaflet(data=df_sub) %>% addTiles() %>% 
  addPolygons(data=df_sub,
              weight=2, 
              color="black",
              fillOpacity = 0.4,
              fillColor = pal(df_sub$Pop_Exposed),
        popup = paste(
        "<div class='leaflet-popup-scrolled' style='max-width:150px;max-height:200px'>",
        "Coast Fld Ann Freq:", df_sub$CFLD_AFREQ, "<br>",
        "Coast Fld Exp Pop:", df_sub$CFLD_EXPP, "<br>",
        "River Fld Ann Freq:", df_sub$RFLD_AFREQ, "<br>",
        "River Fld Exp Pop:", df_sub$RFLD_EXPP, "<br>",
        "Total Pop:", df_sub$POPULATION, "<br>",
        "Max % Pop Exp:", df_sub$Pop_Exposed, 
        "</div>"
      )) %>% 
    addLegend("bottomright", pal = pal, values = ~df_sub$Pop_Exposed,
    title = "Pct Pop Exposed",
    opacity = 1
  )

Look at Flooding

We’ll look at Coastal Flooding and Riverine Flooding

Many more tracts subject to river flooding, but coastal flooding has more potential to affect more people.

With river flooding, as seen in the last pair of plots, there are a lot of cases where either a small number of people were affected, or they were affected infrequently. It does appear to be the case that when a coastal flood occurs, it is likely to affect a lot more people than a river flooding. Given that there are a lot fewer coastal floods events than river flood events, it is interesting that the product of people affected times number of annual floods is quite similar between the two flood types.

Code
facet_labels <- c("Coastal Flooding", "Riverine Flooding")
names(facet_labels) <- c("CFLD_AFREQ", "RFLD_AFREQ")


df %>% 
  select(TRACTFIPS, ends_with("AFREQ")) %>%  
         # ends_with("EXPP")) %>% 
  pivot_longer(!TRACTFIPS, names_to="Variable", values_to="Values") %>% 
  filter(str_detect(Variable, "FLD")) %>% 
  filter(!is.na(Values)) %>% 
  filter(Values>0) %>% 
  mutate(Variable=as.factor(Variable)) %>% 
  ggplot(aes(x=Values)) +
  geom_histogram(binwidth=1) +
  facet_wrap(~Variable, labeller=labeller(Variable = facet_labels)) +
  labs(title="Annual Frequency of Flood Events By Tract",
       x="Number of Floods per Year",
       y="Number of Tracts")

Code
#   Coastal Flooding vs # People affected
df %>% 
  filter(!is.na(CFLD_AFREQ)) %>% 
  filter(CFLD_AFREQ>0) %>% 
  ggplot(aes(x=CFLD_AFREQ, y=CFLD_EXPP)) +
  geom_point() +
  labs(title="Coastal Flooding Annual Frequency vs. Number of People Affected",
    y="Expected Number of People Affected",
    x="Annual Flood Frequency"
  )

Code
#   Riverine Flooding vs # People affected
df %>% 
 ggplot(aes(x=RFLD_AFREQ, y=RFLD_EXPP)) +
  geom_point() +
  labs(title="Riverine Flooding Annual Frequency vs. Number of People Affected",
    y="Expected Number of People Affected",
    x="Annual Flood Frequency"
  )

Code
#   Expected number of people affected annually (# people times # floods)

facet_labels <- c("Coastal Flooding", "Riverine Flooding")
names(facet_labels) <- c("CFLD", "RFLD")

df %>% 
  select(TRACTFIPS, CFLD_AFREQ, CFLD_EXPP, RFLD_AFREQ, RFLD_EXPP) %>% 
  filter(CFLD_AFREQ+RFLD_AFREQ>0.5) %>% 
  pivot_longer(!TRACTFIPS, names_to=c("Flood", ".value"), 
                           names_pattern="(.*FLD).*(AFREQ|EXPP)",
                           values_drop_na = TRUE,
                           values_to="Values") %>% 
  mutate(Annual_effect=AFREQ*EXPP) %>% 
  filter(Annual_effect>100) %>% 
  ggplot(aes(x=Annual_effect)) +
  geom_histogram(binwidth = 2000) +
  scale_y_log10() +
  # facet_wrap(~Flood) +
  facet_wrap(~Flood, labeller=labeller(Flood = facet_labels)) +
  labs(title="People times number of Floods per Year by Tract",
       x="People times Floods per Year",
       y="Number of Tracts")

Summary

This dataset has 85,154 records, representing census tracts. For each tract there are 467 variables covering 18 different disasters, with mostly economic loss numbers, but also estimates of the annual frequency and the number of people expected to be affected by that disaster.

The numbers are (I think) based on the historical records, with statistical predictions. For the Population Exposure, the numbers contained within the software package Hazus 6.0 were used. A small number of these estimates are larger than the population within the tract. I have queried the FEMA support desk, and after a week have received no reply. I assume that the people doing support have probably been DOGEd.