The data sets developed to extend the applicability of the model to PR and the USVI are grouped into eight categories: (1) building inventory, (2) wind-related building characteristics, (3) wind-related building characteristics mapping schemes, (4) surface roughness, (5) tree inventory data, (6) deterministic wind field footprints, (7) probabilistic hazard event footprints, and (8) building fragility and vulnerability functions. The development of the seventh and eighth data sets are described in a companion paper by Vickery et al. (
2023). The remainder of this paper focuses on the development of the first six data sets and the implementation, validation, and calibration of Hazus.
Building Inventory
In Hazus, buildings in the general building stock (GBS) are classified using one of seven general occupancies: residential, commercial, industrial, agriculture, religious, government, and education. These general occupancies are further refined into 33 specific occupancies (
FEMA 2021a). Residential structures, for example, have six specific occupancy types, ranging from SF homes to nursing homes. The GBS is also classified by type of construction. The five general building types (GBTs) used in each of the Hazus hazard models are wood, masonry, concrete, steel, and manufactured housing.
Building counts, floor areas, and replacement values were aggregated for each census block and tract in PR and the USVI using building footprints derived from lidar data for PR (
FEMA 2019,
2021b) and parcel data for the USVI (
USVI GIS Division Office 2018). For this project, the PR footprint data were reprocessed to remove overlapping polygons and slivers as documented in FEMA (
2021b), and the building counts, areas, and values were updated using the existing methods documented in FEMA (
2019).
Wind Building Characteristics Mapping Schemes
Mapping schemes allow Hazus users to efficiently model regional construction practices, building code histories, and local mitigation efforts without having to gather detailed information on each individual building in their study region. In the hurricane model, the building inventory is characterized using a three-tier hierarchy of mapping schemes: (1) GBT, (2) specific building type (SBT), and (3) WBC. GBT and SBT distributions are defined for each specific occupancy, whereas WBC distributions are defined separately for each SBT. We refer to each permitted combination of SBT and WBCs as a wind building type (WBT).
The five GBTs—wood, masonry, concrete, steel, and manufactured housing—are common to each of the four Hazus hazard models. For hurricane wind loss estimation, there are 39 SBTs and 5,996 WBTs. The 39 SBTs are unchanged from previous versions of the hurricane wind model (
FEMA 2018a), but the number of WBTs represents an increase of 1,040 (21%) over previous versions of the model. The new WBTs are associated with the four SF SBTs: wood-framed single story (WSF1), wood-framed multistory (WSF2), masonry single story (MSF1), and masonry multistory (MSF2). For each WBT, the model requires a set of nine fragility and vulnerability functions for each of the five reference terrains. These damage and loss functions are functions of peak gust wind speed in open terrain. The development of the fragility and vulnerability functions for the 1,040 new WBTs is described in a companion paper by Vickery et al. (
2023).
For each SBT, a set of valid WBCs was defined. For example, hip roof shape is included as a WBC for the SF SBTs, but it is not included for SBTs that typically have flat roofs, such as masonry strip malls or concrete high-rise buildings. For each valid SBT-WBC pairing, the percentage of the SBT population having that WBC must be specified. The resulting percentages for all valid SBT-WBC pairings comprise a single WBC mapping scheme.
A partial example of a WBC mapping scheme is provided in Table
1. In this partial example, there are 32 possible WBTs (2 roof shapes × 2 SWR possibilities × 4 roof deck attachments × 2 roof–wall connections), each with its own set of fragility and vulnerability functions and a weighting determined by the appropriate WBC percentages. For simplicity, the relative frequency of each WBC is assumed to be independent of the other WBCs.
The existing Hazus southeast coastal SBT and WBC mapping schemes were selected as the starting points for new PR and USVI mapping schemes. However, as noted previously, several new WBCs were added to the SF SBTs for the island territories. In addition, several of the existing southeast coastal SBT and WBC percentages were modified. The methods used to estimate the relative frequencies of each WBC are summarized in Table
2 and discussed subsequently. The data sources and methods used include (1) tax assessor data, (2) visual inspection (either manual or machine learning), (3) local building code history and construction practices, and (4) engineering judgment. The estimates for PR and the USVI were developed by separate teams using separate data sources and, in some cases, different assumptions, as discussed in the following.
The project team utilized machine learning to predict metal versus concrete roof cover classifications, wood-frame versus masonry building classifications, and single- versus multistory building classifications. The methodology involved training boosted regression tree models (BRTMs) on field-verified data using aerial imagery of individual rooftops and the geometries and lidar-derived heights of individual building footprints. The training data were derived from FEMA substantial damage estimation (SDE) inspections following Hurricane Maria, which involved 8,794 residential structures with information on roof cover type, construction type, and number of stories. These structures served as ground-truth values for the training.
The inputs to the models included the mean, standard deviation, and deciles of the aggregate pixels for each rooftop image for each red, green, blue, and infrared channel, and after conversion for each hue, saturation, and value channel; the deciles, fifth percentile, and 95th percentile of the aggregate lidar-derived height values across each building footprint on a 0.5-m grid; and various geometric features of the building footprints, such as the area, perimeter, and length of the minimum bounding rectangle. Separate models were developed for roof cover type, construction type, and number of stories. For the construction type and number of stories, the models used random undersampling on the majority class of the data set to balance out the class distributions. Approximately 20% of the remaining sample data was held out of training for use in testing. The machine learning methodology and results are described in more detail in FEMA (
2021b).
On the test data, the best-performing roof cover model achieved approximately 86% accuracy and 91% mean average precision; the best-performing superstructure model achieved 82% accuracy and 89% mean average precision; and the best-performing number of stories model achieved 86% accuracy and 87% mean average precision. Every building footprint on the island was then run through the final models to estimate the relative frequencies of roof cover type, construction type, and number of stories.
The total numbers of buildings in PR and the USVI were estimated to be 1,406,245 and 45,154, respectively. Table
3 lists the estimated numbers of buildings to which each of the sampled WBCs is applicable. For example, roof shape is applicable to 10 of the 39 SBTs, which comprise 1,215,941 of the structures in PR and 34,319 of the structures in the USVI. For each of the sampled WBCs, random samplings of buildings were drawn to estimate their relative frequencies. The target sample sizes provided in Table
3 are the sample sizes required to achieve 95% confidence that the sampled frequencies will be within
of the actual frequencies. In most cases, we were able to meet or exceed the target sample sizes; however, some of the actual sample sizes for PR were less than the target sample sizes due to data availability constraints and resource limitations.
Surface Roughness
Aerodynamic surface roughness, created by vegetation, buildings, and other obstructions, reduces mean wind speeds and increases wind turbulence near the surface of the earth. Buildings located in rougher terrain experience different wind pressures and windborne debris environments than buildings located in smoother, more open terrain. To account for these differences, the fragility and vulnerability functions used in the model are a function of both the peak gust wind speed in open terrain and the local surface roughness (
FEMA 2021a).
In Hazus, each census tract and census block is assigned a characteristic roughness length, denoted by
. The roughness lengths developed for PR and the USVI were modeled following the same approach used for the continental United States (CONUS) (
FEMA 2021a), but with two improvements: the first concerning the use of tree canopy percentage and the second concerning the averaging of roughness lengths for individual census block or tracts.
Tree canopy percentage is incorporated into the computation of surface roughness for two land use/land cover (LULC) categories: developed, open space and developed, low intensity (
MRLC 2003,
2019). For CONUS, the average tree canopy percentage in each of these two LULC categories in each county was used to adjust their roughness length estimates. However, for PR and the USVI, the methodology has been refined to apply the tree canopy adjustments for these two LULCs on a pixel-by-pixel basis.
For census blocks larger than , the average values for PR and the USVI were computed using all of the pixels within the block plus an additional buffer of 500 m beyond the block boundary. In blocks smaller than in area, the average values were computed based on a circular area with a radius of 500 m centered on the census block centroid to ensure a minimum fetch of approximately 500 m. The same rules were used for computing census tract averages.
For PR, raster data layers of the most recently available LULC (
MRLC 2003) and tree canopy percentage layers (
MRLC 2019) from the National Land Cover Database (NLCD) were used to determine the surface roughness (
). The PR LULC classes are listed in Table
4. In areas where tree canopy coverage could not be determined due to obstructions in satellite images, such as shadows or clouds, the municipio (county-equivalent) average tree canopy value for the appropriate LULC was used.
The NLCD does not include a LULC layer for the USVI; therefore, a locally developed layer with a different classification scheme was used (G. Guannel, unpublished data, 2020). The mapping of the USVI LULC categories to those of PR and CONUS is summarized in Table
4.
The NLCD 2016 tree canopy cover data layer for PR (
MRLC 2019) also includes the USVI and was used to adjust the roughness estimates for USVI developed, open spaces and developed, low-intensity spaces (
FEMA 2021b). Both the LULC and tree canopy cover data are at 30-m resolution, but the two layers are not perfectly aligned, so the nearest tree canopy pixel was used to compute the roughness adjustments for the developed, open space and developed, low-intensity pixels.
For the USVI, a uniform grid of blocks is used by the Hazus Tsunami Model in place of actual census blocks. For consistency across the hazard models, the same grid of blocks is used by the hurricane model. Unfortunately, LULC data were not available for 32 blocks that intersect several of the smaller islands, as denoted by the filled circles in Fig.
2. Based on overhead imagery, it was observed that these islands were all heavily treed. Therefore, the block surface roughness values (
) were computed as a weighted average of the original, unbuffered block area and the buffered block area, for which the unbuffered block area was assigned a
of 900 mm and the buffered area was assigned a
of 3 mm. There was also one entire census tract on St. Thomas (78030082000) for which no LULC data were available. For this tract, the average of the visually derived block values within the tract was used.