Feature: ensure full geographic coverage of input aggregated exposure models
This feature addresses the problem that there can be geographical areas within an exposure entity for which the exposure is not defined in the input aggregated exposure model. When this occurs and an OBM building is located within the uncovered geographical area, no building classes can be assigned to it because it does not belong to any data unit. This is particularly relevant for the cases in which the industrial exposure is defined in 30-arcsec cells that do not cover the whole exposure entity (as in most countries in ESRM20).
This feature compares the geographical extent of exposure entities against all the data units for which the input aggregated exposure model is defined (for each occupancy case) and decides whether a new data unit is needed to "fill in" the gaps. A relevant challenge lies in the fact that the input boundaries may not be perfect and consecutive data units may have unintended gaps between them, or the boundaries of the exposure entity may have a different resolution from those of the data units. These situations were analysed in detail to define the algorithm for this feature.
Under these and other analysed considerations, data units are created if:
- the number of data units for which the aggregated exposure model is different from the number of data units in the geodata file that contains the corresponding boundaries, OR
- the difference in surface area between the whole exposure entity and the summation of the areas of all data units for which the aggregated exposure model is larger than the threshold defined in the configuration file (
data_units_surface_threshold, default: 1.0%), OR
force_creation_data_unitsin the configuration file is True (i.e. the user tells the program that the number of data units or the difference in surface areas do not matter and it should create a data unit to fill in the gaps irrespective of those checks).
If the number of data units for which the aggregated exposure model is the same as the number of data units in the geodata file that contains the corresponding boundaries, and the difference in surface area lies between
data_units_surface_threshold (e.g. between -1% and 1% with the default values), the program does nothing. However, if the number of data units for which the aggregated exposure model is the same as the number of data units in the geodata file that contains the corresponding boundaries and the difference in surface area is smaller than
-data_units_surface_threshold (e.g. smaller than -1% with the default values), a warning is logged, as this means that the sum of the surface areas of the defined data units is larger than the surface area of the exposure entity. This can happen, for example, with countries that have overseas territories and whether they are included in a geodata file and/or the model depends on several factors.
If a data unit is created, the building classes allocated to it (and their proportions) correspond to those of the exposure entity as a whole. The values of
total_cost_per_building are calculated as the averages of all values in the exposure entity, weighted by the number of buildings associated with each value. The number of buildings, dwellings, people and costs assigned to this new data unit is zero. The need to calculate
total_cost_per_building as weighted averages led to a refactoring of the method