In the SERA exposure model, the `taxonomy` field contains the string that defines the building class as per the GEM Taxonomy. The (already outdated) preliminary version of the SERA exposure model over which this code was developed required a series of parameters (apart from `taxonomy`) to unequivocally define a building class fully, so that all distinct classes had only one value of the parameters dwellings/building, area/dwelling, people/dwelling and cost/area. This led to the concept of `taxonomy*`, i.e., an extended value of `taxonomy`, including other fields. In the present code, `taxonomy*` is defined in the following way for each occupancy case:
In the SERA exposure model, the `taxonomy` field contains the string that defines the building class as per the GEM Taxonomy. However, the same string can be associated with different values of dwellings/building, area/dwelling, people/dwelling and cost/area even within the same administrative unit, because these values depend on some additional parameters. This led to the concept of `taxonomy*`, i.e., an extended value of `taxonomy`, including other fields. In the present code, `taxonomy*` is defined in the following way for all three occupancy cases (residential, commercial and industrial):
The use of triple slash allows to do `taxonomy` = `taxonomy*`.split(‘///’)\[0\]. The HDF5 files generated from the process of distributing the SERA model to the 10-arcsec grid (with `SERA_distributing_exposure_to_cells.py`) store `taxonomy*`.
`taxonomy///settlement_type/occupancy_type`
This definitions of `taxonomy*` will most likely change in the future.
The use of triple slash allows to do `taxonomy` = `taxonomy*`.split(‘///’)\[0\]. The HDF5 files generated from the process of distributing the SERA model to the quadtiles (with `SERA_distributing_exposure_to_cells.py`) store `taxonomy*`.
Not every country and occupancy case contains the same columns in the SERA full CSV files. The code adds missing columns so as to be able to treat all countries and cases in a homogeneous way. For example, if the `settlement_type` column does not exist, it is added with empty strings. The `taxonomy*` in this case will be something like `taxonomy////occupancy_type…` (note that four slashes are present, the three that go after taxonomy and the one that goes after the empty settlement type). For commercial and industrial exposure, the `dwellings` column does not exist, but it is inferred from the total costs and the intermediate dwelling-dependent variables that 1 building = 1 dwellings in these occupancy cases. Therefore, the `dwellings` column is added with values equal to the `buildings` column.
\ No newline at end of file
This definitions of `taxonomy*` may not work with Italy and Portugal. This code will be revised for these two countries in the future.
Not every country and occupancy case contains the same columns in the SERA full CSV files. The code adds missing columns so as to be able to treat all countries and cases in a homogeneous way. For example, if the `settlement_type` column does not exist, it is added with empty strings. The `taxonomy*` in this case will be something like `taxonomy////occupancy_type` (note that four slashes are present, the three that go after taxonomy and the one that goes after the empty settlement type). For commercial and industrial exposure, the `dwellings` column does not exist, but it is added with values equal to the `buildings` column.
In the current version of the SERA exposure model (v0.9, retrieved on 22 November 2021 from [here](https://gitlab.seismo.ethz.ch/efehr/esrm20_exposure)), the `settlement_type` and `occupancy_type` columns contain the following possible values: