*(Unfortunately we are not allowed to redistribute the data and you'll need a password to access the sources. [Read here](https://git.gfz-potsdam.de/dynamicexposure/datasources/-/tree/master/ESRM20_boundaries) for more information about the data being used.)*
3. Place the downloaded data into paths and directories of your preference.
4. If you wish to run `gde-importer` for the industrial exposure of a country for which the geographical units used in ESRM20 are 30-arcsec cells, please read the [special preliminary steps for 30-arcsec industrial cells](#special-preliminary-steps-for-30-arcsec-industrial-cells) down below.
### Configuration
#### Quickstart:
Copy the file `config_example.yml` to your working directory as `config.yml` and provide the necessary parameters:
...
...
@@ -85,6 +85,27 @@ The following configuration options are available in the `config.yml`:
-`data_units_surface_threshold`: Percentage difference (float between 0.0 and 100.0) of geographic areas to define the need to create data units to fill an exposure entity, if the data units defined in an aggregated exposure model do not fully cover the geographic extents of the exposure entity. Default: 1.0%.
-`force_creation_data_units`: Create data units to fill an exposure entity irrespective of other conditions (e.g. irrespective of `data_units_surface_threshold`). Default: False.
### Special preliminary steps for 30-arcsec industrial cells
The following countries have their industrial exposure models defined in terms of 30-arcsec cells in the ESRM20 model (names and underscores as per naming in ESRM20): Albania, Austria, Belgium, Bosnia_and_Herzegovina, Bulgaria, Croatia, Cyprus, Czechia, Denmark, Estonia, Finland, France, Germany, Greece, Hungary, Iceland, Ireland, Italy, Latvia, Lithuania, Luxembourg, Malta, Moldova, Montenegro, Netherlands, North_Macedonia, Norway, Poland, Portugal, Romania, Serbia, Slovakia, Slovenia, Spain, Sweden, Switzerland, United_Kingdom. This list might change. This information can be retrieved from the [ESRM20 exposure model repository](https://gitlab.seismo.ethz.ch/efehr/esrm20_exposure), under the path `esrm20_exposure/sources/European_Exposure_Model_Data_Inputs_Sources.xlsx`.
The ESRM20 model only provides the centroids of the 30-arcsec cells with a certain decimal precision. Some pre-processing of the input files is thus needed so as to create the geometry of the cells around these points export these cells to a geodata file, adjusting the geometries so as to avoid overlaps and gaps between neighbouring cells. It is planned that, in the future, `gde-importer` will be able to handle this peculiarity by itself (without need for pre-processing). However, the current version of `gde-importer` requires that the `SERA_creating_industrial_cells.py` script of the [GDE prototype code](https://git.gfz-potsdam.de/dynamicexposure/legacy/gde_calculations_prototype) be run first. The steps to follow (after steps 1 through 3 in [Obtain data for the European Seismic Risk Model 2020 (ESRM20)](#obtain-data-for-the-european-seismic-risk-model-2020-esrm20)) are:
1. Create a configuration file as per the instructions contained [here](https://git.gfz-potsdam.de/dynamicexposure/legacy/gde_calculations_prototype/-/blob/master/docs/03_Config_File.md) and [this template](https://git.gfz-potsdam.de/dynamicexposure/legacy/gde_calculations_prototype/-/blob/master/GDE_config_file_TEMPLATE.ini). You can name it as you wish, e.g. `GDE_config_industrial_preprocessing.py`. The only two sections that are required are:
- "File Paths": indicate the location of the ESRM20 files (`sera_models_path`), the boundaries path (`sera_boundaries_path`) and type (`boundaries_type`) and the output path (`out_path`).
- "SERA_creating_industrial_cells": keep all parameters as in the [GDE_config_file_TEMPLATE.ini](https://git.gfz-potsdam.de/dynamicexposure/legacy/gde_calculations_prototype/-/blob/master/GDE_config_file_TEMPLATE.ini) file, except for `countries`, which can be all of the ones listed above or a reduced list of your choice.
2. Make sure the output path indicated in the configuration file (`out_path`) exists, and that it contains a subfolder called `Ind_Cells`. The output of the code will be written to this path.
3. Open an instance of `python3` to run `SERA_creating_industrial_cells.py`.
4. Type `python3 SERA_creating_industrial_cells.py GDE_config_industrial_preprocessing.py` to start running the script.
5. Once the script has run, the output can be found in `out_path/Ind_Cells` and consists of:
- Geodata files with names `Adm99_Country.shp`, which contain the geometries of the created cells.
- CSV files with names of the kind `Exposure_Model_Country_Ind.csv`. The contents of these CSV files are the same as those in the original ESRM20 CSV files for these countries, plus two columns named `ID_99` and `NAME_99`, which contain the IDs and names of the created cells' geometries.
-`log_consistency_checks.csv`, which indicates if any problem was found. If all cells are False, then no problem was encountered.
-`log_processing_times.csv`, which indicates the time it took to process each country, in seconds.
6. Replace the original ESRM20 `Exposure_Model_Country_Ind.csv` files with the ones in `out_path/Ind_Cells` (in the folder of your choice defined as per the instructions above, where all other ESRM20 CSV files are located).
7. Place the geodata files with names of the kind `Adm99_Country.shp` in the same directory where the rest of the ESRM20-compatible boundaries are located.
Details on the algorithms of `SERA_creating_industrial_cells.py` can be found [here](https://git.gfz-potsdam.de/dynamicexposure/legacy/gde_calculations_prototype/-/blob/master/docs/08_Industrial_Cells.md).