04_Configuration.md 6.65 KB
Newer Older
Cecilia Nievas's avatar
Cecilia Nievas committed
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
# Configuration

User-configurable parameters need to be provided in a file named `config.yml`, located in the working directory. The file [config_example.yml](../config_example.yml) in this repository can be used as a starting point.

## General parameters

- `model_name` (optional): Name of the input aggregated exposure model (only relevant for the user). E.g. "ESRM20".
- `exposure_format` (optional): Format of the input aggregated exposure model. Currently supported values: esrm20.
- `data_pathname` (required): Path to directory that contains the model data. For example, if the input aggregated exposure model is ESRM20, and the [ESRM20 repository](https://gitlab.seismo.ethz.ch/efehr/esrm20_exposure) has been cloned to the path `/home/username/`, then `data_pathname=/home/username/esrm20_exposure`.
- `boundaries_pathname` (required): Path to directory that contains the boundary geodata files.

## Parameters that control what cases are run

An input aggregated exposure model may cover different [exposure entities](02_Organisation_Geographic_Space.md#exposure-entity) and different [occupancy cases](02_Organisation_Geographic_Space.md#occupancy-cases). These parameters allow to control which are run when calling `gde-importer`:

- `occupancies_to_run` (required): List of occupancies for which the code will be run, separated by ", " (comma and space). They need to exist for the indicated `exposure_format`. Currently supported values: residential, commercial, industrial.
- `exposure_entities_to_run` (required): List of names of exposure entities for which the code will be run. Currently supported options:
  - "all": The list of names will be retrieved from the metadata of the input aggregated exposure model.
  - A comma-space-separated list of entity names: This list of names will be used.
  - A full path to a .txt or .csv file:  The list of names will be retrieved from the indicated .txt/.csv file.

## Parameters needed to access databases

- `database_built_up` (required): Credentials for the [database](https://git.gfz-potsdam.de/dynamicexposure/openbuildingmap/database-obmtiles#obm_built_area_assessments-completeness-assessments-information) where the built-up areas per quadtile are stored. The `sourceid` of the built-up areas needs to be indicated as a nested parameter. The `gde-importer` assumes that this database contains a table named `obm_built_area_assessments`. In order to connect to the database, some of all of the following parameters may be required:
  - host: name of the host
  - dbname: name of the database
  - port: port number
  - username: user name
  - password: password associated with `username`
- `database_gde_tiles` (required): Credentials for the [database](https://git.gfz-potsdam.de/dynamicexposure/globaldynamicexposure/database-gdetiles) where information on the GDE tiles is stored. The `gde-importer` assumes that this database contains the tables indicated in the link. In order to connect to the database, some of all of the following parameters may be required:
  - host: name of the host
  - dbname: name of the database
  - port: port number
  - username: user name
  - password: password associated with `username`
  
## Parameters associated with ensuring full geographic coverage

Input aggregated exposure models may not have exposure defined for the complete territory of an exposure entity. As explained [here](06_Ensuring_Full_Geographic_Coverage.md), the `gde-importer` contains a special routine to ensure that the whole territory is covered by a potential distribution of building classes. The following parameters control this routine:

- `data_units_surface_threshold` (required): Percentage difference (float between 0.0 and 100.0) of geographic areas to define the need to create data units to fill an exposure entity, if the data units defined in an aggregated exposure model do not fully cover the geographic extents of the exposure entity. Default (and reasonable value based on observation of existing models): 1.0%.
- `force_creation_data_units` (optional): True or False. If True, create data units to fill an exposure entity irrespective of other conditions that are automatically verified by the code (such as `data_units_surface_threshold`, for example). If this parameter is not provided, the `gde-importer` takes it as False.
- `data_units_min_admisible_area` (required): Minimum surface area (in m2) of data units created to fill an exposure entity. It needs to be smaller than `data_units_max_admisible_area`. Its purpose is to avoid the creation of data units that are only artefacts of the resolution of the input boundaries and the accuracy of the geometric calculations carried out. Suggested value: 0.1 (m2).
- `data_units_max_admisible_area` (required): Maximum surface area (in m2) of data units created to fill an exposure entity. If the area to cover is larger than `data_units_max_admisible_area`, it gets successively subdivided until complying with this requisite. It needs to be larger than `data_units_min_admisible_area`. Suggested value: 3e9 (m2).

## Other parameters

- `domain_boundary_filepath` (optional): Path to the geodata file (including the name and extension of the file itself) that contains the boundaries within which the input aggregated model is defined. If provided, `gde-importer` verifies that the boundaries of the exposure entities lie inside it and cut out areas that may fall outside. This is relevant for cases in which the geodata files associated with an exposure entity may include overseas territories that are located, e.g., in other continents and are thus not covered by the input aggregated exposure model.
- `number_cores` (required): Number of cores (integer) used for parallelising the creation and storage of data-unit tiles. If larger than 1, individual data units are sent to different cores to be processed in parallel.
- `exposure_entities_code` (required): This parameter controls the creation of the 3-character code that the `gde-importer` uses to identify [exposure entities](02_Organisation_Geographic_Space.md#exposure-entity). The 3-character code is appended to the begining of the IDs of data units as well (e.g. a data unit with ID "38271" in Greece is stored as "GRC_38271"). If the exposure entities of the input aggregated exposure model are countries, it is recommended to set this parameter to "ISO3", in which case the `gde-importer` will retrieve the corresponding alpha-3 [ISO 3166 country code](https://www.iso.org/iso-3166-country-codes.html), using the [iso3166 library](https://github.com/deactivated/python-iso3166). Alternatively, a nested structure with exposure entities names and 3-character codes can be provided. For example:

```
exposure_entities_code:
  Europe: EUE
  North_America: NNN
  South_America: SRR
  ...
```