Commit e6d42abb authored by Cecilia Nievas's avatar Cecilia Nievas
Browse files

Added documentation on tools for creating industrial 30-arcsec cells

parent 8fa07c3b
...@@ -14,17 +14,18 @@ The scripts are run from the command line as: ...@@ -14,17 +14,18 @@ The scripts are run from the command line as:
The order in which the scripts in the present repository need to be run to produce the GDE model for a region of interest is: The order in which the scripts in the present repository need to be run to produce the GDE model for a region of interest is:
1. Run `OBM_assign_cell_ids_and_adm_ids_to_footprints.py` 1. If the country/ies of interest have their industrial exposure defined on a 30-arcsec grid, run `SERA_creating_industrial_cells.py`
2. Run `SERA_create_HDF5_metadata.py`. 2. Run `OBM_assign_cell_ids_and_adm_ids_to_footprints.py`
3. Run `SERA_mapping_admin_units_to_cells.py` 3. Run `SERA_create_HDF5_metadata.py`
4. Run `SERA_mapping_admin_units_to_cells_add_GHS.py` (if GHS criterion desired) 4. Run `SERA_mapping_admin_units_to_cells.py`
5. Run `SERA_mapping_admin_units_to_cells_add_GPW.py` (if GPW criterion desired) 5. Run `SERA_mapping_admin_units_to_cells_add_GHS.py` (if GHS criterion desired)
6. Run `SERA_mapping_admin_units_to_cells_add_Sat.py` (if Sat or Sat_mod criterion desired) 6. Run `SERA_mapping_admin_units_to_cells_add_GPW.py` (if GPW criterion desired)
7. Run `SERA_distributing_exposure_to_cells.py` with the desired distribution method. 7. Run `SERA_mapping_admin_units_to_cells_add_Sat.py` (if Sat or Sat_mod criterion desired)
8. If the OpenQuake input files for the SERA model distributed onto a grid are desired (i.e. not GDE, just SERA), run `SERA_create_OQ_input_files.py` with the desired distribution method. 8. Run `SERA_distributing_exposure_to_cells.py` with the desired distribution method.
9. If a CSV summarising the number of buildings, dwellings, people and costs by cell according to the SERA model is desired (i.e. not GDE, just SERA), run `SERA_create_visual_output_of_grid_model_full_files.py` with the desired distribution method. 9. If the OpenQuake input files for the SERA model distributed onto a grid are desired (i.e. not GDE, just SERA), run `SERA_create_OQ_input_files.py` with the desired distribution method.
10. Run `OBM_buildings_per_cell.py` with the desired distribution method. 10. If a CSV summarising the number of buildings, dwellings, people and costs by cell according to the SERA model is desired (i.e. not GDE, just SERA), run `SERA_create_visual_output_of_grid_model_full_files.py` with the desired distribution method.
11. Run `GDE_gather_SERA_and_OBM.py` with the desired distribution method. The output is: 11. Run `OBM_buildings_per_cell.py` with the desired distribution method.
12. Run `GDE_gather_SERA_and_OBM.py` with the desired distribution method. The output is:
- a series of CSV files that serve as input for damage/risk calculations to be run in OpenQuake (https://github.com/gem/oq-engine); - a series of CSV files that serve as input for damage/risk calculations to be run in OpenQuake (https://github.com/gem/oq-engine);
- a CSV file that summarises results per cell and contains the geometry of the cells so that it can all be visualised with a GIS; - a CSV file that summarises results per cell and contains the geometry of the cells so that it can all be visualised with a GIS;
- a CSV file that summarises results per adminstrative unit and contains the geometry of the administrative boundaries so that it can all be visualised with a GIS; - a CSV file that summarises results per adminstrative unit and contains the geometry of the administrative boundaries so that it can all be visualised with a GIS;
...@@ -32,19 +33,19 @@ The order in which the scripts in the present repository need to be run to produ ...@@ -32,19 +33,19 @@ The order in which the scripts in the present repository need to be run to produ
## Testing Scripts ## Testing Scripts
- The scripts `SERA_testing_rebuilding_exposure_from_cells_alternative_01.py`, `SERA_testing_rebuilding_exposure_from_cells_alternative_02.py` and `SERA_testing_rebuilding_exposure_from_cells_alternative_03.py` can be run after step 7 above. They compare the SERA-on-a-grid model against the original files of the SERA model. - The scripts `SERA_testing_rebuilding_exposure_from_cells_alternative_01.py`, `SERA_testing_rebuilding_exposure_from_cells_alternative_02.py` and `SERA_testing_rebuilding_exposure_from_cells_alternative_03.py` can be run after step 8 above. They compare the SERA-on-a-grid model against the original files of the SERA model.
- The script `SERA_testing_compare_visual_output_vs_OQ_input_files.py` can be run after step 9 above to compare the number of buildings, people and cost per cell reported in the OpenQuake input file (generated from the grid) and the visual output CSV. - The script `SERA_testing_compare_visual_output_vs_OQ_input_files.py` can be run after step 10 above to compare the number of buildings, people and cost per cell reported in the OpenQuake input file (generated from the grid) and the visual output CSV.
- The script `SERA_create_outputs_QGIS_for_checking.py` can be run after step 6 above to create a summary of the parameters mapped (GHS, GPW, Sat, etc) in CSV format to be read with QGIS, enabling a visual check of the results. - The script `SERA_create_outputs_QGIS_for_checking.py` can be run after step 7 above to create a summary of the parameters mapped (GHS, GPW, Sat, etc) in CSV format to be read with QGIS, enabling a visual check of the results.
- The script `SERA_testing_mapping_admin_units_to_cells_qualitycontrol.py` can be run after step 3 above to check the areas of the cells mapped for the administrative units for which step 3 was run. - The script `SERA_testing_mapping_admin_units_to_cells_qualitycontrol.py` can be run after step 4 above to check the areas of the cells mapped for the administrative units for which step 3 was run.
- The script `GDE_check_consistency.py` can be run after step 11 above. It carries out different consistency checks on the resulting GDE model (see detailed description of this script). - The script `GDE_check_consistency.py` can be run after step 12 above. It carries out different consistency checks on the resulting GDE model (see detailed description of this script).
- The script `GDE_check_OQ_input_files.py` can be run after step 11 above. It prints to screen some summary values of the files and checks that the asset ID values are all unique. - The script `GDE_check_OQ_input_files.py` can be run after step 12 above. It prints to screen some summary values of the files and checks that the asset ID values are all unique.
- The script `GDE_check_tiles_vs_visual_CSVs.py` can be run after step 11 above. It reads the visual CSV output by cell and the corresponding GDE tiles HDF5 files and compares the number of buildings, cost and number of people in each cell according to each of the two. An output CSV file collects the discrepancies found, if any. - The script `GDE_check_tiles_vs_visual_CSVs.py` can be run after step 12 above. It reads the visual CSV output by cell and the corresponding GDE tiles HDF5 files and compares the number of buildings, cost and number of people in each cell according to each of the two. An output CSV file collects the discrepancies found, if any.
## Other Scripts ## Other Scripts
......
...@@ -3,6 +3,30 @@ ...@@ -3,6 +3,30 @@
For each core script, the enumerated configurable parameters are those that are specific to that script, i.e. defined in the configuration file under a subtitle that matches the name of the file. General parameters are not explained herein but in `03_Config_File.md` and `GDE_config_file_TEMPLATE.ini`. For each core script, the enumerated configurable parameters are those that are specific to that script, i.e. defined in the configuration file under a subtitle that matches the name of the file. General parameters are not explained herein but in `03_Config_File.md` and `GDE_config_file_TEMPLATE.ini`.
# SERA_creating_industrial_cells.py
## Configurable parameters:
The parameters that need to be specified under the `SERA_creating_industrial_cells` section of the configuration file are:
- countries = Countries to process. If more than one, separate with comma and space.
- col_lon, col_lat = Names of the columns in the SERA model that contain longitudes and latitudes.
- width_EW, width_NS = Widths (arcseconds) of the cells in which the industrial exposure is defined, in the east-west and north-south directions, respectively.
- id_str= First part of the string used to generate IDs of the inidividual points (e.g. "IND"). Do not include the country's ISO2 code (it gets added by the script automatically).
- precision_points = Number of decimal places to be used to determine unique points present in the input aggregated exposure model.
- consistency_checks = True or False (run consistency checks or not).
- autoadjust = After a first adjustment of the cells' geometries to fix any overlaps and/or gaps, a check is carried out if consistency_checks is True to determine if there are any potential leftover overlaps and/or gaps. If autoadjust is True, the script will adjust the geometry again until no further overlaps/gaps are found. If False, the script will not carry out this further adjustment.
- verbose = If True, a series of print statemens of progress are executed while running.
- in_crs = Coordinate reference system of the input and output.
- consistency_tol_dist = Tolerance to assess how large the maximum distance between the original points and the centroids of the generated cells is with respect to the width of the cells. Only needed if consistency_checks is True (e.g. 0.05 implies a 5% of the width as tolerance).
- consistency_tol_area = Tolerance to assess how large the variability of the area of the generated cells is. Only needed if consistency_checks is True.
- export_type = File type to export the created cells to ("shp"=Shapefile, "gpkg"=Geopackage).
## What the code does:
This code is used to handle the fact that industrial exposure may be defined in 30-arcsec cells instead of administrative units in the SERA exposure model. The SERA model only provides the centroids of those cells with a certain decimal precision. This code makes use of the tools defined in `GDE_TOOLS_create_industrial_cells.py`, which create cells around the points given as input and export these cells to a geodata file, adjusting the geometries so as to avoid overlaps and gaps between neighbouring cells. The country/ies to process is/are defined in the configuration file. Details on the algorithms used can be found in `08_Industrial_Cells.md.md`.
# OBM_assign_cell_ids_and_adm_ids_to_footprints.py # OBM_assign_cell_ids_and_adm_ids_to_footprints.py
## Configurable parameters: ## Configurable parameters:
...@@ -123,7 +147,7 @@ The parameters that need to be specified under the `SERA_distributing_exposure_t ...@@ -123,7 +147,7 @@ The parameters that need to be specified under the `SERA_distributing_exposure_t
- sera_disaggregation_to_consider = area, gpw_2015_pop, ghs, sat_27f or sat_27f_model. Select the parameter to use to distribute the SERA model to the grid. - sera_disaggregation_to_consider = area, gpw_2015_pop, ghs, sat_27f or sat_27f_model. Select the parameter to use to distribute the SERA model to the grid.
- ignore_occupancy_cases = Res, Com, Ind or leave empty. The code reads the occupancy cases from the SERA models and only ignores them if specified here. - ignore_occupancy_cases = Res, Com, Ind or leave empty. The code reads the occupancy cases from the SERA models and only ignores them if specified here.
- countries = Countries to process. If more than one, separate with comma and space. - countries = Countries to process. If more than one, separate with comma and space.
- admin_ids_to_ignore = 1110101. Within those countries, do not process admin units specified under this parameter. This is useful for running parts of countries only, it can be empty or ignored too. - admin_ids_to_ignore = Within those countries, do not process admin units specified under this parameter. This is useful for running parts of countries only, it can be empty or ignored too.
- columns_to_distribute = buildings, dwell_per_bdg, area_per_dwelling_sqm, cost_per_area_usd, ppl_per_dwell - columns_to_distribute = buildings, dwell_per_bdg, area_per_dwelling_sqm, cost_per_area_usd, ppl_per_dwell
- write_hdf5_bdg_classes_param = if True, write the HDF5 file of building classes parameters; if False, do not (`write_hdf5_bdg_classes parameter` for `gdet_sera.distribute_SERA_to_cells()` function). The building classes parameters depend only on the SERA model, not on the distribution method, so they do not need to be re-written each time. Setting this parameter to False saves a lot of processing time. - write_hdf5_bdg_classes_param = if True, write the HDF5 file of building classes parameters; if False, do not (`write_hdf5_bdg_classes parameter` for `gdet_sera.distribute_SERA_to_cells()` function). The building classes parameters depend only on the SERA model, not on the distribution method, so they do not need to be re-written each time. Setting this parameter to False saves a lot of processing time.
...@@ -170,7 +194,7 @@ The parameters that need to be specified under the `SERA_create_OQ_input_files` ...@@ -170,7 +194,7 @@ The parameters that need to be specified under the `SERA_create_OQ_input_files`
- sera_disaggregation_to_consider = area, gpw_2015_pop, ghs, sat_27f or sat_27f_model. Select the parameter to use to distribute the SERA model to the grid. - sera_disaggregation_to_consider = area, gpw_2015_pop, ghs, sat_27f or sat_27f_model. Select the parameter to use to distribute the SERA model to the grid.
- occupancy_cases = Res, Com, Ind. Occupancy cases to process. - occupancy_cases = Res, Com, Ind. Occupancy cases to process.
- countries = Countries to process. If more than one, separate with comma and space. - countries = Countries to process. If more than one, separate with comma and space.
- admin_ids_to_ignore = 1110101. Within those countries, do not process admin units specified under this parameter. This is useful for running parts of countries only, it can be empty or ignored too. - admin_ids_to_ignore = Within those countries, do not process admin units specified under this parameter. This is useful for running parts of countries only, it can be empty or ignored too.
## What the code does: ## What the code does:
......
# SERA Industrial Exposure Defined in 30-arcsec Cells: The Tools in GDE_TOOLS_create_industrial_cells.py
In broad terms, `GDE_TOOLS_create_industrial_cells.py` contains tools to:
- read the points in which the SERA industrial exposure model is defined
<img src="Images/industrial_30arcsec_approach_01.png" width="600">
- generate 30-arcsec cell geometries around them
<img src="Images/industrial_30arcsec_approach_02.png" width="600">
- carry out a first adjustment of the coordinates of the cells based on grouping together all longitudes and then all latitudes that "should be" the same, as per a precision criterion
<img src="Images/industrial_30arcsec_approach_03.png" width="600">
<img src="Images/industrial_30arcsec_approach_04.png" width="600">
<img src="Images/industrial_30arcsec_approach_05.png" width="600">
- check that the resulting adjusted geometries are satisfactory, that is, that no potential overlaps/gaps between cells remain
<img src="Images/industrial_30arcsec_approach_06.png" width="600">
- carry out more computationally-demanding operations to adjust the geometries when overlaps/gaps are found
- intersect the adjusted cell geometries with the country boundaries
- update the SERA industrial exposure input files to indicate the corresponding IDs of the generated cells
- export the geometries of the generated cells
The core function of the tools, which is called by `SERA_creating_industrial_cells.py` is `generate_country_industrial_cells()`. The following figures illustrate the functions that it calls and the process as a whole.
After the input points are retrieved from the SERA input files, unique points are identified according to a certain input precision. The coordinates (longitude, latitude) that "should be" the same are identified by rounding the coordinates according to a certain number of decimal places and identifying unique values at that precision level. This is done using strings and dictionaries that store the original and new coordinates. The cells dataframe is finally updated to reflect the new adjusted coordinates and geometries.
<img src="Images/industrial_30arcsec_approach_07.png" width="600">
<img src="Images/industrial_30arcsec_approach_08.png" width="600">
Steps indicated in purple and magenta in the figures that follow are only executed if the input parameters `consistency_checks` and `autoadjust_overlap_gap` are True, respectively. These steps assess whether there are any overlaps/gaps left in between neighbouring cells and adjust the geometries when this is the case. In the case of gaps, geometries are not adjusted if the cells are in diagonal with respect to one another and they participate of other cases of intersection, as this can lead to contradictory adjustments of the geometries in each step because one pair of cells that intersect each other are adjusted at a time.
Cells are only trimmed as per country boundaries after the consistency checks. Otherwise, checks that look at the distance between the resulting cell centroids and the original points and the final aras of the cells would be meaningless, because the trimmed geometry cannot guarantee such consistencies with the original input points.
<img src="Images/industrial_30arcsec_approach_09.png" width="600">
Gaps are identified by generating an enlarged version of the cells (i.e. increasing their dimensions), searching for subsequent intersections (so as to know which cells are neighbours of which other cells), and then subtracting the original cell geometries to these intersections. This resulting geometry is analysed to decide whether a gap exists between the cells or not.
<img src="Images/industrial_30arcsec_approach_10.png" width="600">
<img src="Images/industrial_30arcsec_approach_11.png" width="600">
Which coordinates need to be adjusted is determined by first identifying the relative position of one cell with respect to the other.
<img src="Images/industrial_30arcsec_approach_12.png" width="600">
<img src="Images/industrial_30arcsec_approach_13.png" width="600">
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment