Optimize processing time of country calibration
The calibration of population and structural values per country is a computationally expensive task. Currently, one the function AbstractExposure.get_country_sum_asset_values
is called which queries the sum of the population and the sum of the structural value of all assets of all entities in tiles of a given ISO code. This is a very expensive query due to the two INNER JOIN
s and may fail for very large countries due to memory problems. These values are used to compute the calibration factors. Then, the function AbstractExposure.get_country_entities
is called to query all entity IDs in tiles of a given ISO code. These entities are then given to multiprocessing workers that calibrate the assets.
I suggest to rearrange the process as follows:
-
Sum the population and structural value during the regular tile processing -
After the buildings and the residual tile has been processed, a function can retrieve the sum of both values for all entities of the processed tile -
Sum these over all tiles
-
-
Calibrate all assets -
Retrieve all Quadkeys/tiles -
Start a multiprocessing run to calibrate all assets of all entities of a tile (each worker processes one tile)
-
Edited by Laurens Oostwegel