Commit a6aef4c8 authored by Cecilia Nievas's avatar Cecilia Nievas
Browse files

Reduced number of parameters used to define building classes

parent cb1bcdc9
......@@ -397,20 +397,20 @@ def determine_unique_combinations_of_taxonomies_and_values_per_building(in_case,
For example, for residential exposure a "unique" class will be defined by:
taxonomy///settlement_type/occupancy_type/dwell_per_bdg/area_per_dwelling_sqm
This combination of parameters is referred to in brief as taxonomy*. The columns used for taxonomy* are not the same
This combination of parameters is referred to in brief as taxonomy*. The columns used for taxonomy* are not necessarily the same
for Res, Com and Ind (see function get_list_for_grouping).
This function will then define if there are repeated instances of taxonomy* within with_duplicates_df.
For example, if with_duplicates_df is:
taxonomy structural night
RC/H:2///RURAL//2.00/125.00 2000.5 3.51
RC/H:2///RURAL//2.00/125.00 2000.5 3.51
RC/H:2///RURAL//1.00/150.00 1050.2 3.51
RC/H:2///RURAL// 2000.5 3.51
RC/H:2///RURAL// 2000.5 3.51
RC/H:2///RURAL// 1050.2 3.51
The result will be:
there_are_duplicates= True
uniq_vals= array['RC/H:2///RURAL//2.00/125.00', 'RC/H:2///RURAL//1.00/150.00']
uniq_vals= array['RC/H:2///RURAL//', 'RC/H:2///RURAL//']
uniq_inverse= array[0,0,1] (i.e. the positions of with_duplicates_df that correspond to each unique element in uniq_vals)
groupby_crit= ['taxonomy', 'settlement_type', 'occupancy_type', 'dwell_per_bdg', 'area_per_dwelling_sqm']
......@@ -456,17 +456,17 @@ def get_list_for_grouping(occupancy_case):
"""
The columns enumerated in the out_list are those that will be used to identify the building class.
The decision regarding on which columns to include here was based on the analysis carried out
using SERA_which_countries_have_duplicate_classes_include_settlement.py (March 2020).
in March 2020, May/June 2020 and November 2021.
"""
if occupancy_case=='Res':
out_list= ['taxonomy', 'settlement_type', 'occupancy_type', 'dwell_per_bdg', 'area_per_dwelling_sqm']
round_by= ['', '', '', get_rounding_type('dwell_per_bdg'), get_rounding_type('area_per_dwelling_sqm')] # method for rounding floats
out_list= ['taxonomy', 'settlement_type', 'occupancy_type']
round_by= ['', '', ''] # method for rounding floats
elif occupancy_case=='Com':
out_list= ['taxonomy', 'settlement_type', 'occupancy_type', 'area_per_dwelling_sqm']
round_by= ['', '', '', get_rounding_type('area_per_dwelling_sqm')] # method for rounding floats
out_list= ['taxonomy', 'settlement_type', 'occupancy_type']
round_by= ['', '', ''] # method for rounding floats
elif occupancy_case=='Ind':
out_list= ['taxonomy', 'settlement_type', 'occupancy_type', 'cost_per_area_usd']
round_by= ['', '', '', get_rounding_type('cost_per_area_usd')] # method for rounding floats
out_list= ['taxonomy', 'settlement_type', 'occupancy_type']
round_by= ['', '', ''] # method for rounding floats
else:
out_list= []
round_by= []
......@@ -620,7 +620,7 @@ def group_same_taxonomies(in_df, in_array_unique_combis, in_position_of_unique,
in_array_unique_combis= numpy array of unique combinations of taxonomy* (i.e. taxonomy considering all parameters enumerated in get_list_for_grouping)
in_position_of_unique= its length is the same as the number of rows in in_df; for each row, it indicates the corresponding element of in_array_unique_taxonoms
in_country_adm_ids= an array of strings with length equal to the number of rows in in_df. It indicates the country_admin_ID each row in in_df is coming from.
out_df= Pandas DataFrame in which a certain combination of taxonomy* only exists once (the different rows of in_df for the same combination of taxonomy*
have been grouped together). Note that the number of buildings (i.e. "number" or "buildings" column), number of dwellings and number of people
("occupants_per_asset") are the only values that are added, all other values (e.g. "dwell_per_bdg") are values per building or dwelling and are
......
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment