perf: cache_array_subset not caching the complete band
Pull request here: https://github.com/GFZ/arosics/pull/40
I am seeing some odd caching issues.cache_array_subset
seems to not work as intended.
These are equivalent in terms of speed (they're fast, all array loads are cached):
logger.info("Loading into RAM")
local_reference = GeoArray(local_reference)
_ = local_reference[:, :, r_b4match]
local_target = GeoArray(local_target)
_ = local_target[:, :, s_b4match]
COREG_LOCAL(im_ref=local_reference, im_tgt=local_target, ...)
logger.info("Loading into RAM")
local_reference = GeoArray(local_reference)
local_reference.to_mem()
local_target = GeoArray(local_target)
local_target.to_mem()
COREG_LOCAL(im_ref=local_reference, im_tgt=local_target, ...)
But this is slow:
logger.info("Loading into RAM")
local_reference = GeoArray(local_reference)
local_reference.cache_array_subset([r_b4match])
local_target = GeoArray(local_target)
local_target.cache_array_subset([s_b4match])
COREG_LOCAL(im_ref=local_reference, im_tgt=local_target, ...)
and so is this without any pre-caching
COREG_LOCAL(im_ref=local_reference, im_tgt=local_target, ...)
When looking at the source code, it seems that self.shift.cache_array_subset([self.COREG_obj.shift.band4match])
as used here would only cache the nth element, not the nth band.
I'm seeing a big speedup replacing those two lines with these, which should also populate self._arr_cache
_ = self.ref[:, :, [self.COREG_obj.ref.band4match]]
_ = self.shift[:, :, [self.COREG_obj.shift.band4match]]
Curiously enough, this does not work:
_ = self.ref[:, :, self.COREG_obj.ref.band4match]
_ = self.shift[:, :, self.COREG_obj.shift.band4match]
I don't know how to properly pass [:,:,self.COREG_obj.ref.band4match]
to cache_array_subset
.
Not sure about the memory impact if suddenly way more is cached (and shared among the threads).