speedy_subset#
- speedy_subset(sids_left, sids_right, values_left=None)#
Fast subsetting of data
We make use of multi-level nature of STARE with the following steps:
clamp the sids_left by the upper and lower bounds of sids_right.
determine the intersection level as the lower one of the highest level of left and right.
coerce the resolution of the left sids to the intersection level
get the unique sids of the coerced left sids
perform stare-based intersects pf the unique values and the right
map the intersects back to the original array indices.
- Parameters:
- sids_left: 1D numpy.array
The sids of the left which we are subsetting
- sids_right: 1D numpy.array
The sids we are subseting sids_left with
- values_left: ndarray
optional. If set, we return the subsetted values rather than the left indices. values_left must have same length as sids_left. I.e. the fastest changing index must be of the same length as sids_left.
Examples
>>> import numpy >>> values_left = numpy.array([1,2,3,4,5,6]) >>> sids_left = numpy.array([3330891586388099091, 3330891586390196243, 3330891586392293395, 3330891586394390547, 3330891586396487699, 3330891586398584851]) >>> sids_right = numpy.array([3330891586396487699, 3330891586398584851]) >>> left_values = numpy.array([1,2,3,4,5,6,]) >>> res = speedy_subset(sids_left=sids_left, sids_right=sids_right, values_left=values_left) >>> res array([5, 6])