create intersection from two or more 2d numpy arrays based on common value in one column
I have 3 numpy recarrays with following structure. The first column is some position (Integer) and the second column is a score (Float). Input: a = [[1, 5.41], [2, 5.42], [3, 12.32], dtype=[('position', '
Here is one approach, I believe it should be reasonably fast. I think the first thing you want to do is count the number occurrences for each position. This function will handle that:
def count_positions(positions): positions = np.sort(positions) diff = np.ones(len(positions), 'bool') diff[:-1] = positions[1:] != positions[:-1] count = diff.nonzero() count[1:] = count[1:] - count[:-1] count += 1 uniqPositions = positions[diff] return uniqPositions, count
Now using the function form above you want to take only the positions that occur 3 times:
positions = np.concatenate((a['position'], b['position'], c['position'])) uinqPos, count = count_positions(positions) uinqPos = uinqPos[count == 3]
We will be using search sorted so we sort a b and c:
a.sort(order='position') b.sort(order='position') c.sort(order='position')
Now we can user search sorted to find where in each array to find each of our uniqPos:
new_array = np.empty((len(uinqPos), 4)) new_array[:, 0] = uinqPos index = a['position'].searchsorted(uinqPos) new_array[:, 1] = a['score'][index] index = b['position'].searchsorted(uinqPos) new_array[:, 2] = b['score'][index] index = c['position'].searchsorted(uinqPos) new_array[:, 3] = c['score'][index]
There might be a more elegant solution using dictionaries, but I thought of this one first so I'll leave that to someone else.
- Database Administration Tutorials
- Programming Tutorials & IT News
- Linux & DevOps World
- Entertainment & General News
- Games & eSportLoading...