Last night I found myself in need of a simple weighted sort function in Python.
I had a list of integers which represented object IDs in my project. Some of
the objects needed to be processed before the others while iterating over the
list of integers, and I already knew which object IDs those were. The order
the rest of the object IDs were processed didn't matter at all. I just wanted
the special object IDs to arrive at the beginning of the list, and the
remaining object IDs could be in any order.
I was surprised at how simple it was to produce such a weighted sort. Here's
an example of what I did:
import random
object_ids = [random.randint(0, 100) for i in range(20)]
special_ids = [random.choice(object_ids) for i in range(5)]
print 'Object IDs:', object_ids
print 'Special IDs:', special_ids
object_ids.sort(key=special_ids.__contains__, reverse=True)
print 'Object IDs:', object_ids
And some sample output:
Object IDs: [13, 97, 67, 5, 77, 58, 24, 99, 29, 20, 29, 75, 100, 31, 79, 5, 27, 11, 6, 1]
Special IDs: [13, 1, 27, 6, 67]
Object IDs: [13, 67, 27, 6, 1, 97, 5, 77, 58, 24, 99, 29, 20, 29, 75, 100, 31, 79, 5, 11]
Notice that each of the "special" IDs have shifted from their original position
in the object_ids list to be at the beginning of the list after the sort.
The Python documentation for sort says that the key argument "specifies
a function of one argument that is used to extract a comparison key from each
list element." I'm using it to check to see if a given element in the list is
in my special_ids list. If the element is present in the special_ids
list, it will be shifted to the left because of the way the
special_ids.__contains__ works.
In sorting, a value of 1 (or other positive integer) out of a comparison
function generally means "this belongs to the right of the other element." A
value of -1 (or other negative integer) means "this belongs to the left of
the other element." A value of 0 means "these two elements are equal" (for
the purposes of sorting). I'm assuming it works similarly with the key
argument. Please correct me if I'm wrong!
As lqc states in the comments below, the key argument works differently.
It creates a new sequence of values which is then sorted. Before lqc jumped
in, I was using key=int(i in special_ids) * -2 + 1 to do the sorting, which
is pretty dumb. Using key=special_ids.__contains__ is much more
appropriate. Thanks lqc!!
This sort of weighted sort might not be just right for your needs, but
hopefully it will give you a place to start to build your customized weighted
sort!