This blog is mainly a fun read. If you are seeking for an ideal random permutation generation algorithm for serious, read method 3 directly.
Also note that the algorithm discussed here is NOT new.
What's random permutation?
A random permutation is a set of objects in random order.
Given a set of objects, 1, 2, 3 ... n, a random permutation may look like,
p1, p2, p3 ... pn
In which px is a random value from the original set.
Random permutation is useful to shuffle poker cards, randomize a puzzle game, generate random sequence, or generate a random sub collection of a give set (get random k objects out of n objects).
The random permutation generation algorithms, from naive to mature, in my past real experience
To explain the algorithms, I would use an auxiliary function to generate random number.
int random(int min, int max);
The result is a random number which is greater or eqaul with min and less than max.
That's to say, the result is in left-close-right-open range of [min, max).
Method 1, naive way.
Swap two items at random positions. Repeat for enough times.
Pseudo code:
array data(1..n);
for(enough iterations) {
swap(data[random(0, n)], data[random(0, n)]);
}
This method is quite intuitive, and quite simple, and it really works if there are enough iterations, such like iterates 100 times for 10 objects, etc. Yes, it works, I used it a lot before.
The big problem is that the iteration count has to be much bigger than the object count (n) because the chance to choose the same two items are quite high. The probability is 1 / (n * n) to choose same items, which is quite high.
So with this method, we get either bad performance (very high iteration) or less random result (low iteration).
Method 2, take from a basket.
Assume all objects are balls. We put all the balls to a single basket, then take one random ball from the basket, repeat until the basket is empty.
Pseudo code:
array data(1..n);
basket = new array;
for(i = 0 to N - 1) {
basket.push(data[i]);
}
for(i = 0 to N - 1) {
int index = random(0, basket.length);
data[i] = basket[index];
basket.remove(index);
}
This method is also quite intuitive, because in the reality, the lottery is using that method with real balls and real baskets.
This method is quite high performance with O(N) time complexity.
And theoretically, the result is guaranteed random enough because all balls are choosen from the baskets randomly.
Method 3, evolve -- keep balls in basket
The second method is good for reality, quite easy to operate. But in computer, it has a disadvantage: it requires an extra temporary buffer to be the basket.
Not bad in most cases, but can we do better?
Of course yes! We can do in-place basket method.
Real C++ code:
int random(int minValue, int maxValue)
{
assert(minValue <= maxValue);
if(minValue != maxValue) {
return rand() % (maxValue - minValue) + minValue;
}
else {
return minValue;
}
}
template <typename T>
void randomPermutation(T & data, int count)
{
using std::swap;
for(int i = 0; i < count; ++i) {
swap(data[i], data[random(i, count)]);
}
}
C version non-template randomPermutation (replace "int" with whatever the item type you use, and implement swap by yourself)
void randomPermutation(int * data, int count)
{
for(int i = 0; i < count; ++i) {
swap(&data[i], &data[random(i, count)]);
}
}
Above code is exactly an implementation of the basket method, but in a obscure way.
Understand the magic
Let's suppose the basket is a long basket with N slots in a line. So the basket is linear.
Then the initial basket will look like,
1, 2, 3, 4, 5, 6, ..., N
Now assume we randomly choose 5, then the basket will look like,
1, 2, 3, 4, E, 6, ..., N
E means empty slot.
But instead of just remove E, we shift the slots before 5 just one step backward, and put 5 at the first slot,
5, 1, 2, 3, 4, 6, ..., N
Next time if we choose 3, we just shift all slots before 3 but after 5, then put 3 there,
5, 3, 1, 2, 4, 6, ..., N
Repeat for N times
Better, yes? We don't need an extra buffer. But we have to shift a lot of times, not fun.
How about for C-th choosen, we just swap the candidate with the C-th item instead of a shift?
Then above iterations will look like,
1, 2, 3, 4, 5, 6, ..., N // initial
5, 2, 3, 4, 1, 6, ..., N // choose 5, swap with 1
5, 3, 2, 4, 1, 6, ..., N // choose 3, swap with 2
That's how the code really do.