I encountered this problem in one of my programs.
I thought solution to be kind of cool, in a nerdy kind of way; but I had a doubt if it’s too obvious to warrant an article of its own. So I checked it on my colleague, a teacher; and she said “I think it’s interesting, go ahead and post it”. So here I am.

The problem:

Imagine you have a choosing program, likely involving big (lengthy) search. You are interested in picking just one of “best” solutions; being “best” amounts to having maximum of some criteria – a single number.
However, beforehand you have no idea what number would it be; more, you have no idea if it would be single solution with maximum criteria or hundreds of them.
But in case you have several of them, you think it would be nice to pick one of them at random – I mean, with equal probability. (In case you program some game, say Tic-tac-toe, it would make computer opponent play more interesting – instead of picking same response along each path, it would show some variation). (Note: You can easily have first of them or last of them, depending of putting > or >= in your maximum search code; but with equal probability? )

Easy (read no-thinking) way

So. Easy (read no-thinking) way if doing this would require several passes of search loop.

1. On the first pass, you determine maximum criteria value (“best”).

2. On the second pass, you count number of occurrences (Nbest) of that “best” value.

Then you can dimension array for that value (dim A(Nbest))
3. Third pass fill that array with solutions yielding “best” criteria.

And at last, we can finally generate random number R between 1 and Nbest and return A(R).

'problem:'You have a choosing program, likely involving big (lengthy) search.'You are interested in picking just one of "best" solutions;'being "best" amounts to having maximum of some criteria - a single number.'v.1. Easy (read no-thinking) way'for now, i 1..N would be search space, b(i) is criteria'i, so that b(i)= max, is solution (one of).
N=100dim b(N)
NN=int(rnd(0)*20)+20for i =1to N
b(i)=int(rnd(0)*NN)nextprint"We have ";N; " integer random numbers."'On the first pass, you determine maximum criteria value ("best").
best =-1e40 'so that it would be guaranteed less then maximumfor i =1to N
if b(i)>best then best=b(i)nextprint"Maximum of them: "; best
'On the second pass, you count number of occurrences (Nbest) of that "best" value.
Nbest=0print"solution","value"for i =1to N
if b(i)=best then
Nbest=Nbest+1print i, b(i)endifnextprint"----------------------"print"Num of solutions(maxumums) ";Nbest
'Then you can dimension array for that value (dim A(Nbest))dim a(Nbest)'Third pass fill that array with solutions yielding "best" criteria.
j=0print"number","solution","value"for i =1to N
if b(i)=best then
j=j+1
a(j)=i
print j, a(j), b(a(j))endifnextprint"----------------------"'And at last, we can finally generate random number R between 1 and Nbest
R=int(rnd(0)*Nbest)+1'and return A(R).print"We choose solution ";a(R);" with value ";b(a(R))

This approach has one definite value – clarity. Remember? Make it work – make it work right – make it work fast, and only in that order!
But supposed you already at “work right” phase, some speeding up might as well be welcome. We started with “big (lengthy) search”, remember?

Cutting it 2x

Currently, our algorithm passes over that lengthy search trice.
We can rather easily cut it 2x:
First, we can combine first two phases in one, in a single pass

'On the first pass, you determine maximum criteria value ("best").'On the second pass, you count number of occurrences (Nbest) of that "best" value.' combine first two phases in one, in a single pass
best =-1e40 'so that it would be guaranteed less then maximum
Nbest=0for i =1to N
if b(i)=best then
Nbest=Nbest+1endifif b(i)>best then'order is important: it's too late to check if(b(i)=best)
best=b(i)'after assignment on this line!
Nbest=1endifnextprint"Maximum of them: "; best
print"Num of solutions(maxumums) ";Nbest

So that leaves us with two passes (of three)
Second, we can skip that array business altogether.
We just generate R then run last pass, stopping after hitting R-th best value.

'We just generate R then run last pass, stopping after hitting R-th best value.
R=int(rnd(0)*Nbest)+1print"We choose ";R;"-th solution"
j=0print"number","solution","value"for i =1to N
if b(i)=best then
j=j+1print j, i, b(i)if j=R thenexitfor'stopping after hitting R-th best valueendifnextprint"----------------------"print"We choose solution ";i;" with value ";b(i)

That will cost us from around 1 (best case, R=1) to around full search pass (worst case, R=Nbest), in average – a half. So we end up with 1½ of a full search. As I promised, 2x cut from three passes :).

And now, in a single pass

Could it be made better?
Well, yes.
We will do all that in a single pass.
Here’s how.

Then we hit best value, we could have a counter – how much times we already had that best value?

If answer is 0, (this is best value and it happened just now), then we just keep it, no question asked.

If answer is 1, we have N=2 equal values. So we take new value with probability 1/N
(ha! It just means ½ in our case!), that is, with condition

If rnd(0)<1/N then

If condition fails, we will keep old value.

Likewise, if answer is 2, we have N=3 equal values. We take new value with probability 1/N (1/3 now), else we well keep old value (that happened to be 1st or second best value with equal probability).

You get the picture. ;)
Now, the code.

'problem:'You have a choosing program, likely involving big (lengthy) search.'You are interested in picking just one of "best" solutions;'being "best" amounts to having maximum of some criteria - a single number.'v.3. Single pass solution.'for now, i 1..N would be search space, b(i) is criteria'i, so that b(i)= max, is solution (one of).
N=100dim b(N)
NN=int(rnd(0)*20)+20for i =1to N
b(i)=int(rnd(0)*NN)nextprint"We have ";N; " integer random numbers."'Single pass'Gets maximum, 'counts number of maximums'and randomly keeps one of them.
best =-1e40 'so that it would be guaranteed less then maximum
Nbest=0
solution=0for i =1to N
if b(i)=best then
Nbest=Nbest+1'now we have Nbest equal values'So we take new value with probability 1/NbestIfrnd(1)<1/Nbest then
solution=i
endif'else we well keep old value (do nothing)endifif b(i)>best then'this is best value and it happened just now
best=b(i)'we just keep it, no question asked
solution=i
Nbest=1'start counting anewendifnextprint"Maximum of them: "; best
print"Num of solutions(maxumums) ";Nbest
print"We choose solution ";solution;" with value ";b(solution)

## Single-pass random optimal selection.

tsh73, may 2013## Table of Contents

## (How come?)

I encountered this problem in one of my programs.I thought solution to be kind of cool, in a nerdy kind of way; but I had a doubt if it’s too obvious to warrant an article of its own. So I checked it on my colleague, a teacher; and she said “I think it’s interesting, go ahead and post it”. So here I am.

## The problem:

Imagine you have a choosing program, likely involving big (lengthy) search. You are interested in picking just one of “best” solutions; being “best” amounts to having maximum of some criteria – a single number.However, beforehand you have no idea what number would it be; more, you have no idea if it would be single solution with maximum criteria or hundreds of them.

But in case you have several of them, you think it would be nice to pick one of them at random – I mean, with equal probability. (In case you program some game, say Tic-tac-toe, it would make computer opponent play more interesting – instead of picking same response along each path, it would show some variation). (Note: You can easily have first of them or last of them, depending of putting > or >= in your maximum search code; but with equal probability? )

## Easy (read no-thinking) way

So. Easy (read no-thinking) way if doing this would require several passes of search loop.3. Third pass fill that array with solutions yielding “best” criteria.

This approach has one definite value – clarity. Remember?

Make it work – make it work right – make it work fast, and only in that order!But supposed you already at “work right” phase, some speeding up might as well be welcome. We started with “big (lengthy) search”, remember?

## Cutting it 2x

Currently, our algorithm passes over that lengthy search trice.We can rather easily cut it 2x:

First, we can combine first two phases in one, in a single pass

So that leaves us with two passes (of three)

Second, we can skip that array business altogether.

We just generate R then run last pass, stopping after hitting R-th best value.

That will cost us from around 1 (best case, R=1) to around full search pass (worst case, R=Nbest), in average – a half. So we end up with 1½ of a full search. As I promised, 2x cut from three passes :).

## And now, in a single pass

Could it be made better?Well, yes.

We will do all that in a single pass.

Here’s how.

Then we hit best value, we could have a counter – how much times we already had that best value?

- If answer is 0, (this is best value and it happened just now), then we just keep it, no question asked.
- If answer is 1, we have N=2 equal values. So we take new value with probability 1/N
- Likewise, if answer is 2, we have N=3 equal values. We take new value with probability 1/N (1/3 now), else we well keep old value (that happened to be 1st or second best value with equal probability).

You get the picture. ;)(ha! It just means ½ in our case!), that is, with condition

If condition fails, we will keep old value.If rnd(0)<1/N thenNow, the code.

As I promised, it all happened along single pass of an algorithm!

Single-pass random optimal selection. | (How come?) | The problem: | Easy (read no-thinking) way | Cutting it 2x | And now, in a single pass