Pseudo-random Sampling and .NET

One of the requirements I received for my current application was to select five percent of entities generated by another process for further review by an actual person. The requirement wasn’t quite a request for a simple random sample (since the process generates entities one at a time instead of in batches), so the code I had to write needed to give each entity generated a five percent chance of being selected for further review.  In .NET, anything involving percentage chances means using the Random class in some way.  Because the class doesn’t generate truly random numbers (it generates pseudo-random numbers), additional work is needed to make the outcomes more random.

The first part of my approach to making the outcomes more random was to simplify the five percent aspect of the requirement to a yes or no decision, where “yes” meant treat the entity normally and “no” meant select the entity for further review.  I modeled this as a collection of 100 boolean values with 95 true and five false.  I ended up using a for-loop to populate the boolean list with 95 true values.  Another option I considered was using Enumerable.Repeat (described in great detail in this post), but apparently that operation is quite a bit slower.  I could have used Enumerable.Range instead, and may investigate the possibility later to see what advantages or disadvantages there are in performance and code clarity.

Having created the list of decisions, I needed to randomize their order.  To accomplish this, I used LINQ to sort the list by the value of newly-generated GUIDs:

decisions.OrderBy(d => Guid.NewGuid()) //decisions is a list of bool

With a randomly-ordered list of decisions, the final step was to select a decision from a random location in the list.  For that, I turned to a Jon Skeet post that provided a provided a helper class (see the end of that post) for retrieving a thread-safe instance of Random to use for generating a pseudo-random value within the range of possible decisions.  The resulting code is as follows:

return decisions.OrderBy(d => Guid.NewGuid()).ToArray()[RandomProvider.GetThreadRandom().Next(100)]; //decisions is a list of bool

I used LINQPad to test my code and over multiple executions, I got between 3 and 6 “no” results.