Sample with replacement

random.sample() samples without replacement. I find this piece of code by Sean Ross which samples with replacement.

# credit author(s) of random.py
import random
import itertools

def sample_wr(population, k):
    "Chooses k random elements (with replacement) from a population"
    n = len(population)
    _random, _int = random.random, int  # speed hack 
    return [population[_int(_random() * n)] for i in itertools.repeat(None, k)]
Advertisements

Parallelize big for loops in python

Suppose I have a big for loop to run (big in the sense of many iterations):

for i in range(10000)
   for j in range(10000)
       f((i,j)) 

After hours of search I arrived at the solution using “multiprocessing” module, as the following:

pool=Pool()
x=pool.imap(f,((i,j) for i in xrange(10000) for j in xrange(10000)]))

Remark: pool.map would generate a list of arguments first and then feed the list to the function. Hence if I have a big for loop, it spends a lot of time generating the list of arguments using only 1 cpu. In contrast, imap would generate the arguments on the fly, therefore parallelizing the for loop as I wish.