When no need to return anything:
from joblib import Parallel, delayed
import multiprocessing
# Number of cores available to use
num_cores = multiprocessing.cpu_count()
# If your function takes only 1 variable
def yourFunction(input):
# anything in your loop
return XXX
Parallel(n_jobs=num_cores)(delayed(yourFunction)(input) for input in list)
# If your function taking more than 1 variable
def yourFunction(input1, input2):
# anything in your loop
return XXX
Parallel(n_jobs=num_cores)(delayed(yourFunction)(input1, input2) for input1 in list1 for input2 in list2)
When need to return things, simply point it to a variable, it will be saved as a list:
results = Parallel(n_jobs=num_cores)(delayed(yourFunction)(input) for input in list)
When need to return data.frame and later concatenate together, using mp.Pool
import multiprocessing as mp
with mp.Pool(processes = num_cores-1) as pool:
resultList = pool.map(yourFunction, argvList))
results_df = pd.concat(resultList)
Source:
https://stackoverflow.com/questions/9786102/how-do-i-parallelize-a-simple-python-loop
https://blog.dominodatalab.com/simple-parallelization/
https://stackoverflow.com/questions/36794433/python-using-multiprocessing-on-a-pandas-dataframe