OpenMP and Python

Due to GIL there is no point to use threads for CPU intensive tasks in CPython. You need either multiprocessing (example) or use C extensions that release GIL during computations e.g., some of numpy functions, example. You could easily write C extensions that use multiple threads in Cython, example.

Running programs in parallel using xargs

From the xargs man page: This manual page documents the GNU version of xargs. xargs reads items from the standard input, delimited by blanks (which can be protected with double or single quotes or a backslash) or newlines, and executes the command (default is /bin/echo) one or more times with any initial- arguments followed by items read … Read more

Python multiprocessing.Pool: AttributeError

Error 1: AttributeError: Can’t pickle local object ‘SomeClass.some_method..single’ You solved this error yourself by moving the nested target-function single() out to the top-level. Background: Pool needs to pickle (serialize) everything it sends to its worker-processes (IPC). Pickling actually only saves the name of a function and unpickling requires re-importing the function by name. For that … Read more

How do I parallelize a simple Python loop?

Using multiple threads on CPython won’t give you better performance for pure-Python code due to the global interpreter lock (GIL). I suggest using the multiprocessing module instead: Note that this won’t work in the interactive interpreter. To avoid the usual FUD around the GIL: There wouldn’t be any advantage to using threads for this example anyway. You want to … Read more

How do I parallelize a simple Python loop?

Using multiple threads on CPython won’t give you better performance for pure-Python code due to the global interpreter lock (GIL). I suggest using the multiprocessing module instead: Note that this won’t work in the interactive interpreter. To avoid the usual FUD around the GIL: There wouldn’t be any advantage to using threads for this example anyway. You want to … Read more

What is the difference between concurrency and parallelism?

Concurrency is when two or more tasks can start, run, and complete in overlapping time periods. It doesn’t necessarily mean they’ll ever both be running at the same instant. For example, multitasking on a single-core machine. Parallelism is when tasks literally run at the same time, e.g., on a multicore processor. Quoting Sun’s Multithreaded Programming Guide: Concurrency: A condition that exists when at least two … Read more