Understanding Python Concurrency - Introduction
This post will be exploring the core concepts of concurrency so that we understand the terminology we will use when we talk about concurrency.
What is Concurrency?
From Wikipedia, Concurrency is a property of systems in which several computations are executing simultaneously and potentially interacting with each other.
Simple examples of concurrent systems are A network server which processes multiple client requests, A number crunching job occuring over several CPUs.
Multitasking
Multitasking means a computer running multiple processes at the same point of time. If you have a single core machine, you can run multiple programs because your computer is multitasking. In that case at any point the computer is executing instructions from only one program, but it constantly switches between multiple programs and gives you the illusion that your are running many programs. Actually even when you have a multicore machine and the number of tasks is more than the number of CPUs you have, your computer still performs multitasking.
Concurrency implies multitasking.
What is Parallelism?
When we have multiple CPUs each cpu can process one task simultaneously. This is called Parallelism.
Difference between Concurrency and Parallelism
This SO thread explains it much better than I ever can.
Concurrency is when two tasks can start, run, and complete in overlapping time periods. It doesn’t necessarily mean they’ll ever both be running at the same instant. Eg. multitasking on a single-core machine.
Parallelism is when tasks literally run at the same time, eg. on a multicore processor.
Quoting Sun’s Multithreaded Programming Guide:
-
Parallelism: A condition that arises when at least two threads are executing simultaneously.
-
Concurrency: A condition that exists when at least two threads are making progress. A more generalized form of parallelism that can include time-slicing as a form of virtual parallelism.
Nature of Programs
Programs typically execute by alternating over CPU processing and I/O handling. When a task is performing I/O it must wait(sleep) and the underlying system will carry out the I/O operation and wake up the the task when it’s done.
A task is said to be CPU bound if it spends most of it’s time processing with little I/O. A simple example is Image Processing.
A task is said to be I/O bound if it spends most of the time waiting for I/O. A simple exam is file processing.
Typically most programs are I/O bound.
Nature of Concurrent Programs
Concurrent programs can come in different flavors.
- A program having tasks running in the same memory space. They have simultaneous access to objects.
- Tasks running in separate processes.
- Tasks running on separate machines
There is a fair bit of commentary on Dabeaz’s slides but I will be skipping that here as that seems more like fodder for programmer debates but I would recommend you go through it if only to understand why “Python is slow” is often wrong.