In most programs/scripts that we write today, there is the use of Threads or even Processes, not all the times but it’s rare for a programmer not to hear those creatures names.
They are like our butter or cheese for our toasts, and they have a strong place for every programmer these days in his heart 🙂

As of today, the knowledge of how to use them can be really a game changer.
In the cloud industry for example it gives the idea and basic understanding to know what is vertical and horizontal scaling.

Now if you’re already familiar with them , than you rather skip the next section and jump straight for the next 🙂

The purpose for this article is to show the basic use of threads and processes in python, and also shows some other functionalities that are related to threads.
Afterwards I wish to show a specific issue with threads inside python and how to bypass it, after understanding it’s source.
MAY THE FORCE BE WITH US!

What is a Process/Thread

In order to understand them we are going to go indirectly and talk about a casual program in our computer.

Once we opened an application the OS is responsible to load the application to memory and create a process for it, afterwards to execute it.
Each process has it’s memory which is separated to — stack, heap, text, data.
I don’t want to dig deep inside those memory sections, but just keep in mind that those sections are the memory allocated for the process to work with.

So wait.. if we have a process which is responsible to execute our program and has it’s memory, why we need this weird thing that is called Thread?

GREAT QUESTION!

Well the Thread is the smallest execution unit inside the Process which runs the main program, like our main.c while programming in C.
Inside a process there could be any number of Threads, but we do need to make sure we don’t create to many, because the process and the CPU of our computer, needs to context switch between them, and allocate time slice for them to run on the CPU, therefore if we will create to many Threads that could be a very heavy work for our CPU.
Each PC has it’s own CPU and it’s number of cores so we need keep in mind what is our hardware limitations and behave appropriately so it will not crush 🙂

Each process that is running in our OS, is being scheduled to run by the OS scheduler, which is responsible to wake him and change his state to ‘RUNNING’.
Toady CPU vendors have 2/4/8 and or even more cores, and on each core we can execute a Process at a given moment.
Each OS internals implementations decides how it times the execution of processes and/or even it’s prioritization, the main component that is responsible for that operation is called the Scheduler.

While saying that, the process can be running on multiple cores because he has many threads that can be running concurrently.
Which means in the same moment 2 or more threads can be doing something that we coded in our program.

Important thing to point out, today we have Hyper Threading inside our CPU’s. What it does is actually splitting the core to 2 and allowing execution of 2 processes on the same physical core.
It’s a lot more complex but that’s the idea 🙂

Let’s dive a little inside some code

The common libraries for use in order to create threads are threading and for creating processes multiprocessing.

The following example shows how to create a thread:

import threading
import timedef my_thread_func():
    thread_name = threading.currentThread().getName()  
    print(f'Thread \'{thread_name}\' is going to sleep')        
    time.sleep(1000)  
    print(f'Thread \'{thread_name}\' has woken up')def main():
    t1 = threading.Thread(target=my_thread_func, name='Worker 1')          
    t2 = threading.Thread(target=my_thread_func, name='Worker 2')    t1.start()
    t2.start()
 
    t1.join()
    t2.join()if __name__ == '__main__':
    main()Note: Ignore cases that they will maybe run faster and etc.. It's only for the example :)
Output:
    Thread 'Worker 1' is going to sleep
    Thread 'Worker 2' is going to sleep
    Thread 'Worker 1' has woken up
    Thread 'Worker 2' has woken up

The following example shows how to create a process:

from multiprocessing import Process, current_process
import timedef my_process_func():
    process_name = current_process().name  
    print(f'Process \'{process_name}\' is going to sleep')        
    time.sleep(1000)  
    print(f'Process \'{process_name}\' has woken up')def main():
    p1 = Process(target=my_process_func, name='Worker 1')          
    p2 = Process(target=my_process_func, name='Worker 2')    p1.start()
    p2.start()
 
    p1.join()
    p2.join()if __name__ == '__main__':
    main()Note: Ignore cases that they will maybe run faster and etc.. It's only for the example :)
Output:
    Process 'Worker 1' is going to sleep
    Process 'Worker 2' is going to sleep
    Process 'Worker 1' has woken up
    Process 'Worker 2' has woken up

Each of those scripts do the same.
They start a thread/process who is printing when it’s going to sleep, then sleeps and finally prints when it wakes up.
The main script is waiting for them to finish their execution because we called join().

They look almost exactly the same!
So what is the big difference?!
Well, remember we talked about the fact that threads could run in parallel?
The ugly truth they aren’t in Python 🙂

Who does Python think he is?!

In order to understand the last big statement let’s talk briefly about Python.
Python is a interpreted language, meaning it’s not compiled and then executed any time we want that executable to run.
It’s compiled only when executing the script, which is compiled to byte code and then runs.

The main interpreter of python is CPython which is the one given with the default installation of Python.
There are other versions of interpreters for python which written in other languages like Java, C# , etc…

Inside our dear friend CPython, and when I mean inside, I mean the source code, there is a boolean variable which is called GIL.
GIL(Global Interpreter Lock) is a variable that CPython sets to true, when a thread is running inside the interpreter.

So wait… What is the GIL purpose again?

The interpreter is responsible when he has time slice of the CPU, to execute 100 bytes code instructions of the compiled script, for each thread — TOPS!
Just to make sure you understand, (Theoreticall) if the interpreter has time slice which is equal to 450, than the interpreter will execute 3 threads 100 bytes code and 50 more for the final thread, which will stop to run in the middle, because the end runtime for the interpreter itself has ended.

The GIL is responsible to lock the interpreter for each thread that is running inside the CPython.

Wait a minute… doesn’t it mean that only 1 thread can execute at a time inside the interpreter even that it’s a process?
I’m not joking but YES IT DOES!

You might start asking yourself, there are many Back-end/Machine Learning/UI/Games or any other applications written in Python.
How do they provide such availability of fast computation or basically provide such high CPU-Bound programs?

Then what is the solution?!

Well you actually saw the answer in the start of this article.
The answer is to do one of the following:

Use processes which each process has it’s own interpreter.
Command the CPython using C code to ignore the GIL.
When doing so, you need to worry more about writing lock-free and/or use mutexes to prevent race-conditions, also to stay away from deadlocks.

Some of the more experienced developers might say that to use process is heavy for the OS to context switch them, and you are right, but for now that’s the solution for Python.

If you really worry about that context switching, you can alter CPython and change the behavior of how it uses the GIL, so you can have multiple threads run in parallel.

So what is the use of threads?!

Well usually in programming, there are 2 main uses of programs, which are:

CPU-Bounded — Program which intensively runs on the CPU and make long computations or algorithms.
I/O-Bounded — Program which intensively runs on I/O devices and request read/write operations such as — Hard drive(HDD, SSD), Keyboard, Networking, etc…

Usually when we do I/O operation we wait quiet some time for a response from that I/O utility.
So my suggestion from variety of opinions and experience with online forums, you can use threads for I/O operations and for the CPU-Bounded programs go for processes.

I hope you’ve enjoyed reading this article and it did improve your knowledge about the subject, thank you for your time and have a great day ☺

Sources and links: