Recommended

Multicore community

 

Articles

Intel.com

Microsoft.co.il

 

Community

Microsoft Forums

Intel's Forum

Intel's Multicore Community

 

Resources

http://msdn.com/concurrency

Intel Multicore

NVidia Multicore GPU

 

Downloads

.Net Parallel Extensions

Intel's TBB

WinModules   

 

Tools

AsyncOp Logger

Intel thread analysis

Intel VTune

 

Contact

Asaf Shelly

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

ichilov

-->

 

 

 

 

 

 
2 / 1
 
 
 
 
 
 
 

Thread

Systems today define the basic unit of execution as a Thread. In other words today we represent a Task in the system by using a thread. Based on this every process is considered to be a type of thread or containing an initial thread because a process is a Task Domain that has its private copy of computer memory, system handles, and devices. A process also starts with an initial Task. If the process is a type of thread then the application will end when this thread exits. If the process is a container of threads then the application will end after the last thread exits.

Every thread starts on its main function. It is a parameter that we send when we create a thread. When the function returns then the thread exits. The function of the initial thread of the process is called main, or WinMain for a Windows application.

All threads live in the domain of the process which means that:

  • Threads share the same memory address space and thus all threads that belong to the same process have full access to any data in memory. (Note that Windows CE's Thread Migration is a special case).

  • Threads share the same system handles which include file handles, device handles, and even handles to Named Synchronization Objects.

  •  Threads share the same Code and Data Segments. This allows several threads to execute the same routine, call the same functions, and even jump to execute another thread's main function. It also means that threads can see the data that other threads are using and even modify it. Currently protection to memory pages is implemented by the CPU for processes only and we cannot block a thread from accessing a memory page without blocking all threads from that page.

  • Every thread has its own Stack. This is a basic element of the thread - a special storage that is private to that thread and is therefore considered to be thread-safe. On the other hand the stack is a segment on the processes memory map and since thread can see the entire memory that belongs to the process then all threads can see each other's stack and even accidentally modify it.

There is one exception to the rule stating that threads share the same handles with other threads. This exception is for MUTEX objects. In the case of a MUTEX there cannot be two owners of the same MUTEX at the same time and therefore there is only one owner thread to the MUTEX object. All threads may be using the same system handle to the MUTEX but only one thread is the owner of that MUTEX when it is locked. This means that the operating system (or synchronization library) can automatically cleanup after a dead thread and unlock and release the MUTEX object when the thread terminates unexpectedly or exits without proper exit. On Windows OS the call to the API WaitForSingleObject has a special return value to indicate that the MUTEX was last released by the system and so it is possible that the previous owner terminated without completing its work with the resource.

The basic and most fundamental idea of a thread is that it keeps running without any interruption. At least that is what the thread thinks. In real world the thread is paused to allow other threads to run. This is called a Context Switch. A context switch can happen because there are several threads that need to run and each thread should get its share of time on the CPU (threads of the same priority on real-time OS). The most likely thing to happen is that the thread would relinquish the CPU of its own free will be using Sleep or calling a Wait function. Waiting for any object that is not fulfilled immediately will result in a Context Switch.

The code has to be aware of the fact that the execution could be paused for a Context Switch and then resumed. This can happen any time and between any two CPU instructions (Assembly commands). See Atomic Operations for more about this.

Here is an example of a simple code that is executed by a thread:

A = A + 1

B = B + 1

The thread that runs this may have a Context Switch between the two lines. There is nothing wrong with that. The problem is that another thread may try to use the same resources A and B. As global variables these can be shared by several threads.

If two threads running on the same CPU (core) race each other so that the first thread paused between the lines and then the other started working, there is no real problem.

See this:

A = A + 1

A = A + 1

B = B + 1

B = B + 1

Here is the calculated result:

 

A = A + 1

A = A + 1

B = B + 1

B = B + 1

A = 0 , B = 0

A = 1

A = 2

B = 1

B = 2

The second thread interrupted the execution of the first and then the first resumed. There was no problem here because the modification sequence was already complete.

The problem is that a Context Switch can occur inside the line, between any two CPU instructions.

Here is what happened above:

A = A + 1

Read A

Add 1

Write A

A = 0

 

A = 1

A = A + 1

Read A

Add 1

Write A

A = 1

 

A = 2

B = B + 1

Read B

Add 1

Write B

B = 0

 

B = 1

B = B + 1

Read B

Add 1

Write B

B = 1

 

B = 2

The Context Switch can come anywhere between any two instructions on the middle column. Here is an example:

A + 1

Read A

Add 1

 

A = 0

 

 

A = A + 1

Read A

Add 1

Write A

A = 0

 

A = 1

B = B + 1

Read B

Add 1

Write B

B = 0

 

B = 1

A =

 

 

Write A

 

 

A = 1

B = B + 1

Read B

Add 1

Write B

B = 1

 

B = 2

End result is that A is set to 1 and B is set to 2. This is because the CPU performs addition on its internal Register. The CPU reads the data from memory into an internal Register and performs the addition on that internal Register. After completion the CPU writes the data back to memory. If any thread managed somehow to read the data from memory between reading to writing of another thread then we have a synchronization problem.

The problems comes by the basic definition of a thread as a system object that has its own copy of stack and its own copy of CPU registers, state, and data. The idea of a thread is a basic execution unit that 'thinks' that it has its own private CPU, just like a process 'thinks' that it has its own memory address space even though it is only Virtual Memory and the real memory is shared with other processes on the system.

Every thread therefore has:

  • Private Stack

  • Private copy of CPU registers (including instruction pointer)

  • Private CPU state

  • Private CPU data

  • Private system resources

  • The thread itself is a system object

  • An owner process to which it belongs and lives within its process domain

  • Execution Priority and Quantum

  • System Synchronization Queue

The System Synchronization Queue has a list of synchronization objects that can be waited for. For example list of pending APC (see APC for more information).

The system or library can also implement a user input queue so that threads can communicate with each other. The Windows OS creates and attaches such a queue the first time that the thread is waiting for an event on the queue by calling an API function such as GetMessage or PeekMessage. Other OS and libraries may require that we explicitly create and attach the thread's event queue.

It is possible to set the thread's Affinity Mask to make it run on a specific CPU and CPU Core (Windows OS SetThreadAffinityMask API).

Thread's Priority and Affinity are usually inherited from the process to which it belongs. This may mean the defaults settings, it can be a limitation for example limiting thread Affinity to only be a subset of the process's, and it can be the basic priority of the thread and any value that the thread has is relative to the base priority of the owner process.