In serial Sather there is only one thread of execution; in pSather there may be many. Multiple threads are similar to multiple serial Sather programs executing concurrently, but threads share variables of a single namespace. A new thread is created by executing a fork, which may be a par or fork statement (See par and fork statements), parloop statement (See parloop statement), or an attach (See Attach statement). The new thread is a child of the forking thread, which is the child's parent. pSather provides operations that can block a thread, making it unable to execute statements until some condition occurs. pSather threads that are not blocked will eventually run, but there is no other constraint on the order of execution of statements between threads that are not blocked. Threads no longer exist once they terminate. When a pSather program begins execution it has a single thread corresponding to the main routine (See main).
Serial Sather defines a total order of execution of the program's statements; in contrast, pSather defines only a partial order. This partial order is defined by the union of the constraints implied by the consecutive execution order of statements within single threads and pSather synchronization operations between statements in different threads. As long as this partial order appears to be observed it is possible for a pSather implementation to overlap multiple operations in time, so a child thread may run concurrently with its parent and with other children. Using threads may render programs nondeterministic. Preconditions, postconditions, and class invariants (See Safety Features) may not work as intended when originally serial code is used with multiple threads.
The threaded extension may be implemented without the synchronization extension. This is only useful with data parallel code, in which it is not possible for threads to affect each other through side effects. Platforms may interpret such data parallelism in different ways, such as an opportunity for vectorization, or by executing only one 'thread' at a time.
Example 16-1. Example:
| par fork ... end end | 
| fork_statement ==>
        fork statement_list end | 
| par_statement ==>
        par statement_list end | 
Threads may be created with the fork statement, which must be syntactically enclosed in a par statement, which also implicitly creates a thread. When a fork statement is executed, it forks a body thread to execute the statements in its body. Local variables that are declared outside the body of the innermost enclosing par statement are shared among all threads in the par body. All threads created by a fork must complete before execution continues past the par. The rules for memory consistency apply to body threads, so they may not see a consistent picture of the shared variables unless they employ explicit synchronization (See Memory consistency).
Each body thread receives a unique copy of every local declared in the innermost enclosing par body. When body threads begin, these copies have the value that the locals did at the time the fork statement was executed. Changes to a thread's copy of these variables are never observed by other threads. Iterators may not occur in a fork or par statement unless they are within an enclosed loop. 'quit', 'yield', and 'return' are not permitted in a par or fork body.
As a generalization of serial Sather, it is a fatal error if an exception occurs in a thread which is not handled within that thread by some protect statement. Because par and fork bodies are executed as separate threads, an unhandled exception in their bodies is a fatal error.
NOTE: The reason for having an implicit thread associated with the par statement is subtly related to sementics of the sync statement. The semantics of the sync statement requires synchronizing with all executing threads attached to the gate associated with the par statement. A potential problem that could occur is that some parloop iterations (more generally, the body of any enclosed fork) reach a sync statement before all parloop iterations have been initiated and attached to the controlling gate. Thus, threads which are forked off earlier might erroneously avoid synchronization since their synchronization partner threads have not yet been started or attached to the par gate. However, this problem is avoided since the par statement itself has an implicit thread responsible for executing the sequential portion of the par body, which includes the creation and attaching of the syntactically enclosed. Thus, none of the threads forked off within the par body may go through a sync until the thread associated with the sequential portion of the par has completed (unless there is a sync within the sequential portion of the par); the sequential portion of the par includes the creation and attaching of all the syntactically enclosed fork bodies.
Example 16-2. In this code A and B can execute concurrently. After both A and B complete, C and D can execute concurrently. E must wait for A, B, C, and D to terminate before executing.
| par
   par
      fork A end;
      B
   end;
   fork C end;
   D
end;
E | 
Example 16-3. In this code, 'outer' is declared outside the par, so this variable is shared by the forked thread. However, because 'inner' is inside the par, the fork body receives its own local copy at the time of the fork.
| outer:INT;
par
   inner:INT;
   fork
      -- fork body
   end
end |