Multithreading

This collection of routines lets you create multiple, independent threads.

Each thread has its own current statement being executed, its own subroutine call stack, and its own set of private variables. The local (file-level) and global variables of a program are shared amongst all threads, but require careful locking. Threads can run at the same time as other threads, and thread switching can occur at any point in the control flow.

The one single advantage that threads have over the simpler multitasking is that a thread can wait for a lock, event, signal, notification, or network access to complete, without stalling any other threads. In multitasking any of those will stall the entire application, because the offending task does not (and cannot) call task_yield() when it is in fact itself in such a stalled wait state. If short timeouts and frequent yielding are not enough, and your application still feels sluggish and unresponsive, you may have to resort to multithreading, whereas if you use multithreading unnecessarily, well that’s just plain foolish.

Multithreading is widely considered the most difficult general purpose thing a programmer ever has to master, though there are of course far harder domain-specific problems. The following rules apply when programming with threads.
  • Extensive locking is required on anything that can be modified by more than one thread. In phix this includes the hidden reference counts on literal constants, and private variables that have been initialised from or contain copies of anything similar. You may need to limit the number of strings, sequences, and floats that are shared between threads, to prevent that getting too painful.
  • All GUI updates must be performed by the main thread. If a worker thread wants an update to appear in the GUI, it should send an appropriate (user-defined) message to the main thread rather than attempt to perform the update itself.
  • Debugging is typically at least ten times harder in multithreaded code than non-multithreaded code. If at all possible, new code should be written in a (slow) single threaded test harness environment and converted to use multithreading only when performance issues compel it, though I freely admit in many cases that may simply not be practical.
  • Certain programming techniques (on windows) pretty much demand the use of threads, for example FileDirectoryChangeNotification and full drag and drop - some bits of the latter are quite easy but, trust me, a full COM-based implementation is decidedly non-trivial, try typing in "catch22 drag and drop" into your favourite search engine and reading the six-part tutorial.
  • When adding multithreading to an application, the entire code base needs to be made thread-safe, including any existing multitasking. It should be sufficient to avoid sharing any references to variables or data between the new code and the existing non-thread-safe code. As above that includes literal constants such as "hello" and 3.14159265358979323846.
A good example of a program that needed multithreading was the File Manager I developed at Online50. The basic idea was that a business could securely share their accounting data 24/7 with their accountants and other branches. Most of that was done through the sage50 accounting package; the File Manager was like Windows File Explorer, but worldwide. A remote client, anywhere in the world, would log in to the online50 server, in London, and the File Manager would display:
  • local files on the hard drive(s) of the server
  • remote files on the hard drive(s) of the client
  • management info from a network attached database
Without multithreading, the program would be completely unresponsive until all this information had been collected, which ranged from 15 seconds to several minutes. The real killer was the client files, which would take at least 10 or 20 seconds per drive, longer if the client had their own network attached drives, and each would have to be loaded sequentially.

With multithreading, the program would appear and be responsive immediately, with the local files appearing in a fraction of a second, management info in 2 to 5 seconds, and the client directories appearing soon after. Each request was initiated simultaneously, hence the total loading time depended on the slowest component, rather than the sum of all the component loading times. Even for the worst use case (a user on the other side of the planet accessing network attached files via a server in London) it was at least 3 or 4 times faster, and most of the time it felt a thousand times faster.

In contrast, multitasking would have been completely ineffective, since each task would stall inside blocked i/o, without being able to invoke task_yield() to let other things carry on, or at the very least the program would be significantly complicated by attempts to use asynchronous i/o, which is effectively the same thing as multithreading anyway.

The files demo\rosetta\AtomicUrpdates.exw and demo\rosetta\DiningPhilosophers.exw contain examples of use.

init_cs - create a new critical section
delete_cs - delete a critical section
enter_cs - begin mutually exclusive execution
try_cs - begin mutually exclusive execution, but only if the critical section lock can be obtained instantly
leave_cs - end mutually exclusive execution
create_thread - create a new thread
suspend_thread - suspend execution of a thread
resume_thread - resume execution of a thread
exit_thread - terminate execution of the current thread
get_thread_exitcode - get the exit code of a thread, or STILL_ACTIVE
wait_thread - wait for a thread or set of threads to terminate