Sunday, March 29, 2009

Software transactional memory

Software transactional memory (STM) applies the concept of database transactions to program variables. In a simple implementation, a thread simply executes a transaction without regard to concurrent threads, recording what it reads and what it changes as it proceeds. At the end of the transaction, it checks if any of the variables it accessed were concurrently altered. If they were, it rolls back all of its changes and retries the transaction from the start. If they were not, the transaction succeeds and any changes become permanent.

I don't like this naive approach because it gives up consistency. Because a transaction's input may be randomly altered by other transactions in progress, it may receive corrupt input, triggering foolish actions. The Wikipedia article gives an example of an inconsistency that sends one thread into an infinite loop, for which there are no good or general ways to recover.

Consistency could be provided by the standard means used by relational databases—snapshot copies, global version counters, etc.—but the cost of locking on large numbers of processors would be prohibitive.

STM strikes me as vulnerable to hijinks, since the first transaction to finish wins. A network attacker simply has to identify a fast, innocuous transaction that writes to the same data used by an important transaction that takes a long time to complete. The attacker can prevent the slow transaction from ever completing by simply sending requests slightly more often than the long transaction takes. One could say "Well, that is the system engineer's fault for putting an atomic { ... } clause around such a long process", but that dodges that (1) mistakes happen when complicated systems are being extended, and (2) any slowness larger than network jitter opens the vulnerability.

The more I think about concurrency on computers with many cores, the more sensible message passing seems. A datum lives on a single core, or in a small group of cores, and all manipulation of it is by requests that travel to the datum's personal hardware. Global synchronization across the many-core computer is just too expensive.

Comments: Post a Comment

<< Home

This page is powered by Blogger. Isn't yours?