Category Archives: Programming Languages

Hot off the Presses (Dthreads & Sheriff)

Camera-ready versions of recent pubs from our research group:
SOSP 2011: Dthreads: Efficient and Deterministic Multithreading, Tongping Liu, Charlie Curtsinger, and Emery Berger.
OOPSLA 2011: Sheriff: Precise Detection and Automatic Mitigation of False Sharing, Tongping Liu and Emery Berger.

Advertisements

Dthreads: Efficient Deterministic Multithreading

Dthreads: Efficient Deterministic Multithreading,
Tongping Liu, Charlie Curtsinger, Emery D. Berger
SOSP 2011

[paper (PDF)][source code] [YouTube video of presentation][PPT slides]

Multithreaded programming is notoriously difficult to get right. A key problem is non-determinism, which complicates debugging, testing, and reproducing errors. One way to simplify multithreaded programming is to enforce deterministic execution, but current deterministic systems for C/C++ are incomplete or impractical. These systems require program modification, do not ensure determinism in the presence of data races, do not work with general-purpose multithreaded programs, or run up to 8.4× slower than pthreads.

This paper presents Dthreads, an efficient deterministic mul- tithreading system for unmodified C/C++ applications that replaces the pthreads library. Dthreads enforces determinism in the face of data races and deadlocks. Dthreads works by exploding multithreaded applications into multiple processes, with private, copy-on-write mappings to shared memory. It uses standard virtual memory protection to track writes, and deterministically orders updates by each thread. By separating updates from different threads, Dthreads has the additional benefit of eliminating false sharing. Experimental results show that Dthreads substantially outperforms a state-of-the-art deterministic runtime system, and for a majority of the benchmarks evaluated here, matches and occasionally exceeds the performance of pthreads.

 Related post: SHERIFF: Precise Detection and Automatic Mitigation of False Sharing

SHERIFF: Precise Detection and Automatic Mitigation of False Sharing

Sheriff: Precise Detection and Automatic Mitigation of False Sharing
Tongping Liu, Emery D. Berger, OOPSLA 2011

False sharing is an insidious problem for multithreaded programs running on multicore processors, where it can silently degrade performance and scalability. Previous tools for detecting false sharing are severely limited: they cannot distinguish false sharing from true sharing, have high false positive rates, and provide limited assistance to help programmers locate and resolve false sharing.

This paper presents two tools that attack the problem of false sharing: SHERIFF-DETECT and SHERIFF-PROTECT. Both tools leverage a framework we introduce here called SHERIFF. SHERIFF breaks out threads into separate processes, and exposes an API that allows programs to perform per-thread memory isolation and tracking on a per-page basis. We believe SHERIFF is of independent interest.

SHERIFF-DETECT finds instances of false sharing by comparing updates within the same cache lines by different threads, and uses sampling to rank them by performance impact. SHERIFF-DETECT is precise (no false positives), runs with low overhead (on average, 20%), and is accurate, pinpointing the exact objects involved in false sharing. We present a case study demonstrating SHERIFF-DETECT’s effectiveness at locating false sharing in a variety of benchmarks.

Rewriting a program to fix false sharing can be infeasible when source is unavailable, or undesirable when padding objects would unacceptably increase memory consumption or further worsen runtime performance. SHERIFF-PROTECT mitigates false sharing by adaptively isolating shared updates from different threads into separate physical addresses, effectively eliminating most of the performance impact of false sharing. We show that SHERIFF-PROTECT can improve performance for programs with catastrophic false sharing by up to 9x, without programmer intervention.

I’m a Mac (or, “Emery Inside”)

I’m a Mac (though I prefer John Hodgman)

I used to be a PC guy, but have completely gone Mac (MacBook Air, Mac Mini, iPhone, iPad, Jobs Distortion Field Glasses, etc.). But Mac went Emery before Emery went Mac! Proof below:

From http://www.opensource.apple.com/source/Libc/Libc-594.9.1/gen/magazine_malloc.c:

/*
Multithread enhancements for “tiny” allocations introduced February 2008.
These are in the spirit of “Hoard”. See:
Berger, E.D.; McKinley, K.S.; Blumofe, R.D.; Wilson, P.R. (2000).
“Hoard: a scalable memory allocator for multithreaded applications”.
ACM SIGPLAN Notices 35 (11): 117-128. Berger2000.

Retrieved on 2008-02-22.
*/

Winning the War on Bugs

This is a draft version of an article to appear in our departmental newsletter, Significant Bits (with links added).

Nearly all software ships with known bugs, and others are just lurking in the code waiting to be discovered. Some bugs are benign; for example, a page might not display correctly in a browser. But more serious bugs cause programs to crash unexpectedly or leave them vulnerable to attack by hackers. These bugs are difficult for programmers to find and fix. Even when the bugs are critical and security-sensitive, it takes an average of one month between initial bug reports and the delivery of a patch.

Rather than waiting for programmers to fix their bugs, or for hackers to find and exploit them, Professor Emery Berger’s group is designing systems to make software bug-proof. These systems allow buggy programs to run correctly, make them resistant to attack, and even automatically find and fix certain bugs. This work, developed jointly with Ben Zorn at Microsoft Research, was an important influence on the design of the Fault-Tolerant Heap that today makes Windows 7 more resistant to errors.

Defending Against Bugs

Berger and Zorn first developed an error-resistant system called DieHard, inspired by the film featuring Bruce Willis as an unstoppable cop.

DieHard attacks the widespread problem of memory errors. Programs written in the C and C++ programming languages – the vast majority of desktop, mobile, and server applications – are susceptible to memory errors. These bugs can lead to crashes, erroneous execution, and security vulnerabilities, and are notoriously costly to repair.

Berger uses a real-estate analogy to explain the problem of memory errors. Almost everything done on a computer uses some amount of memory—each graphic on an open Web page, for example—and when a program is running, it is constantly “renting houses” (chunks of memory) to hold each item, and putting them back on the market when they are no longer needed. Each “house” has only enough square footage for a certain number of bytes.

Programmers can make a wide variety of mistakes when managing their memory. They can unwittingly rent out houses that are still occupied (a dangling pointer error). They can ask for less space than they need, so items will spill over into another “house” (a buffer overflow). A program can even place a house up for rent multiple times (a double free), or even try to rent out a house that doesn’t exist (an invalid free), leading to havoc when the renter shows up. These mistakes can make programs suddenly crash, or worse: they can make a computer exploitable by hackers.

The way “addresses” are assigned also makes computers vulnerable. Houses (memory locations) with especially desirable valuables, like passwords, will always be on the same lot on the same street. If hackers can locate a password once, they can easily locate the password’s address on anyone’s version of the same program.

DieHard attacks these problems in several ways. First, it completely prevents certain memory errors, like double and invalid frees, from having any effect. DieHard keeps important information, like which houses are rented and which are not (heap metadata), out of a hacker’s reach. Most importantly, DieHard randomly assigns addresses—a password that has a downtown address in one session may be in the suburbs next time around. This randomization not only adds security but also increases resilience to errors, reducing the odds that dangling pointer errors or small or moderate overflows will have any effect.

Exterminating the Bugs

While Professor Berger is more than pleased that the DieHard work has influenced the Windows 7 Fault-Tolerant Heap, he hopes that Microsoft will adopt the technology that Zorn, Berger, and his Ph.D. student Gene Novark designed next, called Exterminator. Exterminator not only finds errors but also automatically fixes them. Exterminator uses a variant of DieHard (called DieFast) that constantly scans memory looking for signs of errors. DieFast places “canaries” – specific random numbers – in unused memory. Just like in a coalmine, a “dead” canary means trouble. When DieFast discovers a dead canary, it triggers a report containing the state of memory.

Exterminator next applies forensic analysis to these reports. With information gleaned from several users running a buggy program, Exterminator can pinpoint the source and location of memory errors. From that point on, Exterminator protects the program from that error by “padding” buggy memory requests to prevent overflows, and delaying premature relinquishing of memory to prevent dangling pointer errors.

Berger notes that since Microsoft already gathers information when programs crash, using techniques similar to those in Exterminator would be a natural next step to quickly find and fix memory errors.

Professor Berger is now tackling the problem of concurrency errors – bugs that are becoming more common with the widespread adoption of multicore CPUs. His group recently developed Grace, a system that prevents concurrency errors in C and C++ programs, and Berger hopes that some version of it will also gain widespread adoption as part of an arsenal to protect programs from bugs.

New Hoard release

I  just released a new version of Hoard that incorporates lots of changes (one of which I described in an earlier post).

This release incorporates a number of fixes and improvements, including:

  • Better per-thread allocation, improving speed for Unix platforms that do not support __thread (a fast way to do thread-specific data)
  • Added interception of Solaris threads API (thr_*) to work with older Solaris programs (the subject of the earlier post)
  • Fixed a rare race condition accidentally introduced in 3.7.1
  • Increased checking to protect against heap corruption or other errors
  • And Hoard now uses GNU-supported hooks on platforms with glibc (especially Linux), allowing it to work better with legacy programs (even emacs!).