<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
	>

<channel>
	<title>Emery Blogger</title>
	<atom:link href="http://emeryblogger.com/feed/" rel="self" type="application/rss+xml" />
	<link>http://emeryblogger.com</link>
	<description>Emery Berger&#039;s Blog</description>
	<lastBuildDate>Fri, 03 May 2013 11:31:44 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.com/</generator>
<cloud domain='emeryblogger.com' port='80' path='/?rsscloud=notify' registerProcedure='' protocol='http-post' />
<image>
		<url>http://0.gravatar.com/blavatar/460ede314b5608a4bd37bd7bfefda455?s=96&#038;d=http%3A%2F%2Fs2.wp.com%2Fi%2Fbuttonw-com.png</url>
		<title>Emery Blogger</title>
		<link>http://emeryblogger.com</link>
	</image>
	<atom:link rel="search" type="application/opensearchdescription+xml" href="http://emeryblogger.com/osd.xml" title="Emery Blogger" />
	<atom:link rel='hub' href='http://emeryblogger.com/?pushpress=hub'/>
		<item>
		<title>New Scientist coverage of our AutoMan project</title>
		<link>http://emeryblogger.com/2012/12/06/new-scientist-coverage-of-our-automan-project/</link>
		<comments>http://emeryblogger.com/2012/12/06/new-scientist-coverage-of-our-automan-project/#comments</comments>
		<pubDate>Thu, 06 Dec 2012 18:19:35 +0000</pubDate>
		<dc:creator>emeryberger</dc:creator>
				<category><![CDATA[Crowdsourcing]]></category>
		<category><![CDATA[Programming Languages]]></category>

		<guid isPermaLink="false">http://emeryblogger.com/?p=285</guid>
		<description><![CDATA[The New Scientist has just published an article covering our AutoMan project, which makes it possible to program with people. Full article below. Reasonably accurate, though it&#8217;s my team, not Dan&#8217;s [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=emeryblogger.com&#038;blog=9784426&#038;post=285&#038;subd=emeryblogger&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>The New Scientist has j<a href="http://www.newscientist.com/article/mg21628945.500-your-next-boss-could-be-a-computer.html">ust published an article</a> covering our <a href="http://www.automan-lang.org">AutoMan</a> project, which makes it possible to <em>program with people</em>. Full article below. Reasonably accurate, though it&#8217;s <em>my</em> team, not Dan&#8217;s <img src='http://s0.wp.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> . Also on the project are my student <a href="http://www.cs.umass.edu/~charlie">Charlie Curtsinger</a>, and my UMass colleague <a href="http://www.cs.umass.edu/~mcgregor">Andrew McGregor</a>.</p>
<p><strong><span id="more-285"></span>Your next boss could be a computer</strong></p>
<div id="pgtop">
<ul>
<li>06 December 2012 by <a href="http://www.newscientist.com/search?rbauthors=Douglas+Heaven"><b>Douglas Heaven</b></a></li>
<li>Magazine issue <a href="http://www.newscientist.com/issue/2894">2894</a>. <a href="http://subscribe.newscientist.com/bundles.aspx?prom=6005&amp;intcmp=SUBS-nsarttop&amp;promcode=6005"><b>Subscribe and save</b></a></li>
</ul>
</div>
<div id="hldmain">
<div id="hldcontent">
<div id="maincol">
<p><i><a href="http://emeryblogger.com/2012/12/06/new-scientist-coverage-of-our-automan-project/mg21628945-500-1_300/" rel="attachment wp-att-299"><img class="alignright size-full wp-image-299" alt="mg21628945.500-1_300" src="http://emeryblogger.files.wordpress.com/2012/12/mg21628945-500-1_300.jpg?w=470"   /></a>Software that delegates tricky problems to human workers is changing the nature of crowdsourcing</i></p>
<p>&#8220;I&#8217;D RATHER have a computer as my boss than a jerk,&#8221; says Daniel Barowy. To that end he has created AutoMan, the first fully automatic system that can delegate tasks to human workers via crowdsourcing platforms such as Amazon&#8217;s Mechanical Turk.</p>
<p>Artificial intelligence is improving all the time, but computers still struggle to complete certain tasks that are easy for us, such as quickly reading a car&#8217;s license plate or translating a joke. To get round this, people can post such tasks on platforms like Mechanical Turk for others to complete. Barowy wanted to automate this process &#8211; and AutoMan was born.</p>
<p>&#8220;We think of it as a new kind of computing,&#8221; says Barowy, a computer scientist at the University of Massachusetts, Amherst. &#8220;It changes the kind of things you can do.&#8221;</p>
<p>Barowy and colleagues designed AutoMan to send out jobs, manage workers, accept or reject work and make payments. &#8220;You&#8217;re replacing people&#8217;s bosses with a computer,&#8221; he says.</p>
<p>The quality guarantee is the most important contribution of the work, says Barowy. &#8220;Without a mechanism for addressing the quality of worker output, full automation is not possible.&#8221;</p>
<p>Unlike existing crowdsourcing platforms, AutoMan doesn&#8217;t attempt to predict the reliability of its workers based on their previous performance. Instead, if it is not sure it has the correct answer, it keeps on posting the same job, upping the fee each time, until it is confident that it does.</p>
<p>&#8220;One way to think about it is that it saves the interesting parts, the creative parts, or the fun parts for people,&#8221; says Barowy. &#8220;It&#8217;s really the best of both worlds. You have the computer doing the grunt work.&#8221;</p>
<p>AutoMan could be used by developers of apps like <a href="http://vizwiz.org/">VizWiz</a>, in which blind people take a photo of their surroundings and receive a description of the scene. The algorithm could be incorporated into the app, sending the photos to crowdworkers, choosing the correct descriptions and sending them back to the app&#8217;s user.</p>
<p>Of course, human labour doesn&#8217;t come free. AutoMan will be given a budget by the app developer and be programmed to keep costs down. Quicker &#8211; or higher quality &#8211; responses will cost more but AutoMan will manage all of this automatically. Anyone using such hybrid software wouldn&#8217;t know whether they were interacting with a machine or a crowd of humans &#8211; or both.</p>
<p>So how do Mechanical Turk workers feel about being directly employed by a computer? Barowy has received positive feedback so far. When a human boss rejects your work, it can feel personal or unfair. But that&#8217;s not the case with AutoMan. &#8220;People ended up liking the system because it&#8217;s impartial,&#8221; he says. The team presented the work at the <a href="http://splashcon.org/2012/program/oopsla-research-papers/471-programming-support-i">OOPSLA</a> conference in Tucson, Arizona, last month.</p>
<p>&#8220;Any programmer could pick this up and use it,&#8221; says <a href="http://hci.stanford.edu/msb/">Michael Bernstein</a> of Stanford University in California. &#8220;That&#8217;s a really powerful thing.&#8221; Bernstein has developed hybrid computational systems himself, such as <a href="http://projects.csail.mit.edu/soylent/">Soylent</a>, a word processor that uses crowd workers to edit text.</p>
<p>Barowy&#8217;s team hopes that their system will make crowdsourcing mainstream, with software delegating tasks to human workers around the globe. &#8220;AutoMan might even help grow a new class of jobs that could become a new sector of the world economy,&#8221; says team member <a href="http://plasma.cs.umass.edu/emery/">Emery Berger</a>, also at the University of Massachusetts.</p>
<div>
<h3 id="bx289455B1">People power makes Google work</h3>
<p>Google likes to give the impression that it organises the world&#8217;s information using algorithms alone, but the manual for its human raters tells the true story. Google&#8217;s small army of home workers have a big say in what sites we are offered when we type in a search term.</p>
<p>The manual, revealed by technology website The Register, gives instructions on how raters should judge whether a set of search results matches a user&#8217;s intention. They are also asked to make calls on a website&#8217;s &#8220;relevance&#8221; &#8211; something that popular myth suggests is handled by the PageRank algorithm alone &#8211; and &#8220;quality&#8221;. Raters are told to look for websites with content that is less than four months old.</p>
</div>
</div>
</div>
</div>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/emeryblogger.wordpress.com/285/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/emeryblogger.wordpress.com/285/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=emeryblogger.com&#038;blog=9784426&#038;post=285&#038;subd=emeryblogger&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://emeryblogger.com/2012/12/06/new-scientist-coverage-of-our-automan-project/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:thumbnail url="http://emeryblogger.files.wordpress.com/2012/12/mg21628945-500-1_300.jpg?w=150" />
		<media:content url="http://emeryblogger.files.wordpress.com/2012/12/mg21628945-500-1_300.jpg?w=150" medium="image">
			<media:title type="html">mg21628945.500-1_300</media:title>
		</media:content>

		<media:content url="http://2.gravatar.com/avatar/ef7468f0b37dca98b539f321225ee583?s=96&#38;d=http%3A%2F%2Fs0.wp.com%2Fi%2Fmu.gif&#38;r=G" medium="image">
			<media:title type="html">emeryberger</media:title>
		</media:content>

		<media:content url="http://emeryblogger.files.wordpress.com/2012/12/mg21628945-500-1_300.jpg" medium="image">
			<media:title type="html">mg21628945.500-1_300</media:title>
		</media:content>
	</item>
		<item>
		<title>Me on PBS, Explaining Cyberattacks on Banks</title>
		<link>http://emeryblogger.com/2012/12/04/me-on-wgby-explaining-cyberattacks-on-banks/</link>
		<comments>http://emeryblogger.com/2012/12/04/me-on-wgby-explaining-cyberattacks-on-banks/#comments</comments>
		<pubDate>Tue, 04 Dec 2012 22:08:04 +0000</pubDate>
		<dc:creator>emeryberger</dc:creator>
				<category><![CDATA[Security]]></category>

		<guid isPermaLink="false">http://emeryblogger.com/?p=277</guid>
		<description><![CDATA[Me on PBS, Explaining Cyberattacks on Banks My latest appearance on our local PBS affiliate WGBY&#8217;s program Connecting Point, this time explaining cyberattacks on banks (not a how-to!) &#8212; first [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=emeryblogger.com&#038;blog=9784426&#038;post=277&#038;subd=emeryblogger&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p><a title="Me on WGBY, Explaining Cyberattacks on Banks" href="http://video.wgby.org/video/2304224656">Me on PBS, Explaining Cyberattacks on Banks</a></p>
<p><a href="http://video.wgby.org/video/2304224656"><img class="alignright  wp-image-389" alt="Screen Shot 2013-01-05 at 6.28.09 PM" src="http://emeryblogger.files.wordpress.com/2012/12/screen-shot-2013-01-05-at-6-28-09-pm.png?w=282&#038;h=176" width="282" height="176" /></a>My latest appearance on our local PBS affiliate WGBY&#8217;s program Connecting Point, this time explaining cyberattacks on banks (not a how-to!) &#8212; first segment.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/emeryblogger.wordpress.com/277/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/emeryblogger.wordpress.com/277/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=emeryblogger.com&#038;blog=9784426&#038;post=277&#038;subd=emeryblogger&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://emeryblogger.com/2012/12/04/me-on-wgby-explaining-cyberattacks-on-banks/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://2.gravatar.com/avatar/ef7468f0b37dca98b539f321225ee583?s=96&#38;d=http%3A%2F%2Fs0.wp.com%2Fi%2Fmu.gif&#38;r=G" medium="image">
			<media:title type="html">emeryberger</media:title>
		</media:content>

		<media:content url="http://emeryblogger.files.wordpress.com/2012/12/screen-shot-2013-01-05-at-6-28-09-pm.png" medium="image">
			<media:title type="html">Screen Shot 2013-01-05 at 6.28.09 PM</media:title>
		</media:content>
	</item>
		<item>
		<title>Most Influential Paper of OOPSLA 2002: &#8220;Reconsidering Custom Memory Allocation&#8221;</title>
		<link>http://emeryblogger.com/2012/10/28/most-influential-oopsla2012/</link>
		<comments>http://emeryblogger.com/2012/10/28/most-influential-oopsla2012/#comments</comments>
		<pubDate>Sun, 28 Oct 2012 23:07:06 +0000</pubDate>
		<dc:creator>emeryberger</dc:creator>
				<category><![CDATA[Memory Management]]></category>

		<guid isPermaLink="false">http://emeryblogger.com/?p=269</guid>
		<description><![CDATA[Our paper, Reconsidering Custom Memory Allocation, was just granted the Most Influential OOPSLA Paper award (given ten years after the paper appeared). Here&#8217;s the citation for the award. Custom memory [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=emeryblogger.com&#038;blog=9784426&#038;post=269&#038;subd=emeryblogger&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>Our paper, <a href="http://dl.acm.org/citation.cfm?id=582421">Reconsidering Custom Memory Allocation</a>, was just granted the <a href="http://www.sigplan.org/Awards/Conferences/OOPSLA/Description">Most Influential OOPSLA Paper</a> award (given ten years after the paper appeared). Here&#8217;s the citation for the award.<a href="http://www.sigplan.org/Awards/Conferences/OOPSLA/Main"><img class="alignright  wp-image-356" alt="Influential-Paper-OOPSLA" src="http://emeryblogger.files.wordpress.com/2011/12/influential-paper-oopsla.jpg?w=233&#038;h=312" width="233" height="312" /></a></p>
<blockquote><p><em>Custom memory management is often used in systems software for the purpose of decreasing the cost of allocation and tightly controlling memory footprint of software. Until 2002, it was taken for granted that application-specific memory allocators were superior to general purpose libraries. Berger, Zorn and McKinley&#8217;s paper demonstrated through a rigorous empirical study that this assumption is not well-founded, and gave insights into the reasons why general purpose allocators can outperform handcrafted ones. The paper also stands out for the quality of its empirical methodology.</em></p></blockquote>
<p>I am grateful to OOPSLA not only for the check for $333.33 <img src='http://s0.wp.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> , but also for giving me the chance to publicly stand up and thank my wonderful co-authors: my excellent colleague <a href="http://research.microsoft.com/en-us/people/zorn/">Ben Zorn</a> and my awesome advisor, <a href="http://research.microsoft.com/en-us/people/mckinley/">Kathryn McKinley</a> (both now at Microsoft Research). The original paper actually did a bit more than the citation &#8211; here&#8217;s the abstract from the original paper.</p>
<blockquote><p><em>Programmers hoping to achieve performance improvements often use custom memory allocators. This in-depth study examines eight applications that use custom allocators. Surprisingly, for six of these applications, a state-of-the-art general-purpose allocator (the Lea allocator) performs as well as or better than the custom allocators. The two exceptions use regions, which deliver higher performance (improvements of up to 44%). Regions also reduce programmer burden and eliminate a source of memory leaks. However, we show that the inability of programmers to free individual objects within regions can lead to a substantial increase in memory consumption. Worse, this limitation precludes the use of regions for common programming idioms, reducing their usefulness.</em></p>
<p><em>We present a generalization of general-purpose and region-based allocators that we call <strong>reaps</strong>. Reaps are a combination of regions and heaps, providing a full range of region semantics with the addition of individual object deletion. We show that our implementation of reaps provides high performance, outperforming other allocators with region-like semantics. We then use a case study to demonstrate the space advantages and software engineering benefits of reaps in practice. Our results indicate that programmers needing fast regions should use reaps, and that most programmers considering custom allocators should instead use the Lea allocator.</em></p></blockquote>
<p>Slight correction: they should instead use <a href="http://www.hoard.org">Hoard</a> <img src='http://s0.wp.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> .</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/emeryblogger.wordpress.com/269/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/emeryblogger.wordpress.com/269/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=emeryblogger.com&#038;blog=9784426&#038;post=269&#038;subd=emeryblogger&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://emeryblogger.com/2012/10/28/most-influential-oopsla2012/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:thumbnail url="http://emeryblogger.files.wordpress.com/2011/12/influential-paper-oopsla.jpg?w=111" />
		<media:content url="http://emeryblogger.files.wordpress.com/2011/12/influential-paper-oopsla.jpg?w=111" medium="image">
			<media:title type="html">Influential-Paper-OOPSLA</media:title>
		</media:content>

		<media:content url="http://2.gravatar.com/avatar/ef7468f0b37dca98b539f321225ee583?s=96&#38;d=http%3A%2F%2Fs0.wp.com%2Fi%2Fmu.gif&#38;r=G" medium="image">
			<media:title type="html">emeryberger</media:title>
		</media:content>

		<media:content url="http://emeryblogger.files.wordpress.com/2011/12/influential-paper-oopsla.jpg" medium="image">
			<media:title type="html">Influential-Paper-OOPSLA</media:title>
		</media:content>
	</item>
		<item>
		<title>ACM Queue article: &#8220;Software Needs Seatbelts and Airbags&#8221;</title>
		<link>http://emeryblogger.com/2012/07/24/acm-queue-article-software-needs-seatbelts-and-airbags/</link>
		<comments>http://emeryblogger.com/2012/07/24/acm-queue-article-software-needs-seatbelts-and-airbags/#comments</comments>
		<pubDate>Tue, 24 Jul 2012 15:45:54 +0000</pubDate>
		<dc:creator>emeryberger</dc:creator>
				<category><![CDATA[Concurrency]]></category>
		<category><![CDATA[Fault Tolerance]]></category>
		<category><![CDATA[Memory Management]]></category>
		<category><![CDATA[Programming Languages]]></category>

		<guid isPermaLink="false">http://emeryblogger.com/?p=267</guid>
		<description><![CDATA[(Based on an earlier blog post.) ACM Queue, July 2012 - http://queue.acm.org/detail.cfm?id=2333133 Software Needs Seatbelts and Airbags Finding and fixing bugs in deployed software is difficult and time-consuming. Here are some [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=emeryblogger.com&#038;blog=9784426&#038;post=267&#038;subd=emeryblogger&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>(Based on an <a title="Software Needs Seatbelts and Airbags" href="http://emeryblogger.com/2012/05/31/software-needs-seatbelts-and-airbags/">earlier blog post</a>.)</p>
<p>ACM Queue, July 2012 - <a href="http://queue.acm.org/detail.cfm?id=2333133">http://queue.acm.org/detail.cfm?id=2333133</a></p>
<h1>Software Needs Seatbelts and Airbags</h1>
<h2>Finding and fixing bugs in deployed software is difficult and time-consuming. Here are some alternatives.</h2>
<h3>EMERY D. BERGER, UNIVERSITY OF MASSACHUSETTS, AMHERST</h3>
<p>Like death and taxes, buggy code is an unfortunate fact of life. Nearly every program ships with known bugs, and probably all of them end up with bugs that are discovered only post-deployment. There are many reasons for this sad state of affairs.</p>
<p>One problem is that many applications are written in memory-unsafe languages. Variants of C, including C++ and Objective-C, are especially vulnerable to memory errors such as buffer overflows and dangling pointers (use-after-free bugs). Two of these are in the SANS Top 25 list: buffer copy without checking size of input (<a href="http://cwe.mitre.org/top25/index.html#CWE-120">http://cwe.mitre.org/top25/index.html#CWE-120</a>) and incorrect calculation of buffer size (<a href="http://cwe.mitre.org/top25/index.html#CWE-131">http://cwe.mitre.org/top25/index.html#CWE-131</a>); see also heap-based buffer overflow (<a href="http://cwe.mitre.org/data/definitions/122.html">http://cwe.mitre.org/data/definitions/122.html</a>) and use after free (<a href="http://cwe.mitre.org/data/definitions/416.html">http://cwe.mitre.org/data/definitions/416.html</a>).</p>
<p>These bugs, which can lead to crashes, erroneous execution, and security vulnerabilities, are notoriously challenging to repair.</p>
<h3>SAFE LANGUAGES: NO PANACEA</h3>
<p>Writing new applications in memory-safe languages such as Java instead of C/C++ would help mitigate these problems. For example, because Java uses garbage collection, Java programs are not susceptible to use-after-free bugs; similarly, because Java always performs bounds-checking, Java applications cannot suffer memory corruption caused by buffer overflows.</p>
<p>That said, safe languages are no cure-all. Java programs still suffer from buffer overflows and null pointer dereferences, although they throw an exception as soon as they happen, unlike their C-based counterparts. The common recourse to these exceptions is to abort execution and print a stack trace (even to a Web page!). Java is also just as vulnerable as any other language to concurrency errors such as race conditions and deadlocks.</p>
<p>There are both practical and technical reasons not to use safe languages. First, it is generally not feasible to rewrite existing code because of the cost and time involved, not to mention the risk of introducing new bugs. Second, languages that rely on garbage collection are not a good fit for programs that need high performance or that make extensive use of available physical memory, since garbage collection always requires extra memory.<sup>6</sup> These include operating-system-level services, database managers, search engines, and physics-based games.</p>
<h3>ARE TOOLS THE ANSWER?</h3>
<p>While tools can help, they cannot catch all bugs. Static analyzers have made enormous strides in recent years, but many bugs remain out of reach. Rather than swamp developers with false positive reports, most modern static analyzers report far fewer bugs than they could. In other words, they trade false negatives (failing to report real bugs) for lower false positive rates. That makes these tools more usable, but it also means they will fail to report real bugs. Dawson Engler and his colleagues made exactly this choice for Coverity’s “unsound” static analyzer.<sup>4</sup></p>
<p>The state of the art in testing tools has also advanced dramatically in the past decade. Randomized fuzz testing can be combined with static analysis to explore paths that lead to failure. These tools are now in the mainstream: for example, Microsoft’s Driver Verifier can test device-driver code for a variety of problems and now includes randomized concurrency stress testing.</p>
<p>As Dijkstra famously remarked, however, “Program testing can be used to show the presence of bugs, but never to show their absence!” At some point, testing will fail to turn up new bugs, which will unfortunately be discovered only after the software has shipped.</p>
<h3>FIXING BUGS: RISKY (AND SLOW) BUSINESS</h3>
<p>Finding the bugs is only the first step. Once a bug is found—whether by inspection, testing, or analysis—fixing it remains a challenge. Any bug fix must be undertaken with extreme care, since any new code runs the risk of introducing yet more bugs. Developers must construct and carefully test a patch to ensure that it fixes the bug without introducing any new ones. This can be costly and time consuming. For example, the average time between the discovery of a <em>remotely exploitable</em> memory error and the release of a patch for enterprise applications is 28 days, according to Symantec.<sup>12</sup></p>
<p>At some point, fixing certain bugs simply stops making economic sense. Tracking their source is often difficult and time consuming, even when the full memory state and all inputs to the program are available. Obviously, showstopper bugs must be fixed. For other bugs, the benefits of fixing them may be outweighed by the risks of creating new bugs and the costs in programmer time and delayed deployment.</p>
<h3>DEBUGGING AT A DISTANCE</h3>
<p>Once the faulty software has been deployed, the problem of chasing down and repairing bugs becomes exponentially more difficult. Users rarely provide detailed bug reports that allow developers to reproduce the problem.</p>
<p>For deployed software on desktops or mobile devices, getting enough information to find a bug can be difficult. Sending an entire core file is generally impractical, especially over a mobile connection. Typically, the best one can hope for is some logging messages and a minidump consisting of a stack trace and information about thread contexts.</p>
<p>Even this limited information can provide valuable clues. If a particular function appears on many stack traces observed during crashes, then that function is a likely culprit. Microsoft Windows includes an application debugger (formerly Watson, now Windows Error Reporting) that is used to perform this sort of triage not only for Microsoft but also for third-party applications via Microsoft’s Winqual program. Google also has made available a cross-platform tool called Breakpad that can be used to provide similar services for any application.</p>
<p>For many bugs, however, the kind of information that these tools provide is of limited value. For example, memory-corruption errors often do not trigger failures until millions of instructions past the point of the actual error, making stack traces useless. The same is generally true for null dereference exceptions, where the error often happens long after the null pointer was stored.</p>
<h3>CAPTAIN’S LOG: NOT ENOUGH INFORMATION</h3>
<p>On servers, the situation is somewhat better. Server applications typically generate log messages that may contain clues about why a program failed. Unfortunately, log files can be unmanageably large. Poring over logs and trying to correlate them to the source code can be extremely time consuming. Even worse, that work may yield no useful results because the logs are incomplete—that is, they simply may not provide enough information to narrow down the source of a particular error because there were not enough or the right kind of log messages.</p>
<p>Recent work at the University of Illinois and the University of California San Diego may lead to the development of tools that address some of these problems: SherLog<sup>13</sup> automates the process of tracing bugs from log messages to buggy source-code paths; and LogEnhancer<sup>14</sup> automatically extends log messages with information to help post-crash debugging. (More information on logging appears in a 2011 <em>ACM Queue</em> article, <a href="http://queue.acm.org/detail.cfm?id=2082137">Advances and Challenges in Log Analysis</a>.<sup>11</sup>)</p>
<p>Despite these advances, finding bugs has actually become harder than ever. It was already challenging to find bugs when programs were sequential, but now, with multithreaded programs, asynchrony, and multiple cores, the situation has become far worse. Every execution of these <em>nondeterministic</em> programs is completely different from the last because of different timing of events and thread interleavings. This situation makes reproducing bugs impossible even with a complete log of all input events—something that would be too expensive to record in practice, anyway.</p>
<h3>BUMPERS, SEATBELTS, AND AIRBAGS</h3>
<p>Let’s shift gears for a moment to talk about cars (we’ll get back to talking about software in a minute). As an analogy for the current situation, consider when cars first came onto the scene. For years, safety was an afterthought at best. When designing new cars, the primary considerations were aesthetics and high performance (think tailfins and V-8 engines).</p>
<p>Eventually, traffic fatalities led legislators and car manufacturers to take safety into account. Seatbelts became required standard equipment in U.S. cars in the late 1960s, bumpers in the 1970s, and airbags in the 1980s. Modern cars incorporate a wide range of safety features, including laminated windshields, crumple zones, and antilock braking systems. It is now practically unthinkable that any company would ship a car without these essential safety features.</p>
<p>The software industry is in a position similar to that of the automobile industry of the 1950s, delivering software with lots of horsepower and tailfins but no safety measures of any kind. Today’s software even comes complete with decorative spikes on the steering column to make sure that users will suffer if their applications crash.</p>
<h3>DRUNK DRIVING THROUGH A MINEFIELD</h3>
<p>The potent cocktail of manual memory management mixed with unchecked memory accesses makes C and C++ applications susceptible to a wide range of memory errors. These errors can cause programs to crash or produce incorrect results. Attackers are also frequently able to exploit these memory errors to gain unauthorized access to systems. Since the vast majority of objects accessed by applications are on the heap, heap-related errors are especially serious.</p>
<p>Numerous memory errors happen when programs incorrectly free objects. Dangling pointers arise when a heap object is freed while it is still live, leading to use-after-free bugs. Invalid frees happen when a program deallocates an object that was never returned by the allocator by inadvertently freeing a stack object or an address in the middle of a heap object. Double frees occur when a heap object is deallocated multiple times without an intervening allocation. This error may at first glance seem innocuous but, in many cases, leads to heap corruption or program termination.</p>
<p>Other memory errors have to do with the use of allocated objects. When an object is allocated with a size that is not large enough, an out-of-bound error can occur when the memory address to be read or written lies outside the object. Out-of-bound writes are also known as buffer overflows. Uninitialized reads happen when a program reads memory that has never been initialized; in many cases, uninitialized memory contains data from previously allocated objects.</p>
<p>Given that the industry knows it is shipping software with bugs and that the terrain is dangerous, it might make sense to equip the product with seatbelts and airbags. The ideal would be to have both resilience and prompt corrective action for any problem that surfaces in the deployed applications.</p>
<p>Let’s focus on C/C++/Objective-C applications—the lion’s share of applications running on servers, desktops, and mobile platforms—and memory errors, the number-one headache for applications written in these languages. Safety-equipped memory allocators can play a crucial role in protecting software against crashes.</p>
<h3>THE GARBAGE COLLECTION SAFETY NET</h3>
<p>The first class of errors—those that happen because of the misuse of <tt>free</tt> or <tt>delete</tt>—can be remedied directly by using garbage collection. Garbage collection works by reclaiming only the objects that it allocated, eliminating invalid frees. It reclaims objects only once those objects can no longer be reached by traversing pointers from the “roots”: the globals and the stack. That eliminates dangling pointer errors, since by definition there can’t be any pointers to reclaimed objects. Since it naturally reclaims these objects only once, a garbage collector also eliminates double frees.</p>
<p>While C and C++ were not designed with garbage collection in mind, it is possible to plug in a conservative garbage collector and entirely prevent free-related errors. The word <em>conservative</em> here means that because the garbage collector doesn’t necessarily know which values are pointers (since we are in C-land), it conservatively assumes that if a value looks like a pointer (it is in the right range and properly aligned) and acts like a pointer (it points only to valid objects), then it may be a pointer.</p>
<p>The Boehm-Demers-Weiser conservative garbage collector is an excellent choice for this purpose: it is reasonably fast and space efficient and can be used to replace memory allocators by configuring it to treat calls to free as NOPs (no operations).</p>
<p>Of course, garbage collection may not be suitable for all C/C++ applications. It can slow down some applications and lead to unpredictable pauses or jitter. It also may not be appropriate in resource-constrained contexts such as mobile devices, since garbage collection generally requires more memory than explicit memory management to provide the same throughput.<sup>6</sup> For example, conservative garbage collection in Objective-C is available on the desktop but not on iOS (iPhones or iPads).</p>
<h3>SLIPPING THROUGH THE NET</h3>
<p>While garbage collectors eliminate free-related errors, they cannot help prevent the second class of memory errors: those that have to do with the misuse of allocated objects such as buffer overflows.</p>
<p>Runtime systems that can find buffer overflows often impose staggeringly high overheads, making them not particularly suitable for deployed code. Tools such as Valgrind’s MemCheck are incredibly comprehensive and useful, but they are heavyweight by design and slow execution by orders of magnitude.<sup>8</sup></p>
<p>Compiler-based approaches can substantially reduce overhead by avoiding unnecessary checks, though they entail recompiling all of an application’s code, including libraries. Google has recently made available AddressSanitizer (<a href="http://code.google.com/p/address-sanitizer/wiki/AddressSanitizer">http://code.google.com/p/address-sanitizer/wiki/AddressSanitizer</a>), a combination of compiler and runtime technology that can find a number of bugs, including overflows and use-after-free bugs. While AddressSanitizer is much faster than Valgrind, its overhead remains relatively high (around 75 percent), making it useful primarily for testing.</p>
<p>All of these approaches are based on the idea that the best thing to do upon encountering an error is to abort immediately. This is the classic approach of just popping up a window indicating that something terrible has happened and would you like it to send a note home to Redmond / Cupertino, etc. This fail-stop behavior is certainly desirable in testing, but it is not usually what users want. Most application programs are not safety-critical systems, and aborting them in midstream can be an unpleasant experience for users, especially if it means they lose their work. In short, users generally would prefer that their applications be fault tolerant whenever possible.</p>
<h3>BOHR VERSUS HEISENBERG</h3>
<p>In fact, the exact behavior users do <em>not</em> want is for an error to happen consistently and repeatedly. In his classic 1985 article, “Why do computers stop and what can be done about it?”<sup>5</sup> Jim Gray drew a distinction between two kinds of bugs. The first kind are bugs that behave predictably and repeatedly—that is, they occur every time the program encounters the same inputs and goes through the same sequence of steps. These are <em>Bohrbugs</em>, named for the Bohr atom, by analogy with the classical atomic model where electrons circle around the nucleus in planetary-like orbits. Bohrbugs are great when debugging a program, since they are easier to reproduce and find their root causes.</p>
<p>The second kind of bug is the Heisenbug, named for Heisenberg’s Uncertainty Principle and meant to connote the inherit uncertainty in quantum mechanics, which are unpredictable and cannot be reliably reproduced. The most common Heisenbugs these days are concurrency errors (a.k.a. race conditions), which depend on the order and timing of scheduling events to appear. Heisenbugs are also often sensitive to the <em>observer effect</em>; attempts to find the bug by inserting debugging code or running in a debugger often disrupt the sequence of events that led to the bug, making it go away.</p>
<p>Jim Gray makes the point that Bohrbugs are great for debugging, but users would rather their bugs be Heisenbugs. Why? Because Bohrbugs are showstoppers for users: every time the user does the same thing, he or she will encounter the same bug. With Heisenbugs, on the other hand, the bugs often go away when you run the program again. This is a perfect match for the way users already behave on the Web. If they go to a Web page and it fails to respond, they just click “refresh” and that usually solves the problem.</p>
<p>Thus, one way to make life better for users is to convert Bohrbugs into Heisenbugs—if we can figure out how to do that.</p>
<h3>DEFENSIVE DRIVING WITH DIEHARD</h3>
<p>My graduate students at the University of Massachusetts Amherst and I, in collaboration with my colleague Ben Zorn at Microsoft Research, have been working for the past few years on ways to protect programs from bugs. The first fruit of that research is a system called DieHard (<a href="http://diehard-software.org/">http://diehard-software.org/</a>) that makes memory errors less likely to impact users. DieHard eliminates some errors entirely and converts the others into (rare) Heisenbugs.</p>
<p>To see how DieHard works, let’s go back to the car analogy. One way to make cars less likely to crash into each other is to space them farther apart, providing adequate braking distance in case something goes wrong. DieHard provides this “defensive driving” by taking over all memory-management operations and allocating objects in a space larger than required.</p>
<p>This de facto padding increases the odds that a small overflow will end up in unallocated space where it can do no harm. DieHard, however, doesn’t just add a fixed amount of padding between objects. That would provide great protection against overflows that are small enough, and zero protection against the others. In other words, those overflows would still be Bohrbugs.</p>
<p>Instead, DieHard provides <em>probabilistic memory safety</em> by randomly allocating objects on the heap. DieHard adaptively sizes its heap to be a bit larger than the maximum needed by the application; the default is 1/3.<sup>2,3</sup>DieHard allocates memory from increasingly large chunks called <em>miniheaps</em>.</p>
<p>By randomly allocating objects across all the miniheaps, DieHard makes many memory overflows benign, with a probability that naturally declines as the overflow increases in size and the heap becomes full. The effect is that, in most cases when running DieHard, a small overflow is likely to have no effect.</p>
<p>DieHard’s random allocation approach also reduces the likelihood of the <tt>free</tt>-related errors that garbage collection addresses. DieHard uses bitmaps, stored outside the heap, to track allocated memory. A bit set to 1 indicates that a given block is in use; 0 means it is available.</p>
<p>This use of bitmaps to manage memory eliminates the risk of double frees, since resetting a bit to 0 twice is the same as resetting it once. Keeping the heap metadata separate from the data in the heap makes it impossible inadvertently to corrupt the heap itself.</p>
<p>Most importantly, DieHard drastically reduces the risk of dangling pointer errors, which effectively go away. If the heap has 1 million freed objects, the chance that you will immediately reuse one that was just freed is literally one in a million. Contrast this with most allocators, which immediately reclaim freed objects. With DieHard, even after 10,000 reallocations, there is still a 99 percent chance that the dangled object will not be reused.</p>
<p>Because it performs its allocation in (amortized) constant time, DieHard can provide added safety with very little additional cost in performance. For example, using it in a browser causes no perceivable performance impact.</p>
<p>At Microsoft Research, tests with a variant of DieHard prevented about 30 percent of all bugs in the Microsoft Office database with no perceivable impact on performance. Beginning with Windows 7, Microsoft Windows now ships with an FTH (fault-tolerant heap) that was directly inspired by DieHard. Normally, applications use the default heap, but after a program crashes more than a certain number of times, the FTH takes over. Like DieHard, the FTH manages heap metadata separately from the heap. It also adds padding and delays allocations, though it does not provide DieHard’s probabilistic fault tolerance because it does not randomize allocations or deallocations. The FTH approach is especially attractive because it acts like an airbag: effectively invisible and cost-free when everything is fine, but providing protection when needed.</p>
<h3>EXTERMINATING THE BUGS</h3>
<p>Tolerating bugs is one way to improve the effective quality of deployed software. It would be even better if somehow the software could not only tolerate faults but also correct them. We developed Exterminator, a follow-on system to DieHard, that does exactly that.<sup>9,10</sup></p>
<p>Exterminator uses a version of DieHard extended to detect errors (called DieFast). While DieHard probabilistically tolerates errors, DieFast also probabilistically detects them. When Exterminator discovers an error, it dumps a<em>heap image</em> that contains the complete state of the heap without the actual contents. Exterminator then processes one or more heap images to locate the source of the error. Randomization means that each heap image will have completely shuffled objects, making it possible for Exterminator to apply a probabilistic error-isolation algorithm to identify buffer overflows and dangling pointer errors based on the resulting patterns of heap corruption. It can then compute what kind of error happened and where it occurred.</p>
<p>Not only does Exterminator send this information back to the programmers so they can repair the software, but it also automatically corrects the errors via <em>runtime patches</em>. For example, if it detects that a certain object was responsible for a buffer overflow of eight bytes, it will always allocate such objects (distinguished by their call site and size) with an eight-byte pad. If an object is prematurely freed, Exterminator will defer its reclamation. Exterminator can learn from the results of multiple runs or multiple users, so it could be used to proactively push out patches to prevent other users from experiencing errors it has already detected elsewhere.</p>
<h3>THE FUTURE: SAFER, SELF-REPAIRING SOFTWARE</h3>
<p>My group and others (notably Martin Rinard at MIT, Vikram Adve at the University of Illinois, Yuanyuan Zhou at UC-San Diego, Shan Lu at the University of Wisconsin, and Luis Ceze and Dan Grossman at the University of Washington) have made great strides in building safety systems for other classes of errors. We have recently published work on systems that prevent concurrency errors, some of which we can eliminate automatically (<a href="http://plasma.cs.umass.edu/emery/software">http://plasma.cs.umass.edu/emery/software</a>).</p>
<p>Grace is a runtime system that eliminates concurrency errors for concurrent programs that use fork-join parallelism. It hijacks the threads library, converting threads to processes “under the hood,” and uses virtual memory mapping and protection to enforce behavior that gives the illusion of a sequential execution, even on a multicore processor.<sup>1</sup> Dthreads (deterministic threads) is a full replacement for the Posix threads library that enforces deterministic execution for multithreaded code.<sup>7</sup> In other words, a multithreaded program running with Dthreads never has races; every execution with the same inputs generates the same outputs. Dthreads uses similar mechanisms to Grace and adds deterministic locking and diff-based resolution of conflicts, allowing it to support arbitrary multithreaded programs.</p>
<p>We look forward to a day in the not-too-distant future when such safer runtime systems are the norm. Just as we can now barely imagine cars without their many safety features, we are finally adopting a similar philosophy for software. Buggy software is inevitable, and when possible we should deploy safety systems that reduce their impact on users.</p>
<h4>REFERENCES</h4>
<p>1. Berger, E. D., Yang, T. Liu, T., Novark, G. 2009. Grace: safe multithreaded programming for C/C++. In<em>Proceedings of the 24th ACM SIGPLAN Conference on Object-oriented Programming Systems </em><em>Languages and Applications (OOPSLA’09)</em>: 81–96.</p>
<p>2. Berger, E. D., Zorn, B. G. 2006. DieHard: probabilistic memory safety for unsafe languages. In <em>Proceedings of the 2006 ACM SIGPLAN Conference on Programming Language Design and </em><em>Implementation (PLDI)</em>: 158–168.</p>
<p>3. Berger, E. D., Zorn, B. G. 2007. Efficient probabilistic memory safety. Technical Report UMCS TR-2007-17, Department of Computer Science, University of Massachusetts, Amherst.</p>
<p>4. Bessey, A., Block, K., Chelf, B., Chou, A., Fulton, B., Hallem, S., Henri-Gros, C., Kamsky, A., McPeak, S., Engler, D. 2010. A few billion lines of code later: using static analysis to find bugs in the real world. <em>Communications of the ACM</em> 53(2): 66–75.</p>
<p>5. Gray, J. 1985. Why do computers stop and what can be done about it? Tandem TR-85.7;<a href="http://www.hpl.hp.com/techreports/tandem/TR-85.7.htm">http://www.hpl.hp.com/techreports/tandem/TR-85.7.html</a>.</p>
<p>6. Hertz, M., Berger, E. D. 2005. Quantifying the performance of garbage collection vs. explicit memory management. In<em> Proceedings of the 20</em><sup><em>th</em></sup><em> Annual ACM SIGPLAN Conference on Object-oriented </em><em>Programming, Systems, Languages, and Applications</em> <em>(OOPSLA’05):</em> 313–326.</p>
<p>7. Liu, T., Curtsinger, C., Berger, E.D. 2011. Dthreads: efficient deterministic multithreading. In <em>Proceedings of the 23rd ACM Symposium on Operating Systems Principles</em> <em>(SOSP ’11)</em>: 327–336.</p>
<p>8. Nethercote, N., Seward, J. 2007. Valgrind: a framework for heavyweight dynamic binary instrumentation. In<em>Proceedings of </em>the<em> ACM SIGPLAN Conference on Programming Language Design </em><em>and Implementation (PLDI)</em>: 89–100.</p>
<p>9. Novark, G., Berger, E. D., Zorn, B. G. 2007. Exterminator: automatically correcting memory errors with high probability. In <em>Proceedings of the ACM SIGPLAN Conference on Programming Language </em><em>Design and Implementation (PLDI)</em>: 1–11.</p>
<p>10. Novark, G., Berger, E. D., Zorn, B. G. 2008. Exterminator: automatically correcting memory errors with high probability. <em>Communications of the ACM</em> 51(12): 87–95.</p>
<p>11. Oliner, A., Ganapathi, A., Xu, W. 2011. Advances and challenges in log analysis. <em>ACM Queue</em> 9(12);<a href="http://queue.acm.org/detail.cfm?id=2082137">http://queue.acm.org/detail.cfm?id=2082137</a>.</p>
<p>12. Symantec. 2006. Internet security threat report, volume X (September);<a href="http://www.symantec.com/threatreport/archive.jsp">http://www.symantec.com/threatreport/archive.jsp</a>.</p>
<p>13. Yuan, D., Mai, H., Xiong, W., Tan, L., Zhou, Y., Pasupathy, S. 2010. Sherlog: error diagnosis by connecting clues from runtime logs. In <em>Proceedings of the 15th Architectural Support for Programming </em><em>Languages and Operating Systems</em> <em>(ASPLOS ’10):</em> 143–154.</p>
<p>14. Yuan, D., Zheng, J., Park, S., Zhou, Y., Savage, S. 2012. Improving software diagnosability via log enhancement. <em>ACM Transactions on Computer Systems</em> 30(1): 4:1–4:28.</p>
<p>LOVE IT, HATE IT? LET US KNOW</p>
<p><a href="mailto:feedback@acmqueue.com">feedback@queue.acm.org</a></p>
<p><strong>EMERY BERGER</strong> is an associate professor in the department of computer science at the University of Massachusetts, Amherst, where he leads the PLASMA lab. He graduated with a Ph.D. in computer science from the University of Texas at Austin in 2002 and has spent several years as a visiting scientist at Microsoft Research and a senior researcher at the Universitat Politecnica de Catalunya/Barcelona Supercomputing Center. Berger’s research spans programming languages, runtime systems, and operating systems, with a particular focus on systems that transparently improve reliability, security, and performance. He is the creator of various widely used software systems including Hoard and DieHard. He is a senior member of ACM and an associate editor of the <em>ACM Transactions on Programming Languages and Systems</em>.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/emeryblogger.wordpress.com/267/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/emeryblogger.wordpress.com/267/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=emeryblogger.com&#038;blog=9784426&#038;post=267&#038;subd=emeryblogger&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://emeryblogger.com/2012/07/24/acm-queue-article-software-needs-seatbelts-and-airbags/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:thumbnail url="http://emeryblogger.files.wordpress.com/2012/07/082012_cacmpg49_software-needs-t-large.jpg?w=150" />
		<media:content url="http://emeryblogger.files.wordpress.com/2012/07/082012_cacmpg49_software-needs-t-large.jpg?w=150" medium="image">
			<media:title type="html">082012_CACMpg49_Software-Needs-t.large</media:title>
		</media:content>

		<media:content url="http://2.gravatar.com/avatar/ef7468f0b37dca98b539f321225ee583?s=96&#38;d=http%3A%2F%2Fs0.wp.com%2Fi%2Fmu.gif&#38;r=G" medium="image">
			<media:title type="html">emeryberger</media:title>
		</media:content>
	</item>
		<item>
		<title>Software Needs Seatbelts and Airbags</title>
		<link>http://emeryblogger.com/2012/05/31/software-needs-seatbelts-and-airbags/</link>
		<comments>http://emeryblogger.com/2012/05/31/software-needs-seatbelts-and-airbags/#comments</comments>
		<pubDate>Thu, 31 May 2012 14:16:34 +0000</pubDate>
		<dc:creator>emeryberger</dc:creator>
				<category><![CDATA[Concurrency]]></category>
		<category><![CDATA[Fault Tolerance]]></category>
		<category><![CDATA[Memory Management]]></category>
		<category><![CDATA[Programming Languages]]></category>
		<category><![CDATA[software-development]]></category>

		<guid isPermaLink="false">http://emeryblogger.com/?p=248</guid>
		<description><![CDATA[(This post is a draft version of an article slated to appear in ACM Queue.) Finding and fixing bugs in deployed software is difficult and time-consuming: here are some alternatives. [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=emeryblogger.com&#038;blog=9784426&#038;post=248&#038;subd=emeryblogger&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<div>
<p><span style="color:#000000;">(This post is a draft version of an article slated to appear in ACM Queue.)</span></p>
<p><em>Finding and fixing bugs in deployed software is difficult and time-consuming: here are some alternatives.</em></p>
<h3>Death, Taxes, and Bugs.</h3>
<p><img class="alignnone" src="http://www.danzfamily.com/archives/blogphotos/06/403-death-and-taxes.gif" alt="" width="388" height="225" /></p>
<p>Like death and taxes, buggy code is an unfortunate fact of life. Nearly every program ships with known bugs, and probably all of them end up with bugs that are only discovered post-deployment. There are many reasons for this sad state of affairs.</p>
<h3>Unsafe Languages.</h3>
<p>One problem is that many applications are written in memory-unsafe languages. Variants of C, including C++ and Objective-C, are especially vulnerable to memory errors like buffer overflows and dangling pointers (use-after-free bugs). These bugs, which can lead to crashes, erroneous execution, and security vulnerabilities, are notoriously challenging to repair.</p>
<h3>Safe Languages: No Panacea.</h3>
<p><img class="alignnone" src="http://www.kombuchakamp.com/wp/wp-content/uploads/2010/04/panacea.jpg" alt="" width="246" height="205" /></p>
<p>Writing new applications in memory-safe languages like Java instead of C/C++ would go some way towards mitigating these problems. For example, because Java uses garbage collection, Java programs are not susceptible to use-after-free bugs; similarly, because Java always performs bounds-checking, Java applications cannot suffer memory corruption due to buffer overflows.</p>
<p>That said, safe languages are no cure-all. Java programs still suffer from buffer overflows and null pointer dereferences, though they throw an exception as soon as they happen, unlike their C-based counterparts. The common recourse to these exceptions is to abort execution and print a stack trace (even to a web page!). Java is also just as vulnerable as any other language to concurrency errors like race conditions and deadlocks.</p>
<p>There are both practical and technical reasons not to use safe languages. First, it is generally not feasible to rewrite existing code because of the cost and time involved, not to mention the risk of introducing new bugs. Second, languages that rely on garbage collection are not a good fit for programs that need high performance or which make extensive use of available physical memory, since garbage collection always requires some extra memory [5]. These include OS-level services, database managers, search engines, and physics-based games.</p>
</div>
<div>
<h3>Are Tools the Answer?</h3>
<p><img class="alignnone" src="http://emeryblogger.files.wordpress.com/2012/05/toolbox.jpg?w=350&#038;h=277" alt="" width="350" height="277" /></p>
<p>While tools can help, they too cannot catch all bugs. Static analyzers have made enormous strides in recent years, but many bugs remain out of reach. Rather than swamp developers with false positive reports, most modern static analyzers report far fewer bugs than they could. In other words, they trade false negatives (failing to report real bugs) for lower false positive rates. That makes these tools more usable, but also means that they will fail to report real bugs. Dawson Engler and his colleagues made exactly this choice for Coverity’s “unsound” static analyzer (see the Communications of the ACM article on their experiences: <em>A Few Billion Lines of Code Later: Using Static Analysis to Find Bugs in the Real World</em> [4].</p>
<h3>Testing is Good but Not Enough.</h3>
<p><a href="http://emeryblogger.files.wordpress.com/2012/05/testing-code-is-for-wimps-real-men-test-in-production.jpg"><img class="alignright size-medium wp-image-254" title="testing-code-is-for-wimps-real-men-test-in-production" src="http://emeryblogger.files.wordpress.com/2012/05/testing-code-is-for-wimps-real-men-test-in-production.jpg?w=259&#038;h=300" alt="" width="259" height="300" /></a></p>
<p>The state of the art in testing tools has also advanced dramatically in the last decade. Randomized fuzz testing can be combined with static analysis to drive tests to explore paths that lead to failure. These tools are now in the mainstream: for example, <a href="http://msdn.microsoft.com/en- us/library/windows/hardware/gg487310.aspx">Microsoft’s Driver Verifier</a> can test device driver code for a wide variety of problems, and now includes randomized concurrency stress testing.</p>
<p>But as Dijkstra famously remarked, “Program testing can be used to show the presence of bugs, but never to show their absence!” At some point, testing will fail to turn up new bugs, which will unfortunately be discovered only once the software has shipped.</p>
<h3>Fixing Bugs: Risky (and Slow) Business.</h3>
<p><img class="alignright" src="http://emeryblogger.files.wordpress.com/2012/05/riskybusiness.jpg?w=270&#038;h=180" alt="" width="270" height="180" /></p>
<p>Finding the bugs is only the first step. Once a bug is found, whether by inspection, testing, or analysis, fixing it remains a challenge. Any bug fix must be undertaken with extreme care, since any new code runs the risk of introducing yet more bugs. Developers must construct and carefully test a patch to ensure that it fixes the bug without introducing any new ones. This process can be costly and time-consuming. For example, according to Symantec, the average time between the discovery of a remotely exploitable memory error and the release of a patch for enterprise applications is 28 days [11].</p>
<h3>Cut Bait and Ship.</h3>
<p>At some point, it simply stops making economic sense to fix certain bugs. Tracking their source is often difficult and time consuming, even when the full memory state and all inputs to the program are available. Obviously, show-stopper bugs must be fixed. For other bugs, the benefits of fixing them may be outweighed by the risks of creating new bugs and the costs in programmer time and in delayed deployment.</p>
<h3>Debugging at a Distance.</h3>
<p><img class="alignnone" src="http://images.nationalgeographic.com/wpf/media-live/photos/000/064/cache/color-earthrise_6429_600x450.jpg" alt="" width="360" height="270" /></p>
<p>Once that faulty software has been deployed, the problem of chasing down and repairing bugs becomes exponentially more difficult. Users rarely provide detailed bug reports that allow developers to reproduce the problem.</p>
<p>For deployed software on desktops or mobile devices, getting enough information to find a bug can be difficult. Sending an entire core file is generally impractical, especially over a mobile connection. Typically, the best one can hope for is some logging messages and a minidump consisting of a stack trace and information about thread contexts.</p>
</div>
<div>
<p>Even this limited information can provide valuable clues. If a particular function appears on many stack traces observed during crashes, then that function is a likely culprit. Microsoft Windows <a href="http://www.microsoft.com/about/technicalrecognition/watson-technologies-team.aspx">includes an application debugger</a> (formerly “Watson”, now “Windows Error Reporting”) that is used to perform this sort of triage not only for Microsoft but also for third-party applications via Microsoft’s <a href="http://winqual.microsoft.com/">Winqual program</a>. Google also has made available <a href="http://code.google.com/p/google-breakpad/">a cross-platform tool called Breakpad</a> that can be used to provide similar services for any application.</p>
<p>However, for many bugs, the kind of information that these tools provide is of limited value. For example, memory corruption errors often trigger failures millions of instructions past the point of the actual error, making stack traces useless. The same is generally true for null dereference exceptions, where the error often happens long after the null pointer was stored.</p>
<h3>Captain’s Log: Not Enough Information.</h3>
<p><img class="alignleft" src="http://www.eastsidepatch.com/wp-content/uploads/2010/02/capt-kirk_yellowCU-001_1196284873.jpg" alt="" width="288" height="215" />On servers, the situation is somewhat improved. Server applications typically generate log messages, which may contain clues to why a program failed. Unfortunately, log files can be unmanageably large. Poring over logs and trying to correlate them to the source code can be extremely time-consuming. Even worse, that work may yield no useful results because the logs are incomplete; that is, the logs simply may not provide enough information to narrow down the source of a particular error because there were not enough or the right kind of log messages. Recent work at Illinois and UC San Diego may lead to tools that address some of these problems; SherLog [12] automates the process of tracing back bugs from log messages to buggy source code paths, and LogEnhancer [13] automatically extends log messages with information to help post-crash debugging. More information on logging appears in a recent Queue article, <em>Advances and Challenges in Log Analysis</em> [10].</p>
<h3>God Doesn’t Play Dice, but Your Computer Does.</h3>
<p>Despite these advances, finding bugs has actually become harder than ever. Back when many programs were sequential it was already challenging to find bugs, but now the situation has become far worse. Multithreaded programs, asynchrony, and multiple cores are now a fact of life. Every execution of these non-deterministic programs is completely different from the last because of different timing of events and thread interleavings. This situation makes reproducing bugs impossible even with a complete log of all input events—something that would be too expensive to record in practice anyway.</p>
<h3>Bumpers, Seatbelts and Airbags.</h3>
<p>Let’s shift gears for a moment to talk about cars (we’ll get back to talking about software in a minute). As an analogy for the situation we find ourselves in, consider when cars first entered onto the scene. For years, safety was an after-thought at best. When designing new cars, the primary considerations were aesthetics and high performance (think tailfins and V-8 engines).</p>
</div>
<div>
<p><img class="alignnone" src="http://www.automotive-stock-images.com/photos/cadillac-series-62-convertible-1959-rear-tailfins.jpg" alt="" width="409" height="275" /></p>
<p style="text-align:left;">Eventually, traffic fatalities led legislators and car manufacturers to take safety into account. Seatbelts became required standard equipment in US cars in the late 1960s, bumpers in the 1970s, and airbags in the 1980s. Modern cars incorporate a wide range of safety features, including laminated windshields, crumple zones, and anti-lock braking systems. It is now practically unthinkable that anyone would ship a car without these essential safety features.</p>
<p style="text-align:left;">However, we routinely ship software with no safety measures of any kind. We are in a position similar to that of the automobile industry of the 1950s, delivering software with lots of horsepower and tailfins, complete with decorative spikes on the steering column to make sure that the user will suffer if their application crashes.</p>
<h3>Drunk-Driving Through a Minefield.</h3>
<p style="text-align:center;"><img class="aligncenter" src="http://farm3.staticflickr.com/2202/2060704683_315929dddd_b.jpg" alt="" width="614" height="410" /></p>
<p style="text-align:left;">The potent cocktail of manual memory management mixed with unchecked memory accesses makes C and C++ applications susceptible to a wide range of memory errors. These errors can cause programs to crash or produce incorrect results. Attackers are also frequently able to exploit these memory errors to gain unauthorized access to systems. Since the vast majority of objects accessed by applications are on the heap, heap-related errors are especially serious.</p>
<p style="text-align:left;">Numerous memory errors happen when programs incorrectly free objects. <em>Dangling pointers</em> arise when a heap object is freed while it is still live, leading to use-after-free bugs. <em>Invalid frees</em> happen when a program deallocates an object that was never returned by the allocator by inadvertently freeing a stack object or an address in the middle of a heap object. <em>Double frees</em> are when a heap object is deallocated multiple times without an intervening allocation. This error may at first glance seem innocuous but, in many cases, leads to heap corruption or program termination.</p>
<p>Other memory errors have to do with the use of allocated objects. When an object is allocated with a size that is not large enough, an <em>out-of-bound error</em> can occur when the memory address to be read or written lies outside the object. Out-of-bound writes are also known as <em>buffer overflows</em>. <em>Uninitialized reads</em> happen when a program reads memory that has never been initialized; in many cases, uninitialized memory contains data from previously-allocated objects.</p>
<h3>Airbags for Your Applications.</h3>
<p><img class="alignnone" src="http://www.sbboron.com/images/airbag.jpg" alt="" width="394" height="278" /></p>
<p>Given that we know we will be shipping software with bugs and that the terrain is dangerous, it might make sense to equip it with seatbelts and airbags. What we’d like is to have both resilience and prompt corrective action for any problem that surfaces in our deployed applications.</p>
<p>Let’s focus on C/C++/Objective-C applications—the lion’s share of applications running on servers, desktops, and mobile platforms—and memory errors, the number one headache for applications written in these languages. Safety-equipped memory allocators can play a crucial role in helping to protect your software against crashes.</p>
<h3>The Garbage Collection Safety Net.</h3>
<p><img class="alignnone" src="http://www.gulfstreamshadecovers.com/assets/images/products/safety-net1.jpg" alt="" width="360" height="239" /></p>
<p>Dealing with the first class of errors—those that happen because of the misuse of free or delete—can be remedied directly by using garbage collection. Garbage collection works by only reclaiming objects that it allocated, eliminating invalid frees. It only reclaims objects once there is no way to reach those objects anymore by traversing pointers from the “roots”: the globals and the stack. That eliminates dangling pointer errors, since by definition there can’t be any pointers around to reclaimed objects. Since it naturally only reclaims these objects once, a garbage collector also eliminates double frees.</p>
</div>
<div>
<p>While C and C++ were not designed with garbage collection in mind, it is possible to plug in a “conservative” garbage collector and entirely prevent free-related errors. The word “conservative” here means that because the garbage collector doesn’t necessarily know what values are pointers (since we are in C-land), it conservatively assumes that if a value looks like a pointer (it is in the right range and properly aligned), and it acts like a pointer (it only points to valid objects), then it may be a pointer.</p>
<p><a href="http://www.hpl.hp.com/personal/Hans Boehm/gc/">The Boehm-Demers-Weiser conservative garbage collector</a> is an excellent choice for this purpose: it is reasonably fast and space-efficient, and can be used to directly replace memory allocators by configuring it to treat calls to free as NOPs.</p>
<h3>Slipping Through the Net.</h3>
<p><img class="alignnone" src="http://www.svenutcke.de/images/manhole.jpg" alt="" width="384" height="256" /></p>
<p>While garbage collectors eliminate free-related errors, they cannot help prevent the second class of memory errors: those that have to do with the misuse of allocated objects such as buffer overflows.</p>
<p>Runtime systems that can find buffer overflows often impose staggeringly high overheads, making them not particularly suitable for deployed code. Tools like <a href="http://valgrind.org/info/tools.html#memcheck">Valgrind’s MemCheck</a> are incredibly comprehensive and useful, but are heavyweight by design and slow execution by orders of magnitude [7].</p>
<p>Compiler-based approaches can reduce overhead substantially by avoiding unnecessary checks, though they entail recompiling all of an application’s code, including libraries. Google has recently made available <a href="http://code.google.com/p/address-sanitizer/wiki/AddressSanitizer">AddressSanitizer</a>, a combination of compiler and runtime technology that can find a number of bugs, including overflows and use-after-free bugs. While it is much faster than Valgrind, its overhead remains relatively high (around 75%), making it primarily useful for testing.</p>
<h3>Your Program Has Encountered An Error. Goodbye, Cruel World.</h3>
<p><img class="alignnone" src="http://www.fugly.com/media/IMAGES/Random/goodbye_cruel_world.jpg" alt="" width="450" height="341" /></p>
<p>All of these approaches are based on the idea that the best thing to do upon encountering an error is to abort immediately. This fail-stop behavior is certainly desirable in testing. However, it is not usually what your users want. Most application programs are not safety-critical systems, and aborting them in midstream can be an unpleasant experience for users.</p>
<p>Suppose you have been working on a Microsoft Word document for hours (and for some mysterious reason, auto-save has not been turned on). If Microsoft Word suddenly discovers that some error has occurred, what should it do? It could just pop up the window indicating that something terrible has happened and would you like it to send a note home to Redmond. That might be the best thing to do from a debugging perspective, but most people would prefer that Word do its damndest to save the current document rather than fall on its sword if it discovers a possible error. In short, users generally would prefer that their applications be fault tolerant whenever possible.</p>
<h3>Bohr versus Heisenberg.</h3>
<p><img class="alignnone" src="http://emeryblogger.files.wordpress.com/2012/05/1.png?w=345&#038;h=300" alt="" width="345" height="300" /></p>
<p>In fact, the exact behavior users do not want is for an error to happen consistently and repeatably. In his classic 1985 article “Why do computers stop and what can be done about it”, Jim Gray drew a distinction between two kinds of bugs. The first kind are bugs that behave predictably and repeatably—that is, ones that occur every time that the program encounters the same inputs and goes through the same sequence of steps. These are <em>Bohr bugs</em>, by analogy with the classical atomic model where electrons circle around the nucleus in planetary-like orbits. Bohr bugs are great when debugging a program, since it makes it easier to reproduce the bug and find its root cause.</p>
</div>
<div>
<p><img class="alignnone" src="http://www.simonb.com/media/uploads/2009/07/10/heisenbug.png" alt="" width="400" height="412" /></p>
<p>The second kind of bugs are <em>Heisenbugs</em>, meant to connote the inherit uncertainty in quantum mechanics, which are unpredictable and cannot be reliably reproduced. The most common Heisenbugs these days are concurrency errors, a.k.a. race conditions, which depend on the order and timing of scheduling events to appear. Heisenbugs are also often sensitive to the observer effect; attempts to find the bug by inserting debugging code or running in a debugger often disrupt the sequence of events that led to the bug, making it go away.</p>
<p>Jim Gray makes the point that while Bohr bugs are great for debugging, what users want are Heisenbugs. Why? Because a Bohr bug is a showstopper for the user: every time the user does the same thing, they will encounter the same bug. But with Heisenbugs, the bugs often go away when you run the program again. If a program crashes, and the problem is a Heisenbug, then running the program again is likely to work. This is a perfect match for the way users already behave on the Web. If they go to a web page and it fails to respond, they just click “refresh” and that usually solves the problem.</p>
<p>So one way we can make life better for users is to convert Bohr bugs into Heisenbugs, if we can figure out how to do that.</p>
<h3>Defensive Driving with DieHard.</h3>
<p>My graduate students at the University of Massachusetts Amherst and I, in collaboration with my colleague Ben Zorn at Microsoft Research, have been working for the past few years on ways to protect programs from bugs. The first fruit of that research is a system called <strong><a href="http://diehard-software.org/">DieHard</a></strong> that makes memory errors less likely to impact users. DieHard eliminates some errors entirely and converts the others into (rare) Heisenbugs.</p>
<p><img class="alignright" src="http://www.rifftrax.com/files/iriffs-posters/die-hard-poster.jpg" alt="" width="240" height="343" /></p>
<p>To explain how DieHard works, let’s go back to the car analogy. One way to make it less likely for cars to crash into each other is for them to be spaced further apart, providing adequate braking distance in case something goes wrong. DieHard provides this “defensive driving” by taking over all memory management operations and allocating objects in a space larger than required.</p>
<p>This de facto padding increases the odds that a small overflow will end up in un- allocated space where it can do no harm. However, DieHard doesn’t just add a fixed amount of padding between objects. That would provide great protection against overflows that are small enough, and zero protection against the others. In other words, those overflows would still be Bohr bugs.</p>
<p>Instead, DieHard provides probabilistic memory safety by randomly allocating objects on the heap. DieHard adaptively sizes its heap be a bit larger than the maximum needed by the application; the default is 1/3 [2, 3]. DieHard allocates memory from increasingly large chunks that we call miniheaps.</p>
</div>
<div>
<p>By randomly allocating objects across all the miniheaps (see <a href="http://emeryblogger.files.wordpress.com/2012/05/new-heap-diagram.pdf">this diagram for a detailed view</a>), DieHard makes many memory overflows benign, with a probability that naturally declines as the overflow increases in size and as the heap becomes full. The effect is that, in most cases when running with DieHard, a small overflow is likely to have no effect.</p>
<p>DieHard’s random allocation approach also reduces the likelihood of the free-related errors that garbage collection addresses. DieHard uses bitmaps, stored outside the heap, to track allocated memory. A bit set to ’1’ indicates that a given block is in use, and ’0’ that it is available.</p>
<p>This use of bitmaps to manage memory eliminates the risk of double frees, since resetting a bit to zero twice is the same as resetting in once. Keeping the heap metadata separate from the data in the heap makes it impossible to inadvertently corrupt the heap itself.</p>
<p>Most importantly, DieHard drastically reduces the risk of dangling pointer errors, which effectively go away. If the heap has one million freed objects, the chances that you will immediately reuse one that was just freed is literally one in a million. Contrast this with most allocators, which immediately reclaim freed objects. With DieHard, even after 10,000 reallocations, there is still a 99% chance that the dangled object will not be reused.</p>
<p>Because it performs its allocation in (amortized) constant time, DieHard can provide added safety with very little additional cost in performance. For example, using it in a browser results in no perceivable performance impact.</p>
<h3>Tolerating Faults FTW with FTH.</h3>
<p>At Microsoft Research, tests with a variant of DieHard resolved about 30% of all bugs in the Microsoft Office database, while having no perceivable impact on performance. Beginning with Windows 7, Microsoft Windows <a href="http://msdn.microsoft.com/en-us/library/dd744764(v=vs.85).aspx">now ships with a Fault-Tolerant Heap (FTH)</a> that was directly inspired by DieHard. 8 Normally, applications use the default heap, but after a program crashes more than a certain number of times, the Fault-Tolerant Heap takes over. Like DieHard, the Fault-Tolerant Heap manages heap metadata separately from the heap. It also adds padding and delays allocations, though it does not provide DieHard’s probabilistic fault tolerance because it does not randomize allocations or deallocations. The Fault-Tolerant Heap approach is especially attractive because it acts like an airbag: effectively invisible and cost-free when everything is fine, but providing protection when they need it.</p>
<h3>Exterminating the Bugs.</h3>
<p><img class="alignright" src="http://people.cs.umass.edu/~emery/exterminator.png" alt="" width="353" height="200" /></p>
<p>Tolerating bugs is one way to improve the effective quality of deployed software. It would be even better if somehow the software could not only tolerate faults but also correct them. A follow-on to DieHard, called <strong>Exterminator</strong>, does exactly that [8, 9]. Exterminator uses a version of DieHard extended to detect errors, and uses statistical inference to compute what kind of error happened and where the error occurred. Exterminator not only can send this information back to programmers for them to repair the software, but it also automatically corrects the errors via runtime patches. For example, if it detects that a certain object was responsible for a buffer overflow of 8 bytes, it will always allocate such objects (distinguished by their call site and size) with an 8-byte pad. Exterminator can learn from the results of multiple runs or multiple users, so it could be used to proactively push out patches to prevent other users from experiencing errors it has already detected elsewhere.</p>
</div>
<div>
<h3>The Future: Safer, Self-Repairing Software.</h3>
<p><img class="alignnone" src="http://gallery.photo.net/photo/6522423-lg.jpg" alt="" width="266" height="200" /></p>
<p>My group and others (notably Martin Rinard at MIT, Vikram Adve at Illinois, Yuanyuan Zhou at UC-San Diego, Shan Lu at Wisconsin, and Luis Ceze and Dan Grossman at Washington) have made great strides in building safety systems for other classes of errors. We have recently published work on <a href="http://dthreads.org">systems that prevent concurrency errors</a>, some of which we can eliminate automatically. <strong>Grace</strong> is a runtime system that eliminates concurrency errors for concurrent programs that use “fork-join” parallelism. It hijacks the threads library, converting threads to processes “under the hood”, and uses virtual memory mapping and protection to enforce behavior that gives the illusion of a sequential execution, even on a multicore processor [1]. <strong>Dthreads</strong> (“Deterministic Threads”) is a full replacement for the POSIX threads library that enforces deterministic execution for multithreaded code [6]. In other words, a multithreaded program running with dthreads never has races; every execution with the same inputs generates the same outputs.</p>
<p>We look forward to a day in the not too distant future when such safer runtime systems are the norm. Just as we can now barely imagine cars without their myriad of safety features, we are finally adopting a similar philosophy for software. Buggy software is inevitable, and when possible, we should deploy safety systems that reduce their impact on users.</p>
<h4>References</h4>
<ol>
<li>E. D. Berger, T. Yang, T. Liu, and G. Novark. <a href="http://doi.acm.org/10.1145/1640089.1640096">Grace: safe multithreaded programming for C/C++</a>. In S. Arora and G. T. Leavens, editors, OOPSLA, pages 81–96. ACM, 2009.</li>
<li>E. D. Berger and B. G. Zorn. <a href="http://doi.acm.org/10.1145/1133981.1134000">DieHard: Probabilistic memory safety for unsafe languages</a>. In Proceedings of the 2006 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 158–168, New York, NY, USA, 2006. ACM Press.</li>
<li>E. D. Berger and B. G. Zorn. Efficient probabilistic memory safety. Technical Report UMCS TR-2007-17, Department of Computer Science, University of Massachusetts Amherst, Mar. 2007.</li>
<li>A.Bessey, K.Block, B.Chelf, A.Chou, B.Fulton, S.Hallem, C.Henri-Gros, A.Kamsky, S. McPeak, and D. Engler. A few billion lines of code later: using static analysis to find bugs in the real world. Commun. ACM, 53(2):66–75, Feb. 2010.</li>
<li>M. Hertz and E. D. Berger. <a href="http://doi.acm.org/10.1145/1094811.1094836">Quantifying the performance of garbage collection vs. explicit memory management</a>. In OOPSLA ’05: Proceedings of the 20th annual ACM SIGPLAN conference on Object oriented programming, systems, languages, and applications, pages 313–326, New York, NY, USA, 2005. ACM Press.</li>
<li>T.Liu, C.Curtsinger, and E.D.Berger. <a href="http://doi.acm.org/10.1145/2043556.2043587">Dthreads: Efficient Deterministic Multithreading</a>. In Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles, SOSP ’11, pages 327–336, New York, NY, USA, 2011. ACM.</li>
<li>N. Nethercote and J. Seward. Valgrind: A framework for heavyweight dynamic binary instrumentation. In Proceedings of 2007 ACM SIGPLAN Conference on Programming Language Design and Implementation, pages 89–100. ACM Press, June 2007.</li>
<li>G. Novark, E. D. Berger, and B. G. Zorn. <a href="http://doi.acm.org/10.1145/1250734.1250736">Exterminator: automatically correcting memory errors with high probability</a>. In Proceedings of the 2007 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 1–11, New York, NY, USA, 2007. ACM Press.</li>
<li>G. Novark, E. D. Berger, and B. G. Zorn. <a href="http://doi.acm.org/10.1145/1409360.1409382">Exterminator: Automatically correcting memory errors with high probability</a>. Communications of the ACM, 51(12):87–95, 2008.</li>
<li>A. Oliner, A. Ganapathi, and W. Xu. Advances and challenges in log analysis. Commun. ACM, 55(2):55–61, Feb. 2012.</li>
<li>Symantec. Internet security threat report.<a href="http://www.symantec.com/enterprise/threatreport/index.jsp"> http://www.symantec.com/enterprise/threatreport/index.jsp</a>, Sept. 2006.</li>
<li>D. Yuan, H. Mai, W. Xiong, L. Tan, Y. Zhou, and S. Pasupathy. Sherlog: error diagnosis by connecting clues from run-time logs. In Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems, ASPLOS ’10, pages 143–154, New York, NY, USA, 2010. ACM.</li>
<li>D. Yuan, J. Zheng, S. Park, Y. Zhou, and S. Savage. Improving software diagnosability via log enhancement. ACM Trans. Comput. Syst., 30(1):4:1–4:28, Feb. 2012.</li>
</ol>
</div>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/emeryblogger.wordpress.com/248/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/emeryblogger.wordpress.com/248/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=emeryblogger.com&#038;blog=9784426&#038;post=248&#038;subd=emeryblogger&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://emeryblogger.com/2012/05/31/software-needs-seatbelts-and-airbags/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
	
		<media:thumbnail url="http://emeryblogger.files.wordpress.com/2012/05/airbag.jpg?w=150" />
		<media:content url="http://emeryblogger.files.wordpress.com/2012/05/airbag.jpg?w=150" medium="image">
			<media:title type="html">airbag</media:title>
		</media:content>

		<media:content url="http://2.gravatar.com/avatar/ef7468f0b37dca98b539f321225ee583?s=96&#38;d=http%3A%2F%2Fs0.wp.com%2Fi%2Fmu.gif&#38;r=G" medium="image">
			<media:title type="html">emeryberger</media:title>
		</media:content>

		<media:content url="http://www.danzfamily.com/archives/blogphotos/06/403-death-and-taxes.gif" medium="image" />

		<media:content url="http://www.kombuchakamp.com/wp/wp-content/uploads/2010/04/panacea.jpg" medium="image" />

		<media:content url="http://emeryblogger.files.wordpress.com/2012/05/toolbox.jpg?w=300" medium="image" />

		<media:content url="http://emeryblogger.files.wordpress.com/2012/05/testing-code-is-for-wimps-real-men-test-in-production.jpg?w=259" medium="image">
			<media:title type="html">testing-code-is-for-wimps-real-men-test-in-production</media:title>
		</media:content>

		<media:content url="http://emeryblogger.files.wordpress.com/2012/05/riskybusiness.jpg?w=300" medium="image" />

		<media:content url="http://images.nationalgeographic.com/wpf/media-live/photos/000/064/cache/color-earthrise_6429_600x450.jpg" medium="image" />

		<media:content url="http://www.eastsidepatch.com/wp-content/uploads/2010/02/capt-kirk_yellowCU-001_1196284873.jpg" medium="image" />

		<media:content url="http://www.automotive-stock-images.com/photos/cadillac-series-62-convertible-1959-rear-tailfins.jpg" medium="image" />

		<media:content url="http://farm3.staticflickr.com/2202/2060704683_315929dddd_b.jpg" medium="image" />

		<media:content url="http://www.sbboron.com/images/airbag.jpg" medium="image" />

		<media:content url="http://www.gulfstreamshadecovers.com/assets/images/products/safety-net1.jpg" medium="image" />

		<media:content url="http://www.svenutcke.de/images/manhole.jpg" medium="image" />

		<media:content url="http://www.fugly.com/media/IMAGES/Random/goodbye_cruel_world.jpg" medium="image" />

		<media:content url="http://emeryblogger.files.wordpress.com/2012/05/1.png?w=300" medium="image" />

		<media:content url="http://www.simonb.com/media/uploads/2009/07/10/heisenbug.png" medium="image" />

		<media:content url="http://www.rifftrax.com/files/iriffs-posters/die-hard-poster.jpg" medium="image" />

		<media:content url="http://people.cs.umass.edu/~emery/exterminator.png" medium="image" />

		<media:content url="http://gallery.photo.net/photo/6522423-lg.jpg" medium="image" />
	</item>
		<item>
		<title>Doppio (JVM in JavaScript) @ Strange Loop.</title>
		<link>http://emeryblogger.com/2012/05/14/doppio-jvm-in-javascript-strange-loop/</link>
		<comments>http://emeryblogger.com/2012/05/14/doppio-jvm-in-javascript-strange-loop/#comments</comments>
		<pubDate>Tue, 15 May 2012 00:08:13 +0000</pubDate>
		<dc:creator>emeryberger</dc:creator>
				<category><![CDATA[Java]]></category>
		<category><![CDATA[JavaScript]]></category>
		<category><![CDATA[Programming Languages]]></category>
		<category><![CDATA[jvm specification]]></category>
		<category><![CDATA[software-development]]></category>
		<category><![CDATA[strange loop]]></category>

		<guid isPermaLink="false">http://emeryblogger.com/?p=242</guid>
		<description><![CDATA[Strange Loop will be featuring a talk on Doppio, the JVM in Javascript (course project for my grad class gone wild). https://thestrangeloop.com/sessions/doppio-building-a-jvm-in-the-browser Doppio: Building a JVM in the Browser Modern browsers provide sandboxed versions of many native [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=emeryblogger.com&#038;blog=9784426&#038;post=242&#038;subd=emeryblogger&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p style="text-align:justify;"><a href="http://emeryblogger.com/2012/05/14/doppio-jvm-in-javascript-strange-loop/screen-shot-2012-12-07-at-2-35-51-pm/#main" rel="attachment wp-att-307"><img class="size-full wp-image-307 alignleft" alt="Screen Shot 2012-12-07 at 2.35.51 PM" src="http://emeryblogger.files.wordpress.com/2012/05/screen-shot-2012-12-07-at-2-35-51-pm.png?w=470"   /></a>Strange Loop will be featuring a talk on <a href="http://int3.github.com/doppio/">Doppio</a>, the JVM in Javascript (<a href="http://plasma.cs.umass.edu/emery/grad-systems-project-1">course project for my grad class</a> gone wild).</p>
<p><a href="https://thestrangeloop.com/sessions/doppio-building-a-jvm-in-the-browser" target="_blank" rel="nofollow nofollow">https://thestrangeloop.com/sessions/doppio-building-a-jvm-in-the-browser</a></p>
<blockquote>
<h3>Doppio: Building a JVM in the Browser</h3>
<p>Modern browsers provide sandboxed versions of many native system interfaces, such as graphics rendering and a filesystem. Thus in theory we should be able to replicate most of the desktop experience on the web — except for the fact that a great deal of applications are not written in Javascript. Enabling the browser to run other languages would add whole classes of applications to the web platform.</p>
<p>Doppio is an effort to bring the JVM languages to the web by implementing a JVM in Coffeescript. While the JVM specification is technically language-agnostic, the original JVM is written in C and C++, and its architecture reflects that. We’ll discuss some of the challenges of implementing the spec and porting the libraries to a high-level, non-systems language, particularly when it does not expose threads directly. We’ll also talk about how NodeJS was invaluable for development.</p></blockquote>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/emeryblogger.wordpress.com/242/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/emeryblogger.wordpress.com/242/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=emeryblogger.com&#038;blog=9784426&#038;post=242&#038;subd=emeryblogger&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://emeryblogger.com/2012/05/14/doppio-jvm-in-javascript-strange-loop/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://2.gravatar.com/avatar/ef7468f0b37dca98b539f321225ee583?s=96&#38;d=http%3A%2F%2Fs0.wp.com%2Fi%2Fmu.gif&#38;r=G" medium="image">
			<media:title type="html">emeryberger</media:title>
		</media:content>

		<media:content url="http://emeryblogger.files.wordpress.com/2012/05/screen-shot-2012-12-07-at-2-35-51-pm.png" medium="image">
			<media:title type="html">Screen Shot 2012-12-07 at 2.35.51 PM</media:title>
		</media:content>
	</item>
		<item>
		<title>Internet Hacker Danger!</title>
		<link>http://emeryblogger.com/2012/05/08/internet-hacker-danger/</link>
		<comments>http://emeryblogger.com/2012/05/08/internet-hacker-danger/#comments</comments>
		<pubDate>Tue, 08 May 2012 15:35:57 +0000</pubDate>
		<dc:creator>emeryberger</dc:creator>
				<category><![CDATA[Security]]></category>

		<guid isPermaLink="false">http://emeryblogger.com/?p=237</guid>
		<description><![CDATA[Maybe not that scary, but probably should check your computer just in case: me on WGBY explaining about the DNS Changer Trojan Horse: UMass Professor Emery Berger discusses recent reports of [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=emeryblogger.com&#038;blog=9784426&#038;post=237&#038;subd=emeryblogger&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>Maybe not <em>that </em>scary, but probably should check your computer just in case: <a href="http://vimeo.com/41774896">me on WGBY</a> explaining about the <a href="http://www.dcwg.org/">DNS Changer Trojan Horse</a>:</p>
<blockquote><p><a href="https://vimeo.com/41774896"><img class="alignright  wp-image-362" alt="Screen Shot 2013-01-05 at 5.49.50 PM" src="http://emeryblogger.files.wordpress.com/2012/05/screen-shot-2013-01-05-at-5-49-50-pm.png?w=233&#038;h=133" width="233" height="133" /></a>UMass Professor Emery Berger discusses recent reports of internet hacking that could leave thousands of Americans with “infected” internet connections and lead to internet disconnection this July. How can you know if your computer is affected?</p></blockquote>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/emeryblogger.wordpress.com/237/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/emeryblogger.wordpress.com/237/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=emeryblogger.com&#038;blog=9784426&#038;post=237&#038;subd=emeryblogger&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://emeryblogger.com/2012/05/08/internet-hacker-danger/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:thumbnail url="http://emeryblogger.files.wordpress.com/2012/05/screen-shot-2013-01-05-at-5-49-50-pm.png?w=150" />
		<media:content url="http://emeryblogger.files.wordpress.com/2012/05/screen-shot-2013-01-05-at-5-49-50-pm.png?w=150" medium="image">
			<media:title type="html">Screen Shot 2013-01-05 at 5.49.50 PM</media:title>
		</media:content>

		<media:content url="http://2.gravatar.com/avatar/ef7468f0b37dca98b539f321225ee583?s=96&#38;d=http%3A%2F%2Fs0.wp.com%2Fi%2Fmu.gif&#38;r=G" medium="image">
			<media:title type="html">emeryberger</media:title>
		</media:content>

		<media:content url="http://emeryblogger.files.wordpress.com/2012/05/screen-shot-2013-01-05-at-5-49-50-pm.png" medium="image">
			<media:title type="html">Screen Shot 2013-01-05 at 5.49.50 PM</media:title>
		</media:content>
	</item>
		<item>
		<title>Doppio: run JVM bytecodes in your browser without any plug-ins</title>
		<link>http://emeryblogger.com/2012/04/30/doppio-run-jvm-bytecodes-in-your-browser-without-any-plug-ins/</link>
		<comments>http://emeryblogger.com/2012/04/30/doppio-run-jvm-bytecodes-in-your-browser-without-any-plug-ins/#comments</comments>
		<pubDate>Mon, 30 Apr 2012 19:37:15 +0000</pubDate>
		<dc:creator>emeryberger</dc:creator>
				<category><![CDATA[Java]]></category>
		<category><![CDATA[JavaScript]]></category>
		<category><![CDATA[Programming Languages]]></category>

		<guid isPermaLink="false">http://emeryblogger.com/2012/04/30/doppio-run-jvm-bytecodes-in-your-browser-without-any-plug-ins/</guid>
		<description><![CDATA[Doppio: Java on Coffeescript &#8211; run JVM bytecodes in your browser without any plug-ins. Best of this year&#8217;s crop: a project from my graduate systems class. Runs Rhino and javac!<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=emeryblogger.com&#038;blog=9784426&#038;post=235&#038;subd=emeryblogger&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p><a href="http://int3.github.com/doppio/">Doppio: Java on Coffeescript</a> &#8211; run JVM bytecodes in your browser <a href="http://int3.github.com/doppio/about.html">without any plug-ins</a>. Best of this year&#8217;s crop: a project from my <a href="http://plasma.cs.umass.edu/emery/grad-systems">graduate systems class</a>. Runs Rhino and javac!</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/emeryblogger.wordpress.com/235/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/emeryblogger.wordpress.com/235/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=emeryblogger.com&#038;blog=9784426&#038;post=235&#038;subd=emeryblogger&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://emeryblogger.com/2012/04/30/doppio-run-jvm-bytecodes-in-your-browser-without-any-plug-ins/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:thumbnail url="http://emeryblogger.files.wordpress.com/2012/05/screen-shot-2012-12-07-at-2-35-51-pm.png?w=93" />
		<media:content url="http://emeryblogger.files.wordpress.com/2012/05/screen-shot-2012-12-07-at-2-35-51-pm.png?w=93" medium="image">
			<media:title type="html">Screen Shot 2012-12-07 at 2.35.51 PM</media:title>
		</media:content>

		<media:content url="http://2.gravatar.com/avatar/ef7468f0b37dca98b539f321225ee583?s=96&#38;d=http%3A%2F%2Fs0.wp.com%2Fi%2Fmu.gif&#38;r=G" medium="image">
			<media:title type="html">emeryberger</media:title>
		</media:content>
	</item>
		<item>
		<title>Me on PBS: &#8220;Cyberterrorism&#8221; Interview</title>
		<link>http://emeryblogger.com/2012/02/29/cyberterrorism-interview-on-pbs/</link>
		<comments>http://emeryblogger.com/2012/02/29/cyberterrorism-interview-on-pbs/#comments</comments>
		<pubDate>Wed, 29 Feb 2012 19:00:55 +0000</pubDate>
		<dc:creator>emeryberger</dc:creator>
				<category><![CDATA[Security]]></category>

		<guid isPermaLink="false">http://emeryblogger.com/?p=228</guid>
		<description><![CDATA[Me on WGBY: &#8220;cyberterrorism&#8221;, DoS attacks, botnets, SOPA + my not at all cynical take on politicians and the legislative process!<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=emeryblogger.com&#038;blog=9784426&#038;post=228&#038;subd=emeryblogger&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p><a title="Emery Berger interview on WGBY (Connecting Point)" href="https://vimeo.com/37667495">Me on WGBY</a>: &#8220;cyberterrorism&#8221;, DoS attacks, botnets, SOPA + my not at all cynical take on politicians and the legislative process!</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/emeryblogger.wordpress.com/228/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/emeryblogger.wordpress.com/228/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=emeryblogger.com&#038;blog=9784426&#038;post=228&#038;subd=emeryblogger&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://emeryblogger.com/2012/02/29/cyberterrorism-interview-on-pbs/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://2.gravatar.com/avatar/ef7468f0b37dca98b539f321225ee583?s=96&#38;d=http%3A%2F%2Fs0.wp.com%2Fi%2Fmu.gif&#38;r=G" medium="image">
			<media:title type="html">emeryberger</media:title>
		</media:content>
	</item>
		<item>
		<title>Take a Stand for Double-Blind Reviewing!</title>
		<link>http://emeryblogger.com/2012/02/01/take-a-stand-for-double-blind-reviewing/</link>
		<comments>http://emeryblogger.com/2012/02/01/take-a-stand-for-double-blind-reviewing/#comments</comments>
		<pubDate>Thu, 02 Feb 2012 01:46:48 +0000</pubDate>
		<dc:creator>emeryberger</dc:creator>
				<category><![CDATA[Academia]]></category>

		<guid isPermaLink="false">http://emeryblogger.com/?p=223</guid>
		<description><![CDATA[I have long been a proponent of double-blind reviewing. People suffer from expectation bias, and double-blind reviewing is a tried and true approach to combat it. I adopted double-blind reviewing when [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=emeryblogger.com&#038;blog=9784426&#038;post=223&#038;subd=emeryblogger&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>I have long been a proponent of double-blind reviewing. People suffer from <a href="http://en.wikipedia.org/wiki/Expectation_bias">expectation bias</a>, and double-blind reviewing is a tried and true approach to combat it. I adopted double-blind reviewing when I co-chaired VEE 2010 and just recently for <a href="http://plasma.cs.umass.edu/emery/wodet-3">WoDet 3</a>, and have decided to take a stand to sway more program committees to implement it. Join me!</p>
<p><strong>When asked to serve on a PC, agree only if double-blind reviewing is used.</strong></p>
<p>This approach doesn&#8217;t always work, but the fact is that most program chairs simply had not considered it and are happy to adopt it. My advisor <a href="http://www.cs.utexas.edu/~mckinley/">Kathryn McKinley</a>&#8216;s <a href="http://www.cs.utexas.edu/users/mckinley/notes/blind.html">case for double-blind</a> and <a href="http://www.cs.umd.edu/~mwh/">Mike Hicks</a>&#8216; <a href="http://www.cs.umd.edu/~mwh/dbr-faq.html">fantastic FAQ</a> on the topic make excellent ammunition. I suggested it to <a href="http://www.cs.cmu.edu/~tcm/">Todd Mowry</a> and he implemented it for ASPLOS 2011; <a href="http://www.cse.ohio-state.edu/~saday/">P. Sadayappan</a> did the same for PPoPP 2012 (I am grateful to both for their patience!)</p>
<p>But there has been some backsliding; double-blind reviewing is <em>not </em>going to be used for POPL 2013, despite the overwhelmingly positive response of the POPL 2012 committee members.</p>
<p>So the next time you get asked to serve on a PC, at least bring it up. Let&#8217;s help make this a standard practice across our community.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/emeryblogger.wordpress.com/223/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/emeryblogger.wordpress.com/223/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=emeryblogger.com&#038;blog=9784426&#038;post=223&#038;subd=emeryblogger&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://emeryblogger.com/2012/02/01/take-a-stand-for-double-blind-reviewing/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
	
		<media:thumbnail url="http://emeryblogger.files.wordpress.com/2012/02/blind-justice.jpg?w=87" />
		<media:content url="http://emeryblogger.files.wordpress.com/2012/02/blind-justice.jpg?w=87" medium="image">
			<media:title type="html">blind-justice</media:title>
		</media:content>

		<media:content url="http://2.gravatar.com/avatar/ef7468f0b37dca98b539f321225ee583?s=96&#38;d=http%3A%2F%2Fs0.wp.com%2Fi%2Fmu.gif&#38;r=G" medium="image">
			<media:title type="html">emeryberger</media:title>
		</media:content>
	</item>
	</channel>
</rss>
