False sharing is an insidious problem for multithreaded programs running on multicore processors, where it can silently degrade performance and scalability. Previous tools for detecting false sharing are severely limited: they cannot distinguish false sharing from true sharing, have high false positive rates, and provide limited assistance to help programmers locate and resolve false sharing.
This paper presents two tools that attack the problem of false sharing: SHERIFF-DETECT and SHERIFF-PROTECT. Both tools leverage a framework we introduce here called SHERIFF. SHERIFF breaks out threads into separate processes, and exposes an API that allows programs to perform per-thread memory isolation and tracking on a per-page basis. We believe SHERIFF is of independent interest.
SHERIFF-DETECT ﬁnds instances of false sharing by comparing updates within the same cache lines by different threads, and uses sampling to rank them by performance impact. SHERIFF-DETECT is precise (no false positives), runs with low overhead (on average, 20%), and is accurate, pinpointing the exact objects involved in false sharing. We present a case study demonstrating SHERIFF-DETECT’s effectiveness at locating false sharing in a variety of benchmarks.
Rewriting a program to ﬁx false sharing can be infeasible when source is unavailable, or undesirable when padding objects would unacceptably increase memory consumption or further worsen runtime performance. SHERIFF-PROTECT mitigates false sharing by adaptively isolating shared updates from different threads into separate physical addresses, effectively eliminating most of the performance impact of false sharing. We show that SHERIFF-PROTECT can improve performance for programs with catastrophic false sharing by up to 9x, without programmer intervention.