Skip to main content

Critical Section Contention

If you've done any sort of multi-threaded programming on Windows, chances are you've worked with CRITICAL_SECTIONs. They are lightweight and effective. However, if your critical sections are causing too much contention, they might be the cause of serious performance problems. While working on Network Identity Manager, I was curious about how all of its hundreds of critical sections are doing. Network Identity Manager is a multi-threaded plug-in based application that is often prone to contention during certain operations. But how bad is it?

Critical section contention is a fact of life. It is the normal mode of operation for them. However, each time it happens, the application may pay a fairly hefty performance penalty. We are only interested in keeping contention in check. As such, we would like to know which critical sections are experiencing the highest contention levels so we can evaluate design changes that may help alleviate them.

A thorough treatment on the semantics of a critical section is beyond the scope of this post. In addition to MSDN, I found this article quite informative. MSDN doesn't discuss the internals, but the article does. The information we are looking for isn't in the RTL_CRITICAL_SECTION structure, but in the RTL_CRITICAL_SECTION_DEBUG structure. Each CRITICAL_SECTION object has an associated RTL_CRITICAL_SECTION_DEBUG structure that is allocated during the InitializeCriticalSection() call. Among several other bits of information contained therein, is the ContentionCount field:

EntryCount/ContentionCount These fields are both incremented at the same time and for the same reason. It is the number of threads that have entered a wait state because they could not immediately acquire the critical section. Unlike the LockCount and RecursionCount fields, these fields are never decremented.

(From here.  Note the disclaimer at the top of the linked article.)

Now that we have this figured out, what we want is a way to traverse the critical section list for our process and just dump the ones that have a high contention count. This is where WinDbg comes in. It is my debugger of choice. Not really because of how user-friendly it is (it's not), but because it is lightweight and functional. On my not-so-new Windows XP box, it takes about a minute for Visual Studio to start, while WinDbg starts in a snap and is ready to debug pretty much as soon as I finish typing its name. Performance aside, it does lack some degree of usability, particularly when compared with Visual Studio. WinDbg doesn't hold any punches even for your smallest typos.

Anyway, back to the task at hand...

Using the information gathered from the aforementioned article (which, as the disclaimer explains, is not to be relied upon, as it is subject to change), we will begin at ntdll!RtlCriticalSectionList and work our way down the list of critical sections, examining each and dumping those that match our criteria (high contention count). That's it! The procedure is as follows:

  1. Run application under WinDbg. Actually this step isn't strictly necessary since the accounting that goes into maintaining the contention counts happen anyway. We can let the application run outside the debugger and then attach the debugger when we are ready start examining the critical sections.
  2. Execute our script to traverse the critical section list.

What script you ask? The full script is given at the end of this post. You can copy and paste it into a file somewhere on disk and invoke it like so:

> $$>a< c:\path-to-script\CritSecContention.windbg [threshold]


The [threshold] value is what gets compared with the ContentionCount field. If the contention count is higher, then the critical section and the initialization stack trace (if stack traces are enabled) will be displayed.  If you don’t supply a threshold, a default value would be used.  Currently this is 20, though it is best that you try out different values and choose what works best for your application.



Why the ugly syntax? I really would like to know why WinDbg made invocation of scripts have such an irritating syntax. The $$>a< command means that we are invoking the script with arguments, because for some reason that has to have a completely different command from when you want to invoke a script without arguments.  This funky command also suppresses echoing every command in the script and allows the script to have multi-line commands.  This part is important, because some other commands that invoke scripts (called command programs in the WinDbg documentation) coalesce the whole program into a single line and echoes the command in it as they execute.  Woe is those who try use single line comments (the “* comment”) in command programs (remember how the whole program gets coalesced into a single line?).  Whoever came up with this really needs to take a look at gdb.



After the initial shock of the abhorrent syntax wears off, you’ll notice that the output looks like this:



-------------------------------------------------
struct _RTL_CRITICAL_SECTION_DEBUG * 0x00192fe0
+0x000 Type : 0
+0x002 CreatorBackTraceIndex : 0x9e
+0x004 CriticalSection : 0x1004a6a0 _RTL_CRITICAL_SECTION
+0x008 ProcessLocksList : _LIST_ENTRY [ 0x194fe8 - 0x190fe8 ]
+0x010 EntryCount : 0x40
+0x014 ContentionCount : 0x40
+0x018 Spare : [2] 0xc0c0c0c0
-----------------------------------------
Critical section = 0x1004a6a0 (nidmgr32!cs_ident+0x0)
DebugInfo = 0x00192fe0
NOT LOCKED
LockSemaphore = 0x3C4
SpinCount = 0x00000000


Stack trace for DebugInfo = 0x00192fe0:

0x7c911583: ntdll!RtlInitializeCriticalSectionAndSpinCount+0xC9
0x7c91162c: ntdll!RtlInitializeCriticalSection+0xF
0x7c809f9f: kernel32!InitializeCriticalSection+0xE
0x1001b2ae: nidmgr32!kcdbint_ident_init+0xE
0x1001ec89: nidmgr32!kcdb_init+0x29
0x1001ee00: nidmgr32!kcdb_process_attach+0x10
0x10001032: nidmgr32!DllMain+0x32
0x100378ad: nidmgr32!__DllMainCRTStartup+0xCD
0x100377d1: nidmgr32!_DllMainCRTStartup+0x21
0x7c90118a: ntdll!LdrpCallInitRoutine+0x14
0x7c91c4fa: ntdll!LdrpRunInitializeRoutines+0x344
0x7c9211b4: ntdll!LdrpInitializeProcess+0x1131
0x7c9210af: ntdll!_LdrpInitialize+0x183
0x7c90e457: ntdll!KiUserApcDispatcher+0x7


As you can see from this particular example, the script found that cs_ident has a contention count of 0x40 (64). This may be perfectly normal or quite excessive. It all depends on the load and how long you've run the application. In this case, the contention counts weren't that bad and my worries were unfounded.



One thing to note is that for the script to work, symbols for ntdll.dll must be present. Otherwise the script won't be able to find the beginning of the per-process critical section list.  Look here for information on how you can configure WinDbg to automatically download debug symbol files.



Happy debugging!



Source code for CritSecContention.windbg follows:


Comments

Popular posts from this blog

How to begin snowboarding, when you are over 30

It seems there's an unspoken rivalry between snowboarders and skiers.  We found this out when we tried out snowboarding two years ago.  We can barely ski, but we thought we'll try it out.  After all, at worst we'll probably add one more thing to the list of stuff we aren't good at.  Here are some things we learned in the process, offered in instructional format: You will, most probably, be among the oldest people on the beginner slope.  People who hadn't been born when you were graduating high-school will speed past you and do tricks you will probably never be able to do, and that's fine. Get proper gear.  You don't need to buy a snowboard, but make sure you are wearing snow pants and perhaps thermals.  Don't wear cotton pants or khakis.  I've seen people wear those.  After a few falls, the snow sticks to you and melts, making it look like you wet yourself.  It's not pretty and is uncomfortable.  Wear sn...

On facial recognition and retroactive indexing

Anonymizing data is already quite difficult, as shown in this 2015 paper on the reidentifiability of scrubbed credit card metadata. Beyond ineffective anonymizing, another disturbing aspect is the rate at which AI and ML are improving at image recognition. In particular, face recognition is approaching practicality for general purpose use (See Amazon Rekognition for example). While these technologies aren't quite there yet, they will inevitably reach that point. Once coupled with a data sets that are already publicly available, this means that large public image repositories like Imgur will become petri dishes for face recognition data. These technologies affect existing data retroactively. What is now an unlabeled morass of anonymous pictures could conceivably become treasure troves in the future for data brokers when the cost of picking out pictures of one's likeness from billions of images becomes easily affordable. This can and should be concerning to anyone who's p...