Wednesday 18 July 2012

Investigating issues with locks

Tadammm

Last post from my "Windbg bible" series :) Let's finish it and then I'll go for something new and more complicated.
What does lock means? In simple words. I call "Lock" a situation, when several threads are requesting one and the same resource (object, file) when this resouce is already locked by some other thread. While this thread is holding a lock on a resource, while it is working on it - other threads are not able to touch it and just spend time in waiting mode.

What are the sympthoms of having lock-issue?

Monday 16 July 2012

Investigating High Processor Usage issues

If you only know how much time it took for me to make this post! First of all, it was hard to start it, as sometimes I am a bit bored of writing about things that I have done many times already. And in addition, first version of the post was washed away by incorrectly closed browser. Oh no!.. Let's start again...

Today we are talking about heavy CPU load. How it all begins?  Usually customers are complaining about the fact that his solution suddenly started to eat 100% CPU. Actually, it can be not only 100, but any value that is higher than standard CPU consumption for this process. Requests can be performed slowly or are not served at all.
How to gather dumps in these situations, is described here. But often it will be even better to gather dumps simply by clicking context menu on w3wp process in TaskManager.

Wednesday 11 July 2012

Investigating issues with Unmanaged Memory. First steps.


I want to write this blogpost really fast, until I forget everything, despite I have absolutely no time and need to work on other cases :( Let's count this as part of my work, okay?

Prologue.

You can scroll it down to scene 1, if you want to concentrate on concrete steps.

This case had a long story - starting from OOM troubles on 32-bit process, when process 32-bit mode was considered as main boundary for a large solution. BTW, my troubles with opening dumps at that point resulted in this blog post, if you remember it.
After switching to 64bit solution was running smoothly for some time, we even closed the support case. Suddenly (oh, I love this word!) our poor process had eaten 66Gb of memory, which is unbelievably huge. Actually, it was all of the memory available - 16Gb of RAM and 50Gb of page file.


Customer managed to gather a dump when process memory usage was at something like 6Gb. I was almost sure, that issue solution is struggling because of was this bad guy - Lucene analyzer. Luckily, we have script that counts all related to this issue data structures (and also counts cache sizes), so I thought that I will just run this script on the dump and confirm my assumptions.


I was a bit shocked when script has shown only 140Mb of Lucene-related things in memory. Cache sizes were in healthy range. So this is something new. I loaded our good old friend WinDBG and started analyzing memory as usually.

Wednesday 4 July 2012

Investigating dumps with exception


Dump that contains concrete exception is useful in case of "Sometimes"-issues. "Sometimes"-issue is an issue, which description is started from one of the words - sometimes, rarely, suddenly etc. Do you recall some of such cases, and how often they are closed as not-reproducible? Memory dump is your friend here :)

In case of any exception I would suggest to gather dump the same as in case for OOM exception, which is described here. Do not forget to load needed modules and symbols, and after that you are ready to start investigation.


Actually, dump with concrete exception is simplest situation.

Tuesday 3 July 2012

Pocket-size vocabulary


In this post I'm going to keep some useful WinDBG commands and tips. Simply and short :) I'll update it from time to time.


  • .time 

           Shows date and time when dump was created:




  • Opening a new dump. If you need to open another dump in same windbg window, choose Debug -> Stop debugging, or press Shift+F5.
  • Question mark makes conversion from hexadecimal to decimal and vice versa:
0:082> ?1e74
Evaluate expression: 7796 = 00001e74
0:082> ?0n7796
Evaluate expression: 7796 = 00001e74

0n is prefix for decimal numbers
0x is prefix for hexadecimal numbers
  • There is an autocompletion option in WinDBG: you may start writing !ana, then press Tab and it will be completed to !analyze
  • .chain is a command to look currently loaded extensions:
  • .unload is a command used to unload one of the currently loaded extensions. For example, to change it with another version of this extension. Make sure that you pass full name of the extension as a parameter - exactly in a form specified in .chain output:
  • !sam c:\temp This command will export all dlls from current dump to the location you've specified as a parameter. Note, that this is a command from psscor dlls, so if you have loaded sos instead of psscor, it won't work.


  • .logopen /t c:\temp\Output.txt - if you want to save your next commands output to a file.

Types of "I need your dump"-issues

So, you have installed WinDBG, you know how to create a dump (one, two and in the middle of three) , how to load it to WinDBG and how to load necessary modules and symbols. What's next?

There are lots of different commands to work with WinDBG, and at first you can feel lost in all this information. What should be your next step?
that's how I see the dump
investigation process :)

First of all, you need to categorize problem that you have. I can divide "dump"-issues into several categories:

1. Exception. Simply exception.

2. High processor load. Means that your w3wp process takes up to 100% of processor time. From my point of view, any number higher than 20% requires attention.

3. Memory issues. Out of memory exceptions and generally high memory usage. Additional ideas.

4. Locks. Problem with locks appear when you have several threads that are trying to access some resource, that is locked by another thread.




For each of these problem I'm going to create a separate post and add links to this kind of "Contents" page... Push me somebody, pleeeeease :)

UPD. Finished, yeeeey :)