Skip to main content

health service handle count threshold for exchange mp

another admin pointed out something very odd with this particular monitor.  apparently, the monitor has some overrides that change the threshold in certain scenarios.  to start, the monitor description:

This monitor ensures that the "Process\Handle Count" counter for the HealthService.exe process does not exceed a set threshold over a series of consecutive samples.  If the conditions are met this monitor will change to a critical state, which will then roll up to the "Health Service State" monitor.  The "Health Service State" monitor is configured to run a recovery when its state is critical, which will automatically attempt to restart the Health Service.

basically once you breach this number, the health service restarts.  this is typically a good thing since you’re keeping it maintained.  now, flip to the overrides.

image

notice that there’s an exchange 2007 computer group override where the value is 5000.  try to edit this override.  you should get a similar screen.

image

notice how the value of 5000 doesn’t show up here.  interesting that it would even be set at 5000 since 6000 would seem a better rounded number for most agents.  so why would the exchange 2007 computer group want a lower threshold?  mysterious…

not really -- if you know the history.  turns out at one point the threshold was set to some whacky low number.  i don’t have a back rev environment to go pull the actual value.  let’s just say it was 200.  with this value in place, the exchange mp couldn’t reliably operate in large-scale environments with health service constantly restarting.  the override value comes from the exchange mp, forcing the threshold count to a much higher, more realistic value.

this makes complete sense except the value is lower than what is shown in the screen shot above, right?  actually … the value of 6000 was introduced in the latest operations manager 2007 core mp which was released after the exchange mp.

oh by the way, you’ll see this same behavior in the health service private bytes threshold monitor.  (thanks guys!)

Comments

Popular posts from this blog

how to retrieve your ip address with powershell...

update: this is how it’s performed in powershell v3 as demonstrated here.(get-netadapter | get-netipaddress | ? addressfamily -eq'IPv4').ipaddress update: this is by far the easiest.PS C:\temp> (gwmi Win32_NetworkAdapterConfiguration | ? { $_.IPAddress -ne $null }).ipaddress
192.168.1.101
are you laughing yet?  i know you probably find this topic amusing.  it's really interesting though.  whenever you get over it, i'll do this in the standard cmd.exe interpreter and then in powershell to show you what kind of coolness powershell does.done?  okay, good.  this is an interpretation of a demo that bob wells did at our smug meeting.  hope you like it.i should tell you, it's not as simple as the title would lead you to believe.  i like doing that little slight-of-hand thing since it gives the impression that i'm painting a very easy target on my back for your criticism (though it's probably true in other ways)!  the idea is that we want to retrieve just the ip ad…

understanding the “ad op master is inconsistent” alert

i use the term “understanding” loosely.  this is by far no definitive guide on this particular alert, just a few things i have picked up in my attempt to understand it.let’s look at the context of the alert:The Domain Controller's Op Master is inconsitent. See additional alerts for details.
first of all, it gives very little information.  the only particularly useful detail is that it indicates which server is having the issue.  other than that, just a spelling error as there are no additional critical alerts to look at for details.this rule, as you know, comes from a sealed mp.  therefore, we can’t modify anything in it except the overrides.  the couple i’ve tinkered with are:interval (sec) log success event to begin with, interval (sec) is just set way too high.  the default is 60 seconds.  why on earth would anyone want to know that your op master consistency may be off, every minute?  actually, i could think of a few reasons, but really, it’s overkill.  the way the script works…

sccm: content hash fails to match

back in 2008, I wrote up a little thing about how distribution manager fails to send a package to a distribution point. even though a lot of what I wrote that for was the failure of packages to get delivered to child sites, the result was pretty much the same. when the client tries to run the advertisement with an old package, the result was a failure because of content mismatch.I went through an ordeal recently capturing these exact kinds of failures and corrected quite a number of problems with these packages. the resulting blog post is my effort to capture how these problems were resolved. if nothing else, it's a basic checklist of things you can use.DETECTIONstatus messagestake a look at your status messages. this has to be the easiest way to determine where these problems exist. unfortunately, it requires that a client is already experiencing problems. there are client logs you can examine as well such as cas, but I wasn't even sure I was going to have enough material to …