O R G A N I C / F E R T I L I Z E R: 05.08

May 29, 2008

how to audit maintenance mode in mom 2005

what i've discovered is that there are tons of postings, documentation, scripts and such on how to put a machine into maintenance mode -- and bring it back out -- by ui, a cmd shell tool, a script, etc.  however, there's really not much describing how to find out how a machine got into maintenance mode in the first place.

it's actually very simple because mom likes to log the crap out of everything.  there are four relevant ids that you should be aware of.  based on this knowledge, you can create your own event views or reports to take a look at this data.

the first set of ids refer to maintenance mode set by the console:

  • 10015 - maintenance mode start (details are in the parameters tab)
  • 10016 - maintenance mode stop

the second set of maintenance mode ids are set by the cmd line tool or anything that accesses the sdk (sms client for example):

  • 22153 - maintenance mode start request (details are in the description)
  • 22154 - maintenance mode stop request

more details?  here's the knowledge base article.

May 12, 2008

monitoring event storms in mom 2005 ...

as an avid mom administrator, i'm sure you check your total # of events per day, right?  this is really, the only real way to know if events are firing off in a manner that's out of control or heading that way.  some events aren't set to alert when they're picked up so there's a high likelihood that this could be happening in your environment, without your knowledge.

if you're old-school, you can walk in every morning, get your cup of coffee, fire up a t-sql query program and execute something like this:

SELECT COUNT(*) AS 'Event Count'
WHERE TimeGenerated > GetDate()-1

a successful execution will bring back the total number of events from the day prior.  i suppose if that's all you need, you can stop reading right here and go back to the way you were.  but if you're a lazy, little squirrel then keep reading.

still reading, aren't you? :)  since this method can be a monotonous pain in the ass, try something a little more automated.  you have mom for crying out loud.  make it work.  instead of walking in every morning and doing this ourselves, wouldn't it be so much more productive to have mom do it for us?  with that in mind, all we need is a script.  mom will handle the rest. 

for that, here's a script that will perform the exact same thing above with an added feature.  it will take the event count and throw it into the performance data stream and write it to the database as a performance counter.  we can do all kinds of things with it in that format.  to start off... here's the script:

Dim objConn
Dim objRS

Set objConn=CreateObject("ADODB.Connection")
Set objRS = CreateObject("ADODB.Recordset")

sDataBase = ScriptContext.Parameters.Get("DataBaseName")
sMgmtGroup = ScriptContext.Parameters.Get("ManagementGroup")

objConn.ConnectionString = _
     "Driver={SQL Server};" &_
     "Server=" & sDataBase & ";" &_
     "Database=Onepoint;" &_

'Open the database connection

'Define the SQL query
strSQLQuery =       "SELECT COUNT(*) AS 'Event Count' " & _
                       "FROM SDKEventView " & _
                       "WHERE TimeGenerated > GetDate()-1"

'Open the recordset
objRS.Open strSQLQuery, objConn ', adOpenForwardOnly, adLockReadOnly, adCmdText

iTotalEvents = objRS.Fields(0)

'Close up the objects
Set objRS = Nothing
Set objConn = Nothing

CreatePerfData "MOM",sMgmtGroup,"Total Events",iTotalEvents

Sub CreatePerfData(strObjectName,strInstanceName,strCounterName,numValue)
    Set objPerfData = ScriptContext.CreatePerfData
   objPerfData.ObjectName = strObjectName
    objPerfData.InstanceName = strInstanceName
    objPerfData.CounterName = strCounterName
    objPerfData.Value = numValue
    ScriptContext.Submit objPerfData
End Sub

it's not going to work if you just copy and paste it into mom.  you'll have to create a script in the "scripts" container (no, really?) and add some parameters.  if you look at the script, you'll see the parameters you need.  they're prefaced by paramters.get.

  • DatabaseName
  • ManagementGroup

once you've got the script in there, create a rule group, attach it to your mom management server computer group and create a timed provider to execute the script.  i set mine to synchronize every 24 hours at 00:01 just to make sure that it executes around the same time every day.  for the responses, add the script you created above with the parameters filled out to match your environment.

once the script executions begin, you'll be able to graph the performance counter of your total events.  it should come in like this:

  • performance object name: MOM
  • performance instance: [ManagementGroup]
  • performance counter: Total Events

when you graph it out, you'll be able to see the results of your labor: