Skip to main content

powershell: reducing processing time (niche case)

why the caveat? it's important to note that my savings is based on switching out just a simple little thing. there's no magic here. there's no fountain of knowledge. those accolades are for the likes of snover and wilson.

the synopsis is simple. i was asked to create a very specific user list. the specifications were such that i had to consider custom objects to store the information. here are the requirements:
  • must be a csv formatted file
  • must have headers that match a specified string
  • must contain columns even if the value is empty
  • must decode the manager dn to the manager's employee id
after spending a little time getting formatting right, i realized that performance was just terrible. i admit i created it in the laziest way possible. i mean that is what scripting is about right? saving time? 

for processing a thousand users and creating a thousand custom objects, it was okay since the span of time was relatively short. when i raised it to 30000ish the performance issue became evident. i did the most logical thing... which was to consult the expansive, ever-reaching power of the internet and found an assortment of suggestions for speeding up custom objects -- from select method to hashtable method to out-of-my-range complicated c# methods.

i tried a few different things over the next few days when i had time. i broke up the collection of users into smaller chunks, streamlined my ldap filters, tried rearranging things... none of them impacted the performance -- at all. i even removed the custom object requirement entirely (or so i thought since piping to select-object is a way to create custom objects).

i finally spent some time trying to understand where else i could be having performance hits. i narrowed it down to one other place -- the conversion of the manager dn to the manager employee id. some background: active directory stores a user's manager as a forward link to the manager's user object. this means all you have to do is follow the link. so in essence, once i know the manager value of a user, i can just query for the manager object and retrieve the employee id of that object. easy! unfortunately, each time i did this, it would take a few ticks for it to come back.

i don't know for certain how many "a few" is. in desperation, i blamed my old desktop and sought out something more powerful, a performance-purpose desktop with 4 cores and 16gb/ram. i tried running it there. i kicked it off around lunch time. i came in the next morning and checked to see how it was doing. still running. finally after another hour or so, it stopped, presenting these cheery results:

Days              : 0
Hours             : 18
Minutes           : 40
Seconds           : 20
Milliseconds      : 306
Ticks             : 672203068902
TotalDays         : 0.778012811229167
TotalHours        : 18.6723074695
TotalMinutes      : 1120.33844817
TotalSeconds      : 67220.3068902
TotalMilliseconds : 67220306.8902
i had been pondering the idea of switching out what i was using for the ldap lookups to something else to see if the cmdlet itself was the problem. i forgot to check it more often than i remembered (if that's possible, otherwise reverse what i said). well, after the results above, i was finally at the place where you couldn't look away. i had no more excuses or distractions. after searching around for all of 47 seconds, i found the information i needed, switched out the call, and ran it. i would periodically look over so ... when i realized it was done before my lunch break ended, i was -- amazed. results:
Days              : 0
Hours             : 0
Minutes           : 19

Seconds           : 56
Milliseconds      : 468
Ticks             : 11964681595
TotalDays         : 0.0138480111053241
TotalHours        : 0.332352266527778
TotalMinutes      : 19.9411359916667
TotalSeconds      : 1196.4681595
TotalMilliseconds : 1196468.1595
yeah! that's right. i dropped the execution time by 5600%. :) i think it's also significant to indicate that the method that took 18 hours also utilized just about all available ram on my old desktop (beyond 12gb on the performance machine) and at least 30-40% cpu the entire time. so what was it i switched out, you ask? watch your wordwrap... 

the original:
(get-qaduser $_.manager -searchroot "dc=mydomain,dc=com" -DontUseDefaultIncludedProperties -includedproperties employeeid).employeeid

the new:

that is my very long winded way of saying that get-qaduser was the culprit. it's not that it's bad. it's great when you're pulling objects in one fell swoop. calling it repeatedly to go after an object one at a time proved inordinately slow. in this case, using adsi directly won out -- in a big way.


Popular posts from this blog

how to retrieve your ip address with powershell...

update: this is how it’s performed in powershell v3 as demonstrated here.(get-netadapter | get-netipaddress | ? addressfamily -eq'IPv4').ipaddress update: this is by far the easiest.PS C:\temp> (gwmi Win32_NetworkAdapterConfiguration | ? { $_.IPAddress -ne $null }).ipaddress
are you laughing yet?  i know you probably find this topic amusing.  it's really interesting though.  whenever you get over it, i'll do this in the standard cmd.exe interpreter and then in powershell to show you what kind of coolness powershell does.done?  okay, good.  this is an interpretation of a demo that bob wells did at our smug meeting.  hope you like it.i should tell you, it's not as simple as the title would lead you to believe.  i like doing that little slight-of-hand thing since it gives the impression that i'm painting a very easy target on my back for your criticism (though it's probably true in other ways)!  the idea is that we want to retrieve just the ip ad…

understanding the “ad op master is inconsistent” alert

i use the term “understanding” loosely.  this is by far no definitive guide on this particular alert, just a few things i have picked up in my attempt to understand it.let’s look at the context of the alert:The Domain Controller's Op Master is inconsitent. See additional alerts for details.
first of all, it gives very little information.  the only particularly useful detail is that it indicates which server is having the issue.  other than that, just a spelling error as there are no additional critical alerts to look at for details.this rule, as you know, comes from a sealed mp.  therefore, we can’t modify anything in it except the overrides.  the couple i’ve tinkered with are:interval (sec) log success event to begin with, interval (sec) is just set way too high.  the default is 60 seconds.  why on earth would anyone want to know that your op master consistency may be off, every minute?  actually, i could think of a few reasons, but really, it’s overkill.  the way the script works…

sccm: content hash fails to match

back in 2008, I wrote up a little thing about how distribution manager fails to send a package to a distribution point. even though a lot of what I wrote that for was the failure of packages to get delivered to child sites, the result was pretty much the same. when the client tries to run the advertisement with an old package, the result was a failure because of content mismatch.I went through an ordeal recently capturing these exact kinds of failures and corrected quite a number of problems with these packages. the resulting blog post is my effort to capture how these problems were resolved. if nothing else, it's a basic checklist of things you can use.DETECTIONstatus messagestake a look at your status messages. this has to be the easiest way to determine where these problems exist. unfortunately, it requires that a client is already experiencing problems. there are client logs you can examine as well such as cas, but I wasn't even sure I was going to have enough material to …