Background – The Problem
I have been playing a lot with Splunk recently. If you’re not familiar with the product, it is a horizontally scalable database that leverages map-reduce to give you real-time analytics about your data. That’s probably a topic for another day, but the relevant part is that they have an agent that can run on a Windows desktop called the Universal Forwarder. They also have a PowerShell SDK that lets you send data to Splunk via PowerShell. Again, the details about these topics should be saved for another day. The topic for today is that in order to send data from my system with any regularity I encounter a fairly known problem with PowerShell performance: Starting powershell.exe takes longer-than-I’d-like time and it incurs a bit of a CPU hit. Both of which are unacceptable to me if I’m going to run these scripts on every desktop in my enterprise. This is especially true when you consider that a good proportion of those will be virtual desktops that are sharing CPU with each other.
The Solution
I’ve been thinking about this problem a lot, and I have a trimmed-down script-only version of my proposed solution. The technique is not that hard to follow. The first step is to create a PowerShell script that will run indefinitely. The script has the following requirements:
- It should read through a directory for scripts. If a script exists, it should execute it in its current runspace.
- The order in which the scripts are entered in the queue directory matters, i.e., the first script in should be the first script run.
- After every script is run, it should remove the variables that were created by the script, and then call [gc]::collect() to ensure that memory does not become unmanageable. This is the magic part of the script. For many, this post may be worth this snippit alone :) You can use this technique anytime your PowerShell session is using up too much RAM for your tastes.
- It should allow an initialization script to run so that you can load any global variables that should not be deleted or modules that should stay loaded in the session.
- It should sleep for a variable number of seconds between runs
- Parameters should consist of, the queue directory name, the initialization script, and the delay between retries in the loop.
The Script
The end result is a script called longrunnings.ps1 (for lack of any thought put into the name) that looks like this:
param( [Parameter(Mandatory=$true,Position=0)] [ValidateScript({Test-Path $_ -PathType Container})] [string] $QueueDirectory, [Parameter(Mandatory=$false)] [ValidateScript({Test-Path $_ -PathType Leaf})] [string] $InitializationScript, [Parameter(Mandatory=$false)] [int] $SleepSeconds = 15 ) if ($InitializationScript) { Write-Verbose "Dot sourcing $InitializationScript" . $InitializationScript } Write-Verbose "Capturing the list of variables in the session so they are not removed between executions" $vars = dir variable: |select -ExpandProperty name # There's a few variables that get set in this script, and a few others that will be seen when called as a script $vars += ('args','input','MyInvocation','PSBoundParameters','PSDebugContext','file','vars','files','foreach') # Enter the infinite loop while ($true) { $files = dir $QueueDirectory -file -Filter *.ps1 |sort lastwritetime if ($files) { foreach ($file in $files) { Write-Verbose ('Reading {0}' -f $file.fullname) $content = [System.IO.File]::OpenText($file.fullname).ReadToEnd() Write-Verbose ('Executing {0}' -f $file.fullname) Invoke-Expression $content } $newvars = dir variable: |select -ExpandProperty name foreach ($var in $newvars) { if ($vars -notcontains $var) { Write-Verbose ('Removing ${0}' -f $var) Remove-Variable $var } } Write-Verbose 'Garbage Collection' [gc]::Collect() Write-Verbose ('Deleting {0}' -f $file.fullname) del $file.fullname } else { sleep $SleepSeconds } }
The SCHTASKS job
Here’s an export of the schtasks xml I am using to ensure that it runs constantly. I even have it set to restart every 24 hours, but that may not be necessary.
<?xml version="1.0" encoding="UTF-16"?> <Task version="1.2" xmlns="http://schemas.microsoft.com/windows/2004/02/mit/task"> <RegistrationInfo> <Date>2012-06-25T16:30:49.0052527</Date> <Author>TOENUFF\Administrator</Author> </RegistrationInfo> <Triggers> <CalendarTrigger> <Repetition> <Interval>PT5M</Interval> <StopAtDurationEnd>false</StopAtDurationEnd> </Repetition> <StartBoundary>2012-06-25T16:26:37.0340001</StartBoundary> <ExecutionTimeLimit>P1D</ExecutionTimeLimit> <Enabled>true</Enabled> <ScheduleByDay> <DaysInterval>1</DaysInterval> </ScheduleByDay> </CalendarTrigger> </Triggers> <Principals> <Principal id="Author"> <UserId>TOENUFF\Administrator</UserId> <LogonType>S4U</LogonType> <RunLevel>HighestAvailable</RunLevel> </Principal> </Principals> <Settings> <MultipleInstancesPolicy>IgnoreNew</MultipleInstancesPolicy> <DisallowStartIfOnBatteries>false</DisallowStartIfOnBatteries> <StopIfGoingOnBatteries>true</StopIfGoingOnBatteries> <AllowHardTerminate>true</AllowHardTerminate> <StartWhenAvailable>false</StartWhenAvailable> <RunOnlyIfNetworkAvailable>false</RunOnlyIfNetworkAvailable> <IdleSettings> <StopOnIdleEnd>true</StopOnIdleEnd> <RestartOnIdle>false</RestartOnIdle> </IdleSettings> <AllowStartOnDemand>true</AllowStartOnDemand> <Enabled>true</Enabled> <Hidden>false</Hidden> <RunOnlyIfIdle>false</RunOnlyIfIdle> <WakeToRun>false</WakeToRun> <ExecutionTimeLimit>PT0S</ExecutionTimeLimit> <Priority>7</Priority> <RestartOnFailure> <Interval>PT15M</Interval> <Count>4</Count> </RestartOnFailure> </Settings> <Actions Context="Author"> <Exec> <Command>powershell.exe</Command> <Arguments>-windowstyle hidden -file D:\DropBox\scripts\longrunning\longrunning.ps1 -queuedirectory d:\dropbox\scripts\longrunning\queue -InitializationScript d:\DropBox\scripts\longrunning\init.ps1</Arguments> </Exec> </Actions> </Task>
You can load the above by running
schtasks /create /xml d:\pathtoabovexml.xml
Controlling What Gets Run
Finally, to control when things are run, we obviously cannot rely on PowerShell because we’ll be introducing the overhead we are trying to avoid. Instead you can use schtasks again to copy your scripts into the queue directory at the intervals you expect them to run. Mind you, this does not ensure that the script runs at the specified time. It only ensures that it is scheduled to run. Alternatively, you could copy files directly into the directory from some remote server that controls what is run, but for my purposes the schtasks solution is fine.
Discussion
I mentioned at the beginning of this post that this is a script-only interpretation of my solution. I originally wanted to create this as a C# compiled service that created a PowerShell runspace and managed it nearly exactly the way I’m doing it in the script. The truth is that so far the technique I’m using seems to be extremely reliable. I’m sure I’ll hit snags along the way, but for now the technique is sound and the problem is solved. Whether I’ll propose this as a production solution is TBD, but I’m happy to see my dream realized.
