Skip navigation

Tag Archives: PowerShell

More PowerShell work today…

I’ve been working with the output from a Ruby script that we use to access Solr indexes for customer data. As I wrote previously, I’ve been able to use the ISE to create command lines to feed job names to this Ruby script and determine whether the job in question was successfully ingested into Solr when it was first ingested.

Using the ISE, I was able to assemble scripts – well, I say scripts. More like lists of command lines – but it still saved a lot of work to use the PowerShell ISE’s command pane to write script lines to the script pane above, then save the script pane as a PowerShell script on the server where the command needed to be run.

But I want more.

Today, I started work on scripting a tool to take a specified day (which would default to the current day) and call the Ruby script itself for all the jobs processed on the specified day. I started with a script to look up just one job (the job name was hardcoded into the script) while I figured out how to invoke the Ruby script, get the output from the Ruby script, and search the output for the information I needed.

I discovered that the PowerShell tool I needed to capture the output of the Ruby script was Out-String

Specifically, the heart of the new functionality involves a script line like this:

 $result = jruby.exe PARAM1 PARAM2 PARAM3 “20160923 <job#>” | Out-String

The part of the line in blue is the same command line format as the individual lines of the query scripts I’ve been writing. The addition of the Out-String command captures the jruby output and introduces it into the PowerShell pipeline, and setting $result equal to that routes that pipeline output into the $result variable, where I can search for the number of hits.

I then wrapped the the essential parts of this code inside a Get-ChildItem | ForEach-Object cycle. Instead of having the Ruby command line as a line of the script, I build up a string expression and used Invoke-Expression to call it:

Get-ChildItem <local\path\to\job\data\directories | ForEach-Object {
$expression = ‘jruby.exe PARAM1 PARAM2 PARAM3 ‘ +
‘”20160923 ‘+ $_.PSChildName.ToString() + ‘”‘ + “`n”
$result = Invoke-Expression $expression | Out-String
<code to parse $result and extract the number of Solr hits>

I added the extracted hits data to a variable called $Body, which I could use after the main part of the script was done as the -Body of an email to be sent through our processing mail server.

I also created an empty array before starting the Get-ChildItem | ForEach-Object cycle. Inside the cycle, whenever I found a line of output that showed 0 bytes returned from Solr, I added the associated job number to a list of $zeroByteJobs, then at the end used the job numbers to find the information needed for file of Solr addonly command lines, that I then assembled into another script that I ran separately to add those jobs to Solr.

I do have one really nagging problem – the Ruby script burps out a recommendation to use ansicon to improve the Ruby command-line experience. This output was not captured by the $result = … | Out-String combination. Thinking about it after I got home, though, it occurs to me that it might be generated as an error, and if so the way to deal with it will be to redirect it using PowerShell error stream redirection.

 

A couple of days ago, I talked about using the ISE command line to create script lines in the active script tab.

My process was:

  1. Switch to the email program,
  2. Copy the job name or full line out of the error email,
  3. Switch back to the PowerShell ISE, and
  4. Up-Arrow to create each new line

That was four steps – FOR EACH LINE.

That’s WAY too much like work. If only I could get the entire list at once…

Well, actually, I can.

The unique job name that I copied out of the error emails for each line is also the name used for the directory containing that job’s processed files on our processing server. I can build the query using the name of the directory, instead of copying each problem job name out of the error email.

But how do I know which directories contain jobs that generated an error email?

Actually, it’s better not to know. We’re better off running queries against ALL the jobs – that way, we can catch all the jobs that failed to get ingested into the full-text indexing/search app we use, even if the respective error email goes missing.

So to create a script that runs a query against every job processed in a day:

a. Start the PowerShell ISE – if the current script (file) tab isn’t empty, create a New one
b. In the ISE command line pane:
PS P:\> $PCE = $PSISE.CurrentFile.Editor
PS P:\> $solr_cl = ‘jruby solring.rb Cust-37 PROD hits MEMBERS “20160916 ‘
PS P:\>   # $solr_cl is the command line for the bulk of the Solr query –
PS P:\>   # oh, and it’s not quite the query that I use – that’s proprietary info, of course
PS P:\> $PCE.Text += Get-ChildItem <path-to-data>\2016\201609\20160916 |
ForEach-Object { $solr_cl + $_.PSChildName.ToString() + ‘”‘ + “`n”}
PS P:\>

So instead of taking four steps for each line of the query script, I can create the entire query script using just three steps. PowerShell FTW!

But That’s Not All! You Also Get…

Sure, it’s great that I no longer have to copy lines out of individual emails to create a hits query script for all the jobs processed in a given day, but what about creating the addonly query to ingest the missing jobs into Solr?

Well, as it happens, our Solr services were down on 9/16, so all the jobs failed and needed to be added. Also as it happened, there were 121 jobs processed on 9/16, so the hits query was 121 lines long.

The MEMBERS parameter in the Solr command line for the hits query corresponds to a MEMBERS.TXT file that contains web menu lines used by our web app. Each line of an addonly query uses the same format as the lines in the MEMBERS.TXT file.

So, to create the addonly query, I opened a New (blank) script file tab in the ISE, then entered:

PS P:\> $addonly_cl = ‘jruby solring.rb Cust-37 PROD addonly MEMBERS ‘
PS P:\> Get-Content <path_to>\MEMBERS.TXT |
Select-Object -Last 121 |
ForEach-Object {$PCE.Text += $addonly_cl + $_ + “`n”}
PS P:\> # names have been changed to protect proprietary information

But Wait! There’s More!

So I’ve cut down the process of creating these Solr scripts from 4 steps per line to 2 or 3 steps for the whole script – but what about the output?

Previously, I was watching the results of each query, and logging each result in our trouble ticket system. I built in a 15-second delay (using Start-Sleep 15) for each line, and bounced back from our trouble ticket system (in a web browser on my PC’s desktop) to the processing server where I had to run the query.

Again, that’s WAY too much like work.

The results of each hits or addonly query are logged to individual text files in a directory on the processing server. This directory is not shared – however, the processing server can send email using our processing mail server.

So:

  • I connected to the processing server (using Remote Desktop),
  • ran the hits query (to verify that all the jobs needed to be ingested using the addonly query)
  • ran the addonly query and watched as each job seemed to be added successfully, then
  • ran the hits query again (starting at 5:45 PM) to verify successful ingestion

I then used PowerShell to create and send a email report of the results:

PS C:\> $SMTPurl = <URL of processing email server>
PS C:\> $To = “<me@myjob.com>”, “<someone_else@myjob.com>”
PS C:\> $From = “<our_group_email_address@myjob.com>”
PS C:\> $Subject = “Cust-37 hits for 20160916 <job#1>..<job#121> (after addonly)”
PS C:\> $Body = “”
PS C:\> $i = 0
PS C:\> Get-ChildItem <path-to-log-files>\solr_20160916*.log |
Where-Object{ $_.LastWriteTime -gt (Get-Date “9/16/2016 5:45 PM”) } |
Get-Content | ForEach-Object {
if ($_ -match “Getting” ) { $Body += ($i++).ToString() + “: ” + $_ + “`n”}
if ($_ -match “Number of hits found” ) { $Body += $_ + “`n`n" }
   }
PS C:\> Send-MailMessage -SmtpServer $SMTPurl `
-To $To -From $From -Subject $Subject -Body $Body

Shazam! I (and the other tech in this project) got a nice summary report, by the numbers.

What’s next?

Well, these were all commands entered at the PowerShell command line, either in the ISE (on my desktop) or in a regular PowerShell prompt (on the processing server). One obvious improvement would be to create a couple of script-building scripts (how meta!) that I (or someone else) could run to create the query scripts, and a separate script (to be run on the processing server) to generate the summary email.

What if (as is usually the case) only a few of the jobs need to have addonly queries created to be re-ingested? Well, the brute-force way would be to create the addonly query with all the jobs included, then manually edit it, deleting all the lines where the initial ingestion was a success.

But the slick way would be to scan the query results log files, get the job numbers of the jobs that failed to ingest, and pull only the corresponding lines from the MEMBERS.TXT file.

(Spoiler: one way would be to append the name of each failed job to a single string, then get the contents of MEMBERS.TXT, extracting the lines that -match the string of failed jobs, and use those lines to create the addonly query.

It might be faster, though, to hash the lines of MEMBERS.TXT with the job number as the key and the entire line as the value, then return the entire line corresponding to each failed job).

Something that’s been bugging me for a while is that I couldn’t remember how to continue lines in the ISE command pane.

Of course, in a regular PowerShell window, you can just press Enter to continue entering a command on the next line:

PS C:\Users\Owner> 1..10 | Where-Object{ ($_ % 2) -eq 0 }
2
4
6
8
10
PS C:\Users\Owner> 1..10 | Where-Object{
>> ($_ % 2) -eq 0 }
2
4
6
8
10

You can press Enter after a “|” to start the next section of pipe on the next line, you can press Enter after a “,” when entering a list of items, you can press Enter after an opening bracket “{” when starting into a script block, and you can press Enter immediately after entering a “`” (a backtick) anywhere within a line (except inside a string literal).

But if you try to use Enter in the command pane of the ISE, you’ll get an error. For example, when I try to break a line after the opening bracket of Where-Object (as in the example above), I get:

PS P:\> 1..10 | Where-Object {
Missing closing ‘}’ in statement block.
    + CategoryInfo…

Very annoying.

I finally (after a couple weeks of suffering along without continuation lines in my recent orgy of ISE work) tracked down the keystroke needed – it’s Shift-Enter in the ISE.

Enter in the regular prompt, Shift-Enter in the ISE. Enter in the regular prompt, Shift-Enter in the ISE. Enter in the regular prompt, Shift-Enter in the ISE…

One of the things I do at work involves creating scripts to run a Ruby script. For each line in each script I create, I have to:

  1. Go to our secondary email program
  2. Copy a job name (a word, basically) or an entire line from an error email
  3. Change to an editing program (like the PowerShell Integrated Scripting Environment, a.k.a. ISE)
  4. Create a line with the Ruby command line and space for the text I’ve copied from the error email, and add the error email text to that line

Today, I had almost 120 lines to create this way – some of them in two versions

Previously, I did it all by hand – I duplicated the Ruby script portion as many times as I needed, then copied and pasted the error text.

Today, though, contemplating the 200+ lines to create, I decided to dig a bit deeper into the PowerShell ISE.

I discovered I could open a PowerShell script file from disk using:

PS P:\> New-Item -ItemType File test0.ps1
PS P:\> $PSISE.CurrentPowerShellTab.Files.Add(“P:\test0.PS1”)

I then found I could access the open files in the tabbed script panes using standard array indexing notation:

PS P:\> $PSISE.CurrentPowerShellTab.Files[7]                # or [0], [1], [2], etc

With a little more experimentation, I found I could assign the tabbed script to a variable, Save its contents from the command line, and update the Text in the Editor property of the script pane:

PS P:\> $file0 = $PSISE.CurrentPowerShellTab.Files[7]
PS P:\> $file0.Editor.Text = “hello, world”
PS P:\> $file0.Editor.Text += “`n” + “Goodbye, cruel world”
PS P:\> $file0.Save()
PS P:\> Get-Content P:\test0.ps1
hello, world
Goodbye, cruel world

PS P:\>

I then discovered that I could access the current tab directly, without having to use the array indexing notation, or assign the tabbed script to a variable:

PS P:\> $PSISE.CurrentFile.Editor.Text = “”

I then adapted Recipe 8.3 (“Read and Write from the Windows Clipboard”) from the Windows PowerShell Cookbook to write a one-liner:

PS P:\> function Get-Clipboard { Add-Type -Assembly PresentationCore; [Windows.Clipboard]::GetText() }

Finally, I put the Ruby script lines into variables (for example, $ruby_script_1),
defined a new variable $PCE:

PS P:\> $PCE = $PSISE.CurrentFile.Editor

And used the results to add lines to the currently selected script tab:

PS P:\> $PCE.Text += $ruby_script_1 + (Get-Clipboard) + “`n”
# the `n is the PowerShell way to specify a newline character

Now, I just had to

  1. Switch to the email program,
  2. Copy the job name or full line out of the error email,
  3. Switch back to the PowerShell ISE, and
  4. Up-Arrow to create each new line

It looks like the same number of steps – but there’s a lot fewer keypresses, so…WIN!!!

A few weeks ago, I worked my way through a Lynda.com course on the Windows PowerShell ISE.

The current version of the ISE has an Add-On menu item that links directly to the PowerShell ISE Add-Ons page at microsoft.com.

One particular add-on was particularly interesting to me, since I’m trying to bring to bear what I know of proper software engineering practice to the PowerShell scripting I do.

This particular add-on, ISE_Cew, is a package that includes both Pester and Git functions. It’s just the thing if you’re looking to add Behavioral Testing (Pester) and version control (Git) to your PowerShell workflow.

I’ve been diving back in to – well, dipping my toe in the chilly waters of – PowerShell for some scripting here at my Data Processing job.

Several years ago, I learned the hard way (i.e., after writing a couple hundred lines of Ruby script) that although much of our processing automation was written without unit tests, that does NOT apply to any automation that *I* want to write. Not if I want to put it into production, that is.

I resisted unit testing and TDD for some time (Why? Well, that’s a story for another time), but I finally got testing religion last year with some Python scripting.

I could continue with the Python, but I think PowerShell is a better fit for our environment here.

Most modern programming languages have a choice of testing frameworks to choose from, but for PowerShell there’s only one that I know of – Pester.

Pester can be installed through NuGet or downloaded from GitHub.

I’m not going to repeat any Pester examples here – you can find plenty of “Getting Started” guides on the web. For example,

While looking for the Technet link, I found this post courtesy of Matt Wrock’s Hurry Up and Wait blog:

Why TDD for PowerShell? Or why pester? Or why unit test a “scripting” language?

Matt’s blog is subtitled “Tales from an Automation Engineer”, so his perspective on testing is a little different from the usual software testing guru. In particular, he points out that when it comes to infrastructure (and Data Processing, IMO), the things that are mocked / “stubbed out” in most software development environments are the things that we want to test:

Why TDD for PowerShell? Or why pester? Or why unit test a “scripting” language?

But infrastructure code is different

Ok. So far I don’t think anything in this post varies with infrastructure code. As far as I am concerned, these are pretty universal rules to testing. However, infrastructure code IS different…

If I mock the infrastructure, what’s left?

So when writing more traditional style software projects (whatever the hell that is but I don’t know what else to call it), we often try to mock or stub out external “ifrastructureish” systems. File systems, databases, network sockets – we have clever ways of faking these out and that’s a good thing. It allows us to focus on the code that actually needs testing.

However …if I mock away all of these layers, I may fall into the trap where I am not really testing my logic.

More integration tests

One way in which my testing habits have changed when dealing with infrastructure code is I am more willing to sacrifice unit tests for integration style tests…If I mock everything out I may just end up testing that I am calling the correct API endpoints with the expected parameters. This can be useful to some extent but can quickly start to smell like the tests just repeat the implementation.

Typically I like the testing pyramid approach of lots and lots of unit tests under a relatively thin layer of integration tests. I’ll fight to keep that structure but find that often the integration layer needs to be a bit thicker in the infrastructure domain. This may mean that coverage slips a bit at the unit level but some unit tests just don’t provide as much value and I’m gonna get more bang for my buck in integration tests.

Matt’s opinion accords with my intuition about my Data Processing environment. In the DP realm, the part of the script that can be tested without accessing the production environment (or at least a working model of the production environment) can be trivial. This is probably the main reason our existing production automation doesn’t have full testing coverage. (Well, that, and the fact that as far as I know there’s no testing framework for the automation software we use).

So I think my approach will be something like Matt’s – unit test where it’s useful and non-trivial, and more integration tests (a “thicker layer” as Matt says) to get full (or at least adequate) coverage.

PowerShell originally started as a project called “Monad” within Microsoft.

The original Monad Manifesto[PDF] was written by Jeffrey Snover back in August 2002.

BTW, one of the major influences on Monad was a paper by John Ousterhaut:
Scripting: HigherLevel Programming for the 21st Century[PDF]

It’s interesting to read Snover’s original manifesto and see how much of the original vision made it into PowerShell (and how much didn’t).

(originally posted at edward.spurlock.cc)

There comes a time in every programmer’s life when s/he has to strike out on his/her own, writing new code (instead of typing in examples from books / websites). That time has come now for me with regards to PowerShell.

But first, I have to set up my working environment.

Here at work, we have a common (i.e., shared) network directory on our Production resource server. There were no PowerShell utilities in the directory (probably because I think I’m the first person to do anything serious with PowerShell here, with the possible exception of the IT guys – and they don’t use the Production resource server).

However, it occurred to me that that common directory (call it N:\common\utils, because that’s not its name) would be a good place to put modules meant to be shared.

How do I tell PowerShell to look for modules there, without having to specify this every time I start PowerShell?

For now, I just:

  1. created a PS subdirectory in N:\common\utils (PS for PowerShell, of course)
  2. Started PowerShell on my PC and created a profile file in the $profile directory (per Recipe 1.6 from the Windows PowerShell Cookbook):
    New-Item -type file -force $profile
  3. edited the profile file using Notepad.exe:
    notepad $profile
  4. and added a line to add the common directory to the PSModulePath environment variable:
    $env:PSModulePath += “;N:\common\utils\PS”
    (the leading semicolon is the profile path separator)
  5. exited notepad, saving $profile on the way out.

Now, whenever I start PowerShell, the $profile runs and adds the PS shared folder to the module search path.

To do the same thing (less step 1) for the Windows PowerShell ISE, I consulted Microsoft Technet article How to Use Profiles in Windows PowerShell ISE, which suggests wrapping the New-Item in step 2 in an if statement to prevent overwriting an existing profile, and using the ISE to edit the resulting profile file:

  1. (PS subdirectory already created in N:\common\utils)
  2. Started the PowerShell ISE and created a profile file in $profile:
    if (!(test-path $profile)) {New-Item -type file -path $profile -force}
  3. edited the profile file (using the ISE editor):
    psEdit $profile
  4. and added the same line to add the common directory to PSModulePath:
    $env:PSModulePath += “;N:\common\utils\PS”
  5. then closed the ISE editor tab, saving the ISE $profile file on the way out

Now I just have to figure out modules and module manifests…

(originally posted at edward.spurlock.cc)

A lot of our Production Quality Control (QC) operations where I work require checking that data has been uploaded to one of our websites, using either one of our internal tools, or our backdoor access to one of our customer-facing sites. This is all right when we’re checking a couple of customer jobs, but gets tedious VERY quickly for routine QC of dozens or hundreds of customer jobs.

A web app works well as a manual tool (“enter text to be searched for in this box, click Search, click the link for desired item in the list of items matching your search term…”), but our internal tools and customer-facing sites were never designed to be scripted.

For a while, when I was working with more cross-platform scripting languages, I was looking at Selenium. Selenium allows you to control popular web browsers from a number of programming languages, including Python, Ruby, and C# – but not directly from PowerShell. It would be possible to write my own PowerShell wrapper for C# to control Selenium, but I don’t have any experience extending PowerShell with C#, and since we’re not a C# shop, I think that would be very fragile from a long-term maintenance standpoint.

Anyway, unlike the typical Selenium application, our Production QC ops aren’t testing a single web app across multiple browsers. We’re searching for a multitude of data items, but we only have to find each one once, in a single web browser. A more robust solution would be to use something more native to PowerShell to control a single browser – which could even be Internet Explorer (perish the thought!).

I Googled “powershell web browser automation” and came up with a number of possibilities.

Web UI Automation with Windows PowerShell, is an MSDN article from 2008 and talks about using COM to control Internet Explorer, which is something I’ve dabbled in using VB Script. My first experiment with the method wasn’t successful, though, so I looked for troubleshooting info for COM in my handy copy of Windows PowerShell in Action. As it happens, the book illustrates COM with an example of “…a COM-based tool to help manage…browser windows,” so the book probably offers a more fertile field for further research.

A post on StackOverflow then led me to WatiN – Web Application Testing in .Net. WatiN allows control of Internet Explorer AND Firefox, so it might be even better than using COM.

(originally posted at edward.spurlock.cc)

Another resource I’ve been mining in the last week or so: PowerShell.org, an independent community for PowerShell users.

Something at PowerShell.org that you won’t find at every other PowerShell resource: the PowerScripting Podcast, currently on episode 289. As I get time, I’m paging back through the archives to find episodes that are of interest to me (a relatively new PowerShell user)

(originally posted at edward.spurlock.cc)