Skip navigation

A couple of days ago, I talked about using the ISE command line to create script lines in the active script tab.

My process was:

  1. Switch to the email program,
  2. Copy the job name or full line out of the error email,
  3. Switch back to the PowerShell ISE, and
  4. Up-Arrow to create each new line

That was four steps – FOR EACH LINE.

That’s WAY too much like work. If only I could get the entire list at once…

Well, actually, I can.

The unique job name that I copied out of the error emails for each line is also the name used for the directory containing that job’s processed files on our processing server. I can build the query using the name of the directory, instead of copying each problem job name out of the error email.

But how do I know which directories contain jobs that generated an error email?

Actually, it’s better not to know. We’re better off running queries against ALL the jobs – that way, we can catch all the jobs that failed to get ingested into the full-text indexing/search app we use, even if the respective error email goes missing.

So to create a script that runs a query against every job processed in a day:

a. Start the PowerShell ISE – if the current script (file) tab isn’t empty, create a New one
b. In the ISE command line pane:
PS P:\> $PCE = $PSISE.CurrentFile.Editor
PS P:\> $solr_cl = ‘jruby solring.rb Cust-37 PROD hits MEMBERS “20160916 ‘
PS P:\>   # $solr_cl is the command line for the bulk of the Solr query –
PS P:\>   # oh, and it’s not quite the query that I use – that’s proprietary info, of course
PS P:\> $PCE.Text += Get-ChildItem <path-to-data>\2016\201609\20160916 |
ForEach-Object { $solr_cl + $_.PSChildName.ToString() + ‘”‘ + “`n”}
PS P:\>

So instead of taking four steps for each line of the query script, I can create the entire query script using just three steps. PowerShell FTW!

But That’s Not All! You Also Get…

Sure, it’s great that I no longer have to copy lines out of individual emails to create a hits query script for all the jobs processed in a given day, but what about creating the addonly query to ingest the missing jobs into Solr?

Well, as it happens, our Solr services were down on 9/16, so all the jobs failed and needed to be added. Also as it happened, there were 121 jobs processed on 9/16, so the hits query was 121 lines long.

The MEMBERS parameter in the Solr command line for the hits query corresponds to a MEMBERS.TXT file that contains web menu lines used by our web app. Each line of an addonly query uses the same format as the lines in the MEMBERS.TXT file.

So, to create the addonly query, I opened a New (blank) script file tab in the ISE, then entered:

PS P:\> $addonly_cl = ‘jruby solring.rb Cust-37 PROD addonly MEMBERS ‘
PS P:\> Get-Content <path_to>\MEMBERS.TXT |
Select-Object -Last 121 |
ForEach-Object {$PCE.Text += $addonly_cl + $_ + “`n”}
PS P:\> # names have been changed to protect proprietary information

But Wait! There’s More!

So I’ve cut down the process of creating these Solr scripts from 4 steps per line to 2 or 3 steps for the whole script – but what about the output?

Previously, I was watching the results of each query, and logging each result in our trouble ticket system. I built in a 15-second delay (using Start-Sleep 15) for each line, and bounced back from our trouble ticket system (in a web browser on my PC’s desktop) to the processing server where I had to run the query.

Again, that’s WAY too much like work.

The results of each hits or addonly query are logged to individual text files in a directory on the processing server. This directory is not shared – however, the processing server can send email using our processing mail server.

So:

  • I connected to the processing server (using Remote Desktop),
  • ran the hits query (to verify that all the jobs needed to be ingested using the addonly query)
  • ran the addonly query and watched as each job seemed to be added successfully, then
  • ran the hits query again (starting at 5:45 PM) to verify successful ingestion

I then used PowerShell to create and send a email report of the results:

PS C:\> $SMTPurl = <URL of processing email server>
PS C:\> $To = “<me@myjob.com>”, “<someone_else@myjob.com>”
PS C:\> $From = “<our_group_email_address@myjob.com>”
PS C:\> $Subject = “Cust-37 hits for 20160916 <job#1>..<job#121> (after addonly)”
PS C:\> $Body = “”
PS C:\> $i = 0
PS C:\> Get-ChildItem <path-to-log-files>\solr_20160916*.log |
Where-Object{ $_.LastWriteTime -gt (Get-Date “9/16/2016 5:45 PM”) } |
Get-Content | ForEach-Object {
if ($_ -match “Getting” ) { $Body += ($i++).ToString() + “: ” + $_ + “`n”}
if ($_ -match “Number of hits found” ) { $Body += $_ + “`n`n" }
   }
PS C:\> Send-MailMessage -SmtpServer $SMTPurl `
-To $To -From $From -Subject $Subject -Body $Body

Shazam! I (and the other tech in this project) got a nice summary report, by the numbers.

What’s next?

Well, these were all commands entered at the PowerShell command line, either in the ISE (on my desktop) or in a regular PowerShell prompt (on the processing server). One obvious improvement would be to create a couple of script-building scripts (how meta!) that I (or someone else) could run to create the query scripts, and a separate script (to be run on the processing server) to generate the summary email.

What if (as is usually the case) only a few of the jobs need to have addonly queries created to be re-ingested? Well, the brute-force way would be to create the addonly query with all the jobs included, then manually edit it, deleting all the lines where the initial ingestion was a success.

But the slick way would be to scan the query results log files, get the job numbers of the jobs that failed to ingest, and pull only the corresponding lines from the MEMBERS.TXT file.

(Spoiler: one way would be to append the name of each failed job to a single string, then get the contents of MEMBERS.TXT, extracting the lines that -match the string of failed jobs, and use those lines to create the addonly query.

It might be faster, though, to hash the lines of MEMBERS.TXT with the job number as the key and the entire line as the value, then return the entire line corresponding to each failed job).

Leave a Reply

Your email address will not be published. Required fields are marked *