I’ve been diving back in to – well, dipping my toe in the chilly waters of – PowerShell for some scripting here at my Data Processing job.
Several years ago, I learned the hard way (i.e., after writing a couple hundred lines of Ruby script) that although much of our processing automation was written without unit tests, that does NOT apply to any automation that *I* want to write. Not if I want to put it into production, that is.
I resisted unit testing and TDD for some time (Why? Well, that’s a story for another time), but I finally got testing religion last year with some Python scripting.
I could continue with the Python, but I think PowerShell is a better fit for our environment here.
Most modern programming languages have a choice of testing frameworks to choose from, but for PowerShell there’s only one that I know of – Pester.
Pester can be installed through NuGet or downloaded from GitHub.
I’m not going to repeat any Pester examples here – you can find plenty of “Getting Started” guides on the web. For example,
While looking for the Technet link, I found this post courtesy of Matt Wrock’s Hurry Up and Wait blog:
Why TDD for PowerShell? Or why pester? Or why unit test a “scripting” language?
Matt’s blog is subtitled “Tales from an Automation Engineer”, so his perspective on testing is a little different from the usual software testing guru. In particular, he points out that when it comes to infrastructure (and Data Processing, IMO), the things that are mocked / “stubbed out” in most software development environments are the things that we want to test:
But infrastructure code is different
Ok. So far I don’t think anything in this post varies with infrastructure code. As far as I am concerned, these are pretty universal rules to testing. However, infrastructure code IS different…
If I mock the infrastructure, what’s left?
So when writing more traditional style software projects (whatever the hell that is but I don’t know what else to call it), we often try to mock or stub out external “ifrastructureish” systems. File systems, databases, network sockets – we have clever ways of faking these out and that’s a good thing. It allows us to focus on the code that actually needs testing.
However …if I mock away all of these layers, I may fall into the trap where I am not really testing my logic.
More integration tests
One way in which my testing habits have changed when dealing with infrastructure code is I am more willing to sacrifice unit tests for integration style tests…If I mock everything out I may just end up testing that I am calling the correct API endpoints with the expected parameters. This can be useful to some extent but can quickly start to smell like the tests just repeat the implementation.
Typically I like the testing pyramid approach of lots and lots of unit tests under a relatively thin layer of integration tests. I’ll fight to keep that structure but find that often the integration layer needs to be a bit thicker in the infrastructure domain. This may mean that coverage slips a bit at the unit level but some unit tests just don’t provide as much value and I’m gonna get more bang for my buck in integration tests.
Matt’s opinion accords with my intuition about my Data Processing environment. In the DP realm, the part of the script that can be tested without accessing the production environment (or at least a working model of the production environment) can be trivial. This is probably the main reason our existing production automation doesn’t have full testing coverage. (Well, that, and the fact that as far as I know there’s no testing framework for the automation software we use).
So I think my approach will be something like Matt’s – unit test where it’s useful and non-trivial, and more integration tests (a “thicker layer” as Matt says) to get full (or at least adequate) coverage.