Sunday, December 7, 2014

Thoughts on Windows Scripting


For the past few weeks, I haven't been posting much because my attention has been diverted elsewhere. The picture above is a little sneak preview of what I've been working on, which is related to my previous efforts in solving Project Euler problems using Windows batch script - it's almost ready for me to share with the world (though, if you look around, you'll find it since I have it hosted publicly in the cloud), though there are a few i's I'd like to dot and a few t's I'd like to cross before I really promote this thing.

And what a thing it is! But I'm not ready to talk about it just yet.

Instead, I would like to talk about the general topic of Windows scripting. For years, my attention has been focused on small business IT - in that environment, there's a limited amount of automation available to the IT professional, at least as long as the customer is hosting their own equipment. Though a clever IT shop can probably find some way to leverage tools like Puppet and Chef to make on-site deployments more consistent, it's a little difficult to avoid treating servers like pets instead of cattle when each "farm" only has one or two "cattle" and the "farmers" don't like to share (to stretch the already overwrought metaphor a bit). In that environment, the limited nature of old-school Windows batch scripting - a language that traces its lineage all the way back to the early days of DOS - wasn't immediately apparent. Oh sure, there was the occasional heated discussion over whether it was more intuitive to suffer through RUNDLL32's myriad flags and byzantine syntax, or through VBScript's ridiculously verbose but at least sort of intelligible at a glance AddPrinterConnection method, or to just give up entirely and use KiXtart, but the results usually didn't matter all that much in the end. By the time you were done writing a script, testing it, and rolling it out, most scripts just weren't worth the time. So, every client would have a script or two with some "NET USE" commands, some clients with more dedicated IT professionals might have a slightly more complicated batch script with a control structure in it (I was notorious about this, though not near as notorious as some of my coworkers) and that was pretty much it. In our context, PowerShell was observed, tried out, and then promptly ignored as much as humanly possible - it took more typing than CMD, we couldn't count on it being on every machine we wanted to automate (it wasn't installed by default until Windows 7/Server 2008 R2), and it didn't do anything we weren't already doing for ourselves.

These days, though, I'm dealing with a somewhat larger environment. It's still not large enough to really see the benefits of PowerShell firsthand - we're talking about less than ten servers, virtual or otherwise, and approximately 250 PCs, more or less - but it's big enough where I can certainly imagine what a larger environment would look like and some of the challenges I would have keeping things halfway consistent and manageable. That's part of the reason I've been doing the Project Euler exercises in the first place - at some point, my scripting chops will need to be ready for a larger environment. While working on my exercises, though, I've run into some pretty serious difficulties - difficulties I've been able to overcome, mind you, but ones that definitely show that CMD is, shall we say, a product of its time:

  • CMD can't math. No, seriously, ask it what 2/3 is:

    set /a _test=2/3
    0


    That's right - no floating point support. Decimals aren't a thing in CMD, and neither is rounding.  
  • CMD's roots as a scripting language cutting its teeth in the heyday of BASIC really show. Functions are kind of a thing, if you're sort of creative about it, but it's pretty clear that they were bolted on well after the fact. There's exactly one loop structure - if you want a do/while loop, you're going to have to get creative with GOTOs and labels. 
  • Delayed Expansion makes a lot of things possible - without it, I don't think I could functionally use a FOR loop or an IF statement in my Project Euler code. That doesn't mean it makes things easy, though. Divining the pattern between "the interpreter will interpret this as it's happening by default" and "the interpreter will just read ahead and only parrot back what the results will be after that loop is done" (General rule - if what you're doing happens between parenthesis, you probably want Delayed Expansion if you're used to any other programming language ever) is a royal pain.
  • You know what's awesome? When you can do things like this:

    set /a _test=2000000000^2

    And you get a result like this:
    Invalid number. Numbers are limited to 32-bits precision.

    But if you do something like this:

    set /a _test=2000000000*2

    You end up with this:
    -294967296

    Neat, huh? I agree, Microsoft - consistent error checking is for chumps.
  • Arrays? They don't exist. No, really - they don't. Except... if you're really creative... and abuse Delayed Expansion a bit... you can kind of fake them, as long as you're only querying their values in a FOR loop. See, if you try to trick the interpreter out by doing something like:

    SET _test0=Foo
    SET _zero=0

    ECHO:%_test%%_zero%


    You'll just get 0 because it'll attempt to extract the value of _test (which doesn't exist), then the value of _zero (which is 0). However, if you do something like this:

    FOR /L %%G IN (0,1,0) DO (
    ECHO:!_test%%G!
    )


    It'll actually work, resolve %%G first, then resolve everything between the exclamation points as one variable. Spooky, eh? If it helps, the rest of CMD's syntax is every bit this consistent.
  • I don't expect a scripting language to have a firm type casting system - quite the contrary, in fact. However, CMD's is particularly... interesting. For example, let's say you want to calculate the elapsed time between the start and end of a script. Easy enough:
SET _TimeStart=%time%
CALL RandomScript.bat
SET _TimeEnd=%time%

FOR /F "tokens=1-4 delims=:." %%G IN ("%_TimeStart%") DO (
SET _HStart=%%G
SET _MStart=%%H
SET _SStart=%%I
SET _mSStart=%%J
)

FOR /F "tokens=1-4 delims=:." %% K IN ("%_TimeEnd%") DO (
SET _HEnd=%%K
SET _MEnd=%%L
SET _SEnd=%%M
SET _mSEnd=%%N
)

SET /A _mSElapsed=_mSEnd-_mSStart


:: Insert conditional to handle occasions where _mSElapsed is negative because the script took longer than a second or it started near the end of a second.


SET /A _SElapsed=_SEnd-_SStart
:: Insert conditional... etc.
...


And so on. Well, if you tried to write the script like that and ran it, two things would happen:
  1. Your failure to use Delayed Expansion when you called the %time% variable would result in the script always returning 0 since it would only query the value of %time% once - at the end of the script.
  2. If any the time blocks (HH:MM:SS.MS) have a zero at the start (say, 12:09:14.54 - _MWhatever would equal "09"), CMD won't interpret the result as "9" when you ask it to do some math against it. Oh no. Instead... well, I'll just let Microsoft explain this one:

    Numeric values are decimal numbers, unless prefixed by 0x for hexadecimal numbers, and 0 for octal numbers. So 0x12 is the same as 18 is the same as 022. Please note that the octal notation can be confusing: 08 and 09 are not valid numbers because 8 and 9 are not valid octal digits.

    That's right - "09" suddenly becomes 9-base-8, which seriously doesn't exist and will lead to some fascinating results.
So, you end up writing a bunch of code like this to convince SET to, no, seriously, save the damned number as a base-10 number, please and thank you very much:

IF %_HEnd:~0,1% EQU 0 (
SET _HEnd=%_HEnd:~1,1%
)
IF %_MEnd:~0,1% EQU 0 (
SET _MEnd=%_MEnd:~1,1%
)
IF %_SEnd:~0,1% EQU 0 (
SET _SEnd=%_SEnd:~1,1%
)
IF %_mSEnd:~0,1% EQU 0 (
SET _mSEnd=%_mSEnd:~1,1%
)
IF %_HStart:~0,1% EQU 0 (
SET _HStart=%_HStart:~1,1%
)
IF %_MStart:~0,1% EQU 0 (
SET _MStart=%_MStart:~1,1%
)
IF %_SStart:~0,1% EQU 0 (
SET _SStart=%_SStart:~1,1%
)
IF %_mSStart:~0,1% EQU 0 (
SET _mSStart=%_mSStart:~1,1%
)

For those playing along at home, that's just a series of statements that say, if the first character in the variable is a zero, throw that zero in the trash can and just keep the last number. 

Yeah. Now imagine having a thousand or so servers to manage and putting up with this. Given the utter lack of responsible error handling and the head-scratching syntax parsing, it's only a matter of time before somebody writes a script that they think checks a system's MAC address in a text file somewhere and uses netsh to assign an IP address and computer name based on that MAC address, only to have it assign the same IP address to every system or just refuse to properly parse a MAC address' delimiters in some weird corner case that magically converts it into octal or hexadecimal or base 23.7 or Roman numerals or the numbering system of Mayan moisture evaporators or something. Compared to this, bash and its ilk must have seemed like a revelation to anyone even slightly seriously interested in server automation. This doesn't even get into how there are several corners of Windows that are virtually impossible to get to via CMD unless you feel like directly editing registry values with REG (given everything covered so far, what could possibly go wrong?) or you feel like stepping into the object-oriented nightmare that is VBScript[1]. Something needed to be done. Windows needed a proper, honest-to-goodness scripting language that acknowledged some progress in interpreted languages had been made since the first term of the Reagan administration.

I'm glad Microsoft took care of that, even if it's still something that remains mostly tangential to my day-to-day administration experience. For now.

*****
1. That five line example right there in CMD turns into:

NET USE H: \\myserver\users /PERSISTENT:NO

And my former coworkers wondered why I preferred CMD, even with its flaws.

No comments:

Post a Comment