S.Lott-Software Architect: November 2015

Tuesday, November 24, 2015

Coding Camp vs. Computer Science

Step 1, read this: "Dear GeekWire: A coding bootcamp is not a replacement for a computer science degree". It's short, it won't hurt.

I got this comment.

"The world runs in legacy code and the cs degrees focus on leading edge

Most of what is learned in cs [is] never used in the mainstream of business

Much of computer work is repetitive and uninviting to upwardly mobile people who generally are moving up not improving the breed"

I disagree. A lot.

"The world runs in legacy code." First, this is reductionist: everything that's been pushed to GitHub is now a "legacy".

Does "legacy" mean "old, bad code?" If so, only CS grads will be equipped to make that judgement.
Does "legacy" mean "COBOL?" If so, only CS grads will be able to articulate the problems with COBOL and make a rational plan to replace it with Microservices.
Does "legacy" mean "not very interesting?" We'll return to this.

"CS degrees focus on leading edge." Not really true at all. The foundations of CS: data structures and algorithms, logic, and computability, haven't changed much since the days of Alan Turing and John von Neumann. They're highly relevant and form the core of a sensible curriculum.

The "leading edge" would be some Java 1.8 nonsense or some Angular JS hokum. The kind of thing that comes and goes. The point of CS education is to make languages and language features just another thing, not something special and unique. A little CS background allows a programmer to lump all SQL databases into a broad category and deal with them sensibly. A Code Camp grad who only knows SQLite may have trouble seeing that Oracle is superficially different but fundamentally similar.

"cs is never used in the mainstream of business." True for some businesses. This is completely true for those businesses where "legacy" means "not very interesting."

There is a great deal of not very interesting legacy code that fails to leverage a data structure more advanced than the flat file. This code is a liability, not an asset. The managers that let this happen probably didn't have a strong CS background and hired Code Camp graduates (because they're inexpensive) and created a huge pile of very bad code.

I've met these people and worked at these companies. It's a bad thing. The "leadership" that created such a huge pile of wasteful code needs to be fired. The "all that bad coded evolved during the 70's and 80's" isn't a very good excuse. A large amount of not interesting code can be replaced with a small amount of interesting code quickly and with almost zero risk.

Any company that's unable to pursue new lines of business because -- you know -- we've always done X and it's expensive to pivot to Y is deranged. They're merely holding onto their niche because they're paralyzed by fear of innovation=failure.

"Much of computer work is repetitive". False. It's made repetitive by unimaginative management types who like to manage repetitive work. If you've done it twice, you need to be prepared to distinguish coincidence from pattern. When you've done it three times, that's a pattern, and you need to automate it. If you do it a fourth time, you're missing the opportunity to automate, wasting money instead of investing it.

"Much of computer work is ... uninviting to upwardly mobile people" Only in places where repetitive is permitted to exist. If repetitive is not permitted, upward mobility will be the norm for the innovators.

"people who generally are moving up not improving the breed". I get this. The smart people move on. All we have left in this company are Code Camp graduates and their managers who value repetitive work and large volumes of not interesting code.

Improving the Breed means what?

Hiring CS graduates instead of Code Camp kiddies.

Navigation: Latitude, Longitude, Haversine, and all that

For a few years, I was a tech nomad. See Team Red Cruising for some stories of life on a sailboat. Warning: it's pretty dull.

As a tech nomad, I lived and died (literally) by my ability to navigate. Modern GPS devices make the dying part relatively unlikely. So, let's not oversell the danger aspect of this.

The prudent mariner plans a long voyage with a great deal of respect for the many things which can go wrong. One aspect of this is to create a "Float Plan". Read more about it here: http://floatplancentral.cgaux.org.

The idea is to create a summary of the voyage, provide that summary to trusted shore crew, and then check in periodically so that the shore crew can confirm that you're making progress safely. Failure to check in is an indicator of a problem, and action needs to be taken. We use a SPOT Messenger to check in at noon (and sometimes at waypoints.)

Creating a float plan involved an extract of the waypoints from our navigation software (GPS NavX). I would enrich the list of waypoints with estimated travel time between the points. Folding in a departure time would lead to a schedule that could be tracked. I also include some navigation hints in the form of a bearing between waypoints so we know which way to steer to find the next point.

The travel time is the distance (in nautical miles) coupled with an assumption about speed (5 knots.) It's a really simple thing. But the core haversine calculation is not a first-class part of any spreadsheet app. Because of the degrees-to-radians conversions required, and the common practice of annotating degrees with a lot of internal punctuation (38°54ʹ57″ 077°13ʹ36″), it becomes right awkward to simply implement this as a spreadsheet.

Some clever software has a good planning mode. The chartplotter on the boat can do a respectable job of estimating time between waypoints. But. It's not connected to a computer or the internet. So we can't upload that information in the form of a float plan. The idea of copying the data from the chart plotter to a spreadsheet is fraught with errors.

Navtools

Enter navtools. This is a library that I use to transform a route into a .csv with distances and bearings that I can use to create a useful float plan. I can add an estimated arrival time calculation so that a change to departure time creates the entire check-in schedule.

This isn't a sophisticated GUI app. It's just enough software to transform a GPS NavX extract file into a more useful form. The GUI was a spreadsheet (i.e., Numbers.) From this we created a PDF with the details.

Practically, we don't have good connectivity on the boat. So we would create a number of alternative plans ("leave tomorrow", "leave the day after", "leave next Monday", etc.) we would go ashore, find a coffee shop, and email the various plans to ourselves. They could sit in our inbox, waiting for weather and tide to be favorable.

Then, when the weather and tides were finally aligned, we could forward the relevant details to our trusted shore crew. This was a quick spurt of cell phone connectivity to forward an email. It worked out well. When the scheduled departure time arrived, we'd coax Mr. Lehman to life, raise the anchor and away.

Literate Programming

This is an exercise in literate programming. The code that's executed and the HTML documentation are both derived from source ReStructured Text (RST) documents. The documentation for the navigation module includes the math along with the code that implements the math.

I have to say that I'm enthralled with the intimate connection between requirements, design, and implementation that literate programming embodies.

I'm excited to (finally) publish the thing to GitHub. See https://github.com/slott56/navtools. I'm looking at some other projects that require the navtools module. What I wind up doing is copying and pasting the navigation calculation module into other projects. I had something like three separate copies on my laptop. It was time to fold all of the features together, delete the clones, and focus on one authoritative copy going forward.

I still have to remove some crufty old code. One step at a time. First, get all the tests to pass. Then expunge the old code. Then make progress on the other projects that leverage the navtools.navigation module.

Tuesday, November 17, 2015

Events: PyCon 2016, OSCon 2016

Many years ago ('07?) I went to my first PyCon. My situation changed and I didn't get to another PyCon until last year.

The story is a kind of major dumbosity. In '07 I could expense the trip as education. In '08, I'd lost that feature of my employment. After that I was actively figuring out how to be self-employed as a writer and technomad, and completely took my eye off the various kinds of tax deductions and sponsorship opportunities that I might have leveraged. It was too complex, arbitrary, and bewildering for me.

PyCon is an energizing event. I can't say enough good things about attending session after session on Python and the Python-related ecosystem. In particular, it's a joy to see people pitching their solutions to complex problems.

Here's a reminder: https://us.pycon.org/2016/

Since I do some work for O'Reilly media -- if a pair of webcasts count as work -- I think I want to see if I can finagle my way into OSCon, also.

Here's the reminder: http://conferences.oreilly.com/oscon/open-source

I think I can leverage some material from Functional Python Programming to create an interesting tutorial. My webcast on the five kinds of Python functions can expand into a bunch of hands-on-keyboard exercises to build examples of each kind of callable thingy.

Proposals are in. Waiting for comments. Fingers crossed.

Tuesday, November 10, 2015

Formatting Strings and the str.format() family of functions -- Python 3.4 Notes

I have to be clear that I am obsessed with the str.format() family of functions. I've happily left the string % operator behind. I recently re-discovered the vars() function.

My current go-to technique for providing debugging information is this:

print( "note: local={local!r}, this={this!r}, that={that!r}".format_map(vars)) )

I find this to be handy and expressive. It can be replaced with logging.debug() without a second thought. I can readily expand what's being dumped because all locals are provided by vars().

I also like this as a quick and dirty starting point for a class:

def __repr__(self):
    return "{__class__.__name__}(**{state!r})".format(__class__=self.__class__, state=vars(self))

This captures the name and state. But. There are nicer things we can do. One of the easiest is to use a helper function to reformat the current state in keyword parameter syntax, like this:

def args(obj):
    return ", ".join( "{k}={v!r}".format(k=k,v=v) for k,v in vars(obj).items())

This allows us to dump an object's state in a slightly nicer format. We can replace vars(self) with args(self) in our __repr__ method. We've dumped the state of an object with very little class-specific code. We can focus on the problem domain without having to wrestle with Python considerations.

Format Specifications

The use of !r for formatting is important. I've (frequently) messed up and used things like :s where data might be None. I've discovered that -- starting in Python 3.4 -- the :s format is unhappy with None objects. Here's the exhaustive enumeration of cases.

>>> "{0} {1}".format("s",None)
's None'
>>> "{0:s} {1:s}".format("s",None)
Traceback (most recent call last):
  File "", line 1, in 
    "{0:s} {1:s}".format("s",None)
TypeError: non-empty format string passed to object.__format__
>>> "{0!s} {1!s}".format("s",None)
's None'
>>> "{0!r} {1!r}".format("s",None)
"'s' None"

Many things are implicitly converted to strings. This happens in a lot of places. Python is riddled with str() function evaluations. But they aren't everywhere. Python 3.3 had one that was removed for Python 3.4 and up.

Bottom Line: be careful where you use :s formatting. It may do less than you think it should do.

Tuesday, November 3, 2015

Needlessly Redundant Overcommunication and DevOps

At the "day job" I use a Windows laptop. It was essential for a project I might have started, but didn't. So now I'm stuck with it until the budgetary gods deem that it's been paid for and I can request something more useful. Mostly, however, Windows is fine. It doesn't behave too badly and most of the awful "features" are concealed by Python's libraries.

This is context for a strange interaction today. It seems to exemplify DevOps and the cruddy laptop problem.

The goofy Microsoft Office Communicator -- the one that's so often used instead of a good chat program like Slack or HipChat -- pinged. The message went something like this.

"I sent you an email just now. Can you read it and reply?"

I was stunned. Too stunned to save the text. This is either someone being aggressive almost to a point that hints at rudeness, or someone vague on how email works. Let's assume the second option. I can only reply, "I agree with you, that is how email works."

The email was a kind of vague question about server provisioning. It was something along the lines of

"Do we provision our own server with Ansible or Chef? Or is there a team to provision servers for us? ..."

It went on to describe details of a fantasy world where someone would write Chef scripts for them. The rest of the email mostly ignored the first question entirely.

The Real Question

If you're familiar with DevOps as a concept, then server provisioning is -- like most problems -- something that the developers need to solve. Technical Support folks may provide tools (Ansible, for example) to help build the server, but there aren't a room full of support people waiting for your story ("make me a server") to appear on their Kanban board.

Indeed, there was never the kind of support implied in the email, even in non-DevOps organizations. In a "traditional" Dev-vs.-Ops organization, the folks that built servers were (a) overbooked, (b) uninterested in the details of our particular problem, or (c) only grudgingly let us use an existing server that doesn't quite fit our requirements. They rarely built servers for us.

Reason A, of course, is business as usual. Unless we're the Hippo (Highest Paid Person in the Organization,) there's always some other project that's somehow more important than whatever foolishness we're engaged in. How many times have we been told that "The STARS Project is tying up all our resources. It will be 90 days before..."? Gotcha. The bad part about this situation is when the person paying the bills says to me "You need to make them respond." How -- precisely -- do you propose that I change the internal reward system of the ops people?

We could label this as a passive-aggressive approach. They're waiting for us establish a schedule so that they can shoot it down. Or maybe that's reading way too much into the situation. Maybe they're really just overbooked.

Regarding reason B. Years ago, I had a hilarious interaction where we sent a stream of emails explaining our server requirements. The emails were not exactly ignored. But. When we asked about the status of our servers, the person responsible for the team brought a yellow pad and wrote down the requirements. I read the email to them. Without a trace of embarrassment, they wrote down what I was reading from an email. (It was long enough ago, that we didn't have laptops, and I had a hard-copy of the email. They refused the hard-copy. I had to read it. Really.)

Were they clueless about how email works? Or. Was this a kind of passive-aggressive approach to architecture where our input was discounted to zero because it didn't count until they wrote it on a their yellow pad? The behavior was bizarre.

Something similar happened with another organization. We made server recommendations. They didn't like the server recommendations. Not because the recommendations seemed wrong, but because we didn't have a formal sciency-seeming methodology for fantasizing about servers that were required to support the fantasy software which hadn't been written yet. They felt it necessary to complain. And when we talked with hardware vendors, they felt it necessary to customize the cheap commodity servers.

[It got weirder. They were convinced that a server farm needed to be designed from the bottom up. I endured a lecture on how a properly sciency-seeming methodology started by deciding on L1 and L2 cache sizing, bus timing, and worked through memory allocation and then I slowly grew to see that they had no clue what they were talking about when buying commodity servers by the rack-full for software that doesn't exist yet.]

We all know about reason C. The reason for DevOps is to avoid being stuffed into a kind of random server where there are upgrades that we all have to agree on. Or -- worse -- a server that can't be upgraded because no one will agree. A single app team vetos all changes.

"We can't install Anaconda 3 because we know that Python 3 is incompatible with Python 2"...

What?

I stopped understanding at that point. It seemed like the rest of the answer amounted to "having the second Anaconda on a separate path could lead to problems. It can't be proven that no problems will arise, so we'll assume that -- somehow -- PATH settings will get altered randomly and a Python 2 job will crash because it accidentally had the wrong PATH and accidentally ran with Python 3."

It was impossible to explain that this is a non-problem. Their response was "But we can't be sure." That's the last resort of someone who refuses to change. And it's the final answer. Even if you do a proof-of-concept, they'll find reasons to doubt the POC's results because they can't be sure the POC mirrors production.

The Real Answer

The answer to the original Ping and the Email was "You're going to do this yourself." I included links to four or five corporate missives on Chef, Ansible, DevOps, and how to fill in the form for a cloud server.

I have my doubts -- though -- that this would be seen as helpful.

They may not be happy because they don't get to use Communicator and Email and someone else's Kanban board to get this done. They don't get to ask someone else what they're doing and why they're not getting it done on time. They don't get to second-guess their technical decisions. They actually have to do it. And that may not work out well.

The truly passive-aggressive don't seem to do things by themselves. It appears to me that they spend a lot of time looking for reasons to stall. Either they need to get more information or get organized or they need to have some kind more official "permission" to proceed. Lacking any further information, I chalk it up to them only feeling successful when they've found the flaws in what someone else did.

It's challenging sometimes to make it clear that a rambling email asking for someone else to help is going nowhere. A Communicator ping followed by an email isn't actually getting anything done. It's essentially stalling, waiting for more information, getting organized, or waiting for permission. Overcommunication can become a stalling tactic or maybe a way to avoid responsibility.

I'm stuck with a cruddy laptop because the budget gods have laid down some laws that don't make a lick of technical sense. I think that the short-sighted "use it until it physically wears out" might be more costly than "find the right tool, we'll recycle the old one appropriately." In the same way, the shared server world view is clearly costly. We shouldn't share a server "because it's there."

The move to DevOps allows us to build a server rather than discuss building a server.

I want a DevOps parallel for my developer workstation. I don't want permission or authorization. I don't want to overcommunicate with the budget gods. I want a workstation unencumbered by permission-seeking.

Bio and Publications