S.Lott-Software Architect

Tuesday, November 3, 2015

Needlessly Redundant Overcommunication and DevOps

At the "day job" I use a Windows laptop. It was essential for a project I might have started, but didn't. So now I'm stuck with it until the budgetary gods deem that it's been paid for and I can request something more useful. Mostly, however, Windows is fine. It doesn't behave too badly and most of the awful "features" are concealed by Python's libraries.

This is context for a strange interaction today. It seems to exemplify DevOps and the cruddy laptop problem.

The goofy Microsoft Office Communicator -- the one that's so often used instead of a good chat program like Slack or HipChat -- pinged. The message went something like this.

"I sent you an email just now. Can you read it and reply?"

I was stunned. Too stunned to save the text. This is either someone being aggressive almost to a point that hints at rudeness, or someone vague on how email works. Let's assume the second option. I can only reply, "I agree with you, that is how email works."

The email was a kind of vague question about server provisioning. It was something along the lines of

"Do we provision our own server with Ansible or Chef? Or is there a team to provision servers for us? ..."

It went on to describe details of a fantasy world where someone would write Chef scripts for them. The rest of the email mostly ignored the first question entirely.

The Real Question

If you're familiar with DevOps as a concept, then server provisioning is -- like most problems -- something that the developers need to solve. Technical Support folks may provide tools (Ansible, for example) to help build the server, but there aren't a room full of support people waiting for your story ("make me a server") to appear on their Kanban board.

Indeed, there was never the kind of support implied in the email, even in non-DevOps organizations. In a "traditional" Dev-vs.-Ops organization, the folks that built servers were (a) overbooked, (b) uninterested in the details of our particular problem, or (c) only grudgingly let us use an existing server that doesn't quite fit our requirements. They rarely built servers for us.

Reason A, of course, is business as usual. Unless we're the Hippo (Highest Paid Person in the Organization,) there's always some other project that's somehow more important than whatever foolishness we're engaged in. How many times have we been told that "The STARS Project is tying up all our resources. It will be 90 days before..."? Gotcha. The bad part about this situation is when the person paying the bills says to me "You need to make them respond." How -- precisely -- do you propose that I change the internal reward system of the ops people?

We could label this as a passive-aggressive approach. They're waiting for us establish a schedule so that they can shoot it down. Or maybe that's reading way too much into the situation. Maybe they're really just overbooked.

Regarding reason B. Years ago, I had a hilarious interaction where we sent a stream of emails explaining our server requirements. The emails were not exactly ignored. But. When we asked about the status of our servers, the person responsible for the team brought a yellow pad and wrote down the requirements. I read the email to them. Without a trace of embarrassment, they wrote down what I was reading from an email. (It was long enough ago, that we didn't have laptops, and I had a hard-copy of the email. They refused the hard-copy. I had to read it. Really.)

Were they clueless about how email works? Or. Was this a kind of passive-aggressive approach to architecture where our input was discounted to zero because it didn't count until they wrote it on a their yellow pad? The behavior was bizarre.

Something similar happened with another organization. We made server recommendations. They didn't like the server recommendations. Not because the recommendations seemed wrong, but because we didn't have a formal sciency-seeming methodology for fantasizing about servers that were required to support the fantasy software which hadn't been written yet. They felt it necessary to complain. And when we talked with hardware vendors, they felt it necessary to customize the cheap commodity servers.

[It got weirder. They were convinced that a server farm needed to be designed from the bottom up. I endured a lecture on how a properly sciency-seeming methodology started by deciding on L1 and L2 cache sizing, bus timing, and worked through memory allocation and then I slowly grew to see that they had no clue what they were talking about when buying commodity servers by the rack-full for software that doesn't exist yet.]

We all know about reason C. The reason for DevOps is to avoid being stuffed into a kind of random server where there are upgrades that we all have to agree on. Or -- worse -- a server that can't be upgraded because no one will agree. A single app team vetos all changes.

"We can't install Anaconda 3 because we know that Python 3 is incompatible with Python 2"...

What?

I stopped understanding at that point. It seemed like the rest of the answer amounted to "having the second Anaconda on a separate path could lead to problems. It can't be proven that no problems will arise, so we'll assume that -- somehow -- PATH settings will get altered randomly and a Python 2 job will crash because it accidentally had the wrong PATH and accidentally ran with Python 3."

It was impossible to explain that this is a non-problem. Their response was "But we can't be sure." That's the last resort of someone who refuses to change. And it's the final answer. Even if you do a proof-of-concept, they'll find reasons to doubt the POC's results because they can't be sure the POC mirrors production.

The Real Answer

The answer to the original Ping and the Email was "You're going to do this yourself." I included links to four or five corporate missives on Chef, Ansible, DevOps, and how to fill in the form for a cloud server.

I have my doubts -- though -- that this would be seen as helpful.

They may not be happy because they don't get to use Communicator and Email and someone else's Kanban board to get this done. They don't get to ask someone else what they're doing and why they're not getting it done on time. They don't get to second-guess their technical decisions. They actually have to do it. And that may not work out well.

The truly passive-aggressive don't seem to do things by themselves. It appears to me that they spend a lot of time looking for reasons to stall. Either they need to get more information or get organized or they need to have some kind more official "permission" to proceed. Lacking any further information, I chalk it up to them only feeling successful when they've found the flaws in what someone else did.

It's challenging sometimes to make it clear that a rambling email asking for someone else to help is going nowhere. A Communicator ping followed by an email isn't actually getting anything done. It's essentially stalling, waiting for more information, getting organized, or waiting for permission. Overcommunication can become a stalling tactic or maybe a way to avoid responsibility.

I'm stuck with a cruddy laptop because the budget gods have laid down some laws that don't make a lick of technical sense. I think that the short-sighted "use it until it physically wears out" might be more costly than "find the right tool, we'll recycle the old one appropriately." In the same way, the shared server world view is clearly costly. We shouldn't share a server "because it's there."

The move to DevOps allows us to build a server rather than discuss building a server.

I want a DevOps parallel for my developer workstation. I don't want permission or authorization. I don't want to overcommunicate with the budget gods. I want a workstation unencumbered by permission-seeking.

Tuesday, October 27, 2015

The Internet of Things

Wunderbar.

A whole bunch of nicely integrated data collection modules.

I prefer to hack around with Arduino. I'm not sure why -- perhaps it's the lure of building approximately from scratch.

But this is very cool. No soldering. Just start gathering data.

I have a half-built Arduino-based device to measure the position of the steering quadrant on a sailboat. I really need to take the next few steps and finalize the design so that I can order a few boards from Fitzring and try it out for real. I've had it in pieces here and there for about 3 years. The open issue was (and still is) a digital potentiometer that sets the output voltage level. I think I have the right chip for this. I think I have the wrong resistors that adjust the voltage into the proper range. The response curve for the parts I rigged up (years ago) weren't linear enough.

Then I moved. And moved again. And wrote a bunch of books on Python. And I'm about to move again. I need to finish this and get it off my desk. Literally.

The good news is that I took careful notes. Including pictures. So I can break out the boards and mess around a bit. I have three breadboards covered with jumpers, LED's, buttons, and stuff all piled up around the laptop.

The Wunderbar has an light/color/proximity sensor. I've built just the proximity sensor with an Arduino. Reporting the output as resistance that can be used on 12V boat systems as the stumbling block for me.

After the next move... (Something I've said before.)

Tuesday, October 20, 2015

Why Computer Science for All is good for all

An Open Letter from the Nation’s Tech and Business Leaders: Why Computer Science for All is good for all.

"These are the skills and competencies that will power the growth of every industry..."

Civic leaders and educators need to be in on this. And professionals who have skills to share need to be in on this also. It's not limited to New York City. It's a nationwide (perhaps world-wide) need for skills. There are a lot of talented people. Some of them haven't had the right sequence of opportunities to realize their talents.

Wednesday, October 14, 2015

Chapters to Edit: What do I do instead?

I'm starting to get chapters back from the technical reviewers. This is an important part of the writing process: correcting my mistakes and clarifying things that confused the reviewers.

Packt has had a uniformly excellent cadre of technical reviewers. At this point, I've worked with something like a dozen people on four different books. It's been great (for me) to get detailed, specific feedback point by point.

Instead of working on my reviewed chapters, however, I'm browsing. It's Python Week.

https://www.packtpub.com/packt/offers/pythonweek

I'll get to the chapters Thursday, I think.

Tuesday, October 13, 2015

Wait, there's more Python goodness from Packt

This just in...

Here's a link to the actual Python Week page, with all the deals there for the week: https://www.packtpub.com/packt/offers/pythonweek

They also have a week of free Python books too, which change daily: https://www.packtpub.com/packt/offers/free-learning/

Feel free to ruthlessly export their largess and build your personal technical library.

Monday, October 12, 2015

Python Week at Packt Publishing

Go to https://www.packtpub.com.

Look for the deal of the week.

Get 20% off Python titles.

You're welcome.

Tuesday, October 6, 2015

Today's Milestone: Refactoring and Django Migrations

Once upon a time, when today's old folks were young, we'd debate the two project strategies: Hard Part Do Later (HPDL) vs. Hard Part First (HPF).

The HPDL folks argued that you could pick away at the hard part until -- eventually -- it wasn't hard any more. This doesn't often work out well in practice, but a lot of people like it. Sometimes the attempt to avoid the hard part makes it harder.

The HPF folks, on the other hand, recognized that solving the hard problem correctly, may make the easy problems even easier. It may not, but either way, the hard part was done.

The debate would shift to what -- exactly -- constituted the hard part. Generally, what one person finds hard, another person has already done several times before. It's the part that no one has done before that eventually surfaces as being truly hard.

Young kids today (get off my lawn!) often try to make the case that an Agile approach finesses the "hard part" problem. We define a Minimally Viable Product (MVP) and we (magically) don't have to worry about doing the hard part first or last.

They're wrong.

If the MVP happens to include the hard part, we're back a HPF. If the MVP tries to avoid the hard part, we're looking at HPDL.

The Novelty Factor

Agile methods don't change things. We still have to Confront the Novelty (CTN™). Either it's new technology or it's a new problem domain or a new solution to an existing problem domain. Something must be novel, or we wouldn't be writing software, we'd be downloading it.

I'm a HPF person. If you set the hard part aside to do later, all the things you do instead become constraints, limiting your choices for solving the hard part that comes later. In some rare cases, you can decompose the hard part and solve it in pieces. The decomposition is simply Hard Part First through Decomposition (HPFtD™) followed by Prioritize the Pieces (PtP™) and another round of Hard Part First.

Today, we're at a big milestone in the HPF journey.

The application's data model is simple. However.

The application has a complex pipeline of processing to get from source data to the useful data model.

A strict (and dumb) MVP approach would skip building the complex pipeline and assume that it was magically implemented somehow.

A slightly smarter MVP approach uses some kind of technical spike solution to handle the complex pipeline. We do that manually until we get past MVP and decide to implement the pipeline in something more final and complete.

My HPF strategy tackles the complex pipeline because we have to build it anyway and it's hard. We don't have to build all of it. Just enough to lay out the happy path.

The milestone?

It's time to totally refactor because -- even doing the hard part first -- we have the wrong things in the wrong places. Django application boundaries generally follow the "resources". It's a lot like designing a RESTful API. Define the resources, cluster them together in some kind of ontology that provides a meaningful hierarchy.

Until -- of course -- you get past the problem domain novelty and realize that some portion of the hierarchy is going to become really lopsided. It needs to be restructured so we have a flat group of applications.

Wait. What?

Flatten?

Yes.

When we have a Django application model that's got eleventy-kabillion classes, it's too big. Think the magic number 7±2: there's a limit to our ability to grasp a complex model.

Originally, we thought we'd have apps "A", "B", and "C". However. "A" turned out to be more complex than it seemed when we initially partitioned the apps. Based on the way the classes are named and clustered in the model file, it's clear that we have an internal structure is struggling to emerge. There are too many comments and high-level organizational hints in the docstrings.

It looks like this might be the model that's emerging:

Former A

A1
Conceptual A2

This means that there will be classes in A3 that depend on separate apps A2a and A2b. Further, A2 is really just a concept that unifies the design; it doesn't need to be implemented as a proper app. Both A2a and A2b depend on A1. A3 depends on A2a, A2b, and A1.

Ugh. Refactoring. And the associated migrations.

Django allows us to have nested apps. But. Do we really want to go there? Is a nested collection of packages really all that helpful?

Or.

Would it be better to flatten the whole thing, and simply annotate the dependencies among apps?

The Zen Of Python suggests that Flat is Better than Nested.

The hidden benefit of Flat is that the Liskov Substitution Principle is actually a bit easier to exploit. Yes, we have a tangled web of dependencies, but we're slightly less constrained when all of the Django apps are peers. Yes, many things will depend on the A1 app, but that will be less of a problem than the current pile of classes is.

The important part here is to start again. This means I need to discard the spike database and discard the history of migrations to date. I always hate disrupting my development databases, since it has test cases I know and remember.

That's the disruptive milestone for me: discarding the old database and starting again.