Tuesday, May 19, 2015

More Thoughts on the friction of DevOps

Read this: How 'DevOps' is Killing the Developer

My pull-out quote:
This is why we see so many developers that can't pass FizzBuzz: they never really had to write any code.
I agree: It appears that DevOps may be more symptom than solution.

I have one tiny objection to any otherwise excellent series of points: I don't like the totem pole analogy.

I prefer a supply-chain:
  • Release Engineers respond to user needs.
  • Quality Engineers respond to the Release Engineers' needs for assurance that something is fit for use.
  • Developers respond to Release Engineers by providing software.
  • Similarly, procurement folks may purchase or lease or download and pay royalties for software. 
I think of it like this:

Developer ⇒ QE ⇒ RE ⇒ Users

No top-to-bottom. More a sequence of more-or-less peers.

I still agree with the central tenet: a developer is able to march the software from concept to user. We don't really expect QE or RE to create software. We might expect some skill sharing between QE and RE.

Many years ago, I posted this: IT’s Drive to Self-Destruction, which is random and whiny but related to this point about DevOps. The idea is that key developers create competitive advantage. Release Engineers put it in the hands of users. Both are important. Without creation there's no deployment. Without deployment, creators can be diverted to deployment, so deployment can still go forward, but it will be slower.

The key point is this:
If a developer is spending time with DevOps (and TechOps) trying to get stuff deployed, who's developing the Next Big Thing? 

Tuesday, May 12, 2015

Class Design Strategies -- analysis vs. synthesis

The conventional wisdom on class design is to model real-world things. This is the Domain-Driven Design Approach. It's what we used to teach as Rumbaugh's OMT (prior to the Three Amigos creating UML.)

The idea is simple: Look at the real world objects. Describe them.

Classes will have attributes and behaviors. They will have relationships. Rumbaugh was very careful about keeping object and class separate. A class had associations, and object had links. The association was the abstraction, the link was a concrete implementation. A class offered an operation, an object provided a method as an implementation.

As powerful as this is, I'm not sure it's the final word.

The only problem it has is that people often get confused by "real world objects." I've seen a number of places where folks completely fail to distinguish Enterprise IT implementation choices from actual things that reflect actual objects in the actual world.

Users and Credentials, for example. Users are real human beings. You find them in the hallways, standing around. They take up space in conference rooms. Credentials are a peculiar security-focused way to summarize a person as something public (username) and something private (password.) You don't find a stack of credentials tying up a room at the end of the hallway. Indeed, you can't physically stack credentials. While something a user knows is important, it isn't the entirety of a User. The attributes and behaviors of credentials aren't a good model for a User. But you still have this argument periodically when developing a class model or a noSQL database document model.

I'd like to emphasize that this is -- as far as I can tell -- the only problem with domain driven modeling. Some people don't see the domain very clearly because they tend to stick to a technology-driven world view.

However. That doesn't mean that drawing on the white board is the only way to discover the domain.

Building Classes from Functions

As a heretical alternative, allow me to suggest an alternative to the whiteboard.

Once upon a time, the whiteboard was the only way to do object modeling. The successor to the whiteboard (I use Argo UML as well as Pocketworks yuml) is a diagramming tool that -- ideally -- helps you understand the domain before committing to the cost and complexity of writing code.

Wait a second, though.

The "cost and complexity of writing code"? Java programmers know what I mean. If you don't have your classes understood, you should not start slapping code together.

Python programmers have no idea what "cost and complexity of writing code" means. They slap classes together faster than I can draw the damn pictures on Argo.

Indeed, the pictures can become a kind of burden. The picture shows "x.X", therefore, the module must include "x.X". Even though there might be a better way using classes in separate modules "a.Y" and "b.Z". But changing the cluster of pictures that comprises a fairly complete UML diagram isn't easy.

[Clearly, this depends on how much you tried to show. If your diagrams are really spare, refactoring is no problem. If you include parts of the object model in the component diagram or activity diagram, you're in trouble.]

This leads to an alternative to the whiteboard. And the diagramming tool.

Code. [Cue Orchestral Hit: Ta-daaa!]

Yes. Code. [Cue Orchestral Hit: Ta-ta-ta-daaa!]

When you can slap together a spike solution in Python you have a sensible alternative to the whiteboard.

You can build some classes, write some demonstration code to show how they work together. Don't like it? Start again from another base of classes. You can do this as a Mob Programming exercise. It fits somewhere between grooming a story and finishing an MVP release. Indeed, it may be a good way to do specific, concrete grooming.

In some cases, though, you can't build classes. You don't really know (or can't agree) on what the real world things are.

Rather than debate, shift the focus. Just write functions.

In Python, this is easy, since functions are first-class inhabitants of the programming model. In Java, this isn't easy at all. Functions aren't proper things; they must be part of a class; and you can't agree on what the classes are; the Java stalemate. [Yes, Java 8 introduces free-standing functions.]

How This Works In Practice

In many cases, it makes sense to punt on the "big picture." You're not really sure what you even have.  Yes, you know you have eight individual CSV files that reflect events that happened somewhere in cyberspace. (Let's just say they were the output from stored procedure triggers; the only record of changes made to crucial data.)

You can wring your hands over the eight-table join required to reconstruct the pre-change and post-change views of the objects. You can wring your hands over the way it's really three (or is it four?) different navigation paths from I to II to IV to (V to VI to VII) union I to II to IV to (V to VI to VII) union I to III to IV to oh my god I'm so confused.


You can get the sample data.  You can read it using the CSV module.

DictReader can awkward. It can be fixed, however. If your column titles are legal Python variables, you can use this to create a namespace reader from a DictReader. This allows you to say row.ATTRIBUTE instead of row['ATTRIBUTE'].

def nsreader(dictreader):
    return (SimpleNamespace(**row) for row in dictreader)

We can then turn to working out the various join algorithms on real data. Each step builds objects based on types.SimpleNamespace.

You start with simplest possible join algorithm: load everything into dictionaries based on their keys.

I_map = { row.KEY: row for row in nsreader(table_I_dict_reader) }
II_map = { row.SOMETHING: row for row in nsreader(table_II_dict_reader) }

Once you have sample data in memory, you can figure out what the actual, working relationships are. You can tinker with navigation paths through the tangled mess of tables. You can explore that data. You can do data profiling to find out how many misses there are.

If the tables are smallish (10,000 rows each) it all fits in memory nicely. No need for fancy database connections and no need to reason out join algorithms that don't tie up too much memory. You're not writing a database server. You're writing an application.

Look For Common Features

The design issue for classes is locating common features: common attributes or common methods. We often start down the road of common attributes. Because. Well... it seems logical.

Focus on attributes is a bias.

Classification of objects isn't based mostly on attributes. It's not 50-50 objects vs. attributes.

We tend to focus on attributes -- I think -- out of habit. Data structures mean "common data", right? Databases include tables of commonly-structured data.

But this isn't a requirement -- nor is it even important. It's just a habit.

We can conceive of a class hierarchy based around common behavior, too. This may require a very flexible collection of attributes. On the other hand, there's no a priori reason not to define classes based on their behavior.

That's why the idea of building functions first doesn't seem too far-fetched.

First, we can build working functions.  We can have test cases and everything.

Then we can look for commonality. We can refactor into classes. We can start with a Flyweight design pattern. As common attribute emerge, we can refactor to store more state in the class, and less state somewhere else. The API changes while we do this.

Then we examine it for the "is this a thing" criteria. Last, not first. We may need to make a few more tweaks to reflect the thing we discovered scattered around the functions. The thing may be a checklist or a recipe or a procedure: something active instead of simply stateful.

This tends to make RESTful web services a bit of a head scratcher. If we have an active thing, what is the state that we can represent and transfer? The state may be very small; the active agency may be quite sophisticated. This shouldn't be too baffling, but it can be confusing when the GET request response is either 200 or 403/409: OK or Forbidden/Conflict. Or there are multiple shades of 200: 200 OK with a body that indicates success, vs. 200 OK with a body that indicates something more needs to be done, vs. warnings, vs. exceptions, vs. other nuanced results.

Summary -- tl;dr

I think there's a place for code-first design. Build something to explore the space and learn about the problem domain. Refactor. Or Delete and Start Again. In modern languages (i.e., Python) code is cheap. Whiteboard design may not actually save effort.

I think there's a place for building functions and refactoring them into classes. I think the Java pre-8 "All Classes or Burn In Hell" approach is misleading. Functional programming and languages like Python show that functions should be a first-class part of programming.

I think there's too much emphasis on stateful objects. The DDD warnings about "anemic" classes seems to come from a habitual over-emphasis on state and an under-emphsis on operations. I think that active classes (as much as they push the REST envelope) might be a good thing.

Tuesday, May 5, 2015

Scrum, Agile, and Modern Tools

Required Reading: https://www.pandastrike.com/posts/20150304-agile

My takeaway quote? "Scrum lags behind the modern toolchain enough that there can be a Potemkin village vibe to the whole thing."

I was clued into this from another takeaway quote someone seconded on Twitter: "Waterfall used too much written communication, but Agile doesn't use enough."

Also read this: http://caines.ca/blog/2014/12/02/i-dont-miss-the-sprint/

Is "sprint" misleading? What about "sprint commitment?"

I'm not sure I object to "sprint" per se.

But I have seen "sprint commitment" turned into an organizational problem, removing what could have been a helpful tool. Folks who start harping on sprint commitments in the sense of "we committed to this, will we meet the deadline?" tend to create a toxic environment. I think the people who hype commitment the most really liked the non-Agile environments: they try bend Agile to meet their Waterfall concepts.

The problem is the word. A "sprint commitment" shouldn't be used like a legally binding "do it or pay penalties" commitment. It should be a metric used to gauge progress. More like a "sprint outcome".

The commitment hype can lead to stories, epics and detailed technical tasks getting muddied up terribly. The story becomes an epic. Little tiny technical tasks get inflated into big important stories. A proper user story gets replaced with nonsense about prepping a database for production rollout, or resolving defects found in QA, or things that -- obviously -- aren't user stories, but are taking up a lot of time.

When it appears that a story is going nowhere, the scrum master breaks it down into things that have status which changes frequently. The sense of end-user meaning behind the actual story gets lost in a haze of technical considerations and tasks that show activity more than accomplishment more than value.

"As an actuary, I want to know that the developers have written syntactically correct DML for my database, so that the product owner don't have to wait as long for the DBA's to build the database."


Tuesday, April 28, 2015

Software Subscription fee? Or just Vapor-ware?

From a sound engineer I know:
This is the new Pro Tools, it's now a subscription, for a mere $200 a year to stay current. However, if you peruse the list of new features, they all say "coming soon". You pay a subscription fee today for features that aren't even live yet. It's like I'm paying for an open beta of PT12 
I feel this is becoming a more prevalent and systemic problem in software development. Am I wrong? What is going on here?
Is it right to ask a fee in advance for features to be developed in the future? Is this a sustainable business model? Or will customers eventually rebel and switch to another product with better licensing terms and conditions?

Further, this clearly isn't a "purchase" in any traditional sense. Software has always lived in a weird zone where you don't really purchase the software in any sense, you purchase a right to use the software under limited terms and conditions spelled out in an end-user agreement that you probably didn't read. Since you didn't actually purchase it, you can't reverse engineer it or modify it.

It's my understanding that Avid's Pro Tools is the gold standard in the sound engineering industry. It seems unlikely that open source startups will catch up to the power of Avid's product offerings and undercut their pricing.

But -- of course -- who could imagine that MySQL or PostgreSQL would erode significant marketshare from Oracle or IBM?  Indeed, how much marketshare has been consumed by SQLite?

Software prices appear to have plummeted over the decades. In some cases, the drop is real and the root cause seems to be open source tools to build software. In other cases, the drop is a spreading of costs via subscription services like this one.

What if Avid doesn't deliver on their promises? You've forked over $200 and didn't get the promised new features. What now?

Tuesday, April 14, 2015

Java vs Python

Seems silly at first.


It's mostly true. It's also incomplete.

For example, "no tuples in Java" ignores http://commons.apache.org/proper/commons-lang/apidocs/org/apache/commons/lang3/tuple/package-summary.html.

It seems like the only interesting takeaway from the infographic is that Java syntax is wordy.

One hesitates at a detail-by-detail comparison. Does it help to match the 20 or so statements in Python against equivalents in Java?  Does it help to match the byzantine complexity of Java public/protected/private against something that doesn't exist in Python? Does it even make sense to try and compare the complexity of Java annotations with anything? What about Python meta-programming? The fact that we can easily overload Python operators?

Perhaps it's only because I'm an expert in both languages that I hesitate to try point-by-point comparison.

Tuesday, April 7, 2015

Going to PyCon 2015

In Montreal! How cool is that?

I'll be working for my current employer, also a sponsor, to locate Python talent.

I'll have a few copies of my books that I can give away.

Most importantly, the promotional code PYCON_LOTT gives 50% off my Packt titles and runs from April 7th to April 14th