S.Lott-Software Architect: October 2009

Saturday, October 31, 2009

Open Source in the News

Whitehouse.gov is publicly using open source tools.

See Boing Boing blog entry. Plus Huffington Post blog entry.

Most importantly, read this from O'Reilly.

Many places are using open source in stealth mode. Some even deny it. Ask your CIO what the policy on open source is, then check to see if you're using Apache. Often, this is an oops -- policy says "no", practice says "yes".

For a weird perspective on open source, read Binstock's Integration Watch piece in SD Times: From Open Source to Commercial Quality. "rigor is the quality often missing from OSS projects". "Often" missing? I guess Binstock travels in wider circles and sees more bad open source software than I do. The stuff I work with is very, very high quality. Python, Django, Sphinx, the LaTeX stack, PIL, Docutils -- all seem to be outstandingly good.

I guess "rigor" isn't an obvious tangible feature of the software. Indeed, I'm not sure what "commercial quality" means if open source lacks this, also.

All of the commercial software I've seen has been in-house developed stuff. Because there's so much of it, it must have these elusive "rigor" and "commercial quality" features that Binstock values so highly. Yet, the software is really bad: it barely works and they can't maintain it.

My experience is different from Binstock's. Also, most of the key points in his article are process points, not software quality issues. My experience is that the open source product exceeds commercial quality. Since there's no money to support a help desk or marketing or product ownership, the open source process doesn't offer all the features of a commercial operation.

Wednesday, October 28, 2009

Painful Python Import Lessons

Python's packages and modules are -- generally -- quite elegant.

They're relatively easy to manage. The __init__.py file (to make a module into a package) is very elegant. And stuff can be put into the __init__.py file to create a kind of top-level or header module in a larger package of modules.

To a limit.

It took hours, but I found the edge of the envelope. The hard way.

We have a package with about 10 distinct Django apps. Each Django app is -- itself -- a package. Nothing surprising or difficult here.

At first, just one of those apps used a couple of fancy security-related functions to assure that only certain people could see certain things in the view. It turns out that merely being logged in (and a member of the right group) isn't enough. We have some additional context choices that you must make.

The view functions wind up with a structure that looks like this.

@login_required
def someView( request, object_id, context_from_URL ):
   no_good = check_other_context( context_from_URL )
   if no_good is not None: return no_good
   still_no_good = check_session()
   if still_no_good is not None: return still_no_good
   # you get the idea

At first, just one app had this feature.

Then, it grew. Now several apps need to use check_session and check_other_context.

Where to Put The Common Code?

So, now we have the standard architectural problem of refactoring upwards. We need to move these functions somewhere accessible. It's above the original app, and into the package of apps.

The dumb, obvious choice is the package-level __init__.py file.

Why this is dumb isn't obvious -- at first. This file is implicitly imported. Doesn't seem like a bad thing. With one exception.

The settings.

If the settings file is in a package, and the package-level __init__.py file has any Django stuff in it -- any at all -- that stuff will be imported before your settings have finished being imported. Settings are loaded lazily -- as late as possible. However, in the process of loading settings, there are defaults, and Django may have to use those defaults in order to finish the import of your settings.

This leads to the weird situation that Django is clearly ignoring fundamental things like DATABASE_ENGINE and similar settings. You get the dummy database engine, Yet, a basic from django.conf import settings; print settings.DATABASE_ENGINE shows that you should have your expected database.

Moral Of the Story

Nothing with any Django imports can go into the package-level __init__.py files that may get brought in while importing settings.

Monday, October 26, 2009

Process Not Working -- Must Have More Process

After all, programmers are all lazy and stupid.

Got his complaint recently.

"Developers on a fairly routine basis check in code into the wrong branch."

Followed by a common form of the lazy and stupid complaint. "Someone should think about which branch is used for what and when." Clearly "someone" means the programmers and "should think about" means are stupid.

This was followed by the "more process will fix this process problem" litany of candidate solutions.

"Does CVS / Subversion have a knob which provides the functionality to

prevent developers from checking code into a branch?"

"Is there a canonical way to organize branches?" Really, this means something like what are the lazy, stupid programmers doing wrong?

Plus there where rhetorical non-questions to emphasize the lazy, stupid root cause. "Why is code merging so hard?" (Stupid.) "If code is properly done and not coupled, merging should be easy?" (Lazy; a better design would prevent this.) "Perhaps the developers don't understand the code and screw up the merge?" (Stupid.) "If the code is not coupled, understanding should be easy?" (Both Lazy and Stupid.)

Root Cause Analysis

The complaint is about process failure. Tools do not cause (or even contribute) to process failure. There are two possible contributions to process failure: the process and the people.

The process could be flawed. There could be no earthly way the programmers can locate the correct branch because (a) it doesn't exist when they need it or (b) no one told them which branch to use.

The people could be flawed. For whatever reason, they refuse to execute the process. Perhaps they know a better way, perhaps they're just being jerks.

Technical means will not solve either root cause problem. It will -- generally -- exacerbate it. If the process is broken, then attempting to create CVS / Subversion "controls" will end in expensive, elaborate failure. Either they can't be made to work, or (eventually) someone will realize that they don't actually solve the problem. On the other hand, if the people are broken, they'll just subvert the controls in more interesting, silly and convoluted ways.

My response -- at the time -- was not "analyze the root causes". When I first got this, I could only stare at it dumbfounded. My answer was "You're right, your developers are lazy and stupid. Good call. Add more process to overcome their laziness and stupidity."

After all, the questioner clearly knows -- for a fact -- that more process helps fix a broken organization. The questioner must be convinced that actually talking to people will never help.

The question was not "what can I do?" The question was "can I control these people through changes to CVS?" There's a clear presumption of "process not working -- must have more process."

The better response from me should have been. "Ask them what the problem is." I'll bet dollars against bent pins that no one tells them which branch to use in time to start work. I'll bet they're left guessing. Also, there's a small chance that these are off-shore developers and communication delays make it difficult to use the correct branch. There may be no work-orders, just informal email "communication" between on-shore and off-shore points-of-contact (and, perhaps, the points-of-contact aren't decision-makers.)

Bottom Line. If people can't make CVS work, someone needs to talk to them to find out why. Someone does not need to invent more process to control them.

Thursday, October 22, 2009

Breaking into Agile

I had a recent conversation with some folks who were desperate to "processize" everything. They were asking about Scrum Master certification and what standards organizations define the "official" Scrum method.

Interestingly, I also saw a cool column in Better Software magazine, called "Scrumdamentalism" on the same basic question.

In my conversation, I referred them to the Agile Manifesto. My first point was that process often gets in the way of actual progress. Too much process focus lifts up "activity" in place of "accomplishment".

My second point, however, was that the Agile Manifesto and the Scrum method are responses to a larger problem. Looking for a process isn't an appropriate response to the problem.

The One True Scrum Quest

Claiming that there's one true Scrum method and everything else is "not scrum" is an easy mental habit. The question gets asked on Stack Overflow all the time. The questions are usually one of two kinds.

What's the "official" or "best practice" Scrum method and how do I define a process that rigidly enforces this for my entire team of 20?
We are doing our design/code/test/integration/release in a way that diverges from the "official" form in the Ken Schwaber and Mike Beedle book. Or it diverges from the Eclipse version. Or it diverges from the Control Chaos overview. Or the Mountain Goat version. Or the C2 Wiki version. Or this version. Is it okay to diverge from the "standard"?

Sigh. The point of Agile is that we should value "Individuals and interactions over processes and tools". The quest for "One True Scrum" specifically elevates the process above the people.

In The Real World

The biggest issue is that the Agile Manifesto is really a response to some fundamental truths about software development.

In management fantasy world, a "project" as a fixed, definite, limited, clearly articulated scope. From this fixed scope, we can then document "all" the requirements (business and technical). This requirements document is (a) testable against the scope, (b) necessary for all further work and (c) sufficient for design, code, test and transition to production. That's not all. And -- in order to make a point later on -- I'll continue to enumerate the fantasies. The fantasy continues that someone can create a "high-level design" or "specification" that is (a) testable against the requirements, (b) necessary for all further work and (c) sufficient to code, test and transition to production. We can then throw this specification over the transom into another room where a "coder" will "cut code" that matches the specification. The code production happens at a fixed, knowable rate with only small random variation based on risk of illness. The testing, similarly, can be meticulously scheduled and will happen precisely as planned. Most "real-world" (management fantasy) projects do not leave any time for rework after testing -- because rework won't happen. If it won't happen, why test? Finally, there will be no technology transfer issues because putting a freshly-written program into production is the same as installing a game from a DVD.

Managers like to preface things with "In The Real World". As in "In The Real World we need to know how long it will take you to write this."

The "in the real world" speech always means "In My Management Fantasy Land." The reason it's always a fantastic speech is because software development involves the unknowable. I'm not talking about some variable in a formula with a value that's currently unknown. I'm talking about predicting the unknowable future.

The Agile Response to Reality

In the Real real world, software development is extraordinarily hard.

Consider this: the computer clock runs in 100-nanosecond increments (1.0E-7). We expect an application to run 24x7x0.999 = 6.04E5 seconds. That's from 100-nano to half-million: about 12 orders of magnitude to keep in one's head.

Consider this: storage in a largish application may span almost a terabyte, (1.0E12). From bytes to terabytes: about 12 orders of magnitude to keep in one's head.

Consider this: a web application written in a powerful framework (Django) requires one to know the following languages and frameworks. Shell script, Apache Config, Python, Django Templates, SQL, HTML, CSS, Javascript, HTTP (the protocol is it's own language), plus the terminology of the problem domain. That's 9 distinct languages. We also have the OS, TCP/IP Apache, mod_wsgi, Django, Python, browser and our application as distinct frameworks. That's 8 distinct framework API's to keep in one's head.

Consider this: the users can't easily articulate their problem. The business analyst is trying to capture enough information to characterize the problem. The users, the analyst, the project manager (and others outside the team) all have recommendations for a solution, polluting the problem description with "solution speak" that's only adds confusion.

In the Management Fantasy "Real World", this is all knowable and simple. In the Real Real World, this is rather hard.

Adapting to Reality

Once we've recognized that software development is hard, we have several responses.

Deny. Claim that software developers are either lazy or stupid (or both). Give them pep-talks that begin "in the real world" and hope that they cough up the required estimates because they're motivated by being told that software development "in the real world" isn't all that hard.
Processize(tm). Claim that software development is a process that can be specified to a level where even lazy, stupid programmers can step through the process and create consistent results.
Adapt. Adapting to the complexity of software development requires asking, "what -- if anything -- expedites software development?"

What Do We Need to Succeed?

There are essentially two domains of knowledge required to create software: the problem domain and the solution domain.

Problem Domain. This is the "business rules", the "scope", the "requirements", the "purpose", etc. We have the features and functions. What the software does. The value it creates. The "what", "who", "where", "when" and "why".
Solution Domain. This is the technology that makes it go. The time and space dimensions (all 12 orders of magnitude in each dimension), all the languages and all the frameworks. The "how".

The issue is this:

We don't start out with complete knowledge of problem and solution.

At the start of the project -- when we're asked to predict the future -- we can never know the whole problem, nor can we ever know the whole solution we're about to try and build.

What we need is this:

Put Problem Domain and Solution Domain knowledge into one person's head.

The question then becomes "Who's head?"

We have two choices:

Non-Programmers. We can try to teach the various non-programmers all the solution domain stuff. We can make the project manager, business analyst, end-users, executive sponsor -- everyone -- into programmers so that they have problem domain and solution domain knowledge.
Programmers. We can try to impart the problem domain knowledge on the programmers. If we're seriously going to do this, we need to remove the space between programmer and problem.

That's the core of the Agile Response: Close the gap between Problem Domain and Solution Domain by letting programmers understand the problem.

The Bowl of Bananas Solution(tm)

"But wait", managers like to say, "in the real world, we can't just let you play around until you claim you're done. We have to monitor your activity to make sure that you're making 'progress' toward a 'solution'."

In the Real real world, you can't define the "problem", much less test whether anything is -- or is not -- a solution. I could hand most managers a bowl of bananas and they would not be able to point to any test procedure that would determine if the bowl of bananas solves or fails to solve the user's problems.

Most project scope documents, requirements documents, specifications, designs, etc., require extensive tacit problem domain knowledge to interpret them. Given a bowl of bananas, the best that we can do is say "we still have the problem, so this isn't a solution." Our scope statements and requirements and test procedures all make so many assumptions about the problem and the solution that we can't even figure out how evaluate an out-of-the-box response -- like a bowl of bananas.

In the Real real world, management in organization A demands that information be kept in a one database. Management organization B has a separate database for reasons mired in historical animosity and territorial scent-marking. Management in yet another organization wants them "unified" or "reconciled" and demands that someone manually put the data into spreadsheets. This morphs into requirements for a new application "system" to unify this data, making the results look like poorly-design spreadsheets. This morphs into a multi-year project to create a "framework" for data integration that maintains the poorly-designed spreadsheet as part of the "solution".

A quick SQL script to move data from A to B (or B to A) is the bowl-0f-bananas solution. It cannot be evaluated (or even considered) because it isn't a framework, system or application as specified in the scope document for the data integration framework.

This is the problem domain knowledge issue. It's so hard to define the problem, that we can't trust the executive sponsor, the program office, the project managers, the business analysts or anyone to characterize the problem for the developers.

The problem domain knowledge is so important that we need to allow programmers to interact with users so that both the problem and the solution wind up in the programmer's head.

Wednesday, October 21, 2009

Unit Test Naming [Updated]

Just stumbled across several blog postings on unit test naming.

Essentially the TestCase will name the fixture. That's pretty easy to understand.

The cool part is this: each test method is a two-part clause: condition_"should"_result or "when"_condition_"then"_result.

See https://wiki.openmrs.org/display/docs/Unit+Testing+With+at-should+Annotation,

Or possibly "method_state_behavior".

See http://osherove.com/blog/2005/4/3/naming-standards-for-unit-tests.html

What a handy way to organize test cases. Only took me four years to figure out how important this kind of thing is.

[Updated to follow moved links.]

Friday, October 16, 2009

Django Capacity Planning -- Reading the Meta Model

I find that some people spend way too much time doing "meta" programming. I prefer to use someone's framework rather than (a) write my own or (b) extend theirs. I prefer to learn their features (and quirks).

Having disclaimed an interest in meta programming, I do have to participate in capacity planning.

Capacity planning, generally, means canvassing applications to track down disk storage requirements.

Back In The Day

Back in the day, when we wrote SQL by hand, we were expected to carefully plan all our table and index use down to the kilobyte. I used to have really sophisticated spreadsheets for estimating -- to the byte -- Oracle storage requirements.

Since then, the price of storage has fallen so far that I no longer have to spend a lot of time carefully modelling the byte-by-byte storage allocation. The price has fallen so fast that some people still spend way more time on this than it deserves.

Django ORM

The Django ORM obscures the physical database design. This is a good thing.

For capacity planning purposes, however, it would be good to know row sizes so that we can multiply by expected number of rows and cough out a planned size.

Here's some meta-data programming to extract Table and Column information for the purposes of size estimation.


import sys
from django.conf import settings
from django.db.models.base import ModelBase

class Table( object ):
   def __init__( self, name, comment="" ):
       self.name= name
       self.comment= comment
       self.columns= {}
   def add( self, column ):
       self.columns[column.name]= column
   def row_size( self ):
       return sum( self.columns[c].size for c in self.columns ) + 1*len(self.columns)

class Column( object ):
   def __init__( self, name, type, size ):
       self.name= name
       self.type= type
       self.size= size

sizes = {
   'integer': 4,
   'bool': 1,
   'datetime': 32,
   'text': 255,
   'smallint unsigned': 2,
   'date': 24,
   'real': 8,
   'integer unsigned': 4,
   'decimal': 40,
}
def get_size( db_type, max_length ):
   if max_length is not None:
       return max_length
   return sizes[db_type]

def get_schema():
   tables = {}
   for app in settings.INSTALLED_APPS:
       print app
       try:
           __import__( app + ".models" )
           mod= sys.modules[app + ".models"]
           if mod.__doc__ is not None:
               print mod.__doc__.splitlines()[:1]
           for name in mod.__dict__:
               obj = mod.__dict__[name]
               if isinstance( obj, ModelBase ):
                   t = Table( obj._meta.db_table, obj.__doc__ )
                   for fld in obj._meta.fields:
                       c = Column( fld.attname, fld.db_type(), get_size(fld.db_type(), fld.max_length) )
                       t.add( c )
                   tables[t.name]= t
       except AttributeError, e:
           print e
   return tables

if __name__ == "__main__":
   tables = get_schema()
   for t in tables:
       print t, tables[t].row_size()

This shows how we can get table and column information without too much pain. This will report an estimated row size for each DB table that's reasonably close.

You'll have to add storage for indexes, also. Further, many databases leave free space within each physical block, making the actual database much larger than the raw data.

Finally, you'll need extra storage for non-database files, logs and backups.

Wednesday, October 14, 2009

Unit Testing in C

I haven't written new C code since the turn of the millennium. Since then it's been almost all Java and Python. Along with Java and Python come JUnit and Python's unittest module.

I've grown completely dependent on unit testing.

I'm looking at some C code, and I want a unit testing framework. For pure C, I can find things like CuTest and CUnit. The documentation makes them look kind of shabby. Until I remembered what a simplistic language C is. Considering what they're working with, they're actually very cool.

I found a helpful posting on C++ unit testing tools. It provided some insight into C++. But this application is pure C.

I'm interested in replacing the shell script in CuTest with a Python application that does the same basic job. That's -- perhaps -- a low-value add-on. Perhaps I should look at CUnit and stay away from replacing the CuTest shell script with something a bit easier to maintain.

Monday, October 12, 2009

Sometimes the universe appears multidimensional -- but isn't

Had a knock-down drag-out fight with another architect recently over "status" and "priority".

She claimed that the backlog priority and the status where the same thing. I claimed that you can easily have this.

Priority: 1, Status: Not Started

Priority: 2, Status: In Process

Priority: 3, Status: Completed

See? It's obvious that they're independent dimensions.

She said that it's just as obvious that you're doing something wrong.

Here's her point:

If you have priority 1 items that aren't in process now, then they're really priority 2. Fix them to honestly say priority 2.
If you have priority 2 items that "somehow" jumped ahead of priority 1 items, they were really priority 1. Fix them to say priority 1. And don't hand her that "in the real world, you have managers or customers that invert the priorities". Don't invert the priorities, just change them and be honest about it.
The only items that are done must have been priority 1, passed through an "in-process" state and then got finished. Once they're done, they're not priority 1 any more. They're just done.
Things that hang around in "in-process, not done" have two parts. The part that's done, and some other part that's in the backlog and not priority 1.

She says that priority and status are one thing with the following values.

Done.
Priority 1 = in process right now.
Priority 2 = will be in process next. Not eventually. Next.
Priority 3 through ∞ = eventually, in order by priority.

Any more complex scheme is simply misleading (Priority 1 not being done right now? Is it a resource issue? A priority issue? Why aren't you doing it?)

Tuesday, October 6, 2009

Flattening Nested Lists -- You're Doing It Wrong

On StackOverflow you can read numerous questions on "flattening" nested lists in Python.

They all have a similar form.

"How do I flatten this list [ [ 1, 2, 3 ], [ 4, 5, 6 ], ... , [ 98, 99, 100 ] ]?"

The answers include list comprehensions, itertools, and other clever variants.

~~All~~ Much of which is ~~simply wrong~~ inappropriate.

You're Doing it Wrong

The only way to create a nested list is to append a list to a list.

theList.append( aSubList )

You can trivially replace this with the following

theList.extend( aSubList )

Now, your list is created flat. If it's created flat, you never need to flatten it.

Obscure Edge Cases

Sometimes it may be necessary to have both a flattened and an unflattened list. I'm unclear on when or how this situation arises, but this may be edge case that makes some of itertools handy.

For the past 3 decades, I've never seen the "both nested and not nested" use case, so I can't fathom why or how this would arise.

Visiting a Tree

Interestingly, a tree visitor has a net effect somewhat like "flattening". However, it does not actually create an intermediate flat structure. It simply walks the structure as it exists. This isn't a proper use case for transforming a nested list structure to a flat structure. Indeed, this is a great example of why nested structures and flat structures are quite separate use cases.

Monday, October 5, 2009

Code Kata : Analyze A Hard Drive

This isn't computer forensics; it's something much simpler.

A colleague has been struck down with a disease (or won the lottery) and won't be back to work any time soon. Worse, they did not use SVN to do daily check-ins. Their laptop has the latest and greatest. As well as all experiments, spike solutions, and failed dead-ends.

You have their hard drive mounted in an enclosure and available as /Volumes/Fredslaptop or F: if you're a windows programmer.

There are, of course, thousands of directories. Not all of which are terribly useful.

Step 1 - find the source. Write a small utility to locate all directories which contain "source". Pick a language you commonly work in. For C programmers, you might be looking for .c, .cpp, .h, and .hpp files. For Python programmers, you're looking for .py files. For Java programmers, you're looking for .java, and .class files.

Step 2 - get information. For each directory that appears to have source, we want to know the number of source files, the total number of lines in all those source files, and the most recent modification time for those files. This is a combination of the output from wc and ls -t.

Step 3 - produce a useful report. To keep your team informed, create a .CSV file, which can be loaded into a spreadsheet that summarizes your findings.

Friday, October 2, 2009

Agile Methods and "Total Cost"

Many folks ask about Agile project planning and total cost. As our internal project managers wrestle with this, there are a lot of questions.

Mostly these questions are rejections of incremental delivery ("All or Nothing") or rejections of flexibility ("Total Total Cost"). We'll look at these rejections in detail.

Traditional ("waterfall") project planning creates a master plan, with all skills, all tasks, all effort, and all costs. It was easy to simply add it up to a total cost.

Software development, unlike -- for example -- carpentry, has serious unknowns. Indeed software development has so many unknowns that it's not possible to compare software project management with the construction trades.

A carpenter has a task ("frame up these rooms") that has an absolute boundary with no unknown deliverables. No one says things like "we need to separate the functions of these users from those users." They say "build a wall, surface dry-wall, tape, paint, add molding." The carpenter measures, and knows precisely the materials required.

The carpenter rarely has new technology. The pace of change is slow. A carpenter may switch from hand-held nails to a nail gun. It's still nails. The carpenter may switch from wooden 2x4's to metal supports. It's still vertical members and nails. The carpenter may switch brands of wall-board. It's still wall-board.

The consequence of this is that -- for software projects -- Total Cost Is Hard To Predict.

Hack-Arounds

Total cost is hard to predict, but we try to do it anyway. What we do is add "risk factors" to inflate our estimate. We add risk factors for the scope of delivery. We add risk factors for our ability to deliver.

We can organize these risk factors into several subtle buckets. The COCOMO model breaks scope down into three Product Attributes and four Hardware Attributes. It breaks delivery down into five Personnel Attributes and three Project Attributes.

This is a hack-around because we simply cannot ever know the final scope, nor can we ever know our ability to deliver. We can't know our ability to deliver because the team is constantly changing. We should not cope with this expected constant state of flux by writing an elaborate plan and then reporting our failure to meet that plan. That's stupid.

Worse still, we can't know the scope because it's usually a fabric of lies.

Scope Issue 1: "Required"

Customers claim that X, Y and Z are "required". Often, they have no idea what "required" even means. I spent a fruitless hour with a customer that had a 24×7 requirement. I said, "you haven't purchased hardware that will give you 24×7, so we're submitting this change order to remove it from the requirements."

They said, "It's more of a goal. We don't want to remove it."

I said, "It cannot be achieved. You will not pay us because we will fail. Can we remove it and rewrite it as a 'goal'?"

They said, "No need to remove it: we wouldn't failure to meet that requirement as a 'failure'."

"Okay," I said, "what's the minimum you'll put up with before suing us for failing?"

They couldn't answer that. They had no "required" up-time and could not determine what was "required". They had a goal, but no minimum that would trigger labeling the project a failure.

Of course, the project failed. But not because of up-time. There were dozens of these kinds of poorly-worded requirements that weren't really required.

Scope Issues 2: "The Game"

I worked with some users who were adept at gaming IT. They knew that IT was utterly incapable of delivering everything in the requirements document. They knew this and planned on it.

Also, the users knew that a simple solution would not "add enough value"; a simple solution would get rejected by the governance committee. They knew this and planned on it also.

The users would write amazing, fabulous, wondrous requirements, knowing that some of them were sacrificial. The extra requirements were there to (1) force IT to commit serious resources to the project and (2) convince governance that the software "added enough value".

IT spent (wasted?) hours planning, architecting, designing, estimating and tracking progress against all of the requirements. Then, when we got to acceptance testing, there were numerous "requirements" that were not required, nor even desired. They were padding.

What To Do?

Okay. Scope and delivery are unknowable. Fine. In spite of this, what do we do to provide a reasonable estimate of development effort?

Gather the "requirements" or "desires" or "wishes" or "epics" or "stories" or whatever you've got that provides some scope definition. This is the "analysis" or "elaboration" phase. Define "what", but not "how". Clearly define the business problem to be solved. Avoid solution-speak like "database", "application server", and the like.
Decompose. Define a backlog of sprints based on what you know. If necessary, dig into some analysis details to provide more information on the sprints. Jiggle the sprints around to get a consistent size and effort.
Prioritize based on your best understanding. Define some rational ordering to the sprints and releases. Provide some effort estimate for the first few releases. This estimate is simply the sum of the sprint costs. The sprints should be all about the same effort and about the same cost. About. Not exactly. Fine tune as necessary.
Prioritize again with the users. Note that the sprint costs and the sprints required to release are all in their face. They can adjust the order only. Cost is not negotiable. It's largely fixed.

Rejection 1: All Or Nothing

One weird discussion point is the following: "Until release X, this is all useless. You may as well not do release 1 to X-1, those individual steps are of no value."

This is not true, but it's a way some folks try to reject the idea of incremental releases.

You have two possible responses.

"Okay." In this case, you still create the releases, you just don't deliver them. We watched two members of the customer's management team argue about the all-or-nothing issue. One bone-head kept repeating that it was all-or-nothing. Everyone else claimed that Release 1 and 2 were really helpful, it was release 3 to X-1 that were not so useful.
"What not?" In this case, you suspect that the priorities are totally wrong and -- for some reason -- the customer is unwilling to put them in the correct order.

Everything can be prioritized. Something will be delivered first. At the very least, you can play this trump card. "We need to do incremental releases to resolve any potential problems with delivery and turn-over."

Rejection 2: Total Total Cost

The most frustrating conversations surround the "total cost" issue.

The trick to this is the prioritization conversation you had with your users and buyers. Step 4, above.

You gave them the Release - Sprint - Cost breakdown.

You walked through it to put the releases and sprints into the correct order.

What you have to do is add another column to the spread-sheet: "Running Cost". The running cost column is the sum of the sprint costs. Each running cost number is a candidate total cost. It's just that simple.

It takes several tries to get everyone's head wrapped around the concept.

Customer Control

You know the concept has started to sink in when the customer finally agrees that they can pull the plug on the project after any sprint. They grudgingly admit that perhaps they control the costs.

You know they really get it when they finally say something like this.

"We can stop at any time? Any time? In that case, the priority is all wrong. You need to do X first. If we were -- hypothetically -- going to cancel the project, X would create the most value. Then, after that, you have to do Z, not Y. If we cancel after X and Z, we've solved most of the real problems."

When they start to go though hypothetical project cancelation scenarios with you, then they get the way that they control the total cost.

This tends to avoid the tedious of negotiations where the customer then changes the requirements to meet their budget. Nothing is more awful than a customer who has solicited bids via a Request for Proposal (RFP) process. They liked our bid, but realized that they'd asked for too much, and want to reduce the scope, but don't have priorities or cost-per-release information.

If you do the priorities interactively -- with the customer -- there's no "negotiation". It's just decision-making on their part.

Thursday, October 1, 2009

Agile Project Management

Got this question recently.

"Any suggestions on PM tools that meet the following considerations

1) Planning

2) Estimating

3) Tracking (allowing both PM input and developer input)

4) Reporting

5) Support both Agile and Waterfall projects

6) Releases

7) Bug fixes (probably just another type of backlog)"

Agile PM requires far less planning than you're used to.

1. A "backlog" which is best done on a spreadsheet.

2. Daily standup meetings which last no more than 15 minutes at the absolute longest.

And that's about it.

Let's look at these expectations in some detail. This is important because Agile PM is a wrenching change from waterfall PM.

Planning

There are two levels of detail in planning. The top level is the overall backlog. This is based on the "complete requirements" (hahaha, as if such a thing exists). You have an initial planning effort to decompose the "requirements" into a workable sequence of deliverables and sprints to build those deliverables. Don't over-plan -- things will change. Don't invest 120 man-hours of effort into a plan that the customer will invalidated with their first change request. Just decompose into something workable. Spend only a few days on this.

The most important thing is to prioritize. The backlog must always be kept in priority order. The most important things to do next are at the top of the backlog. At the end of every sprint, you review the priorities and change them so that the next thing you do is the absolutely most valuable thing you can do. At any time, you can stop work, and you have done something of significant value. At any time, you can review the next few sprints and describe precisely how valuable those sprints will be.

The micro level of detail is the next few deliverables. No more than four to six. Don't over-plan. Review the deliverables in the backlog, correcting, expanding, combining and refining as necessary to create something that will be of value. List the sprints to build those deliverables. Try to keep each sprint in the four week range. This is really hard to do at first, but after a while you develop a rhythm based on features to be built and skills of the team. You don't know enough going in, so don't over-plan. After the first few sprints you'll learn a lot about the business problem, the technology and the team.

Estimating

Rule 1: don't. Rule 2: the estimate is merely the burn rate (cost per sprint) times the number of sprints. Each sprint involves the whole team building something that *could* be put into production. A team of 5 with 4 week sprints is a cost of 5*40*4 (800 man-hours).

Each sprint, therefore, has a cost of 800 man-hours. Period. The overall project has S sprints. If the project runs more than a year, stop. Stop. The first year is all you can rationally estimate this way. Future years are just random numbers. 5*40*50 = 10,000 man-hours.

Details don't matter because each customer change will invalidate all of your carefully planned schedules. Just use sprints and simple multiplies. It's *more* accurate since it reflects the actual level of unknowns.

What about "total cost"? First, define "total". When the project starts is the "complete requirements" (hahahaha, as if such a thing actually exists). Then, with each customer change, this changes. Further, half the requirements are merely "nice-to-haves". Since they're merely nice, they're low priority -- at the bottom of the backlog.

Since each sprint creates something deliverable, you can draw a line under any sprint, call it "done" and call that the "total cost". Any sprint. Any. There are as many different total costs as there are sprints, and all of them are right.

Tracking

I don't know what this is. I assume it's "tracking progress of tasks against a plan". Since the tasks are not planned at a low level of detail, there's nothing to "track".

You have a daily stand-up. People commit to do something that day. The next day you find out if they finished or didn't finish. This isn't a "tool thing". It's a conversation. Done in under 15 minutes.

Two things can happen during this brief conversation.

- Things are progressing as hoped. The sprint will include all hoped-for features.

- Things are not progressing as hoped. The sprint may not include some feature, or will include an incomplete implementation. The sprint will never have bugs -- quality is not sacrificial. Features are sacrificial.

There's no management intervention possible. The sprint will have what it will have. Nothing can change that. More people won't help. Technology changes won't help. Design changes won't help. You're mid-sprint. You can only finish the sprint.

AFTER the sprint is over, and you've updated the backlog and fixed the priorities, you might want to consider design changes or technology changes.

Reporting

What? To Whom? Each sprint is a deliverable. The report is "Done".

The backlog is a shared document that the users "own" and you use to assure that the next sprint is the next most important thing to do.

Support both Agile and Waterfall projects

Not possible. Incompatible at a fundamental level. You can't do both with one tool because you don't use tools for Agile projects. You just use spreadsheets.

Releases

Some sprints are release sprints. They're no different (from a management perspective) than development sprints. This is just CM.

Bug fixes

Probably just another type of backlog. Correct.

S.Lott-Software Architect

Moved

Moved. See https://slott56.github.io. All new content goes to the new site. This is a legacy, and will likely be dropped five years after the last post in Jan 2023.