Moved

Moved. See https://slott56.github.io. All new content goes to the new site. This is a legacy, and will likely be dropped five years after the last post in Jan 2023.

Showing posts with label tdd. Show all posts
Showing posts with label tdd. Show all posts

Tuesday, April 25, 2017

Modules vs. Monoliths vs. Microservices:

Dan Bader (@dbader_org)
Worth a read: "Modules vs. Microservices" (and how to find a middle ground) oreilly.com/ideas/modules-…

"don't trick yourself into a microservices-only mindset"

Thanks for sharing.

The referenced post gives you the freedom to have a "big-ish" microservice. My current example has four very closely-related resources. There's agony in decomposing these into separate services. So we have several distinct Python modules bound into a single Flask container.

Yes. We lack the advertised static type checking for module boundaries. The kind of static type checking that doesn't actually solve any actual problems, since the issues are always semantic and can only be found with unit tests and integration tests and Gherkin-based acceptance testing (see Python BDD: https://pypi.python.org/pypi/pytest-bdd and https://pypi.python.org/pypi/behave/1.2.5).

We walk a fine line. How tightly coupled are these resources? Can they actually be used in isolation? What do the possible future changes look like? Where is the swagger.json going to change?

It's helpful to have both options on the table.

Thursday, June 12, 2014

TDD, API Design and Refactoring

See this short discussion on a Stingray Reader feature:
https://sourceforge.net/p/stingrayreader/discussion/COBOL/thread/d2132851/?limit=25#2a3a

This turned into an exercise in pure TDD.

<rant>
I'm not a fan of applying TDD in a strict, death-march fashion.

I see the comments on Stack Overflow that indicate that some folks feel strongly that strict TDD is somehow helpful. While "test before code" is laudable and often helpful, there's no royal road to good software.

Design involves a great deal of back and forth between code and test. A great deal.

It's logically impossible to write a test without having thought about the code. In order to write the test first, there must be a notional API against which the test is written. Anyone who requires that the test file must be written before the notional class or module is just playing at petty tyranny.

The notional design -- the rough outline of the class or module -- can be written into a file before any tests. It's okay. It is still test-driven because the considerations of testability drove the design process.

In particular, when starting "from scratch" -- with nothing -- writing tests first is senseless. Some module or package structure must exist for the test modules to import.

</rant>
Having ranted, it still arises that the tests do come before any code under some circumstances.

In this case, the requested functionality was quite difficult to visualize. However, it was possible to cobble together a test case that simplified the problem down to something like this this:


01 Some-Record.
     05 Header PIC XXX.
     05 Body PIC X(17).

01 ABC-Segment.
     05 Field-ABC PIC X(17).

01 DEF-Segment.
     05 Field-DEF PIC X(17).


In COBOL, the program would use logic like IF Header EQUALS "ABC" THEN MOVE Body TO ABC-Segment. We need a way to handle something like this in Python so that we can parse the EBCDIC COBOL data.

This summarized example allowed construction of a test case that made use of a API that might have existed. I was pretty sure I had a test case that showed an approach.

What Actually Happened

Since the application already had 178 unit tests, there was plenty of structure that worked.

The single new unit test relied on a notional API that wasn't really in place. The new test bombed grotesquely.

There are two solutions:

  • Modify the test.
  • Fix the notional API so that it works properly.

I started out chasing the second option. I tweaked some things. More tests failed. I tweaked some more things. The new test finally passed, but another test was failing.

Some careful study of the failing test revealed that my approach was wrong. Way wrong.

The notional API was a bad idea.

The tweaks to make it work were a worse idea.

Back to the Lab Bench

At this point, I had made enough changes that the only thing to do was copy the new test and use the Git Revoke on the local changes to unwind the awful mistakes.

Staring again, I had a slightly better grip on the relevant code. I had a failing test. I tried a different approach that wasn't quite so inventive. This meant modifying the test.

I actually went through a few iterations of the test, using the test method as a kind of lab bench.

A more Pythonic approach to the lab bench is to work from the >>> prompt. I think that all of the exemplary projects use the >>> prompt examples in their documentation. This is a way to narrow and clarify the API. As projects get big, they can sprawl. New features can wind up with many imports to pick and choose elements from existing modules.

When it becomes difficult to use the >>> prompt as the lab bench, that's a sign that the API is too complex. Refactoring must happen.

Using the unit test framework as the lab bench was a hint that something had drifted out of tolerance.

However. I did get a test which passed. Yay. Sort of.

The test code was hideous.

TDD and API Design

The point of TDD, however, is that we have a working suite of tests. Refactoring won't break anything.

The point was that the hideous API could be rewritten into something that both

  • Passed all the tests, and
  • Was usable at the >>> prompt.
It's difficult to express how valuable the Python >>> prompt is to help clarify API design issues.

The rule is this:

If the API doesn't make sense at the >>> prompt, it's incomprehensible

Sadly, Java doesn't have this kind of boundary. Java programming can spin into quite complex API's, limited only by the laziness of the programmer who avoids refactoring.

Or the malice of the programmer's manager in not allowing time to refactor.

Thursday, June 14, 2012

IBM RAMAC Device: 5 MB

Check out this picture.  http://www.petapixel.com/2011/12/27/what-5mb-of-storage-looked-like-in-1956/

Random Reminiscing Follows

When I was in college (1974-1978) 64K of RAM was the size of a refrigerator.

By 1982, 64K of RAM was an Apple ][+ fully tricked out with the 16K expansion card.

I vaguely remember working with a "tower" device that was a 5MB disk drive.  Think Mac Pro case to hold a disk drive.

By 1985 or so, 128K or RAM was a Macintosh and a 5MB disk drive a big desktop console box.  Smaller than a tower.  Irritating because it took up so much real-estate when compared with the Mac itself that was designed to take up an 8 x 11 space (not including keyboard or mouse).

Now 5MB is round-off error.

What's Important?

Has computing gotten that much better?

Old folks (like me) reminisce about running large companies on small computers.  Now, we can't even get our coffee in the morning without the staggering capabilities of a 32GB iPhone.

Old folks, however, are sometimes mired in bad waterfall development methods and don't understand the value of test-driven development.  While the hardware is amazing, the development tools and techniques have improved, also.

Thursday, June 7, 2012

Stingray Schema-Based File Reader

Just updated the Stingray Reader.  There was an egregious error (and a missing test case).  I fixed the error, but didn't add a test case to cover the problem.

It's simple laziness.  TDD is quite clear on how to tackle this kind of thing.  Write the missing test case (which will now fail).  Then make the code change.

But the code change was so simple.

Thursday, February 17, 2011

TDD -- From SME Spreadsheet to TestCase to Code

In "Unit Test Case, Subject Matter Experts and Requirements" I suggested that it's often pretty easy to get a spreadsheet of full-worked out examples from subject-matter experts. Indeed, if your following TDD, that spreadsheet of examples is solid gold.

Let's consider something relatively simple. Let's say we're working on some fancy calculations. Our users explain until they're blue in the face. We take careful notes. We think we understand. To confirm, we ask for a simple spreadsheet with inputs and outputs.

We get something like the following. The latitudes and longitudes are inputs. The ranges and bearings are outputs. [The math can be seen at "Calculate distance, bearing and more between Latitude/Longitude points".]

Latitude 1Longitude 1Latitude 2Longitude 2rangebearing
50 21 50N004 09 25W42 21 04N071 02 27W2805 nm260 07 38

Only it has a a few more rows with different examples. Equator Crossing. Prime Meridian Crossing. All the usual suspects.

TDD Means Making Test Cases

Step one, then, is to parse the spreadsheet full of examples and create some domain-specific examples. Since it's far, far easier to work with .CSV files, we'll presume that we can save the carefully-crafted spreadsheet as a simple .CSV with the columns shown above.

Step two will be to create working Python code from the domain-specific examples.

The creation of test cases is a matter of building some intermediate representation out of the spreadsheet. This is where plenty of parsing and obscure special-case data handling may be necessary.

from __future__ import division
import csv
from collections import namedtuple
import re

latlon_pat= re.compile("(\d+)\s+(\d+)\s+(\d+)([NSWE])")
def latlon( txt ):
match= latlon_pat.match( txt )
d, m, s, h = match.groups()
return float(d)+float(m)/60+float(s)/3600, h
angle_pat= re.compile("(\d+)\s+(\d+)\s+(\d+)")
def angle( txt ):
match= angle_pat.match( txt )
d, m, s = match.groups()
return float(d)+float(m)/60+float(s)/3600
range_pat= re.compile("(\d+)\s*(\D+)")
def range( txt ):
match= range_pat.match( txt )
d, units = match.groups()
return float(d), units

RangeBearing= namedtuple("RangeBearing","lat1,lon1,lat2,lon2,rng,brg")

def test_iter( filename="sample_data.csv" ):
with open(filename,"r") as source:
rdr= csv.DictReader( source )
for row in rdr:
print row
tc= RangeBearing(
latlon(row['Latitude 1']), latlon(row['Longitude 1']),
latlon(row['Latitude 2']), latlon(row['Longitude 2']),
range(row['range']),
angle(row['bearing'])
)
yield tc

for tc in test_iter():
print tc

This is long, but, it handles a lot of the formatting vagaries that users are prone to.

From Abstract to TestCase

Once we have a generator to build test cases as abstraction examples, generating code for Java or Python or anything else is just a little template-fu.

   
from string import Template
testcase= Template("""
class Test_${name}( unittest.TestCase ):
def setUp( self ):
self.p1= LatLon( lat=GlobeAngle(*$lat1), lon=GlobeAngle(*$lon1) )
self.p2= LatLon( lat=GlobeAngle(*$lat2), lon=GlobeAngle(*$lon2) )
def test_should_compute( self ):
d, brg = range_bearing( p1, p2, R=$units )
self.assertEquals( $dist, int(d) )
self.assertEquals( $brg, map(int,map(round,brg.deg)))
""")
for name, tc in enumerate( test_iter() ):
units= tc.rng[1].upper()
dist= tc.rng[0]
code= testcase.substitute( name=name, dist=dist, units=units, **tc._asdict() )
print code

This shows a simple template with values filled in. Often, we have to generate a hair more than this. A few imports, a "unittest.main()" is usually sufficient to transform a spreadsheet into unit tests that we can confidently use for test-driven development.

Tuesday, February 8, 2011

Unit Test Case, Subject Matter Experts and Requirements

Here's a typical "I don't like TDD" question: the topic is "Does TDD really work for complex projects?"

Part of the question focused on the difficulty of preparing test cases that cover the requirements. In particular, there was some hand-wringing over conflicting and contradictory requirements.

Here's what's worked for me.

Preparation. The users provide the test cases as a spreadsheet showing the business rules. The columns are attributes of some business document or case. The rows are specific test cases. Users can (and often will) do this at the drop of a hat. Often complex, narrative requirements written by business analysts are based on such a spreadsheet.

This is remarkably easy for must users to produce. It's just a spreadsheet (or multiple spreadsheets) with concrete examples. It's often easier for users to make concrete examples than it is for them to write more general business rules.

Automated Test Case Construction

Here's what can easily happen next.

Write a Python script to parse the spreadsheet and extract the cases. There will be some ad-hoc rules, inconsistent test cases, small technical problems. The spreadsheets will be formatted poorly or inconsistently.

Once the cases are parsed, it's easy to then create a Unittest.TestCase template of some kind. Use Jinja2 or even Python's string.Template class to rough out the template for the test case. The specifics get filled into the unit test template.

The outline of test case construction is something like this. Details vary with target language, test case design, and overall test case packaging approach.

t = SomeTemplate()
for case_dict in testCaseParser( "some.xls" ):
code= t.render( **case_dict )
with open(testcaseName(**case_dict ),'w') as result:
result.write( code )

You now have a fully-populated tree of unit test classes, modules and packages built from the end-user source documents.

You have your tests. You can start doing TDD.

Scenarios

One of the earliest problems you'll have is test case spreadsheets that are broken. Wrong column titles, wrong formatting, something wrong. Go meet with the user or expert that built the spreadsheet and get the thing straightened out.

Perhaps there's some business subtlety to this. Or perhaps they're just careless. What's important is that the spreadsheets have to be parsed by simple scripts to create simple unit tests. If you can't arrive at a workable solution, you have Big Issues and it's better to resolve it now than try to press on to implementation with a user or SME that's uncooperative.

Another problem you'll have is that tests will be inconsistent. This will be confusing at first because you've got code that passed one test, and fails another test and you can't tell what the differences between the tests are. You have to go meet with the users or SME's and resolve what the issue is. Why are the tests inconsistent? Often, attributes are missing from the spreadsheet -- attributes they each assumed -- and attributes you didn't have explicitly written down anywhere. Other times there's confusion that needs to be resolved before any programming should begin.

The Big Payoff

When the tests all pass, you're ready for performance and final acceptance testing. Here's where TDD (and having the users own the test cases) pays out well.

Let's say we're running the final acceptance test cases and the users balk at some result. "Can't be right" they say.

What do we do?

Actually, almost nothing. Get the correct answer into a spreadsheet somewhere. The test cases were incomplete. This always happens. Outside TDD, it's called "requirements problem" or "scope creep" or something else. Inside TDD, it's called "test coverage" and some more test cases are required. Either way, test cases are always incomplete.

It may be that they're actually changing an earlier test case. Users get examples wrong, too. Either way (omission or error) we're just fixing the spreadsheets, regenerating the test cases, and starting up the TDD process with the revised suite of test cases.

Bug Fixing

Interestingly, a bug fix after production roll-out is no different from an acceptance test problem. Indeed it's no different from anything that's happened so far.

A user spots a bug. They report it. We ask for the concrete example that exemplifies the correct answer.

We regenerate the test cases from the spreadsheets and start doing development. 80% of the time, the new example is actually a change to an existing example. And since the users built the example spreadsheets with the test data, they can maintain those spreadsheets to clarify the bugs. 20% of the time it's a new requirement. Either way, the test cases are as complete and consistent as the users are capable of producing.

Thursday, June 24, 2010

TDD and Python

First, let me say that TDD rocks.

Few things are as much fun as (1) writing a test script for a feature, and then (2) debugging the feature incrementally until it passes the test. It's fun because a great deal of hand-wringing and over-thinking is taken off the table.

To paraphrase Obi-Wan Kenobi:

Use The Test, Luke.

The essence of TDD is a pleasant two-step process: write tests, write code.

However, leaving things at this simplistic level isn't appropriate.

Code Quality

Most folks describe TDD as a 3-step process. I like to call this "red-green-gold" (The Lithuanian Flag version of TDD.)
  1. Tests don't pass (red).
  2. Tests pass (green).
  3. Refactor the code until things look good (gold).
The point here is that once you have tests that pass, you can trivially engage in refactoring and other engineering tasks to improve the overall quality of the code. You can optimize or make it more readable or more reusable without breaking it.

Even this isn't quite right.

Test Quality

The issue with a too-simplistic view TDD is that we walk a fine line.
  • Over-engineering the tests.
  • Under-engineering the tests.
We can -- trivially -- fall into the trap of wringing our hands over every potential nuance of our new piece of code. We can be stalled writing tests. Often we hear complaints from folks who fall into this trap. They spend too much time writing tests and indict all of TDD because they dove into details too early in the process.

We can -- equally easily -- fall into the trap of failing to write suitably robust tests for our software.

TDD is really a 3+1 step process.
  1. Write tests, which don't pass (Red).
  2. Write code until tests pass (Green).
  3. (a) Clean up code to improve quality features. (b) Expand tests to add an appropriate level of robustness.
The operating word here is "appropriate".

Costs and Benefits

Some modules -- because of risk or complexity or visibility -- require extensive testing. Some modules don't require this.

Interestingly, portability -- even in Python -- requires some care in testing. It turns out that MySQL and SQLite are not completely identical in their behavior.

Omitting an order-by in a query can "work by accident" in one database and fail in another. So we need appropriate testing to ferret out these RDBMS-specific issues. Until we have the appropriate level of testing we have an application that works in SQLite but fails in MySQL.

The initial gut reaction can sometimes be "TDD failed us".

But this isn't true. TDD actually helped us by (1) identifying code which passed on one platform and failed on another, and (2) leading us to beef up all tests which depend on ordering. Pleasantly, there aren't many.

Tuesday, April 27, 2010

Yet More Praise for Unit Tests

I can't say enough good things about TDD.

But I'll try.

Due to an epic failure to read the documentation (this, specifically) I couldn't get our RESTful web services to work in Apache.

The entire application system has pretty good test coverage. I use the Python unittest to do integration testing. A test module spins up a Django test server; each TestCase uses the RESTful API library access the web servers through a variety of use cases.

However. This integration isn't done through Apache and mod_wsgi. It's done using Django's stand-alone testserver capability.

As I noted recently, Apache doesn't like to give up the HTTP Authorization header. So, the real deployment on our real servers didn't really work.

The Blame Game

At this point there are lots of things we can blame. Let's start blaming the process.
  1. TDD didn't help. By now it should be obvious that TDD is a complete waste of time because it didn't uncover this obvious integration issue. There's no justification for TDD.
  2. The Unit Testing framework didn't help. It's a completely blown unit. Unit testing is oversold as a technology.
  3. Reliance on "testing" is stupid. There's no point in even attempting to "test" software, since it still broke when we tried to deploy it. Testing simply doesn't uncover enough problems.
Clearly, we need a Bold New Process to solve and prevent problems like this.

Seriously

Search Stack Overflow for "Justification of TDD" or "ROI of Unit Testing" and those kinds of loaded questions and you'll find folks that are angry that software development is hard and TDD or Unit Testing or a slick IDE or a Dynamic Language or REST or SOAP or something didn't make software easy.

There is no Pixie Dust. You've been told. Stop searching for it. Software is hard. Unit Testing helps, but doesn't make it less hard.

Unit Testing to the Rescue

Our code coverage is -- at best -- middlin'. I don't have counts, nor do I actually care what the lines of code number is. Code coverage can devolve to numerosity. The method function coverage and use case coverage is more interesting. A "logic path coverage" might be helpful. But I'm sure our coverage is far from complete.

So there we were.
  1. Hundreds of unit tests pass.
  2. A suite of a half-dozen "integration" scripts (over a dozen TestCases) pass.
  3. Real Apache deployment fails because I couldn't figure out how to get mod_wsgi to pass the HTTP Authorization header. Even though it's clearly and simply documented. [I was busy focusing on Apache; mod_wsgi solves the problem handily.]
What I did was copy a page from AWS and put the digested authentication information in a query string. In one sense, this is a huge change to the API's -- it's visible. In another sense this is a minor tweak to the application.

The RESTful web services all rely on an authenticator object. The change amounted to a new subclass of this authenticator. Plus some refactoring to locate the digest in the query string. This is a tightly focused change in authentication and the client library. About two days of work to subclass and refactor the auth.rest module.

Success Factors

Because of TDD and a suite of unit tests, many things went really, really well.
  1. I could extend the test script for the auth.rest module to include the new authentication-via-query-string mechanism. Having tests that failed made is really easy to refactor and subclass until the tests passed. Then I could refactor some more to simplify the resulting modules.
  2. I could rerun the unittest suite, including the various "integration" tests (tests that had everything by Apache) to be sure everything still worked. Believe it or not, there were actual problems uncovered by this. Specifically, some tests didn't properly use the web services API library. The library had changed, but was mostly backwards compatible, so the tests had continued to work. The latest round of changes broke backwards compatibility, and some tests now failed.
  3. Despair did not set in. There were issues: sales folks were in total panic because the whole "house of cards" architecture had collapsed. A working test suite makes a compelling case that the application -- generally -- is still sound. We're just stumbling on an Apache deployment issue. In one sense, it's a "show stopper", but in another sense it's just a Visible But Minor (VBM™) hurdle.

Wednesday, October 21, 2009

Unit Test Naming [Updated]

Just stumbled across several blog postings on unit test naming.

Essentially the TestCase will name the fixture. That's pretty easy to understand.

The cool part is this: each test method is a two-part clause: condition_"should"_result or "when"_condition_"then"_result.


Or possibly "method_state_behavior".


What a handy way to organize test cases. Only took me four years to figure out how important this kind of thing is.

[Updated to follow moved links.]

Wednesday, October 14, 2009

Unit Testing in C

I haven't written new C code since the turn of the millennium. Since then it's been almost all Java and Python. Along with Java and Python come JUnit and Python's unittest module.

I've grown completely dependent on unit testing.

I'm looking at some C code, and I want a unit testing framework. For pure C, I can find things like CuTest and CUnit. The documentation makes them look kind of shabby. Until I remembered what a simplistic language C is. Considering what they're working with, they're actually very cool.

I found a helpful posting on C++ unit testing tools. It provided some insight into C++. But this application is pure C.

I'm interested in replacing the shell script in CuTest with a Python application that does the same basic job. That's -- perhaps -- a low-value add-on. Perhaps I should look at CUnit and stay away from replacing the CuTest shell script with something a bit easier to maintain.

Monday, September 28, 2009

Duct Tape Programmers

See Joel On Software: The Duct Tape Programmer: he lauds the programmer who gets stuff done with "duct tape and WD-40".

Here's why: "Shipping is a feature. A really important feature. Your product must have it."

Dave Drake sent the link along with the following:

This "speaks of coding for the rest of us, who are not into building castles in the air, but getting the job done. Not that there is anything wrong with better design, cleaner APIs, well-defined modularity to ease the delegation of coding as well as post-delivery maintenance. But damn, I wish I had a nickel for every time I sat in a design meeting where we tried to do something the fancy way, and it broke in the middle of the development cycle, or testing, or even the builds, and always in the demos."

However

There is one set of quotes that falls somewhere on the continuum of wrong, misleading and flamebait.

"And unit tests are not critical. If there’s no unit test the customer isn’t going to complain about that."

This -- in my experience -- is wrong. For Joel or the author of the quote (Jamie Zawinski) this may be merely misleading because it was taken out of context.

It's absolutely false the customers won't complain about missing unit tests. When things don't work, customers complain. And one of the surest ways to make things actually work is to write unit tests.

I suppose that genius-level programmers don't need to test. The rest of us, however, need to write unit tests.

Unit Testing Dogma

On Stack Overflow there are some questions that illustrate the value of misinformation on unit testing. On one end, we have Zawinski (and others) who says that Unit Tests don't create enough value. On the other end we have questions that indicate the slavish adherence to some unit test process is essential.

See How to use TDD correctly to implement a numerical method? The author of the question seems to think that TDD means "decompose the problem into very small cases, write one test for each very small test, and then code for just that one case and no others." I don't know where this process came from, but it sounds like far too much work for the value created.

It's unfair to say that unit testing doesn't add value and claim that customers don't see the unit tests. They emphatically do see unit tests when they see software that works. Customers don't see unit tests in detail. They don't see dogmatic process-oriented software development.

When there are no tests, the customer sees shoddy quality. When the process (or the schedule) trumps the feature-set being delivered, the customer sees incomplete or low-quality deliverables.

Conclusion

The original blog post said -- clearly -- that gold-plated technology doesn't create any value.

The blog post also pulled out a quote that said -- incorrectly -- that unit tests doesn't create enough value.

Wednesday, July 1, 2009

Test-Driven Reverse Engineering and Perniciously Bad Code

I've done a fair amount of reverse engineering over the years.

In the early days, you went from code to specification to new code. It took forever and the problems you uncovered -- well -- they often derailed the project.

Recently, I used a TDD-like approach. Each piece of legacy code was turned into some Java code with some associated unit tests. Further, the users were able to cough up a canonical set of acceptance tests. These were turned into unit tests, and it wasn't too difficult to meet in the middle with plenty of testing for each piece of legacy conversion.

Given some subsequent experience, it turns out that user acceptance tests are essential to success in reverse engineering. Without user acceptance tests being provided up front, reverse engineering is a nightmare.

Mystery Code

Today's issue is legacy code that is -- frankly -- incompetently done. As a bonus, the user organization is a little vague on what it's supposed to do. They trust it, but they can't verify it. There are no official test cases.

The only explanation we can get is a demo. And because of the user's workload, we're only getting one of these. Limited to an hour. AFAIK, the only way we can test the conversion is to run it head-to-head with the legacy and take notes as the users complain about the differences.

There will be no easy way to get to create up-front acceptance tests to drive development. We'll have to take careful notes during the demo and transform the demo script into test results we can use.

Worse Still

What's worse is the incompetent coding. How bad can code be? Let me count the ways:
  1. Globals. Anyone who thinks a global is a legal programming construct needs to find a new career. A module that declares all the globals just compounds the horror. Everything is scopeless and could be used anywhere. There's no "interface" to anything, it's just a puddle of grey goo.
    • Using globals means functions have side-effects. They update global variables more-or-less spontaneously.
    • Using globals also means that all kinds of things may have hysteresis. You call it once, it does one thing. You call it again, it does something different.
  2. Random SQL. Anyone who thinks that SQL statements can be dropped in any random place in the application needs to find a new career. MVC is essential for segregating the SQL away from the View. Views functions can't query stuff that should have been part of the model, it means the model is incomplete -- and possibly in the wrong state. It also means that view functions are slow and possibly not strictly idempotent -- every time you refresh, a value in the view could diverge from the value in the "official" model.
  3. Copy-and-Paste coding. How hard can it be to put common code into a function? Apparently, it's nearly impossible. If you're copying and pasting common code, stop now. There's no excuse. It just raises the cost of maintenance and conversion through the roof.
  4. No Change Control. Or rather, the change control is to leave all previous versions of the code in place as comment blocks. For each line of real code, there are two lines of previous versions commented out. I don't care what it was; I want to know what it is. If you can't use SVN, or even VSS, you need to find another career.
There. I feel better. Back to trying to figure out what this application really does.

Monday, June 8, 2009

A "Don't Break the Build" Tip for Solo Python Developers

One of the Agile practices is Continuous Integration.  Fowler suggests that everyone commits every day.  In Elssamadisy's book includes specific advice on why a daily check-in helps.

Some folks call this the "Don't Break the Build" practice.

But what does that mean for Python where there is no build?  And what does it mean for a solo developer where there aren't any consequences?

The No-Build Build

The C++, Java, C# folks all have a really important, multi-step daily build.  The code has to compile; it has to be packaged into JAR's (or DLL's or whatever).  Perhaps higher-level packages like WAR's or EAR's need to be built.  Then you can run unit tests.

We Python folks don't have anything between code and unit test -- there's no real packaging.  This makes the daily build practice seem a little silly.

However, the daily "commit and run all the tests" is perhaps more important in Python than it is in Java (or C++ or C#.)  Even without any actual build activity, the daily build is still an essential practice.

Things Go Wrong

In Python, you've got two fundamental things which a daily check-in will spot.
  1. Bugs.  All of the logic errors that a daily unit test will spot.
  2. Bad Refactoring.  This is more subtle.  Not all refactoring errors lead directly to a bug that you can detect.  Indeed, there are a significant refactoring problem that I fight with weekly.
No Sense of Commitment

Refactoring is central to Agile development.  It is inevitable that you realize that you've misnamed, misplaced or overused some module or package and need to either rename it or delete it.

In Python, you've got to use `grep` (or something similar) to check your application for a clean change in names.  And you've got to double-check by using SVN to delete or rename the module.

Adding a new module, however, is more subtle.  Adding a new module is easy and quick.  You write it, you use it, you unit test and you're good to go.

Except, of course, if you forget to check it into SVN.  If it's not in SVN, it will still pass all your local unit tests.  It's those "daily build" unit tests that will break on a missing module.

VM To The Rescue

Solo developers, of course, have trouble with the nightly build.   First, they can skip it.  Second, and more important for folks saddled with Windows, you don't often have a clean QA user separate from you, the developer.

A VM is a very, very nice thing to have.  You fire up VMWare (or similar player) and run your daily build in a separate machine.  For a solo developer, you can do the following:
  1. Make changes, unit test.
  2. Commit the changes.
  3. Fire up the VM.  Do an SVN UP.  Run the unit tests again.
When a Python app crashes and burns on the VM, 80% of the time, it's a missing commit.  The rest of the time it's a failed configuration change for any differences between development and QA.

Now you can -- confidently -- turn code over to a sysadmin, knowing that it actually will work.