Moved. See All new content goes to the new site. This is a legacy, and will likely be dropped five years after the last post in Jan 2023.

Thursday, June 30, 2011

Implementing the Unsubscribe User Story

I've been unsubscribing from some junk email recently.

The user story is simple: As a not-very-interested person, I want to get off your dumb-ass mailing list so that I don't have to flag your crap as spam any more.

The implementations vary from good to evil.  Here's what I've found.

The best sites have an unsubscribe link that simply presents the facts -- you are unsubscribed.  I almost feel like re-subscribing to a site that handles this use case so well.

The first level of crap is a site which forces me to click an OK or Unsubscribe button to confirm that I really want to unsubscribe and wasn't clicking the tiny little links at the end of the message randomly.

The deeper level of "marketing" crap is a form that allows me to "configure my subscription settings".  This is done by some marketing genius who wanted to "offer additional value" rather than simply do what I asked.  This is a hateful (but not yet evil) practice.  I don't want to "configure" my settings.  I want out.

The third-from-worst is a form in which I must enter my email address.  What?  I have several email aliases that redirect to a common mailbox.  I have to -- what? -- guess which of the aliases was used?  This is pernicious because I can make a spelling mistake and they can continue to send me dunning email.  This fill-in-the-blanks unsubscribe is simply evil because it gives them plausible deniability when the continue to send me email.  It's now my fault that I didn't spell my name correctly.

The next-to-worst is a "mailto:" link that jumps into my emailer.  I have to -- what? -- fill in the magic word "Complete" somewhere?  You're kidding, right?  This is so 1980's-vintage listserv that I'm hoping these companies can be sued because they failed to actually unsubscribe folks.  Again, this gives the spammer a legitimate excuse because I failed to do the arcane step properly.

The worst is no link at all.  Just instructions explaining that an email must be send with the magic word "Complete" or "Unsubscribe" in the subject or body.  Because I use aliases, this will probably not unsubscribe anything useful, but will only unsubscribe my outbound email address.  This is the worst kind of evil.  In a way, it meets the user story.  But only in a very, very oblique way.

Monday, June 27, 2011

Simplicity vs. Depth

During  chapter technical reviews, the question of technical depth has come up time and again.  Essentially, in every single chapter.

In the older Building Skills in Python book, there are a number of topics that feel "digressive" to the reviewer and editor.  Too much depth.

However, there are a number of Python tutorials, many of which are very shallow.  I'd like to find a way to retain the technical depth, without it feeling "digressive".

Choice 1.  Split each chapter into different "basic" and "advanced" sections.  This would retain a sensible outline of parts (Language Fundamentals, Data Structures, Classes, Modules and a bunch of advanced projects) and chapters within each part.  Some chapters would still have to be split because a number of "advanced" concepts (i.e. alternative function argument passing with * and **) really has to be delayed until after an appropriate data structure chapter.

Choice 2.  Separate material two kinds of chapters "basic" and "pro".  This would lead to a "basics" thread for n00bz (read all the "basics" chapters) and an "pro" thread for professionals where you'd just read all the chapters in order without skipping.    This would create some more chapters, but each chapter would be shorter and more focused.


Tuesday, June 21, 2011


Just started learning about "Hackerspace".

Without really knowing what I was doing, I fell into the 757 Labs Hackerspace.

The 757 Python Users' Group, specifically.

What a great idea.  Bright people.  Interested in the same area of technology.

It's like hanging around with sailors at a marina.

Thursday, June 9, 2011

An Object-Lesson in How to Stifle Innovation

Read this: How Ma Bell Shelved the Future for 60 Years.
AT&T firmly believed that the answering machine, and its magnetic tapes, would lead the public to abandon the telephone.
How many good ideas are set aside by managers who simply don't have a clue what users actually want?

How many great IT projects are rejected because of this kind of delusional paranoia?

Tuesday, June 7, 2011

Multithreading -- Fear, Uncertainty and Doubt

Read this: "How to explain why multi-threading is difficult".

We need to talk. This is not that difficult.

Multi-threading is only difficult if you do it badly. There are an almost infinite number of ways to do it badly. Many magazines and bloggers have decided that the multithreading hurdle is the Next Big Thing (NBT™). We need new, fancy, expensive language and library support for this and we need it right now.

Parallel Computing is the secret to following Moore's Law. All those extra cores will go unused if we can't write multithreaded apps. And we can't write multi-threaded apps because—well—there are lots of reasons, split between ignorance and arrogance. All of which can be solved by throwing money after tools. Right?


One thing that makes multi-threaded applications error-prone is simple arrogance. There are lots and lots of race conditions that can arise. And folks aren't trained to think about how simple it is to have a sequence of instructions interrupted at just the wrong spot. Any sequence of "read, work, update" operations will have threads doing reads (in any order), threads doing the work (in any order) and then doing the updates in the worst possible order.

Compound "read, work, update" sequences need locks. And the locations of the locks can be obscure because we rarely think twice about reading a variable. Setting a variable is a little less confusing. Because we don't think much about reads, we fail to see the consequences of moving the read of a variable around as part of an optimization effort.


The best kind of lock is not a mutex or a semaphore. It surely isn't an RDBMS (but God knows, numerous organizations have used an RDBMS as a large, slow, complex and expensive message queue.)

The best kind of lock seems to be a message queue. The various concurrent elements can simply dequeue pieces of data, do their tasks and enqueue the results. It's really elegant. It has many, simple, uncoupled pieces. It can be scaled by increasing the number of threads sharing a queue.

A queue (read with an official "get") means that the reads aren't casually ignored and moved around during optimization. Further, the creation of a complex object can be done by one thread which gets pieces of data from a queue shared by multiple writers. No locking on the complex object.

Using message queues means that there's no weird race condition when getting data to start doing useful work; a get is atomic and guaranteed to have that property. Each thread gets an thread-local, thread-safe object. There's no weird race condition when passing a result on to the next step in a pipeline. It's dropped into the queue, where it's available to another thread.

Dining Philosophers

The Dining Philosophers Code Kata has a queue-based solution that's pretty cool.

A queue of Forks can be shared by the various Philosopher threads. Each Philosopher must get two Fork resources from the queue, eat, philosophize and then enqueue the two Forks again. It's quite short, easy to write and easy to demonstrate that it must work.

Perhaps the hardest thing is designing the Dining Room (also know as the Waiter, Conductor or Footman) that only allows four of the five philosophers to dine concurrently. To do this, a departing Philosopher must enqueue themselves into a "done eating" queue so that the next waiting Philosopher can be seated.

A queue-based solution is delightfully simple. 200 or so lines of code including docstrings comments so that the documentation looked nice, too.

Additional Constraints

The simplest solution uses a single queue of anonymous Forks. A common constraint is to insist that each Philosopher use only the two adjacent forks. Philosopher p can use forks (p+1 mod 5) and (p-1 mod 5).

This is pleasant to implement. The Philosopher simply dequeues a fork, checks the position, and re-enqueues it if it's a wrong fork.

FUD Factor

I think that the publicity around parallel programming and multithreaded applications is designed to create Fear, Uncertainty and Doubt (FUD™).
  1. Too many questions on StackOverflow seem to indicate that a slow program might magically get faster if somehow threads where involved. For programs that involve scanning the entire hard drive or downloading Wikipedia or doing a giant SQL query, the number of threads has little relevance to the real work involved. These programs are I/O bound; since threads must share the I/O resources of the containing process, multi-threading won't help.
  2. Too many questions on StackOverflow seem to have simple message queue solutions. But folks seem to start out using inappropriate technology. Just learn how to use a message queue. Move on.
  3. Too many vendors of tools (or languages) are pandering to (or creating) the FUD factor. If programmers are made suitably fearful, uncertain or doubtful, they'll lobby for spending lots of money for a language or package that "solves" the problem.
Sigh. The answer isn't software tools, it's design. Break the problem down into independent parallel tasks and feed them from message queues. Collect the results in message queues.

Some Code

class Philosopher( threading.Thread ):
    """A Philosopher.  When invited to dine, they will
    cycle through their standard dining loop.
    -   Acquire two forks from the fork Queue
    -   Eat for a random interval
    -   Release the two forks
    -   Philosophize for a random interval
    When done, they will enqueue themselves with
    the "footman" to indicate that they are leaving.
    def __init__( self, name, cycles=None ):
        """Create this philosopher.
        :param name: the number of this philosopher.  
            This is used by a subclass to find the correct fork.
        :param cycles: the number of cycles they will eat.
            If unspecified, it's a random number, u, 4 <= u < 7
        super( Philosopher, self ).__init__() name
        self.cycles= cycles if cycles is not None else random.randrange(4,7)
        self.log= logging.getLogger( "{0}.{1}".format(self.__class__.__name__, name) ) "cycles={0:d}".format( self.cycles ) )
        self.forks= None
        self.leaving= None
    def enter( self, forks, leaving ):
        """Enter the dining room.  This must be done before the 
        thread can be started.
        :param forks: The queue of available forks
        :param leaving: A queue to notify the footman that they are
        self.forks= forks
        self.leaving= leaving
    def dine( self ):
        """The standard dining cycle: 
        acquire forks, eat, release forks, philosophize.
        for cycle in range(self.cycles):
            f1= self.acquire_fork()
            f2= self.acquire_fork()
            self.release_fork( f1 )
            self.release_fork( f2 )
        self.leaving.put( self )
    def eat( self ):
        """Eating task.""" "Eating" )
        time.sleep( random.random() )
    def philosophize( self ):
        """Philosophizing task.""" "Philosophizing" )
        time.sleep( random.random() )
    def acquire_fork( self ):
        """Acquire a fork.
        :returns: The Fork acquired.
        fork= self.forks.get()
        return fork
    def release_fork( self, fork ):
        """Acquire a fork.
        :param fork: The Fork to release.
        fork.held_by= None
        self.forks.put( fork )
    def run( self ):
        """Interface to Thread.  After the Philosopher
        has entered the dining room, they may engage
        in the main dining cycle.
        assert self.forks and self.leaving

The point is to have the dine method be a direct expression of the Philosopher's dining experience.  We might want to override the acquire_fork method to permit different fork acquisition strategies.

For example, a picky philosopher may only want to use the forks adjacent to their place at the table, rather than reaching across the table for the next available Fork.

The Fork, by comparison, is boring.

class Fork( object ):
    """A Fork.  A Philosopher requires two of these to eat."""
    def __init__( self, name ):
        """Create the Fork.
        :param name: The number of this fork.  This may 
            be used by a Philosopher looking for the correct Fork.
        """ name
        self.holder= None
        self.log= logging.getLogger( "{0}.{1}".format(self.__class__.__name__, name) )
    def held_by( self ):
        """The Philosopher currently holding this Fork."""
        return self.holder
    def held_by( self, philosopher ):
        if philosopher:
   "Acquired by {0}".format( philosopher ) )
   "Released by {0}".format( self.holder ) )
        self.holder= philosopher

The Table, however, is interesting.  It includes the special "leaving" queue that's not a proper part of the problem domain, but is a part of this particular solution.

class Table( object ):
    """The dining Table.  This uses a queue of Philosophers
    waiting to dine and a queue of forks.
    This sets Philosophers, allows them to dine and then
    cleans up after each one is finished dining.
    To prevent deadlock, there's a limit on the number
    of concurrent Philosophers allowed to dine.
    def __init__( self, philosophers, forks, limit=4 ):
        """Create the Table.
        :param philosophers: The queue of Philosophers waiting to dine.
        :param forks: The queue of available Forks.
        :param limit: A limit on the number of concurrently dining Philosophers.
        self.philosophers= philosophers
        self.forks= forks
        self.limit= limit
        self.leaving= Queue.Queue()
        self.log= logging.getLogger( "table" )
    def dinner( self ):
        """The essential dinner cycle:
        admit philosophers (to the stated limit);
        as philosophers finish dining, remove them and admit more;
        when the dining queue is empty, simply clean up.
        self.at_table= self.limit
        while not self.philosophers.empty():
            while self.at_table != 0:
                p= self.philosophers.get()
       p )
            # Must do a Queue.get() to wait for a resource
            p= self.leaving.get()
            self.excuse( p )
        assert self.philosophers.empty()
        while self.at_table != self.limit:
            p= self.leaving.get()
            self.excuse( p )
        assert self.at_table == self.limit
    def seat( self, philosopher ):
        """Seat a philosopher.  This increments the count 
        of currently-eating Philosophers.
        :param philosopher: The Philosopher to be seated.
        """ "Seating {0}".format( )
        philosopher.enter( self.forks, self.leaving)
        self.at_table -= 1 # Consume a seat
    def excuse( self, philosopher ):
        """Excuse a philosopher.  This decrements the count 
        of currently-eating Philosophers.
        :param philosopher: The Philosopher to be excused.
        philosopher.join() # Cleanup the thread "Excusing {0}".format( )
        self.at_table += 1 # Release a seat

The dinner method assures that all Philosophers eat until they are finished.  It also assures that four Philosophers sit at the table and when one finishes, another takes their place.  Finally, it also assures that all Philosophers are done eating before the dining room is closed.

Friday, June 3, 2011

Changed the Page Template

The "default" template I chose was too narrow for presenting code samples.  Changed it.