<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:georss='http://www.georss.org/georss' xmlns:gd='http://schemas.google.com/g/2005' xmlns:thr='http://purl.org/syndication/thread/1.0'><id>tag:blogger.com,1999:blog-684183198890094283</id><updated>2012-01-27T08:19:50.990-05:00</updated><category term='rst'/><category term='continuous integration'/><category term='meetup'/><category term='tools'/><category term='package'/><category term='SQL'/><category term='stingray reader'/><category term='API Design'/><category term='C'/><category term='texlive'/><category term='PL/SQL'/><category term='use case'/><category term='search optimization'/><category term='SQLite'/><category term='applet'/><category term='open source'/><category term='data warehouse'/><category term='hadoop'/><category term='database administration'/><category term='#SOMeetup'/><category term='HTTP'/><category term='stackoverflow'/><category term='module'/><category term='inheritance'/><category term='encryption'/><category term='DOM'/><category term='epydoc'/><category term='css'/><category term='defensive programming'/><category term='spam'/><category term='sales'/><category term='ORM'/><category term='procedural programming'/><category term='metaphone'/><category term='unicode'/><category term='like'/><category term='performance'/><category term='Apache'/><category term='integer'/><category term='xhtml'/><category term='beautiful soup'/><category term='humor'/><category term='#SODevDays'/><category term='interpreted'/><category term='xlsm'/><category term='xml'/><category term='conway&apos;s law'/><category term='software process improvement'/><category term='literate programming'/><category term='macintosh'/><category term='threads'/><category term='multiprocessing'/><category term='java'/><category term='refactoring'/><category term='disruption'/><category term='vmware'/><category term='security'/><category term='pyWeb'/><category term='knowledge capture'/><category term='UML'/><category term='data conversion'/><category term='F#'/><category term='assert statement'/><category term='algorithm'/><category term='jinja'/><category term='pdf'/><category term='aphorism'/><category term='Programming Languages'/><category term='COBOL'/><category term='bbedit'/><category term='database design'/><category term='Django'/><category term='innovation'/><category term='HTML'/><category term='hackerspace'/><category term='unit testing'/><category term='TeX'/><category term='regular expressions'/><category term='waterfall'/><category term='multithreaded'/><category term='architecture'/><category term='floating-point'/><category term='restructuredtext'/><category term='blogging'/><category term='dimensional data'/><category term='ide'/><category term='WebServices'/><category term='design patterns'/><category term='ETL'/><category term='dynamic'/><category term='apple'/><category term='code-kata'/><category term='ESB'/><category term='sphinx'/><category term='retail'/><category term='reverse engineering'/><category term='risk'/><category term='template'/><category term='complexity'/><category term='SOA'/><category term='delegation'/><category term='triggers'/><category term='xlsx'/><category term='ctypes'/><category term='star-schema'/><category term='COCOMO'/><category term='spreadsheet'/><category term='excel'/><category term='content management'/><category term='agile'/><category term='python'/><category term='analysis'/><category term='business rules'/><category term='python 3'/><category term='iWeb'/><category term='tdd'/><category term='building skills books'/><category term='polymorphism'/><category term='test-driven reverse engineering'/><category term='JUnit'/><category term='map-reduce'/><category term='csv'/><category term='scons'/><category term='learning'/><category term='markup'/><category term='noSQL'/><category term='docutils'/><category term='estimating'/><category term='OO design'/><category term='capacity planning'/><category term='CLI'/><category term='zipfile'/><category term='stored procedures'/><category term='cheetah'/><category term='REST'/><category term='mac os x'/><category term='spi'/><category term='RDBMS'/><category term='Fossil'/><category term='TCO'/><category term='VB'/><category term='configuration management'/><category term='queue'/><category term='C#'/><category term='PHP'/><category term='scrum'/><category term='xlrd'/><category term='anti-if'/><category term='soundex'/><category term='functional programming'/><category term='sgml'/><category term='project management'/><category term='iPad'/><category term='numerosity'/><category term='LaTeX'/><category term='scripted'/><title type='text'>S.Lott-Software Architect</title><subtitle type='html'>Rants on the daily grind of building software.</subtitle><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://slott-softwarearchitect.blogspot.com/feeds/posts/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default?max-results=100'/><link rel='alternate' type='text/html' href='http://slott-softwarearchitect.blogspot.com/'/><link rel='hub' href='http://pubsubhubbub.appspot.com/'/><link rel='next' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default?start-index=101&amp;max-results=100'/><author><name>Steven Lott</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-mXhxBa_vM4E/AAAAAAAAAAI/AAAAAAAAAAA/eNlmenT-DOs/s512-c/photo.jpg'/></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>245</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>100</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-684183198890094283.post-5866564886486072962</id><published>2012-01-26T08:00:00.000-05:00</published><updated>2012-01-26T08:00:08.487-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='functional programming'/><category scheme='http://www.blogger.com/atom/ns#' term='analysis'/><category scheme='http://www.blogger.com/atom/ns#' term='python 3'/><category scheme='http://www.blogger.com/atom/ns#' term='Apache'/><title type='text'>Apache Log Parsing</title><content type='html'>How much do I love Python? &amp;nbsp;Consider this little snippet that parses Apache logs.&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;pre&gt;&lt;code&gt;&lt;br /&gt;import re&lt;br /&gt;from collections import defaultdict, named tuple&lt;br /&gt;&lt;br /&gt;format_pat= re.compile( &lt;br /&gt;    r"(?P&amp;lt;host&amp;gt;[\d\.]+)\s" &lt;br /&gt;    r"(?P&amp;lt;identity&amp;gt;\S*)\s" &lt;br /&gt;    r"(?P&amp;lt;user&amp;gt;\S*)\s"&lt;br /&gt;    r"\[(?P&amp;lt;time&amp;gt;.*?)\]\s"&lt;br /&gt;    r'"(?P&amp;lt;request&amp;gt;.*?)"\s'&lt;br /&gt;    r"(?P&amp;lt;status&amp;gt;\d+)\s"&lt;br /&gt;    r"(?P&amp;lt;bytes&amp;gt;\S*)\s"&lt;br /&gt;    r'"(?P&amp;lt;referer&amp;gt;.*?)"\s' # [SIC]&lt;br /&gt;    r'"(?P&amp;lt;user_agent&amp;gt;.*?)"\s*' &lt;br /&gt;)&lt;br /&gt;&lt;br /&gt;Access = namedtuple('Access',&lt;br /&gt;    ['host', 'identity', 'user', 'time', 'request',&lt;br /&gt;    'status', 'bytes', 'referer', 'user_agent'] )&lt;br /&gt;&lt;br /&gt;def access_iter( source_iter ):&lt;br /&gt;    for log in source_iter:&lt;br /&gt;        for line in (l.rstrip() for l in log):&lt;br /&gt;            match= format_pat.match(line)&lt;br /&gt;            if match:&lt;br /&gt;                yield Access( **match.groupdict() )&lt;/code&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;That's about it. &amp;nbsp;The access log rows are now first-class Access-class objects that can be processed pleasantly by high-level Python applications.&lt;br /&gt;&lt;br /&gt;Cool things.&lt;br /&gt;&lt;ol&gt;&lt;li&gt;The adjacent string concatenation means that the regular expression can be broken up into bits to make it readable.&lt;/li&gt;&lt;li&gt;When the named tuple attributes match the regular expression names, we can trivially turn the &lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;match.groupdict()&lt;/span&gt; into a named tuple.&amp;nbsp;&lt;/li&gt;&lt;li&gt;By using a generator, the other parts of the application can simply loop through the results without tying up memory to create vast intermediate structures.&lt;/li&gt;&lt;/ol&gt;&lt;div&gt;A couple of years back, a sysadmin was trying to justify spending money on a log analyzer product. &amp;nbsp;I suggested they (at the very least) get an open source log analyzer.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;I also suggested that they learn Python and save themselves the pain of working with a (potentially) complex tool. &amp;nbsp;Given this as a common library module, log analysis applications are remarkably easy to write.&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684183198890094283-5866564886486072962?l=slott-softwarearchitect.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://slott-softwarearchitect.blogspot.com/feeds/5866564886486072962/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2012/01/apache-log-parsing.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/5866564886486072962'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/5866564886486072962'/><link rel='alternate' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2012/01/apache-log-parsing.html' title='Apache Log Parsing'/><author><name>Steven Lott</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-mXhxBa_vM4E/AAAAAAAAAAI/AAAAAAAAAAA/eNlmenT-DOs/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684183198890094283.post-894709887996562185</id><published>2012-01-24T08:00:00.001-05:00</published><updated>2012-01-24T08:00:05.098-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='python'/><category scheme='http://www.blogger.com/atom/ns#' term='building skills books'/><title type='text'>Building Skills in Programming</title><content type='html'>I've revised (and streamlined) my &lt;i&gt;Building Skills in Programming&lt;/i&gt; book.&lt;br /&gt;&lt;br /&gt;The 2.6.2. edition will simply replace the 2.6.1. edition, leading to the possibility of broken bookmarks because of the changes.&lt;br /&gt;&lt;br /&gt;Currently, the non-programmer book accounts for under 10% hits on the &lt;a href="http://www.itmaybeaback.com/book"&gt;http://www.itmaybeaback.com/book&lt;/a&gt; site. &amp;nbsp;Consequently, I'm not &lt;i&gt;very&lt;/i&gt; worried about the breakage. &amp;nbsp;I know someone will hate me for messing with the content just as they were starting to understand it.&lt;br /&gt;&lt;br /&gt;I'm indebted to all my readers for the numerous suggestions, corrections and complements that I've received. &amp;nbsp;In order to simplify the correction process, I've put the source onto SourceForge. &amp;nbsp;See the &lt;a href="http://sourceforge.net/projects/progbook-py26/" target="_blank"&gt;Programming Book-Python 2.6&lt;/a&gt; project.&lt;br /&gt;&lt;br /&gt;The next step will be to add a PayPal donations button. &lt;br /&gt;&lt;br /&gt;And... If I can get the PDF into really good shape, I may post it on &lt;a href="http://www.lulu.com/" target="_blank"&gt;Lulu&lt;/a&gt; for folks who really want a hardcopy.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Traffic&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;For a random 2-day period, the usage looks like this:&lt;br /&gt;&lt;br /&gt;246 distinct "users" (really IP addresses).&lt;br /&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;{'html only': 161,&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp;('both', 'oodesign-java-2.1'): 9,&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp;('both', 'oodesign-python-2.1'): 5,&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp;('both', 'programming-2.6'): 3,&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp;('both', 'python-2.6'): 7,&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp;('pdf only', 'oodesign-java-2.1'): 21,&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp;('pdf only', 'oodesign-python-2.1'): 14,&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp;('pdf only', 'programming-2.6'): 11,&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp;('pdf only', 'python-2.6'): 37}&lt;/span&gt;&lt;br /&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;br /&gt;The "both" is a count of users reading HTML as well as the identified PDF editions.&lt;br /&gt;For the "html only" and "both" users, there's a detailed list of particular books and sections. &amp;nbsp;Too large and boring to repeat here.&lt;br /&gt;&lt;br /&gt;One interesting part is this detail:&lt;br /&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;82.128.23.63 {'programming-2.6': 37, 'oodesign-java-2.1': 28, 'python-2.6': 37, 'oodesign-python-2.1': 24}&amp;nbsp;&lt;/span&gt;&lt;br /&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;Apparently, Nigeria needs a &lt;b&gt;lot&lt;/b&gt; of copies of the PDF. &amp;nbsp;I think I might want to block them, because this can't be anything sensible except endless polling by some botnet script.&lt;br /&gt;&lt;br /&gt;Since we don't drop off cookies, we can't really identify user sessions. &amp;nbsp;Maybe in the future, I'll wrap the static content download with a simple WSGI application to drop off and collect cookies to track users instead of IP addresses.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684183198890094283-894709887996562185?l=slott-softwarearchitect.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://slott-softwarearchitect.blogspot.com/feeds/894709887996562185/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2012/01/building-skills-in-programming.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/894709887996562185'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/894709887996562185'/><link rel='alternate' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2012/01/building-skills-in-programming.html' title='Building Skills in Programming'/><author><name>Steven Lott</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-mXhxBa_vM4E/AAAAAAAAAAI/AAAAAAAAAAA/eNlmenT-DOs/s512-c/photo.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684183198890094283.post-4409072006127914286</id><published>2012-01-19T08:00:00.000-05:00</published><updated>2012-01-19T08:00:09.522-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='csv'/><category scheme='http://www.blogger.com/atom/ns#' term='unicode'/><category scheme='http://www.blogger.com/atom/ns#' term='python'/><title type='text'>Python 2.7 CSV files with Unicode Characters</title><content type='html'>The &lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;csv&lt;/span&gt; module in Python 2.7 is more-or-less hard-wired to work with ASCII and only ASCII.&lt;br /&gt;&lt;br /&gt;Sadly, we're often confronted with CSV files that include Unicode characters. &amp;nbsp;There are numerous Stack Overflow questions on this topic. &amp;nbsp;&lt;a href="http://stackoverflow.com/search?q=python+csv+unicode"&gt;http://stackoverflow.com/search?q=python+csv+unicode&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;What to do? &amp;nbsp;Since &lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;csv&lt;/span&gt; is married to seeing ASCII/bytes, we must explicitly decode the column values.&lt;br /&gt;&lt;br /&gt;One solution is to wrap&amp;nbsp;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;csv.DictReader&lt;/span&gt;, something like the following. &amp;nbsp;We need to decode each individual column before attempting to do anything with value.&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;code&gt;class UnicodeDictReader( object ):&lt;br /&gt;    def __init__( self, *args, **kw ):&lt;br /&gt;        self.encoding= kw.pop('encoding', 'mac_roman')&lt;br /&gt;        self.reader= csv.DictReader( *args, **kw )&lt;br /&gt;    def __iter__( self ):&lt;br /&gt;        decode= codecs.getdecoder( self.encoding )&lt;br /&gt;        for row in self.reader:&lt;br /&gt;            t= dict( (k,decode(row[k])[0]) for k in row )&lt;br /&gt;            yield t&lt;br /&gt;&lt;/code&gt;&lt;/pre&gt;&lt;br /&gt;This new object is an iterable which contains a DictReader.  We could subclass DictReader, also.  &lt;br /&gt;&lt;br /&gt;The use case, then, becomes something simple like this.&lt;br /&gt;&lt;pre&gt;&lt;code&gt;with open("some.csv","rU") as source:&lt;br /&gt;    rdr= UnicodeDictReader( source )&lt;br /&gt;    for row in rdr:&lt;br /&gt;        # process the row&lt;br /&gt;&lt;/code&gt;&lt;/pre&gt;&lt;br /&gt;We can now get Unicode characters from a CSV file.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684183198890094283-4409072006127914286?l=slott-softwarearchitect.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://slott-softwarearchitect.blogspot.com/feeds/4409072006127914286/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2012/01/python-27-csv-files-with-unicode.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/4409072006127914286'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/4409072006127914286'/><link rel='alternate' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2012/01/python-27-csv-files-with-unicode.html' title='Python 2.7 CSV files with Unicode Characters'/><author><name>Steven Lott</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-mXhxBa_vM4E/AAAAAAAAAAI/AAAAAAAAAAA/eNlmenT-DOs/s512-c/photo.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684183198890094283.post-4317451399314769113</id><published>2012-01-17T08:00:00.000-05:00</published><updated>2012-01-17T08:00:05.590-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='csv'/><category scheme='http://www.blogger.com/atom/ns#' term='python'/><title type='text'>Python 3.2 CSV Module -- Very, very nice</title><content type='html'>A common (and small) task is reformatting a file that's in some variant of CSV. &amp;nbsp;It could be a SQL database extract, or an export from an application that works well with CSV files.&lt;br /&gt;&lt;br /&gt;In Python 2.x, a CSV file with Unicode was a bit of a problem. &amp;nbsp;The CSV module isn't happy with Unicode. &amp;nbsp;The documentation is quite clear that many files need to be opened with a mode of &lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;'rb'&lt;/span&gt; to correctly handle Windows line-endings.&lt;br /&gt;&lt;br /&gt;Because of this, a CSV file with Unicode required using an explicit decoder on the individual columns (not the line as a whole!)&lt;br /&gt;&lt;br /&gt;But with Python 3.2, that's all behind us.&lt;br /&gt;&lt;br /&gt;Here's something I did recently. &amp;nbsp;The file has six columns that are relevant. &amp;nbsp;One of them (the "NOTE") column has a big block of text with details buried inside using a kind of RST markup. &amp;nbsp;The data might be three lines with a value like this "&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;words words\n:budget: 1500\nwords words&lt;/span&gt;".&lt;br /&gt;&lt;br /&gt;The file is UTF-8, and the words have non-ASCII unicode characters randomly through it.&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;pre&gt;&lt;code&gt; &lt;br /&gt;def details( source ):&lt;br /&gt;    relevant = ( "TASK", "FOLDER", "CONTEXT", "PRIORITY", "STAR", )&lt;br /&gt;    parse= "NOTE"&lt;br /&gt;    data_pat= re.compile( r"^:(\w+):\s*(.*)\s*$" )&lt;br /&gt;    rdr= csv.DictReader( source )&lt;br /&gt;    for row in rdr:&lt;br /&gt;        txt= row[parse]&lt;br /&gt;        lines= ( data_pat.match(l) for l in txt.splitlines() )&lt;br /&gt;        matches= ( m.groups() for m in lines if m )&lt;br /&gt;        result= dict( (k, row[k]) for k in relevant) &lt;br /&gt;        result.update( dict(matches) )&lt;br /&gt;        yield result&lt;br /&gt;&lt;/code&gt;&lt;/pre&gt;&lt;br /&gt;How much do I love Python?  Let me count the ways.&lt;br /&gt;&lt;ol&gt;&lt;li&gt;The assignment of &lt;i&gt;lines&lt;/i&gt;&amp;nbsp;on line 8 was fun. &amp;nbsp;The "NOTE" column, in &lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;row[parse]&lt;/span&gt;, contains the extra fields. &amp;nbsp;They'll be on a separate line with the &lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;:word:value&lt;/span&gt; format as shown in the &lt;i&gt;data_pat&lt;/i&gt;&amp;nbsp;pattern. &amp;nbsp;We create a generator which will split the text field into lines and apply the pattern to each line.&lt;/li&gt;&lt;li&gt;The assignment to &amp;nbsp;&lt;i&gt;matches&lt;/i&gt;&amp;nbsp;on line 9 was equally fun. &amp;nbsp;If the &lt;i&gt;matches&amp;nbsp;&lt;/i&gt;generator produced a match object, the &lt;i&gt;lines&lt;/i&gt;&amp;nbsp;generator will gather the two groups form the line.&lt;/li&gt;&lt;li&gt;The assignment to &lt;i&gt;result&lt;/i&gt;&amp;nbsp;creates a dictionary from the relevant columns. &amp;nbsp;&lt;/li&gt;&lt;li&gt;The second assignment to &lt;i&gt;result&lt;/i&gt;&amp;nbsp;updates this dictionary with data parsed out of the "NOTE" column.&lt;/li&gt;&lt;/ol&gt;&lt;div&gt;That makes it quite pleasant (and fast) to process an extract file, reformatting a "big blob of text" into individual columns.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The rest of the app boils down to this.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;pre&gt;&lt;code&gt;def rewrite( input, target=sys.stdout ):&lt;br /&gt;    with io.open(input, 'r', encoding='UTF-8') as source:&lt;br /&gt;        data= list( details( source ) )&lt;br /&gt;    headers= set( k for row in data for k in row  )&lt;br /&gt;    wtr= csv.DictWriter( target, sorted(headers) )&lt;br /&gt;    wtr.writeheader( )&lt;br /&gt;    wtr.writerows( data )&lt;br /&gt;&lt;/code&gt;&lt;/pre&gt;&lt;br /&gt;This gathers the raw data into a big old sequence in memory, andthen writes that big old sequence back out to a file. &amp;nbsp;If we knew the headers buried in the "NOTE" field, we could do the entire thing in a single pass just using generators.&lt;br /&gt;&lt;br /&gt;We have to explicitly provide the encoding because the file was created via a download and the encoding isn't properly set on the client machine. &amp;nbsp;The important thing is that we &lt;i&gt;can&lt;/i&gt;&amp;nbsp;do this when it's necessary. &amp;nbsp;And we no longer have to explicitly decode fields.&lt;br /&gt;&lt;br /&gt;Since we don't know the headers in the "NOTE" field, we're forced to create the &lt;i&gt;headers&lt;/i&gt;&amp;nbsp;set by examining each row dictionary for it's keys.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684183198890094283-4317451399314769113?l=slott-softwarearchitect.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://slott-softwarearchitect.blogspot.com/feeds/4317451399314769113/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2012/01/python-32-csv-module-very-very-nice.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/4317451399314769113'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/4317451399314769113'/><link rel='alternate' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2012/01/python-32-csv-module-very-very-nice.html' title='Python 3.2 CSV Module -- Very, very nice'/><author><name>Steven Lott</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-mXhxBa_vM4E/AAAAAAAAAAI/AAAAAAAAAAA/eNlmenT-DOs/s512-c/photo.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684183198890094283.post-8850433145632125947</id><published>2012-01-12T09:18:00.000-05:00</published><updated>2012-01-18T05:48:05.210-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='python'/><category scheme='http://www.blogger.com/atom/ns#' term='multiprocessing'/><title type='text'>Multiprocessing and Shared Objects [Revised]</title><content type='html'>Read this: &lt;a href="http://eli.thegreenplace.net/2012/01/04/shared-counter-with-pythons-multiprocessing/" target="_blank"&gt;Shared Counter with Python Multiprocessing&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;Brilliant. &amp;nbsp;Thank you for this.&lt;br /&gt;&lt;br /&gt;Too many of the questions on StackOverflow that include multi-threading are better approached as multi-processing. &amp;nbsp;In Linux, there are times when all threads of a single process are stopped while the process (as a whole) waits for system services to complete. &amp;nbsp;It's a consequence of the way select and poll work. &amp;nbsp;An example of the kind of sophisticated design required to avoid this can be found &lt;a href="http://www.kircher-schwanninger.de/michael/publications/lf.pdf" target="_blank"&gt;here&lt;/a&gt;. &amp;nbsp;Most I/O-intensive applications should be done via multi processing, not multi threading.&lt;br /&gt;&lt;br /&gt;And. &amp;nbsp;The kind of shared objects that multi threading allows are often rare and require locks.&lt;br /&gt;&lt;br /&gt;So, simplify your life. &amp;nbsp;When you hear about "threads", replace the word with "processes" and move on. &amp;nbsp;The implementation will be much nicer.&lt;br /&gt;&lt;br /&gt;The standard gripe is that process creation is so expensive, and thread creation is relatively cheap. &amp;nbsp;All true. &amp;nbsp;That's why folks use process pools: to amortize the creation cost over a long period of operation.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684183198890094283-8850433145632125947?l=slott-softwarearchitect.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://slott-softwarearchitect.blogspot.com/feeds/8850433145632125947/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2012/01/multiprocessing-and-shared-objects.html#comment-form' title='8 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/8850433145632125947'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/8850433145632125947'/><link rel='alternate' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2012/01/multiprocessing-and-shared-objects.html' title='Multiprocessing and Shared Objects [Revised]'/><author><name>Steven Lott</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-mXhxBa_vM4E/AAAAAAAAAAI/AAAAAAAAAAA/eNlmenT-DOs/s512-c/photo.jpg'/></author><thr:total>8</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684183198890094283.post-8143257266816057911</id><published>2012-01-10T08:00:00.000-05:00</published><updated>2012-01-10T08:00:14.902-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='software process improvement'/><category scheme='http://www.blogger.com/atom/ns#' term='project management'/><title type='text'>Innovation is the punctuation at the end of a string of failures</title><content type='html'>Read this in Forbes: "&lt;a href="http://www.forbes.com/sites/work-in-progress/2011/05/05/innovations-return-on-failure-rof/" target="_blank"&gt;Innovation's Return on Failure: ROF&lt;/a&gt;".&lt;br /&gt;&lt;br /&gt;Also, this: "&lt;a href="http://blogs.hbr.org/berkun/2008/10/the-necessity-of-failure-in-in.html" target="_blank"&gt;The Necessity of Failure in Innovation (+ more on CDOs)&lt;/a&gt;".&lt;br /&gt;&lt;br /&gt;This, too: "&lt;a href="http://www.scottberkun.com/blog/2006/why-innovation-efforts-fail/" target="_blank"&gt;Why innovation efforts fail&lt;/a&gt;".&lt;br /&gt;&lt;br /&gt;While we're at it: "&lt;a href="http://commscopeblogs.com/2011/12/21/accepting-failure-is-key-to-good-overall-returns-on-high-risk-development-programs/" target="_blank"&gt;Accepting Failure is Key to Good Overall Returns on High-Risk Development Programs&lt;/a&gt;".&lt;br /&gt;&lt;br /&gt;I can't say enough about the value of "failure". &amp;nbsp;The big issue here is the label.&lt;br /&gt;&lt;br /&gt;A project with a grand scope and strategic vision gets changed to make it smaller and more focused. &amp;nbsp;Did it "fail" to deliver on the original requirements? &amp;nbsp;Or did someone learn that the original grand scope was wrong?&lt;br /&gt;&lt;br /&gt;A project that changes isn't failure. &amp;nbsp;It's just lessons learned. &amp;nbsp;Canceling, re-scoping, de-scoping, and otherwise modifying a project is what innovation looks like. &amp;nbsp;It should not be counted as a "failure". &lt;br /&gt;&lt;br /&gt;A project "&lt;a href="http://en.wikipedia.org/wiki/Death_march_(project_management)" target="_blank"&gt;Death March&lt;/a&gt;" occurs because failure is not an option, and change will be labeled as failure.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684183198890094283-8143257266816057911?l=slott-softwarearchitect.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://slott-softwarearchitect.blogspot.com/feeds/8143257266816057911/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2012/01/innovation-is-punctuation-at-end-of.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/8143257266816057911'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/8143257266816057911'/><link rel='alternate' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2012/01/innovation-is-punctuation-at-end-of.html' title='Innovation is the punctuation at the end of a string of failures'/><author><name>Steven Lott</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-mXhxBa_vM4E/AAAAAAAAAAI/AAAAAAAAAAA/eNlmenT-DOs/s512-c/photo.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684183198890094283.post-5215824677425479405</id><published>2012-01-05T08:00:00.000-05:00</published><updated>2012-01-05T08:00:06.569-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='software process improvement'/><title type='text'>Self-Direction, Mastery, Purpose</title><content type='html'>Watch this:&amp;nbsp;&lt;span class="Apple-style-span" style="font-family: 'lucida grande', tahoma, verdana, arial, sans-serif; font-size: 11px; line-height: 14px;"&gt;&lt;a href="http://www.youtube.com/watch?v=u6XAPnuFjJc" rel="nofollow nofollow" style="color: #3b5998; cursor: pointer; text-decoration: underline;" target="_blank"&gt;&lt;span&gt;http://www.youtube.com/&lt;/span&gt;&lt;wbr&gt;&lt;/wbr&gt;&lt;span class="word_break" style="display: inline-block;"&gt;&lt;/span&gt;watch?v=u6XAPnuFjJc&lt;/a&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Brilliant summary of what &lt;b&gt;really&lt;/b&gt; motivates people.&lt;br /&gt;&lt;br /&gt;The most important advice: provide a sense of purpose and get out of people's way so that they can do the right thing.&lt;br /&gt;&lt;br /&gt;Micromanagement, incentives, annual performance reviews and the like aren't as useful as providing the sense of purpose, the opportunity for mastery and the freedom of self-direction.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684183198890094283-5215824677425479405?l=slott-softwarearchitect.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://slott-softwarearchitect.blogspot.com/feeds/5215824677425479405/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2012/01/self-direction-mastery-purpose.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/5215824677425479405'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/5215824677425479405'/><link rel='alternate' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2012/01/self-direction-mastery-purpose.html' title='Self-Direction, Mastery, Purpose'/><author><name>Steven Lott</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-mXhxBa_vM4E/AAAAAAAAAAI/AAAAAAAAAAA/eNlmenT-DOs/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684183198890094283.post-3819624562624360978</id><published>2012-01-03T08:00:00.000-05:00</published><updated>2012-01-03T08:00:02.401-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='waterfall'/><category scheme='http://www.blogger.com/atom/ns#' term='agile'/><category scheme='http://www.blogger.com/atom/ns#' term='software process improvement'/><title type='text'>Epic indictment of Waterfall Methods</title><content type='html'>Saw this recently.&lt;br /&gt;&lt;blockquote class="twitter-tweet"&gt;Why do most gov websites look like they were created by someone's 10 year old nephew yet cost millions to make?&lt;br /&gt;— Steve Dekorte (@stevedekorte) &lt;a data-datetime="2011-12-29T02:05:06+00:00" href="https://twitter.com/stevedekorte/status/152208488716701696"&gt;December 29, 2011&lt;/a&gt;&lt;/blockquote&gt;&lt;script charset="utf-8" src="//platform.twitter.com/widgets.js"&gt;&lt;/script&gt;&lt;br /&gt;I think this aptly summarizes the results of a waterfall methodology.&lt;br /&gt;&lt;ol&gt;&lt;li&gt;You wrote a lot of requirements, not fully understanding the actors or their use cases.&lt;/li&gt;&lt;li&gt;Your vendor implemented those requirements because they were contractual obligations not because they created value for the actors.&lt;/li&gt;&lt;/ol&gt;The government's not the only offender.&amp;nbsp; They're just more visible and more bound up in a legally-mandated purchasing cycle that makes the waterfall desirable and more Agile methods undesirable. &lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684183198890094283-3819624562624360978?l=slott-softwarearchitect.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://slott-softwarearchitect.blogspot.com/feeds/3819624562624360978/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2012/01/epic-indictment-of-waterfall-methods.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/3819624562624360978'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/3819624562624360978'/><link rel='alternate' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2012/01/epic-indictment-of-waterfall-methods.html' title='Epic indictment of Waterfall Methods'/><author><name>Steven Lott</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-mXhxBa_vM4E/AAAAAAAAAAI/AAAAAAAAAAA/eNlmenT-DOs/s512-c/photo.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684183198890094283.post-5463640826337493501</id><published>2011-12-29T08:00:00.000-05:00</published><updated>2011-12-29T08:00:00.592-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='HTTP'/><category scheme='http://www.blogger.com/atom/ns#' term='REST'/><category scheme='http://www.blogger.com/atom/ns#' term='security'/><title type='text'>LANGSEC: Language-theoretic Security</title><content type='html'>Wow. &amp;nbsp;Just wow. &amp;nbsp;See "&lt;a href="http://www.cs.dartmouth.edu/~sergey/langsec/occupy/" target="_blank"&gt;LANGSEC explained in a few slogans&lt;/a&gt;".&lt;br /&gt;&lt;br /&gt;Short, easy-to-grasp explanation of why complex protocols create new problems. &lt;br /&gt;&lt;br /&gt;I'm happy with REST and the stack of stuff under it (HTTP, TCP/IP, etc.)&lt;br /&gt;&lt;br /&gt;Once upon a time (2001), I invented by own version of a RESTful protocol outside HTTP. &amp;nbsp;That was cool. &amp;nbsp;Very simple, and very fast. &amp;nbsp;But relatively inflexible. &amp;nbsp;The syntax was more like FTP and SMTP; the semantics where mostly just CRUD rules and RESTful state transfers. &lt;br /&gt;&lt;br /&gt;I was way too dumb to leverage HTTP methods and the genius of a URI.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684183198890094283-5463640826337493501?l=slott-softwarearchitect.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://slott-softwarearchitect.blogspot.com/feeds/5463640826337493501/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/12/langsec-language-theoretic-security.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/5463640826337493501'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/5463640826337493501'/><link rel='alternate' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/12/langsec-language-theoretic-security.html' title='LANGSEC: Language-theoretic Security'/><author><name>Steven Lott</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-mXhxBa_vM4E/AAAAAAAAAAI/AAAAAAAAAAA/eNlmenT-DOs/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684183198890094283.post-2409424030454707055</id><published>2011-12-27T08:00:00.000-05:00</published><updated>2011-12-27T08:00:11.343-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='software process improvement'/><title type='text'>Technology Refresh</title><content type='html'>&lt;div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"&gt;I've been refurbishing an older project -- written in 2008. &amp;nbsp;Probably with Django 1.0.1. &amp;nbsp;Certainly with Python 2.5.&lt;/div&gt;&lt;br /&gt;The &lt;a href="https://docs.djangoproject.com/en/1.3/releases/#release" target="_blank"&gt;Django 1.3 release&lt;/a&gt; has been around since March. &amp;nbsp;The change&amp;nbsp;underscored the importance of technology refresh.&lt;br /&gt;&lt;br /&gt;The best part was to delete code. &amp;nbsp;There were two significant reasons.&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;The &lt;a href="https://docs.djangoproject.com/en/1.3/ref/django-admin/#testserver-fixture-fixture" target="_blank"&gt;testserver command&lt;/a&gt;&amp;nbsp;allowed me to eliminate a bunch of low-value test harness. &amp;nbsp; Without this command, we had to create our own test database, start a server, run integration tests, and then kill the server. &amp;nbsp;With this command, we simply start and kill the server.&lt;/li&gt;&lt;li&gt;The RESTful web services can be securely integrated into the main web application. &amp;nbsp;A simple piece of middleware can authenticate requests based on headers containing &lt;a href="http://forgerock.com/openam.html" target="_blank"&gt;ForgeRock OpenAM&lt;/a&gt; tokens. &amp;nbsp;It may be that this was &lt;i&gt;always&lt;/i&gt; a feature of Django, but over the last few years, we've figured out how to exploit it with simple middleware.&lt;/li&gt;&lt;/ul&gt;&lt;div&gt;Few things are better than removing old code and replacing it with code written (and tested) by someone else.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;In &amp;nbsp;addition to the deletes, we also rearranged some of the dependencies. &amp;nbsp;We had (incorrectly) thought of the Django project as somehow central or essential. &amp;nbsp;It turns out that a bunch of other Python libraries were actually core to the application. &amp;nbsp;The Django web presentation was just one of the sensible use cases. &amp;nbsp;A suite of command-line apps could also be built around the underlying libraries.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;In addition to this cleanup, we also replaced the documentation with a new Sphinx project. &amp;nbsp;The project originally used &lt;a href="http://epydoc.sourceforge.net/" target="_blank"&gt;Epydoc&lt;/a&gt; markup. &amp;nbsp;This meant that every single docstring had to be rewritten to use RST markup. &amp;nbsp;The upside of this is that we corrected numerous errors.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;There Was Pain&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;This wasn't without some pain. &amp;nbsp;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Was the cost worth the effort? &amp;nbsp;That's the real question here.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;I think that many IT managers adopt a silly "If it ain't broke, don't fix it" policy that focuses on short-term cost and short-term value. &amp;nbsp;It ignores long-term accrual from even tiny short-term cost savings.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Here's are two important lessons. &amp;nbsp;&lt;/div&gt;&lt;div&gt;&lt;ul&gt;&lt;li&gt;Money saved today is saved forever.&lt;/li&gt;&lt;li&gt;Savings accrue. &amp;nbsp;Forever.&lt;/li&gt;&lt;/ul&gt;&lt;div&gt;It's important to avoid short-term thinking about cost and benefit.&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684183198890094283-2409424030454707055?l=slott-softwarearchitect.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://slott-softwarearchitect.blogspot.com/feeds/2409424030454707055/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/12/technology-refresh.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/2409424030454707055'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/2409424030454707055'/><link rel='alternate' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/12/technology-refresh.html' title='Technology Refresh'/><author><name>Steven Lott</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-mXhxBa_vM4E/AAAAAAAAAAI/AAAAAAAAAAA/eNlmenT-DOs/s512-c/photo.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684183198890094283.post-7782133663327178487</id><published>2011-12-20T08:00:00.000-05:00</published><updated>2011-12-20T08:00:11.420-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='HTML'/><category scheme='http://www.blogger.com/atom/ns#' term='css'/><title type='text'>Color Schemes</title><content type='html'>I worked with this a few years ago to tweak up some web pages.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://colorschemedesigner.com/"&gt;http://colorschemedesigner.com/&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;I just rediscovered it.&amp;nbsp; It's a cool toy.&amp;nbsp; You get some colors that all "go" together.&amp;nbsp; If you're careful with your .CSS definitions, you give people this page and let them fuss around until their positively silly with color palettes.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684183198890094283-7782133663327178487?l=slott-softwarearchitect.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://slott-softwarearchitect.blogspot.com/feeds/7782133663327178487/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/12/color-schemes.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/7782133663327178487'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/7782133663327178487'/><link rel='alternate' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/12/color-schemes.html' title='Color Schemes'/><author><name>Steven Lott</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-mXhxBa_vM4E/AAAAAAAAAAI/AAAAAAAAAAA/eNlmenT-DOs/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684183198890094283.post-6187160302792979844</id><published>2011-12-15T08:00:00.000-05:00</published><updated>2011-12-15T08:00:13.207-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='security'/><title type='text'>Good Summary of Bad Security Assumptions</title><content type='html'>This isn't the &lt;a href="https://www.owasp.org/index.php/Category:OWASP_Top_Ten_Project" target="_blank"&gt;OWASP Top 10 &lt;/a&gt;list, but it's still very handy.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://www.eweek.com/c/a/Security/Top-10-Dumb-Computer-Security-Notions-and-Myths-740587/" target="_blank"&gt;Top 10 Dumb Computer Security Notions&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;I'm particularly fond of the "security can't be perfect; since it can't be perfect, why bother?" approach.&lt;br /&gt;&lt;br /&gt;One other notion that amuses me is the silliness of changing a password every 90 days.&amp;nbsp; The argument is that "it's harder to hit a moving target".&amp;nbsp; That's obviously false.&amp;nbsp; A good rainbow table and a bad password without salt can be broken in about half an hour.&amp;nbsp; There's no "moving target" here.&amp;nbsp; At 30 minutes to crack a password, the only way the target can appear to move is making every password a 1-time-only password based on some kind of external source (like a token generator.)&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684183198890094283-6187160302792979844?l=slott-softwarearchitect.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://slott-softwarearchitect.blogspot.com/feeds/6187160302792979844/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/12/good-summary-of-bad-security.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/6187160302792979844'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/6187160302792979844'/><link rel='alternate' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/12/good-summary-of-bad-security.html' title='Good Summary of Bad Security Assumptions'/><author><name>Steven Lott</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-mXhxBa_vM4E/AAAAAAAAAAI/AAAAAAAAAAA/eNlmenT-DOs/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684183198890094283.post-3528332916665988431</id><published>2011-12-13T08:00:00.000-05:00</published><updated>2011-12-13T08:00:02.883-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='unit testing'/><category scheme='http://www.blogger.com/atom/ns#' term='architecture'/><title type='text'>The need for ping</title><content type='html'>Years ago, when designing an interface to a vendor's web services, I did the following. &amp;nbsp;This isn't a genius move, but it's worth emphasizing how important it is. &amp;nbsp;And what's most important isn't technical.&lt;br /&gt;&lt;br /&gt;&lt;ol&gt;&lt;li&gt;I built a simple &lt;a href="http://c2.com/cgi/wiki?SpikeSolution" target="_blank"&gt;spike solution&lt;/a&gt; to access their service.&lt;/li&gt;&lt;li&gt;I morphed this into a "sanity check" to be sure that their service really was working. &amp;nbsp;Mostly, I cleaned up the code so that it was testable and deliverable without embarrassment.&lt;/li&gt;&lt;li&gt;I morphed this into a "diagnostic tool" to bypass the higher-levels of the application and simply access the vendor (and optionally dump the results) to help determine what wasn't work. &amp;nbsp;This involved adding the dump option to the sanity check and renaming the command-line application.&lt;/li&gt;&lt;li&gt;I morphed this into a "credentials check and diagnostic tool". &amp;nbsp;This was -- ahem -- merely taking the hard-wired credentials out of the application. &amp;nbsp;Yes. &amp;nbsp;The first versions had hard-wired credentials.&lt;/li&gt;&lt;/ol&gt;&lt;div&gt;That brings us to the version in use today. &amp;nbsp;The "vendor ping" application.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The default behavior is a credentials check.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;One optional behavior is to dump the interface details.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Another optional behavior is to allow selection among a small number of simple interactions just to be sure things are working.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Unplanned Work&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;What's important here isn't that I did all this. &amp;nbsp;What's important is that the deliverables, user stories and project plans didn't include this little nugget of high-value goodness.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;It gets run fairly frequently in crunch situations. &amp;nbsp;The actor in the story ("As system admin...") is rarely considered as a first-class user of the application. &amp;nbsp;Yet, the admin is a first-class user, and needs to have proper user stories for confirming that the application is working properly.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684183198890094283-3528332916665988431?l=slott-softwarearchitect.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://slott-softwarearchitect.blogspot.com/feeds/3528332916665988431/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/12/need-for-ping.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/3528332916665988431'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/3528332916665988431'/><link rel='alternate' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/12/need-for-ping.html' title='The need for ping'/><author><name>Steven Lott</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-mXhxBa_vM4E/AAAAAAAAAAI/AAAAAAAAAAA/eNlmenT-DOs/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684183198890094283.post-2556054831733186094</id><published>2011-12-09T08:00:00.000-05:00</published><updated>2011-12-09T08:00:07.725-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='functional programming'/><category scheme='http://www.blogger.com/atom/ns#' term='python'/><category scheme='http://www.blogger.com/atom/ns#' term='Programming Languages'/><title type='text'>Statically Typed Language Nonsense</title><content type='html'>Read this: "&lt;a href="http://www.sdtimes.com/l/36103" target="_blank"&gt;Here Comes Functional Programming&lt;/a&gt;" by Larry O'Brien in SD Times.&lt;br /&gt;&lt;br /&gt;&lt;blockquote class="tr_bq"&gt;people who should know better continue to assert that statically typed languages are "safer, because the compiler can catch errors that otherwise wouldn't show up until runtime." While it's true a statically typed language can detect that you've assigned a string to a double without running your code, no type system is so strict that it can substitute for a test suite, and if you have a test suite, type-assignment errors are discovered and precisely diagnosed with little difficulty.&lt;/blockquote&gt;Thank you. &amp;nbsp; A language like Python, which lacks static type declarations for variables, is not evil or an accident waiting to happen.&lt;br /&gt;&lt;br /&gt;The article is about functional languages. &amp;nbsp;But the static declaration statement is universally true.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684183198890094283-2556054831733186094?l=slott-softwarearchitect.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://slott-softwarearchitect.blogspot.com/feeds/2556054831733186094/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/12/statically-typed-language-nonsense.html#comment-form' title='6 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/2556054831733186094'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/2556054831733186094'/><link rel='alternate' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/12/statically-typed-language-nonsense.html' title='Statically Typed Language Nonsense'/><author><name>Steven Lott</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-mXhxBa_vM4E/AAAAAAAAAAI/AAAAAAAAAAA/eNlmenT-DOs/s512-c/photo.jpg'/></author><thr:total>6</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684183198890094283.post-3908466005259772222</id><published>2011-12-06T08:00:00.000-05:00</published><updated>2011-12-06T08:00:00.517-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='blogging'/><title type='text'>I'm Confused by this Marketing Ploy</title><content type='html'>Got this a few weeks back. &lt;br /&gt;&lt;br /&gt;&lt;blockquote class="tr_bq"&gt;My job is to persuade bloggers to link to our site.&amp;nbsp;&lt;/blockquote&gt;&lt;blockquote class="tr_bq"&gt;I really love my job! We have a friendly team and good management, but unfortunately I have no idea how to convince a blogger to link to us, I'm afraid I might lose my job because of it :(&amp;nbsp;&lt;/blockquote&gt;&lt;blockquote class="tr_bq"&gt;And that is why, instead of sending letters to thousands of different blogs, I am reading yours.&lt;/blockquote&gt;Couldn't parse it.&lt;br /&gt;&lt;br /&gt;It seems to be a calculated Pity Ploy. &amp;nbsp;"I'm afraid I might lose my job...I am reading your [blog]."&lt;br /&gt;&lt;br /&gt;The product seemed cool enough. &amp;nbsp;The pitch, however, was too sketchy for me.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684183198890094283-3908466005259772222?l=slott-softwarearchitect.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://slott-softwarearchitect.blogspot.com/feeds/3908466005259772222/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/12/im-confused-by-this-marketing-ploy.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/3908466005259772222'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/3908466005259772222'/><link rel='alternate' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/12/im-confused-by-this-marketing-ploy.html' title='I&apos;m Confused by this Marketing Ploy'/><author><name>Steven Lott</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-mXhxBa_vM4E/AAAAAAAAAAI/AAAAAAAAAAA/eNlmenT-DOs/s512-c/photo.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684183198890094283.post-4550102033945251668</id><published>2011-12-01T08:00:00.000-05:00</published><updated>2011-12-01T08:00:14.487-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='agile'/><category scheme='http://www.blogger.com/atom/ns#' term='software process improvement'/><category scheme='http://www.blogger.com/atom/ns#' term='project management'/><title type='text'>Agile "Religion" Issues</title><content type='html'>See this&amp;nbsp;&lt;a href="http://www.pmhut.com/limitations-of-agile-software-development" target="_blank"&gt;Limitations of Agile Software Development&lt;/a&gt;&amp;nbsp;and this&amp;nbsp;&lt;a href="http://slott-softwarearchitect.blogspot.com/2011/10/agile-religion-what.html" target="_blank"&gt;The Agile "Religion" -- What?&lt;/a&gt;. &amp;nbsp;What's important is that the limitations of Agile are not limitations. &amp;nbsp;They're (mostly) intentional roadblocks to Agile.&lt;br /&gt;&lt;br /&gt;Looking for "limitations" in the Agile approach misses the point of Agile in several important ways.&lt;br /&gt;The most important problem with this list of "limitations" is that five of the six issues are simply anti-Agile positions that a company can take.&lt;br /&gt;&lt;br /&gt;In addition to being anti-Agile, a company can be anti-Test Driven Development. &amp;nbsp;They can be Anti-Continuous Integration. &amp;nbsp;They can be Anti-NoSQL. &amp;nbsp;There are lots of steps a company can take to subvert any given development practice. &amp;nbsp;Taking a step against a practice does not reveal a limitation.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;"1. A team of stars...&amp;nbsp;it takes more than the average Joe to achieve agility"&lt;/b&gt;. &amp;nbsp; This is not a specific step against agility. &amp;nbsp;I chalk this up to a project manager who really likes autocratic rule. &amp;nbsp;It's also possible that this is from a project manager that's deeply misanthropic. &amp;nbsp;Either way, the underlying assumption is that developers are somehow too dumb or disorganized to be trusted. &lt;br /&gt;&lt;br /&gt;Agile only requires working toward a common goal. &amp;nbsp;I can't see how a project manager is an &lt;i&gt;essential&lt;/i&gt; feature of working toward a common goal. &amp;nbsp;A manager may make things more clear or more efficient, but that's all. &amp;nbsp;Indeed, the "clarity" issue is emphasized in most Agile methods: a "Scrum Master" is part of the team specifically to foster clarity of purpose.&lt;br /&gt;&lt;br /&gt;Further, some Agile methods require a Product Owner to clarify the team's direction. &lt;br /&gt;&lt;br /&gt;"A team of stars" is emphatically &lt;b&gt;not&lt;/b&gt; required. &amp;nbsp;The experience of folks working in Agile environments confirms this. &amp;nbsp;Real, working Agile teams really really are average.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;"2. Fit with organizational culture"&lt;/b&gt;. &amp;nbsp;This has nothing to do with Agile methods. &amp;nbsp;This is just a sweeping (and true) generalization about organizations. &amp;nbsp;An organization that refuses autonomy and refuses flexibility can't use Agile methods. &amp;nbsp;An organization that refuses to create a "Big Design Up Front" can't use a traditional waterfall method and &lt;b&gt;must&lt;/b&gt;&amp;nbsp;use Agile methods. &lt;br /&gt;&lt;br /&gt;Organizational fit is not a limitation of Agile. &amp;nbsp;It's just a fact about people.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;"3. Small team...Assuming that large projects tend to require large teams, this restriction naturally extends to project size."&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;The assumption simply contradicts Agile principles. &amp;nbsp;It's not a "limitation" at all. &amp;nbsp;Large projects (with large numbers of people) have a number of smaller teams. &amp;nbsp;I've seen projects with over a dozen parallel Agile teams. &amp;nbsp;This means that in addition to a dozen daily scrums, there's also a scrum-of-scrums by the scrum masters.&lt;br /&gt;&lt;br /&gt;Throwing out the small team isn't a limitation of Agile. &amp;nbsp;It's a failure to understand Agile. &amp;nbsp;A project with many small teams works quite well. &amp;nbsp;It's not "religion". &amp;nbsp;It's experience. &lt;br /&gt;&lt;br /&gt;A single large team has been shown (for the last few decades) to be expensive and risky. &lt;br /&gt;&lt;br /&gt;&lt;b&gt;"4. Collocated team...We can easily think of a number of situations where this limitation prevents using agile:"&lt;/b&gt; &amp;nbsp;These are not limitations of Agile, but outright refusals to follow Agile principles. &amp;nbsp;Specifically:&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;"&lt;b&gt;Office space organized by departments&lt;/b&gt;" is not a limitation of Agile. &amp;nbsp;That's a symptom of an organization that refuses to be Agile. &amp;nbsp;See #2 above; this indicates a bad fit with the culture. &amp;nbsp;An organization that doesn't have space organized by department might have trouble executing a traditional waterfall method.&lt;/li&gt;&lt;li&gt;"&lt;b&gt;Distributed environment&lt;/b&gt;" is not a limitation of Agile. &amp;nbsp;Phones work. &amp;nbsp;Skype works.&lt;/li&gt;&lt;li&gt;"&lt;b&gt;Subcontracting... We have to acknowledge that there is no substitute for face-to-face&lt;/b&gt;". &amp;nbsp;Actually, subcontracting is irrelevant. &amp;nbsp;Further, subcontracting is not a synonym for a failure to be collocated. &amp;nbsp;When subcontractors are located remotely, phones still work. &amp;nbsp;Skype works better and is cheaper. &amp;nbsp;&lt;/li&gt;&lt;/ul&gt;&lt;div&gt;&lt;b&gt;"5. Where’s my methodology?"&lt;/b&gt; &amp;nbsp;This is hard to sort out, since it's full of errors. &amp;nbsp;Essentially, this appears to be a claim that a well-defined, documented processes is somehow &lt;i&gt;essential&lt;/i&gt; to software development. &amp;nbsp;Experience over the last few decades is quite clear that the written processes and the work actually performed diverge a great deal. &amp;nbsp;Most of the time, what people do is not documented, and the documented process has no bearing on what people actually do. &amp;nbsp;A documented process -- in most cases -- appears irrelevant to the work actually done.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Agile is not chaos. &amp;nbsp;It's a change in the rules to de-emphasize unthinking adherence to a plan and replace this with focus on working software. &amp;nbsp;Well-organized software analysis, design, code and test still exist even without elaborately documented (and irrelevant) process definitions.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;"&lt;b&gt;6. Team ownership vs. individual accountability...&amp;nbsp;how can we implement it since an organization’s performance-reward system assesses individual performance and rewards individuals, not teams...?&lt;/b&gt;" &amp;nbsp;Again, the assumption ("performance-reward system assesses individual performance") is simply a rejection of Agile principles. &amp;nbsp;It's not a limitation of Agile, it's an intentional step away from an Agile approach. &amp;nbsp;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;If an organization insists on individual performance metrics, see #2. &amp;nbsp;The culture is simply antithetical to Agile. Agile still works; the organization, however, is taking active steps to subvert it.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Agile isn't a religion. &amp;nbsp;It doesn't suffer from hidden or ignored "limitations".&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;"But did we question the assumption that Agile was indeed superior to traditional methodologies?"&lt;/b&gt; &amp;nbsp;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The answer is "yes". &amp;nbsp;A thousand times yes. &amp;nbsp;The whole reason for Agile approaches is specifically and entirely because of folks questioning traditional methodologies. &amp;nbsp;Traditional command-and-control methodologies have a long history of not working out well for software development. &amp;nbsp;The Agile Manifesto is a result of examining the failures of traditional methods.&lt;br /&gt;&lt;br /&gt;A traditional "waterfall" methodology works when there are few unknowns. &amp;nbsp;Construction projects, for example, rarely have the kinds of unknowns that software development has. &amp;nbsp;Construction usually involves well-known techniques applied to well-documented plans to produce a well-understood result. &amp;nbsp;Software development rarely involves so many well-known details. &amp;nbsp;Software development is 80% design and 20% construction. &amp;nbsp;And the design part involves 80% learning something new and 20% applying experience.&lt;br /&gt;&lt;br /&gt;Agile is not Snake Oil. &amp;nbsp;It's not something to be taken on faith. &amp;nbsp;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The Agile community exists for exactly one reason. &amp;nbsp;Agile methods work.&lt;br /&gt;&lt;br /&gt;Agile isn't a money-making product or service offering. &amp;nbsp;Agile -- itself -- is free. &amp;nbsp;Some folks try to leverage Agile techniques to sell supporting products or services, but Agile isn't an IBM or Oracle product. &amp;nbsp;There are no "backers". &amp;nbsp;There's no trail of money to see who profits from Agility.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Folks have been questioning "traditional" methodologies for years. &amp;nbsp;Why? &amp;nbsp;Because "traditional" waterfall methodologies are a crap-shoot. &amp;nbsp;Sometimes they work and sometimes they don't work. &amp;nbsp;The essential features of long term success are summarized in the &lt;a href="http://agilemanifesto.org/" target="_blank"&gt;Agile Manifesto&lt;/a&gt;. &amp;nbsp;Well-run projects all seem to have certain common features; the features of well-run projects form the basis for the Agile methods.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684183198890094283-4550102033945251668?l=slott-softwarearchitect.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://slott-softwarearchitect.blogspot.com/feeds/4550102033945251668/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/12/agile-religion-issues.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/4550102033945251668'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/4550102033945251668'/><link rel='alternate' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/12/agile-religion-issues.html' title='Agile &quot;Religion&quot; Issues'/><author><name>Steven Lott</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-mXhxBa_vM4E/AAAAAAAAAAI/AAAAAAAAAAA/eNlmenT-DOs/s512-c/photo.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684183198890094283.post-4416962620766638201</id><published>2011-11-29T08:00:00.000-05:00</published><updated>2011-11-29T08:00:03.002-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='tools'/><title type='text'>The Value of Microsoft's Tools</title><content type='html'>See Andrew Binstock's "&lt;a href="http://drdobbs.com/windows/231700224#" target="_blank"&gt;Windows 8: Microsoft's Development Re-Do&lt;/a&gt;".&lt;br /&gt;&lt;blockquote class="tr_bq"&gt;The costs of these migrations has been enormous and continues to accumulate...&lt;/blockquote&gt;I can only rub my hands with glee and engage in shameless "I Told You So" self-congratulations.&lt;br /&gt;&lt;br /&gt;Only you can prevent being held hostage by Microsoft.&lt;br /&gt;&lt;br /&gt;More than once, I've observed that a strategy of using only proprietary tools would be expensive and complex. &amp;nbsp;And every time, the folks I was talking to trivialized my concerns as hardly worth considering. &lt;br /&gt;&lt;br /&gt;I've seen orphaned software: it only compiles on an old version of Visual Studio. &amp;nbsp; I've seen software orphaned so badly that it can only be compiled on one creaky old PC. &amp;nbsp;The cost to convert was so astronomical that the customer preferred to hope for a product to arise somewhere in the marketplace. &amp;nbsp;When no suitable product appeared over the decades, the problem reached palpable Pants On Fire (POF) levels of panic. &amp;nbsp;All due to the hidden costs of Microsoft's tools.&lt;br /&gt;&lt;br /&gt;I've even been told that VB is a terrible language, but Visual Studio makes it acceptable.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684183198890094283-4416962620766638201?l=slott-softwarearchitect.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://slott-softwarearchitect.blogspot.com/feeds/4416962620766638201/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/11/value-of-microsofts-tools.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/4416962620766638201'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/4416962620766638201'/><link rel='alternate' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/11/value-of-microsofts-tools.html' title='The Value of Microsoft&apos;s Tools'/><author><name>Steven Lott</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-mXhxBa_vM4E/AAAAAAAAAAI/AAAAAAAAAAA/eNlmenT-DOs/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684183198890094283.post-6008102324541986567</id><published>2011-11-24T08:00:00.000-05:00</published><updated>2011-11-28T07:37:28.459-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='software process improvement'/><category scheme='http://www.blogger.com/atom/ns#' term='project management'/><title type='text'>Justification of Project Staffing</title><content type='html'>I really dislike being asked to plan a project.&amp;nbsp; It's hard to predict the future accurately.&lt;br /&gt;&lt;br /&gt;In spite of the future being -- well -- the future, and utterly unknowable, we still have to have the following kinds of discussions.&lt;br /&gt;&lt;br /&gt;Me: "It's probably going to take a team of six."&lt;br /&gt;&lt;br /&gt;Customer: "We don't really have the budget for that.&amp;nbsp; You're going to have to provide a lot of justification for a team that big."&lt;br /&gt;&lt;br /&gt;What's wrong with this picture?&amp;nbsp; Let's enumerate.&lt;br /&gt;&lt;ol&gt;&lt;li&gt;Customer is paying me for my opinion based on my experience.&amp;nbsp; If they want to provide me with the answers, I have a way to save them a lot of money.&amp;nbsp; Write their own project plan with their own answers and leave me out of it.&lt;/li&gt;&lt;li&gt;I've already provided all the justification there is.&amp;nbsp; I'm predicting the future here.&amp;nbsp; Software projects are not simple Rate-Time-Distance fourth-grade math problems.&amp;nbsp; They involve an unknown number of unknowns.&amp;nbsp; I can't provide a "lot" of justification because there isn't any indisputable basis for the prediction.&lt;/li&gt;&lt;li&gt;I don't know the people. The customer -- typically -- hasn't hired them yet.&amp;nbsp; Since I don't know them, I don't know how "productive" they'll be.&amp;nbsp; They could hire a dozen n00bz who can't find their asses blindfolded even using both hands.&amp;nbsp; Or.&amp;nbsp; They could hire two singular geniuses who can knock the thing out in a weekend.&amp;nbsp; Or.&amp;nbsp; They could hire a half-dozen arrogant SOB's who refuse to follow my recommendations.&amp;nbsp; &lt;/li&gt;&lt;li&gt;They're going to do whatever they want no matter what I say.&amp;nbsp; Seriously.&amp;nbsp; I could say "six".&amp;nbsp; They could argue that I should rewrite the plan to say "four" without changing the effort and duration.&amp;nbsp; Why ask me to change the plan?&amp;nbsp; A customer can only do what they &lt;b&gt;know &lt;/b&gt;to be the right thing.&amp;nbsp; &lt;/li&gt;&lt;/ol&gt;&lt;b&gt;Doing the Right Thing&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Let's return to that last point.&amp;nbsp; A customer project manager can only do what they absolutely &lt;b&gt;know &lt;/b&gt;is the right thing.&amp;nbsp; I can suggest all kinds of things.&amp;nbsp; If they're too new, too different, too disturbing, they're going to get ignored.&lt;br /&gt;&lt;br /&gt;Indeed, since people have such a huge &lt;a href="http://en.wikipedia.org/wiki/Confirmation_bias" target="_blank"&gt;Confirmation Bias&lt;/a&gt;, it's very, very hard to introduce anything new.&amp;nbsp; A customer doesn't bring in consultants without having already sold the idea that a software development project is in the offing.&amp;nbsp; They justify spending a few thousand on consulting by establishing some overall, ball-park, big-picture budget and showing that the consulting fees are just a small fraction of the overall.&lt;br /&gt;&lt;br /&gt;As consultants, we have to guess this overall, ball-park, big-picture budget accurately, or the project will be shut down.&amp;nbsp; If we guess too high, then the budget is out of control, or the scope isn't well-enough defined, or some other smell will stop all progress.&amp;nbsp; If we guess too low, then we have to lard on additional work to get back to the original concept.&lt;br /&gt;&lt;br /&gt;Architectures, components and techniques &lt;b&gt;all &lt;/b&gt;have to meet expectations. A customer that isn't familiar with test drive development, for example, will have an endless supply of objections.&amp;nbsp; "It's unproven."&amp;nbsp; "We don't have the budget for all that testing."&amp;nbsp; "We're more comfortable with our existing process."&lt;br /&gt;&lt;br /&gt;The final trump card is the passive aggressive "I'll have to see the detailed justification."&amp;nbsp; It means "Don't you dare."&amp;nbsp; But it sounds just like passive acceptance.&lt;br /&gt;&lt;br /&gt;Since project managers can only do what they know is right, they'll find lots of ways of subverting the new and unfamiliar.&lt;br /&gt;&lt;br /&gt;If they don't like the architecture, the first glitch or delay or problem will immediately lead to a change in direction to yank out the new and replace it with the familiar.&lt;br /&gt;&lt;br /&gt;If they don't like a component, they'll find numerous great reasons to rework that part of the project to remove the offending component.&lt;br /&gt;&lt;br /&gt;If they don't like a technique (e.g., Code Walk Throughs) they'll subvert it.&amp;nbsp; Either not schedule them.&amp;nbsp; Or cancel them because there are "more important things to do."&amp;nbsp; Or interrupt them to pull people out of them.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Overcoming the Confirmation Bias&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;I find the process of overcoming the confirmation bias to be tedious.&amp;nbsp; Some people like the one-on-one "influencing" role.&amp;nbsp; It takes patience and time to overcome the confirmation bias so that the customer is open to new ideas.&amp;nbsp; I just don't have the patience.&amp;nbsp; It's too much work to listen patiently to all the objections and slowly work through all the alternatives.&lt;br /&gt;&lt;br /&gt;I've worked with folks who really relish this kind of thing.&amp;nbsp; Endless one-on-one meetings.&amp;nbsp; Lots of pre-meetings and post-meetings and reviews of drafts.&amp;nbsp; I suppose it's rewarding.&amp;nbsp; Sigh.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684183198890094283-6008102324541986567?l=slott-softwarearchitect.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://slott-softwarearchitect.blogspot.com/feeds/6008102324541986567/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/11/justification-of-project-staffing.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/6008102324541986567'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/6008102324541986567'/><link rel='alternate' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/11/justification-of-project-staffing.html' title='Justification of Project Staffing'/><author><name>Steven Lott</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-mXhxBa_vM4E/AAAAAAAAAAI/AAAAAAAAAAA/eNlmenT-DOs/s512-c/photo.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684183198890094283.post-5779748183606941608</id><published>2011-11-22T08:00:00.000-05:00</published><updated>2011-11-22T08:00:06.744-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='learning'/><category scheme='http://www.blogger.com/atom/ns#' term='building skills books'/><title type='text'>How to Learn</title><content type='html'>A recent question. &lt;br /&gt;&lt;blockquote class="tr_bq"&gt;i came up with two options.&lt;br /&gt;&amp;nbsp;1. &amp;nbsp;building skills 1 (+ other references)... then algorithms &amp;amp; data&lt;br /&gt;structures.... then your books 2 &amp;amp; 3&lt;br /&gt;&lt;br /&gt;or&lt;br /&gt;&lt;br /&gt;&amp;nbsp;2. &amp;nbsp;your three books 1,2 &amp;amp; 3... then algo &amp;amp; ds&lt;br /&gt;&lt;br /&gt;kindly help me decide so i can start soon.&amp;nbsp;&lt;/blockquote&gt;&lt;br /&gt;I have two pieces of advice. &lt;br /&gt;&lt;br /&gt;First.&amp;nbsp; Programming is a &lt;i&gt;language &lt;/i&gt;skill.&amp;nbsp; Just like English.&amp;nbsp; If you can't get the English right, the odds of getting Python, Java, HTML or SQL right is considerably reduced.&amp;nbsp; Please, please, please take more care in grammar, syntax and punctuation.&amp;nbsp; Otherwise, your future as a programmer doesn't look very good.&amp;nbsp; For example, the personal pronoun is spelled "I".&amp;nbsp; In the 20th century, we spell out "and"; we stopped writing "&amp;amp;" as a stand-in for the Latin "&lt;i&gt;et&lt;/i&gt;" centuries ago.&amp;nbsp; Also, ellipses ("...") shouldn't be used except when eliding part of a quote.&amp;nbsp; Clarity and precision actually matter.&lt;br /&gt;&lt;br /&gt;Second, and more relevant, your two choices don't really amount to a significant difference.&amp;nbsp; If you're waiting around for advice, you're wasting your time.&amp;nbsp; Both sequences are good ideas. It's more important to get started than it is to carefully choose the precise and exact course of study. Just start doing something immediately. &lt;br /&gt;&lt;br /&gt;Learning to program is a life-long exercise. There will &lt;b&gt;always &lt;/b&gt;be more to learn. Start as soon as you can. The exact choices don't matter.&amp;nbsp; Why?&amp;nbsp; Because, eventually, you'll read &lt;b&gt;all &lt;/b&gt;of those books plus&amp;nbsp; &lt;b&gt;many, many&lt;/b&gt; others.&lt;br /&gt;&lt;br /&gt;Spend less time waiting for advice and more time studying.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684183198890094283-5779748183606941608?l=slott-softwarearchitect.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://slott-softwarearchitect.blogspot.com/feeds/5779748183606941608/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/11/how-to-learn.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/5779748183606941608'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/5779748183606941608'/><link rel='alternate' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/11/how-to-learn.html' title='How to Learn'/><author><name>Steven Lott</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-mXhxBa_vM4E/AAAAAAAAAAI/AAAAAAAAAAA/eNlmenT-DOs/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684183198890094283.post-6980967844911476482</id><published>2011-11-17T08:00:00.000-05:00</published><updated>2011-11-17T08:00:10.546-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='inheritance'/><category scheme='http://www.blogger.com/atom/ns#' term='python'/><category scheme='http://www.blogger.com/atom/ns#' term='OO design'/><category scheme='http://www.blogger.com/atom/ns#' term='delegation'/><title type='text'>More On Inheritance vs. Delegation</title><content type='html'>Emphasis on the "More On" as in "Moron".&amp;nbsp; This is a standard design error story.&amp;nbsp; The issue is that inheritance happens along an "axis" or "dimension" where the subclasses are at different points along that axis.&amp;nbsp; Multi-dimensional inheritance is an EPIC FAIL.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Context&lt;/b&gt; &lt;br /&gt;&lt;br /&gt;Data warehouse processing can involve a fair amount of "big batch" programs.&amp;nbsp; Loading 40,000 rows of econometric data in a single swoop, updating dimensions and loading facts, for example.&amp;nbsp;&lt;br /&gt;&lt;br /&gt;When you get data from customers and vendors, you have endless file-format problems.&amp;nbsp; To assure that things will work, each of these big batch programs has at least two operating modes.&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Validate.&amp;nbsp; Go through all the motions.&amp;nbsp; Except.&amp;nbsp; Don't commit any changes to the database; don't make any filesystem changes.&amp;nbsp; (i.e., write the new files, but don't do the final renames to make the files current.)&lt;/li&gt;&lt;li&gt;Load.&amp;nbsp; Go through all the motions including a complete commit to the database and any filesystem changes.&lt;/li&gt;&lt;/ul&gt;&lt;b&gt;Problem&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;What's the difference between the two modes?&amp;nbsp; Clearly, one is a subclass of the other.&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Load can be the superclass.&amp;nbsp; The Validate subclass simply replaces the save methods stubs that do nothing.&lt;/li&gt;&lt;li&gt;Validate can be the superclass.&amp;nbsp; The Load subclass simply implements the save method stubs with methods that do something useful.&lt;/li&gt;&lt;/ul&gt;Simple, right?&lt;br /&gt; &lt;br /&gt;Wrong.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;What Doesn't Work&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;This design has a smell.&amp;nbsp; The smell is that we can't easily extend the overall processing to include an additional feature.&amp;nbsp;&lt;br /&gt;&lt;br /&gt;Why not?&amp;nbsp; &lt;br /&gt;&lt;br /&gt;This design has the persistence feature set as the inheritance axis or dimension.&amp;nbsp; This is kind of limited.&amp;nbsp; We really want a different feature set for inheritance.&lt;br /&gt;&lt;br /&gt;Consider a Validate for two dimensions (Company and Time) that loads econometric facts.&amp;nbsp; It has stub "save" methods.&lt;br /&gt;&lt;br /&gt;We subclass the Validate to create the proper Load for these two dimensions and one fact.&amp;nbsp; We replace the stub save methods with proper database commits.&amp;nbsp;&lt;br /&gt;&lt;br /&gt;After the actuaries think for a while, suddenly we have a file which includes an additional dimension (i.e., business location) or an additional fact (i.e., econometric data at a different level of granularity).&amp;nbsp; What now?&amp;nbsp; If we subclass Validate to add the dimension or fact, we have a problem.&amp;nbsp; We have to repeat the Load subclass methods for the new, extended Load.&amp;nbsp; Oops.&lt;br /&gt;&lt;br /&gt;If we subclass Load to add the dimension or fact, we have a problem.&amp;nbsp; We have to repeat the Validate stubs in the new extended Load to make it into a Validate.&amp;nbsp; Oops.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Recognizing Delegation&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;It's difficult to predict inheritance vs. delegation design problems.&lt;br /&gt;&lt;br /&gt;The hand-waving advice is to consider the &lt;i&gt;essential &lt;/i&gt;features of the object.&amp;nbsp; This isn't too helpful.&amp;nbsp; Often, we're so focused on the database design that persistence seems essential.&lt;br /&gt;&lt;br /&gt;Experience shows, however, that some things are not essential.&amp;nbsp; Persistence, for example, is one of those things that should &lt;i&gt;always &lt;/i&gt;be delegated.&lt;br /&gt;&lt;br /&gt;Another thing that should always be delegated is the more general problem of representation: JSON, XML, etc., should rely on delegation since this is never essential.&amp;nbsp; There's always another representation for data.&amp;nbsp; Representation is always independent of the object's essential internal state changes.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Consequence&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;In my case, I've got about a dozen implementations using a clunky inheritance that had some copy-and-paste programming.&amp;nbsp; Oops.&lt;br /&gt;&lt;br /&gt;I'm trying to reduce that technical debt by rewriting each to be a proper delegation.&amp;nbsp; With good unit test coverage, there's no real technical risk.&amp;nbsp; Just tedious fixing the same mistake that I rushed into production twelve separate times.&amp;nbsp;&lt;br /&gt;&lt;br /&gt;Really.&amp;nbsp; Colossally dumb.&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684183198890094283-6980967844911476482?l=slott-softwarearchitect.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://slott-softwarearchitect.blogspot.com/feeds/6980967844911476482/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/11/more-on-inheritance-vs-delegation.html#comment-form' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/6980967844911476482'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/6980967844911476482'/><link rel='alternate' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/11/more-on-inheritance-vs-delegation.html' title='More On Inheritance vs. Delegation'/><author><name>Steven Lott</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-mXhxBa_vM4E/AAAAAAAAAAI/AAAAAAAAAAA/eNlmenT-DOs/s512-c/photo.jpg'/></author><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684183198890094283.post-6679365481169431928</id><published>2011-10-25T08:00:00.000-04:00</published><updated>2011-10-25T08:00:09.900-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='ctypes'/><category scheme='http://www.blogger.com/atom/ns#' term='python'/><category scheme='http://www.blogger.com/atom/ns#' term='vmware'/><title type='text'>VMware, VIX and PyVIX2</title><content type='html'>The topic of VMware came up at my local &lt;a href="http://www.meetup.com/757-Python-Users-Group/"&gt;757 Python Users Group&lt;/a&gt;.&lt;br /&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;A common administrative need is to control VM farms. &amp;nbsp;While there are a number of pointy-clicky GUI tools, VMware offers the VIX library to permit writing scripts to control VM's.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Here's some information we looked at recently on &lt;a href="http://www.meetup.com/757-Python-Users-Group/pages/PyVIX2_and_VMware_Control/"&gt;PyVIX2 and VMware&lt;/a&gt;.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The idea behind PyVIX2 is to provide a relatively simple Python binding to VIX. &amp;nbsp; &amp;nbsp;This, too, is a command-line interface, following on the heels of&amp;nbsp;&lt;a href="http://www.blogger.com/blogger.g?blogID=684183198890094283#editor/target=post;postID=2609889238262822318"&gt;More Command-Line Goodness&lt;/a&gt;&amp;nbsp;and &lt;a href="http://www.blogger.com/blogger.g?blogID=684183198890094283#editor/target=post;postID=6953887040859784637"&gt;Command-Line Applications&lt;/a&gt;.&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684183198890094283-6679365481169431928?l=slott-softwarearchitect.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://slott-softwarearchitect.blogspot.com/feeds/6679365481169431928/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/10/vmware-vix-and-pyvix2.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/6679365481169431928'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/6679365481169431928'/><link rel='alternate' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/10/vmware-vix-and-pyvix2.html' title='VMware, VIX and PyVIX2'/><author><name>Steven Lott</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-mXhxBa_vM4E/AAAAAAAAAAI/AAAAAAAAAAA/eNlmenT-DOs/s512-c/photo.jpg'/></author><thr:total>0</thr:total><georss:featurename>Norfolk, VA, USA</georss:featurename><georss:point>36.8507689 -76.2858726</georss:point><georss:box>36.6474749 -76.6017296 37.0540629 -75.97001560000001</georss:box></entry><entry><id>tag:blogger.com,1999:blog-684183198890094283.post-530740292495737576</id><published>2011-10-20T08:00:00.000-04:00</published><updated>2011-10-20T15:38:07.621-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='waterfall'/><category scheme='http://www.blogger.com/atom/ns#' term='agile'/><category scheme='http://www.blogger.com/atom/ns#' term='software process improvement'/><title type='text'>The Agile "Religion" -- What?</title><content type='html'>Received "it seems that software development&amp;nbsp;has caught the agile religion. Personally, I have an issue w/ being&amp;nbsp;unimodal."&lt;br /&gt;&lt;br /&gt;What?&lt;br /&gt;&lt;br /&gt;First. &amp;nbsp;"agile religion". &amp;nbsp;As in the deprecating statement: &lt;i&gt;Agile is nothing more than a religion&lt;/i&gt;?&amp;nbsp; As in &lt;i&gt;Agile is nothing more than a vague religious practice with no tangible value to an organization&lt;/i&gt;?&amp;nbsp; Interesting, I guess. &lt;br /&gt;&lt;br /&gt;I'm assuming that the author did not read the &lt;a href="http://agilemanifesto.org/"&gt;Manifesto for Agile Software Development&lt;/a&gt;. &amp;nbsp;Or--worse-- they read it and find that the four values (Individuals and interactions,&amp;nbsp;Working software,&amp;nbsp;Customer collaboration and&amp;nbsp;Responding to change) are just of no tangible value.&lt;br /&gt;&lt;br /&gt;That's alarming. &amp;nbsp;Really. &amp;nbsp;The alternative (processes and tools,&amp;nbsp;comprehensive documentation,&amp;nbsp;contract negotiation, and&amp;nbsp;following a plan without regard to changes) seems like it's a recipe for cost, risk, low-value work and a cancelled project. &amp;nbsp;Indeed, it seems like non-Agile project management is the best way to get to the fabled "Software Crisis" where lots of money gets spent but little of value gets created.&lt;br /&gt;&lt;br /&gt;Further, it seems that &lt;b&gt;all&lt;/b&gt; modifications of the classic waterfall method (e.g., spiral method as a prime example) specifically create "iterative, incremental" approaches to software development. &amp;nbsp;That is, everything that's not a strict (brain-dead) waterfall has some elements of Agile.&lt;br /&gt;&lt;br /&gt;This causes me to think that Agile isn't a religion. &amp;nbsp;It causes me to think that Waterfall methods were a religious practice of no tangible value. &amp;nbsp;All the methodology experiments over the last 15 years have been ways of introducing flexibility (agility, brains) into a foolishly inflexible methodology definition. &lt;br /&gt;&lt;br /&gt;Indeed, it appears that the heavy-weigh waterfallish methods are an attempt to replace thinking with process. &amp;nbsp;And it didn't work. &amp;nbsp;So, we have to go back to the thinking part. &amp;nbsp;Only, we call it Agile now.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Religious Wars&lt;/b&gt;.&lt;br /&gt;&lt;br /&gt;Second. &amp;nbsp;"agile religion" (again). &amp;nbsp;As in &lt;i&gt;methodology discussions are just religious wars&lt;/i&gt;?&amp;nbsp; As in &lt;i&gt;methodology discussions are just quibbling over no-value details&lt;/i&gt;? &amp;nbsp;Some folks may get this impression of making a choice between Agile vs. Non-Agile methods. &amp;nbsp;I think that those folks haven't actually had the opportunity to work from a prioritized backlog and build the most valuable part first. &amp;nbsp;I think that someone who things Agile is just a religious war hasn't been allowed to fix a broken project plan based on lessons learned during the first release.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Agility&lt;/b&gt;.&lt;br /&gt;&lt;br /&gt;Third. &amp;nbsp; "unimodal". &amp;nbsp;As in &lt;i&gt;being exclusively Agile is bad&lt;/i&gt;? &amp;nbsp;As in &lt;i&gt;sometimes you need to have a rigid, unyielding process that sticks strictly to the schedule irrespective of changes which may occur&lt;/i&gt;? &amp;nbsp;That doesn't seem rational.&lt;br /&gt;&lt;br /&gt;Change happens. &amp;nbsp;Forcing the inevitable changes to conform to some farcical schedule made up by people who didn't have all the details seems silly. &amp;nbsp;Making contract negotiation the focal point of response to change seems like a waste of effort. &amp;nbsp;Trying to document everything so completely that all possible changes are already accounted for seems impossible. &amp;nbsp;And replacing change with a process that regulates change seems -- perhaps -- unhinged.&lt;br /&gt;&lt;br /&gt;There were some links and some charts and graphs attached. &amp;nbsp;I couldn't get past the two sentences above to see if there was, perhaps, something more to it. &amp;nbsp;All I could do was respond with a request for clarification that didn't involve the trivialization of Agile methods. &amp;nbsp;It doesn't seem sensible to try and remove the human element from software development. &lt;br /&gt;&lt;br /&gt;I'll provide whatever follow-up clarification surfaces on this topic. &amp;nbsp;It's interesting to see if the "agile religion" was misplaced, or if there are folks who think that responding to the messiness of real software development is a bad idea.&lt;br /&gt;&lt;br /&gt;We tried the waterfall method. &amp;nbsp;And it didn't work very well. &amp;nbsp;Agile isn't a "religion". &amp;nbsp;It's a simple acknowledgement that reality is messy.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684183198890094283-530740292495737576?l=slott-softwarearchitect.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://slott-softwarearchitect.blogspot.com/feeds/530740292495737576/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/10/agile-religion-what.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/530740292495737576'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/530740292495737576'/><link rel='alternate' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/10/agile-religion-what.html' title='The Agile &quot;Religion&quot; -- What?'/><author><name>Steven Lott</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-mXhxBa_vM4E/AAAAAAAAAAI/AAAAAAAAAAA/eNlmenT-DOs/s512-c/photo.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684183198890094283.post-2609889238262822318</id><published>2011-10-13T08:00:00.000-04:00</published><updated>2011-10-13T08:00:04.738-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='python'/><category scheme='http://www.blogger.com/atom/ns#' term='stingray reader'/><category scheme='http://www.blogger.com/atom/ns#' term='CLI'/><title type='text'>More Command-Line Goodness</title><content type='html'>In &lt;a href="http://slott-softwarearchitect.blogspot.com/2011/10/command-line-applications.html"&gt;Command-Line Applications&lt;/a&gt;, we looked at a Python main-import switch which boiled down to this.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;for file in args.file:&amp;nbsp;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; with open( file, "r" ) as source:&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; process_file( source, args )&lt;/span&gt;&lt;br /&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The point was that each distinct file on the command-line was processed in a more-or-less uniform way by a single function that does the "real work"&amp;nbsp;for that input file.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;It turns out that we often have flat files which are spreadsheets or spreadsheet-like. &amp;nbsp; Indeed, for some people (and some organizations) the spreadsheet is their preferred user interface. &amp;nbsp;As I've said before,&amp;nbsp;&lt;/div&gt;&lt;blockquote&gt;Spreadsheets are the universal user interface. Everyone likes them, they're almost inescapable. And they work. There's no reason to attempt to replace the spreadsheet with a web page or a form or a desktop application. It's easier to cope with spreadsheet vagaries than to replace them.&lt;/blockquote&gt;&lt;div&gt;They have problems, but they are surprisingly common. &amp;nbsp;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Enter &lt;a href="http://sourceforge.net/p/stingrayreader/home/Stingray%20--%20Schema-Based%20File%20Reader/"&gt;Stingray Reader&lt;/a&gt;. &amp;nbsp;This is a small Python library to make it easy to have programs which read workbooks--collections of spreadsheets--or spreadsheet-like files with a degree of transparency. &amp;nbsp;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;And. &amp;nbsp;It allows a clean command-line interface.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;With a little care, we can reduce the main-import switch to something like this.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;if __name__ == "__main__":&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; logging.basicConfig( stream=sys.stderr )&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; args= parse_args()&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; logging.getLogger().setLevel( args.verbosity )&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;b&gt;&amp;nbsp; &amp;nbsp; builder= make_builder( args )&lt;/b&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; try:&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; for file in args:&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;b&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; with workbook.open_workbook( input ) as source:&lt;/b&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;b&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; process_workbook( source, builder )&lt;/b&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; status= 0&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; except Exception as e:&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; logging.exception( e )&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; status= 3&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; logging.shutdown()&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; sys.exit( status )&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The bold lines are specific to workbook ("spreadsheet") processing. &amp;nbsp;A "builder" creates application-specific Python objects from spreadsheet rows. &amp;nbsp;The "&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;workbook.open_workbook&lt;/span&gt;" is a function that builds a workbook reader based on the file name. &amp;nbsp;It can handle a number of file types. &amp;nbsp;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The &lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;process_workbook&lt;/span&gt; function is the "real work" function that handles a workbook of individual spreadsheets (or a spreadsheet-like file).&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684183198890094283-2609889238262822318?l=slott-softwarearchitect.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://slott-softwarearchitect.blogspot.com/feeds/2609889238262822318/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/10/more-command-line-goodness.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/2609889238262822318'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/2609889238262822318'/><link rel='alternate' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/10/more-command-line-goodness.html' title='More Command-Line Goodness'/><author><name>Steven Lott</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-mXhxBa_vM4E/AAAAAAAAAAI/AAAAAAAAAAA/eNlmenT-DOs/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684183198890094283.post-7783926205225519594</id><published>2011-10-11T08:00:00.000-04:00</published><updated>2011-10-11T08:00:11.143-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='software process improvement'/><category scheme='http://www.blogger.com/atom/ns#' term='complexity'/><title type='text'>A smoothly operating, well-oiled engine for failure</title><content type='html'>It occurs to me that much of "Big IT" creates a well-oiled organization that makes broken software seem acceptable. The breakage is wrapped in layers of finely-tuned process.&lt;br /&gt;&lt;br /&gt;Consider a typical Enterprise Application.&amp;nbsp; There's a help desk, ticket tracking, a user support organization that does "ad-hoc" processing, and a development organization to handle bug fixes and enhancement requests.&amp;nbsp; All those people doing all that work.&lt;br /&gt;&lt;br /&gt;Why?&lt;br /&gt;&lt;br /&gt;If people need all that support, then the application is -- from a simplistic view -- broken.&lt;br /&gt;&lt;br /&gt;The organization, however, has coped with the broken application by wrapping it in layers of people, process, tools, technology, management and funding.&amp;nbsp; The end users have a problem, they call the help desk, and the machine kicks in to resolve their problem.&lt;br /&gt;&lt;br /&gt;It is a given -- a going-in assumption -- a normal, standard expectation that any enterprise software is so broken that a huge organization will be essential for pressing forward.&amp;nbsp; It is expected that good software cannot be built. &lt;br /&gt;&lt;br /&gt;We're asked to help a client create a sophisticated plan for the &lt;span class="yshortcuts" id="lw_1317811546_1"&gt;New Enterprise&lt;/span&gt; App support organization.&amp;nbsp; Planning this organization feels like planning for various kinds of known, predicted, expected failures. Failure is the expectation.&amp;nbsp; Broken is the standard operating mode.&lt;br /&gt;&lt;br /&gt;Consider a typical non-Enterprise Application.&amp;nbsp; Let's say, the GNU C compiler.&amp;nbsp; Or Python.&amp;nbsp; Or Linux.&amp;nbsp; An almost entirely volunteer organization, no help desk, no trouble tickets, no elaborate support organization plan.&amp;nbsp; Yet.&amp;nbsp; These products actually work flawlessly.&amp;nbsp; They're not wrapped in a giant organization.&lt;br /&gt;&lt;br /&gt;Why is the bar for acceptability so low for "Enterprise" applications?&amp;nbsp; Why is this tolerated?&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684183198890094283-7783926205225519594?l=slott-softwarearchitect.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://slott-softwarearchitect.blogspot.com/feeds/7783926205225519594/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/10/smoothly-operating-well-oiled-engine.html#comment-form' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/7783926205225519594'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/7783926205225519594'/><link rel='alternate' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/10/smoothly-operating-well-oiled-engine.html' title='A smoothly operating, well-oiled engine for failure'/><author><name>Steven Lott</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-mXhxBa_vM4E/AAAAAAAAAAI/AAAAAAAAAAA/eNlmenT-DOs/s512-c/photo.jpg'/></author><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684183198890094283.post-6953887040859784637</id><published>2011-10-06T08:00:00.000-04:00</published><updated>2011-10-06T08:00:07.785-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='java'/><category scheme='http://www.blogger.com/atom/ns#' term='python'/><category scheme='http://www.blogger.com/atom/ns#' term='CLI'/><title type='text'>Command Line Applications</title><content type='html'>I'm old -- I admit it -- and I feel that command-line applications are still very, very important.  Linux, for example, is packed full of almost innumerable command-line applications.  In some cases, the Linux GUI tools are specifically just wrappers around the underlying command-line applications.&lt;br /&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;For many types of high-volume data processing, command-line applications are essential.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;I've seen command-line applications done very badly.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Overusing Main&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;When writing OO programs, it's absolutely essential that the OS interface (&lt;span class="Apple-style-span" style="font-family: 'courier new';"&gt;public static void main&lt;/span&gt; in Java or the &lt;span class="Apple-style-span" style="font-family: 'courier new';"&gt;if __name__ == "__main__":&lt;/span&gt; block in Python) does as little as possible.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;A good command-line program has the underlying tasks or actions defined in some easy-to-work with class hierarchy built on the &lt;b&gt;Command&lt;/b&gt; design pattern.  The actual main program part does just a few things:  gather the relevant environment variables, parse command-line options and arguments, identify the configuration files, and initiate the appropriate commands.  Nothing application-specific.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;When the main method does application-specific work, that application functionality is buried in a method that's particularly hard to reuse.  It's important to keep the application functionality away from the OS interface.&lt;br /&gt;&lt;br /&gt;I'm finding that main programs should look something like this:&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;if __name__ == "__main__":&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; logging.basicConfig( stream=sys.stderr )&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; args= parse_args()&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; logging.getLogger().setLevel( args.verbosity )&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; try:&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; for file in args.file:&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; with open( file, "r" ) as source:&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; process_file( source, args )&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; status= 0&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; except Exception as e:&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; logging.exception( e )&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; status= 3&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; logging.shutdown()&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; sys.exit( status )&lt;/span&gt;&lt;br /&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;That's it. &amp;nbsp;Nothing more in the top-level main program. &amp;nbsp;The &lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;process_file&lt;/span&gt; function becomes a reusable "command" and something that can be tested independently.&lt;/div&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684183198890094283-6953887040859784637?l=slott-softwarearchitect.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://slott-softwarearchitect.blogspot.com/feeds/6953887040859784637/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/10/command-line-applications.html#comment-form' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/6953887040859784637'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/6953887040859784637'/><link rel='alternate' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/10/command-line-applications.html' title='Command Line Applications'/><author><name>Steven Lott</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-mXhxBa_vM4E/AAAAAAAAAAI/AAAAAAAAAAA/eNlmenT-DOs/s512-c/photo.jpg'/></author><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684183198890094283.post-7723070148206892989</id><published>2011-10-04T08:00:00.000-04:00</published><updated>2011-10-05T06:42:11.961-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='python'/><category scheme='http://www.blogger.com/atom/ns#' term='business rules'/><title type='text'>"Hard Coding" Business Rules</title><content type='html'>See this: "&lt;a href="http://www.sdtimes.com/STOP_HARD_CODING_BUSINESS_RULES/By_David_Rubinstein/About_BUSINESSDEVELOPERS_and_BUSINESSRULES/35919"&gt;Stop hard-coding business rules&lt;/a&gt;" in SD Times.&lt;br /&gt;&lt;br /&gt;Here's what's exasperating: "Memo to developers: Stop hard-coding business rules into applications. Use business rules engines instead."&lt;br /&gt;&lt;br /&gt;Business Rules Engines? &amp;nbsp;You mean Python?&lt;br /&gt;&lt;br /&gt;It appears that they don't mean Python.&lt;br /&gt;&lt;br /&gt;"Developers can use [a BPM suite or rules engine] and be more productive, so long as they don’t use C# or Java as a default for development". &lt;br /&gt;&lt;br /&gt;I'm guessing that by "C# or Java" they mean "a programming language" and I would bet that Python is included in "bad" languages for development.&lt;br /&gt;&lt;br /&gt;Python has all the simplicity and expressive power of a Domain-Specific Language (DSL) for business rules.&lt;br /&gt;&lt;br /&gt;Don't hard-code business rules in Java. &amp;nbsp;Code them in an interpreted language like Python.&lt;br /&gt;&lt;br /&gt;Also, don't be mislead by any claims that business analysts or (weirdly) users can somehow "code" business rules.&amp;nbsp; They can't (and mostly, they won't).&amp;nbsp; That's what SD Times wisely says "Developers".&amp;nbsp; That's how coding gets done.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684183198890094283-7723070148206892989?l=slott-softwarearchitect.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://slott-softwarearchitect.blogspot.com/feeds/7723070148206892989/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/10/hard-coding-business-rules.html#comment-form' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/7723070148206892989'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/7723070148206892989'/><link rel='alternate' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/10/hard-coding-business-rules.html' title='&quot;Hard Coding&quot; Business Rules'/><author><name>Steven Lott</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-mXhxBa_vM4E/AAAAAAAAAAI/AAAAAAAAAAA/eNlmenT-DOs/s512-c/photo.jpg'/></author><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684183198890094283.post-3891157909616110819</id><published>2011-09-29T08:00:00.000-04:00</published><updated>2011-09-29T08:00:07.853-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='project management'/><category scheme='http://www.blogger.com/atom/ns#' term='estimating'/><title type='text'>The Politics of Estimating</title><content type='html'>Computerworld, September 12, page 10.&lt;br /&gt;&lt;br /&gt;&lt;blockquote&gt;&lt;b&gt;Microburst&lt;/b&gt;IT Disasters&lt;br /&gt;According to a study of 1,471 big IT projects, 15% turn out to be money pits, with cost overruns averaging 200%.&lt;/blockquote&gt;&lt;br /&gt;How is this a politically-charged statement? &amp;nbsp;We hear this kind of thing all the time.&lt;br /&gt;&lt;br /&gt;As developers (or project leaders) we're failing to execute.&lt;br /&gt;&lt;br /&gt;Right? &lt;br /&gt;&lt;br /&gt;Hogwash.&lt;br /&gt;&lt;br /&gt;An "overrun" is isomorphic to "badly justified" or "badly budgeted" or "oversold to executive sponsors". &lt;br /&gt;&lt;br /&gt;An "overrun" can be a failure to use (or even permit) realistic estimates. &amp;nbsp;It may reflect an executive sponsor restating objectives to make the project large enough to justify it. &amp;nbsp;An overrun can mean anything.&lt;br /&gt;&lt;br /&gt;Calling it an overrun is a way to label it as "failure to execute".&lt;br /&gt;&lt;br /&gt;I prefer to call it a failure of vision (or whatever it is executive sponsors do). &amp;nbsp;It's more likely to be an under-estimate than it is to be an over-run.&lt;br /&gt;&lt;br /&gt;After all, how many times have we been told to reduce an estimate? &amp;nbsp;How many times have folks gotten their "attaboys" and "attagirls" for "sharpening their pencils" and reducing the proposal to the smallest amount that the customer would approve?&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684183198890094283-3891157909616110819?l=slott-softwarearchitect.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://slott-softwarearchitect.blogspot.com/feeds/3891157909616110819/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/09/politics-of-estimating.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/3891157909616110819'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/3891157909616110819'/><link rel='alternate' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/09/politics-of-estimating.html' title='The Politics of Estimating'/><author><name>Steven Lott</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-mXhxBa_vM4E/AAAAAAAAAAI/AAAAAAAAAAA/eNlmenT-DOs/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684183198890094283.post-8558282619627554973</id><published>2011-09-27T09:22:00.000-04:00</published><updated>2011-09-27T18:58:32.673-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='performance'/><category scheme='http://www.blogger.com/atom/ns#' term='design patterns'/><category scheme='http://www.blogger.com/atom/ns#' term='architecture'/><title type='text'>Threads and I/O</title><content type='html'>Threads don't promote concurrent I/O. &lt;br /&gt;&lt;br /&gt;Kernel threads may. &amp;nbsp;Most of us write user threads. &amp;nbsp;Here's a great summary under &lt;a href="http://en.wikipedia.org/wiki/Thread_(computer_science)"&gt;Thread (Computer Science)&lt;/a&gt;.&lt;br /&gt;&lt;blockquote&gt;However, the use of blocking system calls in user threads (as opposed to kernel threads) or fibers can be problematic. If a user thread or a fiber performs a system call that blocks, the other user threads and fibers in the process are unable to run until the system call returns. A typical example of this problem is when performing I/O: most programs are written to perform I/O synchronously. When an I/O operation is initiated, a system call is made, and does not return until the I/O operation has been completed. In the intervening period, the entire process is "blocked" by the kernel and cannot run, which starves other user threads and fibers in the same process from executing.&lt;/blockquote&gt;&lt;br /&gt;The point is this.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;If it involves I/O, multi-threading doesn't help. &amp;nbsp;Processes do.&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;If it involves computation,&amp;nbsp;multi-threading&amp;nbsp;may help.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684183198890094283-8558282619627554973?l=slott-softwarearchitect.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://slott-softwarearchitect.blogspot.com/feeds/8558282619627554973/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/09/threads-and-io.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/8558282619627554973'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/8558282619627554973'/><link rel='alternate' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/09/threads-and-io.html' title='Threads and I/O'/><author><name>Steven Lott</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-mXhxBa_vM4E/AAAAAAAAAAI/AAAAAAAAAAA/eNlmenT-DOs/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684183198890094283.post-9053119301349435720</id><published>2011-09-22T08:00:00.000-04:00</published><updated>2011-09-22T08:00:07.996-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='unit testing'/><category scheme='http://www.blogger.com/atom/ns#' term='python'/><title type='text'>"Strict" Unit Testing -- Everything In Isolation Is Too Much Work</title><content type='html'>Folks like to claim that unit testing absolutely requires each class be tested in isolation using mocks for all dependencies. &amp;nbsp;This is a noble aspiration, but doesn't work out perfectly well in Python.&lt;br /&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;First, "unit" is intentionally vague. &amp;nbsp;It could be a class, a function, a module or a package. &amp;nbsp;It's "unit" of code. &amp;nbsp;Anything could be considered a "unit".&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Second--and more important--the extensive mocking isn't fully appropriate for Python programming. &amp;nbsp;Mocks are very helpful in statically-typed languages where you must be very fussy about assuring that all of the interface definitions are carefully matched up properly. &amp;nbsp;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;In Python, duck typing allows a mock to be defined quite trivially. &amp;nbsp;A mock library isn't terribly helpful, since it doesn't reduce the code volume or complexity in any meaningful way.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Dependencies without Injection&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The larger issue with trying to unit test in Python with mock objects is the impact of change.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;We have some class with an interface.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;class AppFeature( object ):&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; def app_method( self, anotherObject ):&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;i&gt;etc.&lt;/i&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;class AnotherClass( object ):&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; def another_method( self ):&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;i&gt;etc.&lt;/i&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;We've properly used dependency injection to make &lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;AppFeature&lt;/span&gt; depend on an instance of &lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;AnotherClass&lt;/span&gt;. &amp;nbsp;This means that we're &lt;i&gt;supposed&lt;/i&gt; to create a mock of &lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;AnotherClass&lt;/span&gt; to test the &lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;AppFeature&lt;/span&gt;.&amp;nbsp;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;class MockAnotherClass( object ):&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; def another_method( self ):&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;i&gt;etc.&lt;/i&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;In Python, this mock isn't a best practice. &amp;nbsp;It can be helpful. &amp;nbsp;But adding a mock can also be confusing and misleading.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Refactoring Scenario&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Consider the situation where we're refactoring and change the interface to &lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;AnotherClass&lt;/span&gt;. &amp;nbsp;We modify &lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;another_method&lt;/span&gt; to take an additional argument, for example.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;How many mocks do we have? &amp;nbsp;How many need to be changed? &amp;nbsp;What happens when we miss one of the mocks and have the mysterious Isolated Test Failure? &amp;nbsp;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;While we can use a naming convention and grep to locate the mocks, this can (and does) get murky when we've got a mock that replaces a complex cluster of objects with a simple &lt;b&gt;Facade&lt;/b&gt; for testing purposes. &amp;nbsp;Now, we've got a mock that doesn't trivially replace the mocked class.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Alternative: Less Strict Mocking&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;In Python--and other duck typing languages--a less mock-heavy approach seems more productive. &amp;nbsp;The goal of testing &lt;b&gt;every&lt;/b&gt; class in isolation surrounded by mocks needs to be relaxed. &amp;nbsp;A more helpful approach is to work up through the layers.&lt;/div&gt;&lt;div&gt;&lt;ol&gt;&lt;li&gt;Test the "low-level" classes--those with few or no dependencies--in isolation. &amp;nbsp;This is easy because they're already isolated by design.&lt;/li&gt;&lt;li&gt;The classes which depend on these low-level classes can simply use the low-level classes without shame or embarrassment. &amp;nbsp;The low-level classes work. &amp;nbsp;Higher-level classes can depend on them. &amp;nbsp;It's okay.&lt;/li&gt;&lt;li&gt;In some cases, mocks are required for particularly complex or difficult classes. &amp;nbsp;Nothing is wrong with mocks. &amp;nbsp;But fussy overuse of mocks does create additional work.&lt;/li&gt;&lt;/ol&gt;&lt;div&gt;The benefit of this is&amp;nbsp;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;ul&gt;&lt;li&gt;The layered architecture is tested the way it's actually used. &amp;nbsp;The low-level classes are tested in isolation as well as being tested in conjunction with the classes that depend on them.&lt;/li&gt;&lt;li&gt;It's easier to refactor. &amp;nbsp;The design changes aren't propagated into mocks.&lt;/li&gt;&lt;li&gt;Layer boundaries can be more strictly enforced. &amp;nbsp;Circularities are exposed in a more useful way through the dependencies and layered testing.&lt;/li&gt;&lt;/ul&gt;&lt;div&gt;We need to still work out proper dependency injection. &amp;nbsp;If we try to mock every dependency, we are forced to confront every dependency in glorious detail. &amp;nbsp;If we don't mock every single dependency, we can slide by without properly isolating our design.&lt;/div&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684183198890094283-9053119301349435720?l=slott-softwarearchitect.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://slott-softwarearchitect.blogspot.com/feeds/9053119301349435720/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/09/strict-unit-testing-everything-in.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/9053119301349435720'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/9053119301349435720'/><link rel='alternate' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/09/strict-unit-testing-everything-in.html' title='&quot;Strict&quot; Unit Testing -- Everything In Isolation Is Too Much Work'/><author><name>Steven Lott</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-mXhxBa_vM4E/AAAAAAAAAAI/AAAAAAAAAAA/eNlmenT-DOs/s512-c/photo.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684183198890094283.post-7654921610024458412</id><published>2011-09-13T08:00:00.000-04:00</published><updated>2011-09-13T08:06:44.320-04:00</updated><title type='text'>BOSSIE Awards</title><content type='html'>&lt;a href="http://web2py.com/"&gt;web2py&lt;/a&gt; wins an &lt;a href="http://www.infoworld.com/d/open-source-software/bossie-awards-2011-the-best-open-source-application-development-software-171759-0&amp;amp;current=10&amp;amp;last=11#slideshowTop"&gt;award &lt;/a&gt;-- cool.&lt;br /&gt;&lt;br /&gt;I really like &lt;a href="https://www.djangoproject.com/"&gt;Django&lt;/a&gt;.&amp;nbsp; But... &lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684183198890094283-7654921610024458412?l=slott-softwarearchitect.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://slott-softwarearchitect.blogspot.com/feeds/7654921610024458412/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/09/bossie-awards.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/7654921610024458412'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/7654921610024458412'/><link rel='alternate' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/09/bossie-awards.html' title='BOSSIE Awards'/><author><name>Steven Lott</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-mXhxBa_vM4E/AAAAAAAAAAI/AAAAAAAAAAA/eNlmenT-DOs/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684183198890094283.post-2303791616258570172</id><published>2011-09-08T05:00:00.000-04:00</published><updated>2011-09-08T05:00:08.929-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='data conversion'/><category scheme='http://www.blogger.com/atom/ns#' term='python'/><category scheme='http://www.blogger.com/atom/ns#' term='#SODevDays'/><category scheme='http://www.blogger.com/atom/ns#' term='database administration'/><title type='text'></title><content type='html'>&lt;div class="separator" style="clear: both; text-align: left;"&gt;&lt;a href="http://dl.dropbox.com/u/16180370/speaker-button.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://dl.dropbox.com/u/16180370/speaker-button.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;I was going to be talking about Schema Migration, tacit knowledge, and -- of course -- Python.&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;The hard part would have been avoiding a LONG rant on how devilishly hard the problem really is.&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;Apparently, however, DevDays is cancelled.&amp;nbsp; Sigh.&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684183198890094283-2303791616258570172?l=slott-softwarearchitect.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://slott-softwarearchitect.blogspot.com/feeds/2303791616258570172/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/09/i-was-going-to-be-talking-about-schema.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/2303791616258570172'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/2303791616258570172'/><link rel='alternate' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/09/i-was-going-to-be-talking-about-schema.html' title=''/><author><name>Steven Lott</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-mXhxBa_vM4E/AAAAAAAAAAI/AAAAAAAAAAA/eNlmenT-DOs/s512-c/photo.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684183198890094283.post-3321374786658388187</id><published>2011-09-01T08:00:00.052-04:00</published><updated>2011-09-01T08:00:08.445-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='performance'/><category scheme='http://www.blogger.com/atom/ns#' term='star-schema'/><category scheme='http://www.blogger.com/atom/ns#' term='python'/><category scheme='http://www.blogger.com/atom/ns#' term='data warehouse'/><title type='text'>Data Warehousing and SQL -- Tread Carefully</title><content type='html'>&lt;br /&gt;"Are you implying that a scalable Data Warehouse solution could be implemented using Python and serialised files?"&lt;br /&gt;&lt;br /&gt;Not "implying". &amp;nbsp;I'm trying to state it as clearly as I can.&lt;br /&gt;&lt;br /&gt;A scalable data warehouse solution involves a lot of flat file processing.&lt;br /&gt;&lt;br /&gt;ETL, for example, is mostly a flat-file pipeline. &amp;nbsp;It starts with source application extract (to create a flat file) and proceeds through a number of transformation steps to filter, cleanse, recode, conform dimensions, and eventually relate facts to dimensions. &amp;nbsp;This is generally very, very fast when done with simple flat files and considerably slower when done with a database.&lt;br /&gt;&lt;br /&gt;This is the "Data Warehouse Bus" that Kimball describes in chapter 9 of &lt;i&gt;The Data Warehouse Lifecycle Toolkit&lt;/i&gt;. &lt;br /&gt;&lt;br /&gt;Ultimately, the cleansed, conformed files will lay around in a "staging area" forever. &amp;nbsp;When a datamart is built, then a subset of these files can be (rapidly) loaded into an RDBMS for query processing.&lt;br /&gt;&lt;br /&gt;&lt;div&gt;&lt;div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"&gt;Doing this in Python is no different from doing it in Java, C++ or (for that matter) &lt;a href="http://www.syncsort.com/"&gt;Syncsort&lt;/a&gt;. &amp;nbsp;Yes. &amp;nbsp;You can build a data warehouse using processing steps written around Syncsort and be quite successful.&lt;/div&gt;&lt;div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"&gt;The important part of this is to recognize the following.&lt;/div&gt;&lt;div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"&gt;When trying to do data warehouse flat-file processing in C++ (or Java) you have the ongoing schema maintenance issue. &amp;nbsp;The source data changes. &amp;nbsp;You must tweak the schema mapping from source to warehouse. &amp;nbsp;You can encode this schema mapping as property files or some such, or you can simply use an interpreted language like Python and encode the mappings as Python code.&lt;/div&gt;&lt;div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"&gt;The "Data Warehouse Bus" is a lot of applications that are trivially written as simple, parallel, multi-processing, small, read-match-write programs. &amp;nbsp;Forget threads. &amp;nbsp;Simply use heavy-weight, OS-level processes so that you can maximize the I/O bandwidth. &amp;nbsp;(Remember: &lt;b&gt;when one thread makes an I/O request, the entire process waits&lt;/b&gt;; an I/O-bound application isn't helped by multi-threading.)&lt;/div&gt;&lt;div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; with open('some_data','rb') as source:&lt;/span&gt;&lt;/div&gt;&lt;div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; rdr= csv.DictReader( source )&lt;/span&gt;&lt;/div&gt;&lt;div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; wtr= csv.DictWriter( sys.stdout, some_schema )&lt;/span&gt;&lt;/div&gt;&lt;div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; for row in rdr:&lt;/span&gt;&lt;/div&gt;&lt;div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; if exclude( row ): continue&lt;/span&gt;&lt;/div&gt;&lt;div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; clean = cleanse( row )&lt;/span&gt;&lt;/div&gt;&lt;div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; wtr.writerow( clean )&lt;/span&gt;&lt;/div&gt;&lt;div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"&gt;This example writes to stdout so that it can be connected in a pipeline with other steps in the processing. &amp;nbsp;Programs running in an OS pipeline run concurrently. &amp;nbsp;They tie up all the cores available without any real programming effort other than decomposing the problem into discrete parallel steps that apply to each row being touched.&lt;/div&gt;&lt;div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"&gt;Simple file processing is much, much faster than SQL processing. &amp;nbsp;Why? &amp;nbsp;No overheads for locking or buffer pooling or rollback segments, or logging, or after-image journaling or deadlock detection, etc.&lt;/div&gt;&lt;div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"&gt;Note that a data warehouse database has no need for sophisticated locking. &amp;nbsp;All of the "updates" are bulk loads. &amp;nbsp;80% of the activity is "insert". &amp;nbsp;With some Slowly Changing Dimension (SCD) operations there is a trivial status-change update, but this can be handled with a single database-wide lock during insert.&lt;/div&gt;&lt;div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"&gt;The primary reason for using SQL is to handle "SELECT something ... GROUP BY" queries. &amp;nbsp;SQL does this reasonably well most of the time. &amp;nbsp;Python does it pretty well, also.&lt;/div&gt;&lt;div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; sum_col1 = defaultdict( float )&lt;/span&gt;&lt;/div&gt;&lt;div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; count_group = defaultdict( int )&lt;/span&gt;&lt;/div&gt;&lt;div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; with connection.cursor() as c:&lt;/span&gt;&lt;/div&gt;&lt;div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; c.execute( "SELECT COL1, GROUP FROM..." )&lt;/span&gt;&lt;/div&gt;&lt;div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; for row in c.fetchall():&lt;/span&gt;&lt;/div&gt;&lt;div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; sum_col1[row.group] += col1&lt;/span&gt;&lt;/div&gt;&lt;div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;&amp;nbsp;&lt;/span&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;count_group&lt;/span&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;[&lt;/span&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;row.group&lt;/span&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;] += 1&lt;/span&gt;&lt;/div&gt;&lt;div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; print( sum_col1,&amp;nbsp;&lt;/span&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;count_group&lt;/span&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp;)&lt;/span&gt;&lt;/div&gt;&lt;div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"&gt;That's clearly wordier than SQL. &amp;nbsp;But not &lt;i&gt;much&lt;/i&gt; wordier. &amp;nbsp;The SELECT statement embedded in the Python is simpler because it omits the GROUP BY clause. &amp;nbsp;Since it's simpler, it's more likely to benefit from being reused in the RDBMS.&lt;/div&gt;&lt;div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"&gt;The Python may actually run &lt;i&gt;faster&lt;/i&gt;&amp;nbsp;than a pure SQL query because it avoids the (potentially expensive) RDBMS sort step. &amp;nbsp;The Python defaultdict (or Java HashMap) is how we avoid sorting. &amp;nbsp;If we need to present the keys in some kind of user-friendly order, we have limited the sort to just the distinct key values, not the entire join result.&lt;/div&gt;&lt;div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"&gt;Because of the huge cost of group by, there are two hack-arounds. &amp;nbsp;One is "materialized views". &amp;nbsp;The idea is that a group-by view is updated when the base tables are updated to avoid the painful cost of sorting at query time. &amp;nbsp;In addition to this, there are reporting tools which are "aggregate aware". &amp;nbsp;They can leverage the materialized view to avoid the sort.&lt;/div&gt;&lt;div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"&gt;How about we avoid all the conceptual overhead of materialized views and aggregate aware reporting. Instead we can write simple Python procedures that do the processing we want.&lt;/div&gt;&lt;div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"&gt;&lt;b&gt;Bottom Line&lt;/b&gt;&lt;/div&gt;&lt;div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"&gt;Data Warehouse does not imply SQL. &amp;nbsp;Indeed, it doesn't even suggest SQL except for datamart processing of flexible ad-hoc queries where there's enough horsepower to endure all the sorting.&lt;/div&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684183198890094283-3321374786658388187?l=slott-softwarearchitect.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://slott-softwarearchitect.blogspot.com/feeds/3321374786658388187/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/09/data-warehousing-and-sql-tread.html#comment-form' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/3321374786658388187'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/3321374786658388187'/><link rel='alternate' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/09/data-warehousing-and-sql-tread.html' title='Data Warehousing and SQL -- Tread Carefully'/><author><name>Steven Lott</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-mXhxBa_vM4E/AAAAAAAAAAI/AAAAAAAAAAA/eNlmenT-DOs/s512-c/photo.jpg'/></author><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684183198890094283.post-454955541011989157</id><published>2011-08-04T08:00:00.000-04:00</published><updated>2011-08-04T08:00:11.267-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='csv'/><category scheme='http://www.blogger.com/atom/ns#' term='python'/><title type='text'>Brain-Damaged Data</title><content type='html'>We process a fair amount of externally-prepared datasets.&amp;nbsp; 40,000 rows of econometric data that we purchased from a third-party.&amp;nbsp; Mostly, the data is in a usable format: .CSV or .XSLX. &lt;br /&gt;&lt;br /&gt;Once in a while, we get CSV with | (pipe).&amp;nbsp; A few times, we got fixed-format COBOL-style records.&lt;br /&gt;&lt;br /&gt;Recently, we got a CSV-with-pipe that included 2 records with embedded &lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;\n&lt;/span&gt; sequences in the middle of a CSV row of data.&amp;nbsp; Really.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Painful Elimination&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;There are two ways to "eliminate" this problem.&amp;nbsp; &lt;br /&gt;&lt;ul&gt;&lt;li&gt;Subclass our input processing to handle this special CSV-with-pipe case.&lt;/li&gt;&lt;li&gt;Actually read and parse the source file creating a clean intermediate file that we can simply process with an existing CSV-with-pipe configuration.&lt;/li&gt;&lt;/ul&gt;I elected to do the first.&amp;nbsp; The second is (to my mind) an auditing nightmare because we touched the file.&amp;nbsp; We have to prove that we didn't disturb any other fields.&amp;nbsp; While not impossible, it becomes a very strange special case for this one-and-only file.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;CSV Simplicity&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;The CSV module's epic simplicity makes it easy to work around this kind of goofy data.&amp;nbsp; Our subclass for this case had the following extra foolishness put in&lt;br /&gt;&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;pre&gt;&lt;code&gt;def make_reader( self ):&lt;br /&gt;        def filter_damage( aFile ):&lt;br /&gt;            file_iter= iter(aFile)&lt;br /&gt;            for row in file_iter:&lt;br /&gt;                if row.rfind('"') &amp;gt;= len(row)-3:&lt;br /&gt;                    logger.error( "Damaged Line: %r", row )&lt;br /&gt;                    rest= next(file_iter)&lt;br /&gt;                    line= row[:row.rfind('"')] + rest[3:]&lt;br /&gt;                    logger.warning( "Repaired Line: %r", line )&lt;br /&gt;                    yield line&lt;br /&gt;                else:&lt;br /&gt;                    yield row&lt;br /&gt;        tweaked_file= filter_damage( self.sourceFile )&lt;br /&gt;        return csv.reader( tweaked_file, delimiter='|', doublequote=False, escapechar='"' )&lt;br /&gt;&lt;br /&gt;&lt;/code&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;That's it.&amp;nbsp; Since the Python CSV reader merely wants an iterator over lines, we can (with a simple generator function) provide the necessary "iterator-over-lines".&amp;nbsp;&lt;br /&gt;&lt;br /&gt;Delightful.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Apology&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;The murky-looking &lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;row.rfind('"') &amp;gt;= len(row)-3&lt;/span&gt; condition is one of those consequences of trying to find just a few irregular line endings in an otherwise regular file.&amp;nbsp; For CSV processing, files often have to be opened in &lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;"rb"&lt;/span&gt; mode because they originate (or will be used with) MS-Excel.&amp;nbsp; This makes the damaged line-ending either &lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;'"\n'&lt;/span&gt; or maybe &lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;'"\r\n'&lt;/span&gt;.&amp;nbsp; Rather than spend too much time negotiating with Python's universal newline and &lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;"rb"&lt;/span&gt; mode, it's slightly easier to look for a &lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;'"'&lt;/span&gt; near the end.&amp;nbsp; &lt;br /&gt;&lt;br /&gt;We're hoping this is a one-time-only subclass that we can safely ignore in the future.&amp;nbsp; If hope is dashed, it's a distinct subclass, so it's easily reused and didn't break anything else.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684183198890094283-454955541011989157?l=slott-softwarearchitect.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://slott-softwarearchitect.blogspot.com/feeds/454955541011989157/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/08/brain-damaged-data.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/454955541011989157'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/454955541011989157'/><link rel='alternate' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/08/brain-damaged-data.html' title='Brain-Damaged Data'/><author><name>Steven Lott</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-mXhxBa_vM4E/AAAAAAAAAAI/AAAAAAAAAAA/eNlmenT-DOs/s512-c/photo.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684183198890094283.post-7728224745223378524</id><published>2011-07-26T08:00:00.005-04:00</published><updated>2011-07-26T08:00:03.579-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='template'/><category scheme='http://www.blogger.com/atom/ns#' term='python'/><title type='text'>One of Those Things</title><content type='html'>Check out this question on Stack Overflow: "&lt;a href="http://stackoverflow.com/questions/6789230/python-replace-a-string-by-a-float-in-txt-file/6789735#6789735"&gt;Python: replace a string by a float in txt file&lt;/a&gt;".&lt;br /&gt;&lt;br /&gt;The question is confusing, but it appears to be a longish and confused description of simple formatting or template substitution. &amp;nbsp;It's hard to be sure, but it sounds like one of Those Things™ (TT).&lt;br /&gt;&lt;br /&gt;Most of Those Things (TT) are standard problems with standard solutions. &amp;nbsp;Until you've seen a lot TT's, it seems like your problem is unique and special. &amp;nbsp;It's hard to see TT's for what they are.&lt;br /&gt;&lt;br /&gt;In this case, the problem appears to be solved by Python's &lt;a href="http://docs.python.org/library/string.html#template-strings"&gt;string.Template&lt;/a&gt; class with minor modifications. &amp;nbsp;The documentation for customizing string.Template isn't clear, so here's an example.&lt;br /&gt;&lt;br /&gt;&lt;tt&gt;&lt;br /&gt;from string import Template&lt;br /&gt;class MyTemplate( Template ): &lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;delimiter= '@'&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;pattern= r"@(?P&amp;lt;escaped&amp;gt;@)|@(?P&amp;lt;named&amp;gt;[_a-z][_a-z0-9]*)@|@(?P&amp;lt;braced&amp;gt;[_a-z][_a-z0-9]*)@|@(?P&amp;lt;invalid&amp;gt;)"&lt;br /&gt;&lt;/tt&gt;&lt;br /&gt;&lt;br /&gt;That appears to be the standard solution to the standard problem. &amp;nbsp;Define a new delimiter ('@') and some slightly different delimiter parsing rules and away you go.&lt;br /&gt;&lt;br /&gt;This can be used as follows to replace any '@x@' variables in any template file. &amp;nbsp;What's important is that very little actual code is needed, since it's one of Those Things that's already been solved.&lt;br /&gt;&lt;br /&gt;&lt;code&gt;with open( 'a.txt', 'r' ) as source:&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;t = MyTemplate(source.read())&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;result= t.substitute( x=15 )&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;print result&lt;br /&gt;&lt;/code&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684183198890094283-7728224745223378524?l=slott-softwarearchitect.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://slott-softwarearchitect.blogspot.com/feeds/7728224745223378524/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/07/one-of-those-things.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/7728224745223378524'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/7728224745223378524'/><link rel='alternate' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/07/one-of-those-things.html' title='One of Those Things'/><author><name>Steven Lott</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-mXhxBa_vM4E/AAAAAAAAAAI/AAAAAAAAAAA/eNlmenT-DOs/s512-c/photo.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684183198890094283.post-6394018449446782164</id><published>2011-07-21T08:00:00.001-04:00</published><updated>2011-07-21T11:35:23.367-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='use case'/><category scheme='http://www.blogger.com/atom/ns#' term='spam'/><title type='text'>Spam Email Footers</title><content type='html'>I don't want the spamilicious email.&amp;nbsp; I'm trying to actually unsubscribe.&lt;br /&gt;&lt;br /&gt;The footer says "If you are not the intended recipient, you are hereby notified that any  dissemination, distribution or copying of any information contained in or  attached to this communication is strictly prohibited. If you have received this  message in error, please notify the sender immediately and delete the material  from any computer."&lt;br /&gt;&lt;br /&gt;I don't feel like the intended recipient because it's just irrelevant junk.&amp;nbsp; Perhaps you should not have disseminated, distributed, copied or sent me this.&amp;nbsp; Wouldn't that have been simpler? Keep it to yourself?&lt;br /&gt;&lt;br /&gt;I also think I've received the message in error.&amp;nbsp; Since I don't want the damn thing. And that means that I have to delete it?&amp;nbsp; Why can't you stop sending it?&amp;nbsp; Wouldn't that be simpler for both of us?&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684183198890094283-6394018449446782164?l=slott-softwarearchitect.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://slott-softwarearchitect.blogspot.com/feeds/6394018449446782164/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/07/spam-email-footers.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/6394018449446782164'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/6394018449446782164'/><link rel='alternate' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/07/spam-email-footers.html' title='Spam Email Footers'/><author><name>Steven Lott</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-mXhxBa_vM4E/AAAAAAAAAAI/AAAAAAAAAAA/eNlmenT-DOs/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684183198890094283.post-3880489255756903638</id><published>2011-07-18T06:06:00.001-04:00</published><updated>2011-07-18T06:06:33.810-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='meetup'/><title type='text'>757 Python User's Group Meetup</title><content type='html'>Wednesday night.&amp;nbsp; At 757 Labs.&amp;nbsp; Be there.&lt;br /&gt;&lt;br /&gt;Here's the details on &lt;a href="http://www.meetup.com/757-Python-Users-Group/events/22825011/"&gt;meetup.com&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Lacking any other agenda, I'll do some more presentation on the supreme coolness of Django.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684183198890094283-3880489255756903638?l=slott-softwarearchitect.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://slott-softwarearchitect.blogspot.com/feeds/3880489255756903638/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/07/757-python-users-group-meetup.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/3880489255756903638'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/3880489255756903638'/><link rel='alternate' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/07/757-python-users-group-meetup.html' title='757 Python User&apos;s Group Meetup'/><author><name>Steven Lott</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-mXhxBa_vM4E/AAAAAAAAAAI/AAAAAAAAAAA/eNlmenT-DOs/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684183198890094283.post-6232439180561605041</id><published>2011-07-12T08:00:00.004-04:00</published><updated>2011-07-12T08:00:08.187-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='python'/><category scheme='http://www.blogger.com/atom/ns#' term='building skills books'/><title type='text'>I almost wet myself</title><content type='html'>Someone sent me this: "&lt;a href="http://mygisblog.wordpress.com/2010/05/03/building-skills-in-python-steven-f-lott/"&gt;“Building Skills in Python” – Steven F. Lott&lt;/a&gt;".&lt;br /&gt;&lt;br /&gt;I had a vague idea that this book would get some traction. &amp;nbsp;This response was surprising. &amp;nbsp;I guess I should get to work on the upgrades. &amp;nbsp;And focus on the "no-nonsense" comment.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684183198890094283-6232439180561605041?l=slott-softwarearchitect.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://slott-softwarearchitect.blogspot.com/feeds/6232439180561605041/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/07/i-almost-wet-myself.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/6232439180561605041'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/6232439180561605041'/><link rel='alternate' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/07/i-almost-wet-myself.html' title='I almost wet myself'/><author><name>Steven Lott</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-mXhxBa_vM4E/AAAAAAAAAAI/AAAAAAAAAAA/eNlmenT-DOs/s512-c/photo.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684183198890094283.post-669529689726934704</id><published>2011-07-07T08:00:00.003-04:00</published><updated>2011-07-07T17:20:44.858-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='security'/><title type='text'>Security Vulnerabilities</title><content type='html'>Just saw this for the first time today: &amp;nbsp;&lt;a href="http://cwe.mitre.org/top25/"&gt;http://cwe.mitre.org/top25/&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;I'd always relied on this:&amp;nbsp;&lt;a href="https://www.owasp.org/index.php/Category:OWASP_Top_Ten_Project"&gt;https://www.owasp.org/index.php/Category:OWASP_Top_Ten_Project&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Both are really good lists of security vulnerabilities.&lt;br /&gt;&lt;br /&gt;I once had to listen to a DBA tell me that "we don't know what we don't know" as a way of saying that there was no way to be sure that a web app was "secure". &amp;nbsp;That comment lead the project manager to go &amp;nbsp;through the classic "risk exposure" exercise (and hours of discussion) to determine that security mattered. &amp;nbsp;We defined the risks, the costs and the probability of occurrence so that we could document all kinds of potential exposures or something.&lt;br /&gt;&lt;br /&gt;Instead of hand-wringing, these kinds of simple lists of the common vulnerabilities provides actionable steps for design, code, test and audit of operations. &amp;nbsp;Further, they guide selection, configuration and operation of web server technology to assure that the vulnerabilities are addressed.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684183198890094283-669529689726934704?l=slott-softwarearchitect.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://slott-softwarearchitect.blogspot.com/feeds/669529689726934704/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/07/security-vulnerabilities.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/669529689726934704'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/669529689726934704'/><link rel='alternate' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/07/security-vulnerabilities.html' title='Security Vulnerabilities'/><author><name>Steven Lott</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-mXhxBa_vM4E/AAAAAAAAAAI/AAAAAAAAAAA/eNlmenT-DOs/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684183198890094283.post-6499496791358250144</id><published>2011-06-30T08:00:00.001-04:00</published><updated>2011-06-30T08:00:01.141-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='use case'/><title type='text'>Implementing the Unsubscribe User Story</title><content type='html'>I've been unsubscribing from some junk email recently.&lt;br /&gt;&lt;br /&gt;The user story is simple: As a not-very-interested person, I want to get off your dumb-ass mailing list so that I don't have to flag your crap as spam any more. &lt;br /&gt;&lt;br /&gt;The implementations vary from good to evil. &amp;nbsp;Here's what I've found.&lt;br /&gt;&lt;br /&gt;The best sites have an unsubscribe link that simply presents the facts -- you are unsubscribed. &amp;nbsp;I almost feel like re-subscribing to a site that handles this use case so well.&lt;br /&gt;&lt;br /&gt;The first level of crap is a site which forces me to click an OK or Unsubscribe button to confirm that I really want to unsubscribe and wasn't clicking the tiny little links at the end of the message randomly.&lt;br /&gt;&lt;br /&gt;The deeper level of "marketing" crap is a form that allows me to "configure my subscription settings". &amp;nbsp;This is done by some marketing genius who wanted to "offer additional value" rather than simply do what I asked. &amp;nbsp;This is a hateful (but not yet evil) practice. &amp;nbsp;I don't want to "configure" my settings. &amp;nbsp;I want out.&lt;br /&gt;&lt;br /&gt;The third-from-worst is a form in which I must enter my email address. &amp;nbsp;What? &amp;nbsp;I have several email aliases that redirect to a common mailbox. &amp;nbsp;I have to -- what? -- guess which of the aliases was used? &amp;nbsp;This is pernicious because I can make a spelling mistake and they can continue to send me dunning email. &amp;nbsp;This fill-in-the-blanks unsubscribe is simply evil because it gives them plausible deniability when the continue to send me email. &amp;nbsp;It's now my fault that I didn't spell my name correctly.&lt;br /&gt;&lt;br /&gt;The next-to-worst is a "mailto:" link that jumps into my emailer. &amp;nbsp;I have to -- what? -- fill in the magic word "Complete" somewhere? &amp;nbsp;You're kidding, right? &amp;nbsp;This is so 1980's-vintage listserv that I'm hoping these companies can be sued because they failed to actually unsubscribe folks. &amp;nbsp;Again, this gives the spammer a legitimate excuse because I failed to do the arcane step properly.&lt;br /&gt;&lt;br /&gt;The worst is no link at all. &amp;nbsp;Just instructions explaining that an email must be send with the magic word "Complete" or "Unsubscribe" in the subject or body. &amp;nbsp;Because I use aliases, this will probably not unsubscribe anything useful, but will only unsubscribe my outbound email address. &amp;nbsp;This is the worst kind of evil. &amp;nbsp;In a way, it meets the user story. &amp;nbsp;But only in a very, very oblique way.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684183198890094283-6499496791358250144?l=slott-softwarearchitect.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://slott-softwarearchitect.blogspot.com/feeds/6499496791358250144/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/06/implementing-unsubscribe-user-story.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/6499496791358250144'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/6499496791358250144'/><link rel='alternate' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/06/implementing-unsubscribe-user-story.html' title='Implementing the Unsubscribe User Story'/><author><name>Steven Lott</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-mXhxBa_vM4E/AAAAAAAAAAI/AAAAAAAAAAA/eNlmenT-DOs/s512-c/photo.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684183198890094283.post-7522286999357785770</id><published>2011-06-27T18:05:00.001-04:00</published><updated>2011-06-28T08:39:24.568-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='python'/><title type='text'>Simplicity vs. Depth</title><content type='html'>During &amp;nbsp;chapter technical reviews, the question of technical depth has come up time and again. &amp;nbsp;Essentially, in every single chapter.&lt;br /&gt;&lt;br /&gt;In the older &lt;a href="http://homepage.mac.com/s_lott/books/python/html/index.html"&gt;Building Skills in Python&lt;/a&gt; book, there are a number of topics that feel "digressive" to the reviewer and editor. &amp;nbsp;Too much depth.&lt;br /&gt;&lt;br /&gt;However, there are a number of Python tutorials, many of which are very shallow. &amp;nbsp;I'd like to find a way to retain the technical depth, without it feeling "digressive". &lt;br /&gt;&lt;br /&gt;Choice 1. &amp;nbsp;Split each chapter into different "basic" and "advanced" sections. &amp;nbsp;This would retain a sensible outline of parts (Language Fundamentals, Data Structures, Classes, Modules and a bunch of advanced projects) and chapters within each part. &amp;nbsp;Some chapters would still have to be split because a number of "advanced" concepts (i.e. alternative function argument passing with &lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;*&lt;/span&gt; and &lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;**&lt;/span&gt;) really has to be delayed until after an appropriate data structure chapter.&lt;br /&gt;&lt;br /&gt;Choice 2. &amp;nbsp;Separate material two kinds of chapters "basic" and "pro". &amp;nbsp;This would lead to a "basics" thread for n00bz (read all the "basics" chapters) and an "pro" thread for professionals where you'd just read all the chapters in order without skipping. &amp;nbsp; &amp;nbsp;This would create some more chapters, but each chapter would be shorter and more focused.&lt;br /&gt;&lt;br /&gt;It's&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684183198890094283-7522286999357785770?l=slott-softwarearchitect.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://slott-softwarearchitect.blogspot.com/feeds/7522286999357785770/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/06/simplicity-vs-depth.html#comment-form' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/7522286999357785770'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/7522286999357785770'/><link rel='alternate' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/06/simplicity-vs-depth.html' title='Simplicity vs. Depth'/><author><name>Steven Lott</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-mXhxBa_vM4E/AAAAAAAAAAI/AAAAAAAAAAA/eNlmenT-DOs/s512-c/photo.jpg'/></author><thr:total>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684183198890094283.post-4359123406046929917</id><published>2011-06-21T08:00:00.003-04:00</published><updated>2011-06-21T08:00:01.633-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='hackerspace'/><category scheme='http://www.blogger.com/atom/ns#' term='python'/><title type='text'>Hackerspace</title><content type='html'>Just started learning about "&lt;a href="http://en.wikipedia.org/wiki/Hackerspace"&gt;Hackerspace&lt;/a&gt;".&lt;br /&gt;&lt;br /&gt;Without really knowing what I was doing, I fell into the &lt;a href="http://757labs.org/"&gt;757 Labs&lt;/a&gt; Hackerspace.&lt;br /&gt;&lt;br /&gt;The &lt;a href="http://www.meetup.com/757-Python-Users-Group/"&gt;757 Python Users' Group&lt;/a&gt;, specifically.&lt;br /&gt;&lt;br /&gt;What a great idea. &amp;nbsp;Bright people. &amp;nbsp;Interested in the same area of technology. &lt;br /&gt;&lt;br /&gt;It's like hanging around with sailors at a marina.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684183198890094283-4359123406046929917?l=slott-softwarearchitect.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://slott-softwarearchitect.blogspot.com/feeds/4359123406046929917/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/06/hackerspace.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/4359123406046929917'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/4359123406046929917'/><link rel='alternate' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/06/hackerspace.html' title='Hackerspace'/><author><name>Steven Lott</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-mXhxBa_vM4E/AAAAAAAAAAI/AAAAAAAAAAA/eNlmenT-DOs/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684183198890094283.post-33940033002357465</id><published>2011-06-09T08:00:00.001-04:00</published><updated>2011-06-09T08:00:23.504-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='innovation'/><title type='text'>An Object-Lesson in How to Stifle Innovation</title><content type='html'>Read this:&amp;nbsp;&lt;a href="http://gizmodo.com/5691604/how-ma-bell-shelved-the-future-for-60-years"&gt;How Ma Bell Shelved the Future for 60 Years&lt;/a&gt;.&lt;br /&gt;&lt;blockquote&gt;AT&amp;amp;T firmly believed that the answering machine, and its magnetic tapes, would lead the public&amp;nbsp;to abandon the telephone.&lt;/blockquote&gt;How many good ideas are set aside by managers who simply don't have a clue what users actually want?&lt;br /&gt;&lt;br /&gt;How many great IT projects are rejected because of this kind of delusional paranoia?&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684183198890094283-33940033002357465?l=slott-softwarearchitect.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://slott-softwarearchitect.blogspot.com/feeds/33940033002357465/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/06/object-lesson-in-how-to-stifle.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/33940033002357465'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/33940033002357465'/><link rel='alternate' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/06/object-lesson-in-how-to-stifle.html' title='An Object-Lesson in How to Stifle Innovation'/><author><name>Steven Lott</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-mXhxBa_vM4E/AAAAAAAAAAI/AAAAAAAAAAA/eNlmenT-DOs/s512-c/photo.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684183198890094283.post-6748578157990797717</id><published>2011-06-07T14:10:00.005-04:00</published><updated>2011-06-07T14:10:00.546-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='multithreaded'/><category scheme='http://www.blogger.com/atom/ns#' term='queue'/><category scheme='http://www.blogger.com/atom/ns#' term='design patterns'/><category scheme='http://www.blogger.com/atom/ns#' term='code-kata'/><title type='text'>Multithreading -- Fear, Uncertainty and Doubt</title><content type='html'>Read this: "&lt;a href="http://programmers.stackexchange.com/questions/81003/how-to-explain-why-multi-threading-is-difficult/81008#81008"&gt;How to explain why multi-threading is difficult&lt;/a&gt;".&lt;br /&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;We need to talk.   This is not that difficult.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Multi-threading is only difficult if you do it badly.  There are an almost infinite number of ways to do it badly.  Many magazines and bloggers have decided that the multithreading hurdle is the Next Big Thing (NBT™).  We need new, fancy, expensive language and library support for this and we need it right now.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;a href="http://en.wikipedia.org/wiki/Parallel_computing"&gt;Parallel Computing&lt;/a&gt; is the secret to following Moore's Law.  All those extra cores will go unused if we can't write multithreaded apps.   And we can't write multi-threaded apps because—well—there are lots of reasons, split between ignorance and arrogance.  All of which can be solved by throwing money after tools.  Right?&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Arrogance&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;One thing that makes multi-threaded applications error-prone is simple arrogance.  There are lots and lots of race conditions that can arise.  And folks aren't trained to think about how simple it is to have a sequence of instructions interrupted at just the wrong spot.   Any sequence of "read, work, update" operations will have threads doing reads (in any order), threads doing the work (in any order) and then doing the updates in the worst possible order.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Compound "read, work, update" sequences need locks.  And the locations of the locks can be obscure because we rarely think twice about reading a variable.  Setting a variable is a little less confusing.  Because we don't think much about reads, we fail to see the consequences of moving the read of a variable around as part of an optimization effort. &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Ignorance&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The best kind of lock is not a mutex or a semaphore.  It surely isn't an RDBMS (but God knows, numerous organizations have used an RDBMS as a large, slow, complex and expensive message queue.)&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The best kind of lock seems to be a message queue.  The various concurrent elements can simply dequeue pieces of data, do their tasks and enqueue the results.  It's really elegant.  It has many, simple, uncoupled pieces.  It can be scaled by increasing the number of threads sharing a queue.  &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;A queue (read with an official "get") means that the reads aren't casually ignored and moved around during optimization.  Further, the creation of a complex object can be done by one thread which gets pieces of data from a queue shared by multiple writers.  No locking on the complex object.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Using message queues means that there's no weird race condition when getting data to start doing useful work; a get is atomic and &lt;i&gt;guaranteed&lt;/i&gt; to have that property.  Each thread gets an thread-local, thread-safe object.  There's no weird race condition when passing a result on to the next step in a pipeline.  It's dropped into the queue, where it's available to another thread.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Dining Philosophers&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The &lt;a href="http://en.wikipedia.org/wiki/Dining_philosophers_problem"&gt;Dining Philosophers&lt;/a&gt; Code Kata has a queue-based solution that's pretty cool.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;A queue of Forks can be shared by the various Philosopher threads.  Each Philosopher must get two Fork resources from the queue, eat, philosophize and then enqueue the two Forks again.  It's quite short, easy to write and easy to demonstrate that it &lt;i&gt;must&lt;/i&gt; work.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Perhaps the hardest thing is designing the Dining Room (also know as the Waiter, Conductor or Footman) that only allows four of the five philosophers to dine concurrently.  To do this, a departing Philosopher must enqueue themselves into a "done eating" queue so that the next waiting Philosopher can be seated.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;A queue-based solution is delightfully simple.  200 or so lines of code including docstrings comments so that the documentation looked nice, too.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Additional Constraints&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The simplest solution uses a single queue of anonymous Forks.  A common constraint is to insist that each Philosopher use only the two adjacent forks.  Philosopher &lt;i&gt;p&lt;/i&gt; can use forks (&lt;i&gt;p&lt;/i&gt;+1 mod 5) and  (&lt;i&gt;p-&lt;/i&gt;1 mod 5).  &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;This is pleasant to implement. The Philosopher simply dequeues a fork, checks the position, and re-enqueues it if it's a wrong fork.   &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;FUD Factor&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;I think that the publicity around parallel programming and multithreaded applications is designed to create Fear, Uncertainty and Doubt (FUD™).&lt;/div&gt;&lt;div&gt;&lt;ol&gt;&lt;li&gt;Too many questions on StackOverflow seem to indicate that a slow program might magically get faster if somehow threads where involved.  For programs that involve scanning the entire hard drive or downloading Wikipedia or doing a giant SQL query, the number of threads has little relevance to the real work involved.  These programs are I/O bound; since threads must share the I/O resources of the containing process, multi-threading won't help.&lt;/li&gt;&lt;li&gt;Too many questions on StackOverflow seem to have simple message queue solutions.  But folks seem to start out using inappropriate technology.  Just learn how to use a message queue.  Move on.&lt;/li&gt;&lt;li&gt;Too many vendors of tools (or languages) are pandering to (or creating) the FUD factor.  If programmers are made suitably fearful, uncertain or doubtful, they'll lobby for spending lots of money for a language or package that "solves" the problem.&lt;/li&gt;&lt;/ol&gt;&lt;div&gt;Sigh.  The answer isn't software tools, it's design.  Break the problem down into independent parallel tasks and feed them from message queues.  Collect the results in message queues.&lt;br /&gt;&lt;br /&gt;&lt;/div&gt;&lt;b&gt;Some Code&lt;/b&gt;&lt;br /&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;pre&gt;class Philosopher( threading.Thread ):&lt;br /&gt;    """A Philosopher.  When invited to dine, they will&lt;br /&gt;    cycle through their standard dining loop.&lt;br /&gt;    &lt;br /&gt;    -   Acquire two forks from the fork Queue&lt;br /&gt;    -   Eat for a random interval&lt;br /&gt;    -   Release the two forks&lt;br /&gt;    -   Philosophize for a random interval&lt;br /&gt;    &lt;br /&gt;    When done, they will enqueue themselves with&lt;br /&gt;    the "footman" to indicate that they are leaving.&lt;br /&gt;    """&lt;br /&gt;    def __init__( self, name, cycles=None ):&lt;br /&gt;        """Create this philosopher.&lt;br /&gt;        &lt;br /&gt;        :param name: the number of this philosopher.  &lt;br /&gt;            This is used by a subclass to find the correct fork.&lt;br /&gt;        :param cycles: the number of cycles they will eat.&lt;br /&gt;            If unspecified, it's a random number, u, 4 &amp;lt;= u &amp;lt; 7&lt;br /&gt;        """&lt;br /&gt;        super( Philosopher, self ).__init__()&lt;br /&gt;        self.name= name&lt;br /&gt;        self.cycles= cycles if cycles is not None else random.randrange(4,7)&lt;br /&gt;        self.log= logging.getLogger( "{0}.{1}".format(self.__class__.__name__, name) )&lt;br /&gt;        self.log.info( "cycles={0:d}".format( self.cycles ) )&lt;br /&gt;        self.forks= None&lt;br /&gt;        self.leaving= None&lt;br /&gt;    def enter( self, forks, leaving ):&lt;br /&gt;        """Enter the dining room.  This must be done before the &lt;br /&gt;        thread can be started.&lt;br /&gt;        &lt;br /&gt;        :param forks: The queue of available forks&lt;br /&gt;        :param leaving: A queue to notify the footman that they are&lt;br /&gt;            done.&lt;br /&gt;        """&lt;br /&gt;        self.forks= forks&lt;br /&gt;        self.leaving= leaving&lt;br /&gt;    def dine( self ):&lt;br /&gt;        """The standard dining cycle: &lt;br /&gt;        acquire forks, eat, release forks, philosophize.&lt;br /&gt;        """&lt;br /&gt;        for cycle in range(self.cycles):&lt;br /&gt;            f1= self.acquire_fork()&lt;br /&gt;            f2= self.acquire_fork()&lt;br /&gt;            self.eat()&lt;br /&gt;            self.release_fork( f1 )&lt;br /&gt;            self.release_fork( f2 )&lt;br /&gt;            self.philosophize()&lt;br /&gt;        self.leaving.put( self )&lt;br /&gt;    def eat( self ):&lt;br /&gt;        """Eating task."""&lt;br /&gt;        self.log.info( "Eating" )&lt;br /&gt;        time.sleep( random.random() )&lt;br /&gt;    def philosophize( self ):&lt;br /&gt;        """Philosophizing task."""&lt;br /&gt;        self.log.info( "Philosophizing" )&lt;br /&gt;        time.sleep( random.random() )&lt;br /&gt;    def acquire_fork( self ):&lt;br /&gt;        """Acquire a fork.&lt;br /&gt;        &lt;br /&gt;        :returns: The Fork acquired.&lt;br /&gt;        """&lt;br /&gt;        fork= self.forks.get()&lt;br /&gt;        fork.held_by= self.name&lt;br /&gt;        return fork&lt;br /&gt;    def release_fork( self, fork ):&lt;br /&gt;        """Acquire a fork.&lt;br /&gt;        &lt;br /&gt;        :param fork: The Fork to release.&lt;br /&gt;        """&lt;br /&gt;        fork.held_by= None&lt;br /&gt;        self.forks.put( fork )&lt;br /&gt;    def run( self ):&lt;br /&gt;        """Interface to Thread.  After the Philosopher&lt;br /&gt;        has entered the dining room, they may engage&lt;br /&gt;        in the main dining cycle.&lt;br /&gt;        """&lt;br /&gt;        assert self.forks and self.leaving&lt;br /&gt;        self.dine()&lt;/pre&gt;&lt;/code&gt; &lt;/div&gt;&lt;/div&gt;&lt;br /&gt;&lt;div&gt;The point is to have the &lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;dine&lt;/span&gt; method be a direct expression of the Philosopher's dining experience. &amp;nbsp;We might want to override the&amp;nbsp;&lt;span class="Apple-style-span" style="font-family: monospace; white-space: pre;"&gt;acquire_fork &lt;/span&gt;&lt;span class="Apple-style-span" style="white-space: pre;"&gt;&lt;span class="Apple-style-span" style="font-family: inherit;"&gt;method to permit different fork acquisition strategies. &lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="white-space: pre;"&gt;&lt;span class="Apple-style-span" style="font-family: inherit;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="white-space: pre;"&gt;&lt;span class="Apple-style-span" style="font-family: inherit;"&gt;For example, a picky philosopher may only want to use the forks adjacent to their place at the table, rather than reaching across the table for the next available Fork.&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;The Fork, by comparison, is boring.&lt;br /&gt;&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;pre&gt;class Fork( object ):&lt;br /&gt;    """A Fork.  A Philosopher requires two of these to eat."""&lt;br /&gt;    def __init__( self, name ):&lt;br /&gt;        """Create the Fork.&lt;br /&gt;        &lt;br /&gt;        :param name: The number of this fork.  This may &lt;br /&gt;            be used by a Philosopher looking for the correct Fork.&lt;br /&gt;        """&lt;br /&gt;        self.name= name&lt;br /&gt;        self.holder= None&lt;br /&gt;        self.log= logging.getLogger( "{0}.{1}".format(self.__class__.__name__, name) )&lt;br /&gt;    @property&lt;br /&gt;    def held_by( self ):&lt;br /&gt;        """The Philosopher currently holding this Fork."""&lt;br /&gt;        return self.holder&lt;br /&gt;    @held_by.setter&lt;br /&gt;    def held_by( self, philosopher ):&lt;br /&gt;        if philosopher:&lt;br /&gt;            self.log.info( "Acquired by {0}".format( philosopher ) )&lt;br /&gt;        else:&lt;br /&gt;            self.log.info( "Released by {0}".format( self.holder ) )&lt;br /&gt;        self.holder= philosopher&lt;br /&gt;&lt;/pre&gt;&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;The Table, however, is interesting. &amp;nbsp;It includes the special "leaving" queue that's not a proper part of the problem domain, but is a part of this particular solution.&lt;br /&gt;&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;pre&gt;class Table( object ):&lt;br /&gt;    """The dining Table.  This uses a queue of Philosophers&lt;br /&gt;    waiting to dine and a queue of forks.&lt;br /&gt;    &lt;br /&gt;    This sets Philosophers, allows them to dine and then&lt;br /&gt;    cleans up after each one is finished dining.&lt;br /&gt;    &lt;br /&gt;    To prevent deadlock, there's a limit on the number&lt;br /&gt;    of concurrent Philosophers allowed to dine.&lt;br /&gt;    """&lt;br /&gt;    def __init__( self, philosophers, forks, limit=4 ):&lt;br /&gt;        """Create the Table.&lt;br /&gt;        :param philosophers: The queue of Philosophers waiting to dine.&lt;br /&gt;        :param forks: The queue of available Forks.&lt;br /&gt;        :param limit: A limit on the number of concurrently dining Philosophers.&lt;br /&gt;        """&lt;br /&gt;        self.philosophers= philosophers&lt;br /&gt;        self.forks= forks&lt;br /&gt;        self.limit= limit&lt;br /&gt;        self.leaving= Queue.Queue()&lt;br /&gt;        self.log= logging.getLogger( "table" )&lt;br /&gt;    def dinner( self ):&lt;br /&gt;        """The essential dinner cycle:&lt;br /&gt;        admit philosophers (to the stated limit);&lt;br /&gt;        as philosophers finish dining, remove them and admit more;&lt;br /&gt;        when the dining queue is empty, simply clean up.&lt;br /&gt;        """&lt;br /&gt;        self.at_table= self.limit&lt;br /&gt;        while not self.philosophers.empty():&lt;br /&gt;            while self.at_table != 0:&lt;br /&gt;                p= self.philosophers.get()&lt;br /&gt;                self.seat( p )&lt;br /&gt;            # Must do a Queue.get() to wait for a resource&lt;br /&gt;            p= self.leaving.get()&lt;br /&gt;            self.excuse( p )&lt;br /&gt;        assert self.philosophers.empty()&lt;br /&gt;        while self.at_table != self.limit:&lt;br /&gt;            p= self.leaving.get()&lt;br /&gt;            self.excuse( p )&lt;br /&gt;        assert self.at_table == self.limit&lt;br /&gt;    def seat( self, philosopher ):&lt;br /&gt;        """Seat a philosopher.  This increments the count &lt;br /&gt;        of currently-eating Philosophers.&lt;br /&gt;        &lt;br /&gt;        :param philosopher: The Philosopher to be seated.&lt;br /&gt;        """&lt;br /&gt;        self.log.info( "Seating {0}".format(philosopher.name) )&lt;br /&gt;        philosopher.enter( self.forks, self.leaving)&lt;br /&gt;        philosopher.start()&lt;br /&gt;        self.at_table -= 1 # Consume a seat&lt;br /&gt;    def excuse( self, philosopher ):&lt;br /&gt;        """Excuse a philosopher.  This decrements the count &lt;br /&gt;        of currently-eating Philosophers.&lt;br /&gt;        &lt;br /&gt;        :param philosopher: The Philosopher to be excused.&lt;br /&gt;        """&lt;br /&gt;        philosopher.join() # Cleanup the thread&lt;br /&gt;        self.log.info( "Excusing {0}".format(philosopher.name) )&lt;br /&gt;        self.at_table += 1 # Release a seat&lt;br /&gt;&lt;/pre&gt;&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;The &lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;dinner&lt;/span&gt; method assures that all Philosophers eat until they are finished. &amp;nbsp;It also assures that four Philosophers sit at the table and when one finishes, another takes their place. &amp;nbsp;Finally, it also assures that all Philosophers are done eating before the dining room is closed.&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684183198890094283-6748578157990797717?l=slott-softwarearchitect.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://slott-softwarearchitect.blogspot.com/feeds/6748578157990797717/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/06/multithreading-fear-uncertainty-and.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/6748578157990797717'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/6748578157990797717'/><link rel='alternate' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/06/multithreading-fear-uncertainty-and.html' title='Multithreading -- Fear, Uncertainty and Doubt'/><author><name>Steven Lott</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-mXhxBa_vM4E/AAAAAAAAAAI/AAAAAAAAAAA/eNlmenT-DOs/s512-c/photo.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684183198890094283.post-3730138482225796254</id><published>2011-06-03T08:52:00.000-04:00</published><updated>2011-06-03T08:52:45.142-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='blogging'/><title type='text'>Changed the Page Template</title><content type='html'>The "default" template I chose was too narrow for presenting code samples. &amp;nbsp;Changed it.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684183198890094283-3730138482225796254?l=slott-softwarearchitect.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://slott-softwarearchitect.blogspot.com/feeds/3730138482225796254/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/06/changed-page-template.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/3730138482225796254'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/3730138482225796254'/><link rel='alternate' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/06/changed-page-template.html' title='Changed the Page Template'/><author><name>Steven Lott</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-mXhxBa_vM4E/AAAAAAAAAAI/AAAAAAAAAAA/eNlmenT-DOs/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684183198890094283.post-6736350237541488525</id><published>2011-05-26T08:00:00.001-04:00</published><updated>2011-05-26T08:00:00.422-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='database design'/><category scheme='http://www.blogger.com/atom/ns#' term='code-kata'/><title type='text'>Code Kata : "Simple" Database Design</title><content type='html'>Here's a pretty simple set of use cases for a code-kata database application.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;This is largely transactional, not analytical.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;It's a simple inventory of ingredients, recipes and locations.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Context&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;ul&gt;&lt;li&gt;42' sailboat.&lt;/li&gt;&lt;li&gt;Lots of places to keep stuff.  Lots.&lt;/li&gt;&lt;/ul&gt;&lt;/div&gt;&lt;div&gt;Stuff gets lots or misplaced.  It's helpful to marry recipes with ingredients to use up the last of something before it goes bad and stinks up the boat.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Actor is essentially the cook.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Use Cases&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;ul&gt;&lt;li&gt;Perishables to be eaten soon?&lt;/li&gt;&lt;li&gt;Shopping list for specific recipes.&lt;/li&gt;&lt;li&gt;Where did I put that?&lt;/li&gt;&lt;/ul&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Model&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;a href="http://yuml.me/diagram/scruffy/class/[Ingredient]m-n[Recipe],%20[Ingredient]1-n[On-Hand],%20[On-Hand]n-1[Location]." onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}"&gt;&lt;img style="float:right; margin:0 0 10px 10px;cursor:pointer; cursor:hand;width: 621px; height: 152px;" src="http://yuml.me/diagram/scruffy/class/[Ingredient]m-n[Recipe],%20[Ingredient]1-n[On-Hand],%20[On-Hand]n-1[Location]." border="0" alt="" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;ul&gt;&lt;li&gt;Ingredient.  A generic description: "lime", "coconut".  Not too much more is needed.  A "food safety" notation (refrigeration required, etc.) is a helpful attribute.  Maybe a "food group" or other nutrition information.&lt;/li&gt;&lt;li&gt;Location.  A text description of where things can be stored.  This shouldn't have too many attributes, because boats aren't big grids.  Phrases like "port saloon upper cabinet", or "galley outer cooler" make sense to folks who live on the boat.&lt;/li&gt;&lt;li&gt;On Hand.  This is simply ingredient, location and a measurement of some kind.  Example: 3 limes in the starboard galley center cooler.  There's a lot of magic around units and unit conversion that can be fun.   But that strays outside the database domain.&lt;/li&gt;&lt;li&gt;Recipe.  Example: "One of sour, two of sweet, three of strong, and four of weak.", lime, simple syrup, rum, water.  Plain text using a lightweight markup is what's required here.  Along with a  many-to-many relationship with ingredients.  This is not carefully defined above because it should be done as a "more advanced" exercise.&lt;/li&gt;&lt;/ul&gt;&lt;/div&gt;&lt;div&gt;I think this has the right amount of complexity and isn't very abstract.  Since the use cases are pretty obvious to anyone who's cooked or been to a grocery store, use case details aren't essential.&lt;/div&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684183198890094283-6736350237541488525?l=slott-softwarearchitect.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://slott-softwarearchitect.blogspot.com/feeds/6736350237541488525/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/05/code-kata-simple-database-design.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/6736350237541488525'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/6736350237541488525'/><link rel='alternate' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/05/code-kata-simple-database-design.html' title='Code Kata : &quot;Simple&quot; Database Design'/><author><name>Steven Lott</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-mXhxBa_vM4E/AAAAAAAAAAI/AAAAAAAAAAA/eNlmenT-DOs/s512-c/photo.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684183198890094283.post-6067953643184639962</id><published>2011-05-25T08:49:00.004-04:00</published><updated>2011-05-25T08:56:33.454-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='stackoverflow'/><category scheme='http://www.blogger.com/atom/ns#' term='meetup'/><category scheme='http://www.blogger.com/atom/ns#' term='python'/><title type='text'>Meetup Tonight</title><content type='html'>Tonight (May 25th).  Red Dog.  Colley Ave.  Ghent.  I'll be wearing my Stack Overflow shirt.  I'll be there about 7.  I know that at least one other person won't be there until 8.  &lt;div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The &lt;a href="http://www.meetup.com/stackoverflow/Hampton-VA/105118/"&gt;Meetup&lt;/a&gt; link.&lt;br /&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;I like this meetup idea a lot.  Probably because the WFH life-style is a little isolating.&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;There's the small "&lt;a href="http://www.meetup.com/stackoverflow/Hampton-VA/"&gt;Hampton Stack Overflow Community&lt;/a&gt;".  We have a common interest in Stack Overflow.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Also, there's the &lt;a href="http://www.meetup.com/757-Python-Users-Group/"&gt;757 Python Users Group&lt;/a&gt;.  We have a common interest in Python.  I've decided to become the "official" organizer for this.  I'm going to join the &lt;a href="http://757labs.org/"&gt;757 Labs&lt;/a&gt; Hackerspace, also.&lt;/div&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684183198890094283-6067953643184639962?l=slott-softwarearchitect.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://slott-softwarearchitect.blogspot.com/feeds/6067953643184639962/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/05/meetups.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/6067953643184639962'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/6067953643184639962'/><link rel='alternate' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/05/meetups.html' title='Meetup Tonight'/><author><name>Steven Lott</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-mXhxBa_vM4E/AAAAAAAAAAI/AAAAAAAAAAA/eNlmenT-DOs/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684183198890094283.post-5241556968796287875</id><published>2011-05-24T08:00:00.001-04:00</published><updated>2011-05-24T08:00:01.743-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='waterfall'/><category scheme='http://www.blogger.com/atom/ns#' term='agile'/><category scheme='http://www.blogger.com/atom/ns#' term='software process improvement'/><category scheme='http://www.blogger.com/atom/ns#' term='scrum'/><title type='text'>Agility and following a "Strictly Agile" approach</title><content type='html'>I've seen some discussion on Stack Overflow that is best characterized by the question: "What is Strictly Agile?", or "What's the Official Agile Approach?".&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Someone shared this with me recently: "&lt;a href="http://radar.oreilly.com/2011/05/process-kills-developer-passion.html"&gt;Process kills developer passion&lt;/a&gt;".&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;I have also heard some great complaints about organizations that claim "Agile" and actually do nothing of the kind.  In some cases it's not a "crunchy agile shell" around a waterfall process; it's a simple lie.  Nothing about the process is Agile except a manager insisting that all the status reporting, planning and unprioritized lists of random requirements are Agile.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Finally, I got this weird suggestion: "consider writing a blog about how to test if you are agile or not".  It's weird because testing for Agile is like testing for breathing; it's like testing for flammability.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;The Agile Test&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Testing if your project is Agile can be done two ways.&lt;/div&gt;&lt;div&gt;&lt;ol&gt;&lt;li&gt;&lt;b&gt;Practical&lt;/b&gt;.  Make a change to the project.  Any change.  Requirements, architecture, due dates, staff, &lt;i&gt;anything&lt;/i&gt;.  Does it derail?  If so, it wasn't very Agile, was it?&lt;/li&gt;&lt;li&gt;&lt;b&gt;Theoretical&lt;/b&gt;.  Reread the &lt;a href="http://agilemanifesto.org/"&gt;Agile Manifesto&lt;/a&gt;.  Make a score card that evaluates the project on each of the eight basic criteria in the Agile manifesto.  Convene all the project stakeholders.  Conduct careful surveys and have structured walkthroughs to determine the degree of Agility surrounding each person, deliverable, collaborative relationship and issue.&lt;/li&gt;&lt;/ol&gt;&lt;div&gt;An important point is that Agile is not absolute.  Some practices are more Agile than others.  There's no "strictly" Agile.    There are ways to make a project more Agile; that is, it can effectively cope with change.  There are ways to make a project less Agile; that is, change causes problems and can derail the project completely.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The canonical example is a missing, misstated or contradictory requirement that gets uncovered after coding and during user acceptance test.  Clearly, that feature has been built and is absolutely wrong.  What happens next?&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Agile?  The product can be released with with the broken feature relegated to the next release.  A hack is put in to remove the buttons or menu items or links until they work.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Not Agile?  Everyone works around the clock to make that feature work no matter what.  Paraphrasing Admiral Farragut:   "Technical debt be damned.  Development must proceedfull speed ahead."  All of this irrespective of the relative value of what's being developed.  Schedule comes first; features second.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;How Much Process?&lt;/b&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The "Process Kills..." blog entry repeats observation that a lot of carefully-defined process isn't really all that helpful.  It identifies a cause ("process kills passion") that's can be true, but it's largely irrelevant.  Process is—essentially—work that's not focused on delivering anything of real value.  Complex processes are "meta" work; it's work focused on IT internals; it's work that creates no value for the users of the software; work that replaces the more valuable elements of the Agile Manifesto.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;One can argue that processes, documentation, contracts and plans "assure" success or demonstrate some level of quality.  To an extent all the process and meta-work creates trust that—eventually—the resulting software product will solve the original problem.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The mistake is that non-Agile methods use a series of surrogates—processes, documentation, contracts and plans—instead of actual software.  The point of Agile methods it to release software early and often and avoid using surrogates.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Key Points of Agile&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;&lt;br /&gt;&lt;/b&gt;&lt;/div&gt;&lt;div&gt;Here are the key points of the Agile Manifesto.&lt;/div&gt;&lt;div&gt;&lt;ul&gt;&lt;li&gt;&lt;b&gt;Individuals and interactions&lt;/b&gt; over processes and tools.   A more Agile project will use the best people and encourage them to talk amongst themselves.  A less Agile project will write a lot of things (which folks don't have time or reward for reading.)  There will be misunderstandings, leading to large, boring meetings where someone reads powerpoint slides to other folks to try and clear up misunderstandings.&lt;/li&gt;&lt;li&gt;&lt;b&gt;Working software&lt;/b&gt; over comprehensive documentation.  A more Agile project uses frequent release cycles of incremental software.  A less Agile project attempts to gather all requirements, do all design and then try to do all the coding even though the requirements have already been found to be less than crystal clear.&lt;/li&gt;&lt;li&gt;&lt;b&gt;Customer collaboration&lt;/b&gt; over contract negotiation.  A more Agile project uses constant contact with customer and product owner to refine and prioritize the requirements.  A less Agile project uses a complex change control process to notify everyone of a requirements change, which leads to design and code changes, and has cost and schedule impact that must be carefully planned and documented.&lt;/li&gt;&lt;li&gt;&lt;b&gt;Responding to change&lt;/b&gt; over following a plan.  A more Agile project uses incremental releases, conversation and a modicum of discipline to build things of value.  Just because someone thought it should be included in the requirements doesn't mean the feature is really required.&lt;/li&gt;&lt;/ul&gt;&lt;div&gt;&lt;b&gt;The "Process Kills Passion?" Question&lt;/b&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;There Process Kills Passion blog lists a bunch of things that—it appears—some folks find burdensome:&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;ul&gt;&lt;li&gt;Doing full TDD, writing your tests before you wrote any implementing code.&lt;/li&gt;&lt;li&gt;Requiring some arbitrary percentage of code coverage before check-in.&lt;/li&gt;&lt;li&gt;Having full code reviews on all check-ins.&lt;/li&gt;&lt;li&gt;Using tools like Coverity to generate code complexity numbers and requiring developers to refactor code that has too high a complexity rating.&lt;/li&gt;&lt;li&gt;Generating headlines, stories and tasks.&lt;/li&gt;&lt;li&gt;Grooming stories before each sprint.&lt;/li&gt;&lt;li&gt;Sitting through planning sessions.&lt;/li&gt;&lt;li&gt;Tracking your time to generate burn-down charts for management.&lt;/li&gt;&lt;/ul&gt;&lt;div&gt;This list has three different collections of practices.&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;ul&gt;&lt;li&gt;&lt;b&gt;Good&lt;/b&gt;.  TDD, code reviews, generating headlines, stories and tasks, grooming stories before each sprint and doing some planning for each sprint are all simply good ideas.  They must be done.  "Pure Coding" is not a good way to invest time.  Planning and then coding is much smarter, no matter how boring planning appears.&lt;/li&gt;&lt;li&gt;&lt;b&gt;Difficult&lt;/b&gt;.  Test code coverage can be helpful, but can also devolve to empty numerosity.  20% more coverage doesn't not mean 20% fewer bugs.  Nor does it mean 20% less chance of uncovering a bug at run time.   Code complexity ratings are also fussy because they don't have a direct correlation with much.  They &lt;b&gt;must&lt;/b&gt; be done and used to prioritize work that will reduce technical debt.  But mindless thresholds are for cowards who don't want to mediate deep technical discussions.&lt;/li&gt;&lt;li&gt;&lt;b&gt;Silly&lt;/b&gt;.  Creating burn-down charts for management shouldn't be necessary.  Everyone must read and understand the backlog.  Everyone should build the summary charts they want from the backlog.  The product owner or even the eventual customer should do this on their own.  They must be given a profound level of ownership of the features and the process for creating software.  &lt;/li&gt;&lt;/ul&gt;&lt;div&gt;I don't agree that process kills passion.  I think there's a fine line between playing with software development and building software of value.  I think that valuable software requires some discipline and requires executing a few burdensome tasks (like TDD) that create real value.  Assuring 80% or 100% code coverage doesn't always create real value.  Spending time keeping the backlog precise and complete is good; spending time making pictures is less good.&lt;/div&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684183198890094283-5241556968796287875?l=slott-softwarearchitect.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://slott-softwarearchitect.blogspot.com/feeds/5241556968796287875/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/05/agility-and-following-strictly-agile.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/5241556968796287875'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/5241556968796287875'/><link rel='alternate' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/05/agility-and-following-strictly-agile.html' title='Agility and following a &quot;Strictly Agile&quot; approach'/><author><name>Steven Lott</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-mXhxBa_vM4E/AAAAAAAAAAI/AAAAAAAAAAA/eNlmenT-DOs/s512-c/photo.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684183198890094283.post-4447103513759576655</id><published>2011-05-19T08:00:00.000-04:00</published><updated>2011-05-19T08:00:03.414-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='UML'/><category scheme='http://www.blogger.com/atom/ns#' term='markup'/><category scheme='http://www.blogger.com/atom/ns#' term='rst'/><title type='text'>Creating UML</title><content type='html'>I'm a big fan of plain-text tools.  Source Code.  ReStructuredText.  LaTeX.  &lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;I'm not a big fan of proprietary file formats and document formats that are difficult or impossible to decode.  JSON and XML rock.  .XLS files are painful and difficult to work with.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;UML Diagrams are a particularly odious problem.  To see a diagram it has to be PNG or PDF or some other graphic format that's optimized for storage and display, but not really optimized for editing.  SVG has a text vector markup language, but it's painful because it's so generalized.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Recently, I found two text to UML tools that are exciting prospects.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;First, there's &lt;a href="http://yuml.me/"&gt;YUML.me&lt;/a&gt;.  This draws pretty nice, if simple, diagrams that you can work with with relatively little pain.  It's slow and limited.  But it works for simple diagrams.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;img src="http://yuml.me/diagram/scruffy;scale:75/usecase/[Author]-(write%20text),%20(render%20image)-[YUML],%20[Author]-(share%20link)." /&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The best part is that the image is rendered from the URL as plain text.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-family:'lucida grande';"&gt;http://yuml.me/diagram/scruffy/usecase/[Author]-(write text), (render image)-[YUML], [Author]-(share link).&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;YUML supports simple use case diagrams, simple class diagrams and really simple activity diagrams.  It covers a few bases with a pleasant level of flexibility.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The other tool is &lt;a href="http://plantuml.sourceforge.net/"&gt;Plant UML&lt;/a&gt;.  "PlantUML is used to draw UML diagram, using a simple and human readable text description."  &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The online &lt;a href="http://www.plantuml.com/plantuml/"&gt;Plant UML Server&lt;/a&gt; allows a flexible no-software-on-the-desktop way to play with their markup language.   The text of the image is not in the URL here, since the text is so much more complex.&lt;/div&gt;&lt;div&gt;&lt;img src="http://www.plantuml.com:80/plantuml/img/it8iBSd8Bx9IqDMrKz08paWiIbNmoSpBrkJbiaAH2Y_AB4bL24cjA05A0IL3ylDpe591gNafgKKAdhc9wQcQ0000" /&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The best part of this is that the pictures come from plain text.  &lt;/div&gt;&lt;div&gt;&lt;ul&gt;&lt;li&gt;The plain text is trivial to put under configuration control.&lt;/li&gt;&lt;li&gt;Plain text system descriptions are easy to write with simple markup.&lt;/li&gt;&lt;li&gt;Plain text documentation of existing software can be derived from simple source analysis.&lt;/li&gt;&lt;li&gt;Plain text design documents can generate some elements of the source code&lt;/li&gt;&lt;/ul&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684183198890094283-4447103513759576655?l=slott-softwarearchitect.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://slott-softwarearchitect.blogspot.com/feeds/4447103513759576655/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/05/creating-uml.html#comment-form' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/4447103513759576655'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/4447103513759576655'/><link rel='alternate' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/05/creating-uml.html' title='Creating UML'/><author><name>Steven Lott</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-mXhxBa_vM4E/AAAAAAAAAAI/AAAAAAAAAAA/eNlmenT-DOs/s512-c/photo.jpg'/></author><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684183198890094283.post-8894022160940172690</id><published>2011-05-18T13:32:00.002-04:00</published><updated>2011-05-18T13:36:27.312-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='meetup'/><category scheme='http://www.blogger.com/atom/ns#' term='python'/><title type='text'>The 757 Python User's Group</title><content type='html'>&lt;a href="http://www.meetup.com/757-Python-Users-Group/"&gt;http://www.meetup.com/757-Python-Users-Group/&lt;/a&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;I'm looking forward to meeting other Python developers in Hampton Roads.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Tonight.  7:00 PM.  See you there.&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684183198890094283-8894022160940172690?l=slott-softwarearchitect.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://slott-softwarearchitect.blogspot.com/feeds/8894022160940172690/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/05/757-python-users-group.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/8894022160940172690'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/8894022160940172690'/><link rel='alternate' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/05/757-python-users-group.html' title='The 757 Python User&apos;s Group'/><author><name>Steven Lott</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-mXhxBa_vM4E/AAAAAAAAAAI/AAAAAAAAAAA/eNlmenT-DOs/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684183198890094283.post-4422181590415440732</id><published>2011-05-17T08:00:00.000-04:00</published><updated>2011-05-17T08:00:09.350-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='API Design'/><category scheme='http://www.blogger.com/atom/ns#' term='python'/><title type='text'>Decisions and Consequences</title><content type='html'>A single poorly-made decision can have profound ripple-effects.  Once your stuck with it, you make accommodations, hacks and work-arounds.  Eventually, things work, but the result is less than ideal.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Changing tack requires sometimes pervasive rework to the application.  How can we reduce the risks and improve the value created?&lt;br /&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;A Recent Example&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;When dealing with bulk econometric data (Bloomberg, D&amp;amp;B, Moody's, etc.) you get BIG files with lots of fields.  Depending on what you're paying for, the file layouts are frequently different even though the content is similar.  I'm a big fan of plain-old CSV data.  Even the tab-delimited variant of CSV is not bad to work with.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Further, most vendors will slap some heading rows on the file so that the column names are--more or less--identified.  Surprisingly, this doesn't work out well in practice because there are often multiple columns with the same name.  Sigh.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Using Python's &lt;span class="Apple-style-span"  style="font-family:'lucida grande';"&gt;&lt;a href="http://docs.python.org/release/3.1.3/library/csv.html"&gt;csv&lt;/a&gt;&lt;/span&gt; library module lets us cope with CSV (and tab-delim) quite gracefully.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;What's wrong with that decision?  Nothing.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Variant Column Names&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The question arises when you've purchased several files of econometric data and the column names are slightly different.  This happens with a single vendor and across vendors.  It's part of the game that can't easily be avoided.  Column names vary.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;What to do?&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Here's the less-than-ideal decision.  &lt;b&gt;Make the column names a parameter&lt;/b&gt;.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;In Python, this is not terribly difficult.  The &lt;span class="Apple-style-span"  style="font-family:'lucida grande';"&gt;csv&lt;/span&gt; module's &lt;span class="Apple-style-span"  style="font-family:'lucida grande';"&gt;DictReader&lt;/span&gt; provides us a dictionary for each row.  Each column name becomes a key.  We can access the fields with &lt;span class="Apple-style-span"  style="font-family:'lucida grande';"&gt;some_row['this_field']&lt;/span&gt; and &lt;span class="Apple-style-span"  style="font-family:'lucida grande';"&gt;some_row['that_field']&lt;/span&gt;.  How bad can it be?&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The extra punctuation is fairly hideous.  &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;More importantly, however, is the nature of the metadata.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Consequence One -- Dynamic Metadata&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Dynamic metadata, in this case, means that any indexing of the data is done based on character string column names.  &lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-family:'lucida grande';"&gt;&lt;/span&gt;&lt;/div&gt;&lt;blockquote&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-family:'lucida grande';"&gt;index[index_name][row[column_name]].append( row )&lt;/span&gt;&lt;/div&gt;&lt;/blockquote&gt;&lt;div&gt;That's rather more complex than the alternative where the metadata has a fixed definition.&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-family:'lucida grande';"&gt;&lt;/span&gt;&lt;/div&gt;&lt;blockquote&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-family:'lucida grande';"&gt;some_index[row.column].append( row )&lt;/span&gt;&lt;/div&gt;&lt;/blockquote&gt;&lt;div&gt;&lt;b&gt;Consequence Two -- Murky ORM&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Once we have dynamic metadata, we're largely frozen out of ordinary SQL database implementations.  We don't know the column names, we don't know the indices.  We can't do simple &lt;span class="Apple-style-span"  style="font-family:'lucida grande';"&gt;CREATE TABLE&lt;/span&gt; statements because we don't really have the column names until we open the working files.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;We have to grub through all the code to find out where the dynamic mapping is reasoned out.  Once we find that, we can then consider how to make the metadata fixed enough to tackle a SQL database.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;We could, of course, generate the SQL &lt;span class="Apple-style-span"  style="font-family:'lucida grande';"&gt;CREATE INDEX&lt;/span&gt; statements on-the-fly.  There's nothing wrong with it.  But it slows down analysis and decision-making when we're not sure what indexes there are or what leads to a choice of index.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;What's important here is that we want to use SQLite because it ships with Python.  We want our application to &lt;i&gt;use&lt;/i&gt; an ORM (like &lt;a href="http://www.sqlalchemy.org/"&gt;SQLAlchemy&lt;/a&gt; or &lt;a href="http://sqlobject.org/"&gt;SQLObject&lt;/a&gt;).  We don't want our application to &lt;i&gt;become&lt;/i&gt; a kind of ORM because of the dynamic SQL and dynamic column names.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Cleanup&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The cleanup road is clear.  &lt;/div&gt;&lt;div&gt;&lt;ol&gt;&lt;li&gt;Map all variant inputs to one common structure.  Rather than work with raw dictionaries from csv, map each row to a standard set of names.  For now, we can replace the dictionaries with named tuples to prepare for a migration to an ORM when that's possible.&lt;/li&gt;&lt;li&gt;Replace the &lt;span class="Apple-style-span"  style="font-family:'lucida grande';"&gt;row['some field']&lt;/span&gt; syntax with &lt;span class="Apple-style-span"  style="font-family:'lucida grande';"&gt;row.some_field&lt;/span&gt; syntax.  Of course, there's a lot of this.  This is a pervasive change.&lt;/li&gt;&lt;li&gt;Find all the dynamic index creation and refactor that into a more static "database-like" place for now.&lt;/li&gt;&lt;/ol&gt;&lt;div&gt;Item 1 is pretty easy to unit test.  We're adding a function to map from dynamic names to fixed names.  Nothing much to this testing-wise.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Item 2 requires unit tests with really good code coverage or there's no earthy way we can be sure that each mapping-syntax name has been transformed into an attribute-syntax name.&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Item 3 barely requires testing.  Indexes and other features are performance enhancements that can be removed and added without altering functionality.&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684183198890094283-4422181590415440732?l=slott-softwarearchitect.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://slott-softwarearchitect.blogspot.com/feeds/4422181590415440732/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/05/decisions-and-consequences.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/4422181590415440732'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/4422181590415440732'/><link rel='alternate' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/05/decisions-and-consequences.html' title='Decisions and Consequences'/><author><name>Steven Lott</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-mXhxBa_vM4E/AAAAAAAAAAI/AAAAAAAAAAA/eNlmenT-DOs/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684183198890094283.post-3350049133529838386</id><published>2011-05-12T08:00:00.001-04:00</published><updated>2011-05-15T12:38:06.971-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='use case'/><category scheme='http://www.blogger.com/atom/ns#' term='analysis'/><title type='text'>A Taxonomy of Use Case Errors</title><content type='html'>&lt;div&gt;First, the definition.  A use case describes an actor's interaction with a system to create business value.  There are three parts: Actor, Interaction and Business Value.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;1.  Not Interactive.&lt;/div&gt;&lt;div&gt;1.1.  The use case is just features and technical attributes with no actor interaction expressed.&lt;/div&gt;&lt;div&gt;1.2.  The use case is just algorithms and processing with no connection to an actor or a goal.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;2.  No Business Value.&lt;/div&gt;&lt;div&gt;2.1.  Incomplete&lt;/div&gt;&lt;div&gt;2.1.1.  The use case focus on sequential operations with no value or goal.&lt;/div&gt;&lt;div&gt;2.1.3.  The use case simply follows existing precedent without supporting actual business goals.  It "paves the cow path".&lt;/div&gt;&lt;div&gt;2.2.  Non-Specific&lt;/div&gt;&lt;div&gt;2.2.1.  The use case is a result of free-running imagination; it conflates "possibly" vs. "required".  It contains descriptions of interactions which could happen or would be nice to happen.&lt;/div&gt;&lt;div&gt;2.3.  Covers the Technology Only&lt;/div&gt;&lt;div&gt;2.3.1.  The solution technology is conflated with the business problem.  Words like "database" or "foreign key" or "error log" or other solution technology are central.&lt;/div&gt;&lt;div&gt;2.4.  Contradictory&lt;/div&gt;&lt;div&gt;2.4.1.  The use case goal contradicts other goals.  &lt;/div&gt;&lt;div&gt;2.4.2. The use case sequence is inconsistent with the stated goal.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;3.  No Actor.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684183198890094283-3350049133529838386?l=slott-softwarearchitect.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://slott-softwarearchitect.blogspot.com/feeds/3350049133529838386/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/05/taxonomy-of-use-case-errors.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/3350049133529838386'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/3350049133529838386'/><link rel='alternate' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/05/taxonomy-of-use-case-errors.html' title='A Taxonomy of Use Case Errors'/><author><name>Steven Lott</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-mXhxBa_vM4E/AAAAAAAAAAI/AAAAAAAAAAA/eNlmenT-DOs/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684183198890094283.post-7986009999850561510</id><published>2011-05-10T08:00:00.001-04:00</published><updated>2011-05-10T08:00:01.273-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='OO design'/><category scheme='http://www.blogger.com/atom/ns#' term='procedural programming'/><title type='text'>The Ubiquitous Object</title><content type='html'>Objects are everywhere.  &lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Weirdly, some people can't see them.  I guess they live in a rarified, HP Lovecraftian world of pure action inhabited by amorphous things that can't be properly called "beings" but rather "doings" because they're pure activity with no existence.  &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Read "&lt;a href="http://www.hplovecraft.com/writings/texts/fiction/hy.asp"&gt;Hypnos&lt;/a&gt;".  "They were sensations, yet within them lay unbelievable elements of time and space—things which at bottom possess no distinct and definite existence."&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Got this comment the other day.&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;/div&gt;&lt;/div&gt;&lt;blockquote&gt;&lt;div&gt;&lt;div&gt;... doing procedural code correctly when you don't want to be bothered w/ OO is a separate and big enough topic that warrants its own book or monograph.&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;/div&gt;&lt;/blockquote&gt;&lt;div&gt;I guess that means that objects, and the reality that they model, are a "bother"—a pitfall to be avoided—a cost with no benefit.  This is not the first time I've heard this, and—like Lovecraft—it leads me to wonder how such a rich and weird phantasy world gets constructed.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;I had a project manager exclaim "You don't need more than seven or eight objects to write any application."  I didn't press the person on that point.  I assumed that they were talking about classes (not objects) and, further, had conflated class with "elaborate module-like library packed with amazing features". Or maybe they conflated class with package.  Or something.   It's hard for me to dig into misapprehensions and false assumptions without being rude.  &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;There are a surprising number of misapprehensions.  I'm occasionally tempted to turn &lt;a href="http://www.nltk.org/"&gt;NTLK&lt;/a&gt; loose on all questions tagged "Python" on Stack Overflow.  With some patient reading, I think I could develop a taxonomy of OO confusion.   However, let's just focus on this comment.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;The Bother Factor&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Why is OO a "bother"?  &lt;/div&gt;&lt;div&gt;&lt;ol&gt;&lt;li&gt;I've been told that OO programming is &lt;i&gt;different&lt;/i&gt;.  Different from what?  From procedural programming without objects, I guess.&lt;/li&gt;&lt;li&gt;I'm been told that some problems are a better fit for OO, and some problems aren't a good fit for OO.  This is hard to parse because it makes the more profound claim that some problems weirdly don't involve any "objects" just pure actions.&lt;/li&gt;&lt;li&gt; The &lt;a href="http://en.wikipedia.org/wiki/Object-relational_impedance_mismatch"&gt;Object-Relational Impedance Mismatch&lt;/a&gt; problem somehow indicts object-oriented programming as unsuitable when there's a relational database involved.&lt;/li&gt;&lt;/ol&gt;&lt;div&gt;Let's look at some of these in a little depth to see the underlying fallacies.&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Procedural Is More Fundamental&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;This is subtle and pernicious.  An OO language contains within it a procedural language.  Because of this, we can use Java, C++ or Python to write Fortran-like (or VB-like) crapola code.  It's possible to write everything in a single, massive, static class with piles of random global variables, long lists of disorganized methods, and "adaptation via block comment" buffoonery.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Some folks object to characterizing procedural programming as random, disorganized or buffoonery.  They tell me that a purely procedural can be neat and well organized with tidy, focused modules that have narrowly-defined responsibilities, no global variables and clever techniques like pointer-to-function to support adaptation.  &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Wait.  The idea of tidy, focused  modules with narrowly-defined responsibilities is exactly what a class is.  &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;This is important.  &lt;b&gt;All good procedural programming is isomorphic to object-oriented programming minus the class definitions&lt;/b&gt;.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Procedural isn't "fundamental".  It's just a "fragmentary".  Procedural programming is a subset of object-oriented programming.  Not a foundation.  We can, for example, do functional-style object-oriented programming by using immutable objects.  &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Some Problems Aren't A Good Fit&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Claiming that there are problems which don't fit the object-oriented paradigm is false.  Or such a claim hearkens to a more elaborate ontology in which existence somehow doesn't matter.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;This question is typical:  "&lt;a href="http://stackoverflow.com/questions/178262/what-should-be-oo-and-what-shouldnt"&gt;What should be OO and what shouldn't?&lt;/a&gt;"&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;When a program "runs" or "executes" there is state change.  In a lazy functional world, state change is characterized by the creation and destruction of immutable objects: the new "4" that's created by "2+2".  &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;In order for there to be state, there must be an object that has a state of being.  Objects are inherent in doing any computing of any kind.  &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Some folks like to lift up stored procedures or shell scripts as "important" examples of non-OO programming.  Mostly, these just show that a non-OO language can persist for a long time because clever programmers can work around a lot of limitations.  (&lt;a href="http://en.wikipedia.org/wiki/Turing_completeness"&gt;Turing Completeness&lt;/a&gt; is a necessary pre-condition; not a desirable feature set.)&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;[And yes, I've written multiple-thousand line shell scripts so customers can avoid paying a license fee for a proper compiler.  Just because it &lt;i&gt;can&lt;/i&gt; be done doesn't mean it &lt;i&gt;should&lt;/i&gt; be done.]&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;This is important.  &lt;b&gt;All Programming Involves Objects&lt;/b&gt;.  &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;There are really just two "paradigm" decisions.  Does the problem involve &lt;i&gt;new&lt;/i&gt; class definitions or can it be done using built-in classes?  Does the problem involve mutable objects or immutable objects?&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Software that uses only built-in classes is termed "procedural".  Software that uses only immutable objects is termed "functional".  Software that uses mutable objects is mistakenly termed "object-oriented".&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Object-Relational Mismatch&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;This isn't really very interesting, no matter how many times people like to flog it.  Use an ORM.  Move on.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Further, it's important to recognize that normalization, foreign keys, cascading deletes and other malarky are hacks imposed on us by several relational database limitations.  These are not &lt;i&gt;essential&lt;/i&gt; parts of any problem.  &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;I don't know how many times I've had to answer the "how do I do foreign keys in Java/C++/Python?" question.  The answer is always the same: foreign keys are a hack-around because there are no proper object references in a relational database.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;What's Left?&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;In spite of the obvious logic that OO is central, there is always a residual "It's a bother" sense from folks who's first language was not an OO language.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;As far as I can tell, the "bother" stems from simple ignorance of what's &lt;i&gt;really&lt;/i&gt; going on.  Many programmers can't articulate any design principles.  Yet, they tend to follow some principles rather closely.  Ask them what they're doing.  Read their code.  Almost everyone who codes has some set of fundamental principles.   (The few exceptions are people who seem to write code more-or-less randomly and still manage to arrive at something that appeared to "work"; these people do exist and are very scary.)&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Many programmers don't follow &lt;b&gt;all&lt;/b&gt; of the &lt;a href="http://en.wikipedia.org/wiki/Solid_(object-oriented_design)"&gt;SOLID Principles&lt;/a&gt;. &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Many programmers follow the SOLID principles using different nomenclature.  The SOLID initials and acronyms are just one one goofy terminology.  There are more principles than these, and the principles can have other names. &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;What's important is that (except for rare exceptions) &lt;b&gt;all&lt;/b&gt; programmers follow some of the SOLID principles.  Some follow all of them.  Some follow numerous additional principles beyond these.  Some give their principles other names.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The folks who claim OO programming is a "bother" just don't happen to recognize that they're already following some of the SOLID principles and actually doing OO programming with built-in classes.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Doing Procedural Programming Correctly&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Bottom Line: "doing procedural code correctly" is simply OO programming using only built-in classes.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;It's not a "big" topic.  It's entirely an exercise in learning how to apply someone else's nomenclature to one's existing principles.&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684183198890094283-7986009999850561510?l=slott-softwarearchitect.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://slott-softwarearchitect.blogspot.com/feeds/7986009999850561510/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/05/ubiquitous-object.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/7986009999850561510'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/7986009999850561510'/><link rel='alternate' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/05/ubiquitous-object.html' title='The Ubiquitous Object'/><author><name>Steven Lott</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-mXhxBa_vM4E/AAAAAAAAAAI/AAAAAAAAAAA/eNlmenT-DOs/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684183198890094283.post-3303173747753270700</id><published>2011-05-03T08:00:00.001-04:00</published><updated>2011-05-03T08:00:16.693-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='python'/><category scheme='http://www.blogger.com/atom/ns#' term='OO design'/><category scheme='http://www.blogger.com/atom/ns#' term='procedural programming'/><title type='text'>The curse of procedural design</title><content type='html'>After reverse engineering procedural code in C, VB or even Python, I'm finding that procedural programming inevitably leads to bad, bad code-rot.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Consider some of the common design patterns.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Strategy&lt;/b&gt;.  Confronted with alternative strategy choices, a purely procedural code solution is either &lt;/div&gt;&lt;div&gt;&lt;ul&gt;&lt;li&gt;If-statements everywhere the strategy is involved.&lt;/li&gt;&lt;li&gt;Block comments.  (Pre-processor &lt;span class="Apple-style-span"  style="font-family:'lucida grande';"&gt;#if&lt;/span&gt; statements are the logical equivalent of block comments plus a tool to move them around just prior to compilation.)&lt;/li&gt;&lt;/ul&gt;&lt;div&gt;These lack flexibility and seem to devolve into a quagmire of mystery.   The if-statements often become tangled and complex.  More importantly, some strategy choices — which are unused — may not be maintained at all.  Of course, the block comments are never maintained.&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Command&lt;/b&gt;.  Often a command design requires a "code" or "label" and a big-old sequential switch (BOSS™) statement to select among the procedures which implement the various commands.  Once "composite" commands are introduced, this devolves into nonsense.  Ideally, it's a simple recursion, where a composite command simply invokes the sub-commands.  However, folks get nervous about recursion and try to write weird loops.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;State&lt;/b&gt;.  A state design always seems to involve labels or codes for the state names and a slightly different big-old-state-switch (BOSS™, no accident that this is the same acronym) to sort out the variant behaviors in the distinct states.  This shouldn't become too confusing.  After all, Turing machines and other mathematical abstractions give us a strong hint on how we should proceed.   &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The problem with stateful procedural programming is that the state changes can be hidden everywhere.  In the Really Bad Languages, variables can change values without an assignment statement!  In the Not Bad Languages, we can track down the various assignment statements and try to reason out the state changes.  Procedural code—without a lot of adult supervision—never seems to encapsulate state change with the the same in-your-face clarity that OO programs do.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;I Could Go On&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The point is this.  While procedural programming &lt;i&gt;could&lt;/i&gt; be done well, there appear to be a lot of obstacles inherent in the paradigm.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The best procedural programming I've seen has always been very object-oriented.  Each procedure or function had a distinct data structure it worked with; they were all closely related by virtue of naming or file structure; much like a class definition.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;I'm starting to wonder if my Building Skills books are taking the right approach.  I start with the procedural aspects of Python.  I'm beginning to feel that this may be a disservice to the n00bz.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Perhaps it's better to swap the order of the sections and start with the various Pythonic data structures and introduce the various statements sort of "casually" as part of demonstrating how a data structure is supposed to be used.&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684183198890094283-3303173747753270700?l=slott-softwarearchitect.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://slott-softwarearchitect.blogspot.com/feeds/3303173747753270700/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/05/curse-of-procedural-design.html#comment-form' title='5 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/3303173747753270700'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/3303173747753270700'/><link rel='alternate' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/05/curse-of-procedural-design.html' title='The curse of procedural design'/><author><name>Steven Lott</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-mXhxBa_vM4E/AAAAAAAAAAI/AAAAAAAAAAA/eNlmenT-DOs/s512-c/photo.jpg'/></author><thr:total>5</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684183198890094283.post-3756978367695896957</id><published>2011-04-30T08:00:00.001-04:00</published><updated>2011-04-30T08:00:06.902-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='ide'/><category scheme='http://www.blogger.com/atom/ns#' term='tools'/><category scheme='http://www.blogger.com/atom/ns#' term='Programming Languages'/><title type='text'>Language, Tools, Chickens, Eggs, Java and Python</title><content type='html'>Too much of programming is intimately tied up with the tools to support the development of the software.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Example 1.  I was told -- with absolute and fierce conviction -- that VB may suck as a language, but Visual Studio more than makes up for the obvious problems.  For some people, &lt;b&gt;Tools Trump Language&lt;/b&gt;.   Sadly, I've also had customers with ancient code they could no longer compile or maintain because the tools were out of support.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;On Stack Overflow, you can read questions like this: "&lt;a href="http://stackoverflow.com/questions/81584/what-ide-to-use-for-python"&gt;What IDE to use for Python?&lt;/a&gt;".  In spite of this question's immense popularity, it gets re-asked all the time.  Search for "Python IDE" to see endless duplicates.  One of the most common duplicate forms of this question asks (or demands) code completion.  As if there are folks who cannot write code without code completion.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Chickens and Eggs&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The issue with sophisticated IDE's (like Eclipse, NetBeans, and even Komodo) is that you have to learn the tools before learning the language.   Until you know something about the language, the tools, of course, are useless.  Worse, Eclipse is for "enterprise" applications and is so fat with bells (and whistles) that it's hard to determine what to use and what it means.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;So the tool is a prerequisite for the language.  But the language is a prerequisite for the tool.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;How to cut the Gordian Knot?&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;First Principles&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Irrespective of the "Visual Studio makes VB not suck" crowd, language comes first -- and last -- and fills all the spaces in between.  &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Language is everything.  Software is merely encoded knowledge.  The language of that encoding is how we determine meaning; how we argue about correctness, adaptability, maintainability and security.  Tools don't endure -- they come and go -- but the language remains.  &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The only thing more important than the language is the data itself.  But that's another rant.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Proof, of course, is available everyone except in VB circles.   For non-proprietary languages (Java, Python, etc., etc.) there are a large number of competing tools.  One language many tools.  Take the hint.  Language is important.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Yes, some tools are so flexible, they cover several languages.  But there's no universal tool any more than there's a universal language.  And the bias is clearly very, very many tools for a given language and only a few languages for a given tool.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;How To Start&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Language comes first.  &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;For Python, that's easy.  Run Python, type code at the &amp;gt;&amp;gt;&amp;gt; prompt, and you're learning. Python comes with IDLE which is a minimalist IDE.  It will get anyone started.  Later, they can try other IDE's.    &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;For Java, however, that's not that easy.  It isn't however, impossible to get started.  It's just challenging.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Option 1 -- Bare Knuckles.  It's possible to edit text and run the &lt;span class="Apple-style-span"  style="font-family:'lucida grande';"&gt;javac&lt;/span&gt; compiler to learn a great deal of Java without an IDE.  It's not a bad idea.  It will get complex to manage projects with more than a few files.   &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Eventually that's what &lt;a href="http://ant.apache.org/"&gt;Ant&lt;/a&gt;, &lt;a href="http://maven.apache.org/"&gt;Maven&lt;/a&gt; and &lt;a href="http://www.scons.org/"&gt;SCons&lt;/a&gt; are for.  But that's not a good place to start.  Again, the tools don't make sense until you start writing things big enough that the tools actually help.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Option 2 -- Succession of IDE's.  It's probably best to start with a very simple IDE for Java.  Something like &lt;a href="http://www.activestate.com/komodo-edit"&gt;Komodo Edit&lt;/a&gt;, &lt;a href="http://macromates.com/"&gt;TextMate&lt;/a&gt; or &lt;a href="http://www.barebones.com/products/bbedit/index.html"&gt;BBEdit&lt;/a&gt;.  There are a lot of choices, but the idea is to find something little more than a text editor with a few tools.  I've used these and like their relative simplicity.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The &lt;a href="http://www.javawide.org/index.php/Main_Page"&gt;JavaWIDE&lt;/a&gt; toolset might be helpful.  I haven't used it, but some folks suggest that it simplifies the language learning.  Later a "regular" desktop IDE can be used.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Later, one can move to &lt;a href="http://netbeans.org/"&gt;NetBeans&lt;/a&gt; or &lt;a href="http://www.eclipse.org/"&gt;Eclipse&lt;/a&gt;.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Classrooms and Autodidacts&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;In the classroom, it's easy to demonstrate NetBeans and answer questions.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;For auto-didacts, however, choosing the wrong tool leads to endless confusion.  The chicken and egg issue isn't clarified by wasting time trying to install and use a tool that's too sophisticated for a n00b.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;N00b autodidacts really need to start with a simple text-editor.  They need to use `javac` to compile and `java` to run the resulting class.  For the first week or two, this will do.  Once past the fundamentals, however, IDE selection can start to make sense.  A BBEdit/TextMate/Komodo thing should be next.  This is good for -- perhaps a year or more.  Then, when doing "real" programming, a heavier-weight tool makes sense.  &lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684183198890094283-3756978367695896957?l=slott-softwarearchitect.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://slott-softwarearchitect.blogspot.com/feeds/3756978367695896957/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/04/language-tools-chickens-eggs-java-and.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/3756978367695896957'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/3756978367695896957'/><link rel='alternate' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/04/language-tools-chickens-eggs-java-and.html' title='Language, Tools, Chickens, Eggs, Java and Python'/><author><name>Steven Lott</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-mXhxBa_vM4E/AAAAAAAAAAI/AAAAAAAAAAA/eNlmenT-DOs/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684183198890094283.post-2013339190657751867</id><published>2011-04-19T08:00:00.000-04:00</published><updated>2011-04-19T08:00:03.000-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='test-driven reverse engineering'/><category scheme='http://www.blogger.com/atom/ns#' term='unit testing'/><category scheme='http://www.blogger.com/atom/ns#' term='reverse engineering'/><category scheme='http://www.blogger.com/atom/ns#' term='python'/><title type='text'>Test-Driven Reverse Engineering (TDRE)</title><content type='html'>Another case study on TDRE.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Provided: 2,938 lines of Python code which process a handful of large files to create a number of outputs.  [Details can't be disclosed.]&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Objective: Refactor to distinguish between the overall sequence of transformational steps and the details of each individual step.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Observations&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The code is almost purely procedural.  There are 11 class definitions.  6 of these wrap built-in types with type conversion and null-handling.  1 is a new exception.  1 is a generic "table" that essentially duplicates features of SQLite.  The remaining 3 are actually part of the problem domain.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;One reason for reverse engineering is that the code has reached an intellectual limit.  It's small, but "dense" with highly-optimized processing steps.  The &lt;a href="http://en.wikipedia.org/wiki/Cohesion_(computer_science)#Types_of_cohesion"&gt;cohesion type&lt;/a&gt; is almost all "Temporal".  Processing is grouped into successive processing loops; each loop contains a cluster of processing steps.  Consequently, it's quite hard to tease apart the algorithm to get a "big picture" of what's going on.  It's just a dense stand of trees.  No forest.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Another reason for reverse engineering is to support the endless adaptation and modification of the code base.  The program is a kind of "spreadsheet on steroids".  This isn't a simplistic collection of cells and formulæ that permits simple what-if analysis.  This is a more complex set of formulæ that would be challenging (but not impossible) to implement as a spreadsheet.  The use case, however, is the spreasheet use case:  think, tweak, create results, repeat.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;TDRE Approach&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Start with an &lt;b&gt;Initial Survey&lt;/b&gt; of the legacy code base and sample files.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Create an Outline&lt;/b&gt; or "sketch" of the domain model and main program.  This will be a modules (or a package) with comments and some preliminary class definitions.  Little more.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Pick a processing Step&lt;/b&gt; in the legacy code.  This often requires creating processing summaries of the legacy code.  Most legacy code is procedural, so the processing tends to be sequential in nature.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Instrument the Legacy Code&lt;/b&gt; with print statements to gather data.  This can be simple.  The output can be challenging to interpret.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-family:'courier new';"&gt;    with open("tdre_results_1","w") as tdre:&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-family:'courier new';"&gt;        # some legacy processing&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-family:'courier new';"&gt;        print( "Case:", foo, bar, ", Expect:", baz, file=tdre )&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;From the output, &lt;b&gt;Build Unit Test Cases&lt;/b&gt;.  Fill in parts of the processing sequence and domain model.   Debug code until the tests pass.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Initial Survey&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The &lt;b&gt;Initial Survey&lt;/b&gt; locates several things.&lt;/div&gt;&lt;div&gt;&lt;ol&gt;&lt;li&gt;The usable, working modules.  It appears that all reverse engineering involves a code base with dead or unused code.  Even a small project (3,000 lines) will have a remarkable amount of dead code.&lt;/li&gt;&lt;li&gt;Priorities for the implemented functionality.  Not every "main" module is relevant.  &lt;/li&gt;&lt;li&gt;Example inputs and outputs.&lt;/li&gt;&lt;/ol&gt;&lt;div&gt;If the software cannot be run (as is the case with organically developed systems that depend on large, complex corporate databases), then the example inputs and outputs may not actually match the software.  If the software can be run, it should be run and the actuals compared against the samples to confirm that the code base supplied really produced the sample outputs.&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Expect that the provided legacy code is slightly different from the code in production use.  In some cases, this cannot be resolved; for example, when the executables are older than the source.  In other cases, the code matches and no further work is required to establish the legacy baseline.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The sample outputs point in the direction of an acceptance test case.  The sample output cannot be taken literally as the one-and-only acceptance test.  While it's desirable for reverse engineering to reproduce the sample output, most reverse engineering will involve enhancements or bug fixes.  Expect that errors will be found (or may be known to exist) in the sample output.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Create Outline&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The outline is -- initially -- just generic MVP.  There must be a domain model, some "presenter" that has the application logic,  and some "view" for displaying the outputs.  &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;In our case study, above, the "view" is a collection of (mostly text) output files.  The model was undefined in the legacy code, which was all "presenter" application logic.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The goal was to extract the underlying model, break the application "presenter" logic into two layers (forest and trees) and build some views for each of the output files.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Pick a Processing Step&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;This can be challenging, depending on the legacy code base.  There are two paths through a procedural code base. &lt;/div&gt;&lt;div&gt;&lt;ul&gt;&lt;li&gt;Back to Front.  Start with the final results and unit test the final steps based on previous steps that will be defined later.&lt;/li&gt;&lt;li&gt;Front to Back.  Start with the first recognizable intermediate result based on the input files.  Unit test the initial steps.&lt;/li&gt;&lt;/ul&gt;It's more rewarding to work front-to-back because progress can be shown a little more clearly.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;A better architecture can be created by working back-to-front since dependencies are easier to understand.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Unit Test Volume, Edges and Corners&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;There are two unit test design challenges when doing reverse engineering.&lt;/div&gt;&lt;div&gt;&lt;ul&gt;&lt;li&gt;Volume.  The sample data can be large.  100,000 rows of sample data is too many to test.  Finding a "representative" subset is difficult.  Generally, arbitrary subsets have to be used to get started.  Once the application mostly works, more refined unit tests need to be created.&lt;/li&gt;&lt;li&gt;Edge and Corner Cases.  While the code may be riddled with &lt;b&gt;if&lt;/b&gt;-statements, it can still be difficult to locate sample inputs that exercise the various conditions in the code.  It's risky to create data -- we have to assume that the legacy code does unexpected things.   In many cases, print statements have to be put into complex if statements to locate any actual data that exercises that logic path.&lt;/li&gt;&lt;/ul&gt;&lt;div&gt;Once the unit tests are built, this is just Test-Driven Development (TDD).&lt;/div&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684183198890094283-2013339190657751867?l=slott-softwarearchitect.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://slott-softwarearchitect.blogspot.com/feeds/2013339190657751867/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/04/test-driven-reverse-engineering-tdre.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/2013339190657751867'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/2013339190657751867'/><link rel='alternate' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/04/test-driven-reverse-engineering-tdre.html' title='Test-Driven Reverse Engineering (TDRE)'/><author><name>Steven Lott</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-mXhxBa_vM4E/AAAAAAAAAAI/AAAAAAAAAAA/eNlmenT-DOs/s512-c/photo.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684183198890094283.post-984145871185067652</id><published>2011-04-05T08:00:00.000-04:00</published><updated>2011-04-05T08:00:03.740-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='stackoverflow'/><category scheme='http://www.blogger.com/atom/ns#' term='API Design'/><category scheme='http://www.blogger.com/atom/ns#' term='OO design'/><category scheme='http://www.blogger.com/atom/ns#' term='algorithm'/><title type='text'>Performance Discussions and Software Design</title><content type='html'>Read this first: "&lt;a href="http://news.ycombinator.com/item?id=2375750"&gt;There is something I find interesting about online discussions around performance issues&lt;/a&gt;..."  It's about Stack Overflow, specifically.  Apparently, someone didn't get their question answered and decided it was better to gripe than to rewrite the question. &lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Let's look at their response in pieces.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;"people try to gang up".   Since there's almost no social networking capability, this is a bit much to attribute to people responding to a poorly-worded question.  But, if you've worked all day on  a bad  solution to a poorly-conceived problem, it can &lt;i&gt;feel&lt;/i&gt; like being ganged up on.  When reality leaks in, it can feel unpleasant.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Hint 1.  There are no gangs.  It's possible that the question really is poorly written.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;"cookie-cutter, patronizing, zero-information responses".  I'm guessing these are comments suggesting the approach is bad and asking for clarification.  I run afoul of this often because I feel compelled to post comments asking for clarification.  Some folks just don't like to clarify.  More than once I've been told that their question was very clear.   Since I'm asking for clarification, it seems odd to insist the question is perfect.  Worse, of course, is asking for help on Stack Overflow, but refusing to clarify the help required.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Hint 2.  Clarify.  Please.  Don't insist that the question is perfect.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;"they assume, without any basis, that the person has not (a) benchmarked the code,"  When the question has no bench mark data, this isn't an assumption.  It's a response to the lack of benchmark data.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Hint 3.  Provide the facts.  Don't complain when folks ask for facts.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;"(b) is obviously running an inferior algorithm".  Again, this isn't an assumption.  It's the response to an incomplete question where the algorithm isn't provided.  Also, it's a common response to questions where the algorithm really is inferior. &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Hint 4.  Consider that -- even after spending days banging your head against the wall -- your question might be poorly-written and require both benchmark data and an algorithm.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;"advice about premature optimization... is a well assimilated folklore by now and I dont see how repeating that adds value".  Without measurements, profiler results and benchmark data, this is our only possible response.  &lt;i&gt;After&lt;/i&gt; the profiler results are posted, this advice really is useless.  &lt;i&gt;Before&lt;/i&gt; profiler results are posted, this advice often turns out to be essential.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Hint 5.  Whatever you might know is not well-assimilated folklore on Stack Overflow as a whole.  We don't know you, sadly.  We don't know how  much you know.  To avoid useless advice, provide evidence -- in the question -- that the advice has &lt;i&gt;already&lt;/i&gt; been followed.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;"better served if the discussion shifted to ... Pointed out possible bottlenecks ahead of time,"  Wouldn't that be nice?  What's a "possible bottleneck"?  It's a badly-design algorithm.  So, the responses to performance questions has to be focused on algorithm choice right away.  That means details on the code being used, and profiling information.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Hint 6.  There is no hint 6.  This would simply repeat hints 3 and 5.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;"regardless of the fact whether the code construct is actually a bottleneck in the application or not, it is always good to know what the more efficient alternatives are... there is something called intellectual curiosity."&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Reducing a question to a hand-waving hypothetical doesn't &lt;i&gt;improve&lt;/i&gt; the question.  It doesn't &lt;i&gt;rationalize &lt;/i&gt;a poor question.  The question still needs to be clarified.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Hint 7.  If the question raises a lot of comments and useless advice, please &lt;b&gt;rewrite&lt;/b&gt; the question.&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684183198890094283-984145871185067652?l=slott-softwarearchitect.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://slott-softwarearchitect.blogspot.com/feeds/984145871185067652/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/04/performance-discussions-and-software.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/984145871185067652'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/984145871185067652'/><link rel='alternate' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/04/performance-discussions-and-software.html' title='Performance Discussions and Software Design'/><author><name>Steven Lott</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-mXhxBa_vM4E/AAAAAAAAAAI/AAAAAAAAAAA/eNlmenT-DOs/s512-c/photo.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684183198890094283.post-1307163808249292925</id><published>2011-03-31T08:57:00.004-04:00</published><updated>2011-03-31T08:59:57.191-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='stackoverflow'/><category scheme='http://www.blogger.com/atom/ns#' term='meetup'/><category scheme='http://www.blogger.com/atom/ns#' term='#SOMeetup'/><title type='text'>StackOverflow Meetup</title><content type='html'>See &lt;a href="http://www.meetup.com/stackoverflow/Hampton-VA/"&gt;http://www.meetup.com/stackoverflow/Hampton-VA/&lt;/a&gt; for information on next week's Stack Exchange meetup.  &lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;For other events near you, see the &lt;a href="http://www.meetup.com/stackoverflow/"&gt;Meetup&lt;/a&gt; page.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;I'll be wearing my "official" Stack Overflow shirt.  I'll try to grab a seat by the door, also, to be easy to find.&lt;br /&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;#SOMeetup&lt;/div&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684183198890094283-1307163808249292925?l=slott-softwarearchitect.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://slott-softwarearchitect.blogspot.com/feeds/1307163808249292925/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/03/stackoverflow-meetup.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/1307163808249292925'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/1307163808249292925'/><link rel='alternate' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/03/stackoverflow-meetup.html' title='StackOverflow Meetup'/><author><name>Steven Lott</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-mXhxBa_vM4E/AAAAAAAAAAI/AAAAAAAAAAA/eNlmenT-DOs/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684183198890094283.post-7265074121335223277</id><published>2011-03-30T16:36:00.004-04:00</published><updated>2011-03-30T16:55:29.877-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='architecture'/><category scheme='http://www.blogger.com/atom/ns#' term='refactoring'/><title type='text'>Code Deletion</title><content type='html'>A joyous milestone today.  Removed much of our pre-&lt;a href="https://bitbucket.org/jespern/django-piston/wiki/Home"&gt;Piston&lt;/a&gt; RESTful web services code.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;We started with the &lt;a href="http://code.google.com/p/django-rest-interface/"&gt;Django-REST Interface&lt;/a&gt;.  While nice, it imposed a number of restrictions that were onerous.  In particular, we have a lot of non-model responses.  They're model-like data that we serialize to be compatible with Django, but without actually being first-class Django Model objects. &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;In order to provide a generic, but detailed "status" message, we actually defined a Model that we never instantiated in the database.  We'd build (but not save) instances, just to make it easy to serialize them.  &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;What a hack.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;To further complicate things, I failed to really understand the way that the alternate user authentication sources worked, and how much of the Django authentication process was better handled through middleware.  Failing to fully understand that, I wrote too much code.  We tinkered with incoming requests to extract HTTP Authorization headers.  We tinkered to handle Amazon-style key/signature values in the GET or POST.  And we tinkered to handle OpenAM authentication cookies.  &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Too much code.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;And it gets worse.   I tried to use &lt;a href="http://docs.python.org/library/urllib2.html"&gt;urllib2&lt;/a&gt; for a wide variety of RESTful requests.  This means more than GET and POST.  That was a mistake.  &lt;a href="http://docs.python.org/library/httplib.html"&gt;httplib&lt;/a&gt; works out a little better for doing RESTful web services requests.   If you don't have a lot of complex proxy server handling.  And if you don't have a lot of complex authentication.  &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;In our case, the urllib2 was handling the 401 retries, cookies and also had some extra handler code to treat a 201 Created response as a non-error (by default, urrlib2 gagged on 201 Created).  Also, urllib2 appears to be lazy and doesn't send everything or close the sockets in the event of a problem.  This makes unit testing just a bit more complex than necessary.  Also, urllib2 required a couple of monkey patches to let us use PUT and DELETE without problems.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Needless Complexity.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;It turns out that handling a 401 retry in httplib isn't really all that difficult.   That ended the use case for urllib2.  &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;What's nice is &lt;/div&gt;&lt;div&gt;&lt;ol&gt;&lt;li&gt;Being able to unmake some bad decisions.  &lt;/li&gt;&lt;li&gt;Rerunning the entire unit test suite to ferret out the remaining concealed dependencies.&lt;/li&gt;&lt;li&gt;Removing hack-arounds, volume and complexity.&lt;/li&gt;&lt;/ol&gt;&lt;div&gt;We still have a lot of work to make full use of Piston.  That will lead to removing yet more code.  It will, however, also change the API's slightly because the ".../xml/..." URL's will have a different format and we'll introduce ".../django/..." URL's which will have the current format.&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684183198890094283-7265074121335223277?l=slott-softwarearchitect.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://slott-softwarearchitect.blogspot.com/feeds/7265074121335223277/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/03/code-deletion.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/7265074121335223277'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/7265074121335223277'/><link rel='alternate' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/03/code-deletion.html' title='Code Deletion'/><author><name>Steven Lott</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-mXhxBa_vM4E/AAAAAAAAAAI/AAAAAAAAAAA/eNlmenT-DOs/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684183198890094283.post-5769464489323697980</id><published>2011-03-29T08:00:00.003-04:00</published><updated>2011-03-30T14:25:35.789-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='python 3'/><category scheme='http://www.blogger.com/atom/ns#' term='mac os x'/><category scheme='http://www.blogger.com/atom/ns#' term='python'/><title type='text'>Where is Python Used? (Update)</title><content type='html'>This is a fair-to-partly silly question that shows up on places like StackOverflow once in a while.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Python is used widely and pretty heavily.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;It's a built-in feature to many operating systems in common use.   The exception, of course, is Windows.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;I just found out -- the hard way -- that Python 2.6 is an integral part of &lt;a href="http://www.apple.com/ilife/"&gt;Apple's iLife&lt;/a&gt; suite of products. &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Important safety tip for Mac OS X users.  &lt;span class="Apple-style-span"  style="font-family:'courier new';"&gt;System.Libary.Frameworks&lt;/span&gt; should not be touched.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Also, it helps to get used to the idea of typing &lt;span class="Apple-style-span"  style="font-family:'courier new';"&gt;python3&lt;/span&gt; on the command-line.   Further, it helps to skip Python 3.1 and go straight to Python 3.2.  &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Python 3.2 has &lt;a href="http://docs.python.org/py3k/library/argparse.html"&gt;argparse&lt;/a&gt; and the new dictionary-based configuration of &lt;a href="http://docs.python.org/py3k/library/logging.config.html#logging.config.dictConfig"&gt;logging&lt;/a&gt;.&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684183198890094283-5769464489323697980?l=slott-softwarearchitect.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://slott-softwarearchitect.blogspot.com/feeds/5769464489323697980/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/03/where-is-python-used.html#comment-form' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/5769464489323697980'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/5769464489323697980'/><link rel='alternate' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/03/where-is-python-used.html' title='Where is Python Used? (Update)'/><author><name>Steven Lott</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-mXhxBa_vM4E/AAAAAAAAAAI/AAAAAAAAAAA/eNlmenT-DOs/s512-c/photo.jpg'/></author><thr:total>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684183198890094283.post-814144738402600690</id><published>2011-03-15T08:00:00.000-04:00</published><updated>2011-03-15T08:00:19.030-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='use case'/><title type='text'>XBox Live -- Can't Unsubscribe</title><content type='html'>Here's a lack of a use case for you.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Someone -- fraudulently -- used my email address to subscribe to XBox live.  I cannot remedy this.  Apparently, neither can Microsoft.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;I get spam from XBox.  I change my passwords all over the place.  &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;I go to the XBox live web site to cancel this fraudulent account.  I can't.  There's no place to do that.  I cannot cancel the account because it can only be done through the XBox console.  Except -- of course -- in the case of fraud, the email user doesn't have a console.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;So I call the help desk.  "Please remove my email from this account that fraudulently uses it."  They can't.  Absolutely can't.  All I can do is route xbox.com email into the spam folder.  That's it. &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Nice help desk agent.  Doing the best she can.  But, she cannot find the email address and disconnect me from spam or XBox or XBox live.  Someone at the console needs to do that.  &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;How do we contact the person at the XBox Console?  Can't send them email -- it goes to me!&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Somehow, someone at Microsoft has to call "beezyNdetroit" on the phone (I guess) and break the bad news to them that they're fraudulently using one of my email addresses for their XBox spam.&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684183198890094283-814144738402600690?l=slott-softwarearchitect.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://slott-softwarearchitect.blogspot.com/feeds/814144738402600690/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/03/xbox-live-cant-unsubscribe.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/814144738402600690'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/814144738402600690'/><link rel='alternate' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/03/xbox-live-cant-unsubscribe.html' title='XBox Live -- Can&apos;t Unsubscribe'/><author><name>Steven Lott</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-mXhxBa_vM4E/AAAAAAAAAAI/AAAAAAAAAAA/eNlmenT-DOs/s512-c/photo.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684183198890094283.post-1148551521060053511</id><published>2011-03-10T08:00:00.000-05:00</published><updated>2011-03-10T08:00:13.483-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='software process improvement'/><category scheme='http://www.blogger.com/atom/ns#' term='innovation'/><title type='text'>To Robert Fulton, Regarding the "Steam Boat"</title><content type='html'>&lt;blockquote&gt;"What sir, would you make a ship sail against the wind and currents by lighting a bonfire under her deck? I pray you excuse me. I have no time to listen to such nonsense."&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;-- Napoleon Bonaparte&lt;/div&gt;&lt;div&gt;&lt;/div&gt;&lt;/blockquote&gt;&lt;div&gt;There's no authoritative source for this quote.  Since Fulton was commissioned to build a submarine and did build a steam-powered boat in France, it's unlikely for this quote to be actually true.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;A great list of related quotes: &lt;a href="http://www.av8n.com/physics/ex-cathedra.htm"&gt;Famous Authoritative Pronouncements&lt;/a&gt;.&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684183198890094283-1148551521060053511?l=slott-softwarearchitect.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://slott-softwarearchitect.blogspot.com/feeds/1148551521060053511/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/03/to-robert-fulton-regarding-steam-boat.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/1148551521060053511'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/1148551521060053511'/><link rel='alternate' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/03/to-robert-fulton-regarding-steam-boat.html' title='To Robert Fulton, Regarding the &quot;Steam Boat&quot;'/><author><name>Steven Lott</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-mXhxBa_vM4E/AAAAAAAAAAI/AAAAAAAAAAA/eNlmenT-DOs/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684183198890094283.post-2244129239643076632</id><published>2011-03-03T08:00:00.001-05:00</published><updated>2011-03-15T15:44:06.104-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='software process improvement'/><title type='text'>Improving the "Velocity" of IT</title><content type='html'>Check this out: "&lt;a href="http://www.informationweek.com/news/global-cio/interviews/showArticle.jhtml?articleID=229218781"&gt;IT Is Too Darn Slow&lt;/a&gt;".&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;This article is packed with helpful advice on how to improve "velocity" and the pace of innovation.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Good quotes: "Once IT decides to focus on speed, two obstacles get in the way: security and governance."  This is important.  Manage security without it becoming an impediment.  &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;One of the most important things not mentioned here is the idea that internal security must be a simple, cheap commodity.  A identity manager and SSO framework needs to be standard, available, and well-understood.  Each project shouldn't involve head-scratching and deep thinking about security.  Like an OS and a file system, security infrastructure should be a given.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Another good quote: "IT can't set priorities for 10 projects spread over the next two years because, once projects one, two, and three are done, that will change what would have been four, five, and six."&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;What isn't provided is the a way to handle the accounting practices that always seem to take IT hostage.  The idea of "capital" vs. "expense" can serve as a weird artificial boundary on innovative projects.  Lots of innovation gets chopped off when the "capital" budget is spent, and we switch over to "expense" where we can't invent anything new.  It's the same people.  Yet, the money is "different".  &lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684183198890094283-2244129239643076632?l=slott-softwarearchitect.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://slott-softwarearchitect.blogspot.com/feeds/2244129239643076632/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/03/improving-velocity-of-it.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/2244129239643076632'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/2244129239643076632'/><link rel='alternate' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/03/improving-velocity-of-it.html' title='Improving the &quot;Velocity&quot; of IT'/><author><name>Steven Lott</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-mXhxBa_vM4E/AAAAAAAAAAI/AAAAAAAAAAA/eNlmenT-DOs/s512-c/photo.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684183198890094283.post-5120057498208455859</id><published>2011-03-01T08:00:00.000-05:00</published><updated>2011-03-01T08:00:09.613-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='software process improvement'/><title type='text'>DIY and the Dumb Rules of Prevention</title><content type='html'>Check out this little item in eWeek.  "&lt;a href="http://www.eweek.com/c/a/Enterprise-Applications/Transforming-DIY-Projects-from-the-Painful-to-the-Productive-382595/"&gt;Transforming DIY Projects from the Painful to the Productive&lt;/a&gt;".&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;As an outside consultant in a large number of organizations, I've seen a lot of DIY projects—what we used to call "end user computing".  Indeed, I've even been hired by the user organizations because IT was ineffective.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;But IT gets in it's own way.  There are some Dumb Rules of IT that creates an environment in which software development and project management are nightmares over burdensome inefficiency.  Why?  How?&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;How is easy.  IT has too much process.  I sat with angry users at meetings where they begged me to influence the CIO to stop making everything into a large-scale, heavy-weight, closely-monitored "project".  All that the users wanted were simple conversations on how best to interface with existing applications and frameworks.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;They didn't want a "project".&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Root Cause&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Why does IT anger the users with too much process?&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;That's sometimes hard to see when we're an insider.  From end-users, I learned that IT's response to the inherent complexity of software development is to treat the act of creating software solutions as if the development work were—itself—just more software.  Software executed by people and organizations.  They write process they way they write code.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Here's Dumb Rule 1 of success in IT:  "&lt;b&gt;When in doubt, define a process.&lt;/b&gt;"&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;And that rule is often just wrong.  Users hate it.  &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Process is wrong?  For software development, yes, too much process is a bad thing.  Why do you think we had to invent Agile methods?  The way we create complex software is not helped in any way by having a large-scale, heavy-weight process defined.  Adding process steps to the already complex task of creating software merely slows the work down.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;All processes aren't bad.  Processes to create software are demonstrably unhelpful.  &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Suggestion One: Lighten UP&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;"The first step is to bring business developers out of the shadows and into the limelight."&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;This is important, and difficult.  IT is so enamored of large processes that simply embracing non-IT developers means wrapping them in processes they don't need and can't comply with. &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;While good advice, this isn't the &lt;i&gt;real&lt;/i&gt; first step.   &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The real first step is &lt;b&gt;stop weighing down everything with overly-defined, heavy-weight process definitions&lt;/b&gt;.  Embrace Agile development where there are sprints and releases and leave it at that.  &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;DIY folks who are not in IT can understand sprint and release, since that's how they work anyway.  IT needs to help them get started and help them release.  The rest has to be hands-off.  Inside corporate IT and outside the corporate IT enclave.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Suggestion Two: Open The Door&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;"Next, it’s important for IT to take the lead in providing tools and guidelines that make it easy for business developers to start off on the right foot."&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;While also good advice, it's almost impossible to actually do.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Why?  IT folks who have drunk the Proprietary (i.e., Microsoft) Kool Aid laced with contractual terms and conditions won't let end-users have expensive Visual Studio toolkits.  And the true believers in Microsoft hegemony can't let users have non-Microsoft tools.  All the large IT organizations that are all-singing, all-dancing, all-Microsoft can't empower end users very well at all.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;They offer up Sharepoint and MS-Access and then shut down all further access to tools.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Here's Dumb Rule 2 of success in IT: "&lt;b&gt;Measure Success By Licensing Fees&lt;/b&gt;".&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Visual Studio works because its expensive.  Right?  Oracle is "enterprise" scale because it's expensive.  SQLite is unacceptable because there's no support, right?&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;To provide tools and guidelines, Corporate IT has to &lt;b&gt;either (1) stop paying so much for tools or (2) budget for user departments to have access to real tools or both&lt;/b&gt;.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;If we don't allow the DIY folks a decent tool set they'll still create software using crappy tools.  They're remarkably capable of building vast shadow systems in MS-Access.  Seven interlocked Access databases that all magically work together on a single desktop is the kind of thing they'll do if we don't help them do something better.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;b&gt;Suggestion Three: Stop Driving and Start Guiding&lt;/b&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;DIY folks want first class access to servers, data and tools.  &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;"IT can take the lead in providing guidelines and even code snippets or basic scripts that work with platforms that are commonly used to create DIY projects in your organization."&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;"Provide small doses of advice on how to design applications for scale and security that are led by your professional-development staff. "&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;This can't happen, of course.  The reason is that we can't "trust" the end users.  If we give them access to the data, they'll make a total mess of it.  We &lt;b&gt;know&lt;/b&gt; they'll make a mess of it because we have columns in the database where the users have stopped entering the data we carefully defined in the data dictionary and started entering other data.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;i&gt;Why, oh why do end users corrupt the columns in the database?&lt;/i&gt; We tear at our hair and rend our clothes asking this.  Everyone asks this.  It's the great rhetorical argument against allowing users into the hallowed halls of IT.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;No one tries to &lt;i&gt;answer&lt;/i&gt; it, of course.  It's a rhetorical question.  The answer is embarrassing. &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Users corrupt and misuse the database because a small change to add a column is a 9-month delay preceded by endless useless meetings, endless useless IT process.  While waiting for IT to make a simple change, it's easier to simply enter the data into another unused field.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;IT creates they're own nightmare by being unable to adapt to business change.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;Here's Dumb Rule 3 of success in IT: "&lt;b&gt;Control The Data&lt;/b&gt;".&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Rather than adapt to business changes, wrap them in process as a way to say "no".  Have elaborate processes.  Complex budget negotiations.  Project prioritization meetings.  &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Yes, the resources are finite.  More process, however, means fewer resources devoted to solving business problems.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;Code that does updates needs a lot of code review and testing no matter who wrote it.  If end-users have  a DIY project, IT should be deeply involved in the QA process.  End users don't particularly like the kind of rigorous, expensive, sophisticated testing that IT imposes on itself.  That doesn't remove the need to engage the DIY folks in some kind of test plan.&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Making Progress&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;IT needs to offer a menu of services including "secure, reliable and available data and processing resources".  &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;IT needs to have a simple, easy-to-understand quality threshold for software.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Programming, software purchase, integration installation and configuration and the like can be done by IT or by anyone else who meets the simple quality threshold.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;A test plan.  Evidence that the tests are passed.  Compatibility with some standards.  &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;It should be very, very simple so that DIY can be made to work efficiently.  Since it's going to happen anyway, it's better to do it well than to fight against it with Dumb Rules.&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684183198890094283-5120057498208455859?l=slott-softwarearchitect.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://slott-softwarearchitect.blogspot.com/feeds/5120057498208455859/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/03/diy-and-dumb-rules-of-prevention.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/5120057498208455859'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/5120057498208455859'/><link rel='alternate' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/03/diy-and-dumb-rules-of-prevention.html' title='DIY and the Dumb Rules of Prevention'/><author><name>Steven Lott</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-mXhxBa_vM4E/AAAAAAAAAAI/AAAAAAAAAAA/eNlmenT-DOs/s512-c/photo.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684183198890094283.post-451626523715271072</id><published>2011-02-17T08:00:00.000-05:00</published><updated>2011-02-17T08:00:11.451-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='unit testing'/><category scheme='http://www.blogger.com/atom/ns#' term='tdd'/><category scheme='http://www.blogger.com/atom/ns#' term='python'/><title type='text'>TDD -- From SME Spreadsheet to TestCase to Code</title><content type='html'>In "&lt;a href="http://slott-softwarearchitect.blogspot.com/2011/02/unit-test-case-subject-matter-experts.html"&gt;Unit Test Case, Subject Matter Experts and Requirements&lt;/a&gt;" I suggested that it's often pretty easy to get a spreadsheet of full-worked out examples from subject-matter experts.  Indeed, if your following TDD, that spreadsheet of examples is solid gold.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Let's consider something relatively simple.  Let's say we're working on some fancy calculations.  Our users explain until they're blue in the face.  We take careful notes.  We &lt;i&gt;think&lt;/i&gt; we understand.   To confirm, we ask for a simple spreadsheet with inputs and outputs.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;We get something like the following.  The latitudes and longitudes are inputs.  The ranges and bearings are outputs.  [The math can be seen at "&lt;a href="http://www.movable-type.co.uk/scripts/latlong.html"&gt;Calculate distance, bearing and more between Latitude/Longitude points&lt;/a&gt;".]&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;table border="1"&gt;&lt;thead&gt;&lt;tr&gt;&lt;th&gt;Latitude 1&lt;/th&gt;&lt;th&gt;Longitude 1&lt;/th&gt;&lt;th&gt;Latitude 2&lt;/th&gt;&lt;th&gt;Longitude 2&lt;/th&gt;&lt;th&gt;range&lt;/th&gt;&lt;th&gt;bearing&lt;/th&gt;&lt;/tr&gt;&lt;/thead&gt; &lt;tbody&gt;&lt;tr&gt;&lt;td&gt;50 21 50N&lt;/td&gt;&lt;td&gt;004 09 25W&lt;/td&gt;&lt;td&gt;42 21 04N&lt;/td&gt;&lt;td&gt;071 02 27W&lt;/td&gt;&lt;td&gt;2805 nm&lt;/td&gt;&lt;td&gt;260 07 38&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Only it has a a few more rows with different examples.  Equator Crossing.  Prime Meridian Crossing.  All the usual suspects.&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;TDD Means Making Test Cases&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Step one, then, is to parse the spreadsheet full of examples and create some domain-specific examples.  Since it's far, far easier to work with .CSV files, we'll presume that we can save the carefully-crafted spreadsheet as a simple .CSV with the columns shown above.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Step two will be to create working Python code from the domain-specific examples.  &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The creation of test cases is a matter of building some intermediate representation out of the spreadsheet.  This is where plenty of parsing and obscure special-case data handling may be necessary.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;code&gt;&lt;pre&gt;from __future__ import division&lt;br /&gt;import csv&lt;br /&gt;from collections import namedtuple&lt;br /&gt;import re&lt;br /&gt;&lt;br /&gt;latlon_pat= re.compile("(\d+)\s+(\d+)\s+(\d+)([NSWE])")&lt;br /&gt;def latlon( txt ):&lt;br /&gt;  match= latlon_pat.match( txt )&lt;br /&gt;  d, m, s, h = match.groups()&lt;br /&gt;  return float(d)+float(m)/60+float(s)/3600, h&lt;br /&gt;angle_pat= re.compile("(\d+)\s+(\d+)\s+(\d+)")&lt;br /&gt;def angle( txt ):&lt;br /&gt;  match= angle_pat.match( txt )&lt;br /&gt;  d, m, s = match.groups()&lt;br /&gt;  return float(d)+float(m)/60+float(s)/3600&lt;br /&gt;range_pat= re.compile("(\d+)\s*(\D+)")&lt;br /&gt;def range( txt ):&lt;br /&gt;  match= range_pat.match( txt )&lt;br /&gt;  d, units = match.groups()&lt;br /&gt;  return float(d), units&lt;br /&gt;&lt;br /&gt;RangeBearing= namedtuple("RangeBearing","lat1,lon1,lat2,lon2,rng,brg")&lt;br /&gt;&lt;br /&gt;def test_iter( filename="sample_data.csv" ):&lt;br /&gt;  with open(filename,"r") as source:&lt;br /&gt;      rdr= csv.DictReader( source )&lt;br /&gt;      for row in rdr:&lt;br /&gt;          print row&lt;br /&gt;          tc= RangeBearing(&lt;br /&gt;              latlon(row['Latitude 1']),  latlon(row['Longitude 1']),&lt;br /&gt;              latlon(row['Latitude 2']),  latlon(row['Longitude 2']),&lt;br /&gt;              range(row['range']),&lt;br /&gt;              angle(row['bearing'])&lt;br /&gt;              )&lt;br /&gt;          yield tc&lt;br /&gt;    &lt;br /&gt;for tc in test_iter():&lt;br /&gt;  print tc&lt;/pre&gt;&lt;/code&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;This is long, but, it handles a lot of the formatting vagaries that users are prone to.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;From Abstract to TestCase&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Once we have a generator to build test cases as abstraction examples, generating code for Java or Python or anything else is just a little template-fu.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;code&gt;&lt;pre&gt;   &lt;br /&gt;from string import Template&lt;br /&gt;testcase= Template("""&lt;br /&gt;class Test_${name}( unittest.TestCase ):&lt;br /&gt;   def setUp( self ):&lt;br /&gt;       self.p1= LatLon( lat=GlobeAngle(*$lat1), lon=GlobeAngle(*$lon1) )&lt;br /&gt;       self.p2= LatLon( lat=GlobeAngle(*$lat2), lon=GlobeAngle(*$lon2) )&lt;br /&gt;   def test_should_compute( self ):&lt;br /&gt;       d, brg = range_bearing( p1, p2, R=$units )&lt;br /&gt;       self.assertEquals( $dist, int(d) )&lt;br /&gt;       self.assertEquals( $brg, map(int,map(round,brg.deg)))&lt;br /&gt;""")&lt;br /&gt;for name, tc in enumerate( test_iter() ):&lt;br /&gt;   units= tc.rng[1].upper()&lt;br /&gt;   dist= tc.rng[0]&lt;br /&gt;   code= testcase.substitute( name=name, dist=dist, units=units, **tc._asdict()  )&lt;br /&gt;   print code&lt;/pre&gt;&lt;/code&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;This shows a simple template with values filled in.  Often, we have to generate a hair more than this.  A few imports, a "unittest.main()" is usually sufficient to transform a spreadsheet into unit tests that we can confidently use for test-driven development. &lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684183198890094283-451626523715271072?l=slott-softwarearchitect.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://slott-softwarearchitect.blogspot.com/feeds/451626523715271072/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/02/tdd-from-sme-spreadsheet-to-testcase-to.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/451626523715271072'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/451626523715271072'/><link rel='alternate' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/02/tdd-from-sme-spreadsheet-to-testcase-to.html' title='TDD -- From SME Spreadsheet to TestCase to Code'/><author><name>Steven Lott</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-mXhxBa_vM4E/AAAAAAAAAAI/AAAAAAAAAAA/eNlmenT-DOs/s512-c/photo.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684183198890094283.post-5858933577557131937</id><published>2011-02-08T08:00:00.000-05:00</published><updated>2011-02-08T08:00:13.898-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='unit testing'/><category scheme='http://www.blogger.com/atom/ns#' term='tdd'/><category scheme='http://www.blogger.com/atom/ns#' term='python'/><title type='text'>Unit Test Case, Subject Matter Experts and Requirements</title><content type='html'>Here's a typical "I don't like TDD" question: the topic is "&lt;a href="http://programmers.stackexchange.com/questions/41773/does-tdd-really-work-for-complex-projects"&gt;Does TDD really work for complex projects?&lt;/a&gt;"&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Part of the question focused on the difficulty of preparing test cases that cover the requirements.  In particular, there was some hand-wringing over conflicting and contradictory requirements.  &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Here's what's worked for me.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Preparation&lt;/b&gt;.  The users provide the test cases as a spreadsheet showing the business rules.  The columns are attributes of some business document or case.  The rows are specific test cases.  Users can (and often will) do this at the drop of a hat.  Often complex, narrative requirements written by business analysts are based on such a spreadsheet.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;This is remarkably easy for must users to produce.  It's just a spreadsheet (or multiple spreadsheets) with concrete examples.  It's often easier for users to make concrete examples than it is for them to write more general business rules.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Automated Test Case Construction&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Here's what can easily happen next.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Write a Python script to parse the spreadsheet and extract the cases.  There will be some ad-hoc rules, inconsistent test cases, small technical problems.  The spreadsheets will be formatted poorly or inconsistently.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Once the cases are parsed, it's easy to then create a &lt;span class="Apple-style-span"  style="font-family:'courier new';"&gt;Unittest.TestCase&lt;/span&gt; template of some kind.  Use &lt;a href="http://jinja.pocoo.org/"&gt;Jinja2 &lt;/a&gt;or even Python's &lt;span class="Apple-style-span"  style="font-family:'courier new';"&gt;string.Template&lt;/span&gt; class to rough out the template for the test case.  The specifics get filled into the unit test template.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The outline of test case construction is something like this.  Details vary with target language, test case design, and overall test case packaging approach.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;code&gt;&lt;pre&gt;t = SomeTemplate()&lt;br /&gt;for case_dict in testCaseParser( "some.xls" ):&lt;br /&gt;  code= t.render( **case_dict )&lt;br /&gt;  with open(testcaseName(**case_dict ),'w') as result:&lt;br /&gt;     result.write( code )&lt;/pre&gt;&lt;/code&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;You now have a fully-populated tree of unit test classes, modules and packages built from the end-user source documents.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;You have your tests.  You can start doing TDD.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Scenarios&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;One of the earliest problems you'll have is test case spreadsheets that are broken.  Wrong column titles, wrong formatting, something wrong.  Go meet with the user or expert that built the spreadsheet and get the thing straightened out.  &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Perhaps there's some business subtlety to this.  Or perhaps they're just careless.  What's important is that the spreadsheets have to be parsed by simple scripts to create simple unit tests.  If you can't arrive at a workable solution, you have Big Issues and it's better to resolve it now than try to press on to implementation with a user or SME that's uncooperative.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Another problem you'll have is that tests will be inconsistent.  This will be confusing at first because you've got code that passed one test, and fails another test and you can't tell what the differences between the tests are.  You have to go meet with the users or SME's and resolve what the issue is.  Why are the tests inconsistent?  Often, attributes are missing from the spreadsheet -- attributes they each assumed -- and attributes you didn't have explicitly written down anywhere.  Other times there's confusion that needs to be resolved before any programming should begin.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;The Big Payoff&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;When the tests all pass, you're ready for performance and final acceptance testing.  Here's where TDD (and having the users own the test cases) pays out well.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Let's say we're running the final acceptance test cases and the users balk at some result.  "Can't be right" they say.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;What do we do?&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Actually, almost nothing.  Get the correct answer into a spreadsheet somewhere.  The test cases were incomplete.  This always happens.  Outside TDD, it's called "requirements problem" or "scope creep" or something else.  Inside TDD, it's called "test coverage" and some more test cases are required.  Either way, test cases are always incomplete.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;It may be that they're actually changing an earlier test case.  Users get examples wrong, too.  Either way (omission or error) we're just fixing the spreadsheets, regenerating the test cases, and starting up the TDD process with the revised suite of test cases.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Bug Fixing&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Interestingly, a bug fix after production roll-out is no different from an acceptance test problem.  Indeed it's no different from anything that's happened so far.  &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;A user spots a bug.  They report it.  We ask for the concrete example that exemplifies the correct answer.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;We regenerate the test cases from the spreadsheets and start doing development.  80% of the time, the new example is actually a change to an existing example.  And since the users built the example spreadsheets with the test data, they can maintain those spreadsheets to clarify the bugs.  20% of the time it's a new requirement.  Either way, the test cases are as complete and consistent as the users are capable of producing.&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684183198890094283-5858933577557131937?l=slott-softwarearchitect.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://slott-softwarearchitect.blogspot.com/feeds/5858933577557131937/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/02/unit-test-case-subject-matter-experts.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/5858933577557131937'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/5858933577557131937'/><link rel='alternate' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/02/unit-test-case-subject-matter-experts.html' title='Unit Test Case, Subject Matter Experts and Requirements'/><author><name>Steven Lott</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-mXhxBa_vM4E/AAAAAAAAAAI/AAAAAAAAAAA/eNlmenT-DOs/s512-c/photo.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684183198890094283.post-2597081905915350883</id><published>2011-02-02T08:00:00.000-05:00</published><updated>2011-02-02T08:00:08.162-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='database design'/><category scheme='http://www.blogger.com/atom/ns#' term='SQL'/><category scheme='http://www.blogger.com/atom/ns#' term='noSQL'/><title type='text'>Escaping the Relational Schema Trap</title><content type='html'>We're struggling with our Relational Schema.  We're not alone, of course, everyone struggles with the relational model.  The technology imposes difficult limitations and we work around them.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;There's kind of a 4-step process through which the relational schema erodes into irrelevance.  The concept of a schema is not irrelevant.  It's the rigid relational schema that's a problem.&lt;br /&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Many DBA's will say that the relational model is the ultimate in flexibility.  They're right, but they're missing the point.  The relational database clearly separates the physical storage from the logical model as seen in tables and columns.  It's flexible, but the presence of a rigid relational schema limits the pace of business change.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;"Clearly," the DBA says, "you don't know how to use &lt;span class="Apple-style-span"  style="font-family:'courier new';"&gt;ALTER&lt;/span&gt;."  I beg to differ.  I can use &lt;span class="Apple-style-span"  style=" ;font-family:'courier new';"&gt;ALTER&lt;/span&gt;; however, it doesn't permit the broad, sweeping scope of change that the business demands.  &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;In order to attempt to match the pace of business change, we're using an ORM layer.  This allows us to fabricate methods and properties left, right and center.  We can tackle some pretty big problems with simple code changes.  This, however, is no longer helping.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Straws and Camels&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;When designing a database, we have to be cognizant of the nature and tempo of change.  In highly-regulated, very settled business applications (back-office accounting, for example) the data model is well known.  Changes are mostly distinctive reporting changes and the tempo is pretty lethargic.  It's the back office.  Sorry, but innovation rarely happens there.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Each change is just a another hand-full of straw thrown on the camel's back.  It happens fairly slowly.  And there aren't many surprises.  Hacks, workarounds and technical debt accumulates slowly.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;In innovative, novel, experimental businesses, however, the nature and tempo are very different.  The changes are disruptive, "what are you saying?" kinds of changes.  They are "throw out the bathwater, the babies, the cribs and fire the nursemaid" kinds of changes.  The tempo is semi-annual reinvent everything.  Hacks, workarounds and technical debt get out of control suddenly.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;/div&gt;&lt;blockquote&gt;&lt;div&gt;&lt;b&gt;Important Lesson Learned&lt;/b&gt;.  When the customer misunderstands the offering and asks for something completely senseless, it's good to listen and try to build that -- even if it wasn't what you were offering.  In some cases, the original offering was too complex or contrived.  In other cases, the offering didn't create enough value.  But when you offer &lt;b&gt;[X]&lt;/b&gt; and the customer asks how much it will cost for &lt;b&gt;[Y]&lt;/b&gt;, you have disruptive, sudden, and surprising database changes.&lt;/div&gt;&lt;div&gt;&lt;/div&gt;&lt;/blockquote&gt;&lt;div&gt;This is bales of hay through onto an unprepared camel.  Backs can get broken.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Coping&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;One common coping strategy is SQL &lt;span class="Apple-style-span"  style="font-family:'courier new';"&gt;ALTER&lt;/span&gt; statements to fiddle with the logical model.  This has to be coupled with &lt;span class="Apple-style-span"  style="font-family:'courier new';"&gt;CREATE TABLE AS SELECT&lt;/span&gt; scripts to do open-heart surgery on the logical model.  Married with modified ORM definitions.  This requires some careful "schema versioning" techniques.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Another coping strategy is lots of "Expansion" columns in the tables.  These can be renamed and repurposed without physical storage changes.  The rows haven't physically changed, but the column name morphed from "EXPANSION_INT_01" to "Some_Real_Attribute".  This doesn't prevent the &lt;span class="Apple-style-span"  style="font-family:'courier new';"&gt;CREATE TABLE AS SELECT&lt;/span&gt; scripts to do open-heart surgery.  It still requires some careful "schema versioning" techniques to be sure that the ORM layer matches the logical schema.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;A third -- and perhaps most popular -- coping strategy is manpower.  Just having dedicated DBA's and maintenance programmers is a common way to handle this.   Some folks object, saying that a large staff isn't a way to "cope with change" but is a basic "cost of doing business".&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;It's false, by the way, to claim that dedicated DBA's are essential.  A solo developer can design and implement a database and application software with no help at all.  Indeed, in most organizations, developers design and build databases, then turn them over to DBA's for operational support.  If the nature of change is minor and tempo of change is slow, a solo developer can deal perfectly well with the database.  A dedicated DBA is someone we &lt;b&gt;add&lt;/b&gt; when the developer gets swamped by too much change.  &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;(Some DBA's like to claim that the developers never get normalization or indexing correct.  I counter with the observation that some DBA's don't get this right, either.  DBA's aren't &lt;b&gt;essential&lt;/b&gt;.  They're a popular way to cope with the nature and tempo of change.)&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;In the ORM world, there are schema migration toolkits.  Projects like &lt;a href="https://storm.canonical.com/"&gt;Storm&lt;/a&gt;, this &lt;a href="http://code.djangoproject.com/wiki/SchemaEvolution"&gt;list&lt;/a&gt; for Django, &lt;a href="http://www.embarcadero.co.uk/products/db-change-manager-xe"&gt;Embarcadero Change Manager&lt;/a&gt; for Oracle, and numerous others attempt to support the schema evolution and change management problem.  All of this is a clever way to cope with a problem inherent in our choice of technology.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Chaos Theory&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Rather than invent clever coping mechanisms, let's take a step back.  If we're inventing technology to work around the fixed relational schema, it might be time to rethink the relational schema.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;"Oh noes," DBA's cry, "we must have a fixed logical model otherwise chaos ensues."&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Really?  How come we're always altering that schema?  How come we're always adding tables and restructuring the tables?&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;"Oh that?  That's 'controlled change'," the DBA responds.  &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;No, that's slow chaos.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Here's how it plays out.  We have a disruptive change.  We negotiate with the DBA's to restructure the database.  And the test database.  And the QA database.  We do the development database without any help from the DBA's.  We fix the ORM layers.  We unit test the changes.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Then we plan and coordinate the production rollout of this change with the DBA's.  Note.  We already made the change in development.  We're not allowed to make the change in production.  The DBA's then suggest design alternatives.  Normalization isn't "right".  Or there are physical changes that need to be declared in the table definitions.  We redo the development database.  And the ORM layer.  And rerun the unit tests.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Because the production database couldn't be touched -- and we had paying customers -- we copied production data into a development database and started doing "production" in development.  Now that we're about to make the official production change, we have two databases.  The official database content is out-of-date.  The development database is a mixture of live production and test data.  Sigh. &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Rethinking Schema&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;If the schema is a problem, perhaps we can live without it.  Enter NoSQL databases.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Here's how you start down the slippery slope.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Phase I&lt;/b&gt;.&lt;b&gt; &lt;/b&gt;You need a fairly radical database change.  Rather than wait weeks for the DBA's, you ask for a single "BLOB" column.  You take the extra data elements for the radical change, JSON encode them, and store the JSON representation in the BLOB field.  Now you have a "subschema" buried inside a single BLOB column.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Since this is a simple &lt;span class="Apple-style-span"  style="font-family:'courier new';"&gt;ALTER&lt;/span&gt;, the DBA's will do it without a lot of negotiation or delay.  You have a hybrid database with a mixture of schema and noSQL.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Phase II&lt;/b&gt;.  You need an even more radical change.  Rather than wait weeks for the DBA's, you ask for a few tables that have just a primary key and a BLOB column.    You've basically invented a document-structured database inside SQL, bypassing the SQL schema entirely.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Phase III&lt;/b&gt;.  While waiting for the Phase II changes to be implemented, you convert the customer data from their obscure, stupid format into a simple sequential file of JSON documents and write your own simple map-reduce algorithms in Python.  Sure, performance is poor, but you're up and running without any database overheads.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Phase IV&lt;/b&gt;.  Start looking for alternatives.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;a href="http://www.mongodb.org/display/DOCS/MongoDB,+CouchDB,+MySQL+Compare+Grid"&gt;MongoDB, CouchDB, MySQL Compare Grid&lt;/a&gt; &lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;This MongoDB looks really nice.  &lt;a href="http://api.mongodb.org/python/1.7%2B/tools.html#framework-tools"&gt;PyMongo&lt;/a&gt; offers lots of hints and guidance.  &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;At least one person is looking at &lt;a href="https://github.com/vpulim/mango"&gt;mango&lt;/a&gt;, a MongoDB database adapter for Django.   For us, this isn't the best idea.  We use OpenAM for identity management, so our Users and Sessions are simply cloned from OpenAM by an &lt;a href="http://docs.djangoproject.com/en/dev/ref/authbackends/"&gt;authentication backend&lt;/a&gt; that gets the user from OpenAM.  SQLite works fine for this.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;We think we can use Django's ORM and a relational database for User and Session.  For everything else, we need to look closely and MongoDB.  &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Wins and Losses&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The big win is the ability to handle disruptive change a little bit more gracefully.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The big loss in switching away from the Django ORM is we lose the built-in admin pages.  We have to build admin Forms and view functions.  While this is a bit of a burden, we've already customized every model form heavily.  Switching from ModelForm to Form and adding the missing fields isn't much additional work. &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The biggest issue with document-oriented data models is assuring that the documents comply with some essential or core schema.  Schemas are inescapable.  The question is more a matter of how the schema limits change.  Having a Django Form to validate JSON documents for the "essential" features is far more flexible than having a Django Model class and a mapping to a relational database.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Schema migration becomes a non-issue until we have to expand the essential schema, which changes the validation rules, and may render old documents retroactively invalid.  This is not a new problem -- Relational folks cope with this, also -- but if it's the &lt;i&gt;only&lt;/i&gt; problem, then we may have streamlined the process of making disruptive business changes.&lt;/div&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684183198890094283-2597081905915350883?l=slott-softwarearchitect.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://slott-softwarearchitect.blogspot.com/feeds/2597081905915350883/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/02/escaping-relational-schema-trap.html#comment-form' title='5 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/2597081905915350883'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/2597081905915350883'/><link rel='alternate' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/02/escaping-relational-schema-trap.html' title='Escaping the Relational Schema Trap'/><author><name>Steven Lott</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-mXhxBa_vM4E/AAAAAAAAAAI/AAAAAAAAAAA/eNlmenT-DOs/s512-c/photo.jpg'/></author><thr:total>5</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684183198890094283.post-6803591033474605403</id><published>2011-01-27T08:00:00.000-05:00</published><updated>2011-01-27T08:00:18.159-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='database design'/><category scheme='http://www.blogger.com/atom/ns#' term='architecture'/><title type='text'>FAERIE DUST™</title><content type='html'>&lt;div&gt;Here's how to recognize a &lt;b&gt;Faerie Dust&lt;/b&gt; request:&lt;/div&gt;&lt;div&gt;&lt;ol&gt;&lt;li&gt;We have identified a problem. It can be with almost anything: scalability, reliability, auditability, any Quality Measure. &lt;/li&gt;&lt;li&gt;We're pursuing a specific technology. Typically, something that has the lowest impact on our architecture.&lt;/li&gt;&lt;li&gt;We can't address anything other than this specific technology variation -- we can't change the application software or buy hardware.&lt;/li&gt;&lt;/ol&gt;&lt;/div&gt;&lt;div&gt;Once we're in the &lt;b&gt;Faerie Dust&lt;/b&gt; realm, what can we do?&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Laughing doesn't help. They have a serious problem, they need a solution. The fact that they won't address the cause isn't completely relevant -- we have to work on the denial, anger, negotiation, depression cycle first. Hopefully skipping past the anger, or assuring the anger is directed elsewhere.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Helping doesn't help. If we join the quest for their Faerie Dust, what will we accomplish? We'll burn billable hours to -- eventually -- reach an equivocal non-solution with a complex write-up and recommendations that won't be implemented.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Not helping doesn't help. If we obstinately refuse to join the quest for the Faerie Dust... well... then we've done nothing. We haven't advanced their understanding of their problem.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;What's left? Is there a middle road that allows us to join the Faerie Dust quest, but still point out the side roads, other monsters and other treasures along the way?&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Perhaps there is, but it would require a kind of saintly patient persistence. We would have to start with an enumeration of problem causes, prioritize them, and then focus on their selected bit of Faerie Dust. My idea is that enumerating the possible causes allows us to identify the missed opportunities, and the possible magnitude of fixing something essential (algorithm or data structure) instead of throwing up window-dressing to cover problems in something inessential (reducing the time required for a table scan).&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Example&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Here's a concrete example of Faerie Dust.&lt;/div&gt;&lt;div&gt;&lt;ol&gt;&lt;li&gt;Pick a data model that doesn't fit the use cases.  i.e., lumped many discrete details into a single text field that has "rich semantic content".  Work around this mistake by using wild-card matches.&lt;/li&gt;&lt;li&gt;Complained about performance and dug into nuanced details of LIKE clause and full-text search.  Lots of study time spent on LIKE clause processing and how to improvement performance.&lt;/li&gt;&lt;li&gt;Refused to discuss the actual use case or the mismatch between data structures and requirements.&lt;/li&gt;&lt;/ol&gt;&lt;div&gt;The design didn't match the use cases.  Faerie Dust won't help.&lt;/div&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684183198890094283-6803591033474605403?l=slott-softwarearchitect.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://slott-softwarearchitect.blogspot.com/feeds/6803591033474605403/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/01/faerie-dust.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/6803591033474605403'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/6803591033474605403'/><link rel='alternate' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/01/faerie-dust.html' title='FAERIE DUST™'/><author><name>Steven Lott</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-mXhxBa_vM4E/AAAAAAAAAAI/AAAAAAAAAAA/eNlmenT-DOs/s512-c/photo.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684183198890094283.post-5707519304316477050</id><published>2011-01-25T08:00:00.001-05:00</published><updated>2011-01-25T08:00:03.783-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='database design'/><category scheme='http://www.blogger.com/atom/ns#' term='architecture'/><category scheme='http://www.blogger.com/atom/ns#' term='SQL'/><title type='text'>Wild-Card (LIKE-clause) searches are slow.  What to do?</title><content type='html'>Patient: "Doctor, doctor, it hurts when I do this."&lt;div&gt;Doctor: "Then don't do that."&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;I got an email with hundreds of words of content.  This part made sense: "...doing wild card searches using Oracle's database engine and are wondering why is it so slow and how do they make it go faster."&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The rest made very little sense at all.  The programmer in question immediately dove into nuances of indexing, Oracle pattern matching, Oracle Text Query and other technical questions.  The entire focus was on the technical ins-and-outs.  &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Not a single word on &lt;b&gt;why&lt;/b&gt; wildcards were even being used in the first place.  Wildcards appear to solve a business problem; the business problem was never mentioned.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Use Case for Wildcards&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;After some back-and-forth, the use case emerged.   We'll address it below.  Essentially, the invoices have names (really) that have "rich semantic content".  These invoice names have the form "{customer} {time period} {offering}".  &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Apparently, the use case is "slice-and-dice" queries.  All invoices for a given customer; all invoices in a given time period; all invoices for a given offering; various combinations. &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Really.  Rather than provide discrete dimensions and use a star schema, they've (a) combined all attributes into a single free-text field and (b) used wild-card searches and now (c) want to complain about it.  &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;We'll return to this use case below.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Basic Rules&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Here's are the two rules.  &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Wild Cards Are The Last Resort For Human-Friendly Search.  &lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;&lt;br /&gt;&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Outside&lt;/b&gt;&lt;b&gt; Human-Friendly Search&lt;/b&gt;&lt;b&gt;, Wild Cards Are Useless.&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Let's look at rule 1: &lt;b&gt;Wild Cards Are The Last Resort For Human-Friendly Search&lt;/b&gt;.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;When a person enters a search string on a web page, we have two choices.  &lt;/div&gt;&lt;div&gt;&lt;ol&gt;&lt;li&gt;Trust them to enter the exact field as it appears in the database&lt;/li&gt;&lt;li&gt;Presume that people are fallible and cannot be trusted to enter the exact field. &lt;/li&gt;&lt;/ol&gt;&lt;div&gt;In case #1 (exact match) we might be using an account number, shipping number, an invoice number or some kind of surrogate key.  In this case, we do simple equality checks.  If the user can't get it right, bummer.  In many cases, this is appropriate to prevent snooping.&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;In case #2 (partial match), we're forced to use a some kind of SQL LIKE clause for the human-friendly search.  We have several implementation choices, some in the database, some out of the database.  Some in-the-database solutions benefit from clever indexing.  Many in-the-database solutions are pretty slow.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Yes, an out-of-the-database solution may actually be faster.  Until we benchmark, we can't know.  There's no trivial rule that says the database always does search faster.  For real speed, we may have to resort to a hybrid solution.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Search Optimization&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;We might create a small RESTful server for our searchable text fields.  This is a cache; the server should handle CRUD rules to assure cache coherence.  This search server can uses a Regular Expression engine, or perhaps compute &lt;a href="http://en.wikipedia.org/wiki/Levenshtein_distance"&gt;Levenshtein distances&lt;/a&gt; or whatever makes sense to optimize user-oriented search.   &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;If we're searching in larger chunks of text, we might want to use a commercial full-text search.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;What's essential about this plan is that we're looking at &lt;i&gt;application-specific optimizations&lt;/i&gt;.  People need flexibility for &lt;i&gt;specific&lt;/i&gt; reasons.  It's important to look at the &lt;i&gt;actual&lt;/i&gt; use cases where a person cannot make an exact match lookup.  What problems do they have?&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;An application may have to deal with customer names.  These are often difficult to spell consistently.  (Is it "AT&amp;amp;T" or "ATT"?)  For this kind of thing Levenshtein Distance might make more sense than wild-card searches.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;An application may have to deal with time periods.  "2010",  "2Q 2010", "July 2010", etc.  This is best handled by decomposing time periods into discrete fields and doing appropriate exact match on the specific, relevant fields.  The issue is that there are a lot of formulations and some text parsing can be better than a form with a million drop-downs.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;An application may have to deal with oddly-named offerings.  Marketing calls it one thing.  Sales folks call it another.  The customer's invoice may call it a third, and the help desk may not use any of those phrases.  This may benefit from wild-cards. &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Note that we're looking at the &lt;i&gt;business&lt;/i&gt; issues.  Not the technology issues.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;b&gt;Design Errors&lt;/b&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;&lt;br /&gt;&lt;/b&gt;&lt;/div&gt;&lt;div&gt;The proper use for LIKE is only to optimize the human-friendly search.  Nothing else.  Which brings us to rule 2, &lt;b&gt;Outside Human-Friendly Search, Wild Cards are Useless&lt;/b&gt;.&lt;/div&gt;&lt;div&gt;&lt;b&gt;&lt;span class="Apple-style-span" style="font-weight: normal;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/b&gt;&lt;/div&gt;&lt;div&gt;Outside human search, every wild-card in a SQL statement indicates a serious database design error.  Serious?  Error?  Yes.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;LIKE clauses outside human search indicate a failure to create a design in first normal form (1NF).  A field which is used in a LIKE clause has multiple parts, and should have been decomposed into pieces.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Decomposing a multi-part attribute isn't always trivial.  There are two cases.&lt;/div&gt;&lt;div&gt;&lt;ol&gt;&lt;li&gt;Simple, regular format or punctuation.  For example, SSN, US Phones or ZIP codes: 123-45-6789 or (123)555-1234 or 12345-1234.  &lt;/li&gt;&lt;li&gt;Complex, irregular format or punctuation.  In this case, we have disjoint subtypes in a single table.  Most manufacturing part numbers suffer from this.&lt;/li&gt;&lt;/ol&gt;&lt;div&gt;In case 1, we have two choices:  fully decompose or denormalize.  In case 2, we can only denormalize because the rules are irregular.&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The decomposition solution does not have to lead to a hideous user interface.  We can have a web page with a single text field for phone numbers.  We can parse that string and decompose the phone number into area code, exchange and number for purposes of database storage.  We don't have to thoughtlessly force the users to decompose a field that they don't see as being in three parts.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The denormalization solution means that we have to do some calculation when we accept the input value.  We save the full field, plus we extract the various sub-fields based on whatever hellish, complex rules we're faced with.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Implementation Choices&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Whenever we have a single text field with "rich semantic content" (i.e., combines multiple disjoint attributes like customer, time period and offering) what we're seeing is a clever way to push database design onto the users.  The expectation is that IT will (1) understand the use cases, (2) provide a proper design and (3) optimize performance around that design.  &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;A big text field and wild-card search (and the attendant email traffic) indicates an explicit unwillingness to discuss the real use cases, unwillingness to do design, and a lame hope that somehow wild-card searches can magically be made faster through magical indexing or other super-natural techniques.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The "rich semantic content" field can be decomposed one of two ways.&lt;/div&gt;&lt;div&gt;&lt;ul&gt;&lt;li&gt;In the GUI.  Add drop-downs so users pick the customer, time period, and product offering information.  &lt;/li&gt;&lt;li&gt;In the Application.   Parse the big text field into smaller text fields that don't require wild-card search.&lt;/li&gt;&lt;/ul&gt;&lt;/div&gt;&lt;div&gt;There isn't any magic.  If wild-card searches are too slow, they have to be replaced.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Benefits?&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The benefit of decomposing (or denormalizing) a complex field is that we can eliminate LIKE processing and wild-cards.   Instead of "&lt;span class="Apple-style-span"  style="font-family:'courier new';"&gt;LONG_TEXT_FIELD LIKE '%2Q 2010%'&lt;/span&gt;", we can do "&lt;span class="Apple-style-span"  style="font-family:'courier new';"&gt;DATE.QUARTER=2 AND DATE.YEAR=2010&lt;/span&gt;".&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;All the technical folderol related to indexing and full-text search and database regular expression engines goes right out the window.  &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The cost is that we have to "wrap" the INSERT and UPDATE processing in a class definition that does the denormalization.  That's what a data model layer is for: these kinds of business rules.  The insert/update cost, BTW, will be microscopic compared to the number of SELECTs.  The extra time spent at INSERT will be handsomely amortized over all the simplified SELECT operations.&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684183198890094283-5707519304316477050?l=slott-softwarearchitect.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://slott-softwarearchitect.blogspot.com/feeds/5707519304316477050/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/01/wild-card-like-clause-searches-are-slow.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/5707519304316477050'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/5707519304316477050'/><link rel='alternate' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/01/wild-card-like-clause-searches-are-slow.html' title='Wild-Card (LIKE-clause) searches are slow.  What to do?'/><author><name>Steven Lott</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-mXhxBa_vM4E/AAAAAAAAAAI/AAAAAAAAAAA/eNlmenT-DOs/s512-c/photo.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684183198890094283.post-1361050471201578048</id><published>2011-01-24T08:00:00.000-05:00</published><updated>2011-01-24T08:00:04.185-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='python'/><title type='text'>Python Jobs</title><content type='html'>Looked interesting enough for someone to email it to me.  Still can't figure out why they sent it. Posting this feels like advertising, so perhaps I should charge them a promotional fee.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Web Services - Mobile Developer&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;"A minimum of 2 years of Web Service development, preferably using an established Python-based framework (Django or Pylons).&lt;/div&gt;&lt;div&gt;Demonstrated proficiency with Python is a must."&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;a href="http://www.maxhire.net/cp/?E85B6B361D43515B7E571E2877501F6A02627C49"&gt;http://www.maxhire.net/cp/?E85B6B361D43515B7E571E2877501F6A02627C49&lt;/a&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Maybe they sent it as confirmation that Python is somehow a "real" technology.&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684183198890094283-1361050471201578048?l=slott-softwarearchitect.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://slott-softwarearchitect.blogspot.com/feeds/1361050471201578048/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/01/python-jobs.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/1361050471201578048'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/1361050471201578048'/><link rel='alternate' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/01/python-jobs.html' title='Python Jobs'/><author><name>Steven Lott</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-mXhxBa_vM4E/AAAAAAAAAAI/AAAAAAAAAAA/eNlmenT-DOs/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684183198890094283.post-8464515279417316211</id><published>2011-01-11T08:00:00.000-05:00</published><updated>2011-01-11T08:00:00.090-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='REST'/><category scheme='http://www.blogger.com/atom/ns#' term='architecture'/><title type='text'>Client-Server Partitioning</title><content type='html'>I have slowly grown to love RESTful web services.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;I was asked about a nearly empty section in the code repository labeled "Java client".  &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;"Yes," I said, "it's a place-holder for a Java package that includes classes to wrap our RESTful web services."&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;"Really?" I was asked, "Why?  We use FLEX for the client, not Java Applets."&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;"Today we use FLEX.  In the past, we weren't sure.  We have a complete Python client library.  Indeed, the original concept was to support our customer's building their own web site to use our web services.  The Java package would plug into a J2EE web app.  No one wanted that, so we built FLEX clients instead."&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Then they said that the super-clean separation between all these clients and the RESTful server was taking "flexibility to a whole new level."&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;I pointed them to this.  &lt;a href="http://www.cs.utexas.edu/~EWD/transcriptions/EWD03xx/EWD340.html"&gt;EWD340: The Humble Programmer&lt;/a&gt;.  It has this killer quote:&lt;/div&gt;&lt;div&gt;&lt;blockquote&gt;The competent programmer is fully aware of the strictly limited size of his own skull; therefore he approaches the programming task in full humility, and among other things he avoids clever tricks like the plague.&lt;/blockquote&gt;&lt;/div&gt;&lt;div&gt;I find that this really helps keep the focus on simplicity.  I suppose that it leads to flexibility, but that's not the real point.  The real point was simplicity.&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684183198890094283-8464515279417316211?l=slott-softwarearchitect.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://slott-softwarearchitect.blogspot.com/feeds/8464515279417316211/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/01/client-server-partitioning.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/8464515279417316211'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/8464515279417316211'/><link rel='alternate' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/01/client-server-partitioning.html' title='Client-Server Partitioning'/><author><name>Steven Lott</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-mXhxBa_vM4E/AAAAAAAAAAI/AAAAAAAAAAA/eNlmenT-DOs/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684183198890094283.post-684838042749034546</id><published>2011-01-06T08:00:00.001-05:00</published><updated>2011-01-06T08:00:11.278-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='java'/><category scheme='http://www.blogger.com/atom/ns#' term='PHP'/><category scheme='http://www.blogger.com/atom/ns#' term='python'/><category scheme='http://www.blogger.com/atom/ns#' term='numerosity'/><title type='text'>Java PHP Python -- Which is "Faster In General"?</title><content type='html'>Sigh.  What a difficult question.  There are numerous incarnations on StackOverflow.  All nearly unanswerable.  The worst part is questions where they add the "in general" qualifier.  Which is "faster in general" is essentially impossible to answer.  And yet, the question persists.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;There are three rules for figuring out which is faster.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;And there are three significant problems that make these rules inescapable.&lt;br /&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Rule One.  Languages don't have speeds.  Implementations have speeds.  &lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Info on &lt;a href="http://en.wikipedia.org/wiki/Benchmark_(computing)"&gt;benchmarking&lt;/a&gt;.  The idea of a benchmark is to have a single, standard suite of source code, which can be used to compare compilers, run-time libraries or hardware.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Having a standard suite of source is essential because it provides a basis for comparison.  A single benchmark source is the fixed reference.  We don't compare the top of the Empire State Building with the top of the Stratosphere in Las Vegas without specifying whether we care about height above the ground or height above sea level.  There has to be some fixed point of comparison, some common attribute, or the measurements devolve into mere numerosity.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Once we have a basis for comparison (one single body of source code), the other attributes are degrees of freedom; the measurements we make will include the other attributes.  This will allow a rational statement of what the experimental results where.   We can then compare these various free attributes against each other.  For details look at something like the &lt;a href="http://www.cs.cmu.edu/~jch/java/microbench.html"&gt;Java Micro Benchmark&lt;/a&gt;.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Rule Two.  Statistics Aren't a Panacea.&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The reason there's no "in general" comparison among languages is because there are too many degrees of freedom to make any kind of rational comparison.  We can make irrational comparisons, but that's the trap of numerosity -- throwing numbers around.  1250 vs. 1149, 1300 vs. 3177.  What do they mean?  Height above ground?  Height above sea level?  What's being measured?&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;There's a huge problem with claiming that statistics will yield an answer to which language implementation is faster "in general".  We need some population that we can sample and measure.  &lt;b&gt;Problem 1&lt;/b&gt;: What the population are we measuring?  It can't be "programs": we can't compare &lt;span class="Apple-style-span"  style="font-family:'courier new';"&gt;grep&lt;/span&gt; against Apache &lt;span class="Apple-style-span"  style="font-family:'courier new';"&gt;httpd&lt;/span&gt;.  Those two programs have almost no common features.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;What makes the population of programs difficult to define is the language differences.  If we're trying to compare PHP, Python and Java, we need to find a program which somehow -- magically -- is common across all three languages.  &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;The Basis For Comparison&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Finding common programs degenerates into &lt;b&gt;Problem 2&lt;/b&gt;: what programs could be comparable?  For example, we have the Tomcat application, written in Java.  We wouldn't want to write Tomcat in Python (since Tomcat is a Java Servlet container).  We could probably write something Tomcat-like in PHP, but why try?  So we can't just grab programs randomly.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;At this point, we devolve to subjectivity.  We need to find some kind of problem domain in which these languages overlap.  This gets icky.  Clearly, big servers aren't a good problem domain.  Almost as clearly, command-line applications aren't the best idea.  PHP does run from the command-line, but it's always contrived-looking because it doesn't exploit PHP's strengths.  So we wind up looking at web applications because that's where PHP excels.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Web applications?  Subjective?  Correct.  PHP is a language plus a web application framework bundled together.  Java and Python -- by themselves -- are just languages and require a framework.  Which Java (and Python) framework is &lt;i&gt;identical&lt;/i&gt; to PHP's framework?  Spring, Struts, Django, Pylons?   None of these reflects a code base that's even remotely similar.  Maybe Java JSP is similar enough to PHP.  For Python there are several implementations.  Sigh.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Crappy Program Problem&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;We can't easily compare programs because we're really comparing implementations of an algorithm.   This leads to &lt;b&gt;Problem 3&lt;/b&gt;: we picked a poor algorithm or did a lousy job of implementing it in the target language.  &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;In order to be "comparable", we don't want to exploit highly-optimized or unique features of a language.  So we tried to be generic.  This is fraught with risks.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;For example, Java and PHP don't have list comprehensions.  Do we forbid them from our Python behchmark?  In Python, everything is a reference, values cannot be copied.  If we pick an algorithm implementation which depends on copying objects, Java may appear to excel.  If we pick an algorithm implementation which depends on sharing references, Python may appear to excel.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Somehow we have to get past language differences and programmer mistakes.  What to do?&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Synthetic Benchmarks&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Since we can't easily find comparable programs -- as whole programs -- we're left with the need to create some kind of benchmark based on language primitives.  Statements or expressions or something.  We can try to follow the &lt;a href="http://en.wikipedia.org/wiki/Whetstone_(benchmark)"&gt;Whetstone&lt;/a&gt;/&lt;a href="http://en.wikipedia.org/wiki/Dhrystone"&gt;Dhrystone&lt;/a&gt; approach of analyzing a bunch of programs to find the primitive constructs and their relative frequency.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Here's the plan.  We'll take 100 PHP programs, 100 Java programs and 100 Python programs and analyze them to find the relative frequency of different kinds of statements.  What then?&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The goal is to create one source that reflects the statements actually used in the 300 programs we analyzed.  In three different languages.  Hmmm...  Okay.  We'll need to create a magical mapping among the statement constructs in the three languages.  Well, that's hard.  The languages aren't terribly comparable.  A Python expression using a List Comprehension is the same thing a multi-statement Java loop.  Rats.  &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The languages aren't very comparable at the statement level at all.  And if we force them to be comparable, we're not comparing real programs, but an artificial mapping.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Virtual Machine Benchmarks&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Since we can't compare the languages at the program level or the statement level, what's left?  Clearly, the underlying interpreter is what we care about.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;We're really comparing the Java Virtual Machine, the PHP interpreter and the Python interpreter.    That's what we really care about.  &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;And life is simple because we can compare Java, The &lt;a href="https://www.projectzero.org/php/"&gt;Project Zero PHP Interpreter&lt;/a&gt; based on the JVM and &lt;a href="http://www.jython.org/"&gt;Jython&lt;/a&gt;.  We can look at "compiled" PHP, Java Class Files and Python .PYC files to find the VM primitives used by each language and then -- what?  Compare the run-time of the various VM primitives?  No, that's silly, since the run-times are all JVM run-times.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;What We're Left With&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The very best we can  can do is to compare the statistical distribution of the VM instructions created by Java, PHP or Jython compilers.   We could note that maybe PHP or Python uses too many "slow" VM instructions, where Java used more "fast" VM instructions.   That would be an "in general" comparison.  Right?&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;See?  You can &lt;a href="http://www.howtomeasureanything.com/"&gt;measure anything&lt;/a&gt;.  &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;In this case, the compiler itself is a degree of freedom.  Sadly, we're not comparing languages "in general".  We're comparing the bytecodes created by various compilers.  We're actually comparing compilers and compiler optimizations of the bytecode.  Sigh.  &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;That's not what we were hoping for.  We were hoping for some kind of "in general" comparison of the language, not the JVM compiler.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Java has pretty sophisticated optimization.  Python, however, eschews optimization.  PHP has it's own weird issues.  See this paper from Rob Nicholson from the CodeZero project on how to &lt;a href="http://wiki.jvmlangsummit.com/pdf/24_Nicholson_p8.pdf"&gt;implement PHP in the JVM&lt;/a&gt;.  PHP doesn't fit the JVM as well as Python does.  So there's a weird bias.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Rule Three.  Benchmarking Is Hard.&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;There is no "in general" comparison of programming languages.  All that we can do is benchmark something specific.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;It works like this.&lt;/div&gt;&lt;div&gt;&lt;ol&gt;&lt;li&gt;Stop quibbling about language performance "in general".&lt;/li&gt;&lt;li&gt;Find something specific and concrete we plan to implement. &lt;/li&gt;&lt;li&gt;Actually write the performance-critical piece in Java, PHP, Python, Ruby, whatever.  Yes.  Build it several times.  Really.  We don't want to use "language-independent" or "common" features.  We want to optimize ruthlessly -- use the language the way it was meant to be used. -- use the various unique-to-the language features correctly and completely.&lt;/li&gt;&lt;li&gt;Actually run the performance-critical piece to get actual timings.&lt;/li&gt;&lt;li&gt;Since run-time libraries and hardware are degrees of freedom, we have to use multiple run-time libraries, multiple compiler optimization settings and multiple hardware configurations to make a proper decision on which language to use for our specific problem.&lt;/li&gt;&lt;/ol&gt;&lt;/div&gt;&lt;div&gt;Now we know something about our specific problem domain and the available languages.  That's the best we can do.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;We can only compare a specific problem, with a specific algorithm.  That's the basis for all benchmark comparisons.  Since each implementation was well-done and properly optimized, the degree of freedom is the language -- and the run-time implementation of that language -- and the selected OS and hardware.  &lt;/div&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684183198890094283-684838042749034546?l=slott-softwarearchitect.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://slott-softwarearchitect.blogspot.com/feeds/684838042749034546/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/01/java-php-python-which-is-faster-in.html#comment-form' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/684838042749034546'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/684838042749034546'/><link rel='alternate' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/01/java-php-python-which-is-faster-in.html' title='Java PHP Python -- Which is &quot;Faster In General&quot;?'/><author><name>Steven Lott</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-mXhxBa_vM4E/AAAAAAAAAAI/AAAAAAAAAAA/eNlmenT-DOs/s512-c/photo.jpg'/></author><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684183198890094283.post-3820990360710711788</id><published>2011-01-06T07:50:00.002-05:00</published><updated>2011-01-06T07:56:01.529-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='python'/><title type='text'>Search For Expertise</title><content type='html'>&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, 'Nimbus Sans L', sans-serif; font-size: 13px; line-height: 15px; "&gt;I'm looking for a unbiased Python expert to help with a book I'm working on.  We need "&lt;/span&gt;&lt;span class="Apple-style-span" style="font-family: arial, sans-serif; font-size: 13px; border-collapse: collapse; color: rgb(51, 51, 51); "&gt;an unbiased python expert with a keen eye for detail."&lt;/span&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, 'Nimbus Sans L', sans-serif; font-size: 13px; line-height: 15px; "&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, 'Nimbus Sans L', sans-serif; font-size: 13px; line-height: 15px; "&gt;The role is technical reviewer.  I've never done this before, but it appears that the tech reviewer is a paid position somewhere in a publication pipeline along with copy editing and other production steps.  It doesn't sound like a full-time job, since the chapters trickle through the pipeline slowly.&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, 'Nimbus Sans L', sans-serif; font-size: 13px; line-height: 15px; "&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, 'Nimbus Sans L', sans-serif; font-size: 13px; line-height: 15px; "&gt;I'm guessing that you'd get to correct the misstatements and run all the code examples.&lt;/span&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684183198890094283-3820990360710711788?l=slott-softwarearchitect.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://slott-softwarearchitect.blogspot.com/feeds/3820990360710711788/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/01/search-for-expertise.html#comment-form' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/3820990360710711788'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/3820990360710711788'/><link rel='alternate' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/01/search-for-expertise.html' title='Search For Expertise'/><author><name>Steven Lott</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-mXhxBa_vM4E/AAAAAAAAAAI/AAAAAAAAAAA/eNlmenT-DOs/s512-c/photo.jpg'/></author><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684183198890094283.post-6001780703559266878</id><published>2011-01-04T08:00:00.002-05:00</published><updated>2011-01-04T08:00:13.720-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='unit testing'/><category scheme='http://www.blogger.com/atom/ns#' term='python'/><title type='text'>Integration Testing, unittest and Python 2.7</title><content type='html'>Many folks use Python's &lt;span class="Apple-style-span"  style="font-family:'courier new';"&gt;unittest&lt;/span&gt; module for integration testing. It sometimes leads to whining and hand-wringing, but it is very effective.&lt;br /&gt;&lt;br /&gt;Ordinary "unit" tests use mocks and focus on a class or a module more-or-less in isolation.  The purists say "complete isolation".  But that's sometimes unrealistic.  A class that's part of a &lt;b&gt;State&lt;/b&gt; design pattern or  a &lt;b&gt;Strategy &lt;/b&gt;design pattern is often so trivial that "pure" unit tests aren't really very enlightening.  Using the stateful class along with it's state class hierarchy is usually far more interesting and helpful.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;I argue that it's still "unit" testing because parts of the application have been extracted from their processing context.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;OS and Module Mocks&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Some folks will hand-wring over the use of OS API's or other library API's in a unit test.  They feel that the OS API's should be completely mocked in order to call it a "unit" test.  In most cases, simplistic mocks can be created, but for most use cases, the mock grows to hellish complexity.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;For example, we have a large number of tests which depend on Python's &lt;span class="Apple-style-span"  style="font-family:'courier new';"&gt;csv&lt;/span&gt; module.   While we could mock &lt;span class="Apple-style-span"  style="font-family:'courier new';"&gt;csv&lt;/span&gt;&lt;span class="Apple-style-span"&gt; &lt;/span&gt;to produce row after row of mocked data, it seems much simpler to trust that our infrastructure works, use &lt;span class="Apple-style-span"  style="font-family:'courier new';"&gt;csv&lt;/span&gt;&lt;span class="Apple-style-span"&gt; &lt;/span&gt;and simply provide files of test data in CSV format.  The file is part of the test fixture and is locked up in the source code repository.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Is this "unit" testing when we're integrated with some of the underlying modules?  &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;We don't separately test &lt;span class="Apple-style-span"  style="font-family:'courier new';"&gt;csv&lt;/span&gt;.  We simply trust that it library modules have their own tests.  If we're willing to trust the already-supplied tests for &lt;span class="Apple-style-span"  style="font-family:'courier new';"&gt;csv&lt;/span&gt;, why not use it in our tests?  Why Mock something we've decided to trust?&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;[For that matter, we also trust Python, the OS and the &lt;span class="Apple-style-span"&gt;unittest &lt;/span&gt;module itself.  Why draw lines between Python, &lt;span class="Apple-style-span"  style="font-family:'courier new';"&gt;unittest&lt;/span&gt;&lt;span class="Apple-style-span"&gt; &lt;/span&gt;and &lt;span class="Apple-style-span"  style="font-family:'courier new';"&gt;csv&lt;/span&gt;?]&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;ORM and RDBMS Mocks&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;A similar kind of worry comes with an ORM layer.  Somehow, the "trust" factor starts to break down here.  Folks want to mock the ORM layer.  Or -- worse -- they want to try and mock the RDBMS layer so that they're testing the application and ORM layer.  This is another artificial distinction between stuff we'll trust (the RDBMS layer) and stuff we feel we must write additional tests for (the ORM layer).&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;It's so much nicer to adopt the Django philosophy of building a test database as part of the test fixture.  The ORM and RDBMS are integrated into the test itself.  The thing that's "mocked" is the data which gets pre-loaded into the database.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Outside of Django, this can be a bit complex to set up properly.  You need to create a temporary database, execute the SQL DDL, load the fixture data.  This is an annoying bit of code to write the first time.  However, it has handsome rewards because the "unit" testing includes the ORM and RDBMS at a higher level where it's easier to work with.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Integration Testing&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;We use the &lt;span class="Apple-style-span"&gt;unittest &lt;/span&gt;module to do integration testing also.  In Python 2.6, this involved a fair amount of work.  We had to start our RESTful server, run the unit test, and then kill the server.  Because of the way the &lt;span class="Apple-style-span"&gt;subprocess &lt;/span&gt;module works, we can't be completely sure the server is running, so we simply cheat and use a &lt;span class="Apple-style-span"  style="font-family:'courier new';"&gt;time.sleep(12)&lt;/span&gt; to wait for the DB to be built and loaded.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Python 2.7 adds module-level test cases.  It checks for &lt;span class="Apple-style-span"  style="font-family:'courier new';"&gt;setUpModule&lt;/span&gt;&lt;span class="Apple-style-span"&gt; &lt;/span&gt;and &lt;span class="Apple-style-span"  style="font-family:'courier new';"&gt;tearDownModule&lt;/span&gt; functions.  I've been gleefully revising all our unit tests to make ready for this.  Our previous testing needed pervasive (but minor) rewrites to refactor the database setup and tear down and provide proper names.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Once that's in place, our big test shell script can be simplified down to a little Python &lt;span class="Apple-style-span"  style="font-family:'courier new';"&gt;unittest&lt;/span&gt; module.  This module will build a &lt;span class="Apple-style-span"  style="font-family:'courier new';"&gt;unittest.Suite&lt;/span&gt; from all the other test modules (there are dozens) and simply execute the suite as a single, integrated whole.  &lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684183198890094283-6001780703559266878?l=slott-softwarearchitect.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://slott-softwarearchitect.blogspot.com/feeds/6001780703559266878/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/01/integration-testing-unittest-and-python.html#comment-form' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/6001780703559266878'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/6001780703559266878'/><link rel='alternate' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2011/01/integration-testing-unittest-and-python.html' title='Integration Testing, unittest and Python 2.7'/><author><name>Steven Lott</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-mXhxBa_vM4E/AAAAAAAAAAI/AAAAAAAAAAA/eNlmenT-DOs/s512-c/photo.jpg'/></author><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684183198890094283.post-2057628327623730990</id><published>2010-12-30T13:29:00.002-05:00</published><updated>2010-12-30T13:31:55.217-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='literate programming'/><category scheme='http://www.blogger.com/atom/ns#' term='pyWeb'/><category scheme='http://www.blogger.com/atom/ns#' term='python'/><title type='text'>pyWeb Literate Programming Tool | Download pyWeb Literate Programming Tool software for free at SourceForge.net</title><content type='html'>I've (finally) updated the pyWeb Literate Programming Tool.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;There were feature requests and bug reports.  Much to do.  Sadly, I'm really slow at doing it.&lt;br /&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;a href="https://sourceforge.net/projects/pywebtool/?sms_ss=blogger&amp;amp;at_xt=4d1ccf8b638ad5eb%2C0"&gt;pyWeb Literate Programming Tool | Download pyWeb Literate Programming Tool software for free at SourceForge.net&lt;/a&gt;&lt;/div&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684183198890094283-2057628327623730990?l=slott-softwarearchitect.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://slott-softwarearchitect.blogspot.com/feeds/2057628327623730990/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2010/12/pyweb-literate-programming-tool.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/2057628327623730990'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/2057628327623730990'/><link rel='alternate' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2010/12/pyweb-literate-programming-tool.html' title='pyWeb Literate Programming Tool | Download pyWeb Literate Programming Tool software for free at SourceForge.net'/><author><name>Steven Lott</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-mXhxBa_vM4E/AAAAAAAAAAI/AAAAAAAAAAA/eNlmenT-DOs/s512-c/photo.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684183198890094283.post-2135070313911642840</id><published>2010-12-30T08:00:00.000-05:00</published><updated>2010-12-30T08:00:02.477-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='C'/><category scheme='http://www.blogger.com/atom/ns#' term='java'/><category scheme='http://www.blogger.com/atom/ns#' term='python'/><category scheme='http://www.blogger.com/atom/ns#' term='C#'/><category scheme='http://www.blogger.com/atom/ns#' term='Programming Languages'/><title type='text'>Top Language Skills</title><content type='html'>Check out this item on eWeek: &lt;a href="http://www.eweek.com/c/a/Application-Development/Java-C-C-Top-18-Programming-Languages-for-2011-480790/?kc=EWWHNEMNL12272010STR1"&gt;Java, C, C++: Top Programming Languages for 2011 - Application Development - News &amp;amp; Reviews - eWeek.com&lt;/a&gt;.  &lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The presentation starts with Java, C, C++, C# -- not surprising.  These are clearly the most popular programming languages.  These seem to be the first choice made by many organizations.  &lt;/div&gt;&lt;div&gt;In some cases, it's also the last choice.  Many places are simply "All C#" or "All Java" without any further thought.  This parallels the "All COBOL" mentality that was so pervasive when I started my career.  The "All Singing-All Dancing-All One Language" folks find the most shattering disruptions when their business is eclipsed by competitors with language and platform as a technical edge.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The next tier of languages starts with JavaScript, which is expected.  Just about every web site in common use has some JavaScript somewhere.  Browsers being what they are, there's really no viable alternative.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Weirdly, Perl is 6th.  I say weirdly because the &lt;a href="http://www.tiobe.com/index.php/content/paperinfo/tpci/index.html"&gt;TIOBE Programming Community Index&lt;/a&gt; puts Perl much further down the popularity list.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;PHP is next.  Not surprising.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Visual Basic weighs in above Python.  Being above Python is less weird than seeing Perl in 6th place.  This position is closer to the TIOBE index.  It is distressing to think that VB is still so wildly popular.  I'm not sure what VB's strong suit is.  C# seems to have every possible advantage over VB.  Yet, there it is.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Python and Ruby are the next two.  Again, this is more-or-less in the order I expected to see them.   This is is the second tier of languages: really popular, but not in the same league as Java or one of the innumerable C variants.  &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;After this, they list Objective-C as number 11.  This language is tied to Apple's iOS and MacOS platforms, so it's popularity (like C# and VB) is driven in part by platform popularity.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Third Tier&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Once we get past the top 10 Java/C/C++/C#/Objective C and PHP/Python/Perl/Ruby/Javascript tier, we get into a third realm of languages that are less popular, but still garnering a large community of users.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;ActionScript.  A little bit surprising.  But -- really -- it fills the same client-side niche as JavaScript, so this makes sense.  Further, almost all ActionScript-powered pages will also have a little bit of JavaScript to help launch things smoothly.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Now we're into interesting -- "perhaps I should learn this next" -- languages: Groovy, Go, Scala, Erlang, Clojure and F#.  Notable by their absence are Haskell, Lua and Lisp.  These seem like languages to learn in order to grab the good ideas that make them both popular and distinctive from Java or Python.&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684183198890094283-2135070313911642840?l=slott-softwarearchitect.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://slott-softwarearchitect.blogspot.com/feeds/2135070313911642840/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2010/12/top-language-skills_30.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/2135070313911642840'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/2135070313911642840'/><link rel='alternate' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2010/12/top-language-skills_30.html' title='Top Language Skills'/><author><name>Steven Lott</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-mXhxBa_vM4E/AAAAAAAAAAI/AAAAAAAAAAA/eNlmenT-DOs/s512-c/photo.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684183198890094283.post-7696441794024773171</id><published>2010-12-28T08:00:00.000-05:00</published><updated>2010-12-28T08:00:04.583-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='performance'/><category scheme='http://www.blogger.com/atom/ns#' term='python'/><title type='text'>Amazing Speedup</title><content type='html'>A library had unit tests that ran for almost 600 seconds.  Two small changes dropped the run time to 26 seconds.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;I was amazed.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Step 1.  I turned on the &lt;span class="Apple-style-span"&gt;cProfile&lt;/span&gt;.  I added two methods to the slowest unit test module.&lt;/div&gt;&lt;div&gt;&lt;code&gt;&lt;pre&gt;def profile():&lt;br /&gt;    import cProfile&lt;br /&gt;    cProfile.run( 'main()', 'the_slow_module.prof' )&lt;br /&gt;    report()&lt;br /&gt;&lt;br /&gt;def report():&lt;br /&gt;    import pstats&lt;br /&gt;    p = pstats.Stats( 'the_slow_module.prof' )&lt;br /&gt;    p.sort_stats('time').print_callees(24)&lt;br /&gt;&lt;/pre&gt;&lt;/code&gt;&lt;/div&gt;&lt;div&gt;Now I can add profiling or simply review the report.  Looking at the "callees" provided some hints as to why a particular method was so slow.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Step 2.  I replaced &lt;span class="Apple-style-span"&gt;ElementTree &lt;/span&gt;with &lt;span class="Apple-style-span"&gt;&lt;span class="Apple-style-span"&gt;cElementTree&lt;/span&gt; &lt;/span&gt;(duh.)  Everyone &lt;i&gt;should &lt;/i&gt;know this.  I didn't realize how much this mattered.  The trick is to note how much time was spent doing XML parsing.  In the case of this unit test suite, it was a LOT of time.  In the case of the overall application that uses this library, that won't be true.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Step 3.  The slowest method was assembling a list.  It did a lot of list.append(), and list.__len__().  It looked approximately like the following.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;code&gt;&lt;pre&gt;def something( self ):&lt;br /&gt;result= []&lt;br /&gt;for index, value in some_source:&lt;br /&gt;    while len(result)+1 != index:&lt;br /&gt;        result.append( None )&lt;br /&gt;    result.append( SomeClass( value ) )&lt;br /&gt;return result&lt;/pre&gt;&lt;/code&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;This is easily replaced by a generator.  The API changes, so every use of this method function may need to be modified to use the generator instead of the list object.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;code&gt;&lt;pre&gt;def something_iter( self ):&lt;br /&gt; counter= 0&lt;br /&gt; for index, value in some_source:&lt;br /&gt;     while counter+1 != index:&lt;br /&gt;         yield None&lt;br /&gt;         counter += 1&lt;br /&gt;     yield SomeClass( value )&lt;br /&gt;     counter += 1&lt;/pre&gt;&lt;/code&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The generator was significantly faster than list assembly.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Two minor code changes and a significant speed-up.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684183198890094283-7696441794024773171?l=slott-softwarearchitect.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://slott-softwarearchitect.blogspot.com/feeds/7696441794024773171/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2010/12/amazing-speedup.html#comment-form' title='7 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/7696441794024773171'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/7696441794024773171'/><link rel='alternate' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2010/12/amazing-speedup.html' title='Amazing Speedup'/><author><name>Steven Lott</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-mXhxBa_vM4E/AAAAAAAAAAI/AAAAAAAAAAA/eNlmenT-DOs/s512-c/photo.jpg'/></author><thr:total>7</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684183198890094283.post-6832080079949392696</id><published>2010-12-23T08:00:00.001-05:00</published><updated>2010-12-27T11:48:26.661-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='anti-if'/><category scheme='http://www.blogger.com/atom/ns#' term='python'/><title type='text'>The Anti-IF Campaign</title><content type='html'>Check this out: &lt;a href="http://www.antiifcampaign.com/"&gt;http://www.antiifcampaign.com/&lt;/a&gt;.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;I'm totally in favor of reducing complexity.  I've seen too many places where a &lt;b&gt;Strategy&lt;/b&gt; or some other kind of &lt;b&gt;Delegation&lt;/b&gt; design pattern should have been used.  Instead a cluster of if-statements was used.  Sometimes these if-statements suffer copy-and-paste repetition because someone didn't recognize the design pattern.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;What's important is the the &lt;b&gt;if&lt;/b&gt; statement -- in general -- isn't the issue.  The anti-if folks are simply demanding that folks don't use &lt;b&gt;if&lt;/b&gt; as a stand-in for proper polymorphism.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Related Issues&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Related to abuse of the &lt;b&gt;if&lt;/b&gt; statement is abuse of the &lt;b&gt;else&lt;/b&gt; clause.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;My pet-peeve is code like this.&lt;/div&gt;&lt;div&gt;&lt;code&gt;&lt;pre&gt;if condition1:&lt;br /&gt;   work&lt;br /&gt;elif condition2:&lt;br /&gt;   work&lt;br /&gt;elif condition3:&lt;br /&gt;   work&lt;br /&gt;else:&lt;br /&gt;   what condition applies here?&lt;br /&gt;&lt;/pre&gt;&lt;/code&gt;When the various conditions share common variables it can be very difficult to deduce the condition that applies for the &lt;b&gt;else&lt;/b&gt; clause.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;My suggestion is to &lt;b&gt;Avoid Else&lt;/b&gt;.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Write it like this.&lt;/div&gt;&lt;div&gt;&lt;code&gt;&lt;pre&gt;if condition1:&lt;br /&gt;   work&lt;br /&gt;elif condition2:&lt;br /&gt;   work&lt;br /&gt;elif condition3:&lt;br /&gt;   work&lt;br /&gt;elif not (condition1 or condition2 or condition3)&lt;br /&gt;   work&lt;br /&gt;else:&lt;br /&gt;   raise AssertionError( "Oops.  Design Error.  Sorry" )&lt;/pre&gt;&lt;/code&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Then you'll know when you've screwed up.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;[&lt;b&gt;Update&lt;/b&gt;]&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Using an &lt;b&gt;assert &lt;/b&gt;coupled with an &lt;b&gt;else &lt;/b&gt;clause is a kind of code-golf optimization that doesn't seem to help much.  An &lt;b&gt;elif &lt;/b&gt;will have the same conditional expression as the &lt;b&gt;assert&lt;/b&gt; would have.   But the comment did lead to rewriting this to use &lt;span class="Apple-style-span" &gt;AssertionError &lt;/span&gt;instead of vague, generic &lt;span class="Apple-style-span" &gt;Exception&lt;/span&gt;.&lt;/div&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684183198890094283-6832080079949392696?l=slott-softwarearchitect.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://slott-softwarearchitect.blogspot.com/feeds/6832080079949392696/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2010/12/anti-if-campaign.html#comment-form' title='5 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/6832080079949392696'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/6832080079949392696'/><link rel='alternate' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2010/12/anti-if-campaign.html' title='The Anti-IF Campaign'/><author><name>Steven Lott</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-mXhxBa_vM4E/AAAAAAAAAAI/AAAAAAAAAAA/eNlmenT-DOs/s512-c/photo.jpg'/></author><thr:total>5</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684183198890094283.post-7468617985208420937</id><published>2010-12-14T08:00:00.000-05:00</published><updated>2010-12-14T08:00:04.583-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Django'/><category scheme='http://www.blogger.com/atom/ns#' term='template'/><category scheme='http://www.blogger.com/atom/ns#' term='python'/><category scheme='http://www.blogger.com/atom/ns#' term='architecture'/><title type='text'>Code Base Fragmentation -- Again</title><content type='html'>Check this out: "&lt;a href="http://pydanny.blogspot.com/2010/12/stupid-template-languages.html"&gt;Stupid Template Languages&lt;/a&gt;".  &lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Love this: "The biggest annoyance I have with smart template languages (Mako, Genshi, Jinja2, PHP, Perl, ColdFusion, etc) is that you have the capability to mix core business logic with your end views, hence violating the rules of Model-View-Controller architecture."&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Yes, too much power in the template leads to code base fragmentation:  critical information is not in the applications, but is pushed into the presentation.  This also happens with stored procedures and triggers.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;I love the questions on Stack Overflow (like &lt;a href="http://stackoverflow.com/questions/2115869/calling-python-function-in-django-template"&gt;this&lt;/a&gt; one) asking how to do something super-sophisticated in the Django Template language.  And the answer is often "Don't.  That's what view functions are for."   &lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684183198890094283-7468617985208420937?l=slott-softwarearchitect.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://slott-softwarearchitect.blogspot.com/feeds/7468617985208420937/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2010/12/code-base-fragmentation-again.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/7468617985208420937'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/7468617985208420937'/><link rel='alternate' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2010/12/code-base-fragmentation-again.html' title='Code Base Fragmentation -- Again'/><author><name>Steven Lott</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-mXhxBa_vM4E/AAAAAAAAAAI/AAAAAAAAAAA/eNlmenT-DOs/s512-c/photo.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684183198890094283.post-3069273180232471323</id><published>2010-12-09T08:00:00.000-05:00</published><updated>2010-12-09T08:00:02.815-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='API Design'/><category scheme='http://www.blogger.com/atom/ns#' term='python'/><category scheme='http://www.blogger.com/atom/ns#' term='architecture'/><title type='text'>The Wrapper vs. Library vs. Aspect Problem</title><content type='html'>Imagine that we've got a collection of applications used by customers to provide data, a collection of applications we use to collect data from vendors.  We've got a third collection of analytical tools.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Currently, they share a common database, but the focus, use cases, and interfaces are different.  &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Okay so far?   Three closely-related groups or families of applications.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;We need to introduce a new cross-cutting capability.  Let's imagine that it's something central like using &lt;a href="http://celeryproject.org/"&gt;celery&lt;/a&gt; to manage long-running batch jobs.  Clearly, we don't want to just hack celery features into all three families of applications.  Do we?&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Choices&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;It appears that we have three choices.&lt;/div&gt;&lt;div&gt;&lt;ol&gt;&lt;li&gt;A "wrapper" application that unifies all the application families and provides a new central application.  Responsibilities shift to the new application.&lt;/li&gt;&lt;li&gt;A site-specific library that layers some common features so that our various families of applications can be more consistent.  This involves less of a responsibility shift.&lt;/li&gt;&lt;li&gt;An "aspect" via Aspect-Oriented programming techniques.  Perhaps some additional decorators added to the various applications to make them use the new functionality in a consistent way.&lt;/li&gt;&lt;/ol&gt;&lt;div&gt;&lt;b&gt;Lessons Learned&lt;/b&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Adding a new application to be an overall wrapper turned out to be a bad idea.  After implementing it, it was difficult to extend.  We had two dimensions of extension.&lt;/div&gt;&lt;div&gt;&lt;ol&gt;&lt;li&gt;The workflows in the "wrapper" application needed constant tweaking as the other applications evolved.  Every time we wanted to add a step, we had to update the real application and also update the wrapper.  Python has a lot of introspection, but these aren't technical changes, these are user-visible workflow changes. &lt;/li&gt;&lt;li&gt;Introducing a new data types and file formats was painful.  The responsibility for this is effectively split between the wrapper and the underlying applications.  The wrapper merely serves to dilute the responsibilities.&lt;/li&gt;&lt;/ol&gt;&lt;div&gt;&lt;b&gt;Libraries/Aspects&lt;/b&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;It appears that new common features are almost always new aspects of existing applications.  &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;What makes this realization painful is the process of retrofitting a supporting library into multiple, existing applications.  It seems like a lot of cut-and-paste to add the new &lt;code&gt;import&lt;/code&gt; statements, add the new decorators and lines of code.  However, it's a &lt;i&gt;pervasive&lt;/i&gt; change.  The point is to add the common decorator in all the right places.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Trying to "finesse" a pervasive change by introducing a higher-level wrapper isn't a very good idea.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;A pervasive change is simply a lot of changes and regression tests.  Okay, I'm over it.&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684183198890094283-3069273180232471323?l=slott-softwarearchitect.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://slott-softwarearchitect.blogspot.com/feeds/3069273180232471323/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2010/12/wrapper-vs-library-vs-aspect-problem.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/3069273180232471323'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/3069273180232471323'/><link rel='alternate' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2010/12/wrapper-vs-library-vs-aspect-problem.html' title='The Wrapper vs. Library vs. Aspect Problem'/><author><name>Steven Lott</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-mXhxBa_vM4E/AAAAAAAAAAI/AAAAAAAAAAA/eNlmenT-DOs/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684183198890094283.post-7224072269345262121</id><published>2010-12-07T08:00:00.001-05:00</published><updated>2010-12-07T08:00:00.598-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='python'/><category scheme='http://www.blogger.com/atom/ns#' term='building skills books'/><title type='text'>Intuition and Experience</title><content type='html'>First, read &lt;a href="http://www.cs.utexas.edu/users/EWD/ewd08xx/EWD800.PDF"&gt;EWD800&lt;/a&gt;.  &lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;It has harsh things to say about relying on &lt;i&gt;intuition&lt;/i&gt; in programming.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Stack Overflow is full of questions where someone takes their experience with one language and applies it incorrectly and inappropriately to another language.  &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;I get email, periodically, also on this subject.  I got one recently on the question of "cast", "coercion" and "conversion" which I found incomprehensible for a long time.  I had to reread EWD800 to realize that someone was relying on some sort of vague intuition; it appears that they were desperate to map Java (or C++) concepts on Python.  &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Casting&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;In my Python 2.6 book, I use the word "cast" exactly twice.  In the same paragraph.  Here it is.&lt;/div&gt;&lt;div&gt;&lt;/div&gt;&lt;blockquote&gt;&lt;div&gt;This also means the "casting" an object to match the declared type&lt;/div&gt;&lt;div&gt;of a variable isn't meaningful in Python.  You don't use C++ or Java-style&lt;/div&gt;&lt;div&gt;casting.&lt;/div&gt;&lt;div&gt;&lt;/div&gt;&lt;/blockquote&gt;&lt;div&gt;I though that would be enough information to close the subject.  I guess not.  It appears that some folks have some intuition about type casting that they need to see reflected in other languages, no matter how inappropriate the concept is.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The email asked for a "a nice summary with a simple specific example to hit the point home."&lt;br /&gt;It's quite hard to provide an example of something that doesn't exist.  But, I guess, intuition provides a strong incentive to see things which aren't there.   I'm not sure how to word it more strongly or clearly.  I hate to devolve into blow-by-blow comparison between languages because there are concepts that don't map.  I'll work on being more forceful on casting.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Coercion&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The words coercion (and coerce) occur more often, since they're sensible Python concepts.  After all, Python 2 has formal type coercion rules.  See "&lt;a href="http://docs.python.org/reference/datamodel.html#coercion-rules"&gt;Coercion Rules&lt;/a&gt;".  I guess my summary ("Section 3.4.8 of the Python Language Reference  covers this in more detail; along with the caveat that the Python 2 rules have gotten too complex.") wasn't detailed or explicit enough.  &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The relevant quote from the Language manual is this: "As the language has evolved, the coercion rules have become hard to document precisely; documenting what one version of one particular implementation does is undesirable. Instead, here are some informal guidelines regarding coercion. In Python 3.0, coercion will not be supported."   &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;I guess I could provide examples of coercion.  However, the fact that it is going to be expunged from the language seems to indicate that it isn't deeply relevant.  It appears that some readers have an intuition about coercion that requires some kind of additional details.  I guess I have to include the entire quote to dissuade people from relying on their intuition regarding coercion.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Further, the request for "a nice summary with a simple specific example to hit the point home" doesn't fit well with something that -- in the long run -- is going to be removed.  Maybe I'm wrong, but omitting examples entirely seemed to hit the point home.  &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Conversion&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Conversion gets it's own section, since it's sensible in a Python context.  I kind of thought that a whole section on conversion would cement the concepts.  Indeed, there are (IMO) too many examples of conversions in the conversion section.  But I guess that showing all of the numeric conversions somehow wasn't enough.  I have certainly failed at least one reader.  However, I can't imagine what more could be helpful, since it is -- essentially -- an exhaustive enumeration of all conversions for all built-in numeric types. &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;What I'm guessing is that (a) there's some lurking intuition and (b) Python doesn't match that intuition.  Hence the question -- in spite of exhaustively enumerating the conversions.  I'm not sure what more can be done to make the concept clear.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;It appears that all those examples weren't "nice", "simple" or "specific" enough.  Okay.  I'll work on that.&lt;/div&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684183198890094283-7224072269345262121?l=slott-softwarearchitect.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://slott-softwarearchitect.blogspot.com/feeds/7224072269345262121/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2010/12/intuition-and-experience.html#comment-form' title='5 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/7224072269345262121'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/7224072269345262121'/><link rel='alternate' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2010/12/intuition-and-experience.html' title='Intuition and Experience'/><author><name>Steven Lott</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-mXhxBa_vM4E/AAAAAAAAAAI/AAAAAAAAAAA/eNlmenT-DOs/s512-c/photo.jpg'/></author><thr:total>5</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684183198890094283.post-4418758013666730567</id><published>2010-12-02T08:00:00.000-05:00</published><updated>2010-12-02T08:00:11.256-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='open source'/><category scheme='http://www.blogger.com/atom/ns#' term='agile'/><category scheme='http://www.blogger.com/atom/ns#' term='software process improvement'/><title type='text'>More Open Source and More Agile News</title><content type='html'>ComputerWorld, November 22, 2010, has this: "&lt;a href="http://www.computerworld.com/s/article/9197420/Open_source_grows_up"&gt;Open Source Grows Up&lt;/a&gt;".  The news of the weird is "It's clear that open-source software has moved beyond the zealotry phase."   I wasn't aware this phase existed. I hope to see the project plan with "zealotry" in it.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The &lt;i&gt;real&lt;/i&gt; news is "More than two-thirds (69%) of the respondents said they expect to increase their investments in open source."  That's cool.   &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Be sure to read the sidebar "Many Enterprises Aren't Giving Back."  There's still a lot of concern over intellectual property.  I've seen a lot of corporate software -- it's not that good.  Most companies that are wringing their hands over losing control of their trade secrets should really be wringing their hands because their in-house software won't measure up to open-source standards.  &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;I like this other quote: 'Five years ago, the South Carolina government was "considering writing a policy to prohibit or at least 'control' open source".' I like the "Must Control Open Source" feeling that IT leadership has.  Without this mysterious "control", the organization could be swamped by software it didn't write.  How's that different from being swamped by software products that involve contracts and fees?  And requires &lt;a href="http://en.wikipedia.org/wiki/Patch_Tuesday"&gt;Patch Tuesday&lt;/a&gt;?&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Agility&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;SD Times has two articles on Agile methods.  Both on the front page of a print publication.  That's how you know the technique has "arrived".  &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;First, there's "&lt;a href="http://www.sdtimes.com/VERSIONONE_SURVEY_FINDS_AGILE_KNOWLEDGE_AND_USE_ON_THE_RISE/By_Katie_Serignese/About_AGILE_and_VERSIONONE/34936"&gt;VersionOne survey finds agile knowledge and use on the rise&lt;/a&gt;".  My favorite quote: "Interestingly, management support, the ability to change organizational culture and general resistance to change, remained at the forefronts of participants’ minds when indicating barriers to further agile adoption."  I like the management barriers.  I like it when management tries to exert more 'control' over a process (like software creation) that's so poorly understood.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Here's the companion piece, "&lt;a href="http://www.sdtimes.com/link/34886"&gt;For agile success, leaders must let teams loose&lt;/a&gt;".  This is all good advice.  Particularly, this: '"It’s hard to not command and control, but leadership is not about managing work. It’s about creating a capable organization that can manage work," [Rick Simmons] added.'  &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;If you're micro-managing, you're not building an organization.  Excellent advice.  However, tell that to the financial control crowd.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Budgets and "Control"&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Finally, be sure to read this by Frank Hayes in ComputerWorld: "&lt;a href="http://www.computerworld.com/s/article/352786/Big_Projects_Done_Small"&gt;Big Projects, Done Small&lt;/a&gt;".  Here are the relevant quotes: "The logical conclusion: We should break up all IT projects into sub-million-dollar pieces."  "The political reality: Everybody wants multimillion-dollar behemoths."  "...huge projects get big political support."&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;In short, Agile is the right thing to do until you're trying to get approval.  Bottom line: use Agile methods.  But for purposes of pandering to executives who want to see large numbers with lots of zeroes, it's often necessary to write giant project "plans" that you don't actually use.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Go ahead, write waterfall plans.  Don't feel guilty or conflicted.  Some folks won't catch up with Agility because they think "Control" is better.  Pander to them.  It's okay.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684183198890094283-4418758013666730567?l=slott-softwarearchitect.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://slott-softwarearchitect.blogspot.com/feeds/4418758013666730567/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2010/12/more-open-source-and-more-agile-news.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/4418758013666730567'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/4418758013666730567'/><link rel='alternate' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2010/12/more-open-source-and-more-agile-news.html' title='More Open Source and More Agile News'/><author><name>Steven Lott</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-mXhxBa_vM4E/AAAAAAAAAAI/AAAAAAAAAAA/eNlmenT-DOs/s512-c/photo.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684183198890094283.post-6383451867098837228</id><published>2010-11-30T08:00:00.000-05:00</published><updated>2010-11-30T08:00:05.411-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='knowledge capture'/><category scheme='http://www.blogger.com/atom/ns#' term='analysis'/><category scheme='http://www.blogger.com/atom/ns#' term='reverse engineering'/><title type='text'>Questions, or, How to Ask For Help</title><content type='html'>Half the fun on Stack Overflow is the endless use of closed-ended questions.  "Can I do this in Python?" being so common and so hilarious.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The answer is "Yes."  You &lt;i&gt;can&lt;/i&gt; do it.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Perhaps that's not the question they really meant to ask.&lt;br /&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;See "&lt;a href="http://polaris.gseis.ucla.edu/jrichardson/dis220/openclosed.htm"&gt;Open versus Closed Ended Questions&lt;/a&gt;" for a great list of examples.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Closed-Ended Questions have short answers, essentially yes or no.  &lt;a href="http://www.mediacollege.com/journalism/interviews/leading-questions.html"&gt;Leading Questions&lt;/a&gt; and presuming questions are common variations on this theme.  A closed-ended question is sometimes called "dichotomous" because there are only two choices.  They can also be called "saturated", possibly because all the possible answers are laid out in the question.&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Asking Questions&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The most important part about asking questions is to go through a few steps of preparation.&lt;/div&gt;&lt;div&gt;&lt;ol&gt;&lt;li&gt;&lt;b&gt;Search&lt;/b&gt;.  Use Google, use the Stack Overflow search.  A huge number of people seem to bang questions into Stack Overflow without taking the time to see if it's been asked (and answered) already.&lt;/li&gt;&lt;li&gt;&lt;b&gt;Define Your Goal&lt;/b&gt;.  Seriously.  Write down your objective.  In words.  Be sure the goal includes an active-voice verb -- something you want to be able to &lt;b&gt;do&lt;/b&gt;.  If you want to be able to write code, write down the words "I want to write code for [X]".  If you want to be able to tell the difference between two nearly identical things, write down the words "I want to distinguish [Y] from [Z]".  When in doubt, use active voice verbs to write down the thing you want to do.  Focus on actions you want to take. &lt;/li&gt;&lt;li&gt;&lt;b&gt;Frame Your Question&lt;/b&gt;.  Rewrite your goal into a sentence by changing the fewest words.  90% of the time, you'll switch "I want to" to "How do I".  The rest of the time, you'll have to think for a moment because your goal didn't make sense.  If your goal is not an active-voice verb phrase (something you want to &lt;b&gt;do&lt;/b&gt;) then you'll have trouble with the rewrite.&lt;/li&gt;&lt;/ol&gt;&lt;div&gt;In some cases, folks will skip one or more steps.  Hilarity Ensues.&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Leading/Presuming Questions&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Another form of closed-ended question is the veiled complaint.  "Why doesn't Python do [X] the way Perl/PHP/Haskell/Java/C# does it?"&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Essentially, this is "my favorite other language has a feature Python is missing."  The question boils down to, "Why is [Y] not like [Z]?" Often it's qualified by some feature, but the question is the same: "Regarding [X], why is Python not like language [Z]?"&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The answer is "Because they're different."  The two languages are not the same, that's why there's a difference.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;This leads to "probing" questions of no real value.  "Why did Python designers decide to leave out [X]" and other variants on this theme.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;If the answer was "Because they're evil gnomes" what does it matter?   If the answer was "because it's inefficient" how does that help?  Feature [X] is still missing, and all the "why?" questions won't really help add it back into the language.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;It's possible that there's a legitimate question hidden under the invective.  It might be "How do I implement [X] in Python?  For examples, see Perl/PHP/Haskell/Java/C#."  Notice that this question is transformed into an active-voice verb: "implement".&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;If we look at the three-step question approach above, there's no active-voice verb behind a "why question".   What you "know" isn't really all that easy to provide answers for.  Knowledge is simply hard to provide.  Questions about what you want to &lt;b&gt;do&lt;/b&gt;, are much, much easier to answer.  &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Probing/Confirming Questions&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;One other category are the "questions" that post a pile of details looking for confirmation.  There are three common variations.&lt;/div&gt;&lt;div&gt;&lt;ul&gt;&lt;li&gt;&lt;b&gt;tl;dr&lt;/b&gt;.  The wealth of detail was overwhelming.  I'm a big fan of the "detail beat-down".  It seems like some folks don't need to summarize.   There appear to be people with massive brains that don't need models, abstractions or summaries, but are perfectly capable of coping with endless details.  It would be helpful if these folks could "write down" to those of us with small brains who need summaries. &lt;/li&gt;&lt;li&gt;No question at all, or the question is a closed-ended "Do you agree?"  An answer of "No." is probably not what they wanted.  But what can you do?  That's all they asked for.&lt;/li&gt;&lt;li&gt;Sometimes the question is "Any comments?"  This often stems from having no clear goal.  Generally, if you've done a lot of research and you simply want confirmation, there's no question there.  If you've got doubts, that means you need to do something to correct the problems.  &lt;/li&gt;&lt;/ul&gt;&lt;div&gt;Here's what is really important with &lt;b&gt;tl;dr&lt;/b&gt; questions: What do you want to do? &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;80% of the time, it's "Fix my big, complex tl;dr proposal to correct problem [X]."    [X] could be "security" or "deadlock" or "patent infringement" or "cost overrun" or "testability".&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Here's how to adjust this question from something difficult to answer to something good.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;You want to know if your tl;dr proposal have problem [X].   You're really looking for &lt;i&gt;confirmation&lt;/i&gt; that your tl;dr proposal is free from problem [X].   This is something you want to know -- but knowledge is not a great goal.  It's too hard to guess what you don't know; lots of answers can provide &lt;i&gt;almost&lt;/i&gt; the right information.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Reframe your goal: drop knowledge and switch to action.  What do you want to &lt;b&gt;do&lt;/b&gt;?  You want to show that your tl;dr proposal is free from problem [X].  So ask that: "&lt;b&gt;How do I show my tl;dr proposal is free from problem [X]?&lt;/b&gt;"&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Once you write that down, you now have to focus your tl;dr proposal to get the answer to this question.  In many cases, you can pare things down to some relevant parts that can shown to be free from problem [X].  In most cases, you'll uncover the problem on your own.  In other cases, you've got a good open-ended question to start a useful conversation that will give you something you can do.&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684183198890094283-6383451867098837228?l=slott-softwarearchitect.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://slott-softwarearchitect.blogspot.com/feeds/6383451867098837228/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2010/11/questions-or-how-to-ask-for-help.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/6383451867098837228'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/6383451867098837228'/><link rel='alternate' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2010/11/questions-or-how-to-ask-for-help.html' title='Questions, or, How to Ask For Help'/><author><name>Steven Lott</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-mXhxBa_vM4E/AAAAAAAAAAI/AAAAAAAAAAA/eNlmenT-DOs/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684183198890094283.post-252473287125596075</id><published>2010-11-23T08:00:00.001-05:00</published><updated>2010-11-23T08:00:12.176-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='open source'/><category scheme='http://www.blogger.com/atom/ns#' term='python'/><title type='text'>Open-Source, moving from "when" to "how"</title><content type='html'>Interesting item in the November 1 eWeek: "&lt;a href="http://www.eweek.com/c/a/Linux-and-Open-Source/Open-Source-Software-in-the-Enterprise-177312/"&gt;Open-Source Software in the Enterprise&lt;/a&gt;".&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Here's the key quote: "rather than asking if or when, organizations are increasingly focusing on how".&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Interestingly, the article then goes on to talk about licensing and intellectual property management.  I suppose those count, but they're fringe issues, only relevant to lawyers.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Here's the two real issues:&lt;/div&gt;&lt;div&gt;&lt;ol&gt;&lt;li&gt;Configuration Management&lt;/li&gt;&lt;li&gt;Quality Assurance&lt;/li&gt;&lt;/ol&gt;&lt;/div&gt;&lt;div&gt;Many organizations do things so poorly that open source software is unusable.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Configuration Management&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Many organizations have non-existent or very primitive CM.  They may have some source code control and some change management.  But the configuration of the test and production technology stacks are absolutely mystifying.  No one can positively say what versions of what products are in production or in test.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The funniest conversations center on the interconnectedness of open source projects.  You don't just take a library and plug it in.  It's not like neatly-stacked laundry, all washed and folded and ready to be used.  Open Source software is more like a dryer full of a tangled collection of stuff that's tied in knots and suffers from major static cling.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;"How do we upgrade [X]"?  You don't simply replace a component.  You create a new tech stack with the upgraded [X] and all of the stuff that's knotted together with [X].&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Changing from Python 2.5 to 2.6 changes any binary-compiled libraries like PIL or MySQL_python, mod_wsgi, etc.  These, in turn, may require OS library upgrades.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;A tech stack must be a hallowed thing.  Someone must actively manage change to be sure they're complete and consistent across the enterprise.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Quality Assurance&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;Many organizations have very weak QA.  They have an organization, but it has no authority and developers are permitted to run rough-shod over QA any time they use the magic words "the user's demand it".  &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The truly funny conversations center on how the organization can be sure that open source software works, or is free of hidden malware.  I've been asked how a client can vet an open source package to be sure that it is malware free.   As if the client's Windows PC's are pristine works of art and the Apache POI project is just a logic bomb.&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The idea that you might do acceptance testing on open source software always seems foreign to everyone involved.  You test your in-house software.  Why not test the downloaded software?  Indeed, why not test commercial software for which you pay fees?  Why does QA only seem to apply to in-house software?&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Goals vs. Directions&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;I think one other thing that's endlessly confusing is "Architecture is a Direction not a Goal."  I get the feeling that many organizations strive for a crazy level of stability where everything is fixed, unchanging and completely static (except for patches.)&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The idea that we have systems on a new tech stack and systems on an old tech stack seems to lead to angry words and stalled projects.    However, there's really no sensible alternative.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;We have tech stack [X.1], [X.2] and [X.3] running in production.  We have [X.4] in final quality assurance testing.  We have [X.5] in development.   The legacy servers running version 1 won't be upgraded, they'll be retired.  The legacy servers running version 2 may be upgraded, depending on the value of the new features vs. the cost of upgrading.  The data in the version 3 servers will be migrated to the version 4, and the old servers retired.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;It can be complex.  The architecture is a direction in which most (but not all) servers are heading.  The architecture changes, and some servers catch up to the golden ideal and some servers never catch up.  Sometimes the upgrade doesn't create enough value.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;These are "how" questions that are more important than studying the various licensing provisions.&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684183198890094283-252473287125596075?l=slott-softwarearchitect.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://slott-softwarearchitect.blogspot.com/feeds/252473287125596075/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2010/11/open-source-moving-from-when-to-how.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/252473287125596075'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/252473287125596075'/><link rel='alternate' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2010/11/open-source-moving-from-when-to-how.html' title='Open-Source, moving from &quot;when&quot; to &quot;how&quot;'/><author><name>Steven Lott</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-mXhxBa_vM4E/AAAAAAAAAAI/AAAAAAAAAAA/eNlmenT-DOs/s512-c/photo.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684183198890094283.post-6682537650271639810</id><published>2010-11-18T08:00:00.000-05:00</published><updated>2010-11-18T08:00:10.860-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='software process improvement'/><title type='text'>Software Patents</title><content type='html'>Here's an interesting news item: "&lt;a href="http://www.nytimes.com/external/gigaom/2010/11/12/12gigaom-red-hats-secret-patent-deal-and-the-fate-of-jboss-98607.html?ref=technology"&gt;Red Hat’s Secret Patent Deal and the Fate of JBoss Developers&lt;/a&gt;".&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Here's an ancient -- but still relevant -- piece from Tim O'Reilly: "&lt;a href="http://tim.oreilly.com/patents/index.csp"&gt;Software and Business Method Patents&lt;/a&gt;". &lt;br /&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Here's a great article in Slate on the consequences of software patents.  "&lt;a href="http://www.slate.com/id/2135559/"&gt;Weapons of Business Destruction: How a tiny little 'patent troll' got BlackBerry in a headlock&lt;/a&gt;".&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The biggest issue with software patents is always the "non-obvious" issue.  Generally, this can be debated, so a prior art review is far more valuable.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;See "&lt;a href="http://spectrum.ieee.org/computing/software/peer-review-starts-for-software-patent-applications/0"&gt;Peer Review Starts for Software Patent Applications: IEEE Spectrum talks to the founder to Peer-to-Patent Beth Noveck&lt;/a&gt;".  This is where the rubber hits the road.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;To participate, see &lt;a href="http://www.peertopatent.org/"&gt;Peer To Patent&lt;/a&gt;.   Locate prior art and make patent trolls get real jobs.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Also, see this: "&lt;a href="http://www.economist.com/node/9719020?story_id=9719020"&gt;A patent improvement: Intellectual property: A new scheme will solicit comments via the internet to improve the vetting of patent applications&lt;/a&gt;". &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684183198890094283-6682537650271639810?l=slott-softwarearchitect.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://slott-softwarearchitect.blogspot.com/feeds/6682537650271639810/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2010/11/software-patents.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/6682537650271639810'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/6682537650271639810'/><link rel='alternate' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2010/11/software-patents.html' title='Software Patents'/><author><name>Steven Lott</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-mXhxBa_vM4E/AAAAAAAAAAI/AAAAAAAAAAA/eNlmenT-DOs/s512-c/photo.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684183198890094283.post-2707636007403811385</id><published>2010-11-11T08:00:00.000-05:00</published><updated>2010-11-11T08:00:13.080-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='map-reduce'/><category scheme='http://www.blogger.com/atom/ns#' term='RDBMS'/><category scheme='http://www.blogger.com/atom/ns#' term='hadoop'/><title type='text'>Hadoop and SQL/Relational Hegemony</title><content type='html'>Here's a nice article on why Facebook, Yahoo and eBay use Hadoop: "&lt;a href="http://www.forbes.com/2010/11/05/facebook-yahoo-ebay-technology-hadoop.html"&gt;Asking Any Question Of All Your Data&lt;/a&gt;".&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The article has one tiny element of pandering to the SQL hegemonists.  &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Yes, it sounds like a conspiracy theory, but it seems like there really are folks who will tell you that the relational database is effectively perfect for all data processing and should not be questioned.  To bolster their point, they often have to conflate all data processing into one amorphous void.  Relational transactions aren't central to all processing, just certain elements of data processing.  There, I said it.  &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Here's the pandering quote: "But this only works if the underlying data storage and compute engine is powerful enough to operate on a large dataset in a time-efficient manner".&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;What?  &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Is he saying that relational databases do not impose the same constraint?&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Clearly, the RDBMS has the same "catch".  The relational database only works if "...the underlying data storage and compute engine is powerful enough to operate on a large dataset in a time-efficient manner."&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Pandering?  Really?&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Here's why it seems like the article is pandering.  Because it worked.  It totally appealed to the target audience.  I saw this piece because a DBA -- a card-carrying member of the SQL Hegemony cabal -- sent me the link, and highlighted two things.  The DBA highlighted the "powerful enough" quote.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;As if to say, "See, it won't happen any time soon, Hadoop is too resource intensive to displace the RDBMS."&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Which appears to assume that the RDBMS isn't resource intensive.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Further, the DBA had to add the following.  "The other catch which is not stated is the skill level required of the people doing the work."&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;As if to say, "It won't happen any time soon, ordinary programmers can't understand it."&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Which appears to assume that ordinary programmers totally understand SQL and the relational model.  If they did understand SQL and the relational model perfectly, why would we have DBA's?  Why would we have performance tuning?  Why would we have DBA's adjusting normalization to correct application design problems?&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Weaknesses&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;So the weaknesses of Hadoop are that it (a) demands resources and (b) requires specialized skills.  Okay.  But isn't that the exact same weakness as the relational database?&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Which causes me to ask why an article like this has to pander to the SQL cabal by suggesting that Hadoop requires a big compute engine?  Or is this just my own conspiracy theory?&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684183198890094283-2707636007403811385?l=slott-softwarearchitect.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://slott-softwarearchitect.blogspot.com/feeds/2707636007403811385/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2010/11/hadoop-and-sqlrelational-hegemony.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/2707636007403811385'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/2707636007403811385'/><link rel='alternate' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2010/11/hadoop-and-sqlrelational-hegemony.html' title='Hadoop and SQL/Relational Hegemony'/><author><name>Steven Lott</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-mXhxBa_vM4E/AAAAAAAAAAI/AAAAAAAAAAA/eNlmenT-DOs/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684183198890094283.post-5863277264739578581</id><published>2010-11-09T08:00:00.000-05:00</published><updated>2010-11-09T08:00:05.320-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='ETL'/><category scheme='http://www.blogger.com/atom/ns#' term='python'/><title type='text'>Data Mapping and Conversion Tools -- Sigh</title><content type='html'>Yes, ETL is interesting and important.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;But creating a home-brewed data mapping and conversion tool isn't interesting or important.  Indeed, it's just an attractive nuisance.  Sure, it's fun, but it isn't valuable work.  The world doesn't need another ETL tool.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The core problem is talking management (and other developers) into a change of course.  How do we stop development of Yet Another ETL Tool (YAETLT)? &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;First, there's products like &lt;a href="http://www.talend.com/index.php"&gt;Talend&lt;/a&gt;, &lt;a href="http://www.cloveretl.com/products/community-edition"&gt;CloverETL&lt;/a&gt; and &lt;a href="http://www.pentaho.com/products/data_integration/"&gt;Pentaho&lt;/a&gt; open source data integration.   Open Source.  ETL.  Done.  &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Then, there's this list of &lt;a href="http://www.manageability.org/blog/stuff/open-source-etl"&gt;Open Source ETL products&lt;/a&gt; on the Manageability blog.  This list all Java, but there's nothing wrong with Java.  There are a lot of jumping-off points in this list.  Most importantly, the world doesn't need another ETL tool.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Here's a piece on &lt;a href="http://www.information-management.com/issues/20060601/1088417-1.html"&gt;Open Source BI&lt;/a&gt;, just to drive the point home.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Business Rules&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The ETL tools must have rules.  Either simple field alignment or more complex transformations.  The rules can either be interpreted ("engine-based" ETL) or used to build a stand-alone program ("code-generating" ETL).&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The engine-based ETL, when written in Java, is creepy.  We have a JVM running a Java app.  The Java app is an interpreter for a bunch of ETL rules.  Two levels of interpreter. Why?&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Code-generating ETL, OTOH, is a huge pain in the neck because you have to produce reasonably portable code.  In Java, that's hard.  Your rules are used to build Java code; the resulting Java code can be compiled and run.  And it's often very efficient.  [Commercial products often produce portable C (or COBOL) so that they can be very efficient.  That's really hard to do well.]&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Code-generating, BTW, has an additional complication.  Bad Behavior.  Folks often tweak the resulting code.  Either because the tool wasn't able to generate all the proper nuances, or because the tool-generated code was inefficient in a way that's so grotesque that it couldn't be fixed by an optimizing compiler.   It happens that we can have rules that run afoul of the boilerplate loops. &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Old-School Architecture&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;First, we need to focus on the "TL" part of ETL.  Our applications receive files from our customers.  We don't do the extract -- they do.   This means that each file we receive has a unique and distinctive "feature".  We have a clear SoW and examples.  That doesn't help.  Each file is an experiment in novel data formatting and &lt;a href="http://www.springerlink.com/content/vq07h7701u11852p/"&gt;Semantic Heterogeneity&lt;/a&gt;.   &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;A common old-school design pattern for this could be called "The ETL Two-Step".  This design breaks the processing into "T" and "L" operations.  There are lots of unique, simple, "T" options, one per distinctive file format.  The output from "T" is a standardized file.  A simple, standardized "L" loads the database from the standardized file.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Indeed, if you follow the &lt;b&gt;ETL Two Step&lt;/b&gt; carefully, you don't need to actually write the "L" pass at all.  You prepare files which your RDBMS utilities can simply load.  So the ETL boils down to "simple" transformation from input file to output file.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Folks working on YAETLT have to focus on just the "T" step.   Indeed, they should be writing Yet Another Transformation Tool (YATT) instead of YAETLT.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Enter the Python&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;If all we're doing is moving data around, what's involved?&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;code&gt;&lt;pre&gt;import csv&lt;br /&gt;result = {&lt;br /&gt;  'column1': None,&lt;br /&gt;  'colmnn2': None,&lt;br /&gt;  # etc.&lt;br /&gt;}&lt;br /&gt;with open("source","rb") as source:&lt;br /&gt;  rdr= csv.DictReader( source )&lt;br /&gt;  with open( "target","wb") as target:&lt;br /&gt;      wtr= csv.DictWriter( target, result.keys() )&lt;br /&gt;      for row in rdr:&lt;br /&gt;          result['column1']= row['some_column']&lt;br /&gt;          result['column2']= some_func( row['some_column'] )&lt;br /&gt;          # etc.&lt;br /&gt;          wtr.writerow( result )&lt;br /&gt;&lt;/pre&gt;&lt;/code&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;That's really about it.  There appear to be 6 or 7 lines of overhead.  The rest is working code.  &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;But let's not be too dismissive of the overhead.  An ETL depends on the file format, summarized in the &lt;span class="Apple-style-span"  style="font-family:'lucida grande';"&gt;import&lt;/span&gt; statement.  With a little care we can produce libraries similar to Python's csv that work with XLS directly, as well as XLSX and other formats.  Dealing with COBOL-style fixed layout files can also be boiled down to an importable module.  The &lt;span class="Apple-style-span"  style="font-family:'lucida grande';"&gt;import&lt;/span&gt; isn't overhead; it's a central part of the rules.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The file &lt;span class="Apple-style-span"  style="font-family:'lucida grande';"&gt;open&lt;/span&gt; functions could be seen as overhead.  Do we really need a full line of code when we could -- more easily -- read from stdin and write to stdout?  If we're willing to endure the inefficiency of processing one input file multiple times to create several standardized outputs, then we could eliminate the two &lt;span class="Apple-style-span"  style="font-family:'lucida grande';"&gt;with&lt;/span&gt; statements.  If, however, we have to merge several input files to create a standardized output file, the one-in-one-out model breaks down and we need the &lt;span class="Apple-style-span"  style="font-family:'lucida grande';"&gt;with&lt;/span&gt; statements and the &lt;span class="Apple-style-span"  style="font-family:'lucida grande';"&gt;open&lt;/span&gt; functions.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The &lt;span class="Apple-style-span"  style="font-family:'lucida grande';"&gt;for&lt;/span&gt; statement could be seen as needless overhead.  It goes without saying that we're processing the entire input file.  Unless, of course, we're merging several files.  Then, perhaps, it's not a simple loop that can be somehow implied.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;It's Just Code&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The point of Python-based ETL is that the problem "solved" by YATT isn't that interesting.  Python is an excellent transformation engine ETL.   Rather than write a fancy rule interpreter, just write Python.  Done.  &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;We don't need a higher-level data transformation engine written in Java.  Emit simple Python code and use the Python engine.  (We could try to emit Java code, but it's not as simple and requires a rather complex supporting library.  Python's &lt;a href="http://www.voidspace.org.uk/python/articles/duck_typing.shtml"&gt;Duck Typing&lt;/a&gt; simplifies the supporting library.)&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;If we don't write a new transformation engine, but use Python, that leaves a tiny space left over for the YATT:  producing the ETL rules in Python notation.  Rather than waste time writing another engine, the YATT developers could create a GUI that drags and drops column names to write the assignment statements in the body of the loop.  &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;That's right, the easiest part of the Python loop is what we can automate.  Indeed, that's about all we can automate.  Everything else requires complex coding that can't be built as "drag-and-drop" functionality. &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Transformations&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;There are several standard transformations.  &lt;/div&gt;&lt;div&gt;&lt;ul&gt;&lt;li&gt;Column order or name changes.  Trivial assignment statements handle this.&lt;/li&gt;&lt;li&gt;Mapping functions.  Some simple (no hysteresis, idempotent) function is applied to one or more columns to produce one or more columns.  This can be as simple as a data type conversion, or a complex calculation.&lt;/li&gt;&lt;li&gt;Filter.  Some simple function is used to include or exclude rows.&lt;/li&gt;&lt;li&gt;Reduction.  Some summary (sum, count, min, max, etc.) is applied to a collection of input rows to create output rows.  This is an ideal spot for Python generator functions.  But there's rarely a simple drag-n-drop for these kinds of transformations.&lt;/li&gt;&lt;li&gt;Split.  One file comes in, two go out.  This breaks the stdin-to-stdout assumption.&lt;/li&gt;&lt;li&gt;Merge.  Two go in, one comes out.  This breaks the stdin-to-stdout assumption, also.  Further, the matching can be of several forms.  There's the multi-file merge when several similarly large files are involved.  There's the lookup merge when a large file is merged with smaller files.  Merging also applies to doing key lookups required to match natural keys to locate database FK's.&lt;/li&gt;&lt;li&gt;Normalization (or Distinct Processing).  This is a more subtle form of filter because the function isn't idempotent; it depends on the state of a database or output file.  We include the first of many identical items; we exclude the subsequent copies.  This is also an ideal place for Python generator functions.&lt;/li&gt;&lt;/ul&gt;&lt;div&gt;Of these, only the first three are candidates for drag-and-drop.  And for mapping and filtering, we either need to write code or have a huge library of pre-built mapping and filtering functions.&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Problems and Solutions&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The YATT problem has two parts.  Creating the rules and executing the rules.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Writing another engine to execute the rules is a bad idea.  Just generate Python code.  It's a delightfully simple language for describing data transformation.  It already works.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Writing a tool to create rules is a bad idea.  Just write the Python code and call it the rule set.  Easy to maintain.  Easy to test.  Clear, complete, precise.&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684183198890094283-5863277264739578581?l=slott-softwarearchitect.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://slott-softwarearchitect.blogspot.com/feeds/5863277264739578581/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2010/11/data-mapping-and-conversion-tools-sigh.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/5863277264739578581'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/5863277264739578581'/><link rel='alternate' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2010/11/data-mapping-and-conversion-tools-sigh.html' title='Data Mapping and Conversion Tools -- Sigh'/><author><name>Steven Lott</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-mXhxBa_vM4E/AAAAAAAAAAI/AAAAAAAAAAA/eNlmenT-DOs/s512-c/photo.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684183198890094283.post-7464246651053819512</id><published>2010-11-04T08:00:00.000-04:00</published><updated>2010-11-04T08:00:04.898-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='python'/><title type='text'>Pythonic vs. "Clean"</title><content type='html'>This provokes thought:  "&lt;a href="http://nedbatchelder.com/blog/201011/pythonic.html"&gt;Pythonic&lt;/a&gt;".&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Why does Python have a "Pythonic" style?  Why not "clean"?&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Is it these lines from Tim Peters' "The Zen of Python" (a/k/a &lt;span class="Apple-style-span"  style="font-family:'lucida grande';"&gt;import this&lt;/span&gt;)&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;/div&gt;&lt;/div&gt;&lt;blockquote&gt;&lt;div&gt;&lt;div&gt;There should be one-- and preferably only one --obvious way to do it.&lt;/div&gt;&lt;div&gt;Although that way may not be obvious at first unless you're Dutch.&lt;/div&gt;&lt;/div&gt;&lt;/blockquote&gt;&lt;div&gt;Perhaps having a &lt;a href="http://www.python.org/dev/peps/pep-0008/"&gt;PEP 8&lt;/a&gt;, a &lt;a href="http://en.wikipedia.org/wiki/Guido_van_Rossum"&gt;BDFL&lt;/a&gt; (and &lt;a href="http://www.python.org/dev/peps/pep-0401/"&gt;FLUFL&lt;/a&gt;) means that there's a certain "pressure" to conform?&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Or do we have higher standards than other languages?  Or less intellectual diversity?&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;I think that "pythonic" is just a catchy phrase that rolls off the tongue.  I think a similar concept exists in all languages, but there isn't a good phrase for it in most other languages.  Although Ned Batchelder has some really good suggestions.  (Except for C++, which should be "C-Posh-Posh" for really good coding style.)&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;History&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;When I was a COBOL programmer, there were two buzz-phrases used.   "Clean" and "Structured".  Clean was poorly-defined and really just a kind of cultural norm.  In those days, each shop had a different opinion of "clean" and the lack of widespread connectivity meant that each shop had a more-or-less unique style.  Indeed, as a traveling consultant, I helped refine and adjust those standards because of the wide variety of code I saw in my travels.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;"Structured" is pretty much an absolute.  Each GOTO-like thing had to be reworked as properly nested IFs or PERFORMs.  No real issue there.  Except from folks who argued that "Structured" was slower than non-Structured.  A load of malarkey, but one I heard remarkably often.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;When I was a Fortran (and Ada) programmer, I worked for the military in which there were simply absolute standards for every feature of the source code.  Boring.  And no catchy buzz-word.  Just "Compliant" or "Wrong".&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Since it was the early '90's (and we were sequestered) we didn't have much Internet access.  Once in a while we'd have internal discussions on style where the details weren't covered by any standard.  Not surprisingly, they amounted to "&lt;a href="http://codegolf.com/"&gt;Code Golf&lt;/a&gt;" questions.  Ada has to be perfectly clear, which can be verbose, and some folks don't like clarity.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;When I become a C programmer, I found a Thomas Plum's &lt;a href="http://www.amazon.com/Reliable-Data-Structures-Thomas-Plum/dp/091153704X"&gt;Reliable Data Structures in C&lt;/a&gt;.  That provided a really good set of standards.  The buzzword I used was "Reliable".  &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The problem with C programming is that "Clean" and "Code Golf" get conflated all the time.  Folks write the craziest crap, claim it's "clean" and ignore the resulting obscurity.  Sigh.  I wish folks with stick with "Reliable" or "Maintainable" rather than "Clean".&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;While doing Perl programming I noticed that some folks didn't seem to realize the golden rule.&lt;/div&gt;&lt;div&gt;&lt;/div&gt;&lt;blockquote&gt;&lt;div&gt;&lt;b&gt;No One Wins At Code Golf&lt;/b&gt;&lt;/div&gt;&lt;/blockquote&gt;&lt;div&gt;I don't know why.  Other than to think that some folks felt that Perl programs weren't "real" programs.  They were just "scripts" and could be treated with a casual contempt.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;When I learned Java, I noted that an objective was to have a syntax that was familiar.  It was a stated goal to have the Java style guidelines completely overlap with C and C++ style guidelines.  Fair enough.  Doesn't solve the "Code Golf" vs. "Clean" problem.  But it doesn't confound it with another syntax, either.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Python&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;From this history, I think that "Pythonic" exists because we have a BDFL with high standards.&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684183198890094283-7464246651053819512?l=slott-softwarearchitect.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://slott-softwarearchitect.blogspot.com/feeds/7464246651053819512/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2010/11/pythonic-vs-clean.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/7464246651053819512'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/7464246651053819512'/><link rel='alternate' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2010/11/pythonic-vs-clean.html' title='Pythonic vs. &quot;Clean&quot;'/><author><name>Steven Lott</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-mXhxBa_vM4E/AAAAAAAAAAI/AAAAAAAAAAA/eNlmenT-DOs/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684183198890094283.post-1004876987495087713</id><published>2010-11-02T08:00:00.002-04:00</published><updated>2010-11-02T08:00:07.271-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='python'/><category scheme='http://www.blogger.com/atom/ns#' term='learning'/><category scheme='http://www.blogger.com/atom/ns#' term='building skills books'/><title type='text'>"Might Be Misleading" is misleading</title><content type='html'>My books (&lt;a href="http://homepage.mac.com/s_lott/books/nonprogrammer.html#book-nonprogrammer"&gt;Building Skills in Programming&lt;/a&gt;, &lt;a href="http://homepage.mac.com/s_lott/books/python.html#book-python"&gt;Building Skills in Python&lt;/a&gt; and &lt;a href="http://homepage.mac.com/s_lott/books/oodesign.html#book-oodesign"&gt;Building Skills in OO Design&lt;/a&gt;) develop a steady stream of email.  [Also, as a side note, I need to move them to the me.com server, Apple is decommissioning the homepage.mac.com domain.]&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The mail falls into several buckets.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Thanks&lt;/b&gt;.  Always a delight.  Keep 'em coming.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Suggestions&lt;/b&gt;.  These are suggestions for new topics.  Recently, I've had a few requests for Python 3 coverage.  I'm working with a publisher on this, and hope -- before too long -- to have real news.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Corrections&lt;/b&gt;.  I get a &lt;i&gt;lot&lt;/i&gt; of these.  A lot.  Keep 'em coming.  I didn't pay a copy editor; I tried to do it myself.  It's hard and I did a poor job.    More valuable than spelling corrections are technical corrections.  (I'm happy to report that I don't get as many of these.)  Technical corrections are the most serious kind of correction and I try to fix those as quickly as possible.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Source Code Requests&lt;/b&gt;.  No.  I don't supply any source.  If I send you the source, what skill did you build?  Asking for source?  Not a skill that has much value, IMHO.  If you want to learn to program, you have to create the source yourself.  That &lt;i&gt;is&lt;/i&gt; the job.  Sorry for making you work, but you have to &lt;i&gt;actually&lt;/i&gt; do the work.  There's no royal road to programming.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;The "Other" Bucket&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;I get some emails that I file under "other" because they're so funny.  They have the following boilerplate.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;"Code fragment [X] might is misleading because [Y]."&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;First, it's a complaint, not a question.  That's not very helpful.  That's just complaining.  Without a suggested improvement, it's the worst kind of bare negativity.  &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The best part is that — without exception — the person sending the email was not mislead.  They correctly understood the code examples.  &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Clearly, the issue isn't that the code is "misleading" in the usual sense of "lying" or "mendacious".  If it was actually misleading, then (a) they wouldn't have understood it and (b) there'd be a proper question instead of a complaint.  &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Since they correctly understood it, what's misleading?&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;User Interface Reviews&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;In software development, we used to go through the "misleading" crap-ola in user interface reviews.  In non-Agile ("waterfall") development, we have to get every nuance, every aspect, every feature of the UI completely specified before we can move on.  Everyone has to hand-wring over every word, every font choice, field order, button placement, blah, blah and blah.  &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;It seems like 80% of the comments are "label [X] might be misleading".  The least useful comment, of course, is this sort of comment with no suggestion.   The least useful reviewer is the person who (1) provides a negative comment and, when asked for an improvement, (2) calls a meeting of random people to come up with replacement text.  &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;[&lt;i&gt;Hint: If you eventually understood the misleading label, please state your understanding in your own words.  Often, hilarity ensues when their stated understanding cycles back to the original label.&lt;/i&gt;]&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The "label [X] might be misleading" comment is — perhaps — the most expensive nuisance comment ever.  Projects wind up spinning through warrens of rat-holes chasing down some verbiage that is acceptable.  After all, you can't go over the waterfall until the entire UI is specified, right?&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Worse, of course, the best sales people do not interpose themselves into the sales process.  They connect prospective customers with products (or services).  Really excellent sales people can have trouble making suggestions.  Their transparency is what makes them good.  It's not sensible demanding suggestions from them.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Underneath a "Might Be Misleading" comment, the person complaining completely understood the label.  They were not &lt;i&gt;actually&lt;/i&gt; mislead at all.  If it was misleading, then (a) they wouldn't have understood it and (b) there'd be a proper question instead of a complaint.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Thank goodness for Agile product owners who can discard the bad kind of negativity.  The right thing to do is put a UI in front of more than one user and bypass the negativity with a consensus that the UI actually is usable and isn't really misleading.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Might Be Misleading&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The "Might be Misleading" comments are often code-speak for "I don't like it because..."  And the reason why is often "because I had to think."  I know that thinking is bad.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;I understand that Krug's famous &lt;a href="http://www.sensible.com/"&gt;Don't Make me Think&lt;/a&gt; is the benchmark in usability.   And I totally agree that some thinking is bad.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;There are two levels of thinking.&lt;/div&gt;&lt;div&gt;&lt;ul&gt;&lt;li&gt;Thinking about the problem.&lt;/li&gt;&lt;li&gt;Thinking about the UI and how the UI models the problem.&lt;/li&gt;&lt;/ul&gt;&lt;/div&gt;&lt;div&gt;Krug's advice is clear.  Don't make users think about the UI and how the UI models the problem.  Users still have to think about the problem itself.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;In the case of UI labels which "Might Be Misleading", we have to figure out if it's the problem or the UI that folks are complaining about.  In many cases, parts of the problem are actually hard and no amount of UI fixup can ever make the problem easier.  &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Not Completely Accurate&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;One of the most common UI label complaints is that the label isn't "completely" accurate.  They seem to object to fact that a UI label can only contain a few words and they have to actually &lt;i&gt;understand&lt;/i&gt; the few words.  I assume that folks who complain about UI labels also complain about light switches having just "on" and "off" as labels.  Those labels aren't "completely" accurate.  It should say "power on".  Indeed it should say "110V AC power connected".  Indeed it should say "110V AC power connected through load".  Indeed it should say "110V AC 15 A power connected via circuit labeled #4 through load with ground".  &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Apparently this is news.  &lt;b&gt;Labels are Summaries&lt;/b&gt;.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;No label can be "completely" accurate.  You heard it here first.  Now that you've been notified, you can stop complaining about labels which "might be misleading because they're not completely accurate."  They can't be "completely" accurate unless the label recapitulates the entire problem domain description and all source code leading to the value.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Apologies&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;In too many cases of "Might Be Misleading," people are really complaining that they don't like the UI label (or the code example) because the problem itself is hard.  I'm sympathetic that the problem domain is hard and requires thinking.  &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Please, however, don't complain about what "Might Be Misleading".  Please try to focus on "Actually Is Misleading."&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Before complaining, please clarify your understanding.  &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Here's the rule.  &lt;b&gt;If you eventually understood it, it may be that the problem itself is hard&lt;/b&gt;.  If the problem is hard, fixing the label isn't going to help, is it?&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;If the problem is hard, you have to think.  Some days are like that.  The UI designer and I apologize for making you think.  Can we move on now?&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;If the label (or example) really is &lt;b&gt;wrong&lt;/b&gt;, and you can correct it, that's a good thing.   Figure out what is actually misleading.  Supply the correction.  Try to escalate "Might Be Misleading" to "Actually Mislead Someone".  Specifics matter.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Also, please remember that labels are summaries.  At some point, details must be elided.  If you have trouble with the concept of "summary", you can do this.  (1) Write down &lt;b&gt;all&lt;/b&gt; the details that you understand.  Omit nothing.  (2) Rank the details in order of importance.  (3) Delete words to pare the description down to an appropriate length to fit in the UI.  When you're done, you have a suggestion.&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684183198890094283-1004876987495087713?l=slott-softwarearchitect.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://slott-softwarearchitect.blogspot.com/feeds/1004876987495087713/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2010/11/might-be-misleading-is-misleading.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/1004876987495087713'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/1004876987495087713'/><link rel='alternate' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2010/11/might-be-misleading-is-misleading.html' title='&quot;Might Be Misleading&quot; is misleading'/><author><name>Steven Lott</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-mXhxBa_vM4E/AAAAAAAAAAI/AAAAAAAAAAA/eNlmenT-DOs/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684183198890094283.post-1072967168626718098</id><published>2010-10-26T08:00:00.003-04:00</published><updated>2011-01-05T14:35:34.346-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='python'/><title type='text'>Python and the "Syntactic Whitespace Problem"</title><content type='html'>Check out this list of questions on Stack Overflow:  &lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;http://stackoverflow.com/search?q=%5Bpython%5D+whitespace+syntax&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;About 10% of these are really just complaints about Python's syntax.   Almost every Stack Overflow question on Python's use of syntactic whitespace is really just a complaint.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Here's today's example:  "&lt;a href="http://stackoverflow.com/questions/3994765/python-without-whitespace-requirements"&gt;Python without whitespace requirements&lt;/a&gt;".&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Here's the money quote: "I could potentially be interested in learning Python but the whitespace restrictions are an absolute no-go for me."  &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Here's the reality.  &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Everyone Indents Correctly All The Time In All Languages.&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Everyone.  All the time.  Always.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;It's amazing how well, and how carefully people indent code.  Not Python code.  &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;All Code.  XML.  HTML.  CSS.  Java.  C++.  SQL.  All Code.  &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Everyone indents.  And they always indent &lt;b&gt;correctly&lt;/b&gt;.  It's truly amazing how well people indent.  In particular, when the syntax doesn't require any indentation, they still indent beautifully.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Consider this snippet of C code.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;code&gt;&lt;pre&gt;if( a == 0 )&lt;br /&gt;   printf( "a is zero" );&lt;br /&gt;   r = 1;&lt;br /&gt;else&lt;br /&gt;   printf( "a is non-zero" );&lt;br /&gt;   r = a % 2;&lt;br /&gt;&lt;/pre&gt;&lt;/code&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Over the last few decades, I've probably spent a complete man-year reading code like that and trying to figure out why it doesn't work.  It's not easy to debug.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The indentation completely and accurately reflects the programmer's intention.  Everyone gets the indentation right.  All the time.  In every language.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;And people still complain about Python, even when they indent beautifully in other languages.&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684183198890094283-1072967168626718098?l=slott-softwarearchitect.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://slott-softwarearchitect.blogspot.com/feeds/1072967168626718098/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2010/10/python-and-syntactic-whitespace-problem.html#comment-form' title='14 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/1072967168626718098'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/1072967168626718098'/><link rel='alternate' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2010/10/python-and-syntactic-whitespace-problem.html' title='Python and the &quot;Syntactic Whitespace Problem&quot;'/><author><name>Steven Lott</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-mXhxBa_vM4E/AAAAAAAAAAI/AAAAAAAAAAA/eNlmenT-DOs/s512-c/photo.jpg'/></author><thr:total>14</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684183198890094283.post-8248728874460915022</id><published>2010-10-21T08:00:00.001-04:00</published><updated>2010-10-21T08:00:11.695-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='database design'/><category scheme='http://www.blogger.com/atom/ns#' term='triggers'/><title type='text'>Code Base Fragmentation</title><content type='html'>Here's what I love -- an argument that can only add cost and complexity to a project.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;It sounds like this to me: "We need to fragment the code base into several different languages.  Some of the application programming simply &lt;b&gt;must&lt;/b&gt; be written in a language that's poorly-understood, with tools that are not widely available, and supported by a select few individuals that have exclusive access to this code.  We haven't benchmarked the technical benefit."&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Further, we'll create complex organizational roadblocks in every single project around this obscure, specialized, hard-to-support language.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Perhaps I'm wrong, but database triggers always seem to create more problems than they solve.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;They Totally Solve a Problem&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The most common argument boils down to application-specific &lt;a href="http://en.wikipedia.org/wiki/Aspect-oriented_programming"&gt;cross-cutting concerns&lt;/a&gt;.  The claim is that these concerns (logging, validation, data model integrity, whatever) can only be solved with triggers.  For some reason, though, these cross-cutting concerns can't be solved through ordinary software design.  I'm not sure why triggers are the only solution when simple OO design would be far simpler.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Some folks like to adopt the "multiple application programming languages" argument.  That is, that ordinary OO design won't work because the code would have to be repeated in each language.  This is largely bunk.  It's mostly folks scent-marking territory and refusing to cooperate.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Step 1.  Write  a library and share it.  It's hard to find a language that can't be used to write a sharable library.  It's easy to find an organization where the Visual C# programmers are not on speaking terms with the Java programmers and the isolated Python folks are pariahs.  This isn't technology.  Any one of the languages can create the necessary shared library.  A modicum of &lt;i&gt;cooperation&lt;/i&gt; would be simpler than creating triggers.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Step 2.  Get over it.  "Duplicated" business logic is rampant in most organizations.  Now that you know about, you can manage it.  You don't need to add Yet Another Language to the problem.  Just &lt;i&gt;cooperate&lt;/i&gt; to propagate the changes.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;They're Totally Essential To The Database&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The silly argument is that some business rules are "closer to" or "essential to" the database.  The reason I can call this silly is because when the data is converted to another database (or extracted to the data warehouse) the triggers aren't relevant or even needed.  If the triggers aren't part of "interpreting" or "using" the data, they aren't essential.  They're just convenient.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The data really &lt;b&gt;is&lt;/b&gt; separate from the processing.  And the data is far, far more valuable than the processing.   The processing really &lt;b&gt;is&lt;/b&gt; mostly application-specific.  Any processing that isn't specific to the application really &lt;b&gt;is&lt;/b&gt; a cross-cutting concern (see above).   There is no "essential" processing that's magically part of the data.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;What If...&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Life is simpler if all application programming is done in application programming languages.  And all triggers are just methods in classes.  And everyone just uses the class library they're supposed to use.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;"But what if someone doesn't use the proper library?  A trigger would magically prevent problems."  &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;If someone refuses to use the application libraries, they need career coaching.  As in "find another job where breaking the rules is tolerated."&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684183198890094283-8248728874460915022?l=slott-softwarearchitect.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://slott-softwarearchitect.blogspot.com/feeds/8248728874460915022/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2010/10/code-base-fragmentation.html#comment-form' title='5 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/8248728874460915022'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/8248728874460915022'/><link rel='alternate' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2010/10/code-base-fragmentation.html' title='Code Base Fragmentation'/><author><name>Steven Lott</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-mXhxBa_vM4E/AAAAAAAAAAI/AAAAAAAAAAA/eNlmenT-DOs/s512-c/photo.jpg'/></author><thr:total>5</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684183198890094283.post-5812460337223583748</id><published>2010-10-19T08:00:00.000-04:00</published><updated>2010-10-19T08:00:08.944-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='agile'/><category scheme='http://www.blogger.com/atom/ns#' term='software process improvement'/><category scheme='http://www.blogger.com/atom/ns#' term='architecture'/><category scheme='http://www.blogger.com/atom/ns#' term='refactoring'/><title type='text'>Technical Debt</title><content type='html'>Love this from Gartner.  "&lt;a href="http://www.gartner.com/it/page.jsp?id=1439513"&gt;Gartner Estimates Global 'IT Debt' to Be $500 Billion This Year, with Potential to Grow to $1 Trillion by 2015&lt;/a&gt;".&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;NetworkWorld ran a quicky version of the story.  Gartner: &lt;a href="http://www.networkworld.com/news/2010/092310-global-it-debt.html"&gt;Global 'IT debt' hits $500 billion, on the way to $1 trillion&lt;/a&gt;.  &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;ComputerWorld -- to be proper journalists -- have to get a balancing quote.  Their version of the story is this: &lt;a href="http://www.computerworld.com/s/article/352022/Gartner_Warns_of_App_Maintenance_Debt_?intsrc=print_latest"&gt;Gartner warns of app maintenance 'debt'&lt;/a&gt;.   The balancing quote is the following:&lt;/div&gt;&lt;div&gt;&lt;/div&gt;&lt;blockquote&gt;&lt;div&gt;"There are many good reasons to NOT upgrade/modernize many applications, and I believe Gartner is out of line using words like 'debt' which have guilt associated with them,"  &lt;/div&gt;&lt;div&gt;&lt;/div&gt;&lt;/blockquote&gt;&lt;div&gt;"Guilt"?  That's a problem?  Why are we pandering to an organization's (i.e., CIO's) emotional response? &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;I'm not sure that using a word like "debt" is a problem.  Indeed, I think they should ramp up the threat level on this and add words like "short-sighted" and "daft" and perhaps even "idiotic".  &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Anyone who doesn't believe (or doesn't understand) technical debt needs only to review the Y2K plans and budgets.  A bad technology decision lead to a mountain of rework.  Yes, it was all successful, but it made IT budgeting difficult for years afterwords.  &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The rest of the organization was grumpy about having their projects were stalled until after Y2K.  IT created it's own problems by letting the technology debt accumulate to a level where it was "fix or face an unacceptable risk of not being able to stay in business."&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;How many other latent Y2K-like problems are companies ignoring?&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684183198890094283-5812460337223583748?l=slott-softwarearchitect.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://slott-softwarearchitect.blogspot.com/feeds/5812460337223583748/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2010/10/technical-debt.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/5812460337223583748'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/5812460337223583748'/><link rel='alternate' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2010/10/technical-debt.html' title='Technical Debt'/><author><name>Steven Lott</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-mXhxBa_vM4E/AAAAAAAAAAI/AAAAAAAAAAA/eNlmenT-DOs/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684183198890094283.post-2818101826483067883</id><published>2010-10-13T08:00:00.000-04:00</published><updated>2010-10-14T07:46:24.190-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Django'/><category scheme='http://www.blogger.com/atom/ns#' term='python'/><category scheme='http://www.blogger.com/atom/ns#' term='security'/><title type='text'>Real Security Models</title><content type='html'>Lots of folks like to wring their hands over the Big Vague Concept (BVC™) labeled "security".&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;There's a lot of quibbling.  Let's move beyond BVC to the interesting stuff.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;I've wasted hours listening to people identify risks and costs of something that's not very complex.  I've been plagued by folks throwing up the "We don't know what we don't know" objection to a web services interface.  This objection amounts to "We don't know every possible vulnerability; therefore we don't know how to secure it; therefore all architectures are bad and we should stop development right now!"  The &lt;a href="http://www.owasp.org/index.php/Category:OWASP_Top_Ten_Project"&gt;OWASP top-ten list&lt;/a&gt;, for some reason, doesn't sway them into thinking that security is actually manageable.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;What's more interesting than quibbling over BVC, is determining the authorization rules.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Basics&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Two of the pillars of security are Authentication (who are you?) and Authorization (what are you allowed to do?)&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Authentication is not something to be invented.  It's something to be used.  In our case, with an Apache/Django application, the Django authentication system works nicely for identity management.  It supports a simple model of users, passwords and profiles. &lt;/div&gt;&lt;div&gt; &lt;/div&gt;&lt;div&gt;We're moving to &lt;a href="https://opensso.dev.java.net/"&gt;Open SSO&lt;/a&gt;.  This takes identity management out of Django.  &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The point is that authentication is -- largely -- a solved problem.  Don't invent. It's solved and it's easy to get wrong.  Download or License an established product for identity management&lt;/div&gt;&lt;div&gt;and use it for all authentication.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Authorization&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The Authorization problem is always more nuanced, and more interesting, than Authentication.  Once we know who the user is, we still have to determine what they're really allowed to do.  This varies a lot.  A small change to the organization, or a business process, can have a ripple effect through the authorization rules.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;In the case of Django, there is a "low-level" set of authorization tests that can be attached to each view function.  Each &lt;a href="http://docs.djangoproject.com/en/dev/ref/models/options/"&gt;model &lt;/a&gt;has an implicit set of three permissions (can_add, can_delete and can_change).  Each view function can test to see if the current user has the required permission.  This is done through a simple &lt;a href="http://docs.djangoproject.com/en/dev/topics/auth/#the-permission-required-decorator"&gt;permission_required&lt;/a&gt; decorator on each view function.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;However, that's rarely enough information for practical — and nuanced — problems.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The &lt;a href="http://docs.djangoproject.com/en/dev/topics/auth/#storing-additional-information-about-users"&gt;auth profile module&lt;/a&gt; can be used to provide additional authorization information.  In our case, we just figured out that we have some "big picture" authorizations.  For sales and marketing purposes, some clusters of features are identified as "products" (or "features" or "options" or something).  They aren't smallish things like Django models.  They aren't largish things like whole sites.  They're intermediate things based on what customers like to pay for (and not pay for).&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Some of these "features" map to Django applications.  That's easy.  The application view functions can all simply refuse to work if the user's contract doesn't include the option.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Sadly, however, some "features" are part of an application.  Drat.  We have two choices here.  &lt;/div&gt;&lt;div&gt;&lt;ul&gt;&lt;li&gt;Assure that there's a "default" option and configure the feature or the default at run time.  For a simple class (or even a simple module) this isn't too hard.  Picking a class to instantiate at run time is pretty standard OO programming.&lt;/li&gt;&lt;li&gt;Rewrite the application to refactor it into two applications: the standard version and the optional version.  This can be hard when the feature shows up as one column in a displayed list of objects or one field in a form showing object details.  However, it's very Django to have applications configured dynamically in the &lt;span class="Apple-style-span"  style="font-family:'lucida grande';"&gt;settings&lt;/span&gt; file.&lt;/li&gt;&lt;/ul&gt;&lt;/div&gt;&lt;div&gt;Our current structure is simple: all customers get all applications.  We have to move away from that to mix-and-match applications on a per-customer basis.  And Django supports this elegantly.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Security In Depth&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;This leads us to the "Defense in Depth" buzzword bingo.  We have SSL.  We have SSO.  We have high-level "product" authorizations.  We have fine-grained Django model authorizations.  &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;So far, all of this is done via Django group memberships, allowing us to tweak permissions through the &lt;span class="Apple-style-span"  style="font-family:'lucida grande';"&gt;auth&lt;/span&gt; module.  Very handy.  Very nice.  And we didn't invent anything new.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;All we invented was our high-level "product" authorization.  This is a simple many-to-many relationship between the Django Profile model and a table of license terms and conditions with expiration dates.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Django rocks.  The nuanced part is fine-tuning the available bits and pieces to match the marketing and sales pitch and the the legal terms and conditions in the contracts and statements of work.&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/684183198890094283-2818101826483067883?l=slott-softwarearchitect.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://slott-softwarearchitect.blogspot.com/feeds/2818101826483067883/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2010/10/real-security-models.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/2818101826483067883'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/684183198890094283/posts/default/2818101826483067883'/><link rel='alternate' type='text/html' href='http://slott-softwarearchitect.blogspot.com/2010/10/real-security-models.html' title='Real Security Models'/><author><name>Steven Lott</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-mXhxBa_vM4E/AAAAAAAAAAI/AAAAAAAAAAA/eNlmenT-DOs/s512-c/photo.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-684183198890094283.post-5658168550784726613</id><published>2010-10-04T08:00:00.000-04:00</published><updated>2010-10-04T08:00:09.295-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='xlsx'/><category scheme='http://www.blogger.com/atom/ns#' term='spreadsheet'/><category scheme='http://www.blogger.com/atom/ns#' term='excel'/><category scheme='http://www.blogger.com/atom/ns#' term='xml'/><category scheme='http://www.blogger.com/atom/ns#' term='python'/><category scheme='http://www.blogger.com/atom/ns#' term='xlsm'/><category scheme='http://www.blogger.com/atom/ns#' term='zipfile'/><title type='text'>.xlsm and .xlsx Files -- Finally Reaching Broad Use</title><content type='html'>For years, I've been using &lt;a href="http://poi.apache.org/"&gt;Apache POI&lt;/a&gt; in Java and &lt;a href="http://www.lexicon.net/sjmachin/xlrd.htm"&gt;XLRD &lt;/a&gt;in Python to read spreadsheets.  Finally, now that .XLSX and .XLSM files are in more widespread use, we can move away from those packages and their reliance on successful reverse engineering of undocumented features.&lt;div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Spreadsheets are -- BTW -- the universal user interface.  Everyone likes them, they're almost inescapable.  And they work.  There's no reason to attempt to replace the spreadsheet with a web page or a form or a desktop application.  It's easier to cope with spreadsheet vagaries than to replace them.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The downside is, of course, that users often tweak their spreadsheets, meaning that you never have a truly "stable" interface.  However, transforming each row of data into a Python dictionary (or Java mapping) often works out reasonably well to make your application mostly immune to the common spreadsheet tweaks.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Most of the .XLSX and .XLSM spreadsheets we process can be trivially converted to CSV files.  It's manual, yes, but a quick audit can check the counts and totals.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Yesterday we got an .XLSM with over 80,000 plus rows.  It couldn't be trivially converted to CSV by my installation of Excel.   &lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;What to do?&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Python to the Rescue&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Step 1.  Read the standards.  Start with the Wikipedia article: "&lt;a href="http://en.wikipedia.org/wiki/Office_Open_XML"&gt;Open Office XML&lt;/a&gt;".  Move to the &lt;a href="http://www.ecma-international.org/publications/standards/Ecma-376.htm"&gt;ECMA 376&lt;/a&gt; standard.  &lt;/div&gt;&lt;div&gt;
