Tuesday, August 28, 2012

Password Encryption -- Short Answer: Don't.

First, read this.    Why passwords have never been weaker—and crackers have never been stronger.

There are numerous important lessons in this article.

One of the small lessons is that changing your password every sixty or ninety days is farcical.  The rainbow table algorithms can crack a badly-done password in minutes.  Every 60 days, the cracker has to spend a few minutes breaking your new password.  Why bother changing it?  It only annoys the haxorz; they'll be using your account within a few minutes.  However.  That practice is now so ingrained that it's difficult to dislodge from the heads of security consultants.

The big lesson, however, is profound.

Work Experience

Recently, I got a request from a developer on how to encrypt a password.  We have a Python back-end and the developer was asking which crypto package to download and how to install it.

"Crypto?" I asked.  "Why do we need crypto?"

"To encrypt passwords," they replied.

I spat coffee on my monitor.  I felt like hitting Caps Lock in the chat window so I could respond like this: "NEVER ENCRYPT A PASSWORD, YOU DOLT."

I didn't, but I felt like it.

Much Confusion

The conversation took hours.  Chat can be slow that way.  Also, I can be slow because I need to understand what's going on before I reply.  I'm a slow thinker.  But the developer also needed to try stuff and provide concrete code examples, which takes time.

At the time, I knew that passwords must be hashed with salt.  I hadn't read the Ars Technica article cited above, so I didn't know why computationally intensive hash algorithms are best for this.

We had to discuss hash algorithms.

We had to discuss algorithms for generating unique salt.

We had to discuss random number generators and how to use an entropy source for a seed.

We had to discuss http://www.ietf.org/rfc/rfc2617.txt in some depth, since the algorithms in section 3.2.2. show some best practices in creating hash summaries of usernames, passwords, and realms.

All of this was, of course, side topics before we got to the heart of the matter.

What's Been Going On

After several hours, my "why" questions started revealing things.  The specific user story, for example, was slow to surface.

Why?

Partly because I didn't demand it early enough.

But also, many technology folks will conceive of a "solution" and pursue that technical concept no matter how difficult or bizarre.  In some cases, the concept doesn't really solve the problem.

I call this the "Rat Holes of Lost Time" phenomena: we chase some concept through numerous little rat-holes before we realize there's a lot of activity but no tangible progress.  There's a perceptual narrowing that occurs when we focus on the technology.  Often, we're not actually solving the problem.
IT people leap past the problem into the solution as naturally as they breathe. It's a hard habit to break.
It turned out that they were creating some additional RESTful web services.  They knew that the RESTful requests needed proper authentication.  But, they were vague on the details of how to secure the new RESTful services.

So they were chasing down their concept: encrypt a password and provide this encrypted password with each request.  They were half right, here.  A secure "token" is required.  But an encrypted password is a terrible token.

Use The Framework, Luke

What's most disturbing about this is the developer's blind spot.

For some reason, the existence of other web services didn't enter into this developer's head.  Why didn't they read the code for the services created on earlier sprints?

We're using Django.  We already have a RESTful web services framework with a complete (and high quality) security implementation.

Nothing more is required.  Use the RESTful authentication already part of Django.

In most cases, HTTPS is used to encrypt at the socket layer.  This means that Basic Authentication is all that's required.  This is a huge simplification, since all the RESTful frameworks already offer this.

The Django Rest Framework has a nice authentication module.

When using Piston, it's easy to work with their Authentication handler.

It's possible to make RESTful requests with Digest Authentication, if SSL is not being used.  For example, Akoha handles this.  It's easy to extend a framework to add Digest in addition to Basic authentication.

For other customers, I created an authentication handler between Piston and ForgeRock OpenAM so that OpenAM tokens were used with each RESTful request.  (This requires some care to create a solution that is testable.)

Bottom Lines

Don't encrypt passwords.  Ever.

Don't write your own hash and salt algorithm.  Use a framework that offers this to you.

Read the Ars Technica article before doing anything password-related.

Tuesday, August 21, 2012

Performance "Tuning": running in 1/100th the time

For the 757 Python Meetup group, someone proposed looking at some Python code they had which was slow.  The code implemented a variation on the Elo chess rating system.  It applied the ratings to other sports, and used the points scored as well as basic win/lose/tie to work out a ranking for sports teams.  Very clever stuff.

But.

It was described as horribly slow. Since it was only 400 lines of code, it was a great subject for review in a Python meetup.  I would be able to show some tweaks and performance tips.

Hypothetical Step 1:  Profile

My first thought was to run the profiler and see what popped up as a possible root cause of slowness.

Generally, there are three kinds of performance problems.

  • Wrong data structure.  Or wrong algorithm.  (They're simply opposite sides of the same coin.)  This generally leads to dramatic, earth-shattering improvements.
  • Piss-Poor Programming Practices.  This kind of fine-tuning yields minuscule improvements.  In some cases, no measurable change at all.
  • Bugs.  Every time I've been asked to improve "working" code, I find that it has bugs.  Every time.  These can exacerbate performance problems.  Interestingly, most of these are readily apparent during an initial survey:  i.e., while simply reading the code to see how it works.  Trying to create unit tests for the purposes of refactoring often reveals additional bugs.
I didn't know what I was going to see in this 400-line program.

Hypothetically, profiling will help show what kind of problems we have.

Prior to profiling, of course, we need to run the darn thing.  Thankfully, they provided some typical input files.  For example, 1979 High School Football season results.  159 lines of patiently gathered teams and scores.

It Didn't Run

When we tried to run it with the profiler, we found that we had a bug.  Tracking down the bug, revealed the essential performance problem, also.

The core problem was a failure to make use of Python data structures.  

This manifested itself in a number of really bad design errors.  Plus, it presented as the actual, show-stopping, serious bug.

In order to work out team rankings, the program kept a list of teams.

I'll repeat that for folks who are skimming.
In order to work out team rankings, the program kept a list of teams.
Not a dict mapping from team name to team details.  But a simple list.  Locating a team in the list meant iterating through the list, looking for the matching team.  Really.

for t in team_list:
    if t['name'] == target_name:
        process( t )

This kind of thing was repeated in more than one place.

And one of those repeats had the bug in it.

What we have here is "wrong data structure".  Replacing a list with a dict will have earth-shattering impact on performance.

The Bug

The bug, BTW, was a lookup loop which had the additional requirement of adding missing teams.  It tried to use the for-else structure.  This was the intended code (not the actual code).

for t in team_list:
    if t['name'] == target_name:
        return
else:
    t['name']= init_new_team(target_name)
    
This is a first-class bit of Python.  An else clause on a for statement is a real feature of the language.  However, it's obscure enough that it's easy to get wrong.

However, it's also unique to Python, and the kind of thing that can lead to confusion.  I discourage it's use.  

Test-Driven Reverse Engineering

We're doing TDRE on a little piece of recreational programming.  This means that we need to fabricate unit tests for code that is purported to work.  Some folks like to call this "exploratory testing" (a phrase which is contradictory, and consequently meaningless.)  We're "exploring".  We're not "testing".

Once the core bug is fixed, we can run sample data through the application to get "big picture" results.  We can extract bits from the sample data and exercise various functions in isolation to determine what they actually do now, bugs included.

Since this is so simple--and we're going to review it during a 2-hour meet up--and there's nothing billable going on--we can get by with a few really simple unit tests.  We'll run the darn thing once to create some expected output.  We save the output and use that single output log as the expected results from each refactoring step.

More complex applications require more unit tests.  For a billable TDRE effort last year, I had to create 122 unit tests to successfully refactor a program of over 6,000 lines of code.  It took several weeks.

Real Step 1: Fix The Data Structure

Before profiling (but after running to capture some output) we have to fix the essential data structure.  Repeated scanning of a list is never going to perform well.
The whole point of the study of "Data Structures" is to prevent (or optimize) search.
In this case, we can prevent linear search of a list by using a dictionary.  That, clearly, is job one.

It was a pervasive rewrite.   Essentially, everything in this little program included a goofy loop to lookup a team in the list.  The Elo algorithm itself, which is already O(n2), is dragged down by using the linear search for the opposing team four more times, making it O(n3).

Cyclomatic Complexity

One of the big "issues" is the use of if statements throughout the scoring algorithm.  An if statement creates Cyclomatic Complexity and can lead to performance problems.  Generally, if statements should be avoided.

This algorithm applies some normalization factors to reconcile scoring with win/loss numbers in different sports.  Basketball, specifically, involves generally high scores.  Since there are 2-point and 3-point scoring opportunities, a factor is used to normalize the points into "goals".  Football, similarly, has numerous scoring opportunities with values of 1, 2, 3 and 6 points; the scores here are also normalized.

This normalization was done with an if statement that was evaluated deep inside the Elo algorithm.  Repeatedly. Evaluated.

The two functions that handled the normalizations, plus the normalization factors, are ideal candidates for OO design.  There's a clear hierarchy of classes here.  A superclass handles most sports, and two polymorphic subclasses handle football and basketball normalization.

The if statement is now "pushed up" to the very beginning of the program where an instance of the sports normalization object is created.  This object's methods are then used by the Elo algorithm to normalize scores.

Icing on the Cake

Once we've fixed the bug and replaced a list with a dict, everything else is merely icing.
Some other OO changes.
  1. The "Team" information should not be a flat, anonymous dictionary.  It should be a proper class definition with proper attributes.  There aren't many methods, so it's easy to create. 
  2. The "Game" information is read by csv.DictReader.  However, it should not remain a simple, anonymous dict.  As with a Team, a simple class can be created to handle Game.
  3. The overall structure of the application needs to be broken into two sections.  The command-line interface parses options, opens files, and generally gets everything set up.  The actual ranking algorithm should be a function that is given an open file-like object plus the Sport object for normalization.  This allows the ranking algorithm to be reused in other contexts than the command-line (i.e. a web service).
A more subtle OO design point is the question of "mutability".  A Team in this application is little more than a name.   There are also numerous "stateful" values that are part of the Elo algorithm.   A Game, similarly, is an immutable pair of teams and scores.  However, it has some mutable values that are part of the Elo algorithm.  

Really, we have immutable Team and GameHistory objects, plus a few values that are used as part of the Elo calculation.  I'm a big fan of disentangling these mutable and immutable objects from each other.  

I suspect that the Elo algorithm doesn't really need to update the "state" of an object.  I suspect that it actually creates (and discards) a number of immutable candidate ranking objects.  The iteration that leads to convergence might be a matter of object creation rather than update.  I didn't make this change, since it required real work and we were out of time.

Bottom Line

The more times I do TDRE to improve performance, the more I realize that it's all about bugs and data structures.

This recreational application took 45-60 seconds to process one year's record of games for a given league.  It now takes less than 0.2 seconds to do the same volume of work.  Two test cases involving a complete run of 159 records runs in 0.411 seconds.  That's 1/100th the time simply from switching data structures.

The idea of "tweaking" a working program to improve performance is generally misleading.  It might happen, but the impact is minuscule at best.  

Here's the checklist for getting 100:1 improvements.
  • Remove searches.  
  • Remove deeply-nested if statements.  
Generally, reduce Cyclomatic Complexity.

Tuesday, August 7, 2012

How Expensive is a "Waterfall" Project Plan?

It's impossible to step into the same river twice; other waters are flowing toward the sea.  It's impossible to do "head-to-head" project comparisons.  You can't have the same people doing the same thing with only one constraint changed.  You can try to resort to having similar people doing the exact same thing.

It's impossible to provide trivial quantification of the costs of a waterfall approach.  However, there's a lot of places where "up front" engineering adds cost and risk.  A waterfall plan adds friction, zero-value administrative costs and (sometimes) extra work.

For software development that doesn't involve "toy" projects, but apply to the magical "real world" projects, it's difficult to justify building the same "real world" thing twice.  It's hard to create a detailed and purely economic analysis of this question.  Therefore, we have to resort to hand-waving and speculation.

[I put "real world" in quotes, because the "real world" is a kind of dog-whistle phrase that only project managers understand.  It appears to mean "projects that fit my preconceived notions" or "projects that can't be done any other way because of political considerations" or "projects constrained by a customer to have a farcical structure".  Another variant on this dog-whistle phrase is "the realities of running a business."  There's something in their real world that they (a) can't define and (b) prevents any change.]

A few months back, I saw a http://prezi.com presentation that was intended to drive architecture, cost and schedule conversations.  The authors were happy.  I was left relatively mystified.

There are three relevant issues that don't fit well into a prezi presentation.
  1. Open Technology questions.  If it's new software development, there are unanswered software API, performance, quality, fitness-for-purpose questions.  If it's a clone or a clean-room reverse engineering, then there may be few questions.  Otherwise, there must be unknowns.
  2. Open User Story questions.  If it's new software development, then the user stories are not complete, consistent and optimized.  There must be something missing or extra.
  3. Open User Experience questions.  I'm not a UX person.  Tinkering with an Arduino has taught me that even in the most prosaic, simplistic project there is a UX element.  
It's easy to trivialize this as "details" or "supporting information" omitted from the presentation.  Actually, it's considerably more interesting that simply elaborating details.  Indeed, the missing stuff is often more important that the elements of the presentation than were provided.

The Standard Gaps

While a presentation can show some elements of UX, it's not interactive.  It cannot provide any useful depth on the UX.  Failure to do UX exploration (with real interaction) is crazy.  Making assumptions about UX is more likely to create cost and risk.

User Stories can be kicked around in a presentation.  However.  Once the highest-priority user stories are built, and lessons learned about the users, the problem and the emerging solution, then the other user stories can (and should) change based on those lessons learned.  Assuming a list of user stories and then committing to a specific schedule enshrines the costs in a bad plan.  And we all know that the plan (and the dates) are sacred once written down.

Software does not exist in a vacuum.

I'll repeat that, since it's a hidden assumption behind the waterfall.


Software does not exist in a vacuum.

You have a Heisenberg problem when you start building software and releasing it incrementally.  The presence of the new software, even in a first-release, beta-test, barely-working mode changes the ecosystem in which the software exists.  Other applications can (and will) be modified to adjust to the new software.  Requirements can (and should) change.

A waterfall project plan exists to prevent and stifle change.  A long list of tasks and dates exists to assure that people perform those tasks on those dates.  This is done irrespective of value created and lessons learned.

Technology changes.  Our understanding of the technology changes.  One wrong assumption about the technology invalidates the deliverables for a sprint.  In a project with multiple people, multiple assumptions and multiple sprints, this effect is cumulative.  Every assumption by every person for every technology is subject to (considerable) error.

Every technology assumption must be looked at as a needless cost that's being built into the project.

Recently, I've been working on a project with folks that don't know Django very well.  Their assumptions are -- sometimes -- alarming.  Part way through a sprint, I got a raft of technical questions on encrypting passwords.  It's hard to state it strongly enough: never encrypt a password.  What's more useful is this: always hash passwords.  The original password must be unrecoverable.  There are lots of clever hashing techniques.  Some of which are already part of Django.  A significant portion of the sprint (appeared) to be based on reinventing a feature already part of Django.

Do the math: a few wrong assumptions made during project planning are canonized forever as requirements in the project.  With a waterfall project, they're not easy to change.  Project managers are punished for changing the project.  You can't increase the project deadline; that's death.  You can't decrease it, either: you get blamed for sand-bagging the estimates.

Arduino Technology Lessons Learned

After buying several 74HC595 shift registers from Sparkfun, I realized that my assumptions about the interfaces were utterly wrong.  I needed to solder a mix of right-angle header and straight-through headers onto the breakout boards.  I tried twice to order the right mix headers.  It seems so simple.  But assumptions about the technology are often wrong.

This is yet more anecdotal evidence that all assumptions must be treated as simple lies.  Some managers like to treat assumptions as  "possibly true" statements; i.e., these are statements that are unlikely to be false.  This is wrong.  Assumptions always have a very high probability of being false, since they're not based on factual knowledge.

Some managers like to treat assumptions as "boundary conditions":  i.e.,  if the assumption is true, then the project will go according to plan.  Since all of the assumptions will be prove to be incorrect, this seems like simple psychosis.  

[Interestingly, the assumptions-are-boundary folks like to play the "real world" card: "In the real world, we need to make assumptions and go forward."  Since all assumptions will be shown to be incorrect, why make them?  Wouldn't it be more rational to say "we need to explore carefully by addressing high-righ unknowns first"?  Wouldn't it be better to both gather facts and build an early release of the software at the same time?]

Arduino User Story Assumptions

After building a prototype that addressed two of the user stories, it slowly became clear that the third user story didn't actually exist.

Sure, the words were there on paper.  Therefore, there's a user story.

But. 

There was nothing to actually do.  

The whole "As a [role], I need [feature] so that I can [benefit]" that was written on day one was based on a failure to understand precisely how much information was trivially available.  The benefit statement was available with a small software change and no separate user story.  And no separate hardware to support that user story.

Exploration and tinkering reduced the scope of the work.  In the real world.

[In the "real world" where waterfall is considered important, exploration is described as unbounded spending of limited resources.  In the real real world, money must be spent; it can either be long hand-wringing meetings or it can be prototype development.]

The user story (written before a prototype existed) was based on a failure to fully understand the UX.  The only way to fully understand the UX is to build it.  Education costs money.

Arduino UX Learnings

Perhaps the most important issue here is UX.

Once upon a time, UX was expensive, difficult and complex.  So difficult that special-purpose prototyping tools were created to make it possible to create a preliminary UX that could be used to confirm UX and user stories.

This UX prototyping effort was real money spent as part of "requirements" or "design"; it created documentation that flowed over the waterfall for subsequent development.

This notion is obsolete.  And has been obsolete for several years.

UX is now so easy to build that it makes much more sense to build two (or three) competing UX's and compare them to see which is actually better.

Indeed, it makes a lot of sense to build one UX and release it; make a real effort at solving the user's problems.  Then build a second UX for A/B testing purposes to see if there's room for improvement.

I'll repeat that for folks who really like the waterfall.

It's now cheaper to actually build two than it is to write detailed requirements for one.

[In the "real world", this is deprecated as "wasting time playing with the UX".  As if written requirements based on nothing more than a whiteboard are more "real" than hands-on experience with the UX.]

You can prove this to yourself by actually observing actual UX developers knocking out pages and screens.  Long "requirements gathering" meetings with prospective users amount to a waste of time and money.  Long "brainstorming" sessions, similarly, are wasteful.  Short, ongoing conversations, a build, a release, and a follow-up review has more value, and costs less.  

Do the math.  Several users and business analysts in several multiple-hour meetings costs how much?

A pair of developers banging out fully-functioning, working UX for a use case costs how much?

A slavish adherence to waterfall development creates "real world" costs.