Tuesday, January 20, 2015

Webcast Wednesday

Be there: http://www.oreilly.com/pub/e/3255

Of course, I've got too many slides. 58 slides for a 60 minute presentation. That's really about 2 hours of material. Unless people have questions, then it's a half-day seminar.


I think I've gone waaaay too far on this. But it's my first one, and I'd hate to burn through all eight slides, take a few questions and be done too soon.

If this goes well, perhaps I'll see if I can come up with other 1-hour topics.

I worry a great deal about rehashing the obvious.

On the other hand, I'm working with a room full of newbies, and I think I could spend several hours on each of their questions.

And straightening out their confusions.

Case in point. Not directly related to the webcast.

One of my colleagues had seen a webcast which described Python's &, |, and ~ operators, comparing  them with and, or and not.

I'm not 100% sure, but... I think that this podcast -- I'm getting this second-hand; it's just hearsay -- showed that there's an important equivalence between and and &.

This is true, but hopelessly obscure. Since & has a higher priority than the comparison operators, there will be serious confusion when one fails to parenthesize properly.

Examples like this abound:

>>> 3 == 3 & 4 < 5
>>> (3 == 3) & (4 < 5)


Further, the fact that & can't short-circuit had become confusing to the colleague. I figured out some of what was going on when trying to field some seemingly irrelevant questions on "Why are some operators more efficient?" and "How do you know which to use?"

Um. That's not really the point. There's no confusion if you set the bit-fiddling operators aside.

The point is that and, or, not, and the if-else conditional expression live in their own domain of boolean values. The fact that &, |, ^, and ~ will also operate on boolean values is a kind of weird duplication, not a useful feature. The arithmetic operators also work on booleans. Weirdly.

The Python rules are the rules; it makes sense for True&True to yield True. Results depend on the operands. It would be wrong in that sense for True&True to be 1. But it would also fit the concept of these operators a little better if they always coerced bool to int. This happens for * and +: True+True == 2.

Why can't it be true for & and |? It would reduce potential confusion. 

I'm sure the person who implemented __and__(), __or__(), __xor__(), and __invert__() was happy to create a parallel universe between and and &. I'm not sure I agree.

And perhaps I should have a webcast on Python logic. It seems like a rehash of fundamentals to me. But I have colleagues confused by fundamentals. So perhaps I'm way wrong about what's fundamental and what's useful information.

Thursday, January 15, 2015

Chapter 12 Alternate Example - Normalization and Decorators

In the forthcoming Functional Python Programming (https://www.packtpub.com/application-development/functional-python-programming) I was pressured by one of the technical reviewers to create a better example of composite function creation with decorators.

This was a difficult request. First, of course, "better" is poorly defined. More importantly, the example in the book is extensive and includes the edge-of-the-envelope "don't do this in real code" parts, too. It's important to be thorough. Finally, it's real-world data cleansing code. It's important to be pragmatic, but, it's kind of boring. I really do beat it into submission showing simple decorators, parameterized decorators, and crazy obscurely bad decorators.

In this case, "better" might simply mean "less thorough."

But, perhaps "better" means "less focused on cleansing and more focused on something else."

On Decoration

The essence of the chapter -- and the extensive example -- is that we can use decorators as higher-order functions to build composite functions.

Here's an alternative example. This will combine z-score normalization with another reduction function. Let's say we're doing calculations that require us to normalize a set of data points before using them in some reduction.

Normalizing is the process of scaling a value by the mean and standard deviation of the collection. Chapter 4 covers this in some detail. Reductions like creating a sum are the subject of Chapter 6. I won't rehash the details of these topics in this blog post.

Here's another use of decorators to create a composite function.

def normalize( mean, stdev ):
    normalize = lambda x: (x-mean)/stdev
    def concrete_decorator( function ):
        def wrapped( data_arg ):
            z = map( normalize, data_arg )
            return function( z )
        return wrapped
    return concrete_decorator

The essential feature of the @normalize(mean, stdev) decorator is to apply the normalization to the vector of argument values to the original function. We can use it like this.

>>> d = [ 2, 4, 4, 4, 5, 5, 7, 9 ]
>>> from Chapter_4.ch04_ex4 import mean, stdev
>>> m_d, s_d =  mean(d), stdev(d)
>>> @normalize(m_d, s_d)
>>> def norm_list(d):
...     return list(d)
>>> norm_list(d)
[-1.5, -0.5, -0.5, -0.5, 0.0, 0.0, 1.0, 2.0]

W've create a norm_list() function which applies a normalization to the given values. This function is a composite of normalization plus list().

Clearly, parts of this are deranged. We can't even define the norm_list() function until we have mean and standard deviation parameters for the samples. This doesn't seem appropriate.

Here's a slightly more interesting composite function. This combines normalization with sum().

>>> @normalize(m_d, s_d)
>>> def norm_sum(d):
...     return sum(d)
>>> norm_sum(d)

We've defined the normalized sum function and applied it to a vector of values. The normalization has parameters applied. Those parameters are relatively static compared with the parameters given to the composite function.

It's still a bit creepy because we can't define norm_sum() until we have the mean and standard deviation.

It's not clear to me that a more mathematical example is going to be better. Indeed, the limitation on decorators seems to be this:

  • The original (decorated) function can have lots of parameters;
  • The functions being composed by the decorator must either have no parameters, or have very static "configuration" parameters.
If we try to compose functions in a more general way -- all of the functions have parameters -- we're in for problems. That's why the data cleansing pipeline seems to be the ideal use for decorators.

Thursday, January 8, 2015

The Python Challenge

See http://www.pythonchallenge.com

Addicting. For folks (like me) who like this kind of thing. For others, perhaps just dumb. Or infuriating.

Years ago -- many, many years ago -- I vaguely remember a similar game with a name like "insanity" or something like that. Now there's http://www.notpron.com and http://www.weffriddles.com. All of these are "show the page source" HTML games. These games are a kind of steganography: the page your browser renders isn't what you need to see.

What's important about the Python Challenge is that it's not specifically about Python. Any programming language would do. Although I suspect that folks who don't know Python will have a difficult time with some of the puzzles. I found that having Pillow was essential for problems 7 and 11. I'm sure there are packages as powerful as PIL/Pillow for other languages.

Also, one of the hints included dated Python 2.7 code. The rest of the problems, however, seem to fit perfectly well with Python 3.4.

I wasted a morning getting to challenge 11. It was a ton of fun.

Challenge 12 was the first of the show-stoppers. The hint "evil1.jpg" is beyond subtle. Let me add this hint: This is the first puzzle where the pictures have digits. Perhaps there are related pictures.

I spent hours studying and rearranging and filtering and enhancing evil1.jpg before I finally broke down and searched for a hint. The hint -- of course -- included the whole solution, so I had to skim the code to figure out what I'd missed.

Challenges 14, 15, and 16 require additional hints, also. 14, for example, needs a reminder that the pixels need to be spiraled. Challenge 15 barely requires minimal programming and a lot of Google searching for famous people's birthdays. Challenge 16's hint is as opaque as the picture. It involves restructuring the image. But. I had to resort to reading more of the http://intelligentgeek.blogspot.com/2006/03/python-challenge-16-ahh-i-finally.html than for other problems.

I have chapters to review. I really shouldn't be playing around with silliness like this.

In spite of that, let me just say, that reading about the "Look-and-Say" sequence was a bunch of fun. See http://oeis.org/A005150. Whatever you do, avoid reading this: http://archive.lib.msu.edu/crcmath/math/math/c/c671.htm; it won't help you with the Python Challenge at all. But it's interesting. And a huge time-waster. This particular challenge was more like Project Euler problems. [Project Euler is back up and running, BTW.]

Here's my variation on the Conway sequence theme:

    def say( digits ):

        def run_lengths(digits):
            d_iter= iter(digits)
            c, d0 = 1, next(d_iter)
            for d in d_iter:
                if d0 == d:
                    c += 1
                    yield str(c)+d0
                    c, d0= 1, d
            yield str(c)+d0

        return "".join(run_lengths(digits))

I'm a fan of generator functions. A big fan.

The interesting part is that we can do run-length encoding for the look-and-say function relatively simply using the "buffered generator" design pattern.

1. Seed the buffer with the head of the sequence, next(d_iter)
2. For each item in the tail of the sequence:
    a. If it matches, count.
    b. If it doesn't match, yield the interim reduction and reset the counter.
3. Yield the tail reduction.

This design pattern seems to occur in a number of contexts outside games and abstract math.

Thursday, January 1, 2015

eLearning eXtravaganza

Visit Packt Publishing today for the $5 eBook Bonanza.

What better way to celebrate the new year?

Read. Learn. Grow.

Find out more at http://www.packtpub.com/packt5dollar


Thursday, December 25, 2014

Intro to Python Tutorial


A very nice tutorial. It's Focused on a specific problem. It covers the solution technology in some depth. I think the focus and depth features are important. It's often tempting to cover the technical features without really solving a problem.

In the "real world," we're often pressured to put the first MVP into production and move on to the next problem. I put "real world" in "scare quotes" because this approach is as dumb as a bag of hammers. Managers who insist on installing or shipping the first Minimally Viable Product are essentially purchasing technical debt instead of a solution.

I like the tutorial because it includes additional aspects like quality assurance. It's called "Defensive Programming," but it's really QA. I like to call it "fit and finish." The job's not over until there are automated tests to demonstrate that its over.

The Software Carpentry site as a whole looks quite good. It seems to have numerous high-quality tutorials.

Tuesday, December 23, 2014

Packt Deals

Okay. This seems shameless. But.

Here's the link http://bit.ly/1zg0mpA straight to my book information page on www.PacktPub.com

I'm slowly coming to grips with the reality of marketing.

Friday, December 19, 2014

Dev of the Week


Yes. Everyone is famous for 15 minutes.

And. "On the Web, everyone will be famous to fifteen people."