Bio and Publications

Thursday, December 25, 2014

Intro to Python Tutorial

http://sophieclayton.github.io/2015-01-15-uw/novice/python/index.html

A very nice tutorial. It's Focused on a specific problem. It covers the solution technology in some depth. I think the focus and depth features are important. It's often tempting to cover the technical features without really solving a problem.

In the "real world," we're often pressured to put the first MVP into production and move on to the next problem. I put "real world" in "scare quotes" because this approach is as dumb as a bag of hammers. Managers who insist on installing or shipping the first Minimally Viable Product are essentially purchasing technical debt instead of a solution.

I like the tutorial because it includes additional aspects like quality assurance. It's called "Defensive Programming," but it's really QA. I like to call it "fit and finish." The job's not over until there are automated tests to demonstrate that its over.

The Software Carpentry site as a whole looks quite good. It seems to have numerous high-quality tutorials.

Tuesday, December 23, 2014

Packt Deals

Okay. This seems shameless. But.

Here's the link http://bit.ly/1zg0mpA straight to my book information page on www.PacktPub.com


I'm slowly coming to grips with the reality of marketing.

Friday, December 19, 2014

Dev of the Week

http://java.dzone.com/articles/dev-week-steven-lott

Yes. Everyone is famous for 15 minutes.

And. "On the Web, everyone will be famous to fifteen people."

Thursday, December 18, 2014

Making Learning Accessible

Visit Packt Publishing today for the $5 eBook Bonanza.  https://www.packtpub.com.

eBooks and videos at a discount through something like the 6th of January.

We autodidacts are rejoicing.

Specifically, I can look at some of the Scala and Hadoop titles. I'm working with folks who have Hadoop but I've heard rumors that they're leaning toward Scala, also. Does that mean Apache Spark? Or does it mean Scalding?

I'm biased toward using Python with Hadoop; but I appear to be in the minority on this. Time to do some additional learning.


Tuesday, December 16, 2014

The Getting Started Problem

How does one get started developing software? What's the first step?

When you come to this craft -- or sullen art -- without a background except as a user, how do you get started writing code?

It's not easy. Indeed, developing software may be one the hardest things there is. Really, really hard.

Why? Consider the orders of magnitude involved. From sub-microsecond clock speeds to software that's supposed to continue running for 8,763 hours a year without interruption. That's 31,547,269 seconds. Isn't that about 15 orders of magnitude?

Or consider scope of storage. We wrangle over bytes in a dataset that spans terabytes. That's 12 orders of magnitude.

When engineers build a 13,000' long bridge, are they looking at it from scales of 10±5? Do they even care what's 21 miles away? They might care about things at the scale of 10-5, since that's about an inch. But 10-7? 100th of an inch? I could be wrong, but I have doubts.

I won't go so far as to say bridge building is particularly easy. It's safety critical work. People die when things go wrong. Consequently, it's regulated by civil engineering standards. Bridge designs are limited to proven patterns. You can't spring something new on the world and expect anyone to pay money for it or trust their life to it.

If you're with me so far, you see my point: software is different. And that makes it particularly hard. People do learn elements of it. How does this happen?

Two Paths Diverge

I see two separate paths:

  • More formal, and
  • Less formal.
The more formal path includes the kind of curriculum you find at big CS schools. Formal treatment of algorithms and data structures. Logic and Computable Functions. The essentials of Turing Completeness.


The less formal path starts with -- essentially -- random hacking around, trying to get stuff to work. Some folks argue that a curriculum of structured exercises isn't "random" hacking around. I suggest that a curriculum of structured exercises can be the formal path concealed under a patina of hackeriness. On the other hand, a set of exercises can be successful at training programmers; if it doesn't follow a formalized structure, it's merely a small step from random. 

[Random doesn't mean "bad;" it means "informal" and "unstructured."]

Some folks learn well in a formal, structured approach. They like axiomatic definitions of computability, and they can get a grip on how to map the abstractions of computing to specific languages and problem domains. They read content at http://www.algorist.com and see applications of principles.

Other folks can be shown the formal background that makes their random hacking fit into a larger pattern. When shown how some things fit a larger pattern, they're often happy work in a new context with an expanded repertoire of data structures and algorithms. They read content at http://www.algorist.com and look for solutions to problems; the formal patterns will emerge eventually.

Not all folks respond well to having their informal notions challenged. Some folks have ingrained bad habits and prefer to fight to the death to avoid change. A sad state of affairs, but remarkably common. They didn't understand linked lists at some point and steadfastly refuse to use the java.util.LinkedList class. This is what software religious wars are about. Some trolls truly and deeply love an uniformed religious war. 

Chickens and Eggs

Is this a chicken-and-egg problem? 
  • You can't really appreciate the formal foundations until you have some hands-on coding experience.
  • You shouldn't dirty your hands with implementation details until you have the proper theoretical foundations.
That seems potentially reductionist and uninformative. Or. Perhaps there is a nugget of truth in this. Perhaps one is actually foundational.

Eggs, to be specific, show the fresh mutations. The egg comes first from a chicken-like precursor that's not properly a chicken. 

What's that precursor to programming in Python? CS Fundamentals? Hacking around? I suggest that the way we acquire languages is important here.

Language Skills

Software languages are a small step from natural languages. As with learning natural languages, formal grammar may not be as helpful as engaging in conversations. Indeed, for natural languages, formal grammars are an afterthought. They're something we discover about a corpus. We impose the discovered grammar rules on ourselves (and others) to be understood in a context of other writing (and speaking.) 

Natural language grammar isn't timeless and immutable. People throw their hands up in despair at the erosion of grammar and language. They're -- of course -- just being reactionary. Language evolves. The loudest complainers are the ones who didn't pay attention for a long time and suddenly (somehow) realized the don't know what "WTF" means. LOL.

With an artificial language, the grammar is formalized. It has a first-class existence in compilers, interpreters and other tools. 

However, I think the bits of our brain that assimilate grammar work best from concrete examples. A formal grammar definition -- while helpful -- isn't the way to start. I think that a less formal, "try this" suite of exercises is perhaps the best way to learn to program.

As an author, I'm beholden to my publisher's notions of what sells. Examples sell. See almost everything from Packt. Working examples are solid gold. 

These are not necessarily problems for the reader to tackle and solve. They're examples to study.

The conundrum with attempting to solve problems is the attempting part. It's hard to set out a list of "solve these problems and master programming" problems and hope folks get through them. What if they fail? Clearly, you'd provide answers. In that case, you'd be back at examples to study. Hmm.

I have intermittent interest in my older Building Skills in Python book. Partly because it's got extensive exercises in each chapter. I get donations. I get inquiries. The exercises seem to resonate in a small way.

I've done about 22 levels of the Python Challenge (I'll write about that separately.) It's not a great way to learn from scratch. You need to know a lot. And you need a lot of hints. 

I've done almost 70 levels of Project Euler. It might be a better way to learn programming because the easy problems are really easy. No guesswork. No riddles. No steganography. The answers are totally cut-and-dried, unambiguous, and absolute. However, there's no easy guidance for learners. Either you have an answer, and want help on improving it, or ... well ... you're stuck and frustrated. 

Structured Sequence of Exercises

What strikes me as a possibility here is a structured series of exercises that lay out the foundations of computer science as realized in a specific programming language.

Puzzle-style. With extensive hints. Background readings, too. But with absolutely right answers. And a score-keeping system to show where you stand. 

No tricky riddles. No quizzes to proceed. You could go on to advanced material without mastering the foundations, if you wanted.

I've got a bunch of exercises and examples in my Building Skills books. Plus some of the examples in my Packt books can be modified and repurposed. Plus. Projects like HamCalc contain a wealth of simple applications that can be adjusted to show CS fundamentals.

Perhaps relevant is this: https://www.google.com/edu/programs/exploring-computational-thinking/.   I'm not sure precisely how it fits, since it seems to be more aimed at providing a general background, rather than teaching programming language skills. They decompose the skills into four specific techniques. Here are specific techniques.
  • Decomposition: Breaking a task or problem into steps or parts.
  • Pattern Recognition: Make predictions and models to test.
  • Pattern Generalization and Abstraction: Discover the laws, or principles that cause these patterns.
  • Algorithm Design: Develop the instructions to solve similar problems and repeat the process.
Perhaps this is relevant: http://interactivepython.org/courselib/static/pythonds/index.html.  I haven't read this carefully, but it seems to be expository rather than exploratory.  It's really thorough. It has quizzes and self-checks. 

I think there's a big space for publishing lots simple recreational programming exercises as teaching tools. 

Thursday, December 11, 2014

Wow. Two-Word Question. Profound Insight.

I'm working on yet another Python book. This one looking at functional programming in Python. It doesn't really go with with Mastering Object-Oriented Python and Python for Secret Agents because the focus isn't on Python's strong suit.

In chapter one, a reviewer had this two-word question:

"yield from?"

What? What does "yield from" mean?

Oh.

Wow.

https://docs.python.org/3/whatsnew/3.3.html#pep-380-syntax-for-delegating-to-a-subgenerator

I had utterly missed this profound, important feature.

I guess I have been too blasé in skimming the release notes.

That's embarrassing.  And it only took two words to reveal my mistake.

I had to then review all 113 yield statements in 72 files of examples that go with the book.  That means most chapters will get touched to revise an example to show yield from iter instead of the older for x in iter: yield x template.

This also changes the Tail Call Optimization material. The explicit for was actually kind of nice for showing how TCO is implemented in Python. The yield from makes it a little less clear.

Some reviewers consider TCO so fundamental that it belongs in chapter 1. The omission of detailed analysis of Python's TCO approach was considered a significant flaw. Other reviewers seemed happy setting discussion of TCO aside for later.

The Functional Python Conundrum

This book is going to be difficult. The ratings from the reviewers were low. Really low. It looks like I've got a lot of work to do. Finding the target audience will be difficult.

One reviewer asked -- in effect -- why would someone who knew functional LISP ever use Python? I don't think there's a big audience of disgruntled LISP programmers, so that's not a relevant question.

Viewed from the other direction, it's hugely import. Why would a Python programmer adopt functional design patterns? That's the question that needs to be answered clearly.

And from the reviews of chapter 1, it wasn't addressed clearly enough.

Thursday, December 4, 2014

Architectural Principles, Spring Framework, and Jersey JAX-RS

See this: http://www.moschetti.org

Attended a meeting with Buzz. Not stated in his blog (in an obvious way) was something he said about not being a fan of big frameworks. I didn't write down his punchline, but it was a pretty pithy summary of the framework tradeoff.

IIRC, it was essentially this: you can wrestle with one or both of these technical problems.
  • Boilerplate Code
  • A Framework's Conceptual Model
Either you have to create your own libraries or you have to learn someone else's. This is in addition to wrestling with the business problem you're supposed to be solving.

Buzz's point seemed to be that you can often manage your own boilerplate more easily than you can come to grips with a framework. If one member of your sprint team handles reusable services, you can just ask them for a feature. You don't have to spend an hour reading other people's struggles.

After spending three months getting my brain wrapped around Spring Framework, I'm inclined toward partial, qualified agreement. Frameworks seem to have limited value until you're an expert in using them.

Layers and Layers

When wrestling with a new feature, you are forced to assume that you've understood its semantics. When you mock a framework element for test purposes, you're reduced to hope that your unit tests are sufficient. A unit tests of a mocked framework element only tests your assumptions. If you're not using the element's API correctly, your tests can't show that the framework will break or raise exceptions.

For new technology, you need to start with a technical spike to understand the framework. Then you can write unit tests that test against known framework behavior. Then you can write the real code that's based on the unit tests that are based on a spike that shows how the framework really works.

Using a technical spike for discovery and debugging can be challenging. You don't want to drag around your entire application just to create a spike. But you don't want to drop back to a trivial "hello world" spike that doesn't really apply to your context. You have to balance simplicity against realism.

For example, making JAX-RS requests to web services is aggravating to debug. You can spend many hours looking at boilerplate 401 and 404 errors wondering what's missing. You can't write the unit tests until you finally get something to work. Once you have something, you can replace real objects with mock objects.

If you already know JAX-RS features, it's easy. If you already know the RESTful service, it's not too bad. If you know neither JAX-RS nor the service, you don't have any clue which direction to turn. Did I misuse JAX-RS? Is something wrong in the request? Am I missing a required header? Did I leave something off the Accept header?

I finally had to give up creating spikes and debugging RESTful requests in Java. It turned out to be simpler to write a version of the REST client in Python. I used this to figure out how the real service really worked. Given a working Python spike, I could then save those interactions for WireMock.

Once I has a clue how the service worked, I could also write a mock server for some more sophisticated experiments.  This was useful for debugging problems based on a failure to understand JAX-RS.

Yes. Rather than struggle with the framework, I wrote the client once in Python and then rewrote the client again in Java. It seemed quicker than trying to debug it in Java.

One contributing factor is the 1m 30s build time in Maven. Compare that with interactive  Python at the >>> prompt.

Perhaps a smaller framework would have been better.