S.Lott-Software Architect: April 2014

Thursday, April 24, 2014

Literate Programming with pyWeb 2.3

Updates completed. See https://sourceforge.net/projects/pywebtool/ and http://pywebtool.sourceforge.net.

The list of changes is extensive.

However, the essential API and the markup language for creating literate programs hasn't (significantly) changed. A few experimental features were replaced with a first-class implementation.

The interesting (to me) bit is this sequence of events.

I started out using Leo and Interscript as a literate programming tools. They worked. But they were larger and clunky and I wasn't happy.

I wrote my own too, not really getting the use cases.

I found pyLit and liked it a lot. For a long time, I liked it better than my own pyWeb tool.

Then I ran across some problem domains for which pyLit didn't work out well. It's not that I've abandoned pyLit, but I believe I'll focus more on pyWeb.

The Awkward Problem Domains

Here are the two awkward problem domains.

Historical Story Lines. In some cases, we want to describe a module or package based on the path of exploration. Rather than simply drop the design, we want to show the path followed which lead to the design. This can be helpful for certain kinds of pedagogical exercises where we're steering the reader through a process.
Complex Packages that Don't Follow Python's Presentation Order. In some cases, we need to present things out of order. Python constrains us to have docstring and imports first. Our class definitions must proceed in "dependency" order. But this may not be the best order for explanation. Sometimes, we want to start with the "def main():" function first to explain why a class looks the way it does.

PyWeb handles these nicely. One of the handiest things is this for out-of-order presentation.

@d Some Class... @{

class TheClass:

this class uses the following imports

@d Imports...@{

import this

import that

We can then scatter imports through the documentation in the relevant places. And they follow the more interesting material.

When it comes to final assembly, we have this.


@o some_module.py @{
    @<Imports for this module@>
    @<Some Class that does the real work of this module@>
@}

This builds the module, tangling the imports into one cluster up front, and putting the class definition later.

Thursday, April 17, 2014

Stingray 4.3 Update

See https://sourceforge.net/projects/stingrayreader/

Some small improvements to the COBOL DDE parsing.
A sensible demo program that shows how to read COBOL files.
A complete rewrite to Python3.3.
Support for more COBOL syntax.
Support for Occurs Depending On
Support for RECFM=F, RECFM=V and RECFM=VB legacy files.

The support for Occurs Depending On is a Big Sweaty Deal (BSD™). It breaks the essential structure for calculating offset and size of data items in a fixed file schema. It breaks it badly. We wind up with a fairly complex recursive calculation in the general case of variably located items.

We'll address ODS and Numbers spreadsheets with a somewhat cleaner implementation, also. I figured out how ElementTree QNames work. I regret the ignorant misuse of namespaces in previously posted code. This will be part of release 4.4 or later.

Thursday, April 10, 2014

The SortedContainers Package for Python

See this: SortedContainers — sortedcontainers 0.6.0 documentation

Here's some text from the invitation.

You may find the the performance comparison and implementation details interesting because it doesn't use any sophisticated tree data structure or balancing algorithms. It's a great example of taking advantage of what processors are good at rather than what theory says should be fast.

The documentation is extensive. The implementation details are interesting. The claim of faster is supported nicely. I have two quibbles.

It actually does use a sophisticated tree data structure. A list of lists really is a kind of tree.
"rather than what theory says should be fast" doesn't make any sense to me at all.

A claim that Computer Science theory isn't right bothers me. If theory says some algorithm is fast, there are only two possibilities: (1) theory is actually right and it really is fast and the demonstration was incomplete or (2) the theory is incomplete, and the implementation extends (or replaces) the old theory; the implementation is new theory.

It's never the case that theory is "wrong." That fails to understand the role of theory.

It's always the case that an implementation either confirms theory or extends theory with new results.

To me, this package demonstrates one of two things.

The theory was incomplete and this package is a new theory that replaces the old, wrong theory.
The theory was right and this package demonstrates that the theory was right by being a good, solid, usable implementation.

I would suggest the second option here: this package shows the value of Python's list-of-lists as a high-performance technique for implementing sorted structures. It's not an example of "taking advantage of what processors are good at." This is an example of using Python properly to squeeze excellent performance out of the available structures.

The really important insight is this "The sorted container types are implemented based on a single observation: bisect.insort is fast, really fast."

This is a profound observation. Read more here: http://www.grantjenks.com/docs/sortedcontainers/implementation.html

Thursday, April 3, 2014

Mastering Object-Oriented Python

See http://www.packtpub.com/mastering-object-oriented-python/book

Coming soon.

This is relatively deep, under-the-hood stuff for folks who want to master the Python feature set.

Here's the overview of what you get:

0 Some Preliminaries 3 examples, 56 lines
1 The __init__() Method 55 examples, 351 lines
2 Integrating Seamlessly with Python: Basic Special Methods 92 examples, 558 lines
3 Attribute Access, Properties, and Descriptors 33 examples, 310 lines
4 The ABC's of Consistent Design 18 examples, 108 lines
5 Using Callables and Contexts 17 examples, 214 lines
6 Creating Containers and Collections 50 examples, 438 lines
7 Creating Numbers 12 examples, 232 lines
8 Decorators And Mixins – Cross Cutting Aspects 39 examples, 233 lines
9 Serializing and Saving: JSON, YAML, Pickle, CSV and XML 77 examples, 648 lines
10 Storing and Retrieving Objects via shelve 34 examples, 272 lines
11 Storing and Retrieving Objects via SQLite 45 examples, 410 lines
12 Transmitting and Sharing Objects 38 examples, 388 lines
13 Configuration Files and Persistence 59 examples, 490 lines
14 The Logging and Warning Modules 46 examples, 343 lines
15 Designing for Testability 38 examples, 393 lines
16 Coping With The Command Line 42 examples, 222 lines
17 Module and Package Design 31 examples, 93 lines
18 Quality and Documentation 42 examples, 269 lines
Preface 3 examples, 12 lines
Bonus Chapter 1 Archives and Directories 11 examples, 119 lines
Bonus Chapter 2 Case Study: Document Analysis 39 examples, 308 lines

824 examples, 6467 lines

Yes. That's a lot of code. It's relentless.

S.Lott-Software Architect

Moved

Moved. See https://slott56.github.io. All new content goes to the new site. This is a legacy, and will likely be dropped five years after the last post in Jan 2023.