Monday, November 30, 2009
Thursday, November 26, 2009
Tuesday, November 24, 2009
Thursday, November 19, 2009
Sunday, November 15, 2009
The ORM layer "hides" the database, right?
We never have to think about persistence, right? It just magically "happens."
Here's some quotes from a recent email:
"Somehow people are surprised that we would have performance issues. Somehow people are surprised that now that we are putting humpy/dumpy together that we would have to go back and look at how we have partitioned the system."
I'm not sure what all of that means except that it appears that the author thinks mysterious "people" think performance considerations are secondary.
I don't have a lot of technical details, just a weird ranting list of complaints, including the following.
"... the root cause of the performance issue was that each call to the component did a very small amount of work. So, they were having to make 10 calls to 10 different components to gather useful info. Even though each component calls was quick (something like 0.1 second), to populate the gui screen, they had to make 15 of them."
ORM Is A "Silver Bullet" -- It Solves All Our Problems
If you think that you can adopt some architectural component and then program without further regard for the what that component actually does, stop coding now and find another job. Seriously.
If you think you don't have to consider performance, please save us from having to clean up your mess.
I'm repeatedly shocked at people who claim that some particular ORM (e.g., Hibernate) was unacceptable because of poor performance.
Hint 1: ORM == Mapping. Not Magic. Mapping.
The mapping is from low-rent relational row-column (with no usable collections) to object instances. That's all. Just mapping rows to objects. No magic. Object collections and SQL foreign keys are cleverly exchanged using specific techniques that must be understood to be used.
Hint 2: Encapsulation != Ignorance. OO design frees us from "implementation details". This does not mean that it frees us from performance considerations. Performance is not an "implementation detail". The performance considerations of class encapsulation are central to the very idea of encapsulation.
One central reason we have object-oriented design is to separate performance from programming nuts and bolts. We want to be able to pick and choose alternative class definitions based on performance considerations.
ORM saves writing mappings from column names to class instances. It saves us from writing SQL. It doesn't remove the need to actually think about what's actually going on.
If an attribute is implemented as a property that actually does a query, we need to pay attention to this. We need to read the API documentation, know what features of a class do queries, and think about how to manage this.
If we don't know, we need to write experiments and spikes to demonstrate what is happening. Reading the SQL logs should be done early in the architecture definition.
You can't write random code and complain that the performance isn't very good.
If you think you should be able to write code without thinking and understanding what you're doing, you need to find a new job.
Tuesday, November 10, 2009
# Fix style="background-image:url("url")"
background_image = re.compile(r'background-image:url\("([^"]+)"\)')
def fix_background_image( match ):
return 'background-image:url("e;%s"e;)' % ( match.group(1) )
# Fix src="url name="name""
bad_img = re.compile( r'src="([^ ]+) name="([^"]+)""' )
def fix_bad_img( match ):
return 'src="%s" name="%s"' % ( match.group(1), match.group(2) )
fix_style_quotes = [
The "fix_style_quotes" sequence is provided to the BeautifulSoup contructor as the markupMassage value.
Friday, November 6, 2009
Wednesday, November 4, 2009
def clean_directives( page ):
Stupid Microsoft "Directive"-like comments!
Must remove all <!--[if...]>...<![endif]--> sequences. Which can be nested.
Must remove all <![if...]>...<![endif]> sequences. Which appear to be the nested version.
if_endif_pat= re.compile( r"(\<!-*\[if .*?\]\>)|(<!\[endif\]-*\>)" )
for m in if_endif_pat.finditer( page ):
if "[if" in m.group(0):
if start is not None:
elif "[endif" in m.group(0):
if len(context) == 0:
if start is not None:
- The "DBA as Bottleneck" problem. In short, the DBA's take projects hostage while the development team waits for stored procedures to be written, corrected, performance tuned or maintained.
- The "Data Cartel" problem. The DBA's own parts of the business process. They refuse (or complicate) changes to fundamental business rules for obscure database reasons.
- The "Unmaintainability" problem. The stored procedures (and triggers) have reached a level of confusion and complexity that means that it's easier to drop the application and install a new one.
- The "Doesn't Break the License" problem. For some reason, the interpreted and source-code nature of stored procedures makes them the first candidate for customization of purchased applications. Worse, the feeling is that doing so doesn't (or won't) impair the support agreements.