CodeIgniter…Meet Minify

As a followup to one of my previous posts I wanted to go through how I managed to get CodeIgniter and Minify to play nice with each other. Hopefully this will make someone else’s life easier. For those not using CodeIgniter this post might be either confusing or boring. Or both I guess.

My approach might seem code-heavy compared to other solutions but it has the virtue of requiring only a small change to single file that would be included by all pages on your site. That’s typically not a problem since the first thing I do when I’m working on a site is to break out the common elements such as the <html> and <head> tags to their own included header file.

In CodeIgniter I created a library called MY_Includes.php (/system/application/libraries/MY_Includes.php). This is the core class that contains the mappings between each controller and the JavaScript and CSS files required by the view that will be loaded by the controller that was invoked by the browser. Obviously this implies the extra step. If I create a new JavaScript or CSS file I can’t go into the globally included header file and add a <script> or <link> tag there - I have to edit MY_Includes.php to map the JavaScript or CSS file to that particular view. Yea, it seems weird to edit a PHP file to add a CSS or JavaScript file, but there are a couple of different factors at work here and this solution made the most sense to me. The big win was that it helped integrate Minify into my codebase with almost minimal effort.

You can see an edited version of MY_Includes.php here. I wanted to walk through this code a bit to highlight the important parts, but hopefully it’s readable on its own.

First, you’ll notice the constructor requires the name of the controller that was invoked. I’ll show you how I get that later on, but essentially the whole class relies on that piece of information. My application is fairly linear in the sense that once I know the controller’s name I know (barring exception cases) which view will be invoked.

This in turn allows me to map controllers directly to JS and CSS files, which is why you’ll see the init method set up 2 hashes containing the JS and CSS files that I have access to, jsFilesHave and cssFilesHave. The key in the hash is a logical name I will use when adding the file to a view. This will improve readability and reduce errors and maintenance. The value in the hash is a string that specifies where the corresponding source file can be found. This is relative to the web root and is of a form that Minify understands. Whenever I create a new JS or CSS file I have to first add it to one of these hashes so that I can refer to it later in the file.

One other note on the init - I’m not sure if I needed to, but I found it easiest to break with the CodeIgniter way of doing things and issue a PHP include statement to tell the class where to find the Minify source in the below snippet from that method.

//from minify examples:
//Add the location of Minify’s “lib” directory to the include_path.
ini_set(’include_path’, ‘/home/vdibart/minify/lib/.:’ . ini_get(’include_path’) );
require ‘Minify/Build.php’;
require ‘Minify.php’;

After init, the constructor will call compileTags. This is the heart of the logic. You can see it populate the cssFilesNeed and jsFilesNeed hashes, first with the files that are common to all views and then the ones depending on which controller was invoked.

Determining which controller was invoked is fairly straightforward. The following code is at the top of my globally included header file:

//for globally included header file
//so know which CSS or JS files to include
$pageName = $this->uri->segment(1, 0);
$pageName .= “/” . $this->uri->segment(2, “index”);
$this->load->library(”MY_Includes’, $pageName);

So if the controller was “http://www.mysite.com/member/register”, this code will pass “member/register” to the constructor of my class. Later on in the same header file I have the following 2 lines, which will extract the appropriate CSS and JS links:

<!– for globally included header file –>
<link rel=”stylesheet” href=”<?= $this->CI->my_includes->cssTag(); ?>” type=”text/css” media=”screen” />
<script src=”<?= $this->CI->my_includes->jsTag(); ?>” type=”text/javascript” charset=”utf-8″></script>

Switching back to the source code of MY_Includes.php, you can see those 2 methods invoke Minify to build the included files and then return a URL that can be used to retrieve the files. There’s a little bit of work in each of those to make the URL look like something that CodeIgniter will work with. So once the PHP executes the above tags will look like this in the final source code for the page:

<link rel=”stylesheet” href=”http://www.mysite.com/includetag/css/member-register/1222014216″ type=”text/css” media=”screen” />
<script src=”http://www.mysite.com/includetag/js/member-register/1222098068″ type=”text/javascript” charset=”utf-8″></script>

So each rendered page on my site has only 1 CSS file and 1 JS file included. And those files are minimized and cached. All of that is due to Minify. But you’ll notice there’s one piece of the puzzle still missing. The above <link> and script tags refer back to my site, and there has to be something that knows how to interpret that and return the appropriate CSS or JavaScript data. It turns out that “includetag” is a CodeIgniter controller that I created. I’ve included the source code here. There’s not a ton to mention here. The class loads the exact same helper class MY_Includes.php that interfaces with Minify to retrieve the CSS or JS file and return them to the client.

Hopefully there’s enough to get you through to a working version. To summarize the steps:

  1. Download MY_Includes.php (here) and put it in your /system/applications/libraries directory
  2. Edit the init method inside of MY_Includes.php to include the correct path to your Minify installation
  3. Edit the init method inside of MY_Includes.php to include your CSS and JS files
  4. Edit the compileTags method inside of MY_Includes.php to include the correct files for each controller
  5. Download includetags.php (here) and put it in /system/applications/controllers directory
  6. Add the two code fragments commented with “for globally included header file” above to the appropriate file in your application
  7. Fire it up
  8. Feel free to post a comment if you have troubles and I’ll walk you through it or edit the post to fix any errors as needed.

    These icons link to social bookmarking sites where readers can share and discover new web pages.
    • bodytext
    • Slashdot
    • del.icio.us
    • Facebook
    • StumbleUpon
    • Technorati
    • Reddit
    • NewsVine
    • Ma.gnolia
    • co.mments
    • Spurl
    • Furl
    8 comments

One Flag to Rule Them All

So right now I’m looking at a table that has at least 3 different columns that control whether the particular row is displayed on the front end. In some cases that’s unavoidable, but it has to be kept in check.

Maybe you can tell me what the difference is between the intent of these columns: status (e.g. pending, active, canceled) and should_display (0 or 1). In addition to that, there’s one part of the code that will ignore a record if one of the FK columns is null but will consider it if it’s non null.

This is madness. I now have to piece together which columns are significant to which consumers of the data. And then I have to figure out the magical combination of values to make the row appear on the front end. This leads me to some quick rules for database flags:

  • Limit the number of display flags to as few as possible. I usually use a is_active or display_order column to determine whether the row should be retrieved. There will be cases where the row should be retrieved by one consumer and not another, but there should never be more than one column that does almost the same thing.
  • Use descriptive column names. The ones above are too general. is_active tells me exactly what I need to know.
  • You can use a nullable timestamp column to do both boolean checks and date-triggered checks. In other words, if the column is null it means the column is still valid. If it’s not null you have to check it against the current timestamp. This saves a duplicated column and is fairly easy to get across.
These icons link to social bookmarking sites where readers can share and discover new web pages.
  • bodytext
  • Slashdot
  • del.icio.us
  • Facebook
  • StumbleUpon
  • Technorati
  • Reddit
  • NewsVine
  • Ma.gnolia
  • co.mments
  • Spurl
  • Furl
0 comments

In Praise of Minify

Having read High Performance Web Sites, I figured I’d take a little time out of the development of new features on my side project to look at some basic performance issues. The first stop was YSlow, the Firefox plugin that works with Firebug to give you a simple report on how you rate on the Yahoo! performance scale. Mine being a tiny site, the report before any optimizations was decent but not great. There was definitely room for improvement so I figured I’d put some of the advice I’ve read recently into practice.

The first optimization was very easy. I made sure my images were sufficiently cached by adding a quick .htaccess file in the directory where my images are stored on the server. I saw 2 different techniques for doing this. One was based on file extension, such as the technique discussed here. The second was based on the file’s content-type, which was discussed here. On the margin the one based on content-type seemed a safer bet. That way if I have a file that’s incorrectly named it will still get cached.

The next step was to try to improve my JavaScript and CSS includes. As mentioned in High Performance Web Sites, the files should be minimized in order to save bandwidth. They should have far future expires headers so that the browser doesn’t request them after the first visit. And the number of includes should be limited so that there’s fewer requests that need to be made. Luckily someone much smarter than I already developed just about the perfect solution to all those issues and more. The Minify library for PHP is one of those pieces of code that does exactly what I was hoping it would do in exactly the way I was hoping it would. And to boot it required as little effort to integrate into my existing code base as could reasonably be expected. I recommend that anyone running even a small site on their own take a look at Minify. There’s absolutely no reason not to be using this wonderful little library. None. Go out right now and do it.

There was one snag in process of integrating Minify with my project. As I’ve mentioned, I’m using the CodeIgniter framework. It turns out that Minify and CodeIgniter needed a little bit of coaxing to work together, but nothing that got too messy. I’m going to leave that discussion for my next post, which will hopefully not take 4 months to write :)

These icons link to social bookmarking sites where readers can share and discover new web pages.
  • bodytext
  • Slashdot
  • del.icio.us
  • Facebook
  • StumbleUpon
  • Technorati
  • Reddit
  • NewsVine
  • Ma.gnolia
  • co.mments
  • Spurl
  • Furl
3 comments

Magic Button Syndrome

If there’s one concept I’ve fought my entire career it’s that there can be, or even should be, a way to make everything work “automagically”, a term the afflicted developers use lovingly. I recently christened this the “Magic Button Syndrome”.

Usually a bunch of fairly smart developers sit in a room and start dreaming of how a system might work. “We have to make sure we can easily modify the configuration,” one might say. “We should have a means to generate the configuration based on some other configuration file,” another might respond. “Let’s use annotations to make sure that the configurations stay in sync across versions,” someone else might suggest. Yet another person might think it’d be wonderfully cool if you could auto-inject annotations somehow.

Their triumphant moment comes when the CTO is standing over their shoulder screaming about something that needs to be fixed ASAP and they nonchallantly say, “Oh, I can fix that, one second.” They turn to their machines dramatically, edit one or two lines somewhere, smack the return key, twiddle their thumbs, reload the page, and then smile. “No big deal,” they’ll say with a smirk on their face. That’s it. That’s what they live for. They want that one Magic Button moment.

It sounds foolish, but there are plenty of developers like that out there. For these people, it makes perfect sense that if you can automate something little, automating something bigger containing tons of moving parts must be even better. Eventually the automation will reach singularity in the Magic Button.

The problem is that automation suffers from the same law of diminishing returns as does traveling at the speed of light. It takes an infinite amount of energy to accelerate a particle with any mass to the speed of light. In the same way, it takes an infinite amount of energy to create that Magic Button. Not that it stops people from trying. Sure, changing one of the thousands of options that are contained in a config file or database is easy. But if you’ve worked on systems like these you know that doing anything outside of the realm of what the system was designed to do is absolutely, unbearably painful. That Magic Button hides layers of abstraction upon abstraction upon abstraction. Just when you begin to understand what a peice of code does you realize you forgot what code is calling it. In the effort to make something of uber-value, no single component makes any sense.

You find it takes people months to really understand the system. Changes take weeks to test, and lead to reprecussions that no one really ever expected. Once all the original developers are gone everyone starts to realize the system needs to be redesigned. It’s become like the pyramids - beautiful, absolutely brilliantly designed, but a total mystery. This time we’ll do it differently. In Ruby maybe. And auto generate all the documentation using XML…

These icons link to social bookmarking sites where readers can share and discover new web pages.
  • bodytext
  • Slashdot
  • del.icio.us
  • Facebook
  • StumbleUpon
  • Technorati
  • Reddit
  • NewsVine
  • Ma.gnolia
  • co.mments
  • Spurl
  • Furl
0 comments

Wherein I Question the Usefulness of MVC

I decided to use CodeIgniter for a PHP project that I’m working on. CodeIgniter is an MVC framework, not too unlike CakePHP. At least I imagine they’re very similar, but I can’t say for sure as the reason I chose CodeIgniter over CakePHP was that the CakePHP documentation is a mess and I didn’t have time to wade through it. CodeIgniter has been fairly easy to work with so far. I’m sure there are tons of CodeIngiter reviews by developers like me out there, so I won’t bore you with that just yet (future post!).

This post is about Model-View-Controller (MVC) architecture. Like any developer, I’ve read countless retellings of why patterns and MVC are good for your code. True to form, I think those claims are overblown. I’ve worked with people that do everything “By the Book” and I’ve worked with people that hack everything together as best they can. Seeing both sides of it I honestly can’t say that one made my life any better than the other. Unstructured code, if kept reigned in to some degree, can be incredibly flexible and allow you to be agile in the face of rapidly-changing priorities.

For instance, I’m not above having SQL statements in a JSP file. I don’t love it. I try to avoid it if it’s going to get messy. But I don’t think it’s something to be embarrassed about. I can’t tell you how many times I’ve been able to move a change out in minutes rather than weeks because I was able to tweak a query in the JSP. No, it’s not “By the Book”. But it works, and in the end that’s what you get paid for.

My general rule of thumb is that the closer to the end user your code is the more flexible it has to be. Consider the following range of technologies that flow from the user end to the server side: HTML/CSS, Javascript, PHP/Java/Ruby, PL/SQL, database schema. HTML needs to be more flexible than Java, which needs to be more flexible than the database schema. So for every 1000 times you tweak your HTML or CSS, you might need to make a couple of changes to your backend Java. Sounds reasonable.

So coming back to MVC, one thing I’ve never understood is why the controller is responsible for selecting which view is invoked. This seems fundamentally flawed to me. In a language like Java the controller is a servlet compiled into a jar file somewhere. To change the behavior of that file you have to go through an entire release process: change code, test, promote to QA, test, promote to production, test. At MLB, a change like that took about 2 weeks from start to finish. (Obviously the situation is a little different if you use PHP, which is why I’ve decided to use an MVC framework for the PHP project).

In essence, it’s like the backend developers are saying “Move aside HTML, let the big boys make the call. We know better which file should be displayed”. You know what, they don’t and they shouldn’t. Yes, I know about Front Controllers. Yawn. Yes, I know you could easily write the system such that the flow through the views is configured using XML so it can be changed on the fly, as they did at MLB. Snore. Don’t get me started on XML for configuration. These are all solutions in search of a problem. These things can be done, but no one has really ever convinced me that they need to be done. Agility requires simplicity. Simplicity can’t be configured with XML.

These icons link to social bookmarking sites where readers can share and discover new web pages.
  • bodytext
  • Slashdot
  • del.icio.us
  • Facebook
  • StumbleUpon
  • Technorati
  • Reddit
  • NewsVine
  • Ma.gnolia
  • co.mments
  • Spurl
  • Furl
0 comments

Where Go Older Developers?

It seems like a long since I was the youngest guy in the office. Those days were pretty golden. The young turk is the one that everyone lives vicariously through. I remember recounting stupid drinking stories to a cube-full of older developers eager to relive their own youthful transgressions. The young turk can make mistakes that older guys would get reamed for. And he can get away with unsavory behavior because his unofficial responsibilities include comic relief. The young turk has it pretty good.

These days, I’m almost always one of the oldest guys on the team. At R/GA there are some team members about my age, give or take a year, but most of them are between 4 and 10 years younger. Of course, no one would be stupid enough to be caught arguing that age should imply much in terms of ability. The point is that I’ve spent a lot of time in the past couple of years wondering where all the old developers go. At R/GA there can’t be more than 5 developers (including the front-end guys) 35 and older. This is in a company with probably 150 or more tech people. At MLB I’d bet most of the tech group was under 32. With the exception of the CTO, I can’t think of anyone who would have been older than about 38. This is out of about 100 tech people. I have this fear that I’m going to just start melting away when I turn 38. Or worse - be forced to move into human resources. I mean, I know coding is a young man’s game, but this is absurd.

I suppose some become middle management, but since an organization might only need 1 or 2 managers for every 10 developers there are fewer of these to be found too. Lately I’ve been wondering if the older folk need something more stable and find themselves moving to investment banks. I can’t really say because I’ve avoided those kinds of jobs my whole career. Maybe they freelance, or start their own companies. Maybe they get fed up and go teach. Damned if I know. All I know is that it’s starting to freak me out a bit. How much longer until the developer police show up at my door and take me out to the shooting range? Come to think of it I’ve been hearing voices coming from my O’Reilly books - what if they’re really made out of ground-up developer parts? It’s starting to make a certain bit of sense.

These icons link to social bookmarking sites where readers can share and discover new web pages.
  • bodytext
  • Slashdot
  • del.icio.us
  • Facebook
  • StumbleUpon
  • Technorati
  • Reddit
  • NewsVine
  • Ma.gnolia
  • co.mments
  • Spurl
  • Furl
2 comments

The request_token Pattern

The idea behind the request token is another one of those simple-but-powerful patterns that I’ve come to rely on in various systems. I’ll jump right into an example of a case where I wanted to use it but alas I didn’t get to make the change before I left the job.

The architecture was a simple producer-consumer model. Some piece of the system was responsible for placing a row into a table and another was responsible for finding those rows and processing them. As it turns out, the system required many more consumers than producers, which I realize is not all that uncommon.

(Before you go screaming at me about “enterprise” solutions like Oracle’s Advanced Queueing or JMS, that’s not entirely the point. It’s incidental that this situation looks like a producer-consumer problem, but this pattern in more generally useful. So bear with me and think about how to apply it elsewhere.)

So, applying it to an email system where one piece of the system generates the emails and dumps them into a table and another piece of the system takes them out and sends them, you might have a table that looks like this:


CREATE TABLE email_jobs
(id NUMBER NOT NULL
,email_to VARCHAR2(255) NOT NULL
,email_subject VARCHAR2(255) NOT NULL
,email_body VARCHAR2(255) NOT NULL
,insert_ts DATE DEFAULT SYSDATE NOT NULL
,update_ts DATE
,processed_ts DATE DEFAULT SYSDATE NOT NULL

You can imagine the consumer might wake up, ask for the oldest 10 items in the table, send them off in batch, and then go back to sleep. As you might expect, I had a recurring problem where 2 consumers were both attempting to pull the same item from the table and process it. In the above case, a bug like that might lead to the person getting 2 identical emails, which no one wants. There are ways to protect against these kinds of things at the level, but in reality you just want to ensure that no 2 consumers get the same item.

Enter the request token. With this, each consumer produces the a unique indentifier and marks the rows that it wants with that value. It then requests only the rows with that token, making it virtually impossible to have the same row processed by 2 different consumers.


CREATE TABLE email_jobs
(id NUMBER NOT NULL
,email_to VARCHAR2(255) NOT NULL
,email_subject VARCHAR2(255) NOT NULL
,email_body VARCHAR2(255) NOT NULL
,request_token VARCHAR2(255)
,insert_ts DATE DEFAULT SYSDATE NOT NULL
,update_ts DATE
,processed_ts DATE DEFAULT SYSDATE NOT NULL

Notice the addition of the request_token column. On the application side:


//produces a unique number
$token = generate_token()


//mark some rows with the token - only where the request_token is already null - important!
UPDATE email_jobs SET request_token = $token WHERE <….find oldest rows…> AND request_token is null


//do this so other consumers won’t see these rows
COMMIT


//go back and find the ones that you marked
SELECT ej.id FROM email_jobs ej WHERE request_token = $token

Even if you have more than one process hitting that table, one of them will overwrite the other’s value for the request_token. Therefore, unless your application is sensitive to the number of rows each consumer processes, this is completely safe in that it won’t lead to multiple consumers processing the same row.

In general, the request token pattern pre-marks some data so that it’s easy to find later on. Another example that I’ve used in the past is in account creation. What frequently happens is that you have to insert a row and the update it soon after. The problem is that the insert generates a new unique ID that the update needs to know, but sometimes doesn’t. My solution has been to pass a request token to the code that does the insert and then pass that same value to the code that does the update. As long as the request token is unique they should both be able to address the correct row.

At this point you might have the idea to create the request_token column with a UNIQUE constraint so that no two rows can have the same value. Not so fast. In an even more useful case, there have been times when I’ve had to create a bunch of rows and then manipulate them in bulk. So, for instance, create a bunch of new accounts and set their email address to the same value. Without a column like the request_token, you’d potentially having nothing to group them by except for an insert_ts or similar column. With the request_token, it becomes a very easy thing to do.

These icons link to social bookmarking sites where readers can share and discover new web pages.
  • bodytext
  • Slashdot
  • del.icio.us
  • Facebook
  • StumbleUpon
  • Technorati
  • Reddit
  • NewsVine
  • Ma.gnolia
  • co.mments
  • Spurl
  • Furl
3 comments

Be A Data Integrity Watchdog

Funny thing happens when you start to put data into a database. It becomes important. At one point it might have seemed like a nice idea to save the visitor’s IP address. Slowly, as the system evolves, little branches of code pop up around the fact that the IP address is populated. Suddenly you find yourself in a position where you have to protect that piece of data. You can’t sit by idly and let that improperly formatted IP address bring down the whole system. You have to guard your system against these intrusions.

And the intrusions will happen. I’ve designed a number of large systems, and the only common denominator is that somehow, at one point or another, at least some of the data will get corrupted. Transactions fail, databases crash, bugs show up in the margins, users enter in stupid information, or hackers attack. I had been spending time thinking about this issue at The Sporting News, but it wasn’t until MLB that it really congealed into something useful.

In the fantasy baseball domain, a typical roster transaction leads to the addition or removal of a player from a particular manager’s roster of players. If the player is added to player P’s roster he should not be available for any other manager in that league. If he’s removed from P’s roster he should be available to all other managers (including P). A player can’t be on more than one manager’s roster at the same time.

It turns out that every once in a while something hiccups and one of these rules is violated. Over the years I learned that the single most important step to fixing the problem is to make sure it doesn’t get worse. So, for instance, imagine a manager attempts to drop a player from his roster but something goes wrong. The system shows that the player is still on the roster, but he’s also technically available to others. Now that the data is corrupt it’s crucial that the system not allow the player’s status to be modified any further. It can very quickly become an impossible problem to solve if the player is picked up by another manager, then traded to another team, then dropped, etc.

I’ve spent many, many hours fixing transactions by hand. I worked on fantasy applications for over 7 years continuously, and in that time I can’t remember a single year where I didn’t have to fix at least some transactions by hand. Let me tell you, it suuuuucks. It really suuuuucks. Sucks and blows.

With that in mind, I developed a scheme where instead of waiting to hear that some player is on two teams via the message boards I take matter into my own hands. I designed the system to check each player involved in a transaction for corruption immediately after the transaction is committed. If I detect that one of the fundamental rules were broken (e.g. owned by more than one manager), the player is immediately frozen. No further transactions on that player would be allowed until an administrator can come in and fix the issue.

So for a small incremental cost I’ve bought myself some peace of mind. And I can absolutely tell you that it paid off, time and time again. It just took a different way of thinking about the problem - being proactive versus reactive, protecting the integrity of all that data.

These icons link to social bookmarking sites where readers can share and discover new web pages.
  • bodytext
  • Slashdot
  • del.icio.us
  • Facebook
  • StumbleUpon
  • Technorati
  • Reddit
  • NewsVine
  • Ma.gnolia
  • co.mments
  • Spurl
  • Furl
0 comments

What’s So Great about PL/SQL

I thought I’d start a loose series on PL/SQL for server-side developers. As a developer who has had to defend my use of PL/SQL in various systems over the years, I have some pretty strong feelings about what it brings to the table. I think of PL/SQL as a first-class language. That’s not to say that it can be used wherever Java or PHP are. What I love most about PL/SQL is that it fills some major gaps that Java and PHP (and most other traditional programming languages) have. When it comes to manipulating the database, anything Java can do PL/SQL can do better. In the context of the modern database application, that means that PL/SQL is an essential piece of any system.

Server-side developers in general have some serious hangups about PL/SQL. For one, it looks weird. What? No braces?!?!? Impossible!

Look, it’s not Java. Heck, it’s not even PHP. PL/SQL is its own beast, and you have to learn how to pet that beast so it doesn’t turn on you (and take your database down with it). If you think of PL/SQL as a simple means to tie some logic around DML operations (select/create/update/delete) it begins to make a lot more sense. It’s not supposed to be elegant. It’s not supposed to require hours and hours poring over thick books with fancy titles. It’s supposed to help you build better database applications, and at this I believe it excels.

So what’s so great about PL/SQL? Here are my canned responses to that question whenever some upstart developer starts spewing the crap he read out of his textbooks:

  1. PL/SQL is compiled in the database. It always amuses me that a community like Java, which lives and dies by strong compile-time typing, is perfectly willing to let a major component of their application be loosely typed. You know all those JDBC calls/Hibernate mappings/iBatis queries? Little news for you Java dude. They’re completely unchecked. Put in terms you might understand - when you enter in a period, there’s no code assist to help you figure out how to complete the query. If I go in and modify the database in a few discrete ways your app will crash and burn. And you probably won’t realize this until a user sends a nasty email about why they can’t access the product they purchased. Not the case with PL/SQL. Since Oracle keeps them compiled in the database, you (or more likely the DBA) will know immediately if something changes in such a way that breaks the procedure or package.
  2. Since they’re compiled in the database they will run orders or magnitude faster than the corresponding queries requested by a client application. The important concept here is called context switching. In short, it turns out all those trips back and forth the database tends to slow things down. It’s much much quicker to bundle up related queries in a procedure and make one call to the procedure. I once had an argument with a Java developer about result set sorting. He was convinced that it was much faster to sort a list of objects in Java than it would be to have the database do the ORDER BY and return the results. I like the guy, but that’s just insane. The overhead of fetching each of those rows and then doing some lame bubble sort on them is astronomical. But this is the kind of thinking that infests the server side community. It’s borne out of ignorance, sometimes willful, of what a database can do.
  3. Another benefit of being in the database - they can be used by any client, not just ones written in Java (or PHP, etc.). When they talk about code reuse Java developers apparently don’t consider these kinds of issues. I’m sure it’s a wonderful learning experience to write a shipping cost calculator in Java, PHP, and JavaScript, but wouldn’t it make more sense to write it in PL/SQL once and then use it everywhere? Just a thought.
  4. Believe it or not, most of the good DBAs I’ve worked with prefer complex logic to be wrapped up somewhere they can keep an eye on it. Remember, if something breaks at 3am they’re the ones that will get paged. Having all that business logic tucked away in a jar file somewhere makes then nervous. And when things do go bad they can help a lot more when the code is in the database. It’s better for everyone.
  5. It takes about 15 minutes to learn enough PL/SQL to export some logic to the database. Sure, PL/SQL goes deeper than that, but any curly-brace type programmer should be able to absorb the concepts easily.

Now, I’m not going to say it’s all win-win. Moving business logic into the database has a dramatic effect on system design. You’ll find a lot less justification for something like Hibernate, for instance (ok, maybe that’s a win). I’ve been through this a couple of times, I know it’s hard to find the appropriate place to draw the line in terms of what gets moved into the database. Should you go balls to the wall and have the database return cursors for select statements? I usually don’t, but I have in some instances. Should every insert/update/delete be wrapped in a stored proc? Again, not an easy call.

In my most recent fantasy baseball app, I let the client code only insert into temporary tables, and then called a stored proc to validate the data and move it into the destination table. People look at me like I’m crazy when I tell them about this. But you know what? I’d do it again. If you buy the premise that inserts are dangerous because Java code can’t type-check them, then it’s the right way to do it. Temporary tables and stored procedures are much easier to change than Java code at most “serious” companies. It’s a matter of necessity to do it that way.

Hopefully I’ve covered the “whys” of PL/SQL convincingly enough. In a future post I’ll cover some basics of the “hows”.

These icons link to social bookmarking sites where readers can share and discover new web pages.
  • bodytext
  • Slashdot
  • del.icio.us
  • Facebook
  • StumbleUpon
  • Technorati
  • Reddit
  • NewsVine
  • Ma.gnolia
  • co.mments
  • Spurl
  • Furl
2 comments

Wipe Your Feet Before You Come Into My House

My code is my house. I spend a lot of time in it. I fix it up, take care of it lovingly. I indent appropriately and actually spend time spacing out sections so they’re pleasing to the eye. I do this only partly because I’m obsessive compulsive. My greater motivation is that I really feel that these things matter.

Think about the word “code” for a minute. I love the word. I am a coder. I write code. What is code? Code is something that means something to the person that writes it, means something to some people/machines that read it, but means nothing to people who don’t know how to read it. Code is inherently cryptic. So the act of writing code is a struggle against entropy. Over time the code’s intent will change, its implementation will be less clear, or its documentation will drift out of sync with the actual representation.

As when you move into a house, code will never be as nice as it is on day one. Something breaks and you have to fix it quickly, leaving a hole in the wall. People come to visit and leave their shit around. Perfect code never stays perfect. So it’s critical that on day one the code is as clean and clear as it can be. And you should expect to do periodic improvements to keep entropy at bay.

Speaking practically, this implies a number of things. First and foremost, formatting matters. Spacing matters. These things help someone else determine the intention of the code you are writing. Related sections should be grouped together with spacing so someone reading knows what can be moved around and what should stay together. The goal is to make the code as pleasing to someone else’s eye as possible. We all know you are very clever, but a single line that chains together 50 method calls is impossible to decipher. Break it up and I’ll respect you more because you did it for me, not for you.

Everyone has a favorite format. The religious wars about curly braces probably consume half the storage space on slashdot’s servers. I’m not entirely above it - I’m infamous for reformatting code when I take control of it. But if I’m just visiting someone else’s code I have a strict policy that the code I write should be indistinguisable (as much as possible) from theirs. This means formatting it the way they do. Using the same naming conventions. Following their capitalization scheme. The point isn’t to show others how superior my formatting is. It’s to make sure that someone else reading the code doesn’t have an anuerism.

These icons link to social bookmarking sites where readers can share and discover new web pages.
  • bodytext
  • Slashdot
  • del.icio.us
  • Facebook
  • StumbleUpon
  • Technorati
  • Reddit
  • NewsVine
  • Ma.gnolia
  • co.mments
  • Spurl
  • Furl
2 comments

Next Page »