Say it Ain’t So Memcache

I will never claim that application profiling and stress testing are my strongpoints, but I’m having a really difficult time understanding the results of some tests I’ve been performing on my application.

Here’s the setup. My application is on 2 256 slices, 1 running nginx and PHP through fast cgi. The other is running MySQL. Outside of things like monit and munin, there is nothing else running on these slices. Perfect time to do some stress testing. The application is fairly database heavy, so I long ago decided to integrate memcache with an eye towards boosting peformance. Or so I thought (notice ominous foreshadowing).

My strategery with memcache is to never assume that it is either running, working, or contains the data I need. So my app will use it if it’s there but will carry on unaffected if it’s not. I left hooks in the app to be able to shut off memcache through config changes for cases where I’m testing via XAMPP and don’t have memcache running locally. This turned out to be very useful.

I have a third slice (which runs this blog and a couple of other smaller sites) that I installed http_load on. I used this box to drive the load tests.

One thing about http_load is that it doesn’t understand cookies. You provide it a URL or list of URLs and it just whacks on them until the server breaks. That poses a problem for apps like mine where being logged in is essential to the experience. So I had to make a few changes to the application to support a load testing mode. Once I change to this mode it will take the session identifier out of my config file instead of the cookie. No muss, no fuss, no meaningful change to the app’s behavior while in test mode, which is essential to ensure I’m comparing apples to apples.

OK, enough setup. Here’s one of my test scripts:

http_load -parallel 5 -seconds 30 test.url > test.out

So, run 5 threads for 30 seconds. While that’s going on I’m checking top on my nginx and MySQL slices. First thing I notice - MySQL is pretty much sleeping through the test. Good news. Load on that slice barely breaks above .2. But the fast-cgi processes on the nginx box launch to the top and hog up CPU and memory at an alarming rate. Before the 30 seconds is over load on the nginx box is over 3. Not good. End result was about 27 requests per second. Not horrible, but there’s no way the box could maintain that kind of load long term. I ran this test:

http_load -rate 20 -seconds 30 test.url > test.out

Which simulates 20 requests a second. So what I’m trying to do there is find a reasonable amount of traffic that will stress the server but not kill it. That seemed to be about the breaking point. The server handled 20 requests a second with some negative effects but it seemed it could be stable at that level.

So, had to do some thinking. In an effort to cheer myself up, I figured I’d disable memcache and see how bad it would be without its help. If I got between 20 and 30 with memcache surely I’d only get between 15 and 20 without it.

Well, guess what, nerd. Not so much. To my amazement, the result came in around 36 requests per second without memcache. Not only that, CPU consumption by the fast-cgi threads was reduced and their memory consumption was totally normal. Beyond that, load on the database server didn’t budge.

It’s almost like memcache is penalizing me. Things got a little weirder when I commented out the code to pull some objects out of memcache but left others in. The results got down to 10 per second, which is nearly unbearable. I wish I had a conclusive summary to give but right now my thinking is that the overhead of connecting to memcache and hydrating objects is slower than just getting the data from the database. Or maybe I’m just overusing memcache - storing and retrieving too many small objects for example.

So for the meantime I’m running without memcache, despite the hours and hours of work I put in to integrate it, and all the hopes and dreams of the children.

  • Digg
  • Slashdot
  • del.icio.us
  • Facebook
  • StumbleUpon
  • Technorati
  • Reddit
  • NewsVine
  • LinkedIn
  • Tumblr
0 comments

Year 2009 Alert

Note to LAMP interviewees. It’s now 2009. Boasting that you wrote your own PHP framework is ridiculous and unimpressive. I’m sure you’re very clever but no, I will never agree to letting my project use 10K lines of unproven code that’s running your blog. Yea, we all can (and have) written a database interface class. Still not interested. Sorry.

Also, it might be a good idea to have some understanding of what the term “unit testing” means.

  • Digg
  • Slashdot
  • del.icio.us
  • Facebook
  • StumbleUpon
  • Technorati
  • Reddit
  • NewsVine
  • LinkedIn
  • Tumblr
0 comments

CodeIgniter Autoloading and Performance

Got some interesting results tonight from my adventures with xdebug and CodeIgniter, specifically with the autoloading feature.

I had run xdebug to collect stats on my app’s landing page, the page where all users will be redirected after login. I’d naturally expect this to be one of the most heavily visited pages, therefore has to be as optimized as possible. After running the results of xdebug’s profiler (”xdebug.profiler_enable=On” in php.ini) through WinCacheGrind I found something like 300+ calls being made to the method CodeIgniter uses to load a library file/class. I had long suspected that liberal use of $CI->load->library(’MY_Blah’) wasn’t necessarily good practice, but I didn’t suspect it could have been that bad.

So I decided to put my most-frequently loaded libraries into the autoload.php and remove any calls to load them in my libraries, controller, and views. The difference was noticeable, and a second pass through xdebug and WinCacheGrind proved the improvement was real. I tried not to go overboard by loading too many classes, and it seems like I was able to strike the right balance by autoloading less than ten of my dozens of classes.

Another interesting result was integrating memcache to save some of the objects that are frequently loaded on the landing page. These objects are for the most part shared across all users on the site. For some reason after I integrated memcache the memory usage for the controller (according to CodeIgniter) went up to around 8MB from 2MB. Very weird results that I’m going to have to think about. Database load on the page is near nothing, which is good news. I’m assuming the problem is in copying the objects out of memcache and creating PHP objects out of them.

Guess I’ll be doing some more profiling.

  • Digg
  • Slashdot
  • del.icio.us
  • Facebook
  • StumbleUpon
  • Technorati
  • Reddit
  • NewsVine
  • LinkedIn
  • Tumblr
0 comments

Scary Moments in Administering nginx

So I was trying to install xdebug on my Slicehost slice and I couldn’t get the damn module to load.

I was following these instructions - installed via PECL, added the line to php.ini, restarted the web server, etc. Nothing. Wasn’t showing up in either “php -m” or the output of phpinfo().

So then I decided to compile from source, using instructions on the same page. Now it actually got worse. I was getting a 502 error and this in the logs:

2009/06/04 16:50:46 [error] 4461#0: *1 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: XXX.XXX.XXX, server: myserver.com, request: “GET /index.php HTTP/1.1″, upstream: “fastcgi://127.0.0.1:9000″, host: “myserver.com”

Begin freakout.

Nothing was working. Bounce web server. Nothing. Bounce slice. Nothing.

Continue freakout.

Don’t totally know why I decided to restart fastcgi, but sweet mother that worked. And not only that, xdebug was loading as expected.

sudo /etc/init.d/php-fastcgi restart

End freakout.

(To be more precise, I restarted nginx first and then restarted fastcgi.) Hope that helps someone out there. Certainly scared the tuna salad out of me for a good half hour. Oh the joys of system administration for developers.

  • Digg
  • Slashdot
  • del.icio.us
  • Facebook
  • StumbleUpon
  • Technorati
  • Reddit
  • NewsVine
  • LinkedIn
  • Tumblr
0 comments

Virgin Servers and Nerd Pr0n

As weird as it sounds, for legal reasons I had to move my side project onto a server I have root access to. This posed some serious problems for me. I’ve developed tons of sites, and I’m no stranger to a command prompt, but a sysadmin I am not. Postfix? iptables? munin? monit? The extent of my exposure to the intricacies of system administration was whatever was available from within cPanel plus whatever I could change from my shell account. Admittedly, this was limited, if not comfortable.

After hemming and hawing and researching I settled on a VPS setup on Slicehost, which was recently purchased by Rackspace. I expected a painful transition to a self-managed server, and honestly it wasn’t all shits and giggles, but the experience was (and is)….Amazing. Liberating. Invigorating. Confidence building.

I got plenty of help. The Slicehost tutorial articles were an incredible resource. I could have barely done it without them. I also got some key help from A. DeRose (see profile here) an ex-coworker who had recently worked through moving the Tripology site to Slicehost. Even still, I learned an incredible amount about how to configure and run a server. It’s an experience I wish all developers could have at least once.

Almost on a whim I decided to use nginx as my web server instead of Apache. Nginx is stupid fast. I don’t really have anything against Apache, but I can appreciate how simple nginx is to install and configure. For basic web sites, it makes Apache seem like a big fat mouth-breathing mooch that won’t leave your apartment. I haven’t regretted that decision yet, and I don’t expect that I ever will.

I also heartily recommend Monit, Munin, and apticron. Between those three I feel that if something happens to the server that I need to know about I’ll be the first to know. Lastly, I can recommend Pingdom as a external 3rd Party service to make sure the server is responding.

The most exciting part of all of this is that after all these years of nerdom there’s still something to learn. It’s what geeks like us live for.

Update: Don’t know how I managed to forget this article for setting up Ubuntu on Slicehost. This article was pretty much my bible for 2 days. Whoever wrote that deserves a special place in nerd heaven as far as I’m concerned. Great stuff.

  • Digg
  • Slashdot
  • del.icio.us
  • Facebook
  • StumbleUpon
  • Technorati
  • Reddit
  • NewsVine
  • LinkedIn
  • Tumblr
0 comments

PHP Frameworks and Scaling

This article covering a talk by Rasmus Lerdorf on the issue of scaling PHP and PHP Frameworks is a couple of months old but keeps popping up on Delicious. There’s a lot to disagree with there, especially if you read through the comments by other people who were in the audience for the talk, but the thing that stood out for me benchmarking covered in the “Hello World” section.

Please please please tell me the reporter misunderstood Rasmus’ point, as some have suggested in the comments section. I can only hope Rasmus (who, btw, is a vocal member of the Nike+ Running community that I work on at R/GA and has written the SlowGeek site to improve on some features he felt we were lacking) would not make a big deal out of such a naive example. So you’re telling me that “Hello World” printed out directly from a PHP file is orders of magnitude quicker than “Hello World” printed out through a PHP Framework? Amazing. I will immediately concede that if your intention is to write and scale a “Hello World” application CodeIgniter or other PHP frameworks are not for you.

It’s like saying that it takes longer to get to the corner store if you take you helicopter as opposed to walking. Sure, you have to suit up, start up the rotors, take off, find a good landing spot, step over the dead bodies, etc. But that’s not practical is it? Either is a “Hello World” example.

Look, I’m perfectly willing to accept that CodeIgniter will be slower than a hand-crafted framework when all is said and done. That alone is the reason I don’t use any ORM in my code. It can’t possibly be faster. But for something at my scale it makes configuration, development, and maintenance very easy. And in the wise words of the fine developers at Coding Horror, programmers are expensive and hardware is cheap. To a certain scale, you can just throw hardware at your problem.

That said, I’m sure Rasmus understands that and the reporter took him out of context. Right….?

  • Digg
  • Slashdot
  • del.icio.us
  • Facebook
  • StumbleUpon
  • Technorati
  • Reddit
  • NewsVine
  • LinkedIn
  • Tumblr
4 comments

CodeIgniter…Meet Minify

As a followup to one of my previous posts I wanted to go through how I managed to get CodeIgniter and Minify to play nice with each other. Hopefully this will make someone else’s life easier. For those not using CodeIgniter this post might be either confusing or boring. Or both I guess.

My approach might seem code-heavy compared to other solutions but it has the virtue of requiring only a small change to single file that would be included by all pages on your site. That’s typically not a problem since the first thing I do when I’m working on a site is to break out the common elements such as the <html> and <head> tags to their own included header file.

In CodeIgniter I created a library called MY_Includes.php (/system/application/libraries/MY_Includes.php). This is the core class that contains the mappings between each controller and the JavaScript and CSS files required by the view that will be loaded by the controller that was invoked by the browser. Obviously this implies the extra step. If I create a new JavaScript or CSS file I can’t go into the globally included header file and add a <script> or <link> tag there - I have to edit MY_Includes.php to map the JavaScript or CSS file to that particular view. Yea, it seems weird to edit a PHP file to add a CSS or JavaScript file, but there are a couple of different factors at work here and this solution made the most sense to me. The big win was that it helped integrate Minify into my codebase with almost minimal effort.

You can see an edited version of MY_Includes.php here. I wanted to walk through this code a bit to highlight the important parts, but hopefully it’s readable on its own.

First, you’ll notice the constructor requires the name of the controller that was invoked. I’ll show you how I get that later on, but essentially the whole class relies on that piece of information. My application is fairly linear in the sense that once I know the controller’s name I know (barring exception cases) which view will be invoked.

This in turn allows me to map controllers directly to JS and CSS files, which is why you’ll see the init method set up 2 hashes containing the JS and CSS files that I have access to, jsFilesHave and cssFilesHave. The key in the hash is a logical name I will use when adding the file to a view. This will improve readability and reduce errors and maintenance. The value in the hash is a string that specifies where the corresponding source file can be found. This is relative to the web root and is of a form that Minify understands. Whenever I create a new JS or CSS file I have to first add it to one of these hashes so that I can refer to it later in the file.

One other note on the init - I’m not sure if I needed to, but I found it easiest to break with the CodeIgniter way of doing things and issue a PHP include statement to tell the class where to find the Minify source in the below snippet from that method.

//from minify examples:
//Add the location of Minify’s “lib” directory to the include_path.
ini_set(’include_path’, ‘/home/vdibart/minify/lib/.:’ . ini_get(’include_path’) );
require ‘Minify/Build.php’;
require ‘Minify.php’;

After init, the constructor will call compileTags. This is the heart of the logic. You can see it populate the cssFilesNeed and jsFilesNeed hashes, first with the files that are common to all views and then the ones depending on which controller was invoked.

Determining which controller was invoked is fairly straightforward. The following code is at the top of my globally included header file:

//for globally included header file
//so know which CSS or JS files to include
$pageName = $this->uri->segment(1, 0);
$pageName .= “/” . $this->uri->segment(2, “index”);
$this->load->library(”MY_Includes’, $pageName);

So if the controller was “http://www.mysite.com/member/register”, this code will pass “member/register” to the constructor of my class. Later on in the same header file I have the following 2 lines, which will extract the appropriate CSS and JS links:

<!– for globally included header file –>
<link rel=”stylesheet” href=”<?= $this->CI->my_includes->cssTag(); ?>” type=”text/css” media=”screen” />
<script src=”<?= $this->CI->my_includes->jsTag(); ?>” type=”text/javascript” charset=”utf-8″></script>

Switching back to the source code of MY_Includes.php, you can see those 2 methods invoke Minify to build the included files and then return a URL that can be used to retrieve the files. There’s a little bit of work in each of those to make the URL look like something that CodeIgniter will work with. So once the PHP executes the above tags will look like this in the final source code for the page:

<link rel=”stylesheet” href=”http://www.mysite.com/includetag/css/member-register/1222014216″ type=”text/css” media=”screen” />
<script src=”http://www.mysite.com/includetag/js/member-register/1222098068″ type=”text/javascript” charset=”utf-8″></script>

So each rendered page on my site has only 1 CSS file and 1 JS file included. And those files are minimized and cached. All of that is due to Minify. But you’ll notice there’s one piece of the puzzle still missing. The above <link> and script tags refer back to my site, and there has to be something that knows how to interpret that and return the appropriate CSS or JavaScript data. It turns out that “includetag” is a CodeIgniter controller that I created. I’ve included the source code here. There’s not a ton to mention here. The class loads the exact same helper class MY_Includes.php that interfaces with Minify to retrieve the CSS or JS file and return them to the client.

Hopefully there’s enough to get you through to a working version. To summarize the steps:

  1. Download MY_Includes.php (here) and put it in your /system/applications/libraries directory
  2. Edit the init method inside of MY_Includes.php to include the correct path to your Minify installation
  3. Edit the init method inside of MY_Includes.php to include your CSS and JS files
  4. Edit the compileTags method inside of MY_Includes.php to include the correct files for each controller
  5. Download includetag.php (here) and put it in /system/applications/controllers directory
  6. Add the two code fragments commented with “for globally included header file” above to the appropriate file in your application
  7. Fire it up
  8. Feel free to post a comment if you have troubles and I’ll walk you through it or edit the post to fix any errors as needed.

    • Digg
    • Slashdot
    • del.icio.us
    • Facebook
    • StumbleUpon
    • Technorati
    • Reddit
    • NewsVine
    • LinkedIn
    • Tumblr
    13 comments

One Flag to Rule Them All

So right now I’m looking at a table that has at least 3 different columns that control whether the particular row is displayed on the front end. In some cases that’s unavoidable, but it has to be kept in check.

Maybe you can tell me what the difference is between the intent of these columns: status (e.g. pending, active, canceled) and should_display (0 or 1). In addition to that, there’s one part of the code that will ignore a record if one of the FK columns is null but will consider it if it’s non null.

This is madness. I now have to piece together which columns are significant to which consumers of the data. And then I have to figure out the magical combination of values to make the row appear on the front end. This leads me to some quick rules for database flags:

  • Limit the number of display flags to as few as possible. I usually use a is_active or display_order column to determine whether the row should be retrieved. There will be cases where the row should be retrieved by one consumer and not another, but there should never be more than one column that does almost the same thing.
  • Use descriptive column names. The ones above are too general. is_active tells me exactly what I need to know.
  • You can use a nullable timestamp column to do both boolean checks and date-triggered checks. In other words, if the column is null it means the column is still valid. If it’s not null you have to check it against the current timestamp. This saves a duplicated column and is fairly easy to get across.
  • Digg
  • Slashdot
  • del.icio.us
  • Facebook
  • StumbleUpon
  • Technorati
  • Reddit
  • NewsVine
  • LinkedIn
  • Tumblr
1 comments

In Praise of Minify

Having read High Performance Web Sites, I figured I’d take a little time out of the development of new features on my side project to look at some basic performance issues. The first stop was YSlow, the Firefox plugin that works with Firebug to give you a simple report on how you rate on the Yahoo! performance scale. Mine being a tiny site, the report before any optimizations was decent but not great. There was definitely room for improvement so I figured I’d put some of the advice I’ve read recently into practice.

The first optimization was very easy. I made sure my images were sufficiently cached by adding a quick .htaccess file in the directory where my images are stored on the server. I saw 2 different techniques for doing this. One was based on file extension, such as the technique discussed here. The second was based on the file’s content-type, which was discussed here. On the margin the one based on content-type seemed a safer bet. That way if I have a file that’s incorrectly named it will still get cached.

The next step was to try to improve my JavaScript and CSS includes. As mentioned in High Performance Web Sites, the files should be minimized in order to save bandwidth. They should have far future expires headers so that the browser doesn’t request them after the first visit. And the number of includes should be limited so that there’s fewer requests that need to be made. Luckily someone much smarter than I already developed just about the perfect solution to all those issues and more. The Minify library for PHP is one of those pieces of code that does exactly what I was hoping it would do in exactly the way I was hoping it would. And to boot it required as little effort to integrate into my existing code base as could reasonably be expected. I recommend that anyone running even a small site on their own take a look at Minify. There’s absolutely no reason not to be using this wonderful little library. None. Go out right now and do it.

There was one snag in process of integrating Minify with my project. As I’ve mentioned, I’m using the CodeIgniter framework. It turns out that Minify and CodeIgniter needed a little bit of coaxing to work together, but nothing that got too messy. I’m going to leave that discussion for my next post, which will hopefully not take 4 months to write :)

  • Digg
  • Slashdot
  • del.icio.us
  • Facebook
  • StumbleUpon
  • Technorati
  • Reddit
  • NewsVine
  • LinkedIn
  • Tumblr
3 comments

Magic Button Syndrome

If there’s one concept I’ve fought my entire career it’s that there can be, or even should be, a way to make everything work “automagically”, a term the afflicted developers use lovingly. I recently christened this the “Magic Button Syndrome”.

Usually a bunch of fairly smart developers sit in a room and start dreaming of how a system might work. “We have to make sure we can easily modify the configuration,” one might say. “We should have a means to generate the configuration based on some other configuration file,” another might respond. “Let’s use annotations to make sure that the configurations stay in sync across versions,” someone else might suggest. Yet another person might think it’d be wonderfully cool if you could auto-inject annotations somehow.

Their triumphant moment comes when the CTO is standing over their shoulder screaming about something that needs to be fixed ASAP and they nonchallantly say, “Oh, I can fix that, one second.” They turn to their machines dramatically, edit one or two lines somewhere, smack the return key, twiddle their thumbs, reload the page, and then smile. “No big deal,” they’ll say with a smirk on their face. That’s it. That’s what they live for. They want that one Magic Button moment.

It sounds foolish, but there are plenty of developers like that out there. For these people, it makes perfect sense that if you can automate something little, automating something bigger containing tons of moving parts must be even better. Eventually the automation will reach singularity in the Magic Button.

The problem is that automation suffers from the same law of diminishing returns as does traveling at the speed of light. It takes an infinite amount of energy to accelerate a particle with any mass to the speed of light. In the same way, it takes an infinite amount of energy to create that Magic Button. Not that it stops people from trying. Sure, changing one of the thousands of options that are contained in a config file or database is easy. But if you’ve worked on systems like these you know that doing anything outside of the realm of what the system was designed to do is absolutely, unbearably painful. That Magic Button hides layers of abstraction upon abstraction upon abstraction. Just when you begin to understand what a peice of code does you realize you forgot what code is calling it. In the effort to make something of uber-value, no single component makes any sense.

You find it takes people months to really understand the system. Changes take weeks to test, and lead to reprecussions that no one really ever expected. Once all the original developers are gone everyone starts to realize the system needs to be redesigned. It’s become like the pyramids - beautiful, absolutely brilliantly designed, but a total mystery. This time we’ll do it differently. In Ruby maybe. And auto generate all the documentation using XML…

  • Digg
  • Slashdot
  • del.icio.us
  • Facebook
  • StumbleUpon
  • Technorati
  • Reddit
  • NewsVine
  • LinkedIn
  • Tumblr
0 comments

Next Page »