Archive for the ‘php’ Category

Twitter flies off the rails

Thursday, May 1st, 2008

According to noted internet pundit Arrington twitter has made the call to rewrite and move off ruby on rails. Not really a shock, rails is fundamentally not going to scale up past a certain point.

The uncharitable part of me (and the part that is confident nobody reads this blog) thinks that the vast majority of rails developers have never been around the kind of scale we’re talking about when we say rails doesn’t scale. Especially when I see something like this:
Image from a slide show: Scaling rails is easy

I love that items 2, 4 and 5 are REALLY REALLY hard.

More likely though those developers haven’t faced an app that is particularly hard to scale. I think the issue is less “does my framework scale” and more “is my application going to be hard to scale”.

Twitter is a great example of the kind of app that is very hard to scale, regardless of the platform. Rails is probably not the culprit here, but I doubt it is helping at all. In fact, at this point I would suspect twitter will do what everyone ends up doing when this happens: php frontend talking to a web service driven by java (or c). Because it turns out when the problems get real hard you don’t want a scripting language to handle them.

Rails isn’t bad, but it is isn’t “easy” to scale, except in cases where the application is easy to scale (content). Rails is designed by small, custom app developers for small developers. It provides one major benefit: it makes it faster to develop a certain class of applications. Thats it. So if time is money (you are a small dev shop) rails will make you money. Otherwise rails give you very little. I don’t think it would be a bad choice for most apps, but I don’t think its the panacea many rails advocates would claim.

Your Favorite Language Sucks

Saturday, April 19th, 2008

I’ve noticed at the few events I go to that have a lot of consultants running around that the de-facto standard for freelance developers has become Ruby on Rails. I think this is very sensible actually, because Rails was written by and for these kinds of people, where getting something that works well enough running in the shortest amount of time possible is the name of the game. Rails (and the subset of ruby rails uses) has also exposed a lot of people to objected oriented concepts that have been around for a while. And the rails implementation is a really good one to wrap your brain around how and why object oriented design is useful.

That said, the one thing that annoys me about this whole thing (and a lot of the webdev blogosphere) is that a vocal subset of rails people are religious about it. They like to come into any situation and immediately begin advocating for why rails is magic and should be adopted for everything. This of course has nothing to do with rails itself, which as far as I can tell never claims to be anything other than what it is: a rapid application development framework. By rights it should be putting the competition (mostly .net) out of business: its probably the best RAD framework out there. But it isn’t good for everything.

Backend work at yahoo tends to be written in c++ or java. Much more time consuming for development, but necessary to support the kind of scale our applications need. Frontend work is done in php, but not because php is super-awesome. Rather, php is easily extended with compiled extensions, is quite fast and flexible for the front end and is easy enough for experienced developers to get up to speed with.

There have been some attempts to get rails going at yahoo, with mixed results. The fact is that for a company with thousands of developers, a robust platform and massive scale rails doesn’t have much to offer. That’s because its the wrong tool for the job. Not because it is the sux0rz, or because php/java is the roxorz. Its because any competent developer should be able to get the job done in any language, provided its the right tool for the situation.

If you need to write a run of the mill database-driven application and get it done yesterday, rails is the tool for you. If you have microsoft servers and need to do the same, asp.net is probably the tool for you. If you have microsoft servers, lot of time and expertise you should probably through out the asp.net framework and write your own thing in c# for the .net run time. If you are yahoo, google, amazon or aspire to be you should decouple your frontend from your backend, write your front end in a nice template-focused language like PHP or python and design a nice REST interface for the backend, implented however (even rails if you like) with the understanding that as you grow you might need to replace the backend.

At least, thats my opinion. Others have succeeded with any old stack. Wikipedia is pure LAMP, amazon is JSP on the frontend and twitter runs on rails. That’s probably strong evidence that it doesn’t even matter that much what you use, as long as you design the thing to solve your problem. The rest is just details.

Caching Web Services with APC

Wednesday, October 24th, 2007

Everybody loves web services. It seems like the perfect architecture, back-end (database) stuff is handled independently with front end servers communicating with the back-end via HTTP. The benefits of the RESTful way of building web apps are well documented elsewhere. The is one probably with restful architecture, especially is you are using php.  There are to major consequences of this, the first most obvious one is that when you are making web service calls your app cannot be any faster than the web service. If the web service is slow your site is slow.

There is another consequence that you won’t notice unless you get a lot of traffic. If your server is getting three requests per second or more you will notice that your cpu is suddenly pegged. For most people, when they get this kind of traffic they just buy more hardware, but if your site depends on web services they may be the culprit for all the cpu load.

It turns out that web service requests are actually very expensive in terms of cpu. If you use curl you typically use it synchronously, you make a request, wait for the response and then process the data received. This doesn’t sound difficult, but the trouble is that your os is not smart enough to notice that the thread executing php is just waiting, it can’t reclaim that cpu for other uses. So even though the php curl operations aren’t that much work for the processor they can quickly overload it.

There is an easy solution of course, and that is caching. On the server most developers know about caching pages. Unfortunately a lot of modern applications deliver a custom experience for the user that cannot be cached. But your web service calls can. Say you are making some sort of custom news page that aggregates a user’s favorite news and flickr feeds. You cannot cache the html output, because every user is different. You can however cache the rss feeds you are using to build the page.

This only works if the web service calls aren’t all different, which is why the rss aggregator is a good example. I assume that many of the feeds would be popular with lots of different users. My approach is to generate an md5 hash of the feed url and use that as an id for your cache, that way any time a user requests the same feed you know it and can serve it from cache rather than hitting the web service again. If you are on a low-traffic site or shared hosting plan the best way to do this is to just cache in the file system the same way you would cache a dynamic html page, just make sure the unique identifier for the file is that hash, and that you expire the cache often enough that the content is fresh, but not so often that you get no benefit from caching. If you have a high traffic site you already know that caching to disk is insanely expensive (php file_exists() is very expensive), so you will want to cache in memory. The good news is that APC (Alternative Php Cache) exists and is awesome. It is not as awesome as memcached (which is replicated), but it is what memcached is based on and for many cases it will be just as effective. When I just recently used this technique I actually did all my processing on the web service response before caching a serialized php object so I could save the time of doing that on every request.

The performance benefits can be amazing. And this is way easier to implement than memcached.

Back from the wilderness

Sunday, September 23rd, 2007

This article covers a lot of my issues with Rails. I think PHP has taken a real beating over the last few years. This beating has been good for the language and good for developers. Exploring rails taught me a lot about the MVC pattern and left me with some great ideas for how to develope web apps. But I came back to PHP because I found the framework often didn’t want to do what I did, which always slows things down.

There will always be a place for Rails in the world of rapid development. I think it is a great solution for a small consultancy that needs to produce solid apps, fast. But for large scale development PHP is going to continue to be the  standard scripting language.