<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	>
<channel>
	<title>Comments on: Twitter flies off the rails</title>
	<atom:link href="http://stephenwoods.net/wordpress/2008/05/01/twitter-flies-off-the-rails/feed/" rel="self" type="application/rss+xml" />
	<link>http://stephenwoods.net/wordpress/2008/05/01/twitter-flies-off-the-rails/</link>
	<description>Wherein I discuss whatever</description>
	<pubDate>Wed, 20 Aug 2008 09:19:11 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.5</generator>
		<item>
		<title>By: admin</title>
		<link>http://stephenwoods.net/wordpress/2008/05/01/twitter-flies-off-the-rails/#comment-149</link>
		<dc:creator>admin</dc:creator>
		<pubDate>Fri, 02 May 2008 16:11:54 +0000</pubDate>
		<guid isPermaLink="false">http://stephenwoods.net/wordpress/?p=56#comment-149</guid>
		<description>I think you are right, I worked on an app that delivered emails that I think was pretty similar. We sent out "triggered" emails to about 8 million users every day. We had an event server written in C that watched the db and output a feed of change events. The notification machine ran a php script that built each email and called an email service to send the messages. We ran it on I think 4 machines total (mirrored for continuity in to colos). It took about five hours for us to send out the messages, but that was mostly because the database queries for each email were quite complex.</description>
		<content:encoded><![CDATA[<p>I think you are right, I worked on an app that delivered emails that I think was pretty similar. We sent out &#8220;triggered&#8221; emails to about 8 million users every day. We had an event server written in C that watched the db and output a feed of change events. The notification machine ran a php script that built each email and called an email service to send the messages. We ran it on I think 4 machines total (mirrored for continuity in to colos). It took about five hours for us to send out the messages, but that was mostly because the database queries for each email were quite complex.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Vidar Hokstad</title>
		<link>http://stephenwoods.net/wordpress/2008/05/01/twitter-flies-off-the-rails/#comment-148</link>
		<dc:creator>Vidar Hokstad</dc:creator>
		<pubDate>Fri, 02 May 2008 09:54:03 +0000</pubDate>
		<guid isPermaLink="false">http://stephenwoods.net/wordpress/?p=56#comment-148</guid>
		<description>Actually, scaling an app the type of Twitter is really quite simple. Messaging is simple overall. I don't even like Rails, but frankly if someone fail to scale an app like Twitter with Rails, the problem is in front of the keyboard...

This is a classic multi-tier type app, where scaling Rails should be as simple as adding more boxes - there's simply no need for more than an absolutely minimal state on the frontends at all, certainly nothing you could trivially keep in memcached sessions. If they do something stupid, like trying to shovel lots of databases updates in via Rails, then they will of course run into problems (and one of my dislikes about Rails is that it makes shooting you in the foot far more easy than it need to be).

For the backend, messaging of the type Twitter does isn't hard, it's a simple matter of setting up a mesh of relatively simple servers. I commented on the Techcrunch thread that frankly there are off the shelf alternatives that will handle this nicely, notably a number of Jabber servers, which has the advantage that they're designed from the ground to be distributed, so all that's needed is to farm the userbase out over a large enough number of domains - the users never need to know. If they suffer from not-invented here syndrome, writing an efficient messaging server in Ruby isn't hard - I built one in less than 700 lines that would be trivial to mesh and that I tested up to around 2-3 million messages a day. 

Persistence adds complexity, but nothing that can't quite simply be handed by a partitioned database, or even an e-mail like system - I've done messaging queues handling huge volumes layered over e-mail before, and it sounds ugly, but it's trivial to scale and make reliable because the protocols are all well understood and you can cherrypick from so many components.

Back to the Ruby messaging server: At the 2-3 million messages/day level it took less than 10% of a single 2GHz Xeon core. Scaling it to at least 15 million on a single core on the conservative side brings us to between 120 and 240 million messages per server (assuming 8 or 16 core boxes, which seems to be most cost effective now). Assuming peaks etc. we could be generous and "only" try to handle 100million a day per 16 core box. That's less than 25 cents/month per million or so delivered messages if you go for managed hosting of rented servers, including outbound bandwidth.

I've run into my share of nasty scaling problems with Ruby, but with something that's actually tricky to handle. This isn't it.

And yes, when the problems get real hard, I do want a "scripting laguage" like Ruby to handle them - it means the easy things get out of the way trivially easily, and I can focus on the few hard bits, even if that means dipping down to C to write a tiny extension to speed something up, or sot in a tiny server in whatever other language I decide to use.

Otherwise I agree with you that Rails isn't by any means the panacea that Rails advocates tend to claim, but not because of Twitter.</description>
		<content:encoded><![CDATA[<p>Actually, scaling an app the type of Twitter is really quite simple. Messaging is simple overall. I don&#8217;t even like Rails, but frankly if someone fail to scale an app like Twitter with Rails, the problem is in front of the keyboard&#8230;</p>
<p>This is a classic multi-tier type app, where scaling Rails should be as simple as adding more boxes - there&#8217;s simply no need for more than an absolutely minimal state on the frontends at all, certainly nothing you could trivially keep in memcached sessions. If they do something stupid, like trying to shovel lots of databases updates in via Rails, then they will of course run into problems (and one of my dislikes about Rails is that it makes shooting you in the foot far more easy than it need to be).</p>
<p>For the backend, messaging of the type Twitter does isn&#8217;t hard, it&#8217;s a simple matter of setting up a mesh of relatively simple servers. I commented on the Techcrunch thread that frankly there are off the shelf alternatives that will handle this nicely, notably a number of Jabber servers, which has the advantage that they&#8217;re designed from the ground to be distributed, so all that&#8217;s needed is to farm the userbase out over a large enough number of domains - the users never need to know. If they suffer from not-invented here syndrome, writing an efficient messaging server in Ruby isn&#8217;t hard - I built one in less than 700 lines that would be trivial to mesh and that I tested up to around 2-3 million messages a day. </p>
<p>Persistence adds complexity, but nothing that can&#8217;t quite simply be handed by a partitioned database, or even an e-mail like system - I&#8217;ve done messaging queues handling huge volumes layered over e-mail before, and it sounds ugly, but it&#8217;s trivial to scale and make reliable because the protocols are all well understood and you can cherrypick from so many components.</p>
<p>Back to the Ruby messaging server: At the 2-3 million messages/day level it took less than 10% of a single 2GHz Xeon core. Scaling it to at least 15 million on a single core on the conservative side brings us to between 120 and 240 million messages per server (assuming 8 or 16 core boxes, which seems to be most cost effective now). Assuming peaks etc. we could be generous and &#8220;only&#8221; try to handle 100million a day per 16 core box. That&#8217;s less than 25 cents/month per million or so delivered messages if you go for managed hosting of rented servers, including outbound bandwidth.</p>
<p>I&#8217;ve run into my share of nasty scaling problems with Ruby, but with something that&#8217;s actually tricky to handle. This isn&#8217;t it.</p>
<p>And yes, when the problems get real hard, I do want a &#8220;scripting laguage&#8221; like Ruby to handle them - it means the easy things get out of the way trivially easily, and I can focus on the few hard bits, even if that means dipping down to C to write a tiny extension to speed something up, or sot in a tiny server in whatever other language I decide to use.</p>
<p>Otherwise I agree with you that Rails isn&#8217;t by any means the panacea that Rails advocates tend to claim, but not because of Twitter.</p>
]]></content:encoded>
	</item>
</channel>
</rss>

<!-- Dynamic Page Served (once) in 1.455 seconds -->
