Question of the day: Where would you start with Twitter?
Dare Obasanjo dissects the Twitter problem here and here as do others on TechMeme.
Dare explains the cost/benefits of push and pull implementations and single instance storage systems and suggests a naive implementation of either is probably not the way to go with a service like Twitter.
Well, of course. No off-the-shelf implementation is going to work the best. In terms of implementation models, there’s no off-the-shelf model that’s going to be optimal either. I think this is what most people are suggesting when they say Ruby on Rails is part of the issue. It’s because ROR is biased towards a handful of programming models. No way Dare can convince me otherwise. This isn’t to say C++ isn’t biased too. Where’s its parallel processing capabilities, for instance? Enough about languages though.
The issue that I think Dare is getting at and the Twitter team is now describing, is that operationally they have a brittle system. I don’t know how many times I’ve seen this happen before over the years. If it takes too much effort to deploy, testing is a significant pain for a small venture and it doesn’t get done. There’s not even close to enough time. It if takes too much to maintain, it’s going to scale poorly too.
A related aside: Content management systems have been a sweet spot for databases. Rolling back the clock, I didn’t quite see the benefit. It seemed just as easy to manage things by hand. The cost of managing and deploying the database was greater than the cost of manually managing files on a server(s). However, as more and more updates are made during a day or every hour…A.K.A. blogs…, the benefit grows rapidly. What I didn’t get initially, was that this incremental benefit was going to have a cross over point that was going to approach rapidly.
One reason I didn’t latch onto the benefit early enough? It was incremental.
Here’s the thing, I don’t think there’s a similar incremental approach to designing a system like Twitter–at this point. ROR, C++, it doesn’t matter. It’s going to take some concerted thinking to build an efficient system. I’ll take a bit of this back. A simple Twitter system can be built using any collection of tools. Whether it can grow over time or handle significant load is another issue.
That being said, I think it’s quite feasible to be a non-brittle system that scales up. I don’t think this is a hard problem. I’ve never done it before, but I don’t see how it’s a big deal iff the cache, messaging, and indexing is by design. I’m placing these first for a reason. Now maybe some existing tools can be used to take shortcuts here or there, but I don’t think they can be the core. Now I might be wrong as I was with early CMS. Maybe there’s something incremental going on with database technology that I’m missing. Could be.
My gutt is telling me, though, that the core is something more dedicated. The databases and PHP, ASP, ROR, etc, etc, etc stuff makes up the non-DNA features.
One other point that I think hints at the operational/brittleness challenges Twitter is having: Notice the percentage of down time that’s been happening since their lead developer left a few weeks back. My guess is he had a good mental model of the system, sufficient to repair things as they “failed.” However, there are so many steps on their abstraction ladder at this point, it’s challenging for others to figure out. With things being brittle it’s a challenge for others to poke around the system without breaking more things. They’ll get it though. It’ll just take some time and a little pruning here and there to simplify things.