EDIT 1: In v1 of this post some of my wording was misinterpreted to mean we don't care about scaling chat - we do! In fact late last night we released some improvements that you can read about in the follow up piece Chat Scalability Improvements.
EDIT 2: Added a new third paragraph, bumped original third down to fourth with slight edits - notably removing that statement :) Also edited "Edit 1".
I joined Justin.tv Inc three years ago, I was the first developer to work on our new gaming vertical "Xarth"; 6 weeks of hard graft by Jacob and I, and a last minute name change, and Twitch was born. I'm constantly humbled by how big we've become, as part of trying to deal with the shock I've spent a while thinking about why certain parts of our product work. "Twitch Plays Pokemon" is the perfect segue to discuss one of the core parts of our experience: "Live Chat".
Chat has been a constant scaling PITA. We are pretty sure we're one of largest IRC networks in the world (at time of writing, a low tide, we have nearly 500k concurrent users). It is also one of the strongest reasons to use, and return, to twitch. Our broadcasters are the reason we have a community, but they'd not be able to interact with their viewers were it not for chat, thusly I think of it as our community fulcrum. But why would something so simple be so successful? There are far more compelling features on sites that have far less success than our chat feature does. I believe the answer is fairly simple: text chat is a lowest common denominator technology. It is a platform that permits others to build on top of it. Similar to how twitter succeeded due to its API and the clients that third parties developed for it, our chat succeeds because IRC is incredibly easy to integrate with and our custom bindings are few and far between - thus you get the same experience on our client as you do in your IRC client. We've had people build auto moderation tools, poll tools, random chatter selection tools (for things like: "enter a pool to play with this pro if you subscribe to their channel"), and now massively multiplayer pokemon! Who'd have thunk it, eh?
TPP puts a very new type of stress on our system: massive numbers of inbound messages. Up until now the major scaling challenge has been delivering one message to many viewers of a channel. The decision making as to whether we should deliver a message is fairly heavy; we need to make decisions like "is this person a subscriber", "have they been banned", etc, etc. With such a large increase in inbound message volumne we're seeing parts of that pipeline struggle (caches, db access, etc).
For use cases like TPP there are a number of potential solutions that we've posited in the past. For example one such idea was "kappa aggregation" where, instead of delivering the billions of kappas we deliver, we deliver one message every N-seconds which contains the count of kappas. While many of the potential improvements would improve the overall scalability of the system it would impact the creativity-potential due to the current stream of bits being re-written into something with more structure; thus limiting creativity.The correct thing to do, and the thing we're constantly doing, is to double down on the scalability of our chat system.