Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I'm sure LinkedIn could have cut servers without switching to node. Take your first, crappy implementation and rewrite it in the same language and you'll probably still see at least 10x improvement, if not 20.


This times a thousand. When you first write the software you're basing it on expectations, no matter how well you plan, but when you rewrite it you're coming at it with real world knowledge of the pain points so of course it's going to be better in terms of performance.


"This times a thousand" would be an improvement by a factor of 10000. The improvement should definitely not be that large.


Thanks for the explanation, skeletonjelly. I think the reason I misunderstood is that I don't hang out on the Internet enough: a friend explained to me that "This." and "this times a thousand" are Internet slang that mean "I agree" (the second, presumably means something like "On a scale from 1 to infinity my agreement level is 1000." :). As far I know these expressions aren't used verbally, which would explain why I took it literally.


It means repeat it a thousand times for emphasis.


You're getting downvoted because the poster was referring to the gravity/importance of the message.


We don't all have English as our first language, and there's no problem with that. I'd have replied exactly as you did if I thought he literally meant "this times a thousand"


I included an extra bit of improvement to deal with inflation rates.


LOL, apparently sarcasm and witty banter is not HN forte :P


I disagree. When you change state management to client side, you are making a fundamental architecture shift that is significant enough to remove a lot of server-side overhead. What makes you think refactoring code is going to give you a 10x improvement in efficiency? If your code is that bad, you should get rid of the developers along with the code.


>I disagree. When you change state management to client side, you are making a fundamental architecture shift that is significant enough to remove a lot of server-side overhead.

In most cases, you're not. You're mostly making a big logic spaghetti mess on both the client and the server AND make your pages load slower, especially for the initial load (client performance, js loading times, etc).

>What makes you think refactoring code is going to give you a 10x improvement in efficiency?

Because even correctly implementing just the caching layer, with nested/micro-caching, can give you up to 1000x improvement in efficiency in the first place.


It's true that moving more logic and state management to the client side often increase page load times - but it also allows you (potentially) to make certain interactions much faster.

When you have real data on the client instead of just a bunch of markup, you can be a lot smarter about how and when you make additional AJAX requests. Optimistic updates can make a huge difference in perceived performance.


>> Take your first, crappy implementation and rewrite it in the same language and you'll probably still see at least 10x improvement.

You will probably will get the same mess. There is plenty of literature about that. I.e: A complete rewrite is what killed Netscape


As bad as Netscape 4 was, what killed the company was their decision to charge for the browser while their monopolitic competidor was giving it away for free.


When a competitor undercuts your pricing with free, not doing anything for three years is unlikely to be the optimal response...


Actually the competitor deliberately did it in order to kill Netscape's business model. This also affected Opera too.


I think this taking the wrong lesson from Netscape. If you do a greenfield re-write of a multi-million line application you probably are going to have a bad time, but people successfully re-write smaller applications or portions of an application all the time. Since this was just a portion of the total functionality of LinkedIn, they incurred much less risk than a ground up rewrite would have.


Don't rewrite the whole thing at once! Cut off logical sections and fix them one at a time.


>You will probably will get the same mess. There is plenty of literature about that. I.e: A complete rewrite is what killed Netscape

That's a meme started by a Joel Spolsky article, and maybe an allusion to the "second system effect" (which is about something else altogether). Hardly "plenty of literature about that".

Actually, the complete rewrite might killed Netscape, but it saved Mozilla and Firefox. And Netscape was a multi-million line web browser engine, with a javascript interpreter, a full mail client and a wysiwyg editor thrown in. And multi-platform in C/C++ to boot.

That is, something an order of magnitude more difficult than LinkedIn or 99% of web properties out there.

There have been TONS of successful rewrites. Especially in the web space, it's almost trivial to rewrite your webapp or parts of it. To name but a few:

Twitter, the new Digg, SoundCloud, Basecamp, etc etc.


It's got a slightly longer history than just being a Joel Spolsky meme: http://en.wikipedia.org/wiki/Second-system_effect


Noticed how I already wrote about that? To quote:

"That's a meme started by a Joel Spolsky article, and maybe an allusion to the "second system effect" (which is about something else altogether)."

That said, the "second system effect" is not about merely rewriting risks, but especially about architecture and design choices. From the very wikipedia article:

"""People who have designed something only once before, try to do all the things they "did not get to do last time," loading the project up with all the things they put off while making version one, even if most of them should be put off in version two as well."""

That is, if you design your rewrite _without_ wanting to build a bigger, more involved product, but merely a cleaner and more cleanly made product, this does not apply.

Another quote from the very article: """The second-system effect refers to the tendency of small, elegant, and successful systems to have elephantine, feature-laden monstrosities as their successors."""

This is not the case we refer to here. The Netscape got by version 4 had gotten an ungodly mess (and even before that), not a "small, elegant" system. And Mozilla/Firefox, the rewrite, is cleaner and more elegant than Netscape was.

Consider a 100 line Python script. People can rewrite it from scratch in 100 different ways, while improving upon it with no problem. At some point of complexity this stops being true, but Brooks was talking about huge projects, built by enormous teams, like OS/360 and such. Not some 20K - 100K line web project.


the complete rewrite might killed Netscape, but it saved Mozilla and Firefox.

What do you mean?


I mean that while Netscape, the company, succumbed while waiting to put out their new competitive browser, we now have the Mozilla Foundation and Firefox.

We wouldn't have those if it wasn't for the rewrite. The old code was a mess even before version 4 (by its developers own admission), and it could never get to the point of competing in the engine space ever again.

That is, with the old Netscape rendering engine it would not be possible to extend it to compete in the modern HTML5/Canvas/GPU acceleration/CSS3/add-ons/separate contexts for each tab/etc era.


And since we are talking here more about technology, and not business models, we should more or less regard the Netscape rewrite as a success story. They did NOT produce a system that was over-engineered or that failed to perform well. Indeed it took a tremendous amount of market share. Now of course they complain about the code again but the situation is far from desperate.


Is moving from Rails to Node not a complete rewrite?


It doesn't count as a rewrite when the rewrite is in a language that's cooler than the original version.


They just wrote a Ruby interpreter for Node.js. Node.js is just that fast.


Where can I see this?


You'll have to wait until April 1.


I'm with you on this one. I'm currently in the process of re-platforming and I'm noticing considerable gains just from revising the way certain process are done. You find a lot of "wtf was I thinking".


My thoughts exactly -- going from 30 to 3 servers is no joke and it couldn't just be because of a move to node.js


Node has some pretty unique benefits, I save a good $1000/month from switching from .NET on dedicateds to NodeJS on a PaaS (Heroku), and that was after 2 - 3 years of writing and rewriting and optimizing the .NET stuff.

Biggest improvements came from persistant connections to redis/mongodb and polling for updated information independently of requests so there was no cached-or-fetch shenanigans at all in some areas.


>> " I save a good $1000/month from switching from .NET on dedicateds to NodeJS on a PaaS (Heroku)"

I'm really puzzled by this. A single dedicated large .NET box (16GB RAM and 6 Core Processor, 1TB raid, etc.) runs $150 a month. When I was looking at PaaS, Heroku came in more than five times that much for similar capability. $1,000 a month gets me SIX of these dedicated boxes that can be tied together as needed.

Why was your setup so much more for dedicated that you could _save_ that much, much less why you would be spending that much in the first place?


What were the benefits you saw in switching from .NET to Node? Was it in basic code structure/complexity or performance?

I'd actually be very surprised if NodeJS running on Heroku (which is built on EC2) performs better than compiled .NET code running on dedicated Windows hardware.


The major benefits were persistant connections and background fetching of data - a lot of my requests serve data directly from local memory instead of hitting local or shared caches and databases.

The equivalent in .NET I guess would be BackgroundWorkers that are independently prefetching the data required most of the time but I could never get them to Just Work.

Specifically for my use case 99+ percent of requests receive some data, do some light manipulations and then push the data to redis about 8,000 to 12,000 times a second. With .NET I could only push to locally running software instances because anything remote couldn't keep up with the connection volume (without throwing even more hardware at the problem).


Interesting. Thanks for the answer. I'm a little surprised that .NET can't juggle network connections very well but in retrospect, I probably shouldn't be.


I think it's just whatever .NET's doing to pool them that has some tiny bit of overhead that doesn't matter most of the time.


Interesting.. how often do you pull from the db's? Do you think about it as a write-through cache?


Every 30 or 60 seconds + the data's timestamped so it's usually a pretty minimal refresh.

It's really just like using any memcached or the built in .net caching rather than a write through one, new data reaches each dyno either via the periodic refresh or when it tries to create that record and finds it already exists. Writes are done immediately and the caches don't get updated because there's 8 - 16 dynos and why bother updating the single dyno that created it.

I did originally use redis pub/sub to push out updates to everything but I ended up removing it because it was unnecessary.

Here's some example code, it pre-fetches all the leaderboards (not scores) every 30 seconds: http://pastebin.com/asq6eExu

Higher up in the same script is the api for the leaderboard data with stuff like: http://pastebin.com/gsfvsZsv


This article actually gives details of their first implementation which suggest that they could cut servers without switching to node :

http://ikaisays.com/2012/10/04/clearing-up-some-things-about...


Yeah as the article notes the old version used HTML and the new version uses binary blobs.

This is hardly surprising.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: