Print

Print


I've been using NodeJS in a few side projects lately, and have come to
like it quite a bit for certain types of applications: specifically
applications that need to do a lot of I/O in memory constrained
environments. A recent one is Wikitweets [1] which provides a real
time view of tweets on Twitter that reference Wikipedia. Similarly
Wikistream [2] monitors ~30 Wikimedia IRC channels for information
about Wikipedia articles being edited and publishes them to the Web.

For both these apps the socket.io library for NodeJS provided a really
nice abstraction for streaming data from the server to the client
using a variety of mechanisms: web sockets, flash socket, long
polling, JSONP polling, etc. NodeJS' event driven programming model
made it easy to listen to the Twitter stream, or the ~30 IRC channels,
while simultaneously holding open socket connections to browsers to
push updates to--all from within one process. Doing this sort of thing
in a more typical web application stack like Apache or Tomcat can get
very expensive where each client connection is a new thread or
process--which can lead to lots of memory being used.

If you've done any JavaScript programming in the browser, it will seem
familiar, because of the extensive use of callbacks. This can take
some getting used to, but it can be a real win in some cases,
especially in applications that are more I/O bound than CPU bound.
Ryan Dahl (the creator of NodeJS) gave a presentation [4] to a PHP
group last year which does a really nice job of describing how NodeJS
is different, and why it might be useful for you. If you are new to
event driven programming I wouldn't underestimate how much time you
might spend feeling like you are turning our brain inside out.

In general I was really pleased with the library support in NodeJS,
and the amount of activity there is in the community. The ability to
run the same code in the client as in the browser might be of some
interest. Also, being able use libraries like jQuery or PhantomJS in
command line programs is pretty interesting for things like screen
scraping the tagsoup HTML that is so prevalent on the Web.

If you end up needing to do RDF and XML processing from within NodeJS
and you aren't finding good library support you might want to find
databases (Sesame, eXist, etc) that have good HTTP APIs and use
something like request [5] if there isn't already support for it. I
wrote up why NodeJS was fun to use for Wikistream on my blog if you
are interested [6].

I recommend you try doing something small to get your feet wet with
NodeJS first before diving in with the rewrite. Good luck!

//Ed

[1] http://wikitweets.herokuapp.com
[2] http://wikistream.inkdroid.org
[3] http://inkdroid.org/journal/2011/11/07/an-ode-to-node/
[4] http://www.youtube.com/watch?v=jo_B4LTHi3I
[5] https://github.com/mikeal/request
[6] http://inkdroid.org/journal/2011/11/07/an-ode-to-node/

On Tue, May 8, 2012 at 5:24 PM, Randy Fischer <[log in to unmask]> wrote:
> On Mon, May 7, 2012 at 11:17 PM, Ethan Gruber <[log in to unmask]> wrote:
>
>>
>>
>> It was recently suggested to me that a project I am working on may adopt
>> node.js for its architecture (well, be completely re-written for node.js).
>> I don't know anything about node.js, and have only heard of it in some
>> passing discussions on the list.  I'd like to know if anyone on code4lib
>> has experience developing in this platform, and what their thoughts are on
>> it, positive or negative.
>
>
>
> It's a very interesting project - I think of it as kind of non-preemptive
> multitasking framework, very much like POE in the Perl world, but with a
> more elegant way of managing the event queue.
>
> Where it could shine is that it accepts streaming, non-blocking HTTP
> requests.  So for large PUTs and POSTs, it could be a real win (most other
> web-server arrangements are going to require completed uploads of the
> request, followed by a hand-off to your framework of an opened file
> descriptor to a temporary file).
>
> My naive tests with it a year or so ago gave inconsistent results, though
> (sometime the checksums of large PUTs were right, sometimes not).
>
> And of course to scale up, do SSL, etc, you'll really need to put something
> like Apache in front of it - then you lose the streaming capability.  (I'd
> love to hear I'm wrong here).
>
>
> -Randy Fischer