At Sat, 12 Jul 2008 10:46:06 -0400, Godmar Back <[log in to unmask]> wrote: > > Min, Eric, and others working in this domain - > > have you considered designing your software as a scalable web service > from the get-go, using such frameworks as Google App Engine? You may > be able to use Montepython for the CRF computations > (http://montepython.sourceforge.net/) > > I know Min offers a WSDL wrapper around their software, but that's > simply a gateway to one single-machine installation, and it's not > intended as a production service at that. Thanks for the link to montepython. It looks like it might be a good tool for me to learn more about machine learning. As for my citation metadata extractor, once the training data is generated it would be trivial to scale it; there is no shared state. All that is really needed is an implementation of the Viterbi algorithm, & there is one (in pure Python) on the wikipedia page; it is about 20 lines of code. So presumably it could be scaled on the Google app engine pretty easily. But it could be scaled on anything pretty easily; all you need is a load balancer and however many servers are necessary (not many, I would think). best, Erik Hetzner