NewspaperARCHIVE evaluates Hadoop/Solr – decides to go with Exalead

March 31st, 2010 by Market'&Co Kudos 2

Recently, Exalead made an announcement, “ Scales with Exalead” regarding one of our newest clients,

We mentioned in the press release that set out to replace Autonomy search – and that they looked at a number of alternatives, including at open source search software Solr.

What we didn’t mention in the press release is that NewspaperARCHIVE also evaluated a combination of Solr and Hadoop before purchasing an Exalead license.

A few facts about that Solr/Hadoop evaluation that we didn’t mention in the press release:

– The NewspaperArchive database contains of just over 100 million newspaper pages, each averaging about 6,000 words – a total of roughly 600 billion terms or 2 TBs of text.

– NewspaperArchive already owned a number of midrange servers (HP ProLiant DL300 Servers).

– NewspaperArchive decided that the only way SOLR/Hadoop would work better if they purchased a large number of new commodity servers to run SOLR/Hadoop.

– With Exalead, NewspaperARCHIVE was able to produce more efficient results on the existing server farm.

How so, you may ask. Well, Exalead built a distributed computation layer equivalent to Hadoop/map-reduce for our web search engine. We call it dSort. For NewspaperARCHIVE, Exalead’s dSort technology accomplished everything SOLR/Hadoop did have in terms of distributed computing – but managed it more efficiently. And more importantly, Exalead’s built in semantic processing capabilities assured that customers of NewspaperARCHIVE would significantly more relevant results. In my conversations with the NewspaperARCHIVE team, they’ve said they’re very happy. And so are we.


Leave a comment   2

  1. Otis Gospodnetic says: April 7th, 2010

    I’ll bite:
    I’d love to see the research, numbers, etc. to reach this conclusions. Without that, this is potentially FUD.

      Facebook Twitter Linkedin

  2. Electronic cigarettes review says: September 6th, 2011

    There is perceptibly a bunch to know about this. I think you made various good points in features also.

      Facebook Twitter Linkedin

Leave a Reply

Your email address will not be published. Required fields are marked *