Archive for July, 2007

July 13th, 2007

A Hybrid Web/Database Search Engine for France’s No. 1 Phone Info Service

In France, people have a daily habit of dialing 118 218 to get the number of their favorite Thai restaurant, to find out what’s playing at the local cinema, or to scout out a new dentist.

Now this telephone information service has launched a beta website at www.118218.fr that offers visitors a new, free way to access to its immense database.

Powered by Exalead’s search technology, this new hybrid web/database directory offers rich visuals and practical search refinement tools to help users get the right informationfast.

Image_resultat_118_218

July 12th, 2007

Fresh Access to 13 Million Scientific Articles

Exalead has reached an agreement with the Institute of Scientific and Technical Information (INIST) that will allow Exalead to offer its public search engine users access to INIST’s massive database of English, French, Spanish and Italian-language scientific articles.

Logo_inist

Exa-searchers will soon be able to explore this rich resource using Exalead’s practical search refinement tools, like dynamically generated lists of related terms and concepts.

July 12th, 2007

The Return of the Desert Samaritans

Do you remember our post on March 27
about Florian, one of our R&D Engineers, and his cast of fellow antique car
lovers? They were plotting to brave the Moroccan desert in their vintage rides
for a good cause, traveling 1500+ miles (2500 kilometers) in
15 days to bring school supplies to children in 14 Moroccan cities.

Well, they returned victorious!

Carole&Co

July 12th, 2007

Exalead Awarded Prize for Innovation by Microsoft at Beijing Event

ChinICT
is the premier venue for top information technology innovators from China and Europe.
Each year at this event, ChinICT recognizes the most innovative and fastest
growing IT enterprises from these two regions. This year, Exalead was named the
2007 Microsoft-ChinICT Award Winner.

ImageChinICT

Carole&Co

July 11th, 2007

The Road to Better Site Indexing – Episode 2

Next in our
series: “The Ballot Box Stuffers,” a.k.a. link farms.

What’s a link
farm? Let’s look at the randomly chosen site http://www.rc-car.ravemart.com. At
first glance, this remote control car ecommerce site seems to be a typical
small biz site plying its wares in the typical way.

Now take a closer
look by clicking on some of the text links at the bottom of the page. You’ll
see this site has decided to aid the search engines by providing links to
thousands of its friends’ sites. Click on the generically titled “Link” and
you’ll find links to the company’s “Partners” under categories such as Debt
Consolidation, Vitamins, Legal Services and Sweepstakes. Or click on
“Sponsored Links” for fast access to Cheap Viagra, Casino And Poker
News, Psychic Readings, or Online Dating Services.

Similarly, visit www.all-carpets.com where under “Resources” you’ll not only find links
for Home Flooring and Kitchen Design (not too far off base), but also Cash Back
Credit Cards, South African Zulu Culture, and Offshore Banking (uh-oh, we’re in
left field now…).

Even larger
companies and organizations may participate in these types of link programs,
where all members link to all other members, often regardless of the relevance
of the content, in the somewhat desperate hope of augmenting their popularity
and hence appearing higher in search results.

If you’re a web
searcher trying to find quality results, don’t worry, we remain vigilant ;-).
If you’re a site owner, avoid participating in link farms. You may find the
strategy backfires as your site is demoted or even dropped from some search
engine databases. Instead seek out quality reciprocal links with sites with
whom you share a genuine relationship, and you’ll build the kind of true
popularity both visitors and search engines appreciate.

Next episode: The
Integration of RSS Feeds.

Sebastien, Web
Team Head Chef

July 11th, 2007

The Road to Better Site Indexing – Introduction and Episode 1

The question that usually follows “How
can I make my site appear at the top of search engine results?” is “Why don’t
search engines index all my pages?”

Firest, you should know that pages
accessible uniquely through JavaScript or through form submissions are not
reachable by search engines and therefore they cannot be indexed. And there is
no means for a search engine to know whether it’s missing some pages in a site,
whether the missing page count is 10 or 10,000 (outside of site maps, which I
will discuss in a future post).

Next, let’s refresh ourselves on the
fundamental methods search engines use to find the pages they index: 1) They
follow a submission made by a human being (0,0001% of cases), or 2) They follow
a link from another page. Therefore, if there is a link to a given page, the
probability that it will be indexed is high. Alternately a personal site with
no external links to it has little chance of being indexed by a search engine.
So more links are always better, right?

Not necessarily. It should be understood that from the point of view of
a search engine, a risk arises not from a dearth of links, but from too many.
Why? Because search engines seek to provide the most relevant results for
visitors, returning pages with the content most likely to match visitors’ needs
and expectations. A site that arrived at the top of the results solely because
there were tens of thousands of links to it would not pass this test. In fact,
an overabundance of external links may indicate a “spamming” campaign aimed at
search engines and be an indicator of poor site quality.

Here are two cases of what we’ll call legitimate ‘overabundance,’ an overabundance of links due to valid, non-spamming factors that can be properly managed by search engines.


 

Case 1: User Sessions

When you visit
an e-commerce site, unique “session” information will often be assigned to your
computer. This information uniquely identifies your particular connection and
visit. It may include, for example, a unique ID for your computer and a code
for your browser version or geographic location.

This session information tracks your movements,
preferences and selections as you navigate a site. This is not for nefarious
ends, but is rather used to perform practical tasks like maintaining items in
your shopping cart, showing prices in your local currency or displaying a list
of products you’ve viewed. This session information is most often added to the end
of the URL (web address) for every page you visit.

For instance, say you are visiting Amazon.com and you
navigate to a Stanley wrench set. The URL displayed in your browser is

http://www.amazon.com/Stanley-92-716-Combination-Wrench-22-Piece/dp/B000JPUCT0/
ref=sr_1_7/002-6118145-0432018?ie=UTF8&s=hi&qid=1181650669&sr=1-7
.

Only the first part of the URL,

http://www.amazon.com/Stanley-92-716-Combination-Wrench-22-Piece/dp/B000JPUCT0/,
is needed to locate the product information for this wrench set. The rest,
“ref=sr_1_7/002-6118145-0432018?ie=UTF8&s=hi&qid=1181650669&sr=1-7″

is session information for your
particular visit.

A search engine may come across thousands of links
like the longer address, each of which may appear different because unique session
information is appended, and because each may show different user-dependent
content such as navigation history, promotions, or recommended products. But
any search engine worth its salt can discern the repetitive addresses from the
essential URL, and will know this is not a case of spamming.


Case 2: Calendar Menus

Some sites let you navigate through their
content by clicking on a calendar. For example, you may be able to peruse news
articles or events on a site by choosing a date or date range.

Such menus generate links like:
http://www.ecvd.eu/index.php?option=com_events&task=view_month&Itemid=32&year=2011&month=09&day=12

A competent search engine will know which
of these types of links returns valid content and which does not, and what
baseline URL should be included in a search index. In other words, having a
zillion external links for events on dates from 1950 to 2060 for a site with ten
events will definitely not boost that site’s ranking ;-).

Now you may say these two cases look like
easy ones for a search engine to manage, and you’d be right. The real
difficulties arise from the following three cases, because (scoop!) there are
unscrupulous people out there ready to do anything to improve their search
engine ranking. You’ve most likely encountered their handiwork when using a
search engine other than Exalead.

You run your search and click on a page you
think is relevant, only to encounter an endless list of meaningless links or
keywords, a pastiche of content “borrowed” from other more relevant sites, or
an endless loop of promising links that ultimately go nowhere.

These types of pages are generated by the
folks at the top of our list of ballot-box stuffers, those trying to improve
their search engine rank through:

* Link farms and keyword stuffing,

* Content scraping, including the abuse of
RSS Feeds, and

* Creating content labyrinths.

We’ll be covering these tactics in upcoming
episodes. In the meantime, you can see why search engines may need to limit the
number of pages they index for a site. This ‘quota’ is determined based on the
site’s reputation, the duplication of its content, and a thousand other
parameters, all factored in an attempt to keep the game honest so web searchers
get the most relevant search results possible.

 

Sebastien, Head Chef, Web Team

July 10th, 2007

Add Exalead to your Netvibes Home Page

Exalead is now available in Netvibes’ standard search engine list.
To add Exalead to your Netvibes “Web Search” widget, choose “Manage Search
Engines…” from the pull down menu of the active search engine, then click on
Exalead.

Netvibes_exalead

Now the whole Netvibes community will have a chance to discover
The Other Search Engine. Welcome Netvibers, and Happy Searching!

July 10th, 2007

Exalead Powering Search on the Le Monde Informatique Website

Le Monde Informatique, a major weekly information technology publication from the Group
IDG, has deployed Exalead as the search engine for its portal LeMondeInformatique.fr.

This new
engine provides unified access to all site sections (Technology, IT Economy,
Small Business Solutions, Jobs, Videos, etc.) and offers users practical tools
for refining their search results, such as filtering results by the type of
document (News, Special Reports, Whitepapers, etc.), publication date, author
or related keywords.

Lemondeinformatique

Carole&Co

July 10th, 2007

Doona, a Click-‘N-Give Search Engine for Humanitarians

Doona's logo

Playing off the words “don” and “donner” – respectively “donation” and “to give” in French – the young students who created the nonprofit Doona ( www.doona.fr/ ) are doing their part to prove that the small acts of many can add up meaningful ‘change’in both senses of the word.


The Doona search engine sets aside a small amount of money each time a visitor clicks on one of the site’s sponsored links. Doona then distributes this money monthly to a nonprofit selected by the other nonprofits in the Doona community.


As Doona selected Exalead to power its web search, you know in advance the results will be right on target, so get clicking now in support this growing humanitarian community (and yes! you can search in English :-).

 

 

Carole&Co

July 10th, 2007

IDOO Expresses Itself with Exalead

IDOO is a global provider of free web services that let IDOO users show off their talents via website, blog, and video publishing. IDOO also offers free email and a social network of IDOO buddies. IDOO rewards talent by sharing ad revenue with its usersthe more popular your contributions, the more money you earn!


IDOO has integrated the Exalead search engine into its international web portal, www.idoo.com, enabling visitors to search the web in English, French, Spanish and Portuguese.

 

 

Idoo website


Carole&Co