<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Exalead Blog &#187; Programming</title>
	<atom:link href="http://blog.exalead.com/category/programming/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.exalead.com</link>
	<description>The blog of Exalead</description>
	<lastBuildDate>Fri, 20 Nov 2009 16:29:57 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.4</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Transforming a demo into a full-scale production-ready application</title>
		<link>http://blog.exalead.com/2009/11/10/transforming-a-demo-into-a-full-scale-production-ready-application/</link>
		<comments>http://blog.exalead.com/2009/11/10/transforming-a-demo-into-a-full-scale-production-ready-application/#comments</comments>
		<pubDate>Tue, 10 Nov 2009 09:18:17 +0000</pubDate>
		<dc:creator>Sébastien</dc:creator>
				<category><![CDATA[Exalabs]]></category>
		<category><![CDATA[New products & features]]></category>
		<category><![CDATA[Products]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[Technology]]></category>
		<category><![CDATA[chromatik]]></category>
		<category><![CDATA[color search]]></category>
		<category><![CDATA[image search]]></category>
		<category><![CDATA[labs]]></category>
		<category><![CDATA[production]]></category>

		<guid isPermaLink="false">http://blog.exalead.com/?p=1053</guid>
		<description><![CDATA[Jean Marc brought you  a very delightful post about Chromatik last week with a lot of beautiful images. I will now describe in more detail how it was built. As with the DVD you perhaps watched last night, I am afraid there will be fewer big special effects in this blog than in Jean [...]]]></description>
			<content:encoded><![CDATA[<p>Jean Marc brought you  <a href="../2009/10/27/chromatik-adds-color-to-exalead%E2%80%99s-image-search/" target="_blank">a very delightful post about Chromatik</a> last week with a lot of beautiful images. I will now describe in more detail how it was built. As with the DVD you perhaps watched last night, I am afraid there will be fewer big special effects in this blog than in Jean Marc’s post, but I hope to give you an insightful view of what happened behind the curtain.</p>
<p><a href="http://chromatik.labs.exalead.com/" target="_blank">Chromatik</a> was an elaborate demo, the result of a long effort on both the back-end and the front-end. It indexes one million images. For all of them, a color signature was built and indexed. Our current intuitive user interface, exploits this index to help you filter and select images by choosing a combination of colors, luminosity or text.</p>
<p>A large number of people tried and liked the Chromatik demo so much that we received several requests to integrate it into the official Exalead search site. And because the demo ran relatively bug free and smoothly, our friends thought it was a piece of cake. Of course, it was a bit more work than perceived. So where are the challenges?</p>
<p><strong>1)  The front-end side</strong></p>
<p>A lot of questions are to be answered:</p>
<ul>
<li>How will I adapt the GUI of my      application to integrate the new features ?</li>
<li>Are all these new features      necessary ?</li>
<li>What is the feedback we’ve      received on the different features ?</li>
<li>What is the added value of      these features ?</li>
</ul>
<p>The answers to these questions will impact the total amount of space on the GUI we will take for surfacing them.</p>
<p><strong>2) The back-end side</strong></p>
<p>Let’s begin with a little theory:</p>
<p><strong>Theorem of the factor 10 effect:</strong><br />
<em>No matter how good a developer you are, if  non-trivial code has been designed and tested with only N elements, it won’t work without modifications when applied to 10 * N elements.<br />
</em><br />
<strong>Demonstration: </strong>Rather simple: if you don’t believe it, try it yourself…</p>
<p>In this case we wanted a factor 1000, so we knew it would need some adjustments but when you know this theorem, the advantage is that you can anticipate and the experience we have gathered with these situations at Exalead enables us to know most of the bottlenecks.</p>
<p><strong>Example 1:</strong> Chromatik needed 300MB RAM which is quite good for 1M images. But now, if you multiply this number by 2000, it gives you 600GB RAM which is quite large even if  the final index is distributed over multiple machines.<br />
We therefore decided to reduce the richness of the colors, while maintaining usability, migrate from version 4.6 to version 5.0 of Exalead CloudView and use a more compressed encoding. In the end, it now only costs 9GB.</p>
<p><strong>Example 2:</strong> When you want to analyze two billion images, you need to have a robust code, which means that’s able to handle all sort of images even those which do not have a valid RFC. It’s not that easy, when even the most used library in the world for basic image manipulation can crash on some images <a href="http://bugs.libgd.org/?do=details&amp;task_id=86" target="_blank">as we reported</a>.<br />
The result was that this run spotted some bugs in our code that we hadn’t seen before and therefore had to fix.</p>
<p><strong>Example 3:</strong> The demo was initially a single machine application. We needed to use the distributed system framework included in the CloudView technology to be able to run the whole process of extracting, crawling, indexing in only a few weeks. This framework really helped us transform the single machine demo to a fully load-balanced and monitored application. This use case is a little different than our standard www.exalead.com chain, so we discovered and tweaked a few points in the code that were cumbersome.</p>
<p>The purpose of this integration was to offer a new service to the users of the exalead.com search engine and improve the robustness of the Chromatik technology. We now better understand the impact of different tweaks on color indexing.</p>
<p>Transforming a demo into a real product is not as easy as it seems. I hope this post helps you understand why a lot of companies only show you demos but never real live applications.</p>
<p>At Exalead, we don’t sell demos to our customers; we sell tested and robust solutions. We make sure we work hard to test and uncover all the issues so our customers’ implementations go smoothly.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.exalead.com/2009/11/10/transforming-a-demo-into-a-full-scale-production-ready-application/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Map the Web with Gephi</title>
		<link>http://blog.exalead.com/2008/11/03/map-the-web-with-gephi/</link>
		<comments>http://blog.exalead.com/2008/11/03/map-the-web-with-gephi/#comments</comments>
		<pubDate>Mon, 03 Nov 2008 17:15:27 +0000</pubDate>
		<dc:creator>Carole</dc:creator>
				<category><![CDATA[Programming]]></category>
		<category><![CDATA[Tips and tricks]]></category>

		<guid isPermaLink="false">http://blog.exalead.com/2008/11/03/map-the-web-with-gephi/</guid>
		<description><![CDATA[Innovation is a leading priority for Exalead. That is why the company often gives its support to external initiatives like this project set up by students from U.T.C. that developed Gephi, in collaboration with WebAtlas association. Gephi is an open source software under GPL3 license that enables 3D networks graphics manipulation, exploration and visualization.

What is [...]]]></description>
			<content:encoded><![CDATA[<p><strong>Innovation</strong> is a leading priority for <strong>Exalead</strong>. That is why the company often gives its support to external initiatives like this project set up by students from <a href="http://www.utc.fr/the_university/index.php" target="_blank">U.T.C.</a> that developed <a href="http://gephi.org/" target="_blank"><strong>Gephi</strong></a>, in collaboration with <strong><a href="http://webatlas.fr/" target="_blank">WebAtlas</a></strong> association. <strong>Gephi</strong> is an <strong>open source software</strong> under GPL3 license that enables <strong>3D networks graphics manipulation, exploration and visualization.</strong></p>
<p style="text-align: center"><a href="http://web-mining.fr/files/droit_auteur/carto_droit_auteur_generale.pdf" target="_blank"><img src="http://web-mining.fr/files/droit_auteur/Map1.png" alt="Carte DPI" /></a></p>
<p><em>What is this graphic about?<br />
</em>It represents a <strong>semantic analysis</strong> of the relationship between terms used on the Web to speak about <strong>Intellectual Property Rights</strong> in the French language.  Each <strong>node</strong> symbolizes a word or a group of words and each <strong>edge</strong> connects two expressions when these are <u>co-cited in more than 120 000 web pages</u>.  Each color refers to a <strong>&#8220;semantic cluster&#8221;</strong>, which is a bunch of words than concern the same topic.</p>
<p><em>How can I get this type of graphic?</em><br />
After an <strong>extraction of related terms found on Exalead databases</strong> and a manual filtering phase, the project team receives a <strong><a href="http://gephi.org/wp-content/uploads/2008/10/ipr-semantic-graphe.gdf" target="_blank">GDF file with ordered data</a></strong>.  Then, the exploitation of this file by <strong>Gephi</strong> combined with a specific algorithm leads to the <strong>data “spatialization”</strong>. Then color filters highlight different semantic clusters.</p>
<p>Here is one of the first demonstrations of <strong>Gephi</strong> with <strong>real-time spatialization of several keyword clusters.</strong> In this video, the blue color refers to a “genetics” cluster, orange nodes relate to terms about biology and laboratories, green ones concern words speaking about controversy in the domain of GMOs and purple nodes relate to innovation and research development in biotechnology.</p>
<p><center><a href="http://blog.exalead.fr/wp-content/uploads/2008/10/processus_expansion_raffinement1.JPG" alt="processus_expansion_raffinement" width="382" height="270"><br />
<object width="400" height="251"><param name="allowfullscreen" value="true"></param><param name="allowscriptaccess" value="always"></param><param name="movie" value="http://vimeo.com/moogaloop.swf?clip_id=2035117&amp;server=vimeo.com&amp;show_title=1&amp;show_byline=1&amp;show_portrait=0&amp;color=&amp;fullscreen=1"></param>	<embed src="http://vimeo.com/moogaloop.swf?clip_id=2035117&amp;server=vimeo.com&amp;show_title=1&amp;show_byline=1&amp;show_portrait=0&amp;color=&amp;fullscreen=1" type="application/x-shockwave-flash" allowfullscreen="true" allowscriptaccess="always" width="400" height="251"></embed></object><br />
</a><a href="http://vimeo.com/2035117?pg=embed&amp;sec=2035117">Gephi &#8211; Dynamic demo</a> from <a href="http://vimeo.com/user861314?pg=embed&amp;sec=2035117">gephi</a> on <a href="http://vimeo.com?pg=embed&amp;sec=2035117">Vimeo</a></center><center> </center><strong>Congratulations</strong> to the project team for this <strong>great web mapping tool!</strong><br />
Do not hesitate to visit the <a href="http://gephi.org/" target="_blank">Gephi website</a> to obtain more information and <a href="http://gephi.org/support/demo/" target="_blank">test this software.<br />
</a>If you are interested in this subject, you should know that <strong>the team continues to recruit</strong>.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.exalead.com/2008/11/03/map-the-web-with-gephi/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Guide for Webmasters: Part 1, Making the Most of Your Content</title>
		<link>http://blog.exalead.com/2008/03/17/guide-for-webmasters-part-1-making-the-most-of-your-content/</link>
		<comments>http://blog.exalead.com/2008/03/17/guide-for-webmasters-part-1-making-the-most-of-your-content/#comments</comments>
		<pubDate>Mon, 17 Mar 2008 10:39:44 +0000</pubDate>
		<dc:creator>Sébastien</dc:creator>
				<category><![CDATA[Programming]]></category>
		<category><![CDATA[Tips and tricks]]></category>

		<guid isPermaLink="false">http://blog.exalead.com/2008/03/17/guide-for-webmasters-part-1-making-the-most-of-your-content/</guid>
		<description><![CDATA[Interested in improving the visibility of your site on our engine? Hopefully this series of posts will help.
First up: answers to the two most frequently posed webmaster questions:
1) Why doesn’t my site appear (or why does it only partially appear) when I do a site search (i.e., typing “site: mysitename.com” in the search box)?
  [...]]]></description>
			<content:encoded><![CDATA[<p class="MsoNormal"><span lang="EN-GB">Interested in improving the visibility of your site on our engine? Hopefully this series of posts will help.<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-GB"><o:p></o:p>First up: answers to the two most frequently posed webmaster questions:<o:p></o:p></span></p>
<p class="MsoNormal" style="margin-left: 36pt; text-indent: -18pt"><!----><strong><span lang="EN-GB"><span>1) </span></span></strong><span lang="EN-GB"><strong>Why doesn’t my site appear (or why does it only partially appear) when I do a site search (i.e., typing “site: mysitename.com” in the search box)?</strong><o:p></o:p></span></p>
<p class="MsoNormal" style="margin-left: 18pt"><span lang="EN-GB"><o:p> </o:p>        All or part of your site may be inaccessible to our robots. Try the following to improve your performance:<o:p></o:p></span></p>
<ul>
<li><!----><span style="font-family: Symbol" lang="EN-GB"><span></span></span><span lang="EN-GB">Make sure that all pages are accessible by at least one static link.<o:p></o:p></span></li>
<li><!----><span style="font-family: Symbol" lang="EN-GB"><span></span></span><span lang="EN-GB">Place links to your most important content on every page of your site.<o:p></o:p></span></li>
<li><!----><span style="font-family: Symbol" lang="EN-GB"><span></span></span><span lang="EN-GB">Keeping in mind that certain dynamic pages can’t be accessed by our robots, move content as needed to static (or simply more accessible) pages (see “<a href="http://blog.exalead.com/2007/07/11/the-road-to-better-site-indexing-%e2%80%93-introduction-and-episode-1/">The Road to Better Site Indexing – Introduction and Episode 1</a>”)<o:p></o:p></span></li>
<li><!----><span style="font-family: Symbol" lang="EN-GB"><span></span></span><span lang="EN-GB">Be sure the robots.txt file in your root directory is not blocking access to our crawler (use our <a href="http://www.exalead.com/search?action=displayRobotCheckerForm">robot checker form</a> to test accessibility).<o:p></o:p></span></li>
<li><!----><span style="font-family: Symbol" lang="EN-GB"><span><span style="font-family: 'Times New Roman'; font-style: normal; font-variant: normal; font-weight: normal; font-size: 7pt; line-height: normal; font-size-adjust: none; font-stretch: normal"></span></span></span><span lang="EN-GB">Create a site map (see “<a href="http://blog.exalead.com/2007/08/28/episode-4-sitemaps-based-on-a-true-story/">The Road to Better Site Indexing: Episode 3, Sitemaps</a>”) and <a href="http://www.exalead.com/search/submitYourSitePage">submit it on our site</a>. <o:p></o:p></span></li>
</ul>
<p class="MsoNormal" style="margin-left: 36pt; text-indent: -18pt"><!----><strong><span lang="EN-GB"><span>2) </span></span></strong><span lang="EN-GB"><strong>Why doesn’t my site appear for a given keyword?</strong><o:p></o:p></span></p>
<ul>
<li><!----><span style="font-family: Symbol" lang="EN-GB"><span></span></span><span lang="EN-GB">First, check to see that the keyword is in our index for your site. Enter the keyword in the search field, along with “site:mysitename.com” to limit the search for that keyword to just your site (replacing “mysitename.com” with your domain name, of course). If it is not indexed, follow the steps for question 1 above.<o:p></o:p></span></li>
<li><!----><span style="font-family: Symbol" lang="EN-GB"><span></span></span><span lang="EN-GB">Refine the keywords in your site so they are as specific as possible. It could be the keyword you are checking is too general, and sites that larger, more relevant and/or more popular are ranking ahead of your site for that keyword.<o:p></o:p></span></li>
<li><!----><span style="font-family: Symbol" lang="EN-GB"><span></span></span><span lang="EN-GB">Verify that the content of your site corresponds well to the keyword. It’s not enough for a keyword to simply appear, it must be integrally related to the rest of the site content. <o:p></o:p></span></li>
</ul>
<p class="MsoNormal"><span lang="EN-GB">        You&#8217;ll find further info on keyword relevancy in <a href="http://blog.exalead.com/2007/06/04/search-engine-optimization-seo-more-old-school-than-you-think/">Search Engine Optimization (SEO): More Old-School Than You Think</a>.”<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-GB">        And be careful out there! Stick to keeping your content fresh and relevant for your target audience. Reverting to tricks like hidden text, duplicate content, spam link exchanges or other such tactics to improve your ranking could get you banned from our index (for more info, see “<a href="http://blog.exalead.com/2007/07/11/the-road-to-better-site-indexing-%e2%80%93-episode-2/">The Road to Better Site Indexing – Episode 2</a>”).<o:p></o:p></span></p>
<p><span lang="EN-GB">You’ll also find <a href="http://www.exalead.com/about/document/53#22">general webmaster tips</a> in our site’s help pages.<o:p></o:p></span></p>
]]></content:encoded>
			<wfw:commentRss>http://blog.exalead.com/2008/03/17/guide-for-webmasters-part-1-making-the-most-of-your-content/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Video Search Update, Part 2: New Sites Indexed</title>
		<link>http://blog.exalead.com/2008/01/15/video-search-update-part-2-new-sites-indexed/</link>
		<comments>http://blog.exalead.com/2008/01/15/video-search-update-part-2-new-sites-indexed/#comments</comments>
		<pubDate>Tue, 15 Jan 2008 10:17:36 +0000</pubDate>
		<dc:creator>Carole</dc:creator>
				<category><![CDATA[Programming]]></category>

		<guid isPermaLink="false">http://blog.exalead.com/2008/01/15/video-search-update-part-2-new-sites-indexed/</guid>
		<description><![CDATA[In our Video Search Update, Part 1, we told you how we broadened our index to include your direct submissions. We have now enlarged the index once again, adding these popular sites:




zdnet.fr


thatvideosite.com




comedycentral.com


videonetart.com




AskANinja.com


wat.tv




wideo.fr


blip.tv




veoh.com


onowa.com




nasa.gov


video.on.nytimes.com




latelelibre.fr


sports.espn.go.com




feeds.reuters.com


stage6.divx.com




stupidvideos.com


livevideo.com




video.lequipe.fr


archive.org




channels.ourmedia.org


revver.com




Now that the RSS mode has been activated, all that’s needed to add a new source is to locate its corresponding RSS feed. So [...]]]></description>
			<content:encoded><![CDATA[<p><span lang="EN-GB">In our Video Search Update, <a href="http://blog.exalead.com/2008/01/11/video-search-update-part-1-submit-your-video/" title="Video Search Update 1">Part 1</a>, we told you how we broadened our index to include your direct submissions. We have now enlarged the index once again, adding these popular sites:<o:p></o:p></span></p>
<p align="center">
<table class="MsoNormalTable" style="border-collapse: collapse" align="center" border="0" cellpadding="0" cellspacing="0">
<tr>
<td style="border: 1pt solid windowtext; padding: 0cm 5.4pt; width: 98.6pt" valign="top" width="131">
<p class="MsoNormal" style="text-align: center" align="center"><span style="font-size: 10pt; font-family: Arial">zdnet.fr<u1:p></u1:p></span><o:p></o:p></p>
</td>
<td style="border-style: solid solid solid none; border-color: -moz-use-text-color; border-width: 1pt 1pt 1pt medium; padding: 0cm 5.4pt; width: 117pt" valign="top" width="156">
<p class="MsoNormal" style="text-align: center" align="center"><span style="font-size: 10pt; font-family: Arial">thatvideosite.com<u1:p></u1:p></span><o:p></o:p></p>
</td>
</tr>
<tr>
<td style="border-style: none solid solid; border-color: -moz-use-text-color windowtext windowtext; border-width: medium 1pt 1pt; padding: 0cm 5.4pt; width: 98.6pt" valign="top" width="131">
<p class="MsoNormal" style="text-align: center" align="center"><span style="font-size: 10pt; font-family: Arial">comedycentral.com<u1:p></u1:p></span><o:p></o:p></p>
</td>
<td style="border-style: none solid solid none; border-color: -moz-use-text-color windowtext windowtext -moz-use-text-color; border-width: medium 1pt 1pt medium; padding: 0cm 5.4pt; width: 117pt" valign="top" width="156">
<p class="MsoNormal" style="text-align: center" align="center"><span style="font-size: 10pt; font-family: Arial">videonetart.com<u1:p></u1:p></span><o:p></o:p></p>
</td>
</tr>
<tr>
<td style="border-style: none solid solid; border-color: -moz-use-text-color windowtext windowtext; border-width: medium 1pt 1pt; padding: 0cm 5.4pt; width: 98.6pt" valign="top" width="131">
<p class="MsoNormal" style="text-align: center" align="center"><span style="font-size: 10pt; font-family: Arial">AskANinja.com<u1:p></u1:p></span><o:p></o:p></p>
</td>
<td style="border-style: none solid solid none; border-color: -moz-use-text-color windowtext windowtext -moz-use-text-color; border-width: medium 1pt 1pt medium; padding: 0cm 5.4pt; width: 117pt" valign="top" width="156">
<p class="MsoNormal" style="text-align: center" align="center"><span style="font-size: 10pt; font-family: Arial">wat.tv<u1:p></u1:p></span><o:p></o:p></p>
</td>
</tr>
<tr>
<td style="border-style: none solid solid; border-color: -moz-use-text-color windowtext windowtext; border-width: medium 1pt 1pt; padding: 0cm 5.4pt; width: 98.6pt" valign="top" width="131">
<p class="MsoNormal" style="text-align: center" align="center"><span style="font-size: 10pt; font-family: Arial">wideo.fr<u1:p></u1:p></span><o:p></o:p></p>
</td>
<td style="border-style: none solid solid none; border-color: -moz-use-text-color windowtext windowtext -moz-use-text-color; border-width: medium 1pt 1pt medium; padding: 0cm 5.4pt; width: 117pt" valign="top" width="156">
<p class="MsoNormal" style="text-align: center" align="center"><span style="font-size: 10pt; font-family: Arial">blip.tv<u1:p></u1:p></span><o:p></o:p></p>
</td>
</tr>
<tr>
<td style="border-style: none solid solid; border-color: -moz-use-text-color windowtext windowtext; border-width: medium 1pt 1pt; padding: 0cm 5.4pt; width: 98.6pt" valign="top" width="131">
<p class="MsoNormal" style="text-align: center" align="center"><span style="font-size: 10pt; font-family: Arial">veoh.com<u1:p></u1:p></span><o:p></o:p></p>
</td>
<td style="border-style: none solid solid none; border-color: -moz-use-text-color windowtext windowtext -moz-use-text-color; border-width: medium 1pt 1pt medium; padding: 0cm 5.4pt; width: 117pt" valign="top" width="156">
<p class="MsoNormal" style="text-align: center" align="center"><span style="font-size: 10pt; font-family: Arial">onowa.com<u1:p></u1:p></span><o:p></o:p></p>
</td>
</tr>
<tr>
<td style="border-style: none solid solid; border-color: -moz-use-text-color windowtext windowtext; border-width: medium 1pt 1pt; padding: 0cm 5.4pt; width: 98.6pt" valign="top" width="131">
<p class="MsoNormal" style="text-align: center" align="center"><span style="font-size: 10pt; font-family: Arial">nasa.gov<u1:p></u1:p></span><o:p></o:p></p>
</td>
<td style="border-style: none solid solid none; border-color: -moz-use-text-color windowtext windowtext -moz-use-text-color; border-width: medium 1pt 1pt medium; padding: 0cm 5.4pt; width: 117pt" valign="top" width="156">
<p class="MsoNormal" style="text-align: center" align="center"><span style="font-size: 10pt; font-family: Arial">video.on.nytimes.com<u1:p></u1:p></span><o:p></o:p></p>
</td>
</tr>
<tr>
<td style="border-style: none solid solid; border-color: -moz-use-text-color windowtext windowtext; border-width: medium 1pt 1pt; padding: 0cm 5.4pt; width: 98.6pt" valign="top" width="131">
<p class="MsoNormal" style="text-align: center" align="center"><span style="font-size: 10pt; font-family: Arial">latelelibre.fr<u1:p></u1:p></span><o:p></o:p></p>
</td>
<td style="border-style: none solid solid none; border-color: -moz-use-text-color windowtext windowtext -moz-use-text-color; border-width: medium 1pt 1pt medium; padding: 0cm 5.4pt; width: 117pt" valign="top" width="156">
<p class="MsoNormal" style="text-align: center" align="center"><span style="font-size: 10pt; font-family: Arial">sports.espn.go.com<u1:p></u1:p></span><o:p></o:p></p>
</td>
</tr>
<tr>
<td style="border-style: none solid solid; border-color: -moz-use-text-color windowtext windowtext; border-width: medium 1pt 1pt; padding: 0cm 5.4pt; width: 98.6pt" valign="top" width="131">
<p class="MsoNormal" style="text-align: center" align="center"><span style="font-size: 10pt; font-family: Arial">feeds.reuters.com<u1:p></u1:p></span><o:p></o:p></p>
</td>
<td style="border-style: none solid solid none; border-color: -moz-use-text-color windowtext windowtext -moz-use-text-color; border-width: medium 1pt 1pt medium; padding: 0cm 5.4pt; width: 117pt" valign="top" width="156">
<p class="MsoNormal" style="text-align: center" align="center"><span style="font-size: 10pt; font-family: Arial">stage6.divx.com<u1:p></u1:p></span><o:p></o:p></p>
</td>
</tr>
<tr>
<td style="border-style: none solid solid; border-color: -moz-use-text-color windowtext windowtext; border-width: medium 1pt 1pt; padding: 0cm 5.4pt; width: 98.6pt" valign="top" width="131">
<p class="MsoNormal" style="text-align: center" align="center"><span style="font-size: 10pt; font-family: Arial">stupidvideos.com<u1:p></u1:p></span><o:p></o:p></p>
</td>
<td style="border-style: none solid solid none; border-color: -moz-use-text-color windowtext windowtext -moz-use-text-color; border-width: medium 1pt 1pt medium; padding: 0cm 5.4pt; width: 117pt" valign="top" width="156">
<p class="MsoNormal" style="text-align: center" align="center"><span style="font-size: 10pt; font-family: Arial">livevideo.com<u1:p></u1:p></span><o:p></o:p></p>
</td>
</tr>
<tr>
<td style="border-style: none solid solid; border-color: -moz-use-text-color windowtext windowtext; border-width: medium 1pt 1pt; padding: 0cm 5.4pt; width: 98.6pt" valign="top" width="131">
<p class="MsoNormal" style="text-align: center" align="center"><span style="font-size: 10pt; font-family: Arial">video.lequipe.fr<u1:p></u1:p></span><o:p></o:p></p>
</td>
<td style="border-style: none solid solid none; border-color: -moz-use-text-color windowtext windowtext -moz-use-text-color; border-width: medium 1pt 1pt medium; padding: 0cm 5.4pt; width: 117pt" valign="top" width="156">
<p class="MsoNormal" style="text-align: center" align="center"><span style="font-size: 10pt; font-family: Arial">archive.org<u1:p></u1:p></span><o:p></o:p></p>
</td>
</tr>
<tr>
<td style="border-style: none solid solid; border-color: -moz-use-text-color windowtext windowtext; border-width: medium 1pt 1pt; padding: 0cm 5.4pt; width: 98.6pt" valign="top" width="131">
<p class="MsoNormal" style="text-align: center" align="center"><span style="font-size: 10pt; font-family: Arial">channels.ourmedia.org<u1:p></u1:p></span><o:p></o:p></p>
</td>
<td style="border-style: none solid solid none; border-color: -moz-use-text-color windowtext windowtext -moz-use-text-color; border-width: medium 1pt 1pt medium; padding: 0cm 5.4pt; width: 117pt" valign="top" width="156">
<p class="MsoNormal" style="text-align: center" align="center"><span style="font-size: 10pt; font-family: Arial">revver.com<u1:p></u1:p></span><o:p></o:p></p>
</td>
</tr>
</table>
<p class="MsoNormal"><span lang="EN-GB"><br />
Now that the RSS mode has been activated, all that’s needed to add a new source is to locate its corresponding RSS feed. <o:p></o:p>So if you find a good video source, <a href="mailto:exablog@exalead.com">send us the feed URL</a>. <o:p></o:p></span></p>
]]></content:encoded>
			<wfw:commentRss>http://blog.exalead.com/2008/01/15/video-search-update-part-2-new-sites-indexed/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Video Search Update, Part 1: Submit Your Video!</title>
		<link>http://blog.exalead.com/2008/01/11/video-search-update-part-1-submit-your-video/</link>
		<comments>http://blog.exalead.com/2008/01/11/video-search-update-part-1-submit-your-video/#comments</comments>
		<pubDate>Fri, 11 Jan 2008 14:06:10 +0000</pubDate>
		<dc:creator>Carole</dc:creator>
				<category><![CDATA[New products & features]]></category>
		<category><![CDATA[Programming]]></category>

		<guid isPermaLink="false">http://blog.exalead.com/2008/01/11/video-search-update-part-1-submit-your-video/</guid>
		<description><![CDATA[After having indexed Dailymotion, Youtube and Metacafe, we decided to enlarge our index by enabling you to submit your videos to our crawler directly.
We currently support the Media RSS format, adopted by the majority of video content distributors: http://en.wikipedia.org/wiki/Media_RSS
All you need to do is send us your feed URL so our crawler can fetch it.
Once [...]]]></description>
			<content:encoded><![CDATA[<p>After having indexed Dailymotion, Youtube and Metacafe, we decided to enlarge our index by enabling you to submit your videos to our crawler directly.</p>
<p>We currently support the Media RSS format, adopted by the majority of video content distributors: <a href="http://en.wikipedia.org/wiki/Media_RSS">http://en.wikipedia.org/wiki/Media_RSS</a></p>
<p>All you need to do is <a href="mailto:exablog@exalead.com">send us your feed URL</a> so our crawler can fetch it.</p>
<p>Once your feed is submitted, our crawler will check back regularly to verify that the video is still available.</p>
<p>An example of a video feed:</p>
<p>&lt;?xml version=&#8221;1.0&#8243; encoding=&#8221;utf-8&#8243;?&gt;<br />
&lt;rss version=&#8221;2.0&#8243; xmlns:media=&#8221;http://search.yahoo.com/mrss/&#8221;&gt;<br />
&lt;channel&gt;<br />
&lt;title&gt;My site&lt;/title&gt;<br />
&lt;link&gt;http://www.mysite.com/rss/mrss.xml&lt;/link&gt;<br />
&lt;description&gt;Videos published on my site&lt;/description&gt;<br />
&lt;item&gt;<br />
&lt;author&gt;jane56&lt;/author&gt;<br />
&lt;title&gt;Interview with Tom&lt;/title&gt;<br />
&lt;link&gt;http://www.mysite.com/video/1&lt;/link&gt;<br />
&lt;description&gt;Tom responds to my questions about the new product.&lt;/description&gt;<br />
&lt;guid isPermaLink=&#8221;true&#8221;&gt;http://www.mysite.com/video/1&lt;/guid&gt;<br />
&lt;pubDate&gt;Mon, 25 Nov 2007 08:42:00 +0000&lt;/pubDate&gt;<br />
&lt;media:content url=&#8221;http://www.mysite.com/player/1/interview_de_tom.swf&#8221;<br />
type=&#8221;application/x-shockwave-flash&#8221;/&gt;<br />
&lt;media:content duration=&#8221;325&#8243; &gt;<br />
&lt;media:thumbnail url=&#8221;http://www.mysite.com/vimages/1.jpg&#8221; width=&#8221;340&#8243;<br />
height=&#8221;250&#8243; /&gt;<br />
&lt;media:keywords&gt;Tom, interview, new&lt;/media:keywords&gt;<br />
&lt;media:rating scheme=&#8221;urn:simple&#8221;&gt;nonadult&lt;/media:rating&gt;<br />
&lt;media:category&gt;Entertainment&lt;/media:category&gt;<br />
&lt;/item&gt;<br />
&lt;/channel&gt;<br />
&lt;/rss&gt;</p>
<p>&lt;guid&gt;:<br />
-The guid tag contains the URL of a page where the video can be found. When the user runs a search and clicks on a result, he/she will be directed to this URL.</p>
<p>&lt;thumbnail&gt;:<br />
- The thumbnail tag contains a link to a descriptive image for the video.</p>
<p>&lt;pubDate&gt;:<br />
- This tag is for the publication date of the video.</p>
<p>&lt;media:content&gt;:<br />
-This tag contains a direct link to the video. ‘Type’ is a standard video MIME type.</p>
<p>&lt;media:keywords&gt;:<br />
- A list of keywords associated with the video.</p>
<p>&lt;media:category&gt;:<br />
- One or more categories associated with the video.</p>
<p>&lt;media:rating scheme=”urn:simple”&gt;:<br />
- Indicates if the content is ‘adult’ or ‘nonadult’ (suitable for minors) in nature.</p>
<p>This list is not exhaustive. See <a href="http://search.yahoo.com/mrss">http://search.yahoo.com/mrss</a>  for further specification details.</p>
<p><a href="mailto:exablog@exalead.com">Contact us</a> if you have any technical questions!</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.exalead.com/2008/01/11/video-search-update-part-1-submit-your-video/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>The Road to Better Site Indexing: Episode 3, Sitemaps (based on a true story)</title>
		<link>http://blog.exalead.com/2007/08/28/episode-4-sitemaps-based-on-a-true-story/</link>
		<comments>http://blog.exalead.com/2007/08/28/episode-4-sitemaps-based-on-a-true-story/#comments</comments>
		<pubDate>Tue, 28 Aug 2007 14:21:43 +0000</pubDate>
		<dc:creator>Sébastien</dc:creator>
				<category><![CDATA[Programming]]></category>
		<category><![CDATA[Tips and tricks]]></category>

		<guid isPermaLink="false">http://blog.exalead.com/2007/08/28/episode-4-sitemaps-based-on-a-true-story/</guid>
		<description><![CDATA[
In our prior episodes:
The crawler known as “Bot” travels across the web, moving from page to page and site to site by following links he discovers along the way. But Bot isn’t the type to let himself be led about aimlessly. He tries to imitate his hero Humphrey Bogart, who never shied away from a [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.exalead.com/wikipedia/results?q=Humphrey%20Bogart" target="_blank"><img src="wp-content/imported/images/en_US/humphreybogart.jpg" title="Humphreybogart" alt="Humphrey Bogart" style="margin: 0px 0px 5px 5px; float: right" border="0" /></a><em><br />
In our prior episodes:<br />
The crawler known as “Bot” travels across the web, moving from page to page and site to site by following links he discovers along the way. But Bot isn’t the type to let himself be led about aimlessly. He tries to imitate his hero Humphrey Bogart, who never shied away from a tangled web yet always managed to stay on the right track.</em></p>
<p>But being a perfectionist, Bot wasn’t entirely satisfied with his own method. Was he overlooking a significant thread? Leaving an important page unturned? He had a hunch he could do better.</p>
<p>Leaving important content in the dustbin of unindexed pages was just the sort of slip-up that really peeved Bot’s equally perfectionist client Betty, a.k.a. “The Webmaster.” Betty had specifically called on Bot to crawl her entire site, and Bot had missed several pages.</p>
<p>To get their relationship back on the right track, Bot had an idea: he would ask Betty to tell him flat out everything she wanted him to know about her site. And being a guy always in the know, Bot knew just what tool Betty could use to set the record straight: a sitemap.<br />
He proposed; she accepted.</p>
<p>Now Betty can rest easy knowing all the content she wants to share with the world will be indexed. And just what is this handy tool known as a sitemap?<br />
It’s actually not much more than a laundry list of links. Constructing one is a snap. You simply create a text file listing the URLs you want indexed, along with any key facts you want Bot to know (like how often a file is updated), and place it anywhere you’d like, giving Bot the location in your robots.txt file, for example at the root of your web site: http://www.example.com/sitemap.xml.</p>
<p>Sitemaps can be written in XML (the preferred method), or communicated via syndication feeds or simple text files. A sitemap in XML looks something like this:</p>
<p>&lt;urlset xmlns=&#8221;http://www.sitemaps.org/schemas/sitemap/0.9&#8243;&gt;<br />
&lt;url&gt;<br />
&lt;loc&gt;http://www.example.com/&lt;/loc&gt;<br />
&lt;lastmod&gt;2005-01-01&lt;/lastmod&gt;<br />
&lt;changefreq&gt;monthly&lt;/changefreq&gt;<br />
&lt;priority&gt;0.8&lt;/priority&gt;<br />
&lt;/url&gt;<br />
&lt;url&gt;<br />
&lt;loc&gt;http://www.example.com/catalog?item=12&amp;desc=vacation_hawaii&lt;/loc&gt;<br />
&lt;changefreq&gt;weekly&lt;/changefreq&gt;<br />
&lt;/url&gt;<br />
&lt;url&gt;<br />
&lt;loc&gt;http://www.example.com/catalog?item=83&amp;desc=vacation_usa&lt;/loc&gt;<br />
&lt;/url&gt;<br />
&lt;/urlset&gt;</p>
<p>You can visit <a href="http://www.sitemaps.org/" target="_blank">http://www.sitemaps.org/</a> for all the details. It’s the official site of the Sitemaps protocol, which was first proposed by Google, then fleshed out through discussions with MSN, Yahoo and Ask. It’s now the standard adopted by Google, Yahoo, Ask, and, as of July 2007, Exalead.<br />
But bad guys consider yourselves forewarned: Bot knows not every webmaster is not as straight up as Betty. He stays a step ahead of all nefarious sitemap tricks, checking out every list of links spun his way and skipping right over bum lists.</p>
<p>Sébastien</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.exalead.com/2007/08/28/episode-4-sitemaps-based-on-a-true-story/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The Road to Better Site Indexing – Episode 2</title>
		<link>http://blog.exalead.com/2007/07/11/the-road-to-better-site-indexing-%e2%80%93-episode-2/</link>
		<comments>http://blog.exalead.com/2007/07/11/the-road-to-better-site-indexing-%e2%80%93-episode-2/#comments</comments>
		<pubDate>Wed, 11 Jul 2007 14:27:00 +0000</pubDate>
		<dc:creator>Sébastien</dc:creator>
				<category><![CDATA[Programming]]></category>
		<category><![CDATA[Tips and tricks]]></category>

		<guid isPermaLink="false">http://blog.exalead.com/2007/07/11/the-road-to-better-site-indexing-%e2%80%93-episode-2/</guid>
		<description><![CDATA[Next in our
series: “The Ballot Box Stuffers,” a.k.a. link farms. 
What’s a link
farm? Let’s look at the randomly chosen site http://www.rc-car.ravemart.com. At
first glance, this remote control car ecommerce site seems to be a typical
small biz site plying its wares in the typical way.
 
Now take a closer
look by clicking on some of the text links [...]]]></description>
			<content:encoded><![CDATA[<p style="margin: 0cm 0cm 0.0001pt"><span lang="EN-GB">Next in our<br />
series: “The Ballot Box Stuffers,” a.k.a. link farms. </span></p>
<p style="margin: 0cm 0cm 0.0001pt"><span lang="EN-GB">What’s a link<br />
farm? Let’s look at the randomly chosen site <a href="http://www.rc-car.ravemart.com/">http://www.rc-car.ravemart.com</a>. At<br />
first glance, this remote control car ecommerce site seems to be a typical<br />
small biz site plying its wares in the typical way.</span></p>
<p style="margin: 0cm 0cm 0.0001pt"><span lang="EN-GB"><o:p> </o:p></span></p>
<p style="margin: 0cm 0cm 0.0001pt"><span lang="EN-GB">Now take a closer<br />
look by clicking on some of the text links at the bottom of the page. You’ll<br />
see this site has decided to aid the search engines by providing links to<br />
thousands of its friends’ sites. Click on the generically titled “Link” and<br />
you’ll find links to the company’s “Partners” under categories such as Debt<br />
Consolidation, Vitamins, Legal Services and Sweepstakes. Or click on<br />
&#8220;Sponsored Links&#8221; for fast access to Cheap Viagra, Casino And Poker<br />
News, Psychic Readings, or Online Dating Services.</span></p>
<p style="margin: 0cm 0cm 0.0001pt"><span lang="EN-GB"><o:p> </o:p></span></p>
<p style="margin: 0cm 0cm 0.0001pt"><span lang="EN-GB">Similarly, visit <a href="http://www.all-carpets.com/resources/index.php">www.all-carpets.com</a>  where under “Resources” you’ll not only find links<br />
for Home Flooring and Kitchen Design (not too far off base), but also Cash Back<br />
Credit Cards, South African Zulu Culture, and Offshore Banking (uh-oh, we’re in<br />
left field now…).</span></p>
<p style="margin: 0cm 0cm 0.0001pt"><span lang="EN-GB"><o:p> </o:p></span></p>
<p style="margin: 0cm 0cm 0.0001pt"><span lang="EN-GB">Even larger<br />
companies and organizations may participate in these types of link programs,<br />
where all members link to all other members, often regardless of the relevance<br />
of the content, in the somewhat desperate hope of augmenting their popularity<br />
and hence appearing higher in search results.</span></p>
<p style="margin: 0cm 0cm 0.0001pt"><span lang="EN-GB"><o:p> </o:p></span></p>
<p style="margin: 0cm 0cm 0.0001pt"><span lang="EN-GB">If you’re a web<br />
searcher trying to find quality results, don’t worry, we remain vigilant <img src='http://blog.exalead.com/wp-includes/images/smilies/icon_wink.gif' alt=';-)' class='wp-smiley' /> .<br />
If you’re a site owner, avoid participating in link farms. You may find the<br />
strategy backfires as your site is demoted or even dropped from some search<br />
engine databases. Instead seek out quality reciprocal links with sites with<br />
whom you share a genuine relationship, and you’ll build the kind of true<br />
popularity both visitors and search engines appreciate.</span></p>
<p style="margin: 0cm 0cm 0.0001pt"><span lang="EN-GB">Next episode: The<br />
Integration of RSS Feeds. </span></p>
<p style="margin: 0cm 0cm 0.0001pt"><span lang="EN-GB">Sebastien, Web<br />
Team Head Chef</span></p>
]]></content:encoded>
			<wfw:commentRss>http://blog.exalead.com/2007/07/11/the-road-to-better-site-indexing-%e2%80%93-episode-2/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The Road to Better Site Indexing – Introduction and Episode 1</title>
		<link>http://blog.exalead.com/2007/07/11/the-road-to-better-site-indexing-%e2%80%93-introduction-and-episode-1/</link>
		<comments>http://blog.exalead.com/2007/07/11/the-road-to-better-site-indexing-%e2%80%93-introduction-and-episode-1/#comments</comments>
		<pubDate>Wed, 11 Jul 2007 14:20:00 +0000</pubDate>
		<dc:creator>Sébastien</dc:creator>
				<category><![CDATA[Programming]]></category>
		<category><![CDATA[Tips and tricks]]></category>

		<guid isPermaLink="false">http://blog.exalead.com/2007/07/11/the-road-to-better-site-indexing-%e2%80%93-introduction-and-episode-1/</guid>
		<description><![CDATA[The question that usually follows “How
can I make my site appear at the top of search engine results?” is “Why don’t
search engines index all my pages?”
Firest, you should know that pages
accessible uniquely through JavaScript or through form submissions are not
reachable by search engines and therefore they cannot be indexed. And there is
no means for a [...]]]></description>
			<content:encoded><![CDATA[<p class="MsoBodyText"><span lang="EN-GB">The question that usually follows “How<br />
can I make my site appear at the top of search engine results?” is “Why don’t<br />
search engines index all my pages?”</span></p>
<p class="MsoNormal"><span lang="EN-GB">Firest, you should know that pages<br />
accessible uniquely through JavaScript or through form submissions are not<br />
reachable by search engines and therefore they cannot be indexed. And there is<br />
no means for a search engine to know whether it’s missing some pages in a site,<br />
whether the missing page count is 10 or 10,000 (outside of site maps, which I<br />
will discuss in a future post).</span><span style="font-size: 14pt" lang="EN-GB"><o:p> </o:p></span></p>
<p class="MsoBodyText"><span lang="EN-GB">Next, let’s refresh ourselves on the<br />
fundamental methods search engines use to find the pages they index: 1) They<br />
follow a submission made by a human being (0,0001% of cases), or 2) They follow<br />
a link from another page. Therefore, if there is a link to a given page, the<br />
probability that it will be indexed is high. Alternately a personal site with<br />
no external links to it has little chance of being indexed by a search engine.<br />
So more links are always better, right?</span></p>
<p class="MsoBodyText"><span lang="EN-GB">Not necessarily. It should be understood that from the point of view of<br />
a search engine, a risk arises not from a dearth of links, but from too many.<br />
Why? Because search engines seek to provide the most relevant results for<br />
visitors, returning pages with the content most likely to match visitors’ needs<br />
and expectations. A site that arrived at the top of the results solely because<br />
there were tens of thousands of links to it would not pass this test. In fact,<br />
an overabundance of external links may indicate a “spamming” campaign aimed at<br />
search engines and be an indicator of poor site quality.<o:p> </o:p></span></p>
<p class="MsoBodyText"><span lang="EN-GB"></span></p>
<p>Here are two cases of what we’ll call legitimate ‘overabundance,’ an overabundance of links due to valid, non-spamming factors that can be properly managed by search engines.</p>
<p class="MsoBodyText"><span lang="EN-GB"><br />
</span></p>
<p class="MsoBodyText">&nbsp;</p>
<p><strong>Case 1: User Sessions</strong></p>
<p class="MsoBodyText"><span lang="EN-GB">When you visit<br />
an e-commerce site, unique “session” information will often be assigned to your<br />
computer. This information uniquely identifies your particular connection and<br />
visit. It may include, for example, a unique ID for your computer and a code<br />
for your browser version or geographic location. </span></p>
<p class="MsoBodyText"><span lang="EN-GB">This session information tracks your movements,<br />
preferences and selections as you navigate a site. This is not for nefarious<br />
ends, but is rather used to perform practical tasks like maintaining items in<br />
your shopping cart, showing prices in your local currency or displaying a list<br />
of products you’ve viewed. This session information is most often added to the end<br />
of the URL (web address) for every page you visit.<o:p></o:p></span></p>
<p class="MsoBodyText"><span lang="EN-GB">For instance, say you are visiting Amazon.com and you<br />
navigate to a Stanley wrench set. The URL displayed in your browser is <o:p></o:p></span></p>
<p class="MsoBodyText"><span lang="EN-GB"><a href="http://www.amazon.com/Stanley-92-716-Combination-Wrench-22-Piece/dp/B000JPUCT0/ref=sr_1_7/002-6118145-0432018?ie=UTF8&amp;s=hi&amp;qid=1181650669&amp;sr=1-7">http://www.amazon.com/Stanley-92-716-Combination-Wrench-22-Piece/dp/B000JPUCT0/<br />
ref=sr_1_7/002-6118145-0432018?ie=UTF8&amp;s=hi&amp;qid=1181650669&amp;sr=1-7</a>.<br />
</span></p>
<p class="MsoBodyText"><span lang="EN-GB">Only the first part of the URL,</p>
<p><a href="http://www.amazon.com/Stanley-92-716-Combination-Wrench-22-Piece/dp/B000JPUCT0/">http://www.amazon.com/Stanley-92-716-Combination-Wrench-22-Piece/dp/B000JPUCT0/</a>,<br />
is needed to locate the product information for this wrench set. The rest,<br />
“ref=sr_1_7/002-6118145-0432018?ie=UTF8&amp;s=hi&amp;qid=1181650669&amp;sr=1-7&#8243; </span></p>
<p>is session information for <em>your</em><br />
particular visit.</p>
<p class="MsoBodyText"><span lang="EN-GB">A search engine may come across thousands of links<br />
like the longer address, each of which may appear different because unique session<br />
information is appended, and because each may show different user-dependent<br />
content such as navigation history, promotions, or recommended products. But<br />
any search engine worth its salt can discern the repetitive addresses from the<br />
essential URL, and will know this is not a case of spamming.</span></p>
<p class="MsoBodyText"><span lang="EN-GB"><br />
</span></p>
<p><strong>Case 2: Calendar Menus</strong></p>
<p class="MsoNormal"><span lang="EN-GB">Some sites let you navigate through their<br />
content by clicking on a calendar. For example, you may be able to peruse news<br />
articles or events on a site by choosing a date or date range.</span></p>
<p class="MsoNormal"><span lang="EN-GB">Such menus generate links like:<br />
<ahref="http: index.php?option="com_events&amp;task=view_month&amp;Itemid=32&amp;year=2011&amp;month=09&amp;day=12""> http://www.ecvd.eu/index.php?option=com_events&amp;task=view_month&amp;Itemid=32&amp;year=2011<wbr></wbr>&amp;month=09&amp;day=12</ahref="http:></span></p>
<p class="MsoNormal"><span lang="EN-GB">A competent search engine will know which<br />
of these types of links returns valid content and which does not, and what<br />
baseline URL should be included in a search index. In other words, having a<br />
zillion external links for events on dates from 1950 to 2060 for a site with ten<br />
events will definitely not boost that site’s ranking <img src='http://blog.exalead.com/wp-includes/images/smilies/icon_wink.gif' alt=';-)' class='wp-smiley' /> .</span></p>
<p class="MsoNormal"><span lang="EN-GB">Now you may say these two cases look like<br />
easy ones for a search engine to manage, and you’d be right. The real<br />
difficulties arise from the following three cases, because (scoop!) there are<br />
unscrupulous people out there ready to do anything to improve their search<br />
engine ranking. You’ve most likely encountered their handiwork when using a<br />
search engine other than Exalead.</span></p>
<p class="MsoNormal"><span lang="EN-GB">You run your search and click on a page you<br />
think is relevant, only to encounter an endless list of meaningless links or<br />
keywords, a pastiche of content “borrowed” from other more relevant sites, or<br />
an endless loop of promising links that ultimately go nowhere. </span></p>
<p class="MsoNormal"><span lang="EN-GB">These types of pages are generated by the<br />
folks at the top of our list of ballot-box stuffers, those trying to improve<br />
their search engine rank through:</span></p>
<p class="MsoNormal"><span lang="EN-GB">* Link farms and keyword stuffing, </span></p>
<p class="MsoNormal"><span lang="EN-GB">* Content scraping, including the abuse of<br />
RSS Feeds, and </span></p>
<p class="MsoNormal"><span lang="EN-GB">* Creating content labyrinths.</span></p>
<p class="MsoNormal"><span lang="EN-GB"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-GB">We’ll be covering these tactics in upcoming<br />
episodes. In the meantime, you can see why search engines may need to limit the<br />
number of pages they index for a site. This ‘quota’ is determined based on the<br />
site’s reputation, the duplication of its content, and a thousand other<br />
parameters, all factored in an attempt to keep the game honest so web searchers<br />
get the most relevant search results possible.</span></p>
<p class="MsoNormal">&nbsp;</p>
<p class="MsoNormal"><span lang="EN-GB">Sebastien, Head Chef, Web Team</span></p>
<p class="MsoBodyText"><span lang="EN-GB"><o:p></o:p></span></p>
]]></content:encoded>
			<wfw:commentRss>http://blog.exalead.com/2007/07/11/the-road-to-better-site-indexing-%e2%80%93-introduction-and-episode-1/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Learning Javascript &#8211; Part I</title>
		<link>http://blog.exalead.com/2007/06/13/learning-javascript-part-i/</link>
		<comments>http://blog.exalead.com/2007/06/13/learning-javascript-part-i/#comments</comments>
		<pubDate>Wed, 13 Jun 2007 15:38:55 +0000</pubDate>
		<dc:creator>Carole</dc:creator>
				<category><![CDATA[Programming]]></category>

		<guid isPermaLink="false">http://blog.exalead.com/2007/06/13/learning-javascript-part-i/</guid>
		<description><![CDATA[Many developers hate Javascript. Nevertheless, they are asked every day to &#8220;add a bit of AJAX on the website&#8221;.
This post is dedicated to these developers.
Javascript is a rich language, it is object oriented and easy to learn.
The basics
The Mozilla Developer Connection is the best documentation source for Javascript. To learn the syntax and the basic [...]]]></description>
			<content:encoded><![CDATA[<p>Many developers hate Javascript. Nevertheless, they are asked every day to &#8220;add a bit of AJAX on the website&#8221;.</p>
<p>This post is dedicated to these developers.</p>
<p>Javascript is a rich language, it is object oriented and easy to learn.</p>
<p><strong>The basics</strong><br />
The <a href="http://developer.mozilla.org">Mozilla Developer Connection</a> is the best documentation source for Javascript. To learn the syntax and the basic data types, read <a href="http://developer.mozilla.org/en/docs/A_re-introduction_to_JavaScript">A re-introduction to Javascript</a>.</p>
<p><strong>Objects</strong><br />
In Javascript, an object is basically a hash. The simplest way to create an object is:</p>
<pre>
var o = {};
o.name = "hello";
o.setName = function(name) {
  this.name = name;
}</pre>
<p>In this example, <code>o</code> is an object, its <code>name</code> property is set to <code>hello</code>, and its <code>setName</code> property is a function.</p>
<p><strong>Classes and inheritance.</strong><br />
Javascript handles the notion of class in an unusual way:</p>
<ul>
<li>a Class is an object of type <code>Function</code></li>
<li>this object has a magic member called <code>prototype</code></li>
<li>when invoked with the <code>new</code> operator, a new empty object is created and the prototype is copied inside.</li>
</ul>
<p>Example:</p>
<pre>
var MyClass = function() {
  //contructor
  this.name = "default name";
};
MyClass.prototype = {
  //prototype
  setName: function(name) {
    this.name = name;
  }
};

var o = new MyClass();
alert(o.name); // "default name"
o.setName("new name");
alert(o.name); // "new name"</pre>
<p>There are multiple ways to write classes. For a full overview, read the excellent <a href="http://www.crockford.com/javascript/inheritance.html">Classical inheritance in Javascript</a>.</p>
<p><strong>Access control</strong><br />
In Javascript, everything is public. The best way to preserve the notion of privacy is through convention. For instance, at Exalead we prefix private members with an underscore.</p>
<p><strong>Everything Dynamic</strong><br />
In Javascript, everything can be changed at all times. Even methods. This is particularly useful when coding event based User Interfaces:</p>
<pre>
o.onReceiveSomeEvent = function() {
// do something
};</pre>
<p>It is even possible to enrich basic data types with your own methods by adding methods to their prototype.</p>
<p>Example:</p>
<pre>
String.prototype.blank = function() {
  return /^\s*$/.test(this);
}
"hello".blank(); // false
"     ".blank(); // true</pre>
<p><strong>The DOM &#8211; Document Object Model</strong></p>
<p>Javascript runs in the browser in an HTML page. A representation of that page &#8211; the DOM &#8211; is available to Javascript through the global <code>window</code> object. Of course, it would be too easy if the DOM were identical between browsers. How then, do people write cross browser code? There are 2 approaches at least:</p>
<p>Approach #1:</p>
<pre>
if (navigator.appName == "Microsoft Internet Explorer"
    &amp;&amp; navigator.appVersion &gt;= "4.0") {
  element.attachEvent("onclick", function() {alert("click")});
}

if (navigator.appName != "Microsoft Internet Explorer"
    &amp;&amp; navigator.appName != "Netscape") {
  element.addEventListener("click", function() {alert("click")});
}</pre>
<p>This is the most trivial approach and the least elegant and scalable. It makes sharing code a nightmare and will inevitably make you hate Javascript. Don&#8217;t use it, ever.</p>
<p>Approach #2:</p>
<pre>
if (element.attachEvent) {
  element.attachEvent("onclick", function() {alert("click")});
}
if (element.addEventListener) {
  element.addEventListener("click", function() {alert("click")});
}</pre>
<p>This is a lot better. This approach uses one of javascript&#8217;s strengths: testing the existence of a function. It provides maximal compatibility with minimum browser knowledge.</p>
<p>That&#8217;s all for today folks! In my next post, we&#8217;ll dissect the <a href="http://prototypejs.org/">Prototype</a> Javascript Framework.</p>
<p>- <strong>Damucho</strong>, for the WebDev team</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.exalead.com/2007/06/13/learning-javascript-part-i/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
	</channel>
</rss>
