Everything you’ve ever wanted to know about real-time search

How to get your publication ready for search’s next generation.

Nearly ten years after Google revolutionized search, the way the Internet is organized is changing again.

Many users are no longer content to wait for information to be “crawled” by search engines. Now, links are just as likely to be discovered through Twitter and Facebook as they are through Google making the real-time Web a substantial driver of traffic for sites that are able to assure their content is indexed and pushed to real-time services quickly.

“The old model is that you have to go to a search engine and branch out from there, and that model is not going away,” says Collecta CEO Gerry Campbell. “But real-time search is a little bit different. It makes information seeking continuous.”

For a real-time search engine like Collecta, when a user searches for a term, the search engine continually updates the page with new links until the user hits the pause button or leaves the page, making real-time search ideal for topics like current events, breaking news, financial news, and sports.

Real-time search also constantly checks feeds for updates; some engines can update their index in less than a second after a website is updated.

For example, Campbell says he was sitting on his laptop watching this year’s MTV Video Music Awards when Kayne West infamously stormed the stage to interrupt Taylor Swift’s acceptance speech.  Using Collecta, he saw the conversation move from Twitter to blogs to mainstream media outlets in the span of 20 minutes, with each medium adding context and reporting to the news.

As of now, real-time search is its infancy. Not because the technology is unavailable, but because most consumers are unaware of its existence.

“Users have yet to establish where they will be experiencing real-time information,” says Campbell. Now, users have to navigate to a search engine page and type in a query. But in the future, real-time search could add to the already ubiquitous nature of the Internet.

For example, a real-time search engine should be able to detect what you are watching on TV and provide the real-time conversation alongside the video.  Or on your mobile device, a real-time search application could figure out your location and stream you nearby restaurants without refreshing the page.

Four steps to optimize your content for real-time search

So what does this all mean for publishers? With traditional search engines, many publishers hire search optimization experts to help assure their content appears high in search results for relevant keywords. However, with real-time search, content is usually ranked by the date it was published or by the amount of people that shared the link on social networks.

This has the potential to change the SEO game from search engines “pulling” your content through spiders to the “pushing” or pinging of content by publishers when an RSS feed is updated.

“The adoption of push-based technology is the crux of the real-time Web,” says Campbell.

Here’s a quick roadmap to optimizing your site and your content for the next generation of search:

Pick your technology. According to Campbell, there are three technologies used by publishers to help push their content out quickly:

  • XMPP  – Used by Google Wave, this is similar to instant messaging technology. XMPP is open-source but requires a bit of know-how to implement.
     
  • RSSCloud  – A technology founded by the creator of RSS Dave Winer that is easy to implement. By changing the sources your RSS feed pings, your content can be instantly updated on real-time searches. Many content management systems have RSSCloud plugins. The site even offers an implementation guide.
     
  • Pubsubhubbub – The unfortunately named Pubsubhubbub is a decentralized open-source technology designed to quickly ping servers.

Implement it. For your development department, implementing any of the protocols shouldn’t take one person more than a week. Two of the services involve simply changing what sources your RSS feed pings when it updates

Reach out to search engines. Because the technology is so new, most search engines are adding feeds manually. At the end of this article is a list of real-time search engines. Contact them about syndicating your content.

“People are excited to take on this stuff. The new search engines that are trying to be timely would be really interested in your content,” says Campbell.

Make sure you have an active presence on Twitter. Though certainly not a requirement, having a well-established Twitter account can help your search ranking on "social search" sites that place emphasis on the amount of times a link has been shared or retweeted.

The sources:

Collecta claims to have 10,000 feeds supplying its search engine with content. Other real-time search engines rely mostly on Twitter. Here are the most popular feeds utilized by real-time search

Status updates – More specifically, Facebook status updates. While Facebook statuses are only available to the author’s friends, that doesn’t stop them from being useful inside the Facebook ecosystem. While nothing revolutionary has yet been done with the data, Facebook has shown it records every status update through features like its Happiness Index.  Search engines will soon be able to allow you to enter your Facebook information and have your searches be affected by your friends while making all of their status updates and postings included in your results.

Micro-blogging – Some real-time search engines place more emphasis on tweets while others index the links shared over Twitter. Any way you slice it, the road to real-time search goes directly though Twitter. The site’s status updates are public, tagged with metadata and searchable, making it the primary source for all of the burgeoning real-time search engines. Micro-blogging also-ran Jaiku also receives attention here as well.

Social Bookmarking – The content appearing on social bookmarking sites like Delicious and Digg isn’t exactly real-time, as articles need to get bookmarked dozens of times before even seeing the front page of either site. However the sites are a great way to determine what is popular this minute, event if the content they index is not.

Flickr – The photo sharing site my not seem to operate in real-time, but many event attendees upload photos to Flickr as the event is taking place with an easily searchable tag.

Individual Publishers – This is where you come in. Many publishers such as Time and Entertainment Weekly have provided a real-time feed to Collecta.

The major real-time players:

OneRoit - The growing powerhouse in the real-time search market is OneRoit. The company has licensed its technology to Microsoft and Yahoo and leading the charge to mobile real-time search. OneRoit indexes micro-blogging and social bookmarking sites and ranks the links based on the number of time the link has been shared.

Collecta – Collecta ups the ante by also searching blog posts and blog comments. The site is also able have several real-time searches taking place in a single window.

Topsy and Tweetmeme – Both sites rank search results by the amount of times the link has been shared on Twitter. Best for a look into what’s popular now, not necessarily the most recent content.

Google – Google Founder Larry Page has admitted the company has fallen behind Twitter in the real-time search market, though both companies have recently entered a content-sharing agreement. Page pledged to do a better job, and with the unveiling of the company’s latest “Caffeine” update, Google is just beginning to throw its muscle into the market.

A list of Real-Time search engines from ReadWriteWeb:

Comments

TipTop

I am surprised that one of the most heavily used real-time search engines TipTop at http://FeelTipTop.com is not mentioned in your otherwise excellent write-up.

Sorry about that, I added it

Sorry about that, I added it to the list.

Thanks

I appreciate this greatly, Sean. I'd love to tell you & your readers more about TipTop, the only real-time, semantic, social search engine. Please let me know if you want to chat sometime about some good ways to do this. Thanks.

Resources

About Sean Blanda

Sean Blanda's picture

Job Title
Editor

Bio

Sean Blanda is an editor of eMedia Vitals and a writer based out of the Fistown neighborhood in Philadelphia. Named by UWIRE as one of the top 100 young journalists in 2008, he has served as Web Editor of several publications, including the Philadelphia City Paper.

He has also been published in the Philadelphia Daily News, Philadelphia Inquirer and the Wilmington News Journal. He is the lead organizer of the national BarCamp News Innovation in Philadelphia.

Sean also co-founded and writes for Technically Philly, a news site that covers the technology industry in Philadelphia.