Close
  • Latest News
  • Artificial Intelligence
  • Video
  • Big Data and Analytics
  • Cloud
  • Networking
  • Cybersecurity
  • Applications
  • IT Management
  • Storage
  • Sponsored
  • Mobile
  • Small Business
  • Development
  • Database
  • Servers
  • Android
  • Apple
  • Innovation
  • Blogs
  • PC Hardware
  • Reviews
  • Search Engines
  • Virtualization
Read Down
Sign in
Close
Welcome!Log into your account
Forgot your password?
Read Down
Password recovery
Recover your password
Close
Search
Logo
Subscribe
Logo
  • Latest News
  • Artificial Intelligence
  • Video
  • Big Data and Analytics
  • Cloud
  • Networking
  • Cybersecurity
  • Applications
  • IT Management
  • Storage
  • Sponsored
  • Mobile
  • Small Business
  • Development
  • Database
  • Servers
  • Android
  • Apple
  • Innovation
  • Blogs
  • PC Hardware
  • Reviews
  • Search Engines
  • Virtualization
More
    Subscribe
    Home Cybersecurity
    • Cybersecurity
    • Networking

    Microsoft Research Automates Hunt for Search Engine Spam

    Written by

    Ryan Naraine
    Published July 13, 2006
    Share
    Facebook
    Twitter
    Linkedin

      eWEEK content and product recommendations are editorially independent. We may make money when you click on links to our partners. Learn More.

      Researchers at Microsoft are working on an ambitious new project to hunt down and neutralize large-scale search engine spammers.

      The Redmond, Wash., software giants Cybersecurity and Systems Management Research Group has taken the wraps off Strider Search Defender, an experimental project that automates the discovery of search spammers through non-content analysis.

      The project integrates technology from two previous Microsoft Research prototypes—Strider HoneyMonkey and Strider URL Tracer—and promises a new approach to removing junk results from search engine queries.

      “The Web is so badly spammed, you can find a spam site on just about every search query,” said Yi-Min Wang, the researcher heading up the project at Microsoft, in an interview with eWEEK. “We think this approach can pinpoint the big spammers and use their own tactic against them.”

      According to data from Automattic Kismet, a tool that helps bloggers thwart comment spammers, a whopping 93 percent of all blog comments are spam. With Strider Search Defender, Wangs team is taking a context-based approach that uses URL-redirection analysis to pinpoint spammers.

      “For the spammers to be successful, they have to post millions of fake comments on message boards and blogs. Thats the only way to get picked up by search engines. If we can find a way to pinpoint them before they get indexed by search engines, the problem is solved,” Wang said.

      “They want to be found by search engines, thats why theyre spamming. Well, now were finding you,” he added.

      The problem is tied to the use of spam blogs, or splogs, to earn money from pay-per-click advertising programs offered by Google, Yahoo and MSN. Content on fake blogs often contain text stolen from legitimate Web sites and include an unusually high number of links to sites associated with the splog creator. The sole purpose is to boost the search engine rank of the affiliated sites and cash in on ad impressions from unsuspecting surfers.

      /zimages/6/28571.gifRead more here about the Strider TypoPatrol and URL Tracer projects.

      During the early stages of the Microsoft research, Wang discovered that successful large-scale spammers create a huge number of “doorway pages” on reputable domains to trick search engine users into clicking on a fake site. It is well-known that Googles BlogSpot, Yahoos GeoCities and AOLs Hometown services are all used by spammers to create doorway pages.

      The doorway pages are then spammed to millions of forums, blog comments and archived newsgroups, pushing the page up the search engine results for certain target keywords. A user clicking on a doorway-page link in search listings gets redirected to a target page controlled by the spammer or, in some cases, Wang explained, the browser is instructed to either redirect to or fetch ads listing operated by the spammer.

      Next Page: “Monkey program” analyzes traffic.

      Page 2

      The Microsoft Research team is now proposing to treat each spam page as a dynamic program rather than a static page and use a “monkey program” to analyze the traffic resulting from visiting each page with an actual browser. “By identifying those domains that serve target pages for a large number of doorway pages, we can catch major spammers domains together with all their doorway pages and doorway domains,” Wang explained.

      /zimages/6/28571.gifRead more here about Microsofts Strider HoneyMonkey project.

      Strider Search Defender starts with a seed list of confirmed spam URLs and uses a homegrown tool called Spam Hunter to run link queries on search engines. This is an automated process that pinpoints the forums and guest books on which the known spam URLs were posted. On these pages, additional spam links are scrapped to automatically generate a list of spam URLs.

      To filter out false positives, Microsoft feeds the list of potential spam URLs to the Strider URL Tracer, a tool released earlier this year by Microsoft to help trademark owners find typo-squatting domains of their Web sites.

      Using the URL Tracer, Wangs team can launch an actual browser to visit each URL and record all secondary URLs visited as a result. At the end of that automated scan, the researchers can figure out which target-page domains are associated with a large number of doorway-page URLs.

      In one scenario, Wang said the Spam Hunter collected more than 17,000 BlogSpot URLs and fed them into the URL Tracer. The group was able to identify the top 25 target-page domains that are behind the Google-hosted splogs. The top six are particularly active, Wang said, identifying them as s-e-arch.com, speedsearcher.net, abcsearcher.com, eash.info, paysefeed.net and veryfastsearch.com, which collectively were responsible for approximately 45 percent of the BlogSpot URLs.

      Wang said the Strider Search Defender project has already helped to remove junk results from MSN Search. “The more widely spammed a URL is, the easier it is for the Spam Hunter to find it. Once a spammed forum is identified, it becomes a HoneyForum that can be used to capture new spam URLs in new comment postings,” he said. “Ideally, since there is a delay between spamming and its effect on search engine results, our spam hunter should be able to identify new spam URLs and notify the search engine before the URLs enter top search results.”

      /zimages/6/28571.gifCheck out eWEEK.coms for the latest security news, reviews and analysis. And for insights on security coverage around the Web, take a look at eWEEK.com Security Center Editor Larry Seltzers Weblog.

      Ryan Naraine
      Ryan Naraine

      Get the Free Newsletter!

      Subscribe to Daily Tech Insider for top news, trends & analysis

      Get the Free Newsletter!

      Subscribe to Daily Tech Insider for top news, trends & analysis

      MOST POPULAR ARTICLES

      Artificial Intelligence

      9 Best AI 3D Generators You Need...

      Sam Rinko - June 25, 2024 0
      AI 3D Generators are powerful tools for many different industries. Discover the best AI 3D Generators, and learn which is best for your specific use case.
      Read more
      Cloud

      RingCentral Expands Its Collaboration Platform

      Zeus Kerravala - November 22, 2023 0
      RingCentral adds AI-enabled contact center and hybrid event products to its suite of collaboration services.
      Read more
      Artificial Intelligence

      8 Best AI Data Analytics Software &...

      Aminu Abdullahi - January 18, 2024 0
      Learn the top AI data analytics software to use. Compare AI data analytics solutions & features to make the best choice for your business.
      Read more
      Latest News

      Zeus Kerravala on Networking: Multicloud, 5G, and...

      James Maguire - December 16, 2022 0
      I spoke with Zeus Kerravala, industry analyst at ZK Research, about the rapid changes in enterprise networking, as tech advances and digital transformation prompt...
      Read more
      Video

      Datadog President Amit Agarwal on Trends in...

      James Maguire - November 11, 2022 0
      I spoke with Amit Agarwal, President of Datadog, about infrastructure observability, from current trends to key challenges to the future of this rapidly growing...
      Read more
      Logo

      eWeek has the latest technology news and analysis, buying guides, and product reviews for IT professionals and technology buyers. The site’s focus is on innovative solutions and covering in-depth technical content. eWeek stays on the cutting edge of technology news and IT trends through interviews and expert analysis. Gain insight from top innovators and thought leaders in the fields of IT, business, enterprise software, startups, and more.

      Facebook
      Linkedin
      RSS
      Twitter
      Youtube

      Advertisers

      Advertise with TechnologyAdvice on eWeek and our other IT-focused platforms.

      Advertise with Us

      Menu

      • About eWeek
      • Subscribe to our Newsletter
      • Latest News

      Our Brands

      • Privacy Policy
      • Terms
      • About
      • Contact
      • Advertise
      • Sitemap
      • California – Do Not Sell My Information

      Property of TechnologyAdvice.
      © 2024 TechnologyAdvice. All Rights Reserved

      Advertiser Disclosure: Some of the products that appear on this site are from companies from which TechnologyAdvice receives compensation. This compensation may impact how and where products appear on this site including, for example, the order in which they appear. TechnologyAdvice does not include all companies or all types of products available in the marketplace.