Close
  • Latest News
  • Artificial Intelligence
  • Video
  • Big Data and Analytics
  • Cloud
  • Networking
  • Cybersecurity
  • Applications
  • IT Management
  • Storage
  • Sponsored
  • Mobile
  • Small Business
  • Development
  • Database
  • Servers
  • Android
  • Apple
  • Innovation
  • Blogs
  • PC Hardware
  • Reviews
  • Search Engines
  • Virtualization
Read Down
Sign in
Close
Welcome!Log into your account
Forgot your password?
Read Down
Password recovery
Recover your password
Close
Search
Logo
Logo
  • Latest News
  • Artificial Intelligence
  • Video
  • Big Data and Analytics
  • Cloud
  • Networking
  • Cybersecurity
  • Applications
  • IT Management
  • Storage
  • Sponsored
  • Mobile
  • Small Business
  • Development
  • Database
  • Servers
  • Android
  • Apple
  • Innovation
  • Blogs
  • PC Hardware
  • Reviews
  • Search Engines
  • Virtualization
More
    Home IT Management
    • IT Management

    No Bots Allowed!

    Written by

    eWEEK EDITORS
    Published April 16, 2001
    Share
    Facebook
    Twitter
    Linkedin

      eWEEK content and product recommendations are editorially independent. We may make money when you click on links to our partners. Learn More.

      One look at Napster, and its easy to see how new Internet technologies can stir up major conflict. But a looming controversy on the Net might center on a piece of Internet plumbing thats now 7 years old — ancient in Web terms.

      The Robot Exclusion Standard is arcane enough that probably only hard-core site developers and search engine specialists ever even think about it. But just because its been overlooked doesnt mean its insignificant. The standard figured into at least one important legal dispute on the Net last year, and chances are good it will soon surface again.

      As its name suggests, the Robot Exclusion Standard was created to govern robots — computer programs that surf the Web without human supervision. Among other purposes, robots (or “bots”) are used to compile the vast Web databases that make search engines possible.

      Yet not everyone appreciates robots efficiency. After all, using a robot, someone can extract content from a site without viewing a single ad. A competitor can make an instant copy of your Web sites data assets. Robots can also harm a site operationally, when they surf a site more quickly than a server can serve pages. So in 1994, when the Web was still very much an uncharted frontier, a group of developers cobbled together the Robot Exclusion Standard, which lets Web site owners tell robots where to go and where not to go on their sites.

      While the standard created a workable compromise between bot developers and Web sites, it has not eliminated the potential for conflict when their interests collide.

      Last year, eBay took auction aggregator Bidders Edge to court for sending robots to crawl the eBay site. A federal judge granted an injunction against Bidders Edge, relying partly on eBays use of the Robot Exclusion Standard.

      Bidders Edge has shut down its Web service, and the companies have settled their dispute.

      How could Bidders Edge robots crawl eBays site in the first place if eBay had taken steps to keep them out? The answer lies in how the Robot Exclusion Standard works. When it first enters a site, a visiting Web robot will ask the site for its “robots.txt” file. The file contains commands that tell robots which directories theyre not permitted to visit. (You can view CNN.coms file at www.cnn.com/robots.txt.)

      However, the standard relies entirely on the courtesy of the visiting robot. Its completely optional. Nothing prevents robots from simply ignoring the directives in a robots.txt file — and many robots do just that. In that sense, a robots.txt file is less like a locked door than a “no entry” sign hanging in an open doorway.

      Thats how the creators of the standard intended it, says Martijn Koster, who helped develop the standard and maintains it to this day. Koster, a software engineer at Excite@Home, says the developers main concern was preventing underpowered Web servers from being overloaded by out-of-control bots.

      “Its about server administrators providing information to a robot — thats where it ends,” he says. The standard was never intended to be used to simply bar robots from accessing specific content, or to guarantee that visiting robots would comply with the robots.txt directives.

      Hands Off

      Last years eBay decision, however, suggests that where the standard left off, the courts may be willing to step in by granting legal force to the Robot Exclusion Standard.

      “We likened it to a no trespassing sign, and the court agreed,” says Jay Monahan, eBays legal counsel for intellectual property issues. EBay cited several other factors in its case, but Monahan says he saw the Robot Exclusion Standard as an important part of the companys argument.

      Koster has mixed feelings about sites employing the Robot Exclusion Standard to discriminate between bots and human users. Server operators should be able to turn to the Robot Exclusion Standard to curtail abuse, yet Koster also thinks using a robots.txt file merely to prevent bots from getting at publicly accessible content threatens the openness of the Internet.

      “I dont think thats in the spirit of free information exchange,” Koster says. Some robots may have legitimate reasons to ignore robot exclusion directives. For example, he says, a company might use robots to hunt for copyright infringing content.

      A bot should be allowed to view any publicly accessible pages as long as its not harming a Web site, says Wolfgang Tolle, chief technology officer at Cyveillance. Cyveillance offers digital asset scouring services exactly of the sort Koster describes, although Tolle says Cyveillances robots comply with the Robot Exclusion Standard.

      Another factor that may complicate legal arguments is that the Robot Exclusion Standard isnt a “standard” in the strictest sense, since it isnt sanctioned by any authority.

      Koster says he doesnt intend to push the Robot Exclusion Standard through a standards group, such as the World Wide Web Consortium, which might make it a required part of Web software. As an informal convention, it may have less weight in a courtroom — and to those who want the Internet to remain open, Koster says, that might not be such a bad thing.

      eWEEK EDITORS
      eWEEK EDITORS
      eWeek editors publish top thought leaders and leading experts in emerging technology across a wide variety of Enterprise B2B sectors. Our focus is providing actionable information for today’s technology decision makers.

      Get the Free Newsletter!

      Subscribe to Daily Tech Insider for top news, trends & analysis

      Get the Free Newsletter!

      Subscribe to Daily Tech Insider for top news, trends & analysis

      MOST POPULAR ARTICLES

      Artificial Intelligence

      9 Best AI 3D Generators You Need...

      Sam Rinko - June 25, 2024 0
      AI 3D Generators are powerful tools for many different industries. Discover the best AI 3D Generators, and learn which is best for your specific use case.
      Read more
      Cloud

      RingCentral Expands Its Collaboration Platform

      Zeus Kerravala - November 22, 2023 0
      RingCentral adds AI-enabled contact center and hybrid event products to its suite of collaboration services.
      Read more
      Artificial Intelligence

      8 Best AI Data Analytics Software &...

      Aminu Abdullahi - January 18, 2024 0
      Learn the top AI data analytics software to use. Compare AI data analytics solutions & features to make the best choice for your business.
      Read more
      Latest News

      Zeus Kerravala on Networking: Multicloud, 5G, and...

      James Maguire - December 16, 2022 0
      I spoke with Zeus Kerravala, industry analyst at ZK Research, about the rapid changes in enterprise networking, as tech advances and digital transformation prompt...
      Read more
      Video

      Datadog President Amit Agarwal on Trends in...

      James Maguire - November 11, 2022 0
      I spoke with Amit Agarwal, President of Datadog, about infrastructure observability, from current trends to key challenges to the future of this rapidly growing...
      Read more
      Logo

      eWeek has the latest technology news and analysis, buying guides, and product reviews for IT professionals and technology buyers. The site’s focus is on innovative solutions and covering in-depth technical content. eWeek stays on the cutting edge of technology news and IT trends through interviews and expert analysis. Gain insight from top innovators and thought leaders in the fields of IT, business, enterprise software, startups, and more.

      Facebook
      Linkedin
      RSS
      Twitter
      Youtube

      Advertisers

      Advertise with TechnologyAdvice on eWeek and our other IT-focused platforms.

      Advertise with Us

      Menu

      • About eWeek
      • Subscribe to our Newsletter
      • Latest News

      Our Brands

      • Privacy Policy
      • Terms
      • About
      • Contact
      • Advertise
      • Sitemap
      • California – Do Not Sell My Information

      Property of TechnologyAdvice.
      © 2024 TechnologyAdvice. All Rights Reserved

      Advertiser Disclosure: Some of the products that appear on this site are from companies from which TechnologyAdvice receives compensation. This compensation may impact how and where products appear on this site including, for example, the order in which they appear. TechnologyAdvice does not include all companies or all types of products available in the marketplace.

      ×