Close
  • Latest News
  • Artificial Intelligence
  • Video
  • Big Data and Analytics
  • Cloud
  • Networking
  • Cybersecurity
  • Applications
  • IT Management
  • Storage
  • Sponsored
  • Mobile
  • Small Business
  • Development
  • Database
  • Servers
  • Android
  • Apple
  • Innovation
  • Blogs
  • PC Hardware
  • Reviews
  • Search Engines
  • Virtualization
Read Down
Sign in
Close
Welcome!Log into your account
Forgot your password?
Read Down
Password recovery
Recover your password
Close
Search
Logo
Logo
  • Latest News
  • Artificial Intelligence
  • Video
  • Big Data and Analytics
  • Cloud
  • Networking
  • Cybersecurity
  • Applications
  • IT Management
  • Storage
  • Sponsored
  • Mobile
  • Small Business
  • Development
  • Database
  • Servers
  • Android
  • Apple
  • Innovation
  • Blogs
  • PC Hardware
  • Reviews
  • Search Engines
  • Virtualization
More
    Home Latest News

      3 New Rules to Block AI Bots from Invading Your Websites

      Written by

      Sam Rinko
      Published November 28, 2024
      Share
      Facebook
      Twitter
      Linkedin
        Flat vector illustration of a humanoid combined with programming code.
        Image: Jackie Niam/Adobe Stock

        eWEEK content and product recommendations are editorially independent. We may make money when you click on links to our partners. Learn More.

        Recently, Microsoft executives submitted a new proposal to the Internet Engineering Task Force to protect web data by creating additional rules to distinguish AI training bots from other bots, letting website owners block bots unwanted AI web crawlers. Because AI models require massive amounts of training data, AI companies typically collect that data from public websites by sending AI crawlers across blog posts, product pages, videos, and other forms of web content.

        While tech companies argue that AI bots should be able to crawl publicly available data just like search engines, some website owners see it as an invasion of privacy, as they never consented to this data extraction. The new proposal offers three methods for blocking AI crawlers from invading your website: new robots.txt rules, application layer response header rules, and a Robots HTML meta tag.

        New Robots.Txt Rules

        Microsoft’s proposal would create additional rules for the robots.txt file websites use to instruct web crawlers and search engine bots about which parts of the site they can and cannot crawl.

        “While the Robots Exclusion Protocol enables service owners to control how, if at all, automated clients known as crawlers may access the URIs on their services as defined by [RFC8288],” the draft proposal says, “the protocol doesn’t provide controls on how the data returned by their service may be used in training generative AI foundation models. Application developers are requested to honor these tags.”

        The proposal suggests the following values for controlling how AI bots interact with websites:

        • DisallowAITraining: Tells the parser not to use data for AI training
        • AllowAITraining: Tells the parser the data can be used for AI training

        These rules recognize the same matching logic of standard allow and disallow rules, and are case insensitive.

        Application Layer Response Header

        The proposal also states that web owners should be able to set these same robots.txt rules via the Application Layer Response Header—a type of HTTP request method that retrieves only the headers of a web resource—without downloading the actual content. As with robots.txt, the rules are not case-sensitive.

        Robots HTML Meta Tag

        The third way the proposal offers to block AI crawlers from a website is to use the following HTML meta tags:

        • <meta name=”robots” content=”DisallowAITraining”>
        • <meta name=”examplebot” content=”AllowAITraining”>

        If the proposal’s recommendations are enacted, website owners will have more control over which bots can crawl their web pages. If enough web owners take advantage of these new rules and restrict AI bots, generative AI development could slow down.

        Learn more about the privacy challenges and issues AI faces and the best practices and solutions to address them.

        Sam Rinko
        Sam Rinko
        Sam is a former SaaS sales rep turned technology journalist. He spent his career selling real estate technology to C-suite executives before switching over to writing, where he now covers a variety of enterprise IT topics. Sam specializes in today’s emerging tech, including AI, large language models, machine learning, and related technologies.
        Linkedin

        Get the Free Newsletter!

        Subscribe to Daily Tech Insider for top news, trends & analysis

        Get the Free Newsletter!

        Subscribe to Daily Tech Insider for top news, trends & analysis

        MOST POPULAR ARTICLES

        Artificial Intelligence

        9 Best AI 3D Generators You Need...

        Sam Rinko - June 25, 2024 0
        AI 3D Generators are powerful tools for many different industries. Discover the best AI 3D Generators, and learn which is best for your specific use case.
        Read more
        Cloud

        RingCentral Expands Its Collaboration Platform

        Zeus Kerravala - November 22, 2023 0
        RingCentral adds AI-enabled contact center and hybrid event products to its suite of collaboration services.
        Read more
        Artificial Intelligence

        8 Best AI Data Analytics Software &...

        Aminu Abdullahi - January 18, 2024 0
        Learn the top AI data analytics software to use. Compare AI data analytics solutions & features to make the best choice for your business.
        Read more
        Latest News

        Zeus Kerravala on Networking: Multicloud, 5G, and...

        James Maguire - December 16, 2022 0
        I spoke with Zeus Kerravala, industry analyst at ZK Research, about the rapid changes in enterprise networking, as tech advances and digital transformation prompt...
        Read more
        Video

        Datadog President Amit Agarwal on Trends in...

        James Maguire - November 11, 2022 0
        I spoke with Amit Agarwal, President of Datadog, about infrastructure observability, from current trends to key challenges to the future of this rapidly growing...
        Read more
        Logo

        eWeek has the latest technology news and analysis, buying guides, and product reviews for IT professionals and technology buyers. The site’s focus is on innovative solutions and covering in-depth technical content. eWeek stays on the cutting edge of technology news and IT trends through interviews and expert analysis. Gain insight from top innovators and thought leaders in the fields of IT, business, enterprise software, startups, and more.

        Facebook
        Linkedin
        RSS
        Twitter
        Youtube

        Advertisers

        Advertise with TechnologyAdvice on eWeek and our other IT-focused platforms.

        Advertise with Us

        Menu

        • About eWeek
        • Subscribe to our Newsletter
        • Latest News

        Our Brands

        • Privacy Policy
        • Terms
        • About
        • Contact
        • Advertise
        • Sitemap
        • California – Do Not Sell My Information

        Property of TechnologyAdvice.
        © 2024 TechnologyAdvice. All Rights Reserved

        Advertiser Disclosure: Some of the products that appear on this site are from companies from which TechnologyAdvice receives compensation. This compensation may impact how and where products appear on this site including, for example, the order in which they appear. TechnologyAdvice does not include all companies or all types of products available in the marketplace.

        ×