Close
  • Latest News
  • Artificial Intelligence
  • Video
  • Big Data and Analytics
  • Cloud
  • Networking
  • Cybersecurity
  • Applications
  • IT Management
  • Storage
  • Sponsored
  • Mobile
  • Small Business
  • Development
  • Database
  • Servers
  • Android
  • Apple
  • Innovation
  • Blogs
  • PC Hardware
  • Reviews
  • Search Engines
  • Virtualization
Read Down
Sign in
Close
Welcome!Log into your account
Forgot your password?
Read Down
Password recovery
Recover your password
Close
Search
Logo
Subscribe
Logo
  • Latest News
  • Artificial Intelligence
  • Video
  • Big Data and Analytics
  • Cloud
  • Networking
  • Cybersecurity
  • Applications
  • IT Management
  • Storage
  • Sponsored
  • Mobile
  • Small Business
  • Development
  • Database
  • Servers
  • Android
  • Apple
  • Innovation
  • Blogs
  • PC Hardware
  • Reviews
  • Search Engines
  • Virtualization
More
    Subscribe
    Home Cloud
    • Cloud
    • Development

    Google Unveils Open-Source Gumbo HTML Parser Tool

    Written by

    Todd R. Weiss
    Published August 15, 2013
    Share
    Facebook
    Twitter
    Linkedin

      eWEEK content and product recommendations are editorially independent. We may make money when you click on links to our partners. Learn More.

      Google is adding another open-source tool for developers with the release of its Gumbo HTML parser, which is a C implementation of the HTML5 parsing algorithm.

      The open-source code release was announced in an Aug. 14 post by Jonathan Tang, of the search features team, on the Google Open Source Blog.

      “One of the big accomplishments of the HTML5 standard was to standardize the HTML-parsing algorithm, so that all browsers see the same HTML document in the same way,” wrote Tang. “So far, most implementations of this algorithm have either been tied to specific browsers or rendering engines, or they’ve been written in specific scripting languages. This makes it hard to write quick one-off tools to manipulate and clean up HTML if you don’t happen to be working in a language that already has an HTML5-compatible parsing library.”

      That’s where Gumbo can be helpful, because it gives developers “a simple library that can serve as a basic building block for linters, refactoring tools, templating languages, page analysis and other small programs that need to manipulate HTML,” wrote Tang. “It’s written in pure C for ease of interfacing with other languages, and has no outside dependencies. Gumbo was built from the start to support source locations and correlating nodes in the parse tree with positions in the original text.”

      Developers can find additional details about Gumbo and its use, installation and more on the Gumbo project page.

      Gumbo conforms fully to the HTML5 specification, and is robust and resilient to bad input, according to the Gumbo project page on GitHub. Gumbo includes support for source locations and references back to the original text and has been tested on more than 2.5 billion pages from Google’s index, according to the project page.

      Gumbo is just one of many open-source tools and projects that Google has released to software developers in recent months.

      In June, Google released its open-source Cloud Playground environment where developers can quickly try out ideas on a whim, without having to commit to setting up a local development environment that’s safe for testing coding experiments outside the production infrastructure. The new Cloud Playground is presently limited to supporting Python 2.7 App Engine apps.

      Also in June, Google opened its Google Maps Engine API to developers so they can build consumer and business applications that incorporate the features and flexibility of Google Maps. By using the Maps API, developers can now use Google’s cloud infrastructure to add their data on top of a Google Map and share that custom mash-up with consumers, employees or other users. The maps can then be shared internally by companies or organizations or be published on the Web.

      In May, Google’s Go open-source programming language was updated to Version 1.1, bringing developers new capabilities and performance improvements such as a race detector for finding concurrency bugs and new standard library functionality. Go 1.1 arrived 14 months after the release of the original 1.0 version in March 2012.

      There had been two minor “point releases” in between, but they fixed only critical issues and didn’t amount to a reworking of the application. The new version includes significant performance-related improvements, he wrote, including optimizations in the compiler and linker, garbage collector, goroutine scheduler, map implementation and parts of the standard library.

      In April, Google released the open-source Android-based kernel code for its Glass project to encourage software developers to begin much more Google Glass apps development in a big way.

      In January, Google announced that it was moving its Google Cloud Platform (GCP) over to the GitHub collaborative development environment to make it easier for software developers to contribute and continue the evolution of GCP. The GCP program has been growing since Google unveiled a new partner program in July 2012 to help business clients discover all of Google’s available cloud services. GitHub is a rapidly growing collaborative software development platform for public and private code-sharing and hosting.

      Todd R. Weiss
      Todd R. Weiss
      Todd R. Weiss is a seasoned technology journalist with over 15 years of experience covering enterprise IT. Since 2014, he has been a senior writer at eWEEK.com, specializing in mobile technology, smartphones, tablets, laptops, cloud computing, and enterprise software. Previously, he was a staff writer for Computerworld.com from 2000 to 2008, reporting on a wide range of IT topics. Throughout his career, Weiss has written extensively about innovations in mobile tech, cloud platforms, security, and enterprise software, providing insightful analysis to help IT professionals and businesses navigate the evolving technology landscape. His work has appeared in numerous leading publications, offering expert commentary and in-depth analysis on emerging trends and best practices in IT.

      Get the Free Newsletter!

      Subscribe to Daily Tech Insider for top news, trends & analysis

      Get the Free Newsletter!

      Subscribe to Daily Tech Insider for top news, trends & analysis

      MOST POPULAR ARTICLES

      Artificial Intelligence

      9 Best AI 3D Generators You Need...

      Sam Rinko - June 25, 2024 0
      AI 3D Generators are powerful tools for many different industries. Discover the best AI 3D Generators, and learn which is best for your specific use case.
      Read more
      Cloud

      RingCentral Expands Its Collaboration Platform

      Zeus Kerravala - November 22, 2023 0
      RingCentral adds AI-enabled contact center and hybrid event products to its suite of collaboration services.
      Read more
      Artificial Intelligence

      8 Best AI Data Analytics Software &...

      Aminu Abdullahi - January 18, 2024 0
      Learn the top AI data analytics software to use. Compare AI data analytics solutions & features to make the best choice for your business.
      Read more
      Latest News

      Zeus Kerravala on Networking: Multicloud, 5G, and...

      James Maguire - December 16, 2022 0
      I spoke with Zeus Kerravala, industry analyst at ZK Research, about the rapid changes in enterprise networking, as tech advances and digital transformation prompt...
      Read more
      Video

      Datadog President Amit Agarwal on Trends in...

      James Maguire - November 11, 2022 0
      I spoke with Amit Agarwal, President of Datadog, about infrastructure observability, from current trends to key challenges to the future of this rapidly growing...
      Read more
      Logo

      eWeek has the latest technology news and analysis, buying guides, and product reviews for IT professionals and technology buyers. The site’s focus is on innovative solutions and covering in-depth technical content. eWeek stays on the cutting edge of technology news and IT trends through interviews and expert analysis. Gain insight from top innovators and thought leaders in the fields of IT, business, enterprise software, startups, and more.

      Facebook
      Linkedin
      RSS
      Twitter
      Youtube

      Advertisers

      Advertise with TechnologyAdvice on eWeek and our other IT-focused platforms.

      Advertise with Us

      Menu

      • About eWeek
      • Subscribe to our Newsletter
      • Latest News

      Our Brands

      • Privacy Policy
      • Terms
      • About
      • Contact
      • Advertise
      • Sitemap
      • California – Do Not Sell My Information

      Property of TechnologyAdvice.
      © 2024 TechnologyAdvice. All Rights Reserved

      Advertiser Disclosure: Some of the products that appear on this site are from companies from which TechnologyAdvice receives compensation. This compensation may impact how and where products appear on this site including, for example, the order in which they appear. TechnologyAdvice does not include all companies or all types of products available in the marketplace.

      ×