Google Releases Open Source Tool That Checks Postgres Backup Integrity

Google’s page verification tool can help organizations discover data loss and corruption earlier in the change cycle, company says.

PostgresSQL Database Tool

Google has released a new open-source tool for verifying PostgreSQL (Postgres) database backups. 

Enterprises using the PostgresSQL can use the tool to verify if any data corruption or data loss has occurred when backing up their database.  Google is already using the tool for customers of Google Cloud SQL for Postgres. Starting this week, it is now also available as open source code. 

Brett Hesterberg, product manager at Google's cloud unit and Alexis Guajardo, a senior software engineer at the company described the new feature as a command line tool that administrators can execute against a Postgres database. 

"Since PostgreSQL version 9.3, it’s been possible to enable checksums on data pages to avoid ignoring data corruption," Hesterberg and Guajardo wrote in a blog July 11. "However, with the release of this utility, you can now verify all data files, online or offline," they said. 

Checksum, in the database context, refers to small bits of data that administrators use to determine if any errors or data corruption might have occurred while data is being backed-up or transmitted. Though such errors are fairly common when deploying changes to a database, many organizations do not verify database backups, the two Google engineers said. As a result data loss is often one of the biggest risks organizations encounter when making database changes and backups, they said. 

Google developed the Postgres page verification tool internally so it could mitigate any issues stemming from Postgres database backups. The goal was to minimize data loss stemming from corruption and loss early in the change cycle. 

By releasing the tool to the open source community other organizations using Postgres can guard against data loss and corruption during backups as well Hesterberg and Guajardo noted. Organizations that use the tool to verify database backups will gain greater assurance that their backups are error-free in the event of a disaster. 

In documentation, Google described the page verification tool as helping administrators verify checksums on PostgreSQL data pages without having to load each page to a shared cache. In order to use it, administrators must enable checksums when they initialize a new Postgres database cluster.  

Once enabled, the tool computes its own checksum and then compares that with the Postgres checksum in order to ensure they are identical. In situations where the two checksums are not identical, the tool identifies the database page in which the error exists. 

The Postgres page verification tool can run against a database continuously, but doing so can impact performance. So Google recommends that the tool be incorporated into the backup process and run on a separate server, Guajardo and Hesterberg stated. 

The page verification tool can be run against online or offline databases and is fully integrated into Cloud SQL, Google's cloud hosted database service. Organizations interested in using the tool can download it either from Google's Open Source or the GitHub data repositories.

Jaikumar Vijayan

Jaikumar Vijayan

Vijayan is an award-winning independent journalist and tech content creation specialist covering data security and privacy, business intelligence, big data and data analytics.