Facebook Open-Sources Hack Codegen

Facebook has open-sourced its library for automatically generating Hack code. Hack is a more scalable version of PHP, developed at Facebook.

Facebook logo

Facebook announced it is open-sourcing its Hack Codegen library for generating Hack code and writing it into signed files that prevent undesired modifications.

Hack is a programming language Facebook developed for its HipHop Virtual Machine (HHVM) that interoperates with PHP. Hack reconciles the fast development cycle of PHP with the discipline provided by static typing, while adding many features commonly found in other modern programming languages, Facebook engineers said.

Hack provides instantaneous type checking by incrementally checking a developer’s files as they edit them. It typically runs in less than 200 milliseconds, making it easy for developers to integrate it into their development workflow without introducing a noticeable delay.

HHVM is an open-source virtual machine designed for executing programs written in Hack and PHP. HHVM uses a just-in-time (JIT) compilation approach to achieve superior performance while maintaining the development flexibility that PHP provides, Facebook said.

With much of its initial infrastructure based on PHP, Facebook developed Hack to help developers avoid many of the pitfalls of PHP development. “Every PHP programmer is familiar with day-to-day tasks that can be tricky or cumbersome,” wrote Julien Verlaguet and Alok Menghrajani, in a post on the Facebook engineering blog when the company released Hack last year. Verlaguet is still a member of Facebook’s engineering staff working on Hack, while co-designer Menghrajani has since left the company.

“Traditionally, dynamically typed languages allow for rapid development but sacrifice the ability to catch errors early and introspect code quickly, particularly on larger codebases,” the duo wrote in the post from last year. “Conversely, statically typed languages provide more of a safety net, but often at the cost of quick iteration. We believed there had to be a sweet spot. Thus, Hack was born. We believe that it offers the best of both dynamically typed and statically typed languages, and that it will be valuable to projects of all sizes.”

Moreover, “Hack has sought to solve two important problem areas for PHP, namely improving code quality as code volumes scale up (e.g. by adding stronger type annotation), and improving the performance of Web applications (e.g. by improving compilation to the native hardware),” said Al Hilwa, an analyst with IDC, in a report about Facebook's software development prowess.

“’Hack and HHVM have also contributed to open source and have had a significant impact on the PHP world which has historically moved very slowly in evolving the PHP programming language and its underlying technology," Hilwa added. "Hack/HVVM is yet another important dimension where Facebook has been flexing its technology muscles and sharing with the world the fruits of its innovation. Hack and HVVM have improved the state of PHP and in the process improved the state of the art in server-side Web software development.”

Enter Hack Codegen, which automatically generates hack code.

“Being able to generate code through automated code generation allows programmers to increase the level of abstraction by making frameworks that are declarative and that are translated into high-quality Hack code,” said Alejandro Marcu, a software engineer at Facebook working on Hack, in an August 20 blog post. “We've been using Hack Codegen at Facebook for a while. After seeing so much internal success, we open-sourced this library so that more people could take advantage of it.”

Marcu said before Facebook created Hack Codegen, the company generated code through concatenating strings and a few helper functions. “On the product infrastructure team at Facebook, we had been looking into how to improve one of our internal systems for reading and writing data,” he said.

“The solution we came up with was to write a higher-level abstraction, a schema, that would hold a detailed description of the object type,” Marcu wrote in the post. “Then, we could write a script that would generate the node, mutator, loader, tests, etc., directly from the schema, as well as to set up the storage (e.g., MySQL db).”

Marcu noted that the team realized early on that they would need a good library to generate code, since concatenating strings to generate code don't really scale. “At the time, we didn't do that much code generation at FB, mostly dumping values into arrays, so we didn't have any good tools except for signing files,” he said.

That motivated the engineering team to work to complete the Hack Codegen project. Over the year and a half, Facebook migrated nearly its entire PHP codebase to Hack, thanks to both organic adoption and a number of homegrown refactoring tools.

“After seeing so much internal use of Hack Codegen for diverse applications, it's our pleasure to open-source this library for the external community to use,” Marcu said. The technology can be found here.

“It is interesting that we are getting back into automated code generation which was a big deal in a different era,” Hilwa said. “I think Codegen is a valuable new tool that can help developer productivity in the right settings. The issues typically for automated code generation have typically been learning the abstraction itself, and then the ability to accommodate changes in the generated code. Newer technologies are constantly innovating around these issues.”