LinkedIn Open-Sources Rocket Data, LayoutKit and More

LinkedIn has open-sourced a series of key infrastructure technologies, including its Rocket Data caching system.

linkedin rocket data

LinkedIn, which develops lots of robust infrastructure software to maintain the operations of its social networking service, this week continued its strategy of open-sourcing technology with the contribution of Rocket Data to the community.

Rocket Data is a non-blocking, immutable model management system with a persistent synchronization layer. It can use any cache and has a simple API which easily hooks up to key-value stores, said Peter Livesey, a LinkedIn staff engineer in a blog post.

Livesey said in early 2015, LinkedIn started rewriting its flagship mobile application and wanted a caching system that would present content to the user, while data was loading from the network.

LinkedIn looked at Core Data, Apple's object graph and persistence framework, which is commonly used for this problem, he said. The company has used Core Data in some applications, but Livesey said his team found it to be lacking for their needs. Among the concerns they had were that Core data models are not thread-safe, it does not scale well in large applications and the database needs to be migrated whenever the schema changes.

"Core Data is a powerful framework, but it pays for this power with complexity," Livesey said. "The framework is notorious for crashing the application when something goes wrong."

Also, Livesey's team wanted to use immutable models. "Core Data's programming model relies on mutability, but we wanted to embrace immutability," he said.

For the new caching system, LinkedIn decided on the following requirements: Immutable, thread-safe models; consistency among models in both memory and in the cache, so that when a model is updated, all other instances of this model should also update; non-blocking access on all reads and writes; a simple eviction strategy; the ability to scale well with a large number of model types, schema changes and listeners; and automatic migrations.

They then looked at other options including a simple URL cache, Realm, or simply serializing the models to disk. However, none of these solutions addressed the team's requirements or provided a solution for keeping immutable models consistent, so the company set about building Rocket Data, which is a non-blocking, immutable model management system with a persistent synchronization layer. It can use any cache and has a simple API which easily hooks up to key-value stores, Livesey said.

"Using this caching system, we've been able to easily add caching to all features with very little additional work from developers," Lively said in his post. "The cache and data providers are automatically kept consistent across screens. Despite updating the schemas of multiple models every week, we have never needed to add code for any migrations. And best of all, our application has never crashed due to a Core Data exception."

Last month, LinkedIn open-sourced LayoutKit, a declarative view layout library for iOS applications, said Nick Snyder, a LinkedIn engineer and co-creator of LayoutKit, in a blog post.

Snyder said the performance of LinkedIn's iOS app was lagging and the team realized that the main thread was spending a significant amount of time running Auto Layout. Auto Layout is a layout engine provided by iOS that dynamically calculates the size and position of views on the screen by solving a system of constraints, he said.

Initially, they tweaked the code and got some performance improvement, but not enough. They also considered manually writing layout code, "but we had also learned that manual layout code can be difficult to maintain," Snyder said. "We knew that an ideal solution would encapsulate the layout calculations into reusable components while maintaining good performance."

Snyder said LinkedIn had four requirements for a new layout engine: It needed to be fast; it needed to support the Swift programming language because it is used extensively at LinkedIn; it needed to be maintained and have non-trivial adoption; and it needed to have an acceptable license.

"We were not able to find a project that had all of the features we were looking for, so we built LayoutKit," Snyder said.

According to Snyder, LayoutKit is fast, asynchronous, declarative and cacheable. It also is tested and production ready, he said. LinkedIn open-sourced the technology at the end of June.

Also at the end of last month, LinkedIn open-sourced URL-Detector. URL-Detector is a Java library to detect and normalize URLs in text. Because of the scale of its service, LinkedIn checks hundreds of thousands of URLs for malware and phishing every second.

"In order to guarantee that our members have a safe browsing experience, all user-generated content is checked by a backend service for potentially dangerous content," said Tzu-Han Jan, a senior software engineer at LinkedIn in a blog post. "As a prerequisite for us to be able to check URLs for bad content at this scale, we need to be able to extract URLs in text at scale."

To handle this at scale, LinkedIn created its URL-Detector.