Alluxio: Product Overview and Insight

eWEEK PRODUCT OVERVIEW AND INSIGHT: Alluxio develops open source data orchestration software for the cloud.


Today: Company Name: Alluxio (data orchestration software)

Company description

Proven at global web scale in production for modern data services, Alluxio is the developer of open source data orchestration software for the cloud. Alluxio moves data closer to big data and machine learning compute frameworks in any cloud across clusters, regions, clouds and countries, providing memory-speed data access to files and objects.  Venture-backed by Andreessen Horowitz and Seven Seas Partners, Alluxio was founded at UC Berkeley’s AMPLab by the creators of the Tachyon open source project and launched in 2015. Based in San Mateo, Calif.; Haoyuan Li is the company's founder and CEO. 

Markets: Intelligent data tiering and data management deliver consistent high performance to customers across all industries including financial services, high tech, retail, healthcare and telecommunications. 

Product and Services

Alluxio claims that its data orchestration platform is able to solve the challenges that have come out of decoupled architectures in modern workloads. With the rise of compute intensive workloads and cloud adoption, enterprises must be able to scale compute independently from storage.  

Alluxio sits as a layer between compute and storage, bringing data sets closer to compute. It solves for the decoupled issue by bringing data locality, data accessibility and data elasticity to compute across data silos, zones, regions and even clouds. 

Key Features

Alluxio 2.0 provides next-gen data orchestration innovation for multi-cloud with: 

  • Policy-driven data management for automating data movement across storage systems 
  • Highly efficient data movement across cloud stores like AWS S3 and Google GCS, so expensive operations on object store are seamless to compute frameworks

It optimizes compute data access for cloud analytics with:

  • Compute-focused cluster partitioning so users can partition a cluster by their framework which reduces data transfer costs
  • Integration with external data sources over REST so users can bring in data from web-based sources to aggregate in Alluxio and run analytics

And other key 2.0 features include:

  • Alluxio Data Service, a distributed clustered service that enable data operations such as replication, persistence, for high performance and massive scale
  • Adaptive replication for increased data locality which configures a range for the number of copies of data stored in Alluxio that are automatically managed
  • High availability with embedded journal, a new fault-tolerance and high-availability mode for file and object metadata that uses the RAFT consensus algorithm and is independent of any other external storage systems; it is particularly useful for abstracting object storage. [Editor’s note: RAFT is a distributed consensus algorithm. It was designed to be easily understood. It solves the problem of getting multiple servers to agree on a shared state even in the face of failures.]
  • A POSIX-compatible API so that frameworks such as Tensorflow, Caffe and other Python-based models can directly access data from any storage system via Alluxio using traditional file system access.

Insight and Analysis

No recorded product reviews as of Aug. 28, 2020. We will update soon.

List of current customers: Alluxio is used today at Alibaba Cloud, China Unicom, Development Bank of Singapore, WalmartLabs, Ryte, Bazaarvoice, Rakuten, EA and many more. 

Other key players in this market:

Databricks (Spark), Starburst Data (Presto), Qubole, Hammerspace, Scality


Software: Cloud, on-premises and hybrid 


For deployment in the cloud, Alluxio offers pay-as-you-go pricing (per software instance/hour)

For enterprise deployments, the company also offers software subscriptions:

Contact information for potential customers: 

Email:  [email protected]

eWEEK is building a new IT products and services section that encompasses most of the categories that we cover on our site. In it, we will spotlight the leaders in each sector, which include enterprise software, hardware, security, on-premises-based systems and cloud services. We also will add promising new companies as they come into the market.