Today: Company Name: Alluxio (data orchestration software)
Proven at global web scale in production for modern data services, Alluxio is the developer of open source data orchestration software for the cloud. Alluxio moves data closer to big data and machine learning compute frameworks in any cloud across clusters, regions, clouds and countries, providing memory-speed data access to files and objects. Venture-backed by Andreessen Horowitz and Seven Seas Partners, Alluxio was founded at UC Berkeley’s AMPLab by the creators of the Tachyon open source project and launched in 2015. Based in San Mateo, Calif.; Haoyuan Li is the company's founder and CEO.
Markets: Intelligent data tiering and data management deliver consistent high performance to customers across all industries including financial services, high tech, retail, healthcare and telecommunications.
Product and Services
Alluxio claims that its data orchestration platform is able to solve the challenges that have come out of decoupled architectures in modern workloads. With the rise of compute intensive workloads and cloud adoption, enterprises must be able to scale compute independently from storage.
Alluxio sits as a layer between compute and storage, bringing data sets closer to compute. It solves for the decoupled issue by bringing data locality, data accessibility and data elasticity to compute across data silos, zones, regions and even clouds.
Alluxio 2.0 provides next-gen data orchestration innovation for multi-cloud with:
- Policy-driven data management for automating data movement across storage systems
- Highly efficient data movement across cloud stores like AWS S3 and Google GCS, so expensive operations on object store are seamless to compute frameworks
It optimizes compute data access for cloud analytics with:
- Compute-focused cluster partitioning so users can partition a cluster by their framework which reduces data transfer costs
- Integration with external data sources over REST so users can bring in data from web-based sources to aggregate in Alluxio and run analytics
And other key 2.0 features include:
- Alluxio Data Service, a distributed clustered service that enable data operations such as replication, persistence, for high performance and massive scale
- Adaptive replication for increased data locality which configures a range for the number of copies of data stored in Alluxio that are automatically managed
- High availability with embedded journal, a new fault-tolerance and high-availability mode for file and object metadata that uses the RAFT consensus algorithm and is independent of any other external storage systems; it is particularly useful for abstracting object storage. [Editor’s note: RAFT is a distributed consensus algorithm. It was designed to be easily understood. It solves the problem of getting multiple servers to agree on a shared state even in the face of failures.]
- A POSIX-compatible API so that frameworks such as Tensorflow, Caffe and other Python-based models can directly access data from any storage system via Alluxio using traditional file system access.
Insight and Analysis
No recorded product reviews as of Aug. 28, 2020. We will update soon.
Other key players in this market:
For deployment in the cloud, Alluxio offers pay-as-you-go pricing (per software instance/hour)
For enterprise deployments, the company also offers software subscriptions:
Contact information for potential customers:
eWEEK is building a new IT products and services section that encompasses most of the categories that we cover on our site. In it, we will spotlight the leaders in each sector, which include enterprise software, hardware, security, on-premises-based systems and cloud services. We also will add promising new companies as they come into the market.