Close
  • Latest News
  • Cybersecurity
  • Big Data and Analytics
  • Cloud
  • Mobile
  • Networking
  • Storage
  • Applications
  • IT Management
  • Small Business
  • Development
  • Database
  • Servers
  • Android
  • Apple
  • Innovation
  • Blogs
  • PC Hardware
  • Reviews
  • Search Engines
  • Virtualization
Read Down
Sign in
Close
Welcome!Log into your account
Forgot your password?
Read Down
Password recovery
Recover your password
Close
Search
Menu
Search
  • Latest News
  • Cybersecurity
  • Big Data and Analytics
  • Cloud
  • Mobile
  • Networking
  • Storage
  • Applications
  • IT Management
  • Small Business
  • Development
  • Database
  • Servers
  • Android
  • Apple
  • Innovation
  • Blogs
  • PC Hardware
  • Reviews
  • Search Engines
  • Virtualization
More
    Home Big Data and Analytics
    • Big Data and Analytics
    • Cloud
    • Storage

    Why Object Storage Can Be Optimal for AI, Machine Learning Workloads

    By
    Chris Preimesberger
    -
    November 24, 2017
    Share
    Facebook
    Twitter
    Linkedin
      Object.storage

      If IT were a television show, it would be “Hoarders.” Organizations are creating and storing more and more data every day, and they’re having a difficulty finding effective places to put it all.

      In fact, according to research by IDC, by 2020 we will hit the 44 zettabyte mark, with about 80 percent of the data not in databases. With such unprecedented data growth, IT teams are looking for flexible, scalable, easily manageable ways to preserve and protect that data. This is where object storage shines.

      Object storage (also known as object-based storage) is a storage architecture that manages data as objects, as opposed to other storage architectures such as file systems, which manage data as a file hierarchy and block storage, which manages data as blocks within sectors and tracks. Each object typically includes the data itself, a variable amount of metadata, and a globally unique identifier.

      Companies that specialize in—or at least offer—object storage options include Cloudian, Pure Storage, Digital Ocean, IBM/Cisco, Dell EMC Virtustream, Spectra Logic, SwiftStack, Qumulo, Minio, NetApp, Hitachi Data Systems, Cohesity and Veritas, among others.

      Michael Tso, CEO and Co-Founder of Cloudian and a man who knows his market well, provided eWEEK some industry information on exactly why he believes object storage systems are the most efficient for big data-type workloads—including those that run machine learning and artificial intelligence use cases—that are becoming more common all the time.

      Here are eight specific storage requirements of these data sets, and why AI and ML applications demand the data management capabilities supplied by enterprise object storage solutions.

      Storage Requirement No. 1: Scalability

      AI systems can process vast amounts of data in a short timeframe. Furthermore, larger data sets deliver better algorithms. The combination drives significant storage demands.  Microsoft taught computers to speak using five years of continuous speech recordings. Tesla is teaching cars to drive with 1.3 billion miles of driving data. Managing these data sets requires a storage system that can scale without limits.

      How Object Storage Helps: Object storage is the only storage type that scales limitlessly within single namespace. Plus, the modular design allows storage to be added at any time, so you can scale with demand, rather than ahead of demand. 

      Storage Requirement No. 2:  Cost Efficiency

      A useful storage system must be both scalable and affordable, two attributes that don’t always co-exist in enterprise storage: historically, highly-scalable systems have been more expensive on a cost/capacity basis.

      How Object Storage Helps: Object storage is built on the industry’s lowest cost hardware platform. Add in low management overhead and space-saving data compression features, and the result is 70 percent less cost than traditional enterprise disk storage.

      Storage Requirement No. 3: Software-defined Storage Options

      Vast data sets will sometimes require hyperscale data centers with purpose-built server architectures already in place. Other configurations may benefit from the simplicity of pre-configured appliances. 

      How Object Storage Helps: Object storage keeps your deployment options open, with your choice of storage appliances or software-defined storage.

      Storage Requirement No. 4: Hybrid Architecture

      Different data types have varying performance requirements, and the hardware must reflect that. Systems must include the right mix of storage technologies to meet the simultaneous needs for scale and performance, rather than a homogeneous approach that will fall short.

      How Object Storage Helps: Object storage employs a hybrid architecture, with a spinning disk for user data and SSDs for performance-sensitive metadata, thus optimizing cost and performance.

      Storage Requirement No. 5: Parallel Architecture

      For data sets that grow without limits, a parallel-access architecture is essential. Otherwise, the system will develop choke points that limit growth.

      How Object Storage Helps: Object storage employs a shared-nothing cluster architecture, which means that all parts of the system work in parallel. Data throughput grows continuously as the system expands.

      Storage Requirement No. 6: Data Durability

      Backing up a multi-petabyte training data set is not feasible; it would usually be cost and time prohibitive. But you can’t leave it unprotected, either. Instead, the storage system need to be self-protecting.

      How Object Storage Helps: Object storage is designed with redundancy built-in, so data is protected without requiring a separate backup process. Furthermore, you can select the level of data protection needed for each data type to optimize efficiency. Systems can be configured to tolerate multiple node failures or even the loss of an entire data center. 

      Storage Requirement No. 7: Data Locality

      While some training data will reside in the cloud, much of it will remain in the data center for a variety of reasons: performance, cost, and regulatory compliance are three of them. To be competitive, on-premises storage must offer the same cost and scalability benefits of its cloud-based counterpart.

      How Object Storage Helps: Object storage is the storage of the cloud. It’s used by many cloud providers for use as public cloud infrastructure. Cloud scalability and economics are now available on-premises.

      Storage Requirement No. 8: Cloud Integration

      Regardless of where data resides, cloud integration will still be an important requirement for two reasons. First, much of the AI/ML innovation is occurring in the cloud. On-premises systems that are cloud-integrated will provide the greatest flexibility to use cloud-native tools. Secondly, we are likely to see a fluid flow of data to/from the cloud as information is generated and analyzed. An on-premises solution should simplify that flow, not limit it.

      How Object Storage Helps: Object storage should be cloud-integrated in three ways: First, solutions may employ the S3 API, the de-facto standard language of cloud storage. Secondly, they may facilitate tiering to/from Amazon, Google, and Microsoft public clouds, and let you view local and cloud-based data within a single namespace. Thirdly, data stored to the cloud should be accessible directly from cloud-based applications. This bi-modal access lets you employ both cloud and on-prem resources interchangeably.

      Realizing the full potential of AI/ML requires an infrastructure that supports innovation. Today’s object storage solutions should deliver the scalability, cost efficiency and interoperability that enhances the capabilities of these emerging technologies.

      Avatar
      Chris Preimesberger
      https://www.eweek.com/author/cpreimesberger/
      Chris J. Preimesberger is Editor-in-Chief of eWEEK and responsible for all the publication's coverage. In his 16 years and more than 5,000 articles at eWEEK, he has distinguished himself in reporting and analysis of the business use of new-gen IT in a variety of sectors, including cloud computing, data center systems, storage, edge systems, security and others. In February 2017 and September 2018, Chris was named among the 250 most influential business journalists in the world (https://richtopia.com/inspirational-people/top-250-business-journalists/) by Richtopia, a UK research firm that used analytics to compile the ranking. He has won several national and regional awards for his work, including a 2011 Folio Award for a profile (https://www.eweek.com/cloud/marc-benioff-trend-seer-and-business-socialist/) of Salesforce founder/CEO Marc Benioff--the only time he has entered the competition. Previously, Chris was a founding editor of both IT Manager's Journal and DevX.com and was managing editor of Software Development magazine. He has been a stringer for the Associated Press since 1983 and resides in Silicon Valley.

      MOST POPULAR ARTICLES

      Android

      Samsung Galaxy XCover Pro: Durability for Tough...

      Chris Preimesberger - December 5, 2020 0
      Have you ever dropped your phone, winced and felt the pain as it hit the sidewalk? Either the screen splintered like a windshield being...
      Read more
      Cloud

      Why Data Security Will Face Even Harsher...

      Chris Preimesberger - December 1, 2020 0
      Who would know more about details of the hacking process than an actual former career hacker? And who wants to understand all they can...
      Read more
      Cybersecurity

      How Veritas Is Shining a Light Into...

      eWEEK EDITORS - September 25, 2020 0
      Protecting data has always been one of the most important tasks in all of IT, yet as more companies become data companies at the...
      Read more
      Big Data and Analytics

      How NVIDIA A100 Station Brings Data Center...

      Zeus Kerravala - November 18, 2020 0
      There’s little debate that graphics processor unit manufacturer NVIDIA is the de facto standard when it comes to providing silicon to power machine learning...
      Read more
      Apple

      Why iPhone 12 Pro Makes Sense for...

      Wayne Rash - November 26, 2020 0
      If you’ve been watching the Apple commercials for the past three weeks, you already know what the company thinks will happen if you buy...
      Read more
      eWeek


      Contact Us | About | Sitemap

      Facebook
      Linkedin
      RSS
      Twitter
      Youtube

      Property of TechnologyAdvice.
      Terms of Service | Privacy Notice | Advertise | California - Do Not Sell My Information

      © 2021 TechnologyAdvice. All Rights Reserved

      Advertiser Disclosure: Some of the products that appear on this site are from companies from which TechnologyAdvice receives compensation. This compensation may impact how and where products appear on this site including, for example, the order in which they appear. TechnologyAdvice does not include all companies or all types of products available in the marketplace.

      ×