Close
  • Latest News
  • Artificial Intelligence
  • Video
  • Big Data and Analytics
  • Cloud
  • Networking
  • Cybersecurity
  • Applications
  • IT Management
  • Storage
  • Sponsored
  • Mobile
  • Small Business
  • Development
  • Database
  • Servers
  • Android
  • Apple
  • Innovation
  • Blogs
  • PC Hardware
  • Reviews
  • Search Engines
  • Virtualization
Read Down
Sign in
Close
Welcome!Log into your account
Forgot your password?
Read Down
Password recovery
Recover your password
Close
Search
Logo
Logo
  • Latest News
  • Artificial Intelligence
  • Video
  • Big Data and Analytics
  • Cloud
  • Networking
  • Cybersecurity
  • Applications
  • IT Management
  • Storage
  • Sponsored
  • Mobile
  • Small Business
  • Development
  • Database
  • Servers
  • Android
  • Apple
  • Innovation
  • Blogs
  • PC Hardware
  • Reviews
  • Search Engines
  • Virtualization
More
    Home Big Data and Analytics
    • Big Data and Analytics
    • Cloud
    • IT Management
    • Storage

    Why Enterprises Struggle with Cloud Data Lakes

    Written by

    eWEEK EDITORS
    Published October 19, 2020
    Share
    Facebook
    Twitter
    Linkedin

      eWEEK content and product recommendations are editorially independent. We may make money when you click on links to our partners. Learn More.

      Cloud data lakes are an increasingly critical complement to enterprises’ data warehouses. Their promise now goes well beyond more traditional notions of centralized storage to also incorporate what’s needed for data ingestion, analytics, data engineering, and artificial intelligence and machine learning initiatives. Get it right, and you unlock higher-value data insights that can steer myriad business initiatives for the long haul. See it go off the rails and … well, at least you’re not alone.

      Gartner attracted headlines for tabbing the failure rate of data lake and big data projects at 60% back in 2016, only to revise that upto 85% the following year. Yet demand for data lakes has only increased, since enterprises understand the need.

      [To see a larger view of the image at top left, right-click on it and select “View Image.”]

      Gartner research from earlier this year found that 52% of enterprises plan to invest in a data lake within the next two years. Whether the failure rate stays high depends on overcoming acute challenges. Talent is expensive and isn’t all that scalable for most organizations anyway.

      Investments in cloud data lakes can run well into seven figures

      Achieving a production cloud data lake deployment can typically take six to nine months of development. Annual investments required to maintain implementations can reach well into the seven figures, requiring teams of DevOps, security pros, and cloud experts. These requirements and the complexity they represent quickly prove overwhelming for enterprises seeking the benefits of cloud data lakes without the correct plan of attack in hand.

      The most common cloud data lake setup–the do-it-yourself variety embraced by early adopters and digital natives–usually entails utilizing a native cloud PaaS stack that’s been assembled from a vast and labyrinthine array of technologies. Teams must navigate the many hundreds of PaaS choices and architectures available, while addressing the developmental, operational and security requisites of integrating each solution within their DIY cloud data lake implementation. Too often this task proves to be more than organizations without fully capable (and yes, costly) DevOps teams can realistically manage.

      In this edition of eWEEK Data Points, Lovan Chetty, Vice-President at instant cloud data lake provider Cazena, shares his industry information about five specific struggles that enterprises must plan for in their cloud data lake journey

      Data Point No. 1: Achieving end-to-end orchestration of cloud data lakes isn’t easy.

      In order to manifest a cloud data lake as a singular, integrated and complete production instance, a disparate stack of technologies must be orchestrated and validated. This includes cloud storage, data ingestion, compute engines, security controls, identity management, networking access and analytics tools. The fact that several components of the data lake stack may be on-premises–such as analytics tools, analytics users and data sources–is usually a big contributor to the challenge of achieving this functional, hybrid and efficient end-to-end orchestration.

      Data Point No. 2: Runaway costs and performance degradation occur without relentless monitoring and management.

      The success of any cloud data lake project hinges on continual changes to maximize performance, reliability and cost efficiency. Each of these variables require constant and detailed monitoring and management of end-to-end workloads. Consider the evolution of data processing engines and the importance of leveraging the most advantageous opportunities around price and performance. Managing workload price performance and cloud cost optimization is just as crucial to cloud data lake implementations, where costs can and will quickly get out of hand if proper monitoring and management aren’t in place.

      Data Point No. 3: Ensuring security, regulatory compliance and governance can be tricky.

      Public cloud resources aren’t private by default. Securing a production cloud data lake requires extensive configuration and customization efforts–especially for enterprises that must fall in line with specific regulatory compliance oversights and governance mandates (HIPAA, PCI DSS, GDPR, etc). Achieving the requisite data safeguards often means enlisting experienced and dedicated teams who are equipped to lock down cloud resources and restrict access to only users that are authorized and credentialed.

      Once robust authorization and access controls are in place that protect a cloud data lake, teams must continue to monitor and control a fluid analytic environment containing sensitive and confidential data while also enforcing regulatory compliance and effective data governance policies.

      Data Point No. 4: Hybrid architecture and multi-cloud support is required.

      There’s dangerous potential for disconnect between on-prem data sources and analytics users on one side, and the cloud data lake resources in the public cloud on the other. Cloud data lakes are typically part of a hybrid architecture that enables them to function as an extension of enterprise data environments. Cloud data lakes need to support data pipelines that can span both on-premises sources, as well as cloud sources or third-party data sources and processes. Ensuring that global analytics users can leverage cloud data lakes seamlessly takes close monitoring, management and security backed by robust end-to-end SLAs.

      Enabling multi-cloud architecture also expands the utility and possibilities of cloud data lakes, ensuring their portability across cloud providers without disruptions to data flows or analytics workflows. Enterprises who will need multi-cloud options should build this capability into their platform from the beginning; it’s far more challenging to add later.

      Data Point No. 5: Those who need the data need to be able to get to it themselves.

      Enterprises do well to focus on the users their data lakes are for: the data scientists, data engineers, analysts and product teams that coax out insights and press them into action. A strong definition of a successful cloud data lake is one that enables those who need the data to rapidly access, consume and put data to work–no matter their tools or location.

      While challenging to implement, self-service functionality is essential for allowing users without development or operations skillsets to still leverage the cloud data lake using existing AI/ML, BI, search, and data tools (or even new ones hosted with the data lake itself).

      Data Point No. 6: The Takeaway

      While each of the capabilities discussed is a must-have for a modern cloud data lake, they also show the cost and implementation challenges enterprises must navigate to get there. It is estimated that enterprises invest 5-6 DevOps dollars on expertise for each dollar of investment in the cloud stack. In shaping their cloud data lake approach, enterprises should seek out a strategy that can simplify these challenges and reduce the need for expertise.

      By lowering these hurdles–or avoiding them altogether–enterprises can have their data science and analytics teams reap the benefits of cloud data lake implementations while operating as efficiently as possible and within their budgets.

      If you have a suggestion for an eWEEK Data Points article, email [email protected].

      eWEEK EDITORS
      eWEEK EDITORS
      eWeek editors publish top thought leaders and leading experts in emerging technology across a wide variety of Enterprise B2B sectors. Our focus is providing actionable information for today’s technology decision makers.

      Get the Free Newsletter!

      Subscribe to Daily Tech Insider for top news, trends & analysis

      Get the Free Newsletter!

      Subscribe to Daily Tech Insider for top news, trends & analysis

      MOST POPULAR ARTICLES

      Artificial Intelligence

      9 Best AI 3D Generators You Need...

      Sam Rinko - June 25, 2024 0
      AI 3D Generators are powerful tools for many different industries. Discover the best AI 3D Generators, and learn which is best for your specific use case.
      Read more
      Cloud

      RingCentral Expands Its Collaboration Platform

      Zeus Kerravala - November 22, 2023 0
      RingCentral adds AI-enabled contact center and hybrid event products to its suite of collaboration services.
      Read more
      Artificial Intelligence

      8 Best AI Data Analytics Software &...

      Aminu Abdullahi - January 18, 2024 0
      Learn the top AI data analytics software to use. Compare AI data analytics solutions & features to make the best choice for your business.
      Read more
      Latest News

      Zeus Kerravala on Networking: Multicloud, 5G, and...

      James Maguire - December 16, 2022 0
      I spoke with Zeus Kerravala, industry analyst at ZK Research, about the rapid changes in enterprise networking, as tech advances and digital transformation prompt...
      Read more
      Video

      Datadog President Amit Agarwal on Trends in...

      James Maguire - November 11, 2022 0
      I spoke with Amit Agarwal, President of Datadog, about infrastructure observability, from current trends to key challenges to the future of this rapidly growing...
      Read more
      Logo

      eWeek has the latest technology news and analysis, buying guides, and product reviews for IT professionals and technology buyers. The site’s focus is on innovative solutions and covering in-depth technical content. eWeek stays on the cutting edge of technology news and IT trends through interviews and expert analysis. Gain insight from top innovators and thought leaders in the fields of IT, business, enterprise software, startups, and more.

      Facebook
      Linkedin
      RSS
      Twitter
      Youtube

      Advertisers

      Advertise with TechnologyAdvice on eWeek and our other IT-focused platforms.

      Advertise with Us

      Menu

      • About eWeek
      • Subscribe to our Newsletter
      • Latest News

      Our Brands

      • Privacy Policy
      • Terms
      • About
      • Contact
      • Advertise
      • Sitemap
      • California – Do Not Sell My Information

      Property of TechnologyAdvice.
      © 2024 TechnologyAdvice. All Rights Reserved

      Advertiser Disclosure: Some of the products that appear on this site are from companies from which TechnologyAdvice receives compensation. This compensation may impact how and where products appear on this site including, for example, the order in which they appear. TechnologyAdvice does not include all companies or all types of products available in the marketplace.

      ×