Predictions 2018: How GDPR is Forcing Big Changes in Storage

Experts believe we might see things like machine-learning-assisted storage and de-archiving becoming commonplace during the coming calendar year.

GDPR.logo2

Storage is home base for all of IT. If data doesn’t have a place to live, well, guess what? It’s dead, and if data dies, so does the internet, IT systems, television networks, the world economy and our lives in general. In other words, nothing too bad will happen, right?

Technology in the data protection and storage sector is evolving at a frighteningly fast pace, with impressive new media coming into the market all the time (in-memory, 3D Xpoint, NVME, flash), sophisticated new storage software and devices now available to administer it and infinite capacity via clouds and, yes, digital tape to handle the bulk of it all.

Did you know that a single cartridge of digital tape today can hold hundreds of terabytes of data (330 to be exact, which is what IBM’s latest tape cartridge offering holds)? Solid-state and conventional hard drives aren’t anywhere near that kind of capaciousness, although there are strings attached with tape (linear-only access, possible magnetic issues, separate mechanical systems required, etc.).

However, most companies are investing a lot more capital (than for an old-fashioned tape system) to house mere dozens of terabytes of data on high-end digital systems using flash and hard drive media. Of course, there are many trade-offs in both cases; results will vary according to use case, and this isn’t the time or place to get into a comparison argument.

Biggest Storage Industry Change of 2018: GDPR

Perhaps the biggest international storage issue next year will be that the European Union's GDPR (General Data Protection Regulation) goes into effect May 25, 2018.

As Datos IO Vice-President Peter Smails told eWEEK: "Data-aware data management becomes table stakes in 2018. The GDPR is the most sweeping change to data protection in the past 20 years. Under the new set of regulations, both U.S. and European companies will need to demonstrate compliance when it comes to managing, storing and sharing data–no matter how massive the data sets. Security-wise, companies will have to report data breaches within 72 hours of their knowledge of them.

"One of the biggest issues next year will be GDPR Article 17, which enables a user's right to be forgotten, which will increase demand for storage and data management solutions that are data-aware. Whether it’s application-specific backup and recovery to protect against ransomware, or intelligent query-based data movement to support test/dev, CI/CD, or GDPR initiatives, organizations will require data management solutions that are data aware and enable them to protect, mobilize, and monetize their data across any cloud boundaries.”

eWEEK is in regular touch with experts throughout this market on a regular basis. Here are some of their predictions for the data storage sector in 2018.

Lance Smith, CEO of Primary Data: Storage services become bespoke with machine learning.“With the ability to add software to collect metadata from application uses, analyze metadata across the enterprise storage infrastructure, and align data requirements to storage system capabilities, admins gain the power of machine-learning intelligence. Rather than applying a one-size-fits-all solution, IT can custom match business needs with existing or incrementally add storage resources, becoming a much more strategic operation. Over time, data management software with machine learning intelligence can help admins continue to refine their policies and fine-tune how they can ensure both performance and savings by optimizing how they are leveraging their unique infrastructure.”

Molly Presley, Vice-President of Global Marketing, Quantum:

  • Data-driven organizations demand intelligence in their storage. “Expect greater emphasis on tools to maximize content value, primarily in the form of AI to increase intelligence about content, and data visualization software to improve access and the cost of storing data. As data is increasingly used for strategic decision making for companies --or in cases where data is the product itself--storage strategies will become a more foundational consideration.”
  • Tape steps in as ransomware savior. “With the rising tide of ransomware and malware in the news, tape will increasingly be a valued element of many data protection solutions because it offers an offline 'air-gapped' backup copy.”
  • More applications become ripe for containerization. “As new applications are being written for the container/cloud-native world, existing applications will increasingly be updated for cloud-native architectures.”
  • Intelligent file systems will push object storage in the corner where it belongs. ”File systems will be increasingly leveraged for data ingest and data management. As a result, object storage will be pushed to the capacity and retention part of storage architectures, where it arguably is best suited.”
  • Cloud-native architectures to gain traction. “As cloud-native architectures become an increasingly common term in the IT vocabulary, look for organizations such as the Cloud Native Computing Foundation (CNCF) to become more influential.”

Peter Godman, CTO of Qumulo:

  • SSD won't be cheaper than HDD. “In 2016, every all-flash vendor in the world claimed that solid-state disks are now cheaper than HDD, based on two nonsensical claims that (1) all data is compressible and duplicated and (2) compression and dedupe don't apply to HDDs. Western Digital's MAMR (microwave-assisted magnetic recording) announcement makes it clear that the ratio of NAND flash capacity cost to HDD capacity cost will remain close to 10x for years to come.”
  • “All major cloud vendors will start to build or buy file storage, as they realize that quality, scalable file storage is essential for capturing compute-intensive workloads.”
  • “SATA and SAS SSDs will rapidly disappear as the cost of NVMe (non-volatile memory express) and SATA/SAS converges quickly.”
  • “AMD will make rapid inroads in all-flash storage due to Epyc's enormous PCIe (Peripheral Component Interconnect Express)) bandwidth, and ARM will wait on the server sidelines for another year.”

Peter Smails, Vice President, Marketing and Business Development, Datos IO: Non-relational databases dominate the hybrid cloud world. “The biggest database event in 2017 was the MongoDB IPO representing strong market validation that non-relational databases are becoming the lingua franca of data in the hybrid cloud world. In the new year, this new breed of modern databases including DynamoDB, DataStax, and Couchbase will continue to solidify their positions as the standard platform for modern applications.”

Guy Churchward, CEO, Datatorrent: The data lake is recognized as an antiquated catch-all analytics approach and Achilles heel for fast insight and actions needed to compete.
“With the advent of the internet of things, data growth is set to accelerate. Data sources are moving from humans to machines as sources move from web to mobile to machines. This has created a dire need to scale out data pipelines in a cost-effective way. Big data and cloud ecosystems realized that they cannot be just about search indexing or data warehousing. They need to service all enterprise data flow, be it human (web, mobile) generated or machine generated. The ability to respond in real time provides a dramatic competitive advantage. An enterprise that can do predictive analytics in real-time will gain a competitive edge over one that does not.

“Big data needs to be real-time, agile, and operable. Moreover, to get real time, agility and even to some extent operability back, data lakes cannot be in between this data flow. The data lake served companies fantastically well through the data “at rest” and “batch” era; but back in 2015, it started to become clear this architecture was getting overused, but it’s now become the Achilles heel for ‘real’ real-time data analytics. Parking data first, then analyzing it immediately puts companies at a massive disadvantage. When it comes to gaining insights and taking actions as fast as compute can allow, companies relying on stale event data create a total eclipse on visibility, actions, and any possible immediate remediation. This is one area where ‘good enough’ will prove strategically fatal!”

David Friend, CEO and co-founder of Wasabi:

  • Cloud storage will feature commodity pricing that collapses artificial pricing for storage tiers.
    “Similar to electricity, 2018 will be the year cloud storage becomes a commodity. By 2020, the estimated amount of data stored will be at least 50 times larger than it was in 2010, further driving the need for a standardized, cost effective, and one-size-fits-all storage solution. On-premises storage, artificial pricing and storage tiers from major cloud players like AWS and Azure and vendor lock-in are on the way to extinction.”
  • New laws and regulations for data governance and compliance will drive the move to cloud storage.
    “With new regulations expected to go into full effect in 2018, such as the General Data Protection Regulation (GDPR), the manner in which companies store their data will become more unified. While the regulation is technically required of European Union countries,  many U.S. companies with business --even if it’s strictly online--and data in the EU will find itself making changes in how and where they store data in 2018. Patient record health care data has to comply with HIPAA (Health Insurance Portability and Accountability Act of 2006), and CJIS (FBI Criminal Justice Information Services Division) data must adhere to strict guidelines, further hastening the journey to cloud storage.”
  • The “Internet of things” will generate more data than we realize.
    “In a study published by Cisco Internet of Things (IoT) 2017, nearly 75 percent of executives were reported using IoT data to improve their businesses, with 95 percent planning to launch an IoT business within three years. In 2018, this data will surpass expectations, and there will be a huge incentive to store more data for future analysis.”
  • The concept of de-archiving will emerge. “Video is by far the biggest data type on the internet by volume. What many people aren’t aware of, however, is how the media and entertainment industry has been storing their own video content in the form of tapes, with tens of thousands of old shows, newscasts, unreleased feature films, etc. sitting in Hollywood basements and warehouses. Assets that are sitting in dead storage are hard to monetize.  Moving these assets back into hot storage allows them to be marketed through all the new streaming channels that are available today. In 2018, the media and entertainment industry will embrace “de-archiving” and create new revenue streams from old content.”

Monte Zweben, CEO of Splice Machine

  • Online Predictive Processing (OLPP) emerges as a new approach to combining OLTP, OLAP, streaming, and machine learning in one platform.
  • AI is the new big data: “Companies race to do it whether they know they need it or not.”
  • “The Hadoop era of disillusionment hits full stride, with many companies drowning in their data lakes, unable to get a ROI because of the complexity of duct-taping Hadoop-based compute engines.”
  • “SQL is reborn as many companies realize their Hadoop-based data lakes need traditional database operations, such as in-place record updates and indexes to power applications.”
  • “The state-of-the art for OLPP databases will be indexed by rows for fast access and updates, but stored in columnar encodings for massive storage savings and scan speeds for analytics.”

Chris Colotti, Field CTO at Tintri
“2018 will be the year that DR moves from being a secondary issue to a primary focus. The last 12 months have seen Mother Nature throw numerous natural disasters at us, which has magnified the need for a formal DR strategy. The challenge is that organizations are struggling to find DR solutions that work simply at scale. It’s become somewhat of the white whale to achieve, but there are platforms that are designed to scale and protect workloads wherever they are--on-premises or in the public cloud.”

Chris Preimesberger

Chris J. Preimesberger

Chris J. Preimesberger is Editor of Features & Analysis at eWEEK, responsible in large part for the publication's coverage areas. In his 12 years and more than 3,900 stories at eWEEK, he...