Making sure clean data is flowing inside an enterprise is like keeping a close eye on a motor vehicle’s oil level and quality; you do not want to “run dirty” in either case, because all it can do is gum up the works.
Clean data can be described as complete, accurate, relevant, up to date, non-corrupted and duplicated only for required backup purposes from a record set, table, object storage volume or database.
Storing clean data is a critical ingredient for businesses to do accurate analyses, yet many enterprises are still relying on manual and inefficient processes to clean and prepare it. This is the key finding of a new survey released May 17 from data management toolmaker Trifacta.
Global Survey About Data Prep Practices
Trifacta knows this market. Its data preparation platform helps data analysts explore, assess and refine data for analysis and solve the big problems of their business.
The San Francisco-based company conducted a global data preparation survey of nearly 300 data professionals to identify the challenges hindering organizations’ use of data and analytics.
Key findings from the survey include:
- Overreliance on IT resources for data preparation costs organizations billions. Sixty-five percent of IT professionals spend half or more of their time at work on data quality assurance, cleanup or preparation. Based upon Glassdoor salary estimates and IDC’s estimation that there are 18 million IT operations and management professionals globally, organizations are spending approximately $500 billion annually on data preparation.
- Fifty-nine percent of respondents (both IT professionals and data analysts) believe that the majority of the data analysts in their organization are dependent on IT resources to prepare or access data.
- Eighty-three percent of analysts believe they would be able to derive increased value from their analysis projects with a decreased dependency on IT.
- Unnecessary iteration between business users and IT exacerbates the cost of data preparation. Analysts who depend on IT to prepare data often request modifications to their initial requirements, the survey found, likely due to unanticipated findings from the raw data contents. Eighty-two percent of analysts said they regularly go back to IT with new requirements. This includes 11 percent who said they always do this.
- Excel continues to be the primary tool for data preparation: 37 percent of data analysts and 30 percent of IT professionals use it more than other tools to prepare data. Trifacta predicted that a reliance on manually driven data preparation tools such as Excel will continue to delay data initiatives and deter new insights.
- Analysts recognize that the time-consuming nature of data preparation is a detriment to their organizations: 58 percent believe that the overall time spent on data QA or data cleansing costs their organization money, more than it delivers value to the organization.
- Data analysts are also spending too much time preparing data: 92 percent would choose to focus on another analytic activity rather than data preparation, yet 65 percent are spending at least half their time preparing data for analytic use.
- Critical data is at risk. Even though data privacy concerns abound in today’s business landscape, 74 percent of data analysts confess that their individual computers are one of the top three places they store data, and 56 percent of IT professionals say the same thing.
Trifacta conducted a global survey of 294 individuals who prepare data—179 IT professionals who prepare data for a group of business users and 115 data analysts who prepare and/or analyze data for themselves. The survey was conducted between April 4 and April 13, 2018.
For more information, go here.