Microsoft Power BI Self-Service Data Preparation Gets Big Data Boost

Power BI's updated self-service data preparation tools will allow Excel experts to analyze big data using Power Query.

Power BI

Microsoft has upgraded the self-service data preparation capabilities in Power BI to help business users extract more meaningful insights from big data stored in their software-as-a-service applications and other sources using familiar tools.

Growing demand for SaaS business applications has created a problem for organizations looking to create a data-driven culture in the workplace, according Arun Ulagaratchagan, general manager of Power BI Engineering at Microsoft. Those applications become data silos, each requiring specialized IT expertise to convert it into information that their business intelligence tools can use.

As part of a series of updates that Microsoft will be rolling out in preview beginning in July, Power BI users will be able to more easily incorporate big data, including web analytics and information generated by Internet of Things (IoT) deployments, into their models, dashboards and reports, Ulagaratchagan announced. This is made possible by expanding the boundaries of what Power Query can do.

Power Query, a familiar tool among Excel users, allows users to extract, combine, transform and clean data from multiple sources, and then import data using a format that works with Excel. It supports a variety of enterprise data sources, including Oracle, IBM DB2, Sybase, and of course, Microsoft's own SQL Server database.

For Power BI, Ulagaratchagan and his team looked into taking the technology and "extending it to very large data volumes, trillions of rows of data," he told eWEEK, allowing customers to pump that information directly into the Power BI service. But Microsoft is banking on more than making massive data volumes more accessible in order to drive Power BI adoption and bring more non-data scientists into the fold.

The company is targeting business analysts, "people who are not developers, people with advanced Excel skills" with user-friendly tools like Power Query and new ways of spurring collaboration, Ulagaratchagan said. This includes unifying data access between Power BI and the silo-busting Azure Data Lake Storage Gen2 service.

Introduced in late June, Azure Data Lake Storage Gen2 brings together Microsoft's scale-out cloud object storage platform and HDFS (Hadoop Distributed File System) for advanced big data analytics. It now serves as "the unifying layer where the data accumulates [and where] the data is unified across different roles in the enterprise," enabling all Power BI users to work on the same data and allowing data scientists and engineers to kick off advanced analytics and AI projects with richer, more comprehensive sets of data, explained Ulagaratchagan.

Power BI is also gaining support for the common data model, a standardized and extensible group of data schemas, allowing other Microsoft and third-party applications to use data already processed by Power BI. Additionally, the service is bulking up with larger size limits, incremental refresh capabilities and other features that will allow customers to process larger datasets while maintaining responsive application performance.

Other new additions include new Power BI Premium deployment options that help organizations meet their data residency requirements and "pixel-perfect" enterprise reporting capabilities based on SQL Server Reporting Services. The product will also soon support the XMLA (XML for Analysis) protocol, enabling application lifecycle management use cases with the SSAS (SQL Server Analysis Services) toolkit and access to Power BI data from third-party BI software.

Pedro Hernandez

Pedro Hernandez

Pedro Hernandez is a contributor to eWEEK and the IT Business Edge Network, the network for technology professionals. Previously, he served as a managing editor for the network of...