Microsoft Azure Search Scours Unstructured Data | eWeek

Microsoft Azure Search Scours Unstructured Data

Microsoft Azure Search
Feb 12, 2016
2 minute read
eWeek content and product recommendations are editorially independent. We may make money when you click on links to our partners. Learn More

After enabling Azure Search on cloud databases, Microsoft is now turning its attention to unstructured data.

Due to customer demand, Microsoft released a preview version of its Search indexer for Azure Blob Storage, the company’s cloud-based unstructured data storage service, Eugene Shvets, a Microsoft Azure Search senior software engineer announced on Feb. 9. “Our indexers for Azure SQL Database and DocumentDB have been a hit with customers, and many of them have asked us to build similar magic for Azure Blob Storage.”

The indexer is intended to spare customers the challenges of extracting text from “blobs,” added Shvets. “Formats like PDF and DOC/XLS are binary and difficult to parse; content type detection and metadata extraction can be non-trivial tasks. Good tools exist, but integrating them into an indexing workflow still takes considerable effort and saddles customers with a bunch of code and infrastructure to maintain,” he stated.

Azure Search blob indexer can extract text and metadata from PDF files, along with several Office document file formats (DOCX/DOC, XLSX/XLS, PPTX/PPT and MSG). The indexer also works on HTML, XML, ZIP, EML and, of course, plain text files. Instructions on setting up blob indexing are available in this company blog post.

For administrators seeking more information about their Azure virtual machines (VMs), Microsoft also announced a new Log Analytics capability this week. “Log Analytics (OMS) brings the power of Microsoft’s new cloud-based management solution, Operations Management Suite [OMS], right into the Azure portal allowing you to provision a brand new OMS workspace, link workspaces to Azure subscriptions, and on-board Azure VMs directly to the OMS service,” blogged Anurag Gupta, a Microsoft Open Source Technology Center program manager.

Microsoft also issued two new Azure Resource Manager (ARM) templates, Gupta said. “These templates allow you to quickly deploy a brand-new Windows or Linux VM that instantly on-boards to the OMS service.”

Finally, Microsoft published new documentation and a code sample on GitHub for developers kicking off Azure-powered Internet of things projects that involve pulling data in from public data feeds.

“There are a number of documentation articles and code samples on pushing data from devices you control to Azure and for analyzing in combination with other streaming or static data,” said Spyros Sakellariadis, principal program manager at Microsoft Azure Machine Learning, in a Feb. 11 announcement. “What’s not as well documented is how to pull data from a public Website you don’t control, then push that data into an Azure Event Hub,” Microsoft’s telemetry-ingestion service.

“A recent article and code sample I produced with Dinar Gainitdinov shows how building a simple application with a few lines of C#, does this,” continued Sakellariadis. A modified version of the code combines “real-time motor vehicle data with maintenance records in Microsoft Dynamics and another version to analyze how traffic in the Seattle region was affected by the weather,” he said.

eWeek Logo

eWeek has the latest technology news and analysis, buying guides, and product reviews for IT professionals and technology buyers. The site's focus is on innovative solutions and covering in-depth technical content. eWeek stays on the cutting edge of technology news and IT trends through interviews and expert analysis. Gain insight from top innovators and thought leaders in the fields of IT, business, enterprise software, startups, and more.

Property of TechnologyAdvice. © 2026 TechnologyAdvice. All Rights Reserved

Advertiser Disclosure: Some of the products that appear on this site are from companies from which TechnologyAdvice receives compensation. This compensation may impact how and where products appear on this site including, for example, the order in which they appear. TechnologyAdvice does not include all companies or all types of products available in the marketplace.