Microsoft announced the preview of lightning-fast data exploration service called Azure Data Explorer at Ignite in 2018. It is a PaaS offering from Azure providing an end-to-end solution for data exploration.
This service from Azure was developed to provide end-to-end data exploration services to help the businesses get quick insights, and make critical business decisions. It can be used for streaming data as well to identify patterns, statistics, anomalies, outliers, and even to diagnose issues.
Azure Data Explorer, abbreviated as ADX has a very fast indexing and a very powerful query language to work with the data. It is known as Kusto Query Language, abbreviated as KQL. One interesting fact is that this service was being used by many individuals and teams with Microsoft itself due to its rich feature, which we will shortly discuss.
ADX is a PaaS offering from Azure, which is capable of performing analysis on large volumes of data from heterogeneous sources, like – Custom Applications, IoT Devices, Diagnostic Logs, and other streaming data sources as well. This data can be structured, unstructured, or free text.
Data within ADX is organized in relational tables within the database, which has strongly typed schema. It can scale quickly depending upon the ingested data volume and query load.
Data Warehousing Workflow
I already mentioned that data can be ingested in Azure Data Explorer from heterogeneous sources, which include the line of business applications, CRM applications, graphs and images, social media, IoT applications, and even other cloud application.
There are many ingestion methods available that can be used in different scenarios and under different circumstances. For example – data can be ingested using managed pipelines like Event Hubs, IoT Hubs, from the blob storage using Event Grids, or using connectors and plugins like Power Automate, Azure Data Factory, Kafka, etc. Data ingestion can also be done programatically through custom code using the already optimized SDK’s, like .Net, Python, R, Java, Node JS, GO, and Rest API. We will discuss more about data ingestion in part 8 of this series
The ingestion of data involves orchestration and monitoring, which is then followed by data exploration, where the AI data is queried from the data store, there can either be data mining to produce insights or an algorithmic result are generated as a part of the data product.
The model is then tested in iterations to verify the results. It is shared with others for verification and validations. Once the validation is passed, the model is put to use and can be served as results for business benefits. The results can be in the form of Power BI dashboards, excel files, etc.
ADX integrates with different major services in order to provide end-to-end data solution that includes all the steps of data analytics., and as the name suggests it plays a critical role by performing the data exploration step in the complete flow on the large volumes of ingested data.
Part – 1: Data Science Overview
Part – 3: Azure Data Explorer Features
Part – 4: Azure Data Explorer Service Capabilities
Part – 5: Creating the ADX Environment
Part – 6: The Kusto Query Language
Part – 7: Data Obfuscation in Kusto Query Language
Part – 8: Data Ingestion Preparation: Schema Mapping
Part – 9: Overview of data ingestion in Azure Data Explorer
Part – 10: Managing Azure Data Explorer Cluster