Dedicated SQL Pool vs. Serverless SQL—Comparison and Uses
Written by Imran Abdul Rauf
Technical Content Writer
February 10, 2022
How is the term serverless misleading?
It’s a managed service through which a provider handles server infrastructure. Serverless architecture, commonly known as serverless computing, defines a cloud computing implementation system in which the service provider provides machine resources and manages servers for customers’ demands.
In short, the technical personnel responsible aren’t bound to handle servers, capacity planning, or maintenance and support tasks, but they’re in charge of the logic aspects only.
Serverless SQL pool with Azure Synapse Analytics
Serverless SQL pool is a query service in your data lake that facilitates users to access data by the following workarounds:
- A known T-SQL syntax to query data without any requirement of copying or loading data within a specialized location.
- Integrated connectivity through the T-SQL interface, which offers a variety of BI and ad-hoc querying tools and popular drivers.
Serverless SQL pool works as a distributed data processing system primarily used for large-scale data and computations jobs.
Serverless SQL pool benefits
Serverless SQL pool offers many benefits for teams and users looking to explore data and acquire insights from the data lake, optimize their current data transformation pipeline. These are the situations best suited for using a serverless SQL pool endpoint.
- Fundamental discovery and exploration:Inquire the data in different formats like Parquet, JSON, CSV, etc., in the data lake. The purpose is to make action plans for extracting valuable insights.
- Logical data warehouse: Issues a relational abstraction over the disparate data without transforming and relocating data, users get the view updated data at all times.
- Data transformation: Transform the data in the lake through T-SQL in a convenient, scalable manner. The data is then added to BI and other tools or incorporated into a relational data store such as Azure SQL Database, Synapse SQL Database, etc.
Other personnel who can leverage serverless SQL pool are:
- Data scientists can promptly examine the content and structure of the lake data by using the OPENROWSET and automatic schema inference features.
- Data engineers can explore the lake, transform, create, and simplify their data transformation pipelines through the pool.
- Data analysts can explore the data and Spark external tables through T-SQL language and other familiar tools, which can be further connected with serverless SQL pool.
- Business intelligence experts can instantly generate BI reports over Spark tables and data in the lake.
Dedicated SQL pool vs. serverless SQL
Dedicated SQL pool
The SQL pool was known as Azure SQL Data Warehouse, before becoming a part of Synapse. When using Synapse SQL, a dedicated pool presents several provisioned analytical resources. A dedicated SQL is a big data solution that provides data storage in a relational table with columnar storage, and it improves query performance and limits the storage cost. A dedicated SQL server is measured in data warehousing units (DWU). After the data is stored in the dedicated SQL pool, users can utilize it for analytics on a large scale.
Serverless SQL pool
A serverless SQL pool stores and computes large-scale data in a distributing data processing system. The Azure serverless SQL pool doesn’t mandate users to set up infrastructure and maintain clusters. Instead, it uses a pay-per-use model, which means there is no cost incurred for resources not used, and the charge made is based on the data processed by each query run.
|Dedicated SQL Pool||Serverless SQL Pool|
|Allows you to run a query and ingest data from the business’s data lake files||Allows users to query their data lake files|
|Demands an infrastructure set up||Not required to set up an infrastructure or maintain clusters|
|Need to acquire dedicated servers before conducting any operation||Seamless data transformation and exploration without any infrastructure in place|
|Data is stored in relational tables||Data is stored in Data Lake|
|Cost is managed by pausing SQL pool or scaling down warehouse||Cost is automatically handled and invoiced on a pay-per-query basis|
|Cost is incurred for the resources reserved||Cost is incurred for the data processed per query|
|Pay per DWU provisioned||Pay per TB Processed|
Unlike dedicated SQL pools where the user pays for reserved resources at a pre-decided scale, the cost of serverless SQL is calculated per TB or data processed based on the queries run. The pricing of serverless SQL pool is around 5 USD per TB of data processed, but costs can quickly rise if the queries run are above large multi TB datasets.
Some users might intentionally run large queries using a pay-per-query price model, but you can always manage your expense. The Cost Control feature in Azure Synapse analytics allows users to determine their budgets based on daily, weekly, and monthly spend. Users can configure the feature through Synapse Studio or the stored process in T-SQL.
Use cases for Synapse serverless SQL pool
Ad hoc analysis and data discovery
- It is used to explore analytical data as an alternate to Apache Spark, particularly for personnel already hands-on with T-SQL, without any need for cluster configuration.
- Ad hoc queries are run for data analysis and profiling purposes to work out transformation demands for new data sources added.
- Providing data to businesses for initial analysis before the ETL pipelines development.
- Conducting ad hoc analysis for historical analysis ingested in the data lake and comparing with the existing data in the data warehouse.
Data lake analytics
- Use lakehouse data via BI tools to report for transformations required.
- Leveraging the data lake as a full-fledged logical data warehouse, running queries similar to the DW through equal latency and throughput. Creating the data warehouse straight on top of the data lake helps prevent data duplication in relational databases like Azure SQL DB. Users can create views over the data lake files in T-SQL through a serverless SQL pool.
Azure Synapse Analytics is a product of Microsoft Azure used for data warehousing and big data analytics purposes. In Azure, technical personnel can choose various technologies such as Azure Synapse SQL vs. Apache Spark and Dedicated SQL vs. Serverless SQL. Each technology suite comes with its own set of features, benefits, and use cases.
Royal Cyber helps clients leverage data analytics for better productivity, risk and compliance, planning and monitoring resources, and more through enterprise-wide consulting.