Switch content of the page by the Role togglethe content would be changed according to the role
T-SQL Querying, 1st edition
Published by Microsoft Press (March 6, 2015) © 2015
- Itzik Ben-Gan
- Adam Machanic
- Dejan Sarka
- Kevin Farlee
eTextbook
$57.99
- Available for purchase from all major ebook resellers, including InformIT.com.
- To request a review copy, click on the "Request a Review Copy" button.
$47.99
- A print text (hardcover or paperback)
- Free shipping
- Also available for purchase as an ebook from all major ebook resellers, including InformIT.com
This book gives students a detailed look at the internal architecture of T-SQL and a comprehensive programming reference. Readers will tackle the toughest set-based querying and query tuning problems—guided by an author team with in-depth, inside knowledge of T-SQL. This book deepens student understanding of architecture and internals—and gives practical approaches and advanced techniques to optimize code performance. The book covers many unique techniques that provide highly efficient solutions for common challenges.
- Covers moving from procedural programming to the language of sets and logic
- Shows how to optimize query tuning with a top-down methodology
- Assesses algorithmic complexity to predict performance
- Compares data-aggregation techniques, including new grouping sets
Foreword xv
Introduction xvii
Chapter 1: Logical query processing 1
Logical query-processing phases 3
Logical query-processing phases in brief 4
Sample query based on customers/orders scenario 6
Logical query-processing phase details 8
Step 1: The FROM phase 8
Step 2: The WHERE phase 14
Step 3: The GROUP BY phase 15
Step 4: The HAVING phase 16
Step 5: The SELECT phase 17
Step 6: The ORDER BY phase 20
Step 7: Apply the TOP or OFFSET-FETCH filter 22
Further aspects of logical query processing 26
Table operators 26
Window functions 35
The UNION, EXCEPT, and INTERSECT operators 38
Conclusion 39
Chapter 2: Query tuning 41
Internals 41
Pages and extents 42
Table organization 43
Tools to measure query performance 53
Access methods 57
Table scan/unordered clustered index scan 57
Unordered covering nonclustered index scan 60
Ordered clustered index scan 62
Ordered covering nonclustered index scan 63
The storage engine’s treatment of scans 65
Nonclustered index seek + range scan + lookups 81
Unordered nonclustered index scan + lookups 91
Clustered index seek + range scan 93
Covering nonclustered index seek + range scan 94
Cardinality estimates 97
Legacy estimator vs. 2014 cardinality estimator 98
Implications of underestimations and overestimations 99
Statistics 101
Estimates for multiple predicates 104
Ascending key problem 107
Unknowns 110
Indexing features 115
Descending indexes 115
Included non-key columns 119
Filtered indexes and statistics 120
Columnstore indexes 123
Inline index definition 130
Prioritizing queries for tuning with extended events 131
Index and query information and statistics 134
Temporary objects 139
Set-based vs. iterative solutions 149
Query tuning with query revisions 153
Parallel query execution 158
How intraquery parallelism works 158
Parallelism and query optimization 175
The parallel APPLY query pattern 181
Conclusion 186
Chapter 3: Multi-table queries 187
Subqueries 187
Self-contained subqueries 187
Correlated subqueries 189
The EXISTS predicate 194
Misbehaving subqueries 201
Table expressions 204
Derived tables 205
CTEs 207
Views 211
Inline table-valued functions 215
Generating numbers 215
The APPLY operator 218
The CROSS APPLY operator 219
The OUTER APPLY operator 221
Implicit APPLY 221
Reuse of column aliases 222
Joins 224
Cross join 224
Inner join 228
Outer join 229
Self join 230
Equi and non-equi joins 230
Multi-join queries 231
Semi and anti semi joins 237
Join algorithms 239
Separating elements 245
The UNION, EXCEPT, and INTERSECT operators 249
The UNION ALL and UNION operators 250
The INTERSECT operator 253
The EXCEPT operator 255
Conclusion 257
Chapter 4: Grouping, pivoting, and windowing 259
Window functions 259
Aggregate window functions 260
Ranking window functions 281
Offset window functions 285
Statistical window functions 288
Gaps and islands 291
Pivoting 299
One-to-one pivot 300
Many-to-one pivot 304
Unpivoting 307
Unpivoting with CROSS JOIN and VALUES 308
Unpivoting with CROSS APPLY and VALUES 310
Using the UNPIVOT operator 312
Custom aggregations 313
Using a cursor 314
Using pivoting 315
Specialized solutions 316
Grouping sets 327
GROUPING SETS subclause 328
CUBE and ROLLUP clauses 331
Grouping sets algebra 333
Materializing grouping sets 334
Sorting 337
Conclusion 339
Chapter 5: TOP and OFFSET-FETCH 341
The TOP and OFFSET-FETCH filters 341
The TOP filter 341
The OFFSET-FETCH filter 345
Optimization of filters demonstrated through paging 346
Optimization of TOP 346
Optimization of OFFSET-FETCH 354
Optimization of ROW_NUMBER 358
Using the TOP option with modifications 360
TOP with modifications 360
Modifying in chunks 361
Top N per group 363
Solution using ROW_NUMBER 364
Solution using TOP and APPLY 365
Solution using concatenation (a carry-along sort) 366
Median 368
Solution using PERCENTILE_CONT 369
Solution using ROW_NUMBER 369
Solution using OFFSET-FETCH and APPLY 370
Conclusion 371
Chapter 6: Data modification 373
Inserting data 373
SELECT INTO 373
Bulk import 376
Measuring the amount of logging 377
BULK rowset provider 378
Sequences 381
Characteristics and inflexibilities of the identity property 381
The sequence object 382
Performance considerations 387
Summarizing the comparison of identity with sequence 394
Deleting data 395
TRUNCATE TABLE 395
Deleting duplicates 399
Updating data 401
Update using table expressions 402
Update using variables 403
Merging data 404
MERGE examples 405
Preventing MERGE conflicts 408
ON isn't a filter 409
USING is similar to FROM 410
The OUTPUT clause 411
Example with INSERT and identity 412
Example for archiving deleted data 413
Example with the MERGE statement 414
Composable DML 417
Conclusion 417
Chapter 7: Working with date and time 419
Date and time data types 419
Date and time functions 422
Challenges working with date and time 434
Literals 434
Identifying weekdays 436
Handling date-only or time-only data with DATETIME and SMALLDATETIME 439
First, last, previous, and next date calculations 440
Search argument 445
Rounding issues 447
Querying date and time data 449
Grouping by the week 449
Intervals 450
Conclusion 471
Chapter 8: T-SQL for BI practitioners 473
Data preparation 473
Sales analysis view 474
Frequencies 476
Frequencies without window functions 476
Frequencies with window functions 477
Descriptive statistics for continuous variables 479
Centers of a distribution 479
Spread of a distribution 482
Higher population moments 487
Linear dependencies 495
Two continuous variables 495
Contingency tables and chi-squared 501
Analysis of variance 505
Definite integration 509
Moving averages and entropy 512
Moving averages 512
Entropy 518
Conclusion 522
Chapter 9: Programmable objects 525
Dynamic SQL 525
Using the EXEC command 525
Using the sp_executesql procedure 529
Dynamic pivot 530
Dynamic search conditions 535
Dynamic sorting 542
User-defined functions 546
Scalar UDFs 546
Multistatement TVFs 550
Stored procedures 553
Compilations, recompilations, and reuse of execution plans 554
Table type and table-valued parameters 571
EXECUTE WITH RESULT SETS 573
Triggers 575
Trigger types and uses 575
Efficient trigger programming 581
SQLCLR programming 585
SQLCLR architecture 586
CLR scalar functions and creating your first assembly 588
Streaming table-valued functions 597
SQLCLR stored procedures and triggers 605
SQLCLR user-defined types 617
SQLCLR user-defined aggregates 628
Transaction and concurrency 632
Transactions described 633
Locks and blocking 636
Lock escalation 641
Delayed durability 643
Isolation levels 645
Deadlocks 657
Error handling 662
The TRY-CATCH construct 662
Errors in transactions 666
Retry logic 669
Conclusion 670
Chapter 10: In-Memory OLTP 671
In-Memory OLTP overview 671
Data is always in memory 672
Native compilation 673
Lock and latch-free architecture 673
SQL Server integration 674
Creating memory-optimized tables 675
Creating indexes in memory-optimized tables 676
Clustered vs. nonclustered indexes 677
Nonclustered indexes 677
Hash indexes 680
Execution environments 690
Query interop 690
Natively compiled procedures 699
Surface-area restrictions 703
Table DDL 703
DML 704
Conclusion 705
Chapter 11: Graphs and recursive queries 707
Terminology 707
Graphs 707
Trees 708
Hierarchies 709
Scenarios 709
Employee organizational chart 709
Bill of materials (BOM) 711
Road system 715
Iteration/recursion 718
Subgraph/descendants 719
Ancestors/path 730
Subgraph/descendants with path enumeration 733
Sorting 736
Cycles 740
Materialized path 742
Maintaining data 743
Querying 749
Materialized path with the HIERARCHYID data type 754
Maintaining data 756
Querying 763
Further aspects of working with HIERARCHYID 767
Nested sets 778
Assigning left and right values 778
Querying 784
Transitive closure 787
Directed acyclic graph 787
Conclusion 801
Index 803
Introduction xvii
Chapter 1: Logical query processing 1
Logical query-processing phases 3
Logical query-processing phases in brief 4
Sample query based on customers/orders scenario 6
Logical query-processing phase details 8
Step 1: The FROM phase 8
Step 2: The WHERE phase 14
Step 3: The GROUP BY phase 15
Step 4: The HAVING phase 16
Step 5: The SELECT phase 17
Step 6: The ORDER BY phase 20
Step 7: Apply the TOP or OFFSET-FETCH filter 22
Further aspects of logical query processing 26
Table operators 26
Window functions 35
The UNION, EXCEPT, and INTERSECT operators 38
Conclusion 39
Chapter 2: Query tuning 41
Internals 41
Pages and extents 42
Table organization 43
Tools to measure query performance 53
Access methods 57
Table scan/unordered clustered index scan 57
Unordered covering nonclustered index scan 60
Ordered clustered index scan 62
Ordered covering nonclustered index scan 63
The storage engine’s treatment of scans 65
Nonclustered index seek + range scan + lookups 81
Unordered nonclustered index scan + lookups 91
Clustered index seek + range scan 93
Covering nonclustered index seek + range scan 94
Cardinality estimates 97
Legacy estimator vs. 2014 cardinality estimator 98
Implications of underestimations and overestimations 99
Statistics 101
Estimates for multiple predicates 104
Ascending key problem 107
Unknowns 110
Indexing features 115
Descending indexes 115
Included non-key columns 119
Filtered indexes and statistics 120
Columnstore indexes 123
Inline index definition 130
Prioritizing queries for tuning with extended events 131
Index and query information and statistics 134
Temporary objects 139
Set-based vs. iterative solutions 149
Query tuning with query revisions 153
Parallel query execution 158
How intraquery parallelism works 158
Parallelism and query optimization 175
The parallel APPLY query pattern 181
Conclusion 186
Chapter 3: Multi-table queries 187
Subqueries 187
Self-contained subqueries 187
Correlated subqueries 189
The EXISTS predicate 194
Misbehaving subqueries 201
Table expressions 204
Derived tables 205
CTEs 207
Views 211
Inline table-valued functions 215
Generating numbers 215
The APPLY operator 218
The CROSS APPLY operator 219
The OUTER APPLY operator 221
Implicit APPLY 221
Reuse of column aliases 222
Joins 224
Cross join 224
Inner join 228
Outer join 229
Self join 230
Equi and non-equi joins 230
Multi-join queries 231
Semi and anti semi joins 237
Join algorithms 239
Separating elements 245
The UNION, EXCEPT, and INTERSECT operators 249
The UNION ALL and UNION operators 250
The INTERSECT operator 253
The EXCEPT operator 255
Conclusion 257
Chapter 4: Grouping, pivoting, and windowing 259
Window functions 259
Aggregate window functions 260
Ranking window functions 281
Offset window functions 285
Statistical window functions 288
Gaps and islands 291
Pivoting 299
One-to-one pivot 300
Many-to-one pivot 304
Unpivoting 307
Unpivoting with CROSS JOIN and VALUES 308
Unpivoting with CROSS APPLY and VALUES 310
Using the UNPIVOT operator 312
Custom aggregations 313
Using a cursor 314
Using pivoting 315
Specialized solutions 316
Grouping sets 327
GROUPING SETS subclause 328
CUBE and ROLLUP clauses 331
Grouping sets algebra 333
Materializing grouping sets 334
Sorting 337
Conclusion 339
Chapter 5: TOP and OFFSET-FETCH 341
The TOP and OFFSET-FETCH filters 341
The TOP filter 341
The OFFSET-FETCH filter 345
Optimization of filters demonstrated through paging 346
Optimization of TOP 346
Optimization of OFFSET-FETCH 354
Optimization of ROW_NUMBER 358
Using the TOP option with modifications 360
TOP with modifications 360
Modifying in chunks 361
Top N per group 363
Solution using ROW_NUMBER 364
Solution using TOP and APPLY 365
Solution using concatenation (a carry-along sort) 366
Median 368
Solution using PERCENTILE_CONT 369
Solution using ROW_NUMBER 369
Solution using OFFSET-FETCH and APPLY 370
Conclusion 371
Chapter 6: Data modification 373
Inserting data 373
SELECT INTO 373
Bulk import 376
Measuring the amount of logging 377
BULK rowset provider 378
Sequences 381
Characteristics and inflexibilities of the identity property 381
The sequence object 382
Performance considerations 387
Summarizing the comparison of identity with sequence 394
Deleting data 395
TRUNCATE TABLE 395
Deleting duplicates 399
Updating data 401
Update using table expressions 402
Update using variables 403
Merging data 404
MERGE examples 405
Preventing MERGE conflicts 408
ON isn't a filter 409
USING is similar to FROM 410
The OUTPUT clause 411
Example with INSERT and identity 412
Example for archiving deleted data 413
Example with the MERGE statement 414
Composable DML 417
Conclusion 417
Chapter 7: Working with date and time 419
Date and time data types 419
Date and time functions 422
Challenges working with date and time 434
Literals 434
Identifying weekdays 436
Handling date-only or time-only data with DATETIME and SMALLDATETIME 439
First, last, previous, and next date calculations 440
Search argument 445
Rounding issues 447
Querying date and time data 449
Grouping by the week 449
Intervals 450
Conclusion 471
Chapter 8: T-SQL for BI practitioners 473
Data preparation 473
Sales analysis view 474
Frequencies 476
Frequencies without window functions 476
Frequencies with window functions 477
Descriptive statistics for continuous variables 479
Centers of a distribution 479
Spread of a distribution 482
Higher population moments 487
Linear dependencies 495
Two continuous variables 495
Contingency tables and chi-squared 501
Analysis of variance 505
Definite integration 509
Moving averages and entropy 512
Moving averages 512
Entropy 518
Conclusion 522
Chapter 9: Programmable objects 525
Dynamic SQL 525
Using the EXEC command 525
Using the sp_executesql procedure 529
Dynamic pivot 530
Dynamic search conditions 535
Dynamic sorting 542
User-defined functions 546
Scalar UDFs 546
Multistatement TVFs 550
Stored procedures 553
Compilations, recompilations, and reuse of execution plans 554
Table type and table-valued parameters 571
EXECUTE WITH RESULT SETS 573
Triggers 575
Trigger types and uses 575
Efficient trigger programming 581
SQLCLR programming 585
SQLCLR architecture 586
CLR scalar functions and creating your first assembly 588
Streaming table-valued functions 597
SQLCLR stored procedures and triggers 605
SQLCLR user-defined types 617
SQLCLR user-defined aggregates 628
Transaction and concurrency 632
Transactions described 633
Locks and blocking 636
Lock escalation 641
Delayed durability 643
Isolation levels 645
Deadlocks 657
Error handling 662
The TRY-CATCH construct 662
Errors in transactions 666
Retry logic 669
Conclusion 670
Chapter 10: In-Memory OLTP 671
In-Memory OLTP overview 671
Data is always in memory 672
Native compilation 673
Lock and latch-free architecture 673
SQL Server integration 674
Creating memory-optimized tables 675
Creating indexes in memory-optimized tables 676
Clustered vs. nonclustered indexes 677
Nonclustered indexes 677
Hash indexes 680
Execution environments 690
Query interop 690
Natively compiled procedures 699
Surface-area restrictions 703
Table DDL 703
DML 704
Conclusion 705
Chapter 11: Graphs and recursive queries 707
Terminology 707
Graphs 707
Trees 708
Hierarchies 709
Scenarios 709
Employee organizational chart 709
Bill of materials (BOM) 711
Road system 715
Iteration/recursion 718
Subgraph/descendants 719
Ancestors/path 730
Subgraph/descendants with path enumeration 733
Sorting 736
Cycles 740
Materialized path 742
Maintaining data 743
Querying 749
Materialized path with the HIERARCHYID data type 754
Maintaining data 756
Querying 763
Further aspects of working with HIERARCHYID 767
Nested sets 778
Assigning left and right values 778
Querying 784
Transitive closure 787
Directed acyclic graph 787
Conclusion 801
Index 803
Itzik Ben-Gan is a mentor for and co-founder of SolidQ. A SQL Server Microsoft MVP (Most Valuable Professional) since 1999, Itzik has delivered numerous training events around the world focused on T-SQL querying, query tuning, and programming. Itzik has authored several T-SQL books as well as articles for SQL Server Pro, SolidQ Journal, and MSDN. Itzik’s speaking activities include TechEd, SQLPASS, SQL Server Connections, SolidQ events, and various user groups around the world. Itzik is the author of SolidQ’s Advanced T-SQL Querying, Programming and Tuning, and T-SQL Fundamentals courses, along with being a primary resource within the company for its T-SQL-related activities.
Dejan Sarka, MCT and SQL Server MVP, is an independent consultant, trainer, and developer focusing on database and business intelligence applications. His specialties are advanced topics like data modeling, data mining, and data quality. On these tough topics, he works and researches together with SolidQ and the Data Quality Institute. He is the founder of the Slovenian SQL Server and .NET Users Group. Dejan Sarka is the main author or coauthor of 11 books about databases and SQL Server, with more to come. He also has developed and is continuing to develop many courses and seminars for SolidQ and Microsoft. He has been a regular speaker at many conferences worldwide for more than 15 years, including Microsoft TechEd, PASS Summit, and others.
Adam Machanic is a Boston-based SQL Server developer, writer, and speaker. He focuses on large-scale data warehouse performance and development and is author of the award-winning SQL Server monitoring stored procedure, sp_WhoIsActive. Adam has contributed to several books on SQL Server, including T-SQL Querying (Microsoft Press, 2015) and SQL Server 2008 Internals (Microsoft Press, 2009). He regularly speaks at international conferences and training events on a variety of SQL Server topics.
Kevin Farlee has over 25 years in the industry, in both database and storage-management software. In his current role as a Storage Engine Program Manager on the Microsoft SQL Server team, he brings these threads together. His current projects include the SQL Server Project “Hekaton” In-Memory OLTP feature.
Dejan Sarka, MCT and SQL Server MVP, is an independent consultant, trainer, and developer focusing on database and business intelligence applications. His specialties are advanced topics like data modeling, data mining, and data quality. On these tough topics, he works and researches together with SolidQ and the Data Quality Institute. He is the founder of the Slovenian SQL Server and .NET Users Group. Dejan Sarka is the main author or coauthor of 11 books about databases and SQL Server, with more to come. He also has developed and is continuing to develop many courses and seminars for SolidQ and Microsoft. He has been a regular speaker at many conferences worldwide for more than 15 years, including Microsoft TechEd, PASS Summit, and others.
Adam Machanic is a Boston-based SQL Server developer, writer, and speaker. He focuses on large-scale data warehouse performance and development and is author of the award-winning SQL Server monitoring stored procedure, sp_WhoIsActive. Adam has contributed to several books on SQL Server, including T-SQL Querying (Microsoft Press, 2015) and SQL Server 2008 Internals (Microsoft Press, 2009). He regularly speaks at international conferences and training events on a variety of SQL Server topics.
Kevin Farlee has over 25 years in the industry, in both database and storage-management software. In his current role as a Storage Engine Program Manager on the Microsoft SQL Server team, he brings these threads together. His current projects include the SQL Server Project “Hekaton” In-Memory OLTP feature.
Need help? Get in touch
Play
Privacy and cookies
By watching, you agree Pearson can share your viewership data for marketing and analytics for one year, revocable upon changing cookie preferences. Disabling cookies may affect video functionality. More info...