LiuLyndon Look For: [SQL] Compare Operator - 比較基本索引型態成本效率

各種索引的比較，如下：

主要分析不同索引鍵，所包含可能執行的種類
掃描：從第一筆資料讀至最後一筆資料才會停止
搜尋：利用相關的 "索引" 快速找到符合的資料

查詢必定使用 where 條件篩選，以及 "索引" 中 "鍵值資料行" (Key Column) / "鍵值屬性" (Key Attribute)

索引種類	掃描(scan)	搜尋(seek)	是否要考量輸出資料行
Heap	O
Clustered Index	O	O
NonClustered Index	O	O	O
NonClustered ColumnStore Index	O	O	O

資料庫使用 Adventure Works for SQL Server 2012 的資料表 Production.Product 產生 4 種不同資料表的索引種類

tbProduct01：Heap
tbProduct02：Clustered Index
tbProduct03：NonClustered Index
tbProduct04：NonClustered ColumnStore Index

CASE 1. 沒有 where 條件查詢，只輸出 "Key Column"

查詢

select ProductID from tbProduct01
select ProductID from tbProduct02
select ProductID from tbProduct03
select ProductID from tbProduct04

查詢結果

Result：

"Heap" 在查詢需輸出不需要的 "資料行" 越多，負擔會越高相對成本也會提高
"Heap" 如查詢資料表包含太多 "資料行" 會造能效率不彰的原因，建議將一個資料表中比較不常用的 "資料行"，切至另一個資料表來儲存，將會降低查詢的成本，稱為 "Vertical Partitioning" (垂直資料分割)
如果 "Clustered Index" 包括資料行，剛好全部都要輸出資料行，這將會是最佳情況
"NonClustered Index" 與 "NonClustered ColumnStore Index" 可視為 "輕量型資料表"

CASE 2. Where 條件沒利用索引中 "鍵值資料行"

查詢案例 1

select ProductID from tbProduct01 where Name = 'Reflector'
select ProductID from tbProduct02 where Name = 'Reflector'
select ProductID from tbProduct03 where Name = 'Reflector'
select ProductID from tbProduct04 where Name = 'Reflector'

查詢案例 1 結果

查詢 3：因 "Name" 不在索引內，所以 Query Optimizer 採用 "資料表掃描" 取代使用 "索引掃描" (NonClustered)
查詢 4：因 "NonClustered ColumnStore Index" 包含 "Name" 的索引資訊，此成本最低

查詢案例 2：使用 "Table Hints" (資料表提示) 來引響 "執行計畫"

select ProductID from tbProduct01 where Name = 'Reflector'
select ProductID from tbProduct02 with(index(clIndex)) where Name = 'Reflector'
select ProductID from tbProduct03 with(index(nclIndex)) where Name = 'Reflector'
select ProductID from tbProduct04 with(index(nclIndex)) where Name = 'Reflector'

查詢案例 2 結果

查詢 3：因不具有 "Name" 的屬性，所以除了使用 "NonClustered Index" 外，還多出 "RID 查閱" (Heap) 及 "巢狀迴圈" (內部連結) 兩個運算，因此查詢成本多出很多

Result：

資料量大時，強行使用 "NonClustered Index"，反而因使用 "查閱" ("RID 查閱" (Heap) 及 "索引鍵查閱" (Clustered) ) 相關造成反效果
如硬碟空間不是問題，適當使用 "NonClustered ColumnStore Index" 是不錯的選項

CASE 3. Where 條件使用"索引鍵資料行"，並只輸出 "鍵值資料行" (Key Column)

查詢

select ProductID from tbProduct01 where ProductID = 777
select ProductID from tbProduct02 where ProductID = 777
select ProductID from tbProduct03 where ProductID = 777
select ProductID from tbProduct04 where ProductID = 777

查詢結果

Result：

不具任何索引資料表為 "Heap"，查詢時全部都採用 "掃描"，執行效率表現最差
建議每資料表都應該建立 "Clustered Index"，確保某些條件下，有較好的查詢效率
而查詢 2, 3, 4 輸出資料行是索引中的 "鍵值資料行"，相對成本都是相同

CASE 4. Where 條件使用"索引鍵資料行"，並只輸出 "鍵值資料行" (Key Column) 與 "非鍵值資料行"

查詢

select ProductID, Name from tbProduct01 where ProductID = 777
select ProductID, Name from tbProduct02 where ProductID = 777
select ProductID, Name from tbProduct03 where ProductID = 777
select ProductID, Name from tbProduct04 where ProductID = 777

查詢結果

Result：

"查閱" 運算子將會照成額外的查詢成本
"搜尋" 查詢 2, 4 相對成本都為 13 %，無關 "資料行" 多少

CASE 5. Where 條件使用"索引鍵資料行"，輸出資料行超出 "NonClustered ColumnStore Index" 所包含資料行

查詢

select ProductID, Name, Color from tbProduct01 where ProductID = 777
select ProductID, Name, Color from tbProduct02 where ProductID = 777
select ProductID, Name, Color from tbProduct03 where ProductID = 777
select ProductID, Name, Color from tbProduct04 where ProductID = 777

查詢結果

Result：

NonClustered ColumnStore Index 包含的 "資料行' 不足提供給查詢使用，此需進行 "查閱"，因而提高查詢成本
建議對大部分會使用 "資料行"，利用 "NonClustered ColumnStore Index" 的特性來強化索引效能，因 "NonClustered ColumnStore Index" 可實質納入更多 "資料行"，運用上會更彈性
"NonClustered ColumnStore Index" 缺點會耗費更多儲存空間

CASE 6. 使用 (sp_spaceused) 查詢不同種類的索引所佔用儲存空間

查詢

exec sp_spaceused tbProduct01
exec sp_spaceused tbProduct02
exec sp_spaceused tbProduct03
exec sp_spaceused tbProduct04

查詢結果

Result：

建立索引可降低查詢執行成本，但會提高儲存空間成本，當索引建立越多，所佔的硬碟空間會越多
"NonClustered ColumnStore Index" 額外耗費的空間更多，但可額外納入更多 "資料行"

REF：

SQL Server 效能調校

LiuLyndon Look For

2017年10月8日星期日

[SQL] Compare Operator - 比較基本索引型態成本效率

CASE 1. 沒有 where 條件查詢，只輸出 "Key Column"

CASE 2. Where 條件沒利用索引中 "鍵值資料行"

CASE 3. Where 條件使用"索引鍵資料行"，並只輸出 "鍵值資料行" (Key Column)

CASE 4. Where 條件使用"索引鍵資料行"，並只輸出 "鍵值資料行" (Key Column) 與 "非鍵值資料行"

CASE 5. Where 條件使用"索引鍵資料行"，輸出資料行超出 "NonClustered ColumnStore Index" 所包含資料行

CASE 6. 使用 (sp_spaceused) 查詢不同種類的索引所佔用儲存空間

沒有留言:

張貼留言

2017年10月8日 星期日

[SQL] Compare Operator - 比較基本索引型態成本效率

CASE 1. 沒有 where 條件查詢，只輸出 "Key Column"

CASE 2. Where 條件沒利用索引中 "鍵值資料行"

CASE 3. Where 條件使用"索引鍵資料行"，並只輸出 "鍵值資料行" (Key Column)

CASE 4. Where 條件使用"索引鍵資料行"，並只輸出 "鍵值資料行" (Key Column) 與 "非鍵值資料行"

CASE 5. Where 條件使用"索引鍵資料行"，輸出資料行超出 "NonClustered ColumnStore Index" 所包含資料行

CASE 6. 使用 (sp_spaceused) 查詢不同種類的索引所佔用儲存空間

沒有留言:

張貼留言

2017年10月8日星期日