없이 나는 TPC-H 데이터베이스 및 워크로드를하고 난 (TPC-H 쿼리 21) 다음 쿼리를 처리 해요 :SQL Server 데이터베이스 튜닝 자체 조인 비정규
SELECT TOP 100
S_NAME, COUNT(*) AS NUMWAIT
FROM SUPPLIER S
JOIN NATION N ON S.S_NATIONKEY = N.N_NATIONKEY AND N.N_NAME = 'JORDAN'
JOIN LINEITEM L1 ON S.S_SUPPKEY = L1.L_SUPPKEY AND L1.L_RECEIPTDATE > L1.L_COMMITDATE
JOIN ORDERS O ON O.O_ORDERKEY = L1.L_ORDERKEY AND O.O_ORDERSTATUS = 'F'
WHERE EXISTS (
SELECT *
FROM LINEITEM L2
WHERE L2.L_ORDERKEY = L1.L_ORDERKEY AND
L2.L_SUPPKEY <> L1.L_SUPPKEY
)
GROUP BY S_NAME
ORDER BY NUMWAIT DESC, S_NAME
쿼리 문제를보다 분명하게하기 위해 원본으로 약간 수정되었습니다. 쿼리의 아이디어는 여러 공급 업체 주문을 기다리는 공급 업체 이름을 찾는 것입니다. 쿼리는 특정 국가의 공급 업체를 식별합니다. 이 쿼리의 주요 문제점은 기본적으로 존재하는 여러 공급 업체 주문을 식별하는 것입니다.
나는 다음과 같은 인덱스를 작성하는 경우 :
CREATE VIEW LINEITEM_VIOLATE1 WITH SCHEMABINDING AS
SELECT L_ORDERKEY, L_LINENUMBER, L_SUPPKEY
FROM dbo.LINEITEM
WHERE L_RECEIPTDATE > L_COMMITDATE
GO
CREATE UNIQUE CLUSTERED INDEX noncluster_idx_lineitem_orderkey1
ON LINEITEM_VIOLATE1(L_ORDERKEY, L_LINENUMBER, L_SUPPKEY)
GO
CREATE NONCLUSTERED INDEX nonclustered_idx_lineitem_suppkey1
ON [dbo].[LINEITEM_VIOLATE1] ([L_SUPPKEY])
INCLUDE ([L_ORDERKEY])
GO
CREATE NONCLUSTERED INDEX change2
ON [dbo].[ORDERS] ([O_ORDERSTATUS])
INCLUDE ([O_ORDERKEY])
GO
CREATE NONCLUSTERED INDEX change3
ON [dbo].[LINEITEM] ([L_ORDERKEY])
INCLUDE ([L_SUPPKEY])
GO
나는 다음과 같은 결과가 꽤 좋은 쿼리 계획을 얻을 :
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'ORDERS'. Scan count 1, logical reads 1719, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'LINEITEM'. Scan count 137252, logical reads 437982, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'LINEITEM_VIOLATE1'. Scan count 362, logical reads 1484, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'SUPPLIER'. Scan count 1, logical reads 205, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'NATION'. Scan count 1, logical reads 1, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
(1 row(s) affected)
SQL Server Execution Times:
CPU time = 938 ms, elapsed time = 1134 ms.
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 0 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
작업, 두 개 이상의 고유 한 공급 업체가 주문을 찾고, 시간의 절반 이상을 소비합니다. 쿼리 계획 (자체 조인은 79 %의 비용이 소요되는 것으로 추정 됨)에서 LINEITEM의 여러 논리적 읽기에서 관찰 할 수 있으며 또한 CPU 사용 시간이 반으로 줄어든 exist 절을 테스트 해 보았습니다. 이 하위 쿼리에는 자체 조인이 포함되어 있으므로 인덱싱 된 뷰를 만들 수 없습니다. 비정규 화하지 않고 최대한의 성능을 얻고 싶습니다. 이 쿼리의 성능을 더욱 향상시키는 방법을 알고 있습니까? 통찰력을 가져 주셔서 감사합니다.
난 그냥 쿼리 계획의 텍스트 표현을 추가합니다
100 1 SELECT TOP 100 S_NAME, COUNT(*) AS NUMWAIT FROM SUPPLIER S JOIN NATION N ON S.S_NATIONKEY = N.N_NATIONKEY AND N.N_NAME = 'JORDAN' JOIN LINEITEM L1 ON S.S_SUPPKEY = L1.L_SUPPKEY AND L1.L_RECEIPTDATE > L1.L_COMMITDATE JOIN ORDERS O ON O.O_ORDERKEY = L1.L_ORDERKEY AND O.O_ORDERSTATUS = 'F' WHERE EXISTS (SELECT * FROM LINEITEM L2 WHERE L2.L_ORDERKEY = L1.L_ORDERKEY AND L2.L_SUPPKEY <> L1.L_SUPPKEY ) GROUP BY S_NAME ORDER BY NUMWAIT DESC, S_NAME ----- 1 1 0 NULL NULL NULL NULL 100 NULL NULL NULL 49,24822 NULL NULL SELECT 0 NULL
100 1 |--Sort(TOP 100, ORDER BY:([Expr1016] DESC, [S].[S_NAME] ASC)) 1 2 1 Sort TopN Sort TOP 100, ORDER BY:([Expr1016] DESC, [S].[S_NAME] ASC) NULL 100 0,01126126 0,5933996 36 49,24822 [S].[S_NAME], [Expr1016] NULL PLAN_ROW 0 1
0 0 |--Compute Scalar(DEFINE:([Expr1016]=CONVERT_IMPLICIT(int,[Expr1024],0))) 1 3 2 Compute Scalar Compute Scalar DEFINE:([Expr1016]=CONVERT_IMPLICIT(int,[Expr1024],0)) [Expr1016]=CONVERT_IMPLICIT(int,[Expr1024],0) 9931,421 0 0,6267112 36 48,64356 [S].[S_NAME], [Expr1016] NULL PLAN_ROW 0 1
362 1 |--Hash Match(Aggregate, HASH:([S].[S_NAME]), RESIDUAL:([SUPPLIER].[S_NAME] as [S].[S_NAME] = [SUPPLIER].[S_NAME] as [S].[S_NAME]) DEFINE:([Expr1024]=COUNT(*))) 1 4 3 Hash Match Aggregate HASH:([S].[S_NAME]), RESIDUAL:([SUPPLIER].[S_NAME] as [S].[S_NAME] = [SUPPLIER].[S_NAME] as [S].[S_NAME]) [Expr1024]=COUNT(*) 9931,421 0 0,6267112 36 48,64356 [S].[S_NAME], [Expr1024] NULL PLAN_ROW 0 1
63631 1 |--Hash Match(Inner Join, HASH:([L1].[L_ORDERKEY])=([O].[O_ORDERKEY])) 1 5 4 Hash Match Inner Join HASH:([L1].[L_ORDERKEY])=([O].[O_ORDERKEY]) NULL 48872,45 0 4,75613 32 48,01685 [S].[S_NAME] NULL PLAN_ROW 0 1
132283 1 |--Nested Loops(Left Semi Join, OUTER REFERENCES:([L1].[L_ORDERKEY], [L1].[L_SUPPKEY], [Expr1023]) OPTIMIZED WITH UNORDERED PREFETCH) 1 6 5 Nested Loops Left Semi Join OUTER REFERENCES:([L1].[L_ORDERKEY], [L1].[L_SUPPKEY], [Expr1023]) OPTIMIZED WITH UNORDERED PREFETCH NULL 64854 0 0,3012108 36 41,18718 [S].[S_NAME], [L1].[L_ORDERKEY] NULL PLAN_ROW 0 1
137252 1 | |--Nested Loops(Inner Join, OUTER REFERENCES:([S].[S_SUPPKEY], [Expr1022]) OPTIMIZED WITH UNORDERED PREFETCH) 1 10 6 Nested Loops Inner Join OUTER REFERENCES:([S].[S_SUPPKEY], [Expr1022]) OPTIMIZED WITH UNORDERED PREFETCH NULL 72060 0 0,3012108 40 1,893772 [S].[S_NAME], [LINEITEM_VIOLATE1].[L_ORDERKEY], [LINEITEM_VIOLATE1].[L_SUPPKEY] NULL PLAN_ROW 0 1
362 1 | | |--Hash Match(Inner Join, HASH:([N].[N_NATIONKEY])=([S].[S_NATIONKEY])) 1 13 10 Hash Match Inner Join HASH:([N].[N_NATIONKEY])=([S].[S_NATIONKEY]) NULL 400 0 0,06381615 36 0,2325338 [S].[S_SUPPKEY], [S].[S_NAME] NULL PLAN_ROW 0 1
1 1 | | | |--Table Scan(OBJECT:([NATION] AS [N]), WHERE:([NATION].[N_NAME] as [N].[N_NAME]='JORDAN')) 1 14 13 Table Scan Table Scan OBJECT:([NATION] AS [N]), WHERE:([NATION].[N_NAME] as [N].[N_NAME]='JORDAN') [N].[N_NATIONKEY] 1 0,003125 0,0001845 36 0,0033095 [N].[N_NATIONKEY] NULL PLAN_ROW 0 1
10000 1 | | | |--Table Scan(OBJECT:([SUPPLIER] AS [S])) 1 15 13 Table Scan Table Scan OBJECT:([SUPPLIER] AS [S]) [S].[S_SUPPKEY], [S].[S_NAME], [S].[S_NATIONKEY] 10000 0,1542361 0,011157 40 0,1653931 [S].[S_SUPPKEY], [S].[S_NAME], [S].[S_NATIONKEY] NULL PLAN_ROW 0 1
137252 362 | | |--Index Seek(OBJECT:([LINEITEM_VIOLATE1].[nonclustered_idx_lineitem_suppkey1]), SEEK:([LINEITEM_VIOLATE1].[L_SUPPKEY]=[SUPPLIER].[S_SUPPKEY] as [S].[S_SUPPKEY]) ORDERED FORWARD) 1 17 10 Index Seek Index Seek OBJECT:([LINEITEM_VIOLATE1].[nonclustered_idx_lineitem_suppkey1]), SEEK:([LINEITEM_VIOLATE1].[L_SUPPKEY]=[SUPPLIER].[S_SUPPKEY] as [S].[S_SUPPKEY]) ORDERED FORWARD [LINEITEM_VIOLATE1].[L_ORDERKEY], [LINEITEM_VIOLATE1].[L_SUPPKEY] 180,15 0,003125 0,000355165 15 1,360028 [LINEITEM_VIOLATE1].[L_ORDERKEY], [LINEITEM_VIOLATE1].[L_SUPPKEY] NULL PLAN_ROW 0 400
132283 137252 | |--Index Seek(OBJECT:([LINEITEM].[change3] AS [L2]), SEEK:([L2].[L_ORDERKEY]=[LINEITEM].[L_ORDERKEY] as [L1].[L_ORDERKEY]), WHERE:([LINEITEM].[L_SUPPKEY] as [L2].[L_SUPPKEY]<>[LINEITEM].[L_SUPPKEY] as [L1].[L_SUPPKEY]) ORDERED FORWARD) 1 23 6 Index Seek Index Seek OBJECT:([LINEITEM].[change3] AS [L2]), SEEK:([L2].[L_ORDERKEY]=[LINEITEM].[L_ORDERKEY] as [L1].[L_ORDERKEY]), WHERE:([LINEITEM].[L_SUPPKEY] as [L2].[L_SUPPKEY]<>[LINEITEM].[L_SUPPKEY] as [L1].[L_SUPPKEY]) ORDERED FORWARD NULL 1 0,003125 0,0001614009 11 38,95377 NULL NULL PLAN_ROW 0 72060
729413 1 |--Index Seek(OBJECT:([ORDERS].[change2] AS [O]), SEEK:([O].[O_ORDERSTATUS]='F') ORDERED FORWARD) 1 24 5 Index Seek Index Seek OBJECT:([ORDERS].[change2] AS [O]), SEEK:([O].[O_ORDERSTATUS]='F') ORDERED FORWARD [O].[O_ORDERKEY] 729413 1,271023 0,8025113 11 2,073534 [O].[O_ORDERKEY] NULL PLAN_ROW 0 1
답변 해 주셔서 감사합니다. 불행히도 LINEITEM은 가장 큰 테이블이므로 인라인보기는 그 자체로 매우 비용이 많이 듭니다. 인덱스를 사용하여이 쿼리를 실행하면 원래 솔루션과 비교할 때 두 배의 시간이 필요합니다. –