October 2, 2014 at 8:57 am
Hi all, I'm encountering a strange situation with SS2K5. I'm running a query joining two tables with a left outer join. Most of the fields are being formatted using UDFs, since the result of this query needs to satisfy strict requirements.
If I run this query including an ORDER BY clause, the query is returning more rows than if I don't use ORDER BY.
Has anyone even encountered such a situation?
Thanks for any help!
George
October 2, 2014 at 9:07 am
ORDER BY won't change the number of rows, it has to be something else. Could you post the queries that you're using?
October 2, 2014 at 9:28 am
I've never heard of such a thing, but am incredibly curious to see it. Please post the DDL and query you're using.
October 2, 2014 at 9:30 am
Hi, this is the DDL (It's actually a sproc):
ALTER PROCEDURE [dbo].[sp_aqr_createTape_RRE] (@snapshot SMALLDATETIME)
AS
BEGIN
DELETE FROM dbo.tb_aqr_RRE_tape_temp WHERE R_SNAPF = @snapshot
DECLARE @count BIGINT
SET @count = (SELECT COUNT(*) FROM dbo.tb_aqr_RRE_tape_temp)
IF ISNULL(@count, 0) = 0
DBCC checkident(tb_aqr_RRE_tape_temp, reseed, 0)
INSERT INTO
dbo.tb_aqr_RRE_tape_temp
(
R_SNAPF,
R_ENTITY,
R_SYSTEM,
R_COUNTR,
R_BRANCH,
R_IDFF,
R_IDFD,
R_GEOGF,
S_AQRASF,
S_CRR,
B_RESMAT,
B_CHAN,
B_PROD,
B_CURR,
B_EFFRAT,
B_CURRAT,
B_PROT,
E_ONBAL,
E_OFFBAL,
E_CCF,
E_LIR,
S_NPEINT,
S_NPEEBA,
S_NPE12M,
D_DPD,
FO_INT,
R_INTRAT,
A_FAIRVA,
P_SPECF,
P_IBNRF,
R_IDCC,
C_TYPE,
C_COUNTR,
C_REGION,
C_VAL,
C_FLAG,
C_DATE,
C_COVER,
C_VALINS,
P_PROVF
)
SELECT
-- R_SNAPF
@snapshot,
-- R_ENTITY
CASE
WHEN LEN(LTRIM(RTRIM(ISNULL(L.R_ENTITY, '')))) = 0
THEN 'MISS'
ELSE
L.R_ENTITY
END,
-- R_SYSTEM
CASE
WHEN LEN(LTRIM(RTRIM(ISNULL(L.R_SYSTEM, '')))) = 0
THEN 'MISS'
ELSE
L.R_SYSTEM
END,
-- R_COUNTR
CASE
WHEN LEN(LTRIM(RTRIM(ISNULL(L.R_COUNTR, '')))) = 0
THEN 'MISS'
ELSE
L.R_COUNTR
END,
-- R_BRANCH
CASE
WHEN LEN(LTRIM(RTRIM(ISNULL(L.R_BRANCH, '')))) = 0
THEN 'MISS'
ELSE
L.R_BRANCH
END,
-- R_IDFF
CASE
WHEN LEN(LTRIM(RTRIM(ISNULL(L.R_IDFF, '')))) = 0
THEN CAST('MISS' AS NVARCHAR (255))
ELSE
CAST(L.R_IDFF AS NVARCHAR (255))
END,
-- R_IDFD
CASE
WHEN LEN(LTRIM(RTRIM(ISNULL(L.R_IDFD, '')))) = 0
THEN CAST('MISS' AS NVARCHAR (255))
ELSE
CAST(L.R_IDFD AS NVARCHAR (255))
END,
-- R_GEOGF
CASE
WHEN LEN(ISNULL(L.R_GEOGD, '')) = 0
THEN CAST('MISS' AS NVARCHAR (255))
ELSE CAST(L.R_GEOGD AS NVARCHAR (255))
END,
-- S_AQRASF
CASE
WHEN LEN(LTRIM(RTRIM(ISNULL(C.S_AQRASF, '')))) = 0
THEN CAST('MISS' AS NVARCHAR (255))
ELSE CAST(C.S_AQRASF AS NVARCHAR (255))
END,
-- S_CRR
CASE
WHEN LEN(ISNULL(dbo.udf_aqr_get_CRR (L.R_IDFD, L.R_SYSTEM), '')) = 0 THEN 'N/A'
ELSE dbo.udf_aqr_get_CRR (L.R_IDFD, L.R_SYSTEM)
END,
-- B_RESMAT
CASE
WHEN ISNULL(DATEDIFF(MONTH, @snapshot, L.B_RESMAT), 0) < 0 THEN '0'
ELSE CAST(ISNULL(DATEDIFF(MONTH, @snapshot, L.B_RESMAT), 0) AS NVARCHAR (255))
END,
-- B_CHAN
CASE
WHEN LEN(ISNULL(C.B_CHAN, '')) = 0
THEN 'MISS'
ELSE C.B_CHAN
END,
-- B_PROD
CASE
WHEN SUBSTRING(LTRIM(RTRIM(L.R_IDFF)), 3, 3) = 'JCB'
THEN 'SPG'
WHEN SUBSTRING(LTRIM(RTRIM(L.R_IDFF)), 3, 3) = 'JCK'
THEN 'LC'
WHEN LEN(ISNULL(L.B_PROD, '')) = 0
THEN 'MISS'
ELSE
L.B_PROD
END,
-- B_CURR
CASE
WHEN LEN(ISNULL(L.B_CURR, '')) = 0
THEN 'MISS'
ELSE L.B_CURR
END,
-- B_EFFRAT
dbo.udf_aqr_format_numeric_pcent(CAST(L.B_EFFRAT AS NVARCHAR (255))),
-- B_CURRAT
dbo.udf_aqr_format_numeric_pcent(CAST(L.B_CURRAT AS NVARCHAR (255))),
-- B_PROT
'N/A',
-- E_ONBAL
dbo.udf_aqr_format_numeric(L.E_ONBAL),
-- E_OFFBAL
dbo.udf_aqr_format_numeric(L.E_OFFBAL),
-- E_CCF
dbo.udf_aqr_get_CCF(L.R_IDFF),
-- E_LIR
'99999999999',
-- dbo.udf_aqr_get_E_LIR (L.R_IDFF),
-- S_NPEINT
CASE
WHEN dbo.udf_aqr_get_DPD(C.R_IDFF, L.R_SYSTEM, @snapshot) > 90 OR dbo.udf_aqr_get_RiskRating(C.R_IDFF, @snapshot) LIKE 'D%' THEN 'NP'
WHEN dbo.udf_aqr_get_RiskRating(C.R_IDFF, @snapshot) LIKE 'D%' THEN 'NP'
ELSE 'PE'
END,
-- S_NPEEBA
CASE
WHEN dbo.udf_aqr_get_DPD(C.R_IDFF, L.R_SYSTEM, @snapshot) > 90 OR dbo.udf_aqr_get_RiskRating(C.R_IDFF, @snapshot) LIKE 'D%' THEN 'NP'
WHEN dbo.udf_aqr_get_RiskRating(C.R_IDFF, @snapshot) LIKE 'D%' THEN 'NP'
ELSE 'PE'
END,
-- S_NPE12M
CASE
WHEN dbo.udf_aqr_get_NPE12M(L.R_IDFF, DATEADD(YEAR, -1, @snapshot) ,@snapshot) = 1
THEN 'Y'
ELSE
'N'
END,
-- D_DPD
CAST(dbo.udf_aqr_get_DPD(L.R_IDFF, L.R_SYSTEM, @snapshot) AS NVARCHAR (255)),
-- FO_INT
CASE
WHEN LEN(LTRIM(RTRIM(ISNULL(C.FO_INT, '')))) = 0
THEN 'N/A'
ELSE C.FO_INT
END,
-- R_INTRAT
'11111111111',
-- A_FAIRVA
CAST('N/A' AS NVARCHAR (255)),
-- P_SPECF
dbo.udf_aqr_format_numeric(C.P_SPECF),
-- P_IBNRF
dbo.udf_aqr_format_numeric(C.P_IBNRF),
-- R_IDCC
CASE
WHEN LEN(LTRIM(RTRIM(ISNULL(C.R_IDCC, '')))) = 0
THEN 'MISS'
ELSE
C.R_IDCC
END,
-- C_TYPE
CASE
WHEN LEN(LTRIM(RTRIM(ISNULL(C.C_TYPE, '')))) = 0
THEN 'MISS'
ELSE
C.C_TYPE
END,
-- C_COUNTR
CASE
WHEN UPPER(LTRIM(RTRIM(C.LBM_COUNTRY))) = 'GOZO'
THEN dbo.udf_aqr_get_country_ISO('MALTA')
ELSE
ISNULL(dbo.udf_aqr_get_country_ISO(UPPER(LTRIM(RTRIM(C.LBM_COUNTRY)))), 'MISS')
END,
-- C_REGION
CASE
WHEN UPPER(LTRIM(RTRIM(C.LBM_COUNTRY))) = 'GOZO'
THEN dbo.udf_aqr_get_country_ISO('MALTA')
ELSE
ISNULL(dbo.udf_aqr_get_country_ISO(UPPER(LTRIM(RTRIM(C.LBM_COUNTRY)))), 'MISS')
END,
-- C_VAL
dbo.udf_aqr_format_numeric(CAST(dbo.udf_aqr_C_VAL(C.R_IDFD, C.R_IDCC) AS NVARCHAR(255))),
-- C_FLAG
dbo.udf_aqr_format_C_FLAG(C.C_FLAG),
-- C_DATE
CASE
WHEN LEN(ISNULL(CAST(dbo.udf_aqr_C_DATE(C.R_IDFF) AS NVARCHAR (10)), '')) = 0
THEN 'MISS'
ELSE
dbo.udf_aqr_leftPad(CAST(DATEPART(dd, dbo.udf_aqr_C_DATE(C.R_IDFF)) AS NVARCHAR(2)), '0', 2) + '/' + dbo.udf_aqr_leftPad(CAST(DATEPART(mm, dbo.udf_aqr_C_DATE(C.R_IDFF)) AS NVARCHAR(2)), '0', 2) + '/' + dbo.udf_aqr_leftPad(CAST(DATEPART(yyyy, dbo.udf_aqr_C_DATE(C.R_IDFF)) AS NVARCHAR(4)), '0', 4)
END,
-- C_COVER
dbo.udf_aqr_format_numeric(CAST(dbo.udf_aqr_C_COVER(C.R_IDFD, C.R_IDCC) AS NVARCHAR(255))),
-- C_VALINS
CASE
WHEN LEN(LTRIM(RTRIM(ISNULL(C.C_VALINS, '')))) = 0
THEN 'MISS'
ELSE
C.C_VALINS
END,
-- P_PROVF
CASE
WHEN C.P_PROVF = 'IS' OR C.P_PROVF = 'NI' THEN dbo.udf_aqr_format_P_PROVD(C.P_PROVF)
WHEN C.P_SPECF <> '11111111111' OR C.P_SPECF <> '99999999999' THEN 'IS'
ELSE 'NI'
END
FROM
dbo.tb_aqr_bulkimport_loantape L
LEFT OUTER JOIN dbo.tb_aqr_consolidated_2014 C
ON LTRIM(RTRIM(L.R_IDFF)) = LTRIM(RTRIM(C.R_IDFF))
WHERE
dbo.udf_aqr_debtor_is_retailRRE (L.R_IDFD) = 1
ORDER BY
L.R_IDFD,
L.R_IDFF
October 2, 2014 at 9:59 am
There's no way that your ORDER BY will change the results, it must be something else.
Your query is full of UDFs that will affect performance in a severe way, specially with the one in the WHERE clause and the functions on your JOIN clause.
All your LEN(LTRIM(RTRIM(ISNULL(String, '')))) = 0 can be simplified to LEN( String).
SELECT String,
LEN(LTRIM(RTRIM(ISNULL(String, '')))),
LEN( String)
FROM (VALUES('a'),
(' a'),
('a '),
(''),
(' '),
(NULL))x(String)
There are so much improvements that can be made to this code that a forum post wouldn't be enough.
Here's a formatted version if anyone else wants to take a look.
ALTER PROCEDURE [dbo].[sp_aqr_createTape_RRE] (@snapshot SMALLDATETIME)
AS
BEGIN
DELETE
FROM dbo.tb_aqr_RRE_tape_temp
WHERE R_SNAPF = @snapshot
DECLARE @count BIGINT
SET @count = (
SELECT COUNT(*)
FROM dbo.tb_aqr_RRE_tape_temp
)
IF ISNULL(@count, 0) = 0
DBCC CHECKIDENT (
tb_aqr_RRE_tape_temp
,reseed
,0
)
INSERT INTO dbo.tb_aqr_RRE_tape_temp (
R_SNAPF
,R_ENTITY
,R_SYSTEM
,R_COUNTR
,R_BRANCH
,R_IDFF
,R_IDFD
,R_GEOGF
,S_AQRASF
,S_CRR
,B_RESMAT
,B_CHAN
,B_PROD
,B_CURR
,B_EFFRAT
,B_CURRAT
,B_PROT
,E_ONBAL
,E_OFFBAL
,E_CCF
,E_LIR
,S_NPEINT
,S_NPEEBA
,S_NPE12M
,D_DPD
,FO_INT
,R_INTRAT
,A_FAIRVA
,P_SPECF
,P_IBNRF
,R_IDCC
,C_TYPE
,C_COUNTR
,C_REGION
,C_VAL
,C_FLAG
,C_DATE
,C_COVER
,C_VALINS
,P_PROVF
)
SELECT @snapshot
,CASE
WHEN LEN(LTRIM(RTRIM(ISNULL(L.R_ENTITY, '')))) = 0
THEN 'MISS'
ELSE L.R_ENTITY
END
,CASE
WHEN LEN(LTRIM(RTRIM(ISNULL(L.R_SYSTEM, '')))) = 0
THEN 'MISS'
ELSE L.R_SYSTEM
END
,CASE
WHEN LEN(LTRIM(RTRIM(ISNULL(L.R_COUNTR, '')))) = 0
THEN 'MISS'
ELSE L.R_COUNTR
END
,CASE
WHEN LEN(LTRIM(RTRIM(ISNULL(L.R_BRANCH, '')))) = 0
THEN 'MISS'
ELSE L.R_BRANCH
END
,CASE
WHEN LEN(LTRIM(RTRIM(ISNULL(L.R_IDFF, '')))) = 0
THEN CAST('MISS' AS NVARCHAR(255))
ELSE CAST(L.R_IDFF AS NVARCHAR(255))
END
,CASE
WHEN LEN(LTRIM(RTRIM(ISNULL(L.R_IDFD, '')))) = 0
THEN CAST('MISS' AS NVARCHAR(255))
ELSE CAST(L.R_IDFD AS NVARCHAR(255))
END
,CASE
WHEN LEN(ISNULL(L.R_GEOGD, '')) = 0
THEN CAST('MISS' AS NVARCHAR(255))
ELSE CAST(L.R_GEOGD AS NVARCHAR(255))
END
,CASE
WHEN LEN(LTRIM(RTRIM(ISNULL(C.S_AQRASF, '')))) = 0
THEN CAST('MISS' AS NVARCHAR(255))
ELSE CAST(C.S_AQRASF AS NVARCHAR(255))
END
,CASE
WHEN LEN(ISNULL(dbo.udf_aqr_get_CRR(L.R_IDFD, L.R_SYSTEM), '')) = 0
THEN 'N/A'
ELSE dbo.udf_aqr_get_CRR(L.R_IDFD, L.R_SYSTEM)
END
,CASE
WHEN ISNULL(DATEDIFF(MONTH, @snapshot, L.B_RESMAT), 0) < 0
THEN '0'
ELSE CAST(ISNULL(DATEDIFF(MONTH, @snapshot, L.B_RESMAT), 0) AS NVARCHAR(255))
END
,CASE
WHEN LEN(ISNULL(C.B_CHAN, '')) = 0
THEN 'MISS'
ELSE C.B_CHAN
END
,CASE
WHEN SUBSTRING(LTRIM(RTRIM(L.R_IDFF)), 3, 3) = 'JCB'
THEN 'SPG'
WHEN SUBSTRING(LTRIM(RTRIM(L.R_IDFF)), 3, 3) = 'JCK'
THEN 'LC'
WHEN LEN(ISNULL(L.B_PROD, '')) = 0
THEN 'MISS'
ELSE L.B_PROD
END
,CASE
WHEN LEN(ISNULL(L.B_CURR, '')) = 0
THEN 'MISS'
ELSE L.B_CURR
END
,dbo.udf_aqr_format_numeric_pcent(CAST(L.B_EFFRAT AS NVARCHAR(255)))
,dbo.udf_aqr_format_numeric_pcent(CAST(L.B_CURRAT AS NVARCHAR(255)))
,'N/A'
,dbo.udf_aqr_format_numeric(L.E_ONBAL)
,dbo.udf_aqr_format_numeric(L.E_OFFBAL)
,dbo.udf_aqr_get_CCF(L.R_IDFF)
,'99999999999'
,CASE
WHEN dbo.udf_aqr_get_DPD(C.R_IDFF, L.R_SYSTEM, @snapshot) > 90
OR dbo.udf_aqr_get_RiskRating(C.R_IDFF, @snapshot) LIKE 'D%'
THEN 'NP'
WHEN dbo.udf_aqr_get_RiskRating(C.R_IDFF, @snapshot) LIKE 'D%'
THEN 'NP'
ELSE 'PE'
END
,CASE
WHEN dbo.udf_aqr_get_DPD(C.R_IDFF, L.R_SYSTEM, @snapshot) > 90
OR dbo.udf_aqr_get_RiskRating(C.R_IDFF, @snapshot) LIKE 'D%'
THEN 'NP'
WHEN dbo.udf_aqr_get_RiskRating(C.R_IDFF, @snapshot) LIKE 'D%'
THEN 'NP'
ELSE 'PE'
END
,CASE
WHEN dbo.udf_aqr_get_NPE12M(L.R_IDFF, DATEADD(YEAR, - 1, @snapshot), @snapshot) = 1
THEN 'Y'
ELSE 'N'
END
,CAST(dbo.udf_aqr_get_DPD(L.R_IDFF, L.R_SYSTEM, @snapshot) AS NVARCHAR(255))
,CASE
WHEN LEN(LTRIM(RTRIM(ISNULL(C.FO_INT, '')))) = 0
THEN 'N/A'
ELSE C.FO_INT
END
,'11111111111'
,CAST('N/A' AS NVARCHAR(255))
,dbo.udf_aqr_format_numeric(C.P_SPECF)
,dbo.udf_aqr_format_numeric(C.P_IBNRF)
,CASE
WHEN LEN(LTRIM(RTRIM(ISNULL(C.R_IDCC, '')))) = 0
THEN 'MISS'
ELSE C.R_IDCC
END
,CASE
WHEN LEN(LTRIM(RTRIM(ISNULL(C.C_TYPE, '')))) = 0
THEN 'MISS'
ELSE C.C_TYPE
END
,CASE
WHEN UPPER(LTRIM(RTRIM(C.LBM_COUNTRY))) = 'GOZO'
THEN dbo.udf_aqr_get_country_ISO('MALTA')
ELSE ISNULL(dbo.udf_aqr_get_country_ISO(UPPER(LTRIM(RTRIM(C.LBM_COUNTRY)))), 'MISS')
END
,CASE
WHEN UPPER(LTRIM(RTRIM(C.LBM_COUNTRY))) = 'GOZO'
THEN dbo.udf_aqr_get_country_ISO('MALTA')
ELSE ISNULL(dbo.udf_aqr_get_country_ISO(UPPER(LTRIM(RTRIM(C.LBM_COUNTRY)))), 'MISS')
END
,dbo.udf_aqr_format_numeric(CAST(dbo.udf_aqr_C_VAL(C.R_IDFD, C.R_IDCC) AS NVARCHAR(255)))
,dbo.udf_aqr_format_C_FLAG(C.C_FLAG)
,CASE
WHEN LEN(ISNULL(CAST(dbo.udf_aqr_C_DATE(C.R_IDFF) AS NVARCHAR(10)), '')) = 0
THEN 'MISS'
ELSE dbo.udf_aqr_leftPad(CAST(DATEPART(dd, dbo.udf_aqr_C_DATE(C.R_IDFF)) AS NVARCHAR(2)), '0', 2) + '/' + dbo.udf_aqr_leftPad(CAST(DATEPART(mm, dbo.udf_aqr_C_DATE(C.R_IDFF)) AS NVARCHAR(2)), '0', 2) + '/' + dbo.udf_aqr_leftPad(CAST(DATEPART(yyyy, dbo.udf_aqr_C_DATE(C.R_IDFF)) AS NVARCHAR(4)), '0', 4)
END
,dbo.udf_aqr_format_numeric(CAST(dbo.udf_aqr_C_COVER(C.R_IDFD, C.R_IDCC) AS NVARCHAR(255)))
,CASE
WHEN LEN(LTRIM(RTRIM(ISNULL(C.C_VALINS, '')))) = 0
THEN 'MISS'
ELSE C.C_VALINS
END
,CASE
WHEN C.P_PROVF = 'IS'
OR C.P_PROVF = 'NI'
THEN dbo.udf_aqr_format_P_PROVD(C.P_PROVF)
WHEN C.P_SPECF <> '11111111111'
OR C.P_SPECF <> '99999999999'
THEN 'IS'
ELSE 'NI'
END
FROM dbo.tb_aqr_bulkimport_loantape L
LEFT OUTER JOIN dbo.tb_aqr_consolidated_2014 C ON LTRIM(RTRIM(L.R_IDFF)) = LTRIM(RTRIM(C.R_IDFF))
WHERE dbo.udf_aqr_debtor_is_retailRRE(L.R_IDFD) = 1
ORDER BY L.R_IDFD
,L.R_IDFF
October 2, 2014 at 10:17 am
Luis Cazares (10/2/2014)
There's no way that your ORDER BY will change the results, it must be something else.
+ 1. I don't see how an ORDER BY clause can possibly change the rows being returned. Are rows being written to the table between the times you run it? That's the only thing I can think of that would cause the problem you describe.
Luis Cazares (10/2/2014)
Your query is full of UDFs that will affect performance in a severe way, specially with the one in the WHERE clause and the functions on your JOIN clause.
Oh my, this is so very true. It looks like part of an ETL process where you're importing and massaging the data before writing it to a permanent table. If this is true, you've already imported the data at this point. You'd do well to fire UPDATE statements against the data to trim off the unwanted spaces.
I think Luis was being very nice is saying "affect performance in a severe way". I would say that they're going to kill it. Give some thought to breaking down all the steps you're doing here into more manageable pieces. The adage "divide and conquer" comes to mind. If you can convert some of the scalar UDFs into ITVFs, that will help tremendously. You can also add some columns to your intermediate tables and calculate some of this stuff in-place ahead of time. This will also help because you'll be able to verify your data at each step along the way and moving on to the next one until you know they're all good. In the end, your heavy lifting should already be done before you write to your production table.
October 3, 2014 at 2:00 am
Hi, thanks for the tips. The query itself does not take too long to run - < 1m.
I've decided to 'debug' the DDL by remming out field by field. With my luck, the 'issue' was with the last-but-one field (VALINS). The only difference I noticed with this field was that it was of type NVARCHAR(MAX), while the other fields are all of type NVARCHAR(255), which is more that enough. I changed the data type to match the other fields, and the 'missing' records are now being output.
I still can't understand why this has been happening, and would appreciate any advice!
Thanks again and regards,
George
October 3, 2014 at 11:59 am
Along with Luis and Ed I can't find anything obvious in your query that would cause there to be different rows to be returned based on the addition of an ORDER BY clause because ORDER BY shouldn't ever affect the number of rows returned.
I'd suggest the following:
1. Capture and post the execution plans for the query both with and without the ORDER BY. Seeing what the optimizer is doing may be helpful.
2. Just run the SELECT without the 2nd table using the same criteria with and without the ORDER BY to see what happens.
3. Run the query without all the UDF's in the SELECT and see what happens.
Jack Corbett
Consultant - Straight Path Solutions
Check out these links on how to get faster and more accurate answers:
Forum Etiquette: How to post data/code on a forum to get the best help
Need an Answer? Actually, No ... You Need a Question
Viewing 8 posts - 1 through 7 (of 7 total)
You must be logged in to reply to this topic. Login to reply
This website stores cookies on your computer.
These cookies are used to improve your website experience and provide more personalized services to you, both on this website and through other media.
To find out more about the cookies we use, see our Privacy Policy