I have been witnessing a very strange behaviour of ROW_NUMBER() PARTITION BY clause in complex join and nested select or CTE statements over large volumes of data with Max Degree of Parallelism <> 1
The ROW_NUMBER() or RANK() were incorrectly marking records within a partiton, and that behaviour was nondeterministic.
Let's say I have three tables girls, shoppingmall and times girls and shoppingmall have from and to dates, and times table has a row for every date. shoppingmall has shopid which can appear in different mallid.
ex: Now, I want to get first mallid where the girl was shopping for every day. I join girls table with times by from and to. Then I create a nested select from shoppingmall also joined to times by from and to, which would use ROW_NUMBER () PARTITION BY in order to give me first mallid for a given date and given shop.
select * into #resultset
from girls
left join times ON times.timeid >= girls.datefrom and times.timeid < girls.dateto
left join (select mallid, shopid as shopidx, times.timeid as timeidx, ROW_NUMBER() OVER (PARTITION BY shopid, times.timeid ORDER BY ordering ASC) Lenin
FROM shoppingmall INNER JOIN times ON times.timeid between shoppingmall.datefrom and shoppingmall.dateto) AS trip
ON girls.shopid = trip.shopidx AND times.timeid = trip.timeidx and Lenin = 1
Now, what I should get is only one row per date and per shopid from shoppingmall
However, if I run this query on a very large dataset ROW_NUMBER() occasionally marks as 1 two rows for same shopid and timeid!
As soon as I add a very restrictive WHERE condition which limits the dataset, ex: WHERE girls.girlid = "Kathy", the query starts behaving correctly (whereas without strict WHERE and on the same record it was marking two rows PARTITIONED BY shopid and timeid as 1)
Also, if I fist insert result of inner select in the temporary table, and then use the table in the join - results are correct.
I am running on SQL Server 2008 10.0.2531.0
I will rewrite all my inner selects to use temporary tables instead of nested selects and CTE, but still I do not like the unexpected and unexplainable behaviour of PARTITION BY.
I guess that this behaviour is due to parallelism because when I changed the Max Degree of Paralellism to 1 this behaviour disappeared and PARTITION BY started working correctly?