April 2, 2008 at 7:35 am
Im trying to do a number of different aggregate calculations on a rather large recordset (> 2*10^6 records). It's rather complex to supply example data, but I'll try to keep the question as concise as possible 😎
Here are the tables to be calculated:
CREATE TABLE [mcmain].[venpart](
[vp_ordid] [char](30) NOT NULL,
[vp_volc] [char](8) NOT NULL,
[vp_brprice] [numeric](16, 4) NULL,
[vp_prcdisc] [numeric](5, 2) NULL,
[vp_ntprice] [numeric](16, 4) NULL,
[vp_legacy] [bit] NULL DEFAULT ((0)),
[vp_remove] [bit] NULL DEFAULT ((0)),
[vp_notmod] [bit] NULL DEFAULT ((0)),
[vp_bprperc] [numeric](16, 4) NULL,
[vp_nprperc] [numeric](16, 4) NULL,
[vp_newitem] [bit] NULL DEFAULT ((0)),
[vp_rowid] [int] IDENTITY(1,1) NOT NULL,
CONSTRAINT [vmpart] PRIMARY KEY CLUSTERED
(
[vp_ordid] ASC,
[vp_volc] ASC
))
CREATE TABLE [mcmain].[vpmod](
[vm_ordid] [char](30) NOT NULL,
[vm_volc] [char](8) NOT NULL,
[vm_brprice] [numeric](16, 4) NULL,
[vm_prcdisc] [numeric](5, 2) NULL,
[vm_ntprice] [numeric](16, 4) NULL,
[vm_legacy] [bit] NULL DEFAULT ((0)),
[vm_remove] [bit] NULL DEFAULT ((0)),
[vm_notmod] [bit] NULL DEFAULT ((0)),
[vm_bprperc] [numeric](16, 4) NULL,
[vm_nprperc] [numeric](16, 4) NULL,
[vm_newitem] [bit] NULL DEFAULT ((0)),
[vm_rowid] [int] IDENTITY(1,1) NOT NULL,
CONSTRAINT [vmpart] PRIMARY KEY CLUSTERED
(
[vm_ordid] ASC,
[vm_volc] ASC
))
Here's the receiving table:
CREATE TABLE [mcmain].[volcdet](
[vd_volcnmr] [char](8) NULL,
[vd_upnmr] [char](9) NULL,
[vd_qtyrd] [numeric](9, 0) NULL,
[vd_qtyorg] [numeric](9, 0) NULL,
[vd_qtyproc] [numeric](9, 0) NULL,
[vd_qtyremn] [numeric](9, 0) NULL,
[vd_qtyrej] [numeric](9, 0) NULL,
[vd_qtybrsd] [numeric](9, 0) NULL,
[vd_qtyblow] [numeric](9, 0) NULL,
[vd_qtybsam] [numeric](9, 0) NULL,
[vd_qtynrsd] [numeric](9, 0) NULL,
[vd_qtynlow] [numeric](9, 0) NULL,
[vd_qtynsam] [numeric](9, 0) NULL,
[vd_qtylegc] [numeric](9, 0) NULL,
[vd_qtyremv] [numeric](9, 0) NULL,
[vd_qtynew] [numeric](9, 0) NULL,
[vd_qtysel] [numeric](9, 0) NULL,
[vd_qtydscg] [numeric](9, 0) NULL,
[vd_qtydscn] [numeric](9, 0) NULL,
[vd_qtydscd] [numeric](9, 0) NULL,
[vd_qbmaxed] [numeric](9, 0) NULL,
[vd_qbmined] [numeric](9, 0) NULL,
[vd_qnmaxed] [numeric](9, 0) NULL,
[vd_qnmined] [numeric](9, 0) NULL,
[vd_avgbprc] [numeric](12, 4) NULL,
[vd_avgnprc] [numeric](12, 4) NULL,
[vd_notmod] [bit] NULL DEFAULT ((0)),
[vd_rowid] [int] IDENTITY(1,1) NOT NULL
)
Here are the calculations:
DECLARE @plannummer VarChar(16)
SET @plannummer = '20080402'
-- Bruto price raise calculation
UPDATE mcmain.volcdet
SET vd_qtybrsd = (SELECT COUNT(*) FROM mcmain.vpmod
INNER JOIN mcmain.venpart ON vm_ordid = vp_ordid AND vm_volc = vp_volc
WHERE vm_brprice > vp_brprice
AND vm_volc = vd_volcnmr
AND vm_notmod = 0)
WHERE vd_upnmr = @plannummer
-- Bruto prijsdalers berekenen
UPDATE mcmain.volcdet
SET vd_qtyblow = (SELECT COUNT(*) FROM mcmain.vpmod
INNER JOIN mcmain.venpart ON vm_ordid = vp_ordid AND vm_volc = vp_volc
WHERE vm_brprice < vp_brprice
AND vm_volc = vd_volcnmr
AND vm_notmod = 0)
WHERE vd_upnmr = @plannummer
-- Bruto gelijk berekenen
UPDATE mcmain.volcdet
SET vd_qtybsam = (SELECT COUNT(*) FROM mcmain.vpmod
INNER JOIN mcmain.venpart ON vm_ordid = vp_ordid AND vm_volc = vp_volc
WHERE vm_brprice = vp_brprice
AND vm_volc = vd_volcnmr
AND vm_notmod = 0)
WHERE vd_upnmr = @plannummer
-- Bruto stijgings percentage berekenen
UPDATE mcmain.volcdet
SET vd_avgbprc = (SELECT avg(vm_bprperc) FROM mcmain.vpmod
WHERE vm_notmod = 0)
WHERE vd_upnmr = @plannummer
Question:
Is it possible (and if so, how?) to speed up this process, e.g. by reducing the roundtrips? At the moment it's taking around 6 minutes for each calculation.
The parts in bold suggest something might be combined?
Any tips, help, insight would be appreciated.
Cees Cappelle
April 2, 2008 at 10:44 am
I'm not clear on what the end result of this is. I see the tables, I see a the queries, but without a sample of the starting data and the desired end result, it's rough to work with this. Can you provide a few rows of sample data for each table?
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
April 3, 2008 at 2:55 am
Ok, here goes. The code below has 4 Update statements for the same table. Is it possible to setup some statements doing effectively the same, but reduce the round trips and thereby the processing time? The example has only 10 records, whereas vpmod and venpart each have (not identical) 2*10^6 records.
TRUNCATE TABLE mcmain.vpmod
TRUNCATE TABLE mcmain.venpart
TRUNCATE TABLE mcmain.volcdet
INSERT INTO mcmain.venpart (vp_ordid
, vp_volc
, vp_brprice
, vp_prcdisc
, vp_ntprice
, vp_legacy
, vp_remove
, vp_notmod)
SELECT '0000 3 10 ','23.05 ',0.1370,80.00,0.0274,0,0,NULLUNION ALL
SELECT '0000 3 16 ','23.05 ',0.1610,80.00,0.0322,0,0,NULL UNION ALL
SELECT '0000 3 20 ','23.05 ',0.1590,80.00,0.0318,0,0,NULL UNION ALL
SELECT '0000 3 25 ','23.05 ',0.1790,80.00,0.0358,0,0,NULL UNION ALL
SELECT '0000 4 10 ','23.05 ',0.1790,80.00,0.0358,0,0,NULL UNION ALL
SELECT '0000 4 16 ','23.05 ',0.2260,80.04,0.0451,0,0,NULL UNION ALL
SELECT '0000 4 20 ','23.05 ',0.2350,80.00,0.0470,0,0,NULL UNION ALL
SELECT '0000 4 25 ','23.05 ',0.2525,80.00,0.0505,0,0,NULL UNION ALL
SELECT '0000 5 40 ','23.05 ',0.6350,80.00,0.1270,0,0,NULL UNION ALL
SELECT '0000 5 50 ','23.05 ',0.7500,80.00,0.1500,0,0,NULL
INSERT INTO mcmain.vpmod (vm_ordid
, vm_volc
, vm_brprice
, vm_prcdisc
, vm_ntprice
, vm_legacy
, vm_remove
, vm_notmod)
SELECT '0000 3 10 ','23.05 ',0.1510,81.85,0.0274,0,0,1 UNION ALL
SELECT '0000 3 16 ','23.05 ',0.1775,81.86,0.0322,0,0,1 UNION ALL
SELECT '0000 3 20 ','23.05 ',0.1750,81.83,0.0318,0,0,1 UNION ALL
SELECT '0000 3 25 ','23.05 ',0.1970,81.83,0.0358,0,0,1 UNION ALL
SELECT '0000 4 10 ','23.05 ',0.1970,81.83,0.0358,0,0,1 UNION ALL
SELECT '0000 4 16 ','23.05 ',0.2490,81.89,0.0451,0,0,1 UNION ALL
SELECT '0000 4 20 ','23.05 ',0.2590,81.85,0.0470,0,0,1 UNION ALL
SELECT '0000 4 25 ','23.05 ',0.2780,81.83,0.0505,0,0,1 UNION ALL
SELECT '0000 5 40 ','23.05 ',0.7000,81.86,0.1270,0,0,1 UNION ALL
SELECT '0000 5 50 ','23.05 ',0.8250,81.82,0.1500,0,0,1
INSERT INTO mcmain.volcdet (vd_volcnmr
, vd_upnmr
, vd_qtyrd
, vd_qtyorg
, vd_qtyproc
, vd_qtyremn
, vd_qtyrej
, vd_qtybrsd
, vd_qtyblow
, vd_qtybsam
, vd_qtynrsd
, vd_qtynlow
, vd_qtynsam
, vd_qtylegc
, vd_qtyremv
, vd_qtynew
, vd_qtysel
, vd_qtydscg
, vd_qtydscn
, vd_qtydscd
, vd_qbmaxed
, vd_qbmined
, vd_qnmaxed
, vd_qnmined
, vd_avgbprc
, vd_avgnprc
, vd_notmod)
VALUES ('23.05 ','20080402',0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0)
-- Bruto price raise calculation
UPDATE mcmain.volcdet
SET vd_qtybrsd = (SELECT COUNT(*) FROM mcmain.vpmod
INNER JOIN mcmain.venpart ON vm_ordid = vp_ordid AND vm_volc = vp_volc
WHERE vm_brprice > vp_brprice
AND vm_volc = vd_volcnmr
)
-- Bruto prijsdalers berekenen
UPDATE mcmain.volcdet
SET vd_qtyblow = (SELECT COUNT(*) FROM mcmain.vpmod
INNER JOIN mcmain.venpart ON vm_ordid = vp_ordid AND vm_volc = vp_volc
WHERE vm_brprice < vp_brprice
AND vm_volc = vd_volcnmr
)
-- Bruto gelijk berekenen
UPDATE mcmain.volcdet
SET vd_qtybsam = (SELECT COUNT(*) FROM mcmain.vpmod
INNER JOIN mcmain.venpart ON vm_ordid = vp_ordid AND vm_volc = vp_volc
WHERE vm_brprice = vp_brprice
AND vm_volc = vd_volcnmr
)
-- Bruto stijgings percentage berekenen
UPDATE mcmain.volcdet
SET vd_avgbprc = (SELECT avg(vm_bprperc) FROM mcmain.vpmod
)
SELECT vd_volcnmr, vd_qtybrsd, vd_qtyblow, vd_qtybsam, vd_avgbprc FROM mcmain.volcdet
I've deleted extraneous where statements so it can run on its own. The WHERE statements suggest to me that there might be something there that could be combined?
Cees Cappelle
April 3, 2008 at 6:48 am
hi cees,
can you edit your previous post and break 'INSERT INTO mcmain.volcdet (vd_volcnmr, vd_upnmr, vd...' onto multiple lines? that will make your post narrower and easier to read.
multiple columns can be udpated in a single statement, so you could just merge all 4 statements into:
UPDATE mcmain.volcdet
SET vd_qtybrsd = (SELECT ...),
vd_qtyblow = (SELECT ...),
vd_qtybsam = (SELECT ...),
vd_qtybrsd = (SELECT ...)
WHERE vd_upnmr = @plannummer
but i'm sure your IO would be excessive. the code below consolidates the < = > subqueries into a single query. (not tested).
UPDATE mcmain.volcdet
SET vd_qtybrsd = X.ct_brsd,
vd_qtyblow = X.ct_blow,
vd_qtybsam = X.ct_sam
FROM mcmain.volcdet INNER JOIN
(SELECT vm_volc,
count(case when vm_brprice > vp_brprice then 1 end) as ct_brsd,
count(case when vm_brprice < vp_brprice then 1 end) as ct_blow,
count(case when vm_brprice = vp_brprice then 1 end) as ct_sam
FROM mcmain.vpmod
INNER JOIN mcmain.venpart ON vm_ordid = vp_ordid AND vm_volc = vp_volc
WHERE vm_notmod = 0
GROUP BY vm_volc) as X ON vm_volc = vd_volcnmr
WHERE vd_upnmr = @plannummer
The avgbprc update will still need it's own subquery.
April 3, 2008 at 11:04 am
Edited the post as requested. I'll try your suggestion tomorrow. Thanks for the tip.
btw Any new possibilities in SQL2008 that might help?
Cees Cappelle
Viewing 5 posts - 1 through 4 (of 4 total)
You must be logged in to reply to this topic. Login to reply