April 1, 2015 at 3:10 am
Hello,
I appeal for help on a sql method to eliminate values considered as "duplicate" on each input rows in a table.
In summary here is the schematics:
A table that collects data "Service" 40 orders (services) possible by invoice
N°Invoice - Service1 - Service2 - Service3 - Service4- Service5 - up to 40 .... Etc...
000001 Null VAA VAA SFG SFG
000002 YBB YBB SFU Null Null
000003 Null Null SAE SFU SFU
000004 VDP VDP VBB VBB VAA
etc ....
The goal is to get this following table B
N°invoice - Service1 - Service2 - Service3 - Service4 - Service5 ........ etc ..
000001 VAA Null Null Null SFG
000002 Null Null Null YBB SFU
000003 Null Null Null SAE SFU
000004 VDP VBB Null Null VAA
Invoices must be addressed once to the service because we just need one validation by services for all no matter if it is mentioned 2, 5 or 10 times on the invoice.
The brutal solution that I found : (ex: to test the update service3, we have to look in a temp table the 2 previous values):
with temp as (select N°facture,service1 as serv1, service2 as serv2) from table A
insert into table B (N°facture,Service3)
select a.N°Invoice, a.Service3 as serv3
from TableA a
Left Join temp
on temp.N°Invoice=a.N°facture
Where serv1<>a.service3 And serv2<>a.service3
The problem with this method is that you test the 40 services one by one .. that becomes undrinkable writing from the 6th or 7th already ...:blink:
I think there must be a faster method (rownum..)
April 1, 2015 at 10:09 am
In a word, Normalize your table. Then this is stupid easy. Repeating fields are going to make your life miserable.
April 1, 2015 at 10:29 am
As previously stated, you need to normalize your table. If you can't change the schema, you still need to code the normalization and denormalize again. Here's a sample on how to do it. It will be easier to complete it than your previous option.
CREATE TABLE #Sample(
Invoice char(6),
Service1 char(6), Service2 char(6), Service3 char(6), Service4 char(6), Service5 char(6))
INSERT INTO #Sample
VALUES
('000001', Null, 'VAA', 'VAA', 'SFG', 'SFG'),
('000002', 'YBB', 'YBB', 'SFU', Null, Null),
('000003', Null, Null, 'SAE', 'SFU', 'SFU'),
('000004', 'VDP', 'VDP', 'VBB', 'VBB', 'VAA');
WITH CTE AS(
SELECT Invoice,
MIN(ServiceID) ServiceID,
Value
FROM #Sample
CROSS APPLY (SELECT 1, Service1 UNION ALL
SELECT 2, Service2 UNION ALL
SELECT 3, Service3 UNION ALL
SELECT 4, Service4 UNION ALL
SELECT 5, Service5)x(ServiceID,Value)
GROUP BY Invoice, Value
)
SELECT Invoice,
MAX( CASE WHEN ServiceID = 1 THEN Value END) Service1,
MAX( CASE WHEN ServiceID = 2 THEN Value END) Service2,
MAX( CASE WHEN ServiceID = 3 THEN Value END) Service3,
MAX( CASE WHEN ServiceID = 4 THEN Value END) Service4,
MAX( CASE WHEN ServiceID = 5 THEN Value END) Service5
FROM CTE
GROUP BY Invoice;
GO
DROP TABLE #Sample
April 1, 2015 at 11:29 am
You can prevent duplication of service codes by leveraging an INSTEAD OF INSERT, UPDATE trigger where each service column is conditionally populated depending on whether a preceeding service column has the same code. For example:
CREATE TABLE MyInvoice
(
Invoice char(6) not null primary key
, Service1 char(6) null
, Service2 char(6) null
, Service3 char(6) null
, Service4 char(6) null
, Service5 char(6) null
);
GO
CREATE TRIGGER MyInvoiceTrigger on MyInvoice
INSTEAD OF INSERT
AS
BEGIN
INSERT INTO MyInvoice SELECT
Invoice
, Service1
, case when Service2
not in (isnull(Service1,''))
then Service2 else null end
, case when Service3
not in (isnull(Service1,''),isnull(Service2,''))
then Service3 else null end
, case when Service4
not in (isnull(Service1,''),isnull(Service2,''),isnull(Service3,''))
then Service4 else null end
, case when Service5
not in (isnull(Service1,''),isnull(Service2,''),isnull(Service3,''),isnull(Service4,''))
then Service5 else null end
FROM inserted;
END;
GO
insert into MyInvoice values ('001','A','B',null,null,null);
insert into MyInvoice values ('002','A','B','A','C',null);
insert into MyInvoice values ('003','A','B',null,'C','A');
select * from MyInvoice;
Invoice Service1 Service2 Service3 Service4 Service5
001 A B NULL NULL NULL
002 A B NULL C NULL
003 A B NULL C NULL
"Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho
April 1, 2015 at 11:44 am
Many thanks for your reply.
And yes it s not possible to normalize as the data come from pdf captured areas from invoices.
Each invoices could have 40 services it s depend how many purchase order are booked on it.
And the ’null’ result correspond to a normal situation .
The service code show a discrepencie between the PO and the invoice.
So the manager need to received it and approved . So if he had 10 discrepencies he just need to received the document once.
That why i need to remouving duplicates by invoices...
Viewing 5 posts - 1 through 4 (of 4 total)
You must be logged in to reply to this topic. Login to reply