October 7, 2014 at 12:01 pm
Hello all,
I have a document database, which contains values for:
DOC_NO
REV_NO
FILE_EXTENSION
There can be multiple "renditions" of the same file and I am trying to query which have a variation in the file type.
So, basically any records where the DOC_NO and REV_NO would have to be the same as another row, with a different FILE_EXTENSION value.
I have manged to return the entire list of values, but am having trouble filtering the list down to "unique rows" which have the above criteria.
Example Data:
DOC1, VER1, PDF
DOC1, VER1, PDF
DOC1, VER1, PDF
DOC1, VER1, DOC
DOC2, VER1, DWG
DOC2, VER2, DWG
DOC2, VER2, PDF
DOC3, VER1, XLS
DOC4, VER1, PDF
DOC4, VER1, XLS
DOC4, VER1, DOC
DOC4, VER2, XLS
DOC5, VER1, PDF
DOC5, VER1, DOC
DOC5, VER2, PDF
DOC5, VER2, PDF
DOC5, VER3, XLS
The values I would be looking for would be;
DOC1, VER1, PDF
DOC1, VER1, DOC
DOC2, VER2, DWG
DOC2, VER2, PDF
DOC4, VER1, PDF
DOC4, VER1, XLS
DOC4, VER1, DOC
DOC5, VER1, PDF
DOC5, VER1, DOC
Any assistance very much appreciated and thanks in advance for any replies.
October 7, 2014 at 12:21 pm
Open/import you document in Excel (comma delimited). Then go to the Data tab and there you have "Remove Duplicates" button. Click it and job is done.
Igor Micev,My blog: www.igormicev.com
October 7, 2014 at 12:22 pm
I'm not sure about the performance of this in a large scale, but I think I have the logic working.
;with RawData as
(
select 'DOC1' as DOC_NO, 'VER1' as REV_NO, 'PDF' as FILE_EXTENSION
union all select 'DOC1', 'VER1', 'PDF'
union all select 'DOC1', 'VER1', 'PDF'
union all select 'DOC1', 'VER1', 'DOC'
union all select 'DOC2', 'VER1', 'DWG'
union all select 'DOC2', 'VER2', 'DWG'
union all select 'DOC2', 'VER2', 'PDF'
union all select 'DOC3', 'VER1', 'XLS'
union all select 'DOC4', 'VER1', 'PDF'
union all select 'DOC4', 'VER1', 'XLS'
union all select 'DOC4', 'VER1', 'DOC'
union all select 'DOC4', 'VER2', 'XLS'
union all select 'DOC5', 'VER1', 'PDF'
union all select 'DOC5', 'VER1', 'DOC'
union all select 'DOC5', 'VER2', 'PDF'
union all select 'DOC5', 'VER2', 'PDF'
union all select 'DOC5', 'VER3', 'XLS'
)
, Multiples as
(
selectDOC_NO, REV_NO
fromRawData
group byDOC_NO, REV_NO
having COUNT(distinct FILE_EXTENSION) > 1
)
select distinct
RD.DOC_NO
,RD.REV_NO
,RD.FILE_EXTENSION
from
RawDataRD
joinMultiplesMon M.DOC_NO = RD.DOC_NO and M.REV_NO = RD.REV_NO
order by
RD.DOC_NO
,RD.REV_NO
,RD.FILE_EXTENSION
October 7, 2014 at 2:37 pm
Thanks Igor, that seems to have done the job, just using Excel...
I'm sure it's doable in SQL, but as I'm reporting in Excel anyway, it's easy enough to do the filtering in there 🙂
Viewing 4 posts - 1 through 3 (of 3 total)
You must be logged in to reply to this topic. Login to reply