May 10, 2010 at 9:05 am
say i have a race table
race race description
i American indian
N Asian
P Pasafic Ilander
B African American
w White.
i am getting some data from a new file which has race column in it.
for a given record i can have multiple values for race in that file.
say inp (Not comma separated)
another example is PBW .(Not comma seperated)
so how can i test whether the values inp or PBW are allowable values
by referencing race table.
Thanks In advance.
May 10, 2010 at 11:05 am
Dont worry i was done.
create function allowableRaceValues(@Raceval varchar(20))
Returns int
AS
BEGIN
declare @output int
declare @ie varchar(5)
set @output=1
declare @i int
declare @ctvar int
set @i=1
set @ctvar=len(LTRIM(RTRIM(@Raceval)))
while (@ctvar>@i)
Begin
set @ie = RTRIM(LTRIM(substring(@Raceval, @i, 1)))
IF(@ie) IN ('I','N','A','B','P')
set @i=@i+1
ELSE
BEGIN
set @output=0
set @i=@i+1
END
END
RETURN @output
END
Thanks
For Checking.
May 10, 2010 at 2:09 pm
You might want to have a look at the Tally table article referenced in my signature. There is a split string function included you could modify to meet your requirements.
This way you could avoid using a loop. It might not make a huge difference on your current scenario but once you know about a non-looping solution you might benefit from it in the future.
I'm not going to comment on the file structure you have to deal with though... 😉
May 10, 2010 at 7:58 pm
As much as I love working with the Tally table, it just doesn't need to be that complicated for this. In fact, the CASE formula in the following could be used as a column constraint or computed column if you needed it to be...
--===== Create a spot of test data. This is NOT a part of the solution.
DECLARE @test-2 TABLE (RowNum INT IDENTITY(1,1), RaceVal VARCHAR(10))
INSERT INTO @test-2
(RaceVal)
SELECT 'INABP' UNION ALL
SELECT 'INCBP' UNION ALL
SELECT 'INABPQ' UNION ALL
SELECT 'IN' UNION ALL
SELECT 'PQ' UNION ALL
SELECT 'Q'
--===== Solve the problem
SELECT RowNum, RaceVal,
CASE WHEN RaceVal LIKE '%[^INABPW]%' THEN 0 ELSE 1 END AS IsAllValid --<<<LOOK!
FROM @test-2
--Jeff Moden
Change is inevitable... Change for the better is not.
May 11, 2010 at 10:44 am
Jeff Moden (5/10/2010)
As much as I love working with the Tally table, it just doesn't need to be that complicated for this. In fact, the CASE formula in the following could be used as a column constraint or computed column if you needed it to be...
... assuming the race table will not be expanded with addtl. rows, I'd agree...
But what would happen if there's a need to another race, e.g. Africans or Europeans?
Since the sample data are based on a table I thought a more generic solution would be preferred over hard coded values, even though the function provided by the OP did hard code the values too...
As an alternative your code could be changed into a dynamic version but at that point I think the tally approach would be the better choice. Therefore I recommended it in the first place.
May 12, 2010 at 12:30 pm
Joe Celko (5/12/2010)
Don't use the race codes table; remove the code from the string and see if it has left-overs:REPLACE('I', '',
REPLACE('N', '',
REPLACE('P', '',
REPLACE('B', '',
REPLACE ( 'W', '', mutli_race_code),
),),),) = ''
How would you deal with another race to be added?
If the race table would still exist (because it might be used for other purposes) the hard coded version would reject a valid combination. Wouldn't that be considered as an inconsistent DB?
When using the hard coded version the reace table should be removed, I think...
May 12, 2010 at 1:51 pm
lmu92 (5/11/2010)
Jeff Moden (5/10/2010)
As much as I love working with the Tally table, it just doesn't need to be that complicated for this. In fact, the CASE formula in the following could be used as a column constraint or computed column if you needed it to be...... assuming the race table will not be expanded with addtl. rows, I'd agree...
But what would happen if there's a need to another race, e.g. Africans or Europeans?
Since the sample data are based on a table I thought a more generic solution would be preferred over hard coded values, even though the function provided by the OP did hard code the values too...
As an alternative your code could be changed into a dynamic version but at that point I think the tally approach would be the better choice. Therefore I recommended it in the first place.
I made a hardcoded solution only because that's what had been presented so far but I agree with you. Hard coding of this nature is a "Bozo-no-no".
I'm still believe that the use of the Tally table here would be slower across a million rows than if the codes were stored in a nice easily updateable table and a single concatenation was done to support the LIKE. To be sure, I've not tested that premise so I could certainly be wrong.
In truth, even that is the incorrect implementation... the column should be normalized.
--Jeff Moden
Change is inevitable... Change for the better is not.
May 12, 2010 at 1:55 pm
Joe Celko (5/12/2010)
Don't use the race codes table; remove the code from the string and see if it has left-overs:REPLACE('I', '',
REPLACE('N', '',
REPLACE('P', '',
REPLACE('B', '',
REPLACE ( 'W', '', mutli_race_code),
),),),) = ''
In essence, that's what "RaceVal LIKE '%[^INABPW]%'" does. Either way (and I'm pretty sure you'll agree), a better thing to do would be for them to normalize the column instead.
--Jeff Moden
Change is inevitable... Change for the better is not.
May 12, 2010 at 2:19 pm
Jeff Moden (5/12/2010)
...I'm still believe that the use of the Tally table here would be slower across a million rows than if the codes were stored in a nice easily updateable table and a single concatenation was done to support the LIKE. To be sure, I've not tested that premise so I could certainly be wrong.
In truth, even that is the incorrect implementation... the column should be normalized.
Would the following code (based on your fine sample code) be close to what you mean when talking about a single concatenation?
--===== Create a spot of test data. This is NOT a part of the solution.
DECLARE @test-2 TABLE (RowNum INT IDENTITY(1,1), RaceVal VARCHAR(10))
INSERT INTO @test-2
(RaceVal)
SELECT 'INABP' UNION ALL
SELECT 'INCBP' UNION ALL
SELECT 'INABPQ' UNION ALL
SELECT 'IN' UNION ALL
SELECT 'PQ' UNION ALL
SELECT 'Q'
--===== Solve the problem (hardcoded)
SELECT RowNum, RaceVal,
CASE WHEN RaceVal LIKE '%[^INBP]%' THEN 0 ELSE 1 END AS IsAllValid --<<<LOOK!
FROM @test-2
--===== Solve the problem (table based) with a single concatenation
DECLARE @tbl TABLE (race CHAR(1), descr VARCHAR(30))
INSERT INTO @tbl (race, descr)
SELECT 'i', 'American indian' UNION ALL
SELECT 'N', 'Asian' UNION ALL
SELECT 'P', 'Pasafic Ilander' UNION ALL
SELECT 'B', 'African American' UNION ALL
SELECT 'w', 'White'
SELECT RowNum, RaceVal,
CASE WHEN RaceVal LIKE '%[^'+sub.conc_string+']%' THEN 0 ELSE 1 END AS IsAllValid
FROM @test-2
CROSS APPLY
(SELECT (SELECT '' + Race FROM @tbl t2 FOR XML PATH('')) AS conc_string) sub
May 12, 2010 at 3:16 pm
Yes and no. You're certainly on the correct tract with making a table but it should be a permanent table instead of one built on the fly. Second, the concatenation is correct but it would be executed for every row because of the CROSS APPLY. It would be much less resource intensive if the concatenation where calculated just once at the beginning of a proc and stored in a variable and then used in the LIKE.
--Jeff Moden
Change is inevitable... Change for the better is not.
May 12, 2010 at 4:43 pm
Jeff Moden (5/12/2010)
Yes and no. You're certainly on the correct tract with making a table but it should be a permanent table instead of one built on the fly. Second, the concatenation is correct but it would be executed for every row because of the CROSS APPLY. It would be much less resource intensive if the concatenation where calculated just once at the beginning of a proc and stored in a variable and then used in the LIKE.
I decided to use a temp table for demonstration purpose. The OP already referred to a permanent table. But you're right, I should have explained it more detailed.
Regarding the CROSS APPLY:
Are you sure it will be executed for each and every row? In this case the CROSS APPLY could be rewritten as a CROSS JOIN to a table holding only one row (and one column):
SELECT RowNum, RaceVal,
CASE WHEN RaceVal LIKE '%[^'+sub.conc_string+']%' THEN 0 ELSE 1 END AS IsAllValid
FROM @test-2
CROSS JOIN
(SELECT (SELECT '' + Race FROM @tbl t2 FOR XML PATH('')) AS conc_string) sub
I don't see an execution for every row...
May 23, 2010 at 9:42 am
Sorry... I lost track of this one, Lutz.
Looking back at this one, I wonder what I was thinking. You're correct... it would appear that the XML path query is only being calculated once. It's easier to see in the execution plan if you convert the tables from Table Variables to Temp Tables.
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 12 posts - 1 through 11 (of 11 total)
You must be logged in to reply to this topic. Login to reply