April 3, 2007 at 4:18 am
Hello,
say you have a string of codes. E.g. XX/\/\/\/\#### all within one field
Then I want a column called X with a count of the number of X's in that string etc so you would have
X / \ # i
2 4 4 4 0
i doesnt exist in this field so it would come back with 0. Is there any way of doing this kind of query?
Thanks
Debbie
April 3, 2007 at 5:32 am
You could do it like this;
create table #x ( weirdcodes varchar(20) not null )
go
insert #x select 'XX/\/\/\/\####'
go
select weirdcodes,
(datalength(weirdcodes) - datalength(replace(weirdcodes, 'X', ''))) as 'X',
(datalength(weirdcodes) - datalength(replace(weirdcodes, '/', ''))) as '/',
(datalength(weirdcodes) - datalength(replace(weirdcodes, '\', ''))) as '\',
(datalength(weirdcodes) - datalength(replace(weirdcodes, '#', ''))) as '#',
(datalength(weirdcodes) - datalength(replace(weirdcodes, 'i', ''))) as 'i'
from #x
go
weirdcodes X / \ # i
-------------------- ----------- ----------- ----------- ----------- -----------
XX/\/\/\/\#### 2 4 4 4 0
(1 row(s) affected)
/Kenneth
April 3, 2007 at 5:47 am
Hi,
Thanks for that. I dont really want to go the temporary table route because I have got throusands of records that I need to do this too and then I would have to link the resulting columns back to the original columns.
Is there any way of doing this without the table. Im going to see if I can use the
select
(datalength(weirdcodes) - datalength(replace(weirdcodes, 'X', ''))) as 'X',
(datalength(weirdcodes) - datalength(replace(weirdcodes, '/', ''))) as '/',
(datalength(weirdcodes) - datalength(replace(weirdcodes, '\', ''))) as '\',
(datalength(weirdcodes) - datalength(replace(weirdcodes, '#', ''))) as '#',
(datalength(weirdcodes) - datalength(replace(weirdcodes, 'i', ''))) as 'i'
from #x
go
against the table itsself but Im still not sure what Im doing
Debbie
April 3, 2007 at 5:51 am
brilliant,
It works, I used the following within my query...
,
(datalength(wk.SESS_PER_WEEK) - datalength(replace(wk.SESS_PER_WEEK, 'X', ''))) as 'X',
(datalength(wk.SESS_PER_WEEK) - datalength(replace(wk.SESS_PER_WEEK, '/', ''))) as '/',
(datalength(wk.SESS_PER_WEEK) - datalength(replace(wk.SESS_PER_WEEK, '\', ''))) as '\',
(datalength(wk.SESS_PER_WEEK) - datalength(replace(wk.SESS_PER_WEEK, '#', ''))) as '#',
(datalength(wk.SESS_PER_WEEK) - datalength(replace(wk.SESS_PER_WEEK, 'i', ''))) as 'i'
Thanks for that!
Debbie
April 4, 2007 at 3:45 am
Yes, that was the point
The temptable was just for illustration.
You were supposed to substitute it with your actual table.
/Kenneth
August 10, 2007 at 1:16 pm
Datalength was a great way of solving this, however I have a scenario where this won't work. I had to query Exchange, and find the # of email addresses that are copied on emails. These are seperated with a ';' The problem is that sometimes a space follows the semicolon, and sometimes it doesn't. When the space follows the semicolon, it's counted using the solution above. So I wrote the function below, which essentially does the same thing:
--takes a string, a character
--returns the # of times the character appears in the string
create function [dbo].[fnCountCharInString] (@string varchar(500), @char varchar(1))
returns int
as
begin
declare @stringpos int
declare @charcount int
declare @ascii int
set @stringpos = 1
set @charcount = 0
set @ascii = ascii(@char)
while @stringpos <= Len(@string)
begin
if ascii(substring(@string,@stringpos,1)) = @ascii
begin
set @charcount = @charcount + 1
end
set @stringpos = @stringpos + 1
end
return @charcount
end
August 10, 2007 at 7:40 pm
I believe what you're looking for is a kind of "histogram". The other solutions are great but if there are any unexpected characters, the cross-tabs will miss them and, if I can, I avoid the RBAR associated with a function.
The first thing you need to do is make an nice little multi-use numbers table. It consists of nothing more than a well indexed column of sequential numbers and has a great number of uses. I call mine a "Tally" table because I use it to count and it sounds cooler than "Numbers". Here's how to make one (recommend you make it a permanent part of your programming arsenal)...
--===== Create and populate the Tally table on the fly
SELECT TOP 11000 --equates to more than 30 years of dates
IDENTITY(INT,1,1) AS N
INTO dbo.Tally
FROM Master.dbo.SysColumns sc1,
Master.dbo.SysColumns sc2
--===== Add a Primary Key to maximize performance
ALTER TABLE dbo.Tally
ADD CONSTRAINT PK_Tally_N
PRIMARY KEY CLUSTERED (N) WITH FILLFACTOR = 100
--===== Allow the general public to use it
GRANT SELECT ON dbo.Tally TO PUBLIC
Heh... notice that I hate loops and cursors so much that I even avoided them to make the Tally table
Ok... you say you have thousands of rows that you need to do a histogram for... let's make thousands of rows of test data... note that this is NOT part of the solution... this table is just to give everyone some test data so they can test my solution, if they want.
--===== Create a test table with a bunch of "wierd" codes in the SomeString column
-- Takes about 10 seconds to execute.
SELECT TOP 100000 --<<Create 100 THOUSAND rows of data
SomePK = IDENTITY(INT,1,1),
SomeString = CHAR(STR(RAND(CAST(NEWID() AS VARBINARY))*26+65))
+ CHAR(STR(RAND(CAST(NEWID() AS VARBINARY))*26+65))
+ CHAR(STR(RAND(CAST(NEWID() AS VARBINARY))*3+91))
+ CHAR(STR(RAND(CAST(NEWID() AS VARBINARY))*26+65))
+ CHAR(STR(RAND(CAST(NEWID() AS VARBINARY))*26+65))
+ CHAR(STR(RAND(CAST(NEWID() AS VARBINARY))*26+65))
+ CHAR(STR(RAND(CAST(NEWID() AS VARBINARY))*3+91))
+ CHAR(STR(RAND(CAST(NEWID() AS VARBINARY))*26+97))
+ CHAR(STR(RAND(CAST(NEWID() AS VARBINARY))*26+97))
+ CHAR(STR(RAND(CAST(NEWID() AS VARBINARY))*26+97))
+ SPACE(5) --Adds some trailing spaces just to show they aren't counted
INTO dbo.JBMTest
FROM Master.dbo.SysColumns sc1,
Master.dbo.SysColumns sc2 --Lack of join criteria makes this a CROSS-JOIN
--===== A table is not properly formed unless a Primary Key has been assigned
ALTER TABLE dbo.JBMTest
ADD PRIMARY KEY CLUSTERED (SomePK)
Like I said, the code above is just for creating a test table and is NOT part of the solution.
We're all set... the code to make a "Normalized Histogram" becomes super easy after that... and, it's pretty fast... does 100,000 rows of data (the whole table as a "set" all at once) in about 21 seconds and doesn't count trailing spaces. Here's the code... might be worth storing the results in a temp table so you don't need to keep recalculating...
--===== Create the NORMALIZED "histogram" using a join to the Tally table
-- Takes about 21 seconds
SELECT SomePK,
Character = SUBSTRING(source.SomeString,t.N,1),
[Count] = COUNT(*)
FROM dbo.JBMTest source,
dbo.Tally t
WHERE t.N <= LEN(source.SomeString)
GROUP BY source.SomePK,SUBSTRING(source.SomeString,t.N,1)
ORDER BY source.SomePK,Character
Neat, huh? It works by using a highly constrained triangular join on the Tally table to find each character as a single character substring... like I said, it's fast, too.
Obviously, you would change the table name from "JBMTest" to the actual name of your table and you'd have to change the column names to match your table, as well.
Please feel free to post back if you have any questions on this.
--Jeff Moden
Change is inevitable... Change for the better is not.
August 13, 2007 at 3:20 am
Thanks for that,
I can think of a few other uses for this one so I will definitely give this a go!
Thanks again
August 13, 2007 at 6:42 am
...notice that I hate loops and cursors so much that I even avoided them... |
But still using non ANSI-92 joins eh
Far away is close at hand in the images of elsewhere.
Anon.
August 13, 2007 at 6:50 pm
Heh... just seeing if you're awake, David
Ah... old habits are hard to break
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 10 posts - 1 through 9 (of 9 total)
You must be logged in to reply to this topic. Login to reply