December 12, 2007 at 2:08 am
hi! i have a problem on how to extract data to another table.
Example:
table_old
id(int):1
remarks(text):123-01,EX,2600,123-02,EX,2600
the result will be
table_new
ID t_no type Part
1 123-1 EX 2600
1 123-2 EX 2600
the order is always like that.
thanks in advance.
December 12, 2007 at 2:15 am
Use
select
into
from
(or)
Insert into
select
karthik
December 12, 2007 at 2:28 am
hi,i think my question is not clear,sorry for that.
the value of remaks is:
remarks(text):123-01,EX,2600,123-02,EX,2600
i have to manipulate this characters in remarks,
so the output will be:
table_new
ID t_no type Part
1 123-01 EX 2600
1 123-02 EX 2600
i have 4 fields in my new table.
id,t_no,type and Part
thanks...
------------my old post-------------
hi! i have a problem on how to extract data to another table.
Example:
table_old
id(int):1
remarks(text):123-01,EX,2600,123-02,EX,2600
the result will be
table_new
ID t_no type Part
1 123-1 EX 2600
1 123-2 EX 2600
the order is always like that.
thanks in advance.
December 12, 2007 at 9:45 pm
any help from this forum very much appreciated.
thanks
December 12, 2007 at 9:51 pm
Try searching this site for split functions. They'll take your single text value and return a set of the individual values.
i.e...
http://www.sqlservercentral.com/scripts/Miscellaneous/31913/
______________________________________________________________________
Personal Motto: Why push the envelope when you can just open it?
If you follow the direction given HERE[/url] you'll likely increase the number and quality of responses you get to your question.
Jason L. SelburgDecember 12, 2007 at 10:56 pm
thank you very much.i ask again if i can't do it right.
December 12, 2007 at 11:07 pm
karthikeyan (12/12/2007)
Useselect
into
from
(or)
Insert into
select
Good... you're trying to answer questions and that's a good thing. But go back and read the request... OP needs to split the data, too... do you know how to do that, Karthik?
--Jeff Moden
Change is inevitable... Change for the better is not.
December 13, 2007 at 2:50 am
hi!well i cannot do it.i hope all of you can help me again.
About jason post,he give a site wherein there's a code
on how to split a string(http://www.sqlservercentral.com/scripts/Miscellaneous/31913/)
,its ok,but its not the output i want,it's only one column,it should be three.
Here is the example:
table_old
id(int):1
remarks(text):123-01,EX,2600,123-02,EX,2600
the result will be
table_new
ID t_no type Part
1 123-01 EX 2600
1 123-02 EX 2600
thanks in advanced
December 16, 2007 at 11:17 pm
Jeff,
Sorry for late reply.
Solution:
Create table Extract
(
ID Int,
Remarks varchar(100)
)
insert into Extract
select 1,'123-01,EX,2600'
insert into Extract
SELECT 2,'123-01,EX,2600,123-02,EX,2600'
insert into Extract
SELECT 3,'123-01,EX,2600,123-02,EX,2600,123-02,EX,2600'
For example i have inserted 3 rows.Metnik wants to extract in the below format.
ID Type Exam Year
1 123-01 EX 2006
2 123-02 EX 2006
2 123-02 EX 2006
3 123-01 EX 2006
3 123-02 EX 2006
3 123-03 EX 2006
Query:
Declare @One int,@Two,@Three int,@MaxLength int,@Start int
Select @One = 1,@Two = 8,@Three = 11,@Start = 1
Select ID,Substring(Remarks,@One,6)Type,
Substring(Remarks,@Two,2)Ex,
Substring(Remarks,@Three,4)Year
into #Temp
from dbo.Extract
where 1= len(Remarks)/14
select @MaxLength = max(Len(Remarks)/14) from dbo.Extract
while @Start <= @MaxLength
Begin
Insert into #Temp
Select ID,Substring(Remarks,@One,6),
Substring(Remarks,@Two,2),
Substring(Remarks,@Three,4)
from dbo.Extract
where len(Remarks)/14 > 1
and Substring(Remarks,@One,6) <> '' -- To avoid unwanted repeation of loop say for
example Maxlength is 3,So if we didn't include this line it will produce a blank line for ID-2.But we no need to repeat the loop three times for ID-2.
select @One = @One + 15
select @Two = @One + 7
select @Three = @Two+10
Select @Start = @Start + 1
End
Select * from #Temp
Output:
-----------------------
ID Type Exam Year
1 123-01 EX 2006
2 123-02 EX 2006
2 123-02 EX 2006
3 123-01 EX 2006
3 123-02 EX 2006
3 123-03 EX 2006
-----------------------
Jeff,Actually i have tried to do this one without loop.But i am not able to do it without loop.Yes i tried to solve it by using spt_values table.But i can't.
My Questions:
1) Can you explain me how to accomplish it without using Loop especially i want to use spt_values table in this logic. is it possible ?
2) Can you give me your valuable suggestions about my Code ?
3) If any suggestion,Can you tell me in which area i have to made changes ?
Thanks in advance.
karthik
December 17, 2007 at 10:48 am
Karthik, it looks as if your code will fail if the length of any part changes because you've hard-coded each length. Also, I wouldn't use the spt_Values table in a production job because it only goes up to 255 in 2k and only up to 2047 in 2k5 ... like I said before, I only use it for examples.
Metnik and Karthik,
In order to do things like this while avoiding the loop and still maintaining high performance levels, you need to make a Tally table. A Tally table, also known as a Numbers table, contains nothing more than a very well indexed single column of numbers that start at 1 and go to some number. I maintain a Tally table with 11,000 number because 11,000 is larger than the number of characters in the VARCHAR data type and, if I use it for creating sequential dates, provides more than 30 years of days (typical mortgage is 30 years or less).
Here's how to make a Tally table...
--===== Create and populate the Tally table on the fly
SELECT TOP 11000 --equates to more than 30 years of dates
IDENTITY(INT,1,1) AS N
INTO dbo.Tally
FROM Master.dbo.SysColumns sc1,
Master.dbo.SysColumns sc2
--===== Add a Primary Key to maximize performance
ALTER TABLE dbo.Tally
ADD CONSTRAINT PK_Tally_N
PRIMARY KEY CLUSTERED (N) WITH FILLFACTOR = 100
--===== Allow the general public to use it
GRANT SELECT ON dbo.Tally TO PUBLIC
Once you have such a table permanently installed in your database (and you should have one because of all it's uses), this problem becomes downright simple to resolve... the following is the solution for SQL Server 2k and works well in 2k5. You could also convert the #Split table to a CTE in 2k5...
--===== Create and populate a test table.
-- This is NOT part of the solution.
IF OBJECT_ID('TempDB..#YourTable','U') IS NOT NULL
DROP TABLE #YourTable
CREATE TABLE #YourTable
(
ID INT,
Remarks VARCHAR(100)
)
INSERT INTO #YourTable
SELECT 1,'123-01,AX,2600' UNION ALL
SELECT 2,'123-02,BX,2602,123-03,CX,2603' UNION ALL
SELECT 3,'123-04,DX,2604,123-05,EX,2605,123-06,FX,2606'
GO
-----------------------------------------------------------------------------------
--===== Solve the problem ==========================
--===== Declare a variable to hold the delimiter
DECLARE @Delim CHAR(1)
SET @Delim = ','
--===== Create and populate a "split" table with all the "pieces" split out
IF OBJECT_ID('TempDB..#Split','U') IS NOT NULL
DROP TABLE #Split
SELECT ID,
Val = SUBSTRING(@Delim+h.Remarks+@Delim,
t.N+1,
CHARINDEX(@Delim, @Delim+h.Remarks+@Delim, t.N+1)-t.N-1),
Posit = t.N-LEN(REPLACE(LEFT(@Delim+h.Remarks+@Delim,t.N), @Delim, ''))
INTO #Split
FROM dbo.Tally t
RIGHT OUTER JOIN --Necessary in case Remarks is NULL
dbo.#YourTable h
ON SUBSTRING(@Delim+h.Remarks+@Delim, t.N, 1) = @Delim
AND t.N < LEN(@Delim+h.Remarks+@Delim)
--===== Reassemble the peices (Split table could be a CTE in 2k5)
SELECT s1.ID,
T_No = s1.Val,
Type = s2.Val,
Part = s3.Val
FROM #Split s1
INNER JOIN #Split s2 ON s2.ID = s1.ID AND s2.Posit = s1.Posit+1
INNER JOIN #Split s3 ON s3.ID = s1.ID AND s3.Posit = s1.Posit+2
WHERE s1.Posit%3 = 1
The Tally table takes the place of the loop...
--Jeff Moden
Change is inevitable... Change for the better is not.
December 17, 2007 at 8:02 pm
Actually, here's a solution that doesn't use a temp table...
--===== Solve the problem ==========================
--===== Declare a variable to hold the delimiter
DECLARE @Delim CHAR(1)
SET @Delim = ','
SELECT ID,
MAX(CASE WHEN Posit = 0 THEN Val END),
MAX(CASE WHEN Posit = 1 THEN Val END),
MAX(CASE WHEN Posit = 2 THEN Val END)
FROM (
SELECT ID,
Val = SUBSTRING(@Delim+h.Remarks+@Delim,
t.N+1,
CHARINDEX(@Delim, @Delim+h.Remarks+@Delim, t.N+1)-t.N-1),
Posit = (t.N-LEN(REPLACE(LEFT(@Delim+h.Remarks+@Delim,t.N), @Delim, ''))-1)%3,
Grp = (t.N-LEN(REPLACE(LEFT(@Delim+h.Remarks+@Delim,t.N), @Delim, ''))-1)/3
FROM dbo.Tally t
RIGHT OUTER JOIN --Necessary in case Remarks is NULL
dbo.#YourTable h
ON SUBSTRING(@Delim+h.Remarks+@Delim, t.N, 1) = @Delim
AND t.N < LEN(@Delim+h.Remarks+@Delim)
)d
GROUP BY ID,Grp
ORDER BY ID,Grp
--Jeff Moden
Change is inevitable... Change for the better is not.
December 18, 2007 at 6:26 pm
hi to all guys!wow!you have many feedback regarding my comment.
thank you very much,especially to Jeff.i haven't test your code yet.
i will give a feedback immediately.thanks again!:D
December 20, 2007 at 4:43 am
Jeff,
Really i am wondering about your skill.
Can you explain me your two different queries in detailed ?Also,
1. What do you mean by @Delim+Remarks+@Delim ?
i.e Why you have added 'comma(,)' in both side.
2. why you have used Right Outer Join ?
3. Why you have used 'syscolumns' while creating Tally tables ? is there any reason ? can't we use other system tables like sysobjects etc., ?
4.Why you have used Innner Join for the same table(i.e #split) ?
5. What do you mean by posit%3 ? Also why we are checking whether posit%3 = 1 (Can i use 2 or else ? )
Please explain me this part in detail.
I have checked your first code only.I didn't check your second code.
Once again,Thanks for teaching some best code to me.
karthik
December 20, 2007 at 9:37 am
karthikeyan (12/20/2007)
Jeff,Really i am wondering about your skill.
Can you explain me your two different queries in detailed ?Also,
1. What do you mean by @Delim+Remarks+@Delim ?
i.e Why you have added 'comma(,)' in both side.
2. why you have used Right Outer Join ?
3. Why you have used 'syscolumns' while creating Tally tables ? is there any reason ? can't we use other system tables like sysobjects etc., ?
4.Why you have used Innner Join for the same table(i.e #split) ?
5. What do you mean by posit%3 ? Also why we are checking whether posit%3 = 1 (Can i use 2 or else ? )
Please explain me this part in detail.
I have checked your first code only.I didn't check your second code.
Once again,Thanks for teaching some best code to me.
1. The reason for adding a comma to both sides is so that each "element" has a comma on each side and can be treated equally. In other words, the same code works if there's one element or many. For example, take the following lines of text that we need to parse...
Part1
Part1,Part2
Part1,Part2,Part3
... if I just leave it like that, then I would need to treat the "search" differently for each part because I wouldn't be able to just do a search for ",Partx,". I'd have to search for "Partx", "Partx,", ",Partx", and ",Partx,". If I simply add a comma to the beginning and the end of each line of text, all the parts end up looking like ",Partx," which makes the code real simple...
,Part1,
,Part1,Part2,
,Part1,Part2,Part3,
... see? Each and every "element" is surrounded by commas and can be treated the same way in code. No special programming.
2. Why have I used a Right Outer Join? Just like the comment in the code says, "Necessary in case Remarks is NULL". We want to return the ID of the source table even if there are no remarks so that we can show there are no remarks in the source table.
3. Why did I use Master.dbo.SysColumns instead of any other table in the creation of the Tally Table? The answer is that I could have... it depends on the number of rows you want in the Tally table. I frequently use the same method to build millions of rows of test data. I know that, even on a new instance of SQL Server, that SysColumns will never have less that 4,000 rows in it. That means that, thanks to the Cross-Join on itself, I'll always be able to get up to 4000*4000 or 16 million rows out of that table. Since it's in the Master data base, I know it's guaranteed to always be there with at least 4,000 rows in it.
4. Why did I Inner Join the same table 3 times in the first code? Because I knew I needed 3 "elements" to be returned for each ID. Actually, that example is bad code on my part... the second example is much more effecient.
5. What does "posit%3" mean. Well, first ya gotta look in Books Online and figure out that the "%" operator is the Modulo operator. If I evenly divide 2 by 3, the remainder is 2. If I evenly divide 8 by 3, the remainder is also 2. I use modulo to number the 3 elements of each split since the OP said the elements will always appear in groups of 3. If you run the subquery in the second bit of code, you'll see the result and then understand that the outer query is nothing more than a cross-tab that reassembles the peices of each "group" of elements.
Looking for Posit%3 = 1 happens to give me the "first" element of each group.
The second code I posted is much more interesting and effecient. It breaks up the elements in a fashion similar to the first but calculates the "position" and "group" of each element. For example, if a row has the following information on it (after I add a comma to each side)...
,Posit1Grp1,Posit2Grp1,Posit3Grp1,Posit1Grp2,Posit2Grp2,Posit3Grp2,
... we can plainly see that there are six elements of two groups of 3. If we turn that sideways, like I do in the code, we end up with the following (from the sub-query in the second code with a couple of columns added for explanation purposes)...
ID Val OrigPosit OrigPosit-1 Posit Grp
----------- ---------- ----------- ----------- ----------- -----------
1 Posit1Grp1 1 0 0 0
1 Posit2Grp1 2 1 1 0
1 Posit3Grp1 3 2 2 0
1 Posit1Grp2 4 3 0 1
1 Posit2Grp2 5 4 1 1
1 Posit3Grp2 6 5 2 1
Notice that the "OrigPosit" identifies the six individual elements and their positions on the orginal line of text. Notice that "OrigPosit-1" is nothing more than the original "Posit-1". I did that because Modulo (%) has a base of zero...
The "Posit" column is created by evenly dividing the "OrigPosit-1" column by 3 and taking the remainder (that's what Modulo does). The GRP column was formed by evenly dividing the "OrigPosit-1" by 3 and taking the quotient (answer from division) with no remainder which successfully identifies which group of 3 the elements came from (Group 0 is the first group)
The Cross-Tab in the Outer query simply reassembles that information by unique ID, Grp, and Posit.
Simple yet elegant Math.
--Jeff Moden
Change is inevitable... Change for the better is not.
December 21, 2007 at 12:21 am
Jef,
Wow ! Really interesting! It opens my eye broadly.
Thanks for your explanation with good example.Once again,Really i am wondering about your skill.
I didn't check your second code.I am going to test it today,I will get back to you shortly.
Once again,Thanks a lot for your answer with good logic,elegant, and examples.
karthik
Viewing 15 posts - 1 through 15 (of 17 total)
You must be logged in to reply to this topic. Login to reply