July 5, 2011 at 4:38 pm
Hi all. I need help (again) trying to get the cursors out of my queries.
I have the following tables (Location and Segment):
CREATE TABLE [Location](
[Location_ID] [varchar](24) NOT NULL,
[Loc_KEY] [varchar](12) NOT NULL,
[Loc_MEAS] [decimal](6, 2) NOT NULL,
CONSTRAINT [PK_Location] PRIMARY KEY NONCLUSTERED
([Location_ID] ASC ))
CREATE TABLE [Segment](
[Segment_ID] [int] NOT NULL,
[BEG_LOC_ID] [varchar](24) NOT NULL,
[END_LOC_ID] [varchar](24) NOT NULL,
CONSTRAINT [PK_Segment] PRIMARY KEY NONCLUSTERED
( [Segment_ID] ASC ))
Multiple locations have a given KEY, but the combination of KEY and MEAS will uniquely identify a given location. A segment must have a Begin Location and an End Location. A segment's begin and end locations must have the same KEY. Segments do overlap so a single location can be associated with multiple segments. A location is associated with a segment if the location's MEAS value is between the begin location's MEAS value and the end location's MEAS value. A segment must have 2 to many locations, and a location can be associated with 1 to many segments.
My challenge is to populate the many-to-many table which associates locations and segments:
CREATE TABLE [Segment_Location](
[Location_ID] [varchar](24) NOT NULL,
[Segment_ID] [int] NOT NULL,
)
I have written a select statement that includes temp tables, cursors, etc. Its kinda ugly and takes 20 seconds to process 2000 test records. In Production, I will have to process tens of thousands of records. I need a quality upgrade for sure.
The 'english' translation of my current proc is:
- populate a cursor with each combination of segment id, begin location, and end location
- for each record in the cursor, execute a query which selects all the locations that
1) have the same KEY value as the current segment's begin location, and
2) have a MEAS value between the current segment's begin MEAS and end MEAS
- stuff the results of the select statement into a Temp table
- get the next record in the cursor
- after all records in cursor have been processed, insert all records that are in the temp table into the many-to-many table.
Any suggestions on how i can get rid of the cursor?
Thank you.
July 5, 2011 at 4:39 pm
You know, the people that help out here are all un-paid volunteers, so please HELP US HELP YOU. Providing the DDL scripts (CREATE TABLE, CREATE INDEX, etc.) for the tables affected, and INSERT statements to put some test data into those tables that shows your problem will go a long way in getting people to look at your issue and help you out. Please include code for what you have already tried. Don't forget to include what your expected results should be, based on the sample data provided. As a bonus to you, you will get tested code back. For more details on how to get all of this into your post, please look at the first link in my signature.
Wayne
Microsoft Certified Master: SQL Server 2008
Author - SQL Server T-SQL Recipes
July 5, 2011 at 6:52 pm
Tom Rippetoe (7/5/2011)
The 'english' translation of my current proc is:- populate a cursor with each combination of segment id, begin location, and end location
- for each record in the cursor, execute a query which selects all the locations that
1) have the same KEY value as the current segment's begin location, and
2) have a MEAS value between the current segment's begin MEAS and end MEAS
- stuff the results of the select statement into a Temp table
- get the next record in the cursor
- after all records in cursor have been processed, insert all records that are in the temp table into the many-to-many table.
Any suggestions on how i can get rid of the cursor?
Yep... CROSS JOIN.
--Jeff Moden
Change is inevitable... Change for the better is not.
July 5, 2011 at 6:55 pm
Hi Wayne.
I apologize for being too casual and too quick on my original post. Below are some sample data and the code that i am currently using.
Sample data for the Location table:
INSERT INTO location Values ('abc_123', 'AAA', 10.00);
INSERT INTO location Values ('abc_124', 'AAA', 10.10);
INSERT INTO location Values ('abc_125', 'AAA', 10.20);
INSERT INTO location Values ('abc_126', 'AAA', 10.30);
INSERT INTO location Values ('abc_127', 'AAA', 10.40);
INSERT INTO location Values ('abc_128', 'AAA', 10.50);
INSERT INTO location Values ('abc_129', 'AAA', 10.60);
INSERT INTO location Values ('abc_130', 'AAA', 10.70);
INSERT INTO location Values ('abc_131', 'AAA', 10.80);
INSERT INTO location Values ('abc_132', 'AAA', 10.90);
INSERT INTO location Values ('abc_133', 'AAA', 11.00);
Sample data for the Segment table:
INSERT INTO Segment Values (1, 'abc_123', 'abc_128');
INSERT INTO Segment Values (2, 'abc_124', 'abc_128');
INSERT INTO Segment Values (3, 'abc_125', 'abc_129');
INSERT INTO Segment Values (4, 'abc_126', 'abc_132');
INSERT INTO Segment Values (5, 'abc_127', 'abc_133');
Expected data in the many-to-many table Segment_Location:
SELECT Location_ID, Segment_ID FROM Segment_Location
('abc_123', 1)
('abc_124', 1)
('abc_125', 1)
('abc_126', 1)
('abc_127', 1)
('abc_128', 1)
('abc_124', 2)
('abc_125', 2)
('abc_126', 2)
('abc_127', 2)
('abc_128', 2)
('abc_125', 3)
('abc_126', 3)
('abc_127', 3)
('abc_128', 3)
('abc_129', 3)
('abc_126', 4)
('abc_127', 4)
('abc_128', 4)
('abc_129', 4)
('abc_130', 4)
('abc_131', 4)
('abc_132', 4)
('abc_127', 5)
('abc_128', 5)
('abc_129', 5)
('abc_130', 5)
('abc_131', 5)
('abc_132', 5)
('abc_133', 5)
And here is a version of the cursor code i am using to populate the many-to-many table;
set nocount ON;
DECLARE @Loc_Seg TABLE (
LocId varchar(24),
SegId int
)
DECLARE @segId int, @key varchar(12), @seg_bmp decimal(5,2), @seg_emp decimal(5,2)
DECLARE Segment_Cursor CURSOR FOR SELECT seg.Segment_ID, bmp.Loc_KEY, bmp.Loc_MEAS, emp.Loc_MEAS
FROM Segment seg
INNER JOIN location bmp ON seg.BEG_LOC_ID = bmp.Location_ID
INNER JOIN location emp ON seg.END_LOC_ID = emp.Location_ID
OPEN Segment_Cursor
FETCH NEXT FROM Segment_Cursor INTO @segId, @key, @seg_bmp, @seg_emp
WHILE @@FETCH_STATUS = 0
BEGIN
INSERT INTO @Loc_Seg(LocId,SegId)
SELECT l.Location_ID, @segId
FROM Location l
WHERE l.Loc_KEY = @key AND l.Loc_MEAS BETWEEN @seg_bmp AND @seg_emp
FETCH NEXT FROM Segment_Cursor INTO @segId, @key, @seg_bmp, @seg_emp
END
CLOSE Segment_Cursor
DEALLOCATE Segment_Cursor
INSERT INTO Segment_Location(Location_ID, Segment_ID) SELECT locid, segid FROM @Loc_Seg
Thank you for your patience and time.
July 5, 2011 at 9:54 pm
Tom,
For the sample data and expected results that you provided, the following code works. However, there are no checks going on with the beginning/ending MEAS values. Would you like to create some sample data that makes us have to do this? (Don't forget the expected results for the sample data provided, so we can check to see if we've got it correct or not.)
INSERT INTO Segment_Location (Location_ID, Segment_Id)
SELECT l.Location_Id, s.Segment_Id
FROM Location l
JOIN Segment s
ON l.Location_ID BETWEEN s.BEG_LOC_ID AND s.END_LOC_ID;
Wayne
Microsoft Certified Master: SQL Server 2008
Author - SQL Server T-SQL Recipes
July 5, 2011 at 10:00 pm
posted in error
Wayne
Microsoft Certified Master: SQL Server 2008
Author - SQL Server T-SQL Recipes
July 5, 2011 at 10:08 pm
I was pretty excited when I saw your first response, then reality set in .... You are right; the solution works because of the simplistic data set i provided. I will put together a hopefully more robust example set and expected output, and post it here tonight.
Thanks again.
July 5, 2011 at 11:19 pm
Here is a slightly more robust collection of Location data (at least it breaks the sequential nature of the data and adds a second 'key' value):
INSERT INTO location Values ('abc_123', 'AAA', 10.00);
INSERT INTO location Values ('abc_124', 'AAA', 10.10);
INSERT INTO location Values ('abc_125', 'AAA', 10.20);
INSERT INTO location Values ('abc_126', 'AAA', 10.30);
INSERT INTO location Values ('abc_127', 'AAA', 10.40);
INSERT INTO location Values ('abc_128', 'AAA', 10.50);
INSERT INTO location Values ('abc_129', 'AAA', 10.60);
INSERT INTO location Values ('abc_130', 'AAA', 10.70);
INSERT INTO location Values ('abc_131', 'AAA', 10.80);
INSERT INTO location Values ('abc_132', 'AAA', 10.90);
INSERT INTO location Values ('abc_133', 'AAA', 11.00);
INSERT INTO location Values ('AAA_1100', 'AAA', 12.49);
INSERT INTO location Values ('rtyt_099', 'AAA', 55.48);
INSERT INTO location Values ('ghty_bbhg', 'AAA', 9.55);
INSERT INTO location Values ('yuiyyte', 'AAA', 10.44);
INSERT INTO location Values ('hjegeer', 'BBB', 52.00);
INSERT INTO location Values ('yuijree', 'BBB', 49.58);
INSERT INTO location Values ('hyjukit', 'BBB', 1.05);
INSERT INTO location Values ('thyjaae', 'BBB', 33.58);
INSERT INTO location Values ('bghty_1', 'BBB', 100.75);
INSERT INTO location Values ('r5t67tg', 'BBB', 23.98);
And some slightly more robust segment data:
INSERT INTO Segment Values (1, 'abc_123', 'abc_128');
INSERT INTO Segment Values (2, 'abc_124', 'abc_128');
INSERT INTO Segment Values (3, 'abc_125', 'abc_129');
INSERT INTO Segment Values (4, 'abc_126', 'abc_132');
INSERT INTO Segment Values (5, 'abc_127', 'abc_133');
INSERT INTO Segment Values (6, 'r5t67tg', 'bghty_1');
INSERT INTO Segment Values (7, 'hyjukit', 'yuijree');
INSERT INTO Segment Values (8, 'yuiyyte', 'abc_129');
INSERT INTO Segment Values (9, 'thyjaae', 'hjegeer');
INSERT INTO Segment Values (10, 'yuijree', 'hjegeer');
And the expected contents of the many-to-many table:
INSERT INTO Segment_Location Values ('abc_123', 1);
INSERT INTO Segment_Location Values ('abc_124', 1);
INSERT INTO Segment_Location Values ('abc_125', 1);
INSERT INTO Segment_Location Values ('abc_126', 1);
INSERT INTO Segment_Location Values ('abc_127', 1);
INSERT INTO Segment_Location Values ('yuiyyte', 1);
INSERT INTO Segment_Location Values ('abc_128', 1);
INSERT INTO Segment_Location Values ('abc_124', 2);
INSERT INTO Segment_Location Values ('abc_125', 2);
INSERT INTO Segment_Location Values ('abc_126', 2);
INSERT INTO Segment_Location Values ('abc_127', 2);
INSERT INTO Segment_Location Values ('yuiyyte', 2);
INSERT INTO Segment_Location Values ('abc_128', 2);
INSERT INTO Segment_Location Values ('abc_125', 3);
INSERT INTO Segment_Location Values ('abc_126', 3);
INSERT INTO Segment_Location Values ('abc_127', 3);
INSERT INTO Segment_Location Values ('yuiyyte', 3);
INSERT INTO Segment_Location Values ('abc_128', 3);
INSERT INTO Segment_Location Values ('abc_129', 3);
INSERT INTO Segment_Location Values ('abc_126', 4);
INSERT INTO Segment_Location Values ('abc_127', 4);
INSERT INTO Segment_Location Values ('abc_128', 4);
INSERT INTO Segment_Location Values ('yuiyyte', 4);
INSERT INTO Segment_Location Values ('abc_129', 4);
INSERT INTO Segment_Location Values ('abc_130', 4);
INSERT INTO Segment_Location Values ('abc_131', 4);
INSERT INTO Segment_Location Values ('abc_132', 4);
INSERT INTO Segment_Location Values ('abc_127', 5);
INSERT INTO Segment_Location Values ('abc_128', 5);
INSERT INTO Segment_Location Values ('yuiyyte', 5);
INSERT INTO Segment_Location Values ('abc_129', 5);
INSERT INTO Segment_Location Values ('abc_130', 5);
INSERT INTO Segment_Location Values ('abc_131', 5);
INSERT INTO Segment_Location Values ('abc_132', 5);
INSERT INTO Segment_Location Values ('abc_132', 5);
INSERT INTO Segment_Location Values ('r5t67tg', 6);
INSERT INTO Segment_Location Values ('thyjaae', 6);
INSERT INTO Segment_Location Values ('yuijree', 6);
INSERT INTO Segment_Location Values ('hjegeer', 6);
INSERT INTO Segment_Location Values ('bghty_1', 6);
INSERT INTO Segment_Location Values ('hyjukit', 7);
INSERT INTO Segment_Location Values ('r5t67tg', 7);
INSERT INTO Segment_Location Values ('thyjaae', 7);
INSERT INTO Segment_Location Values ('yuijree', 7);
INSERT INTO Segment_Location Values ('yuiyyte', 8);
INSERT INTO Segment_Location Values ('abc_128', 8);
INSERT INTO Segment_Location Values ('abc_129', 8);
INSERT INTO Segment_Location Values ('thyjaae', 9);
INSERT INTO Segment_Location Values ('yuijree', 9);
INSERT INTO Segment_Location Values ('hjegeer', 9);
INSERT INTO Segment_Location Values ('yuijree', 10);
INSERT INTO Segment_Location Values ('hjegeer', 10);
Thanks again for your time.
July 5, 2011 at 11:52 pm
Hi Jeff.
Thanks for your suggestion; it definitly sent me down what seems to be the right path. Not every location is associated with every segment so a 'straight-up' cross join did not work. I had to set a restriction on which locations were associated with a given segment, and that restriction is based on the segment's begin and end points as well as the locations' key.
For the test data i have so far (about 2k records), this query seems to do the trick. Anyway, it has gotten my query time down from 20 seconds to 6 seconds for the same set of test data.
WITH CTE AS (SELECT segment_ID, bmp.loc_Key as locKey, bmp.LOC_MEAS as bmp, emp.LOC_MEAS as emp
FROM segment s
INNER JOIN location bmp ON s.BEG_LOC_ID = bmp.Location_ID
INNER JOIN location emp ON s.END_LOC_ID = emp.Location_ID
)
SELECT location_ID, s.Segment_ID from location, segment s
INNER JOIN CTE c ON s.segment_id = c.segment_ID
WHERE
loc_meas BETWEEN c.bmp AND c.emp and c.locKey = loc_key
ORDER BY segment_id
July 6, 2011 at 2:13 pm
Tom,
So, are you happy with what you coded, or would you still like for someone to look into it further?
Wayne
Microsoft Certified Master: SQL Server 2008
Author - SQL Server T-SQL Recipes
July 6, 2011 at 2:28 pm
Hi Wayne.
Thanks for the follow up.
I am always happy and eager to learn more techniques (for example, i had no idea that one could use 'between / and' in a JOIN statement) and about how to 'do sql' the right way. That being said, i recognize that your, and other's, time is voluntarily given so i dont want to ask for more than what is necessary. If you want to offer other approaches, i will use the info that you pass on to me, and pass it on to my workmates.
(I think that's a long winded way of answering your question with a 'yes, sure. please. but not if its a major drag or drain on your good will.')
Thank you.
July 9, 2011 at 4:31 pm
Tom Rippetoe (7/5/2011)
Hi Jeff.Thanks for your suggestion; it definitly sent me down what seems to be the right path. Not every location is associated with every segment so a 'straight-up' cross join did not work. I had to set a restriction on which locations were associated with a given segment, and that restriction is based on the segment's begin and end points as well as the locations' key.
For the test data i have so far (about 2k records), this query seems to do the trick. Anyway, it has gotten my query time down from 20 seconds to 6 seconds for the same set of test data.
WITH CTE AS (SELECT segment_ID, bmp.loc_Key as locKey, bmp.LOC_MEAS as bmp, emp.LOC_MEAS as emp
FROM segment s
INNER JOIN location bmp ON s.BEG_LOC_ID = bmp.Location_ID
INNER JOIN location emp ON s.END_LOC_ID = emp.Location_ID
)
SELECT location_ID, s.Segment_ID from location, segment s
INNER JOIN CTE c ON s.segment_id = c.segment_ID
WHERE
loc_meas BETWEEN c.bmp AND c.emp and c.locKey = loc_key
ORDER BY segment_id
Can't tell if what you've done is good or bad because things like loc_meas don't have an alias.
--Jeff Moden
Change is inevitable... Change for the better is not.
July 10, 2011 at 2:10 am
Hi Jeff.
Thank you for your follow up. I do greatly appreciate your time.
I am not sure if this is too much information, but hopefully it is helpful....
A 'segment' is a length of road that is uniquely defined by its begin location and end location. There can be 0 to many other 'locations' between the begin and end locations, but there must be one and only begin location and one and only one end location. By definition a 'segment' must be on one and only one roadway.
A 'location' represents a physical point on a road. That physical point is uniquely and exactly identified by the combination of its location key (a unique identifier for a given roadway) and its location measure, that is some distance, measured to the nearest 0.01 mile, along that roadway. Note that in the database we are recording 'locations of interest' - not every 0.01 mile point along a segment is 'interesting'. But there are many of those 0.01 mile points along a roadway that are 'interesting'. Each of those 'interesting' locations are recorded in the 'location' table.
My goal is to associate a set of 'locations' with a segment. Segments do overlap, so there is a many-to-many relationship between segments and locations. So, any location which exists between the segment's begin and end point needs to be associated with that segment.
Here is a hopefully 'improved' version of the same query. 'bmp' stands for 'beginning mile point', and 'emp' stands for 'ending mile point'. loc_key is the location key (unique road id) and loc_meas is the location measure (aka 0.01 mile point) along the roadway.
WITH cte AS
(
SELECT segment_ID, bmp.loc_Key AS locKey, bmp.loc_meas AS bmp, emp.loc_meas AS emp
FROM segment s
INNER JOIN location bmp ON s.beg_loc_id = bmp.location_id
INNER JOIN location emp ON s.end_loc_id = emp.location_id
)
SELECT l.location_id, s.Segment_id
FROM location l, segment s
INNER JOIN cte c ON s.segment_id = c.segment_ID
WHERE
l.loc_meas BETWEEN c.bmp AND c.emp AND c.locKey = l.loc_key
ORDER BY segment_id
August 5, 2011 at 7:31 pm
Damn. Tom, I'm sorry. I totally lost tract of this thread. Are you all set or do you still need some help?
--Jeff Moden
Change is inevitable... Change for the better is not.
August 8, 2011 at 10:07 am
Hello Jeff.
No worries at all. I think the code/querying is working pretty well now.
Thanks for the followup and all your time and advice.
Tom
Viewing 15 posts - 1 through 14 (of 14 total)
You must be logged in to reply to this topic. Login to reply