December 10, 2009 at 7:42 am
I am trying to import a text file with 37 fields delineated by "/". This file is from a live GPS feed from several devices so it is constantly being updated. Each device is identified with a unique ID. The timestamp for each entry is in the following format: DDHHMMSS, so 2:10:45 PM today would be 10141045--no date or year is given.
Importing the data is easy. However, I need to parse out some the information, like the timestamp field, into usable information (make it into a real timestamp that SQL can understand). In addition, I need to have this setup so that it automatically runs every couple of minutes and imports only the most recent data. Just to make things interesting, the script would need to make sure each line is complete because it could attempt to important data while a new record is being recorded so not all of the 37 fields would be there.
Finally, I need to identify the most current record for each device ID and indicate it somehow in the database (flag it in some way). We will be using the data in a GIS mapping program, so by identifying the most current record, I can also show the breadcrumbs (previous locations). I only need to be able to see the previous 15 records or so, but I need to store all of the records long term. One idea I had was to use a field within the database that stores a number 1-16 with 1 being the most current record and 16 being the oldest. However, I have no idea how to do this.
I am at a complete loss on how to do this. Any help would be greatly appreciated. Thanks.
December 14, 2009 at 1:01 pm
>>I need to parse out some the information, like the timestamp field, into usable information (make it into a real timestamp that SQL can understand).
I would think you'd want an timestamp field defaulted to getdate() in your destination table. You could combine this with your DDHHMMSS format to generate a real datetime field with which you could make time and date based calculations. Since there is no month or year in the incoming data you will need to insure you import records created in a particular month are imported in the same month. This is a problem with your source data which I would consider serious. This will create integrity/timing problems. What if you import records in the first minute of January? You'll have to have some sort of logic to say these are really December records.
>>I need to have this setup so that it automatically runs every couple of minutes
Not a problem just schedule it a a desired frequency.
>> imports only the most recent data
Typically a series of files would be imported but it sounds like you have one file that is constantly being written to? Correct? Do you have a unique way to identify each record. Then you could import all the data into a staging table and then compare using NOT IN to your destination table to identify 'recent data'. Then import the records which don't exist in your destination table from the staging table.
>>the script would need to make sure each line is complete because it could attempt to important data
Typically you can't import data from an open file just for reasons like this.
>>Finally, I need to identify the most current record for each device ID and indicate it somehow in the database
Select the max timestamp for each ID (if you can generate a valid datetimestamp)
>>show the breadcrumbs (previous locations). I only need to be able to see the previous 15 records or so, but I need to store all of the records long term. One idea I had was to use a field within the database that stores a number 1-16 with 1 being the most current record and 16 being the oldest
Bad idea. If you have valid data you shouldn't have to do this. You would use the TOP statement instead.
December 14, 2009 at 1:08 pm
can you post some sample data you would be importing as well as the CREATE TABLE of the destination table?
you said the first two values of 10141045 was the day, so it assumes the 10th of the current month, is that correct? you explained the rest of teh field just fine.
Lowell
December 14, 2009 at 1:27 pm
also, here is an example of converting that unusualdate format into a datetime field:
--results:
BeginOfMonth DaMonth DaFinalDate
----------------------- ----------------------- -----------------------
2009-12-01 00:00:00.000 2009-12-11 00:00:00.000 2009-12-11 14:10:45.000
2009-12-01 00:00:00.000 2009-12-11 00:00:00.000 2009-12-11 14:11:37.000
2009-12-01 00:00:00.000 2009-12-11 00:00:00.000 2009-12-11 15:09:22.000
create table #import (ImportDateAsUnusualFormat char(10))
insert into #import
SELECT '10141045' UNION ALL
SELECT '10141137' UNION ALL
SELECT '10150922'
With MyCTE As (
--First Day of this Month
select
CONVERT(int,substring(ImportDateAsUnusualFormat,1,2)) As theDayOfMonth,
CONVERT(int,substring(ImportDateAsUnusualFormat,3,2)) As theHours,
CONVERT(int,substring(ImportDateAsUnusualFormat,5,2)) As theMinutes,
CONVERT(int,substring(ImportDateAsUnusualFormat,7,2)) As theSeconds
from #import )
SELECT
--showing each substring for reference:
--Dec 1 for example
DATEADD(mm, DATEDIFF(mm,0,getdate()), 0) As BeginOfMonth,
--add theDayOfMonth to the beggining of month
DATEADD(dd,theDayOfMonth,DATEADD(mm, DATEDIFF(mm,0,getdate()), 0)) As DaMonth,
--be lazy, convert hours/min/seconds to seconds, and just add them to the date we created:
dateadd(ss,
(theHours * 3600) + (theMinutes) * 60 + theSeconds,
DATEADD(dd,theDayOfMonth,DATEADD(mm, DATEDIFF(mm,0,getdate()), 0))
) As DaFinalDate
From MyCTE
Lowell
December 14, 2009 at 1:48 pm
As far as turning the text file timestamp into a real timestamp, take a look at this, see if it'll do what you need:
declare @TS char(8);
select @TS = '10141045';
select
dateadd(month, datediff(month, 0, getdate()), 0),
substring(@TS, 1, 2) - 1,
substring(@TS, 3, 2),
substring(@TS, 5, 2),
substring(@TS, 7, 2),
dateadd(second,
cast(substring(@TS, 7, 2) as int),
dateadd(minute,
cast(substring(@TS, 5, 2) as int),
dateadd(hour,
cast(substring(@TS, 3, 2) as int),
dateadd(day,
cast(substring(@TS, 1, 2) as int) - 1,
dateadd(month, datediff(month, 0, getdate()), 0)))));
The first part just creates a variable for testing. The query is split in two. The first half is just the pieces of the string. It's just in there to show you what I'm doing, and you'd get rid of it in the real query. The second part just uses nested DateAdd functions on each piece of the "timestamp".
Instead of a variable, you'd use a column name, of course.
For the import, you'd want to import to a staging table. Then you can format the data the way you want, validate that you're only getting the new rows, and then add it to the target table.
For the numbering thing, I'd do that at runtime, not in the table.
Might look something like this:
-- Set up test/proof of concept
if object_id(N'tempdb..#T') is not null
drop table #T;
if object_id(N'tempdb..#GPSes') is not null
drop table #GPSes;
create table #T (
ID int identity primary key,
GPSID uniqueidentifier,
LogTime datetime,
Lat float,
Long float);
create table #GPSes (
ID uniqueidentifier);
insert into #GPSes (ID)
select newid()
from dbo.Numbers
where Number between 1 and 10;
with
Times (DT) as
(select dateadd(second, Number-1, '12/14/2009')
from dbo.Numbers
where Number between 1 and 3600)
insert into #T (GPSID, LogTime, Lat, Long)
select ID, DT, checksum(newid())%180, checksum(newid())%180
from #GPSes
cross join Times
order by ID, DT;
-- Query data (use your real columns and table name here)
;with
CTE as
(select
*,
row_number() over (partition by GPSID order by LogTime desc) as Row
from #T)
select *
from CTE
where Row <= 16
order by GPSID, Row desc;
Does that help?
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
Viewing 5 posts - 1 through 4 (of 4 total)
You must be logged in to reply to this topic. Login to reply