October 25, 2010 at 8:27 am
i have a table with the following fields
[GUID] [uniqueidentifier] ROWGUIDCOL NOT NULL,
[VehicleGUID] [uniqueidentifier] NOT NULL,
[TimeStamp] [datetime] NOT NULL,
[X] [float] NULL,
[Y] [float] NULL,
[Accuracy] [tinyint] NOT NULL,
[Speed] [float] NOT NULL,
[Heading] [smallint] NOT NULL,
[CreationDate] [datetime] NULL,
[Battery] [tinyint] NULL
the key of the table is (VehicleGUID,TimeStamp)
which holds GPS data of vehicles. the table holds over 50 million records and takes a lot of disk space (this table with all it's indexes and a few other fields not listed take over 40GB)
i have discovered that the Battery field changes roughly for every vehicle once an hour while the table itself has data every minute from the GPS for that vehicle. and i have also discovered that the battery field on some vehicles is null as the GPS does not send battery info.
to save disk space, i want to split the battery field to another table and i also want to keep only the value changes so i will have a table structure like this
GPS Data Table:
[GUID] [uniqueidentifier] ROWGUIDCOL NOT NULL,
[VehicleGUID] [uniqueidentifier] NOT NULL,
[TimeStamp] [datetime] NOT NULL,
[X] [float] NULL,
[Y] [float] NULL,
[Accuracy] [tinyint] NOT NULL,
[Speed] [float] NOT NULL,
[Heading] [smallint] NOT NULL,
[CreationDate] [datetime] NULL
Battery Changes Table:
[VehicleGUID] [uniqueidentifier] NOT NULL,
[TimeStamp] [datetime] NOT NULL,
[Battery] [tinyint] NULL,
the Battery Changes Table will hold only dates where the battery value actually changed.
obviously, this design will save a lot of disk space but i encountered a problem:
how do i run a query on the tables that will return data as it was before the table split (i.e. for every record in the GPS data table, find a record with the same vehicle Guid and a the closest earlier row in the Battery Changes Table and use the battery from that row)
i know that in this specific case the disk space saved is small but i am talking about the concept. i am going to implement this idea over fields which will save much more data per row
Dani Avni
OfficeCore
October 25, 2010 at 9:04 am
Can you supply some sample data (both what you have and how you want it to look)? I think I understand what you're looking for, but I don't have time to go through your table setup. If we can figure out from a quick glance what you want, we can help you out a lot easier.
Thanks!
+--------------------------------------------------------------------------------------+
Check out my blog at https://pianorayk.wordpress.com/
October 27, 2010 at 12:43 am
for simplicity assume i have this data
IDDateXYData
11/1/2000 00:00:00121
21/1/2000 00:01:00341
31/1/2000 00:02:00561
41/1/2000 01:00:00562
51/1/2000 01:01:00782
61/1/2000 01:02:009102
as you can see the Data field only changes twice (00:00 and 01:00). so i want to create two tables:
the actual data
IDDateXY
11/1/2000 00:00:0012
21/1/2000 00:01:0034
31/1/2000 00:02:0056
41/1/2000 01:00:0056
51/1/2000 01:01:0078
61/1/2000 01:02:00910
And another table for the changes in the Data field
DateData
1/1/2000 00:00:001
1/1/2000 01:00:002
my question is assuming i have these two tables, how do i write a query that will return results like the original table (i.e. Data field in every row)
Dani Avni
OfficeCore
November 10, 2010 at 7:32 am
I think what you are looking for is an INNER JOIN,
Try this query:
Select Table1.VehicleGUID, Table2.TimeStamp As [Battry Changed On] From GPSDataTable
Inner Join BattryChangeTable Table2 ON Table2.VehicleGUID = Table1.VehicleGUID
Vishal Gajjar
http://SqlAndMe.com
December 6, 2010 at 10:05 pm
Its purely inner join with "case when" condition.
column should be fetched from primary table when there is no record available for that vehicleGUID in battery table, otherwise you just need to get it from the battery table.
do you need query for it?
December 17, 2010 at 8:43 am
yes please. a sample would be much appriciated.
thanks.
Dani Avni
OfficeCore
December 29, 2010 at 11:48 pm
declare @positons table(
vehicle_ID smallint,
date datetime,
x smallint,
y smallint)
insert into @positons
select 1, '1/1/2000 00:00:00', 1, 2 union all
select 1, '1/1/2000 00:01:00', 3, 4 union all
select 1, '1/1/2000 00:02:00', 5, 6 union all
select 1, '1/1/2000 01:00:00', 5, 6 union all
select 1, '1/1/2000 01:01:00', 7, 8 union all
select 1, '1/1/2000 01:02:00', 9, 10
declare @battery_change table(
vehicle_ID smallint,
date datetime,
change smallint)
insert into @battery_change
select 1,'1/1/2000 00:00:00', 1 union all
select 1,'1/1/2000 01:00:00', 0
select p.*,(select change from @battery_change where (vehicle_id = p.vehicle_id) and date =(select max(date) from @battery_change bc where bc.date <= p.date and bc.vehicle_id = p.vehicle_id)) battery_change from @positons p
December 30, 2010 at 3:43 am
Regarding the table size itself:
I noticed you're using FLOAT quite often (e.g. for [X], [Y], and [Speed]). Float will use 8 byte.
You could change the Speed column to smallint (2 byte). I'm not sure if X and Y are integer values or not and what the scale is. But it might be possible to adjust those data types as well.
Regarding your indexes: Assuming SS2K5 or above, you might benefit from moving some columns to the INCLUDE part of the index. This will not only reduce the required space but will also improve performance (if done correctly) since the index will be more narrow.
December 30, 2010 at 4:10 am
Hi Lutz,
thanks for the message. these tables where built by me about 6-7 years ago when i didn't know too much SQL. when the tables where small and held thousands of records the bytes wasted weren't too much a deal. now i have tens of millions of records and suddenly every byte counts. i am in the process of complete rewrite of this table looking to remove every byte possible.
for example the speed can be even reduced to 1 byte (after all cars do not travel at speeds > 255 Km/h) and even if they do i can still store it in 1 byte which will hold the speed divided by 2 (permitting speeds of up to 510 Km/h)
Dani Avni
Dani Avni
OfficeCore
December 30, 2010 at 4:22 am
danavni (12/30/2010)
Hi Lutz,thanks for the message. these tables where built by me about 6-7 years ago when i didn't know too much SQL. when the tables where small and held thousands of records the bytes wasted weren't too much a deal. now i have tens of millions of records and suddenly every byte counts. i am in the process of complete rewrite of this table looking to remove every byte possible.
for example the speed can be even reduced to 1 byte (after all cars do not travel at speeds > 255 Km/h) and even if they do i can still store it in 1 byte which will hold the speed divided by 2 (permitting speeds of up to 510 Km/h)
Dani Avni
I hope you'll never have to track data for a PORSCHE on the German Autobahn... :-D:-D
(Max. speed >300km/h on the road. Trust me. Personal experience.)
I'd strongly vote against storing data in a non-natural format just to save one byte. If you forget to re-calculate it a single sproc or view you'll display plain wrong data.
I'm sure there's noone around without the experience you're going through right now... But the number of people actually rewriting their own code is much smaller... Good luck!!
Viewing 10 posts - 1 through 9 (of 9 total)
You must be logged in to reply to this topic. Login to reply