September 22, 2014 at 11:37 am
Does anyone have a function that determines the geometric function from a series of spatial points (geometric or geographic.)?
If not I am ready to build one ... I found a clean C++ implementation that I can translate to T-SQL.
Russ
September 22, 2014 at 1:07 pm
Haven't done Weiszfeld's algorithm in TSQL, would love giving it a go as it is a good example of iterative approach which potentially could be converted into a set based solution. Could you possibly share the C++ code?
😎
September 22, 2014 at 1:57 pm
I've had a quick look at this and I suspect while it could be done in T-SQL (painfully and slowly :-D), a CLR function built around your C++ would probably be a better option.
Here's a link to a python version Eirikur
September 22, 2014 at 2:25 pm
mickyT (9/22/2014)
I've had a quick look at this and I suspect while it could be done in T-SQL (painfully and slowly :-D), a CLR function built around your C++ would probably be a better option.Here's a link to a python version Eirikur
Thanks!
😎
September 22, 2014 at 4:42 pm
Looking at it further, it may not be so bad in t-sql.
Here's my crack at it. It currently uses a while loop because I'm having a few issues turning it into a recursive CTE. Might be able to sort something out later.
I also included a formula for a Geometric Mean.
-- Setup some sample data
--------------------------
SELECT * INTO sample
FROM (VALUES
(Geometry::Point(54,37,0))
,(Geometry::Point(45,75,0))
,(Geometry::Point(34,45,0))
,(Geometry::Point(87,124,0))
,(Geometry::Point(15,68,0))
,(Geometry::Point(32,84,0))
,(Geometry::Point(87,48,0))
,(Geometry::Point(01,78,0))
,(Geometry::Point(78,35,0))
,(Geometry::Point(125,78,0))
,(Geometry::Point(57,14,0))
) p(point);
-- Do that actual work
----------------------
DECLARE @tempX float = 0.0;
DECLARE @tempY float = 0.0;
DECLARE @iterations int = 50;
DECLARE @denomsum float;
SELECT @tempX = EXP(SUM(LOG(point.STX))/COUNT(*)),
@tempY = EXP(SUM(LOG(point.STY))/COUNT(*))
FROM sample;
WHILE @iterations > 0
BEGIN
SET @iterations -= 1;
SELECT @denomsum = SUM(1 / SQRT(POWER(@tempX - point.STX,2) + POWER(@tempY - point.STY,2)))
FROM sample;
SELECT @tempX = SUM((point.STX * (1 / SQRT(POWER(@tempX - point.STX,2) + POWER(@tempY - point.STY,2)))) / @denomsum),
@tempy = SUM((point.STY * (1 / SQRT(POWER(@tempX - point.STX,2) + POWER(@tempY - point.STY,2)))) / @denomsum)
FROM sample;
END;
-- Show the results
-------------------
SELECT point.STBuffer(2) from sample
UNION ALL
SELECT Geometry::Point(EXP(SUM(LOG(point.STX))/COUNT(*)), EXP(SUM(LOG(point.STY))/COUNT(*)), 0).STBuffer(4) -- GeometricMean
FROM sample
UNION ALL
SELECT Geometry::Point(@tempx, @tempy,0).STBuffer(6) -- GeometricMedian
;
September 22, 2014 at 4:47 pm
I pulled it off of another forum (what is the etiquette relating to that?). I'm better off converting it to T-SQL for my purpose as it will be used to populate a report in a third party app.
// header files for IO functions and math
#include <cstdio>
#include <cmath>
// the maximul value n can take
const int maxn = 50001;
// given a point (x, y) on a grid, we can find its left/right/up/down neighbors
// by using these constants: (x + dx[0], y + dy[0]) = upper neighbor etc.
const int dx[] = {-1, 0, 1, 0};
const int dy[] = {0, 1, 0, -1};
// controls the precision - this should give you an answer accurate to 3 decimals
const double eps = 0.001;
// input and output files
FILE *in = fopen("adapost2.in","r"), *out = fopen("adapost2.out","w");
// stores a point in 2d space
struct punct
{
double x, y;
};
// how many points are in the input file
int n;
// stores the points in the input file
punct a[maxn];
// stores the answer to the question
double x, y;
// finds the sum of (euclidean) distances from each input point to (x, y)
double dist(double x, double y)
{
double ret = 0;
for ( int i = 1; i <= n; ++i )
{
double dx = a.x - x;
double dy = a.y - y;
ret += sqrt(dx*dx + dy*dy); // classical distance formula
}
return ret;
}
// reads the input
void read()
{
fscanf(in, "%d", &n); // read n from the first
// read n points next, one on each line
for ( int i = 1; i <= n; ++i )
fscanf(in, "%lf %lf", &a.x, &a.y), // reads a point
x += a.x,
y += a.y; // we add the x and y at first, because we will start by approximating the answer as the center of gravity
// divide by the number of points (n) to get the center of gravity
x /= n;
y /= n;
}
// implements the solving algorithm
void go()
{
// start by finding the sum of distances to the center of gravity
double d = dist(x, y);
// our step value, chosen by experimentation
double step = 100.0;
// done is used to keep track of updates: if none of the neighbors of the current
// point that are *step* steps away improve the solution, then *step* is too big
// and we need to look closer to the current point, so we must half *step*.
int done = 0;
// while we still need a more precise answer
while ( step > eps )
{
done = 0;
for ( int i = 0; i < 4; ++i )
{
// check the neighbors in all 4 directions.
double nx = (double)x + step*dx;
double ny = (double)y + step*dy;
// find the sum of distances to each neighbor
double t = dist(nx, ny);
// if a neighbor offers a better sum of distances
if ( t < d )
{
update the current minimum
d = t;
x = nx;
y = ny;
// an improvement has been made, so
// don't half step in the next iteration, because we might need
// to jump the same amount again
done = 1;
break;
}
}
// half the step size, because no update has been made, so we might have
// jumped too much, and now we need to head back some.
if ( !done )
step /= 2;
}
}
int main()
{
read();
go();
// print the answer with 4 decimal points
fprintf(out, "%.4lf %.4lf", x, y);
return 0;
}
Eirikur Eiriksson (9/22/2014)
Haven't done Weiszfeld's algorithm in TSQL, would love giving it a go as it is a good example of iterative approach which potentially could be converted into a set based solution. Could you possibly share the C++ code?😎
September 22, 2014 at 4:53 pm
Nice, I'm going to check it out tomorrow morning. I was using Geometric Mean and comparing it to Absolute distance from a separate point (to measure efficiency). If I run this against my data (which will be around 5,000 samples) to make sure the geographic median always has closer sums of distances than the mean centroids and my facility location.
Russ
mickyT (9/22/2014)
Looking at it further, it may not be so bad in t-sql.Here's my crack at it. It currently uses a while loop because I'm having a few issues turning it into a recursive CTE. Might be able to sort something out later.
I also included a formula for a Geometric Mean.
September 22, 2014 at 6:06 pm
While I'm on a bit of a roll. Had a bit of success with a rCTE, but it does require a iTVF to be created.
Removed: Incorrect results
September 24, 2014 at 11:20 am
Thanks a bunch mickyT!!! I've added a user-defined table type to make this re-usable from my GIS database
CREATE TYPE udttPDXGeom AS TABLE (geompt GEOMETRY)
CREATE TYPE udttPDXGeog AS TABLE (geogpt GEOGRAPHY)
And added a function that does your math (allowing for iterations to be a parameter so it is flexible upon usage):
CREATE FUNCTION udfPDXGeomMedian (@geomPts udttPDXGeom READONLY, @iIterations NUMERIC)
RETURNS GEOMETRY AS
BEGIN
DECLARE @tempXFLOAT = 0.0;
DECLARE @tempYFLOAT = 0.0;
DECLARE @denomsumFLOAT;
SELECT@tempx = EXP(SUM(LOG(geompt.STX))/COUNT(*)),
@tempy = EXP(SUM(LOG(geompt.STY))/COUNT(*)) FROM @geomPts
WHILE @iIterations > 0
BEGIN
SET @iIterations -= 1;
SELECT @denomsum = SUM(1 / SQRT(POWER(@tempX - geompt.STX,2) + POWER(@tempY - geompt.STY,2)))
FROM @geomPts
SELECT @tempX = SUM((geompt.STX * (1 / SQRT(POWER(@tempX - geompt.STX,2) + POWER(@tempY - geompt.STY,2)))) / @denomsum),
@tempy = SUM((geompt.STY * (1 / SQRT(POWER(@tempX - geompt.STX,2) + POWER(@tempY - geompt.STY,2)))) / @denomsum)
FROM @geomPts;
END
RETURN GEOMETRY::Point(@tempx, @tempy,0)
END
So the usage is down to this for me ...
DECLARE @sample AS udttPDXGeom
INSERT INTO @sample
VALUES
(Geometry::Point(54,37,0))
,(Geometry::Point(45,75,0))
,(Geometry::Point(34,45,0))
,(Geometry::Point(87,124,0))
,(Geometry::Point(15,68,0))
,(Geometry::Point(32,84,0))
,(Geometry::Point(87,48,0))
,(Geometry::Point(01,78,0))
,(Geometry::Point(78,35,0))
,(Geometry::Point(125,78,0))
,(Geometry::Point(57,14,0))
SELECT geompt.STBuffer(2) FROM @sample UNION ALL
SELECT dbo.udfPDXGeomMedian(@sample, 50).STBuffer(6)
Do you think this would translate will to GEOGRAPHY data types? I have all of my locations in both the spatial "4326" SRID as well as the "2601" Stateplane planar coordinates ... but the google maps API only supports Lat/Long so I would have to translate in my code which is expensive.
Russ
September 24, 2014 at 11:39 am
Quick question, how accurate does this have to be? The reason for asking is that one can do a good approximation using the spatial data type methods and functions, happy to demonstrate if that's good enough.
I have looked at the python and the C++ samples, think mikyT has pretty much nailed that approach, would like to prove otherwise if I have time:-D (good job mikyT)
😎
September 24, 2014 at 11:42 am
Well that didn't go so well ....
CREATE FUNCTION udfPDXGeogMedian (@geogPts udttPDXGeog READONLY, @iIterations NUMERIC)
RETURNS GEOGRAPHY AS
BEGIN
DECLARE @tempXFLOAT = 0.0;
DECLARE @tempYFLOAT = 0.0;
DECLARE @denomsumFLOAT;
SELECT@tempx = EXP(SUM(LOG(geogpt.Long))/COUNT(*)),
@tempy = EXP(SUM(LOG(geogpt.Lat))/COUNT(*)) FROM @geogPts
WHILE @iIterations > 0
BEGIN
SET @iIterations -= 1;
SELECT @denomsum = SUM(1 / SQRT(POWER(@tempX - geogpt.Long,2) + POWER(@tempY - geogpt.Lat,2)))
FROM @geogPts
SELECT @tempX = SUM((geogpt.Lat * (1 / SQRT(POWER(@tempX - geogpt.Long,2) + POWER(@tempY - geogpt.Lat,2)))) / @denomsum),
@tempy = SUM((geogpt.Lat * (1 / SQRT(POWER(@tempX - geogpt.Long,2) + POWER(@tempY - geogpt.Lat,2)))) / @denomsum)
FROM @geogPts;
END
RETURN GEOGRAPHY::Point(@tempx, @tempy,0)
END
Sample data:
DECLARE @sampleGeog AS udttPDXGeog
INSERT INTO @sampleGeog
VALUES
(GEOGRAPHY::Point(44.478311,-122.62196,4326))
,(GEOGRAPHY::Point(44.4600192,-122.6435081,4326))
,(GEOGRAPHY::Point(44.5137321,-122.6458555,4326))
,(GEOGRAPHY::Point(44.505355,-122.58344,4326))
,(GEOGRAPHY::Point(44.518456,-122.6453,4326))
,(GEOGRAPHY::Point(44.4941156,-122.5793172,4326))
SELECT geogpt.STBuffer(100) FROM @sampleGeog
UNION ALL SELECT dbo.udfPDXGeogMedian(@sampleGeog, 500).STBuffer(300)
Yields:
(6 row(s) affected)
Msg 3623, Level 16, State 1, Line 13
An invalid floating point operation occurred.
September 24, 2014 at 11:46 am
The more accurate it is, the more money I will save :-). So for an area of 30 miles by 30 miles, I would like to get the Geographic Median within maybe a tenth of a mile. I'm dealing with ten of thousands of trips a year so every bit helps.
Russ
September 24, 2014 at 12:11 pm
busraker (9/24/2014)
I'm dealing with ten of thousands of trips a year so every bit helps.Russ
Sounds a little like travelling salesman here:-)
This isn't too high of a volume, cannot say much but these are kind of hourly type volumes where I'm coming from. I'll piece together an example and post it shortly when I have time.
😎
September 24, 2014 at 12:55 pm
Eirikur Eiriksson (9/24/2014)
Sounds a little like travelling salesman here:-)
This isn't too high of a volume, cannot say much but these are kind of hourly type volumes where I'm coming from. I'll piece together an example and post it shortly when I have time.
😎
The only miles I log are on the CPU bus 🙂
I'm thinking I will simply cast the log and lat as Geometries instead of Geographies and convert back to geography after getting the geometric median. It will be close enough I think, and may better then the Geometric Mean
September 24, 2014 at 1:04 pm
busraker (9/24/2014)
Well that didn't go so well ....
...
SELECT@tempx = EXP(SUM(LOG(geogpt.Long))/COUNT(*)),
@tempy = EXP(SUM(LOG(geogpt.Lat))/COUNT(*)) FROM @geogPts
...
END
Yields:
(6 row(s) affected)
Msg 3623, Level 16, State 1, Line 13
An invalid floating point operation occurred.
The floating point error will be due to the LOG getting a negative I think.
I suspect the geography version will not be accurate enough for you. It would depend on the north south spread of your data.
If you aren't adverse to using CLRs, there is a few libraries out there that can help with reprojections.
http://sqlspatialtools.codeplex.com/wikipage?title=Current%20Contents&referringTitle=Home
Viewing 15 posts - 1 through 15 (of 24 total)
You must be logged in to reply to this topic. Login to reply