Data storage and retrieval for geospatial data in SQL Server 2008


By:   |   Updated: 2010-01-22   |   Comments (2)   |   Related: More > Spatial Data Storage

Problem

The company I work for has migrated to SQL Server 2008 and we're ready to start planning for the use of geospatial data. One of the applications we want to use geospatial data in will assist order fulfillers in efficiently picking stock for orders. Our warehouse is fairly large, which geospatial data type should we use?

Solution

In a previous tip, we gave a high-level overview of the new geospatial data types in SQL Server 2008. Now it is time to talk about data storage for geospatial data and when to use which type.

It's not uncommon for geospatial datasets to grow to millions of records, particularly when housing data for points over a large area (like restaurants in the United States). In both the geometry and geography data types the field is equivalent to VARBINARY(MAX). Remember that the geometry data type is best used with a Euclidian or a flat coordinate system to house point, line, linestring and other object dimensions.

We will use the AdventureWorks database and assign product locations housed in the Production.Product table. Although the warehouse is very large in size the Earth's curvature has nil effect on the distances between products, so the geometry data type should be adequate. If we were to use the geography data type we would have to select a Spatial Reference Identifier (SRID); with the geometry data type, an SRID is not necessary (you can specify an SRID of 0 if so desired). When specifying a point value it looks like 22 characters are used for storage, so keep that in mind when planning the database size.

Our first step is to create a table called ProductLocation to house the point data.

CREATE TABLE Production.ProductLocation 
( 
     ProductID INT 
          CONSTRAINT PK_ProductLocation_ProductID PRIMARY KEY 
           CONSTRAINT FK_ProductLocation_ProductID_Product_ProductID 
           FOREIGN KEY REFERENCES Production.Product(ProductID) 
     , ProductLocation GEOMETRY NOT NULL 
     , CreatedDt DATETIME NOT NULL 
           CONSTRAINT dft_ProductLocation_CreatedDtNow DEFAULT(GETDATE()) 
     , CreatedBy VARCHAR(50) NOT NULL 
     , ModifiedDt DATETIME 
     , ModifiedBy VARCHAR(50) 
) 

The table's primary key is the ProductID, since there can only be one location for each product (in another scenario a product could have both a primary location and a location where surplus product could be housed). In addition, there is a foreign key relationship between the ProductID in ProductLocation and ProductID in the Product table. The cascading action on the foreign key is DELETE, since there would be no need to keep location information on a product that is no longer stocked.

For the example we will populate the ProductLocation table with locations for some of the flat washers and bike frames:

INSERT INTO Production.ProductLocation(ProductID,ProductLocation,CreatedBy,ModifiedBy) 
VALUES(342,geometry::STPointFromText('POINT(0 10)',0),'timcullen','timcullen') 
INSERT INTO Production.ProductLocation(ProductID,ProductLocation,CreatedBy,ModifiedBy) 
VALUES(343,geometry::STPointFromText('POINT(0 12)',0),'timcullen','timcullen') 
INSERT INTO Production.ProductLocation(ProductID,ProductLocation,CreatedBy,ModifiedBy) 
VALUES(344,geometry::STPointFromText('POINT(0 14)',0),'timcullen','timcullen') 
INSERT INTO Production.ProductLocation(ProductID,ProductLocation,CreatedBy,ModifiedBy) 
VALUES(345,geometry::STPointFromText('POINT(0 16)',0),'timcullen','timcullen') 
INSERT INTO Production.ProductLocation(ProductID,ProductLocation,CreatedBy,ModifiedBy) 
VALUES(346,geometry::STPointFromText('POINT(0 18)',0),'timcullen','timcullen') 
INSERT INTO Production.ProductLocation(ProductID,ProductLocation,CreatedBy,ModifiedBy) 
VALUES(347,geometry::STPointFromText('POINT(0 20)',0),'timcullen','timcullen') 
INSERT INTO Production.ProductLocation(ProductID,ProductLocation,CreatedBy,ModifiedBy) 
VALUES(348,geometry::STPointFromText('POINT(0 22)',0),'timcullen','timcullen') 
INSERT INTO Production.ProductLocation(ProductID,ProductLocation,CreatedBy,ModifiedBy) 
VALUES(349,geometry::STPointFromText('POINT(0 24)',0),'timcullen','timcullen') 
INSERT INTO Production.ProductLocation(ProductID,ProductLocation,CreatedBy,ModifiedBy) 
VALUES(748,geometry::STPointFromText('POINT(29 31)',0),'timcullen','timcullen') 
INSERT INTO Production.ProductLocation(ProductID,ProductLocation,CreatedBy,ModifiedBy) 
VALUES(739,geometry::STPointFromText('POINT(29 41)',0),'timcullen','timcullen') 
INSERT INTO Production.ProductLocation(ProductID,ProductLocation,CreatedBy,ModifiedBy) 
VALUES(740,geometry::STPointFromText('POINT(29 51)',0),'timcullen','timcullen') 
INSERT INTO Production.ProductLocation(ProductID,ProductLocation,CreatedBy,ModifiedBy) 
VALUES(742,geometry::STPointFromText('POINT(29 61)',0),'timcullen','timcullen') 
INSERT INTO Production.ProductLocation(ProductID,ProductLocation,CreatedBy,ModifiedBy) 
VALUES(741,geometry::STPointFromText('POINT(29 71)',0),'timcullen','timcullen') 

Now that we have populated the ProductLocation table we can evaluate the structure of the table. When we perform a simple select statement against the table we see that the product location is stored as a binary value, known as the Well-Known Binary Representation:

Records in the ProductLocation table

The binary representation is created by using a .NET Common Language Runtime data type. One other thing to note is that when a query is performed on a table with a geospatial data type an additional tab appears in the results area called Spatial Results. The data returned is represented on a grid:

Records shown in the Spatial Results tab

Since the geometry data in this table are point values, we can use the STX and STY functions to extract each coordinate. One interesting note about using these functions is that since they use the .NET CLR they are case-sensitive. If I execute the query with one of the characters in lower-case I receive the following message:

Error when executing function

When I change the case of the character I receive the intended results:

Results of the query to obtain coordinates

One other thing to note is that if the geometry column, in our case ProductLocation, is not queried the Spatial Results tab will not appear.

Next Steps


Last Updated: 2010-01-22


get scripts

next tip button



About the author
MSSQLTips author Tim Cullen Tim Cullen has been working in the IT industry since 2003 and currently works as a SQL Server Reports Developer.

View all my tips





Comments For This Article




Friday, January 22, 2010 - 12:26:02 PM - timothyrcullen Back To Top (4762)

Nice catch!!!  Somehow I missed that during the proof-read.  Thanks for bringing it to my attention.


Friday, January 22, 2010 - 7:45:24 AM - mharr Back To Top (4759)
A nice introductory article on spatial support on SQL Server 2008. But I want to correct a statement in the early part of the article: "Remember that the geometry data type houses point data only,...". That is not correct; geometry datatype can also hold curves (LineString), surfaces (Polygon), and MultiSurface and MultiCurve collections. That also makes it very effective for holding representations of regions (like counties or state boundaries), where spatial functions can be used to organize set data.


download


Recommended Reading

Calculate the Geographical Distance Between Two Cities in SQL Server

SQL Server Geography Data Type

Calculate Geographical Coding and Time Zone for an Address in SQL Server with PowerShell, T-SQL and Google Maps

SQL Server 2008 Geography and Geometry Data Types

SQL Server Stored Procedure to Calculate Area of a Polygon and Draw its Shape





get free sql tips
agree to terms


Learn more about SQL Server tools