Weather Station Tracking with SQL Server and Excel
Please illustrate several techniques for organizing and visualizing geographical data managed by SQL Server. I have weather station related data that would be helpful to visualize the location for each weather station on a map.
The National Weather Service (NWS) Cooperative Observer Program (COOP) is sometimes referred to as the nation's weather and climate observing network of, by and for the people. Weather observations are collected at thousands of weather stations across the United States and its territories. A prior tip (Analyze Historical Weather Data with SQL Server) introduced this data source. This tip illustrates several ways to visualize the geography of the weather stations with SQL Server. If you have data about geographically dispersed objects or events, such as weather stations, stores, or earthquakes, the demonstrations in this tip can help you visualize the geography for your data points as well as the numerical values associated with those points.
Metadata for COOP weather stations
The metadata for COOP weather stations can be obtained from a fixed-length text file (ccsl.txt) which can be downloaded from the NOAA Historical Observing Metadata Repository. NOAA is an abbreviation for the National Oceanic and Atmospheric Administration, which is the federal agency in which the National Weather Service resides. The layout of metadata for each station resides on a single row in ccsl.txt. Another link on the same web page enables the download of documentation for ccsl.txt contents in a file named CCSL_table.txt. Here is an image of the documentation file.
Some especially critical elements in the documentation include the following points.
- There are data for ten columns in the ccsl.txt file.
- The first column has the name STATEPROV that indicates a state or territory abbreviation in which a weather station is located.
- The last column has the name END DATE, which denotes the last date during which there was an observation at a station. An END DATE value of 99991231 denotes that a station is still being observed on a daily basis.
Here are two excerpts from the ccsl.txt file.
- The first excerpt is for the first five weather stations from the state
of Alaska (AK).
- The first station name is ADAK, and its COOP STATION ID is 500026; the last station name in the excerpt is AKUTAN, and its COOP STATION ID is 500144.
- None of the stations in the first excerpt are still in use because each row has an END DATE column value of other than 99991231, but there are other rows for Alaskan weather stations beyond those excerpted with an END DATE value of 99991231. This outcome reminds you that the source data are for weather stations that are currently in use as well as stations that are no longer in use.
- The second excerpt is for 16 weather stations at the end of the file.
- The first two rows are for locales in the state of Wyoming (WY).
- The last fourteen rows do not have a STATEPROV identifier value. These last fourteen rows are for locales that do not correspond to places for which the United States Postal Service (USPS) has a two-character state/province abbreviation.
- There is a list of the USPS abbreviations for states and territories at this web page.
- 8 of the rows are for weather stations that are still in use because their END DATE value is 99991231.
Both the first and second excerpts depict latitude and longitude values for weather stations in degrees (o), minutes ('), and seconds (") format. The seventh and eight column values denote, respectively, the latitude and longitude of a weather station.
- The latitude values for weather stations in Alaska and Wyoming are north of the equator; for this reason, the latitude values are positive.
- CANTON ISLAND and FUNAFUTI weather stations are south of the equator as indicated by their negative latitude values.
- All the negative longitude values are west of the prime meridian, which is the meridian passing through the Royal Observatory in Greenwich, England as well as the north and south poles. Weather stations with a positive longitude value are to the east of the prime meridian.
The following T-SQL script reads a subset of the contents from the ccsl.txt file into a SQL Server table.
- The code begins with a use statement to reference the noaa_data database as the default database. You can use any other database of your choice as the default database.
- The next code block creates a fresh version of the ImportedFileTable object. This table will be used to store the unparsed line values from the ccsl.txt file downloaded from the NOAA website.
- Then, a bulk insert statement reads the data from the ccsl.txt file in the noaa_data folder of the C drive. The ccsl.txt file in the noaa_data folder is manually re-named ccssl_data.txt to differentiate it from the downloaded COOP station text file.
- After populating the ImportedFileTable object, the code parses each line
and filters the resulting rows with a select…into statement.
- The select list items specify expressions for parsing the values on each row in the ImportedFileTable object into columns within the ccsl_still_in_use table within the dbo schema of the default database.
- The where clause filters the results set in the target table so that only rows with an END DATE column value of '99991231' are retained. Recall that this COOP code denotes that a weather station is still in operation. Other END DATE values denote the date in YYYYMMDD format for the last observation from a station.
use noaa_data go -- Collect info about ccsl weather stations for -- those still in use -- ccsl.txt file downloaded on May12,2020 -- Remove prior ImportedFileTable if EXISTS(SELECT * FROM sys.objects WHERE name = 'ImportedFileTable' and TYPE = 'u') DROP TABLE ImportedFileTable -- Create ImportedFileTable CREATE TABLE ImportedFileTable( textvalue varchar(max) ) GO -- Import text file for Comprehensive Cooperative Station List (ccsl) BULK INSERT ImportedFileTable FROM 'C:\noaa_data\ccsl_data.txt' begin try drop table dbo.ccsl_still_in_use end try begin catch print 'ccsl_still_in_use table not available to drop' end catch -- select all stations from 'C:\noaa_data\ccsl_data.txt' -- convert latitude and longitude from original string (LATITUDE and LONGITUDE) -- to string representation in dd.mmss (lat_dd.mmss and long_dd.mmss) -- station name starting with 'UTQIA' omitted -- because of lack of collation support for character in station name -- no explicit criterion to omit the station because it is no longer operational -- and the result set from this query is only for currently operating -- weather stations select left(textvalue,2) STATEPROV ,substring(textvalue,4,30) [STATION NAME] ,substring(textvalue,35,6) [COOP STATION ID] ,substring(textvalue,42,2) [CLIMATE DIVISION] ,substring(textvalue,45,30) [COUNTY NAME] ,substring(textvalue,76,8) [NCDC STATION ID] ,substring(textvalue,85,10) [LATITUDE] ,LEFT(substring(textvalue,85,10),3) + '.' + LEFT(substring(textvalue,89,3),2) + LEFT(substring(textvalue,92,3),2) [lat_dd.mmss] ,substring(textvalue,96,11) [LONGITUDE] ,LEFT(substring(textvalue,96,4),4) + '.' + LEFT(substring(textvalue,101,2),2) + LEFT(substring(textvalue,104,2),2) [long_dd.mmss] ,substring(textvalue,108,5) [ELEVATION] ,substring(textvalue,115,8) [END DATE] into dbo.ccsl_still_in_use from ImportedFileTable where substring(textvalue,115,8) = '99991231' -- station is still in operation
Once you have populated the ccsl_still_in_use table, you can query it to report data about the weather stations. Here is a simple script to display all the rows in the ccsl_still_in_use table. It orders its results set by stateprov in descending order and then by station name in ascending order.
-- sorted by stateprov, [station name], [end date], [coop station id] -- stations with no stateprov value are at the end of the list select * from dbo.ccsl_still_in_use order by stateprov desc, [station name]
Here are the first five rows from the results set from the preceding script. All five rows are for weather stations in Wyoming. The first two rows are for weather stations in Lincoln county. The next three rows are for weather stations in Teton, Fremont, and Carbon counties.
The next screen shot is for the last eight rows in the results set from the preceding script.
- The first two rows are for weather stations within Yakutat Borough county in the state of Alaska (AK).
- The next six rows are for weather stations that are not located in a locale with a USPS two-character state or territory abbreviation. You can, however, specify the location of these weather stations by the latitude and longitude of the station location.
Creating a heat map for the count of weather stations in a state
Now that the tip created a data source for weather stations including weather station identifiers and the state jurisdictions in which stations reside, we can create a heat map to represent by color shades the count of weather stations by state. There are three steps to accomplishing this task.
- Begin by counting the weather stations in each state from within SQL Server and manually copying the results set from SQL Server to a blank Excel worksheet.
- Select the copied data in the worksheet and choose to insert a filled map.
- Configure the map chart that results from the insert map command for display of the heat map results.
Here's a T-SQL script for counting the number of weather stations by state from the ccsl_still_in_use table created in the prior section.
- The code groups the table rows by STATEPROV field value with a group by clause.
- In its having clause, the script excludes rows that are not for a USPS state abbreviation.
- There are two select list items:
- STATEPROV is for the two-character state abbreviation.
- count(*) returns the count of weather stations for each STATEPROV value in the results set.
-- for Heat Map of COOP STATION Count by State select STATEPROV, count(*) [STATION COUNT] from [noaa_data].[dbo].[ccsl_still_in_use] group by STATEPROV having STATEPROV NOT IN('','AS','FM','GU','MH','MP','PR','PW','UM','VI') order by STATEPROV
Here is a table with the contents of results set from the query. There is one row for each of the fifty states. The states are represented by their two-character USPS state abbreviation. The left pane shows the first twenty-five rows, and the right pane shows the second twenty-five rows.
After selecting all the rows from the results set within SSMS, right-click in the Results tab and choose Copy with Header. Then, enter a fresh worksheet within an Excel workbook and paste the copied results set to columns A and B in the worksheet.
Next, select the copied data in columns A and B. Then, choose Insert, Maps, Filled Maps. These actions paste a basic version of the heat map on the center of the worksheet. See the Excel worksheet image below for the menu selections and the resulting heat map.
You can edit the map chart by changing its size, the shading color, and the background color. You can also assign a custom chart title. The following image shows an edited version of the basic chart with the cursor hovering over Rhode Island (RI). Hovering the mouse over a state causes a box for the state to pop up with the state name and the value for the count of weather stations in the state. As the hover box indicates, there are eight weather COOP stations in Rhode Island. This value derives from the contents in cell B40. The background shade for Rhode Island matches the shade for 8 in the heat legend to the right of the map.
Here is another example with the mouse cursor resting in the states of Texas (TX). The hover box indicates there are 535 weather stations in Texas. Both the shade and count of weather stations matches the maximum value on the heat scale legend to the right of the map.
Download the sample Heat Map for this tip.
Display from SQL Server of latitude and longitude coordinates for multiple points
The geography data type within SQL Server 2019 and some earlier versions of SQL Server supports the display of geographic points, such as weather station locations, by latitude and longitude decimal degree values. The latitude-longitude coordinates display from within a new spatial results tab. The spatial results tab can be selected in SSMS similarly to the results and messages tabs.
The lat_dd.mmss and long_dd.mmss values in the ccsl_still_in_use table are not decimal degree values. Instead, they are whole degree values followed by a period and a two-character field for minutes and then another two-character field for seconds. When you have source data for latitude and longitude values in degree-minute-second format, you must transform the degree-minute-second format to a decimal-degree format for use in SQL Server geographical data type functions. This requirement is common because many geographic data sources represent latitude/longitude values in degree-minute-second format as opposed to decimal degree format. The conversion transformation is as follows.
- The whole degrees value is represented as degrees in both formats.
- Designate the minutes value as the value in minutes divided by 60 to obtain a fraction of a degree. By convention, there are up to sixty minutes within one degree.
- Designate the seconds value as the value in seconds divided by 3600 to obtain a fraction of a degree. By convention, there are up to thirty-six hundred seconds in a degree.
- The minute and second fractional degree values are either added to or subtracted
from the degree value for a latitude or longitude based on the hemisphere for
a geography point.
- For any positive latitude value or longitude value, the fractional degrees for transformed minute and second values are added to the degree value for a point.
- For any negative latitude value or longitude value, the fractional degrees for transformed minute and second values are subtracted from the degree value for a point.
The following script shows the application of these guidelines for transforming the lat_dd.mmss and long_dd.mmss values for the 8824 weather stations in the ccsl_still_in_use table. The decimal degree version of the weather station latitude and longitude values are saved in the lat_long_dec_degree_by_station_id table within the dbo schema of the default database, which is in the noaa_data database in this tip.
- The data source for the script appears in the from clause.
- The destination (lat_long_dec_degree_by_station_id table) for the result set from the script is specified with the help of the into clause for the select statement.
- The select list has five items.
- COOP STATION ID value
- Lat_dd.mmss and long_dd.mmss values which come directly from the COOP program
- Lat_dec_degree and long_dec_degree values which are the transformed decimal degree values which are computed by case statements.
The case statements for calculating lat_dec_degree and long_dec_degree values have a parallel structure.
- Each case statement has two when clauses.
- The first when clause is for instances where the original degree value is greater than or equal to zero.
- The second when clause is for instances where the original degree value is negative.
- When the original degree value is greater than or equal to zero, then the transformed minute and second values are added to the original whole degree value to obtain the transformed value.
- When the original degree value is less than zero, then the transformed minute and second values are subtracted from the original whole degree value to obtain the transformed value.
-- compute latitude and longitude decimal degree values -- from station locales in degrees, minutes and seconds -- Remove prior lat_long_dec_degree_by_station_id table -- if a prior version already exists if EXISTS(SELECT * FROM sys.objects WHERE name = 'lat_long_dec_degree_by_station_id' and TYPE = 'u') DROP TABLE dbo.lat_long_dec_degree_by_station_id go -- do transformation calculations and -- store them by COOP STATION ID in the -- lat_long_dec_degree_by_station_id table select [COOP STATION ID] ,[lat_dd.mmss] ,[long_dd.mmss] , case when cast([lat_dd.mmss] as float) >= 0 then cast(substring([lat_dd.mmss],1,3) as float) + cast(substring([lat_dd.mmss],5,2) as float)/60 + cast(substring([lat_dd.mmss],7,2) as float)/3600 when cast([lat_dd.mmss] as float) < 0 then cast(substring([lat_dd.mmss],1,3) as float) - cast(substring([lat_dd.mmss],5,2) as float)/60 - cast(substring([lat_dd.mmss],7,2) as float)/3600 end lat_dec_degree , case when cast([long_dd.mmss] as float) >= 0 then cast(substring([long_dd.mmss],1,4) as float) + cast(substring([long_dd.mmss],6,2) as float)/60 + cast(substring([long_dd.mmss],8,2) as float)/3600 when cast([long_dd.mmss] as float) < 0 then cast(substring([long_dd.mmss],1,4) as float) - cast(substring([long_dd.mmss],6,2) as float)/60 - cast(substring([long_dd.mmss],8,2) as float)/3600 end long_dec_degree into dbo.lat_long_dec_degree_by_station_id from dbo.ccsl_still_in_use order by case when cast([lat_dd.mmss] as float) >= 0 then cast(substring([lat_dd.mmss],1,3) as float) + cast(substring([lat_dd.mmss],5,2) as float)/60 + cast(substring([lat_dd.mmss],7,2) as float)/3600 when cast([lat_dd.mmss] as float) < 0 then cast(substring([lat_dd.mmss],1,3) as float) - cast(substring([lat_dd.mmss],5,2) as float)/60 - cast(substring([lat_dd.mmss],7,2) as float)/3600 end
The next script shows the use of the decimal degree values from the lat_long_dec_degree_by_station_id table to plot weather station latitude and longitude values from the ccsl_still_in_use table. In particular, the code plots three weather stations located at Washington Reagan National airport, Washington Dulles International airport, and Baltimore-Washington International airport. All three of these airports are relatively proximate to Washington, DC.
- The from clause implements an inner join between the ccsl_still_in_use table and the lat_long_dec_degree_by_station_id table based on the COOP STATION ID value from both tables.
- The where clause restricts the output to three station id values – one for Washington Dulles International airport, another for Washington Reagan National airport, and the last station id value is for Baltimore-Washington International airport.
- All except one select list item are derived from the source tables.
- STATEPROV and COUNTY NAME identify jurisdictional locales for each of the three weather stations
- STATION NAME, and COOP STATION ID values reveal identifying fields for the weather stations in the COOP program
- Lat_dd.mmss and long_dd.mmss specify the latitude and longitude, respectively, of the weather stations in degrees, minutes, and seconds format
- Lat_dec_degree and long_dec_degree designate the latitude and longitude, respectively, of the weather stations in decimal degree format
- The remaining select list item has the label weather_station_locale. This item uses the Point geography data type. Notice the syntax for the Point geography data type relies on the lat_dec_degree and long_dec_degree field values. This data type specifies the location of the weather station to SQL Server. The constant value of 4326 trailing the long_dec_degree value denotes the most common spatial reference system identifier for referencing data across the entire world (or any portion of it).
-- plot the location of three airport weather stations -- via latitude/longitude values mapped to geographic points select ccsl_still_in_use.[STATEPROV] ,ccsl_still_in_use.[COUNTY NAME] ,ccsl_still_in_use.[STATION NAME] ,ccsl_still_in_use.[COOP STATION ID] ,ccsl_still_in_use.[lat_dd.mmss] ,ccsl_still_in_use.[long_dd.mmss] ,lat_long_dec_degree_by_station_id.lat_dec_degree ,lat_long_dec_degree_by_station_id.long_dec_degree ,geography::Point(lat_dec_degree, [long_dec_degree], 4326) weather_station_locale from [dbo].[ccsl_still_in_use] inner join dbo.lat_long_dec_degree_by_station_id on ccsl_still_in_use.[COOP STATION ID] = lat_long_dec_degree_by_station_id.[COOP STATION ID] where ccsl_still_in_use.[COOP STATION ID] in( 448903, -- for WASHINGTON DULLES INTL AP 448906, -- for WASHINGTON REAGAN NATL AP 180465 -- for BALTIMORE-WASHINGTON INTL AP )
Here is the results set from the preceding script. The results set consists of three rows – one for the weather station at each of the three airports. Notice that the spatial results tab appears between the results tab and the messages tab. The spatial results tab is only present when a results set contains a column of SQL Server point data type values, such as the values in the weather_station_locale column.
The spatial results tab presents a rectangular grid with three points – one point for each row in the preceding results set. The points plot the weather station locations in terms of latitude and longitude decimal values. The grid designates latitude values with horizontal lines. The grid also designates longitude values with vertical lines. Resting the cursor next to a point shows its matching column values, except for weather_station_locale, from the preceding screen shot. The following screen shots shows the hover box for each of the three points.
- The first screen shot identifies the point for Baltimore-Washington International airport.
- The second screen shot identifies the point for Washington Reagan National airport.
- The third screen shot identifies the point for Washington Dulles International airport.
A plot like the one displayed in the spatial results tab makes it easy to designate the positions of points relative to one another. For example, the Baltimore-Washington International airport is clearly north and east of the other two airports. On the other hand, the Washington Reagan National airport is south of the other two airports. Finally, the Washington Dulles International airport is west of the other two airports.
Calculating the distance in miles between weather stations and the White House
SQL Server 2019 offers the STDistance method for computing the LineString difference between two points. The general syntax for the method is
When used with the 4326 spatial reference identifier for specifying points, this method returns the meters between two points. You can convert the meters values to miles value by multiplying the returned meters value by 0.00062137116.
Based on the above guidelines, the following script returns the difference in miles between the White House in Washington, DC and the COOP weather stations on the grounds of the three airports that are near Washington, DC (Reagan, Dulles, and Baltimore-Washington).
- The script follows a declare with a set statement to designate the decimal degree latitude and longitude values for the White House in a local variable named @WhiteHouse.
- Next, a select statement with a nested query computes the distance between
the White House and each of the three airports.
- An inner query computes the geography point for each of the three airports.
- The outer query returns the STATION NAME for each airport weather station along with the distance between the weather station at the airport and the White House.
- Recall that the returned distance from the STDistance method is in meters; multiplying the return value by 0.00062137116 converts the distance from meters to miles.
-- calculating the distance (in miles) between weather stations -- at airports relatively near the White House -- decimal degree latitude and longitude used to -- White House locale declare @WhiteHouse geography; set @WhiteHouse = geography::Point( 38.697657, -77.036560, 4326) select [STATION NAME] ,@WhiteHouse.STDistance(weather_station_locale) * 0.00062137116 [miles to White House] from ( select ccsl_still_in_use.[STATION NAME] ,geography::Point(lat_dec_degree,long_dec_degree,4326) weather_station_locale from [noaa_data].[dbo].[ccsl_still_in_use] inner join dbo.lat_long_dec_degree_by_station_id on ccsl_still_in_use.[COOP STATION ID] = lat_long_dec_degree_by_station_id.[COOP STATION ID] where ccsl_still_in_use.[COOP STATION ID] in( 448903, 448906, 180465 ) ) [COOP STATION IDs relatively near the White House] order by [STATION NAME]
The following screen shot shows the results set from the preceding query. As you can see, the weather station at the Washington Reagan National airport is the closest airport to the White House, and the weather station at the Baltimore-Washington International airport is the farthest of the three airports from the White House. The weather station at the Washington Dulles International airport has a distance from the White House that is in between the nearest and farthest weather station.
I looked up on the internet the miles between each airport and the White House. I found some different values than those reported below. I surmise that a big part of the difference is that the COOP weather station within the grounds of each airport was not in the midst of passenger loading and disembarking locations. In any event, the miles from the weather stations to the White House were in the same order by airport whether I derived the distances from the COOP weather station latitude and longitude along with the STDistance method or an internet lookup approach for the distance between the airports and White House.
The data for this tip is based on source files from the NOAA that maintains a database of historical weather stations used for observing climate data, such as temperature, rain fall, and snow fall. The geography fields for the weather stations include state and county jurisdiction identifiers as well as latitude and longitude values. The tip includes three examples of geographical analysis for data in SQL Server, including a heat map, plotting points in the spatial results tab of SQL Server, and computing the distance between points in either meters or miles.
- If you have just a general interest in analyzing geographical data within SQL Server, you can simply bookmark this tip for future reference when you need to apply one or more of the three analysis types demonstrated here.
- If you currently have geographic data either at the jurisdictional level of state, county, or zip code or for the latitudes and longitudes of a set of points, then you can apply any of the three analysis frameworks demonstrated in this tip for your own data.
- If you have more advanced geographical analysis techniques beyond those illustrated in this tip, you might check the SQL Server documentation for the geography data type. The data type has a large array of methods beyond those illustrated in this tip.
About the author
View all my tips