# SQL Server Data Aggregation for Data with Different Sampling Rates

By:   |   Updated: 2014-12-04   |   Comments (3)   |   Related: More > Dates

##### Problem

In the emerging "Internet of Things", there are multitudes of devices collecting data at differing sampling rates. Integrating this data so the data has a common granularity in time is important to not only allow for accurate analysis and mining, but it will also aid in reducing the amount of data to be stored and processed.

##### Solution

In this tip, we will demonstrate how to use the T-SQL AVG function and GROUP BY clause to transform data collected from two devices sampling at 100Hz and 40Hz to one row per second. We will use T-SQL to generate simulated data for this example. Let's begin by creating two tables, one for Device A's data and another for Device B's data.

```create table dbo.tblDeviceA
(
pkDeviceA integer identity(1,1) primary key,
recordingTime datetime2,
xAxis decimal(5,3),
yAxis decimal(5,3),
zAxis decimal(5,3)
)

create table dbo.tblDeviceB
(
pkDeviceB integer identity(1,1) primary key,
recordingTime datetime2,
temperature decimal(5,2),
humidity decimal(5,2)
)
```

Next, we will populate the Device A table with our simulated data from a device sampling at 100Hz. The DATEADD function is incrementing the recording time by 10 milliseconds.

```declare @i as integer
declare @dateTime as datetime2
set @i=0
set @dateTime='2014-11-15 12:00:00.000';
while @i<1000
begin
insert into dbo.tblDeviceA values
set @[email protected]+1
end
go
```

After inserting the rows, we will select the top 10 rows from the table to verify. Now we will populate the Device B table with our simulated data from a device sampling at 40Hz. The DATEADD function is incrementing the recording time by 25 milliseconds.

```declare @i as integer
declare @dateTime as datetime2
set @i=0
set @dateTime='2014-11-15 12:00:00.000';
while @i<400
begin
insert into dbo.tblDeviceB values
set @[email protected]+1
end
go
```

After inserting the rows, we will select the top 10 rows from the table to verify. In these next steps we will roll up the data for each device to one row of data per second. An accepted way of doing this is to average the recorded data values within each second. In the T-SQL below, we are using the CONVERT function to remove the milliseconds from the date/time value. The style value of 120 returns the date in the yyyy-mm-dd hh:mi:ss(24h) format. The same CONVERT function is used in the GROUP BY and ORDER BY clauses. The AVG function is applied to each of the data columns to get an average value per second, and the CAST function is used to limit the returned values to three decimal places.

```select CONVERT(DATETIME, CONVERT(VARCHAR(20), recordingTime, 120)) as recordingTime,
cast(avg(xAxis) as decimal(5,3)) as averageXAxis,
cast(avg(yAxis) as decimal(5,3)) as averageYAxis,
cast(avg(zAxis) as decimal(5,3)) as averageZAxis
from dbo.tblDeviceA
group by CONVERT(DATETIME, CONVERT(VARCHAR(20), recordingTime, 120))
order by CONVERT(DATETIME, CONVERT(VARCHAR(20), recordingTime, 120))
```

The results from this query for Device A are shown below. We have reduced the data from 100 rows per second to 1 row per second. For Device B's data, we apply the same CONVERT function. Notice how the query does not need the sampling rate.

```select CONVERT(DATETIME, CONVERT(VARCHAR(20), recordingTime, 120)) as recordingTime,
cast(avg(temperature) as decimal(5,2)) as averageTemperature,
cast(avg(humidity) as decimal(5,2)) as averageHumidity
from dbo.tblDeviceB
group by CONVERT(DATETIME, CONVERT(VARCHAR(20), recordingTime, 120))
order by CONVERT(DATETIME, CONVERT(VARCHAR(20), recordingTime, 120))
```

The results from this query for Device B are shown below. We have reduced the data from 40 rows per second to 1 row per second. The last step in this tip will be to display the data from both devices together on one row. There are myriad ways to do this, but we will show one way that involves two subqueries with an INNER JOIN on the recording time columns.

```select
a.recordingTime,
a.averageXAxis,
a.averageYAxis,
a.averageZAxis,
b.averageTemperature,
b.averageHumidity
from
(
select CONVERT(DATETIME, CONVERT(VARCHAR(20), recordingTime, 120)) as recordingTime,
cast(avg(xAxis) as decimal(5,3)) as averageXAxis,
cast(avg(yAxis) as decimal(5,3)) as averageYAxis,
cast(avg(zAxis) as decimal(5,3)) as averageZAxis
from dbo.tblDeviceA
group by CONVERT(DATETIME, CONVERT(VARCHAR(20), recordingTime, 120))
)as A
inner join
(
select CONVERT(DATETIME, CONVERT(VARCHAR(20), recordingTime, 120)) as recordingTime,
cast(avg(temperature) as decimal(5,2)) as averageTemperature,
cast(avg(humidity) as decimal(5,2)) as averageHumidity
from dbo.tblDeviceB
group by CONVERT(DATETIME, CONVERT(VARCHAR(20), recordingTime, 120))
) as B
on A.recordingTime=B.recordingTime
order by A.recordingTime
```

The results from the above query shows the data from both devices integrated on one row. ##### Next Steps

Temporary tables can be used to stage the data during the integration process. Also, this tip assumes that the clocks on the devices are synchronized. The DATEADD function can be used to adjust the times if the device clocks are not synchronized. Finally, read more about using the functions and clauses in T-SQL that are mentioned within this article in the following tips from MSSQLTips.com.

• Temporary Tables and Table Variables
• Date and Time Conversions Using SQL Server
• Decimal Conversions in SQL Server Change Datatype Format Dr. Dallas Snider is an Assistant Professor in the Computer Science Department at the University of West Florida and has 18+ years of SQL experience.

View all my tips

Article Last Updated: 2014-12-04