SQL Server Index and Partition Space Usage

Problem

Recently, in SQL Server File and Filegroup Space Usage - Part 1, I wrote about enhancing the way we look at information about large tables, as well as the files and filegroups that contain them. The problem we have today is that there are a handful of queries you may need to fully analyze a scenario, but assembling them into one spot can be cumbersome.

Solution

I wanted to continue that earlier discussion by enhancing my initial queries with further information about indexes and partitions. If you remember, we had a table called PartExample, which had a partitioned clustering key, and at least one non-clustered index that was not partition-aligned. The output of my initial query didn’t make it obvious that any of the data from the table was on the primary filegroup:

As a start, I’d want to see something like this; a pivot table with a column for each index, and the space (in MB) that index occupies in each file:

This still points out that one of my files is the wrong size, but it also shows that the fill percentages are off, that the clustered index has no rows on the second file in the Part4 filegroup, and that ix_dt was accidentally created on the PRIMARY filegroup. But how do I get these results?

I can pre-aggregate the size per file from the dynamic management function, sys.dm_db_database_page_allocations (which, admittedly, I warn about using excessively in an earlier tip):

DECLARE @object_id int = OBJECT_ID(N'PartExample'SELECT 
  FileID  = extent_file_id, 
  IndexID = index_id, 
  SizeMB  = CONVERT(decimal(18,2), COUNT(allocated_page_page_id)*8.192/1024)
FROM sys.dm_db_database_page_allocations(DB_ID(), @object_id, NULL, NULL, N'LIMITED')
GROUP BY extent_file_id, index_id;

Results:

That definitely looks like something we could turn sideways, pivoting manually using IndexID (and since we know the corresponding names, we can easily set the column headers appropriately):

DECLARE @object_id int = OBJECT_ID(N'PartExample'WITH dist AS
(
  SELECT 
    FileID  = extent_file_id, 
    IndexID = index_id, 
    SizeMB  = CONVERT(decimal(18,2),COUNT(allocated_page_page_id)*8.192/1024)
  FROM sys.dm_db_database_page_allocations(DB_ID(), @object_id, NULL, NULL, N'LIMITED') AS pa
  GROUP BY extent_file_id, index_id
)
SELECT 
  p.FileID,
  [cix_pe (index_id = 1)] = p.[1],
  [ix_dt (2)]             = p.[2] 
FROM dist PIVOT (SUM(SizeMB) FOR IndexID IN ([1],[2])) AS

Results:

And you can see from this output how we could produce the screen shot above, outer joining those results to the original query and having those pivoted index columns alongside the file details:

DECLARE @object_id int = OBJECT_ID(N'PartExample'WITH dist AS
(
  SELECT FileID  = extent_file_id, 
         IndexID = index_id, 
         SizeMB  = CONVERT(decimal(18,2),COUNT(allocated_page_page_id)*8.192/1024)
  FROM sys.dm_db_database_page_allocations(DB_ID(), @object_id, NULL, NULL, N'LIMITED')
  GROUP BY extent_file_id, index_id
),
p AS
(
  SELECT FileID, [1], [2] FROM dist
  PIVOT (SUM(SizeMB) FOR IndexID IN ([1], [2])) AS p
),
finfo AS
(
  SELECT FG          = fg.name, 
         FileID      = f.file_id,
         LogicalName = f.name,
         [Path]      = f.physical_name, 
         FileSizeMB  = f.size/128.0, 
         UsedSpaceMB = CONVERT(bigint, FILEPROPERTY(f.[name], 'SpaceUsed'))/128.0
  FROM sys.database_files AS f
  INNER JOIN sys.filegroups AS fg ON f.data_space_id = fg.data_space_id
)
SELECT 
  [Filegroup] = f.FG, 
  f.FileID,     
  f.LogicalName,
  FileSizeMB  = CONVERT(decimal(18,2), f.FileSizeMB),
  FreeSpaceMB = CONVERT(decimal(18,2), f.FileSizeMB-f.UsedSpaceMB),
  [%]         = CONVERT(decimal(5,2), 100.0*(f.FileSizeMB-f.UsedSpaceMB)/f.FileSizeMB),
  [cix_pe (index_id = 1)] = p.[1],
  [ix_dt (2)]             = p.[2]
FROM finfo AS f LEFT OUTER JOIN p ON f.FileID = p.FileID
ORDER BY f.FileID;

Results:

In troubleshooting mode, though, I don't know all of the index details in advance, and I don't want to have to suss those out myself. I want to call a stored procedure that does that grunt work for me, by…

Building the PIVOT details dynamically

In the code sample above, I highlighted the portions of the eventual query that would need to be derived dynamically in order to combine the tabular and pivoted data into a single resultset. We can also dynamically add a partition count column for any index with more than one partition. I created the following stored procedure in a Utility database common across all of our instances, and it can be run from there without having to be in the right database context (you just need to pass in the right database name):

CREATE PROCEDURE dbo.AssessDistribution_ByTable
  @ObjectName     sysname,
  @SchemaName     sysname       = N'dbo',
  @DatabaseName   nvarchar(260) = NULL,
  @FileGroupName  nvarchar(260) = NULL 
AS
BEGIN
  SET NOCOUNT ONDECLARE @sql            nvarchar(max)  = N'SELECT @oi = OBJECT_ID(@on);',
          @ObjectID       int,
          @PivotColNames  nvarchar(max)  = N'',
          @PrettyHeaders  nvarchar(max)  = N'',
          @MaxHeaders     nvarchar(max)  = N'',
          @Context        nvarchar(1024) = COALESCE(QUOTENAME(@DatabaseName) + N'.', '')
                                           + N'sys.sp_executesql',
          @FullObjectName nvarchar(520)  = QUOTENAME(COALESCE(@SchemaName, N'dbo'))
                                           + N'.' + QUOTENAME(@ObjectName);
  EXEC @Context @sql, N'@on nvarchar(512), @oi int OUTPUT', @FullObjectName, @ObjectID OUTPUTIF @ObjectID IS NULL
  BEGIN
      RAISERROR(N'%s does not exist in db %s.', 11, 1, @FullObjectName, @DatabaseName);
      RETURN  END
  SET @sql = N'SELECT 
    @pcn += N'','' + QUOTENAME(index_id),
    @mh  += N'','' + QUOTENAME(index_id) + '' = MAX('' + QUOTENAME(index_id) + '')'',
    @ph  += N'', ['' + COALESCE([name],''(heap'') 
         +  N'' ('' + CASE WHEN index_id < 2 THEN ''id '' ELSE '''' END 
         + RTRIM(index_id) + '') size] = ps.['' + RTRIM(index_id) + '']''
         + CASE WHEN EXISTS (SELECT 1 FROM sys.partitions WHERE 
             [object_id] = @oi AND index_id = i.index_id AND partition_number > 1)
            THEN N'', [part cnt ('' + RTRIM(index_id) + N'')] = pc.['' + RTRIM(index_id) + '']'' 
            ELSE '''' END 
    FROM sys.indexes AS i WHERE [object_id] = @oi;'EXEC @Context @sql, 
       N'@oi int, @pcn nvarchar(max) OUTPUT, @ph nvarchar(max) OUTPUT, @mh nvarchar(max) OUTPUT',
       @ObjectID, @PivotColNames OUTPUT, @PrettyHeaders OUTPUT, @MaxHeaders OUTPUTSET @sql = N';WITH dst AS (
      SELECT FileID  = extent_file_id, 
             IndexID = index_id, 
             SizeMB  = CONVERT(decimal(18,2),COUNT(allocated_page_page_id)*8.192/1024),
             PartitionCount = COUNT(DISTINCT partition_id)
      FROM sys.dm_db_database_page_allocations(DB_ID(), @ObjectID, NULL, NULL, N''LIMITED'')
      GROUP BY extent_file_id, index_id
    ),
    ps AS (SELECT FileID, $pcn$ FROM dst PIVOT (SUM(SizeMB) FOR IndexID IN ($pcn$)) p),
    pc AS (SELECT FileID, $mh$ FROM (SELECT FileID, $pcn$ FROM dst 
           PIVOT (MAX(PartitionCount) FOR IndexID IN ($pcn$)) p) AS x GROUP BY FileID),
    finfo AS (
      SELECT FG          = fg.name, 
             FileID      = f.file_id,
             LogicalName = f.name,
             [Path]      = f.physical_name, 
             FileSizeMB  = f.size/128.0, 
             UsedSpaceMB = CONVERT(bigint, FILEPROPERTY(f.[name], N''SpaceUsed''))/128.0,
             GrowthMB    = CASE f.is_percent_growth WHEN 1 THEN NULL ELSE f.growth/128.0 END,
             MaxSizeMB   = NULLIF(f.max_size, -1)/128.0,
             DriveSizeMB = vs.total_bytes/1048576.0,
             DriveFreeMB = vs.available_bytes/1048576.0
      FROM sys.database_files AS f
      INNER JOIN sys.filegroups AS fg ON f.data_space_id = fg.data_space_id
      CROSS APPLY sys.dm_os_volume_stats(DB_ID(), f.file_id) AS vs
      WHERE fg.name = COALESCE(@FileGroupName, fg.name)
    )
    SELECT 
      [Filegroup] = f.FG, 
      f.FileID,     
      f.LogicalName,
      f.[Path],
      FileSizeMB  = CONVERT(decimal(18,2), f.FileSizeMB),
      FreeSpaceMB = CONVERT(decimal(18,2), f.FileSizeMB - f.UsedSpaceMB),
      [% Free]    = CONVERT(decimal(5,2), 100.0*(f.FileSizeMB - f.UsedSpaceMB) / f.FileSizeMB),
      GrowthMB    = COALESCE(RTRIM(CONVERT(decimal(18,2), GrowthMB)), ''% warning!''),
      MaxSizeMB   = CONVERT(decimal(18,2), f.MaxSizeMB)
      $ph$,
      DriveSizeMB = CONVERT(bigint, DriveSizeMB),
      DriveFreeMB = CONVERT(bigint, DriveFreeMB),
      [% Free]    = CONVERT(decimal(5,2), 100.0*(DriveFreeMB)/DriveSizeMB)
    FROM finfo AS f 
    LEFT OUTER JOIN ps ON f.FileID = ps.FileID
    LEFT OUTER JOIN pc ON f.FileID = pc.FileID
    ORDER BY [Filegroup], f.FileID;'SET @sql = REPLACE(REPLACE(REPLACE(@sql, 
             N'$ph$',  @PrettyHeaders), 
             N'$pcn$', STUFF(@PivotColNames, 1, 1, N'')),
             N'$mh$',  STUFF(@MaxHeaders, 1, 1, N''PRINT @sql;
  EXEC @Context @sql, N'@ObjectID int, @FileGroupName sysname', @ObjectID, @FileGroupName;
END
GO

When I run it with a call like this:

EXEC dbo.AssessDistribution_ByTable 
     @ObjectName   = N'PartExample', 
     @SchemaName   = N'dbo', 
     @DatabaseName = N'FGExample';

I get this output:

Results of file information correlated with distribution of data across those files

Which gets me pretty close to the original result I was looking for at the beginning of Part 1 – I just indicate an index is partition-aligned by the presence or absence of the partition count column, and show the MaxSize column in the wrong spot in the screen shot above:

Caveats

As with any complicated solution, there are some shortcomings here.

I'll mention again that sys.dm_db_database_page_allocations has some potential performance challenges. I would use this approach only when you are sure that you need to investigate the distribution of data across filegroups and data files.
I have not fully tested this with mount points or quotas; SQL Server may not always be a reliable source for information about percent of space free or room to grow.
I didn't look at how to integrate compression types into this output, nor did I look at columnstore or memory-optimized indexes, any of which can play a part in storage discrepancies.

Next Steps

This solution assumes I already know the problem table to investigate. If I only knew which database is having a space issue, could I change the procedure to pull details for the top table, or top n tables, in terms of space or row count? If I just know the instance is having trouble, could I even leave out the details of which database I need to investigate? Stay tuned for part 3! Until then, see these related tips and other resources:

Aaron Bertrand

Aaron Bertrand (@AaronBertrand) is a passionate technologist with industry experience dating back to Classic ASP and SQL Server 6.5. He also blogs at sqlblog.org.

MSSQLTips Awards: Author of the Year – 2016, 2023 | Leadership (200+ tips) – 2022

SQL Server Index and Partition Space Usage – Part 2

Building the PIVOT details dynamically

Caveats

Leave a ReplyCancel Reply