How To Find Space Used by Tables and Indexes in SQL Server Part 3


By:   |   Updated: 2020-05-26   |   Comments   |   Related: 1 | 2 | 3 | More > Database Administration

Problem

In Part 1, I wrote about discovering more details about large SQL Server tables, and in Part 2, I showed a SQL Server stored procedure that makes it easy to assemble all of this information into a single result set. That procedure, however, assumes you already know which SQL Server table(s) to investigate. What if you only know that the files for a specific SQL Server database are growing or seem misconfigured? Wouldn't it be nice to find the tables in that database with the largest size on disk or the highest number of rows?

Solution

I looked at the solution from Part 2, and realized that it would take little more than a wrapper procedure to accomplish my goal. The concept is that you pass in the name of the database, the number of tables you want to return, and whether you want to rank by size or by row count. The procedure would then run a query against either sys.partitions or sys.dm_db_database_page_allocations to determine the top n tables, and then – for each table in the list – call the original procedure. This would produce n result sets, each showing how a table is distributed across the files and filegroups.

Another approach could be to change the original procedure to pivot each table and its indexes into a single result set, but that may be more cumbersome both to create and to consume.

Before I show the code in the wrapper procedure I created, let's create a simple database with two filegroups, three files, and two tables, just to keep the results explanatory but digestible. Here is the database:

CREATE DATABASE gronk 
ON PRIMARY
  (name = N'gronk_data1'filename = N'/var/opt/mssql/data/gronk1.mdf',    size = 10 MB),
FILEGROUP SECONDARY
  (name = N'gronk_data2a', filename = N'/var/opt/mssql/data/gronk2a.mdf',   size = 10 MB),
  (name = N'gronk_data2b', filename = N'/var/opt/mssql/data/gronk2b.mdf',   size = 10 MB)
LOG ON
  (name = N'gronk_log',    filename = N'/var/opt/mssql/data/gronk_log.ldf', size = 10 MB);

Then two simple tables, one with more rows, and the other fewer rows, but more data:

USE gronk;
GO -- in the first table, we'll put a lot of rows, but small row size CREATE TABLE dbo.tblPrimary
(
  id int NOT NULL,
  CONSTRAINT pk_tblPrimary PRIMARY KEY (id),
  INDEX      ix_tblPrimary (id DESC)
) ON [PRIMARY];
GO INSERT dbo.tblPrimary(id) SELECT [object_id] FROM sys.all_objects;
  -- in the second table, we'll put fewer rows, but larger row size CREATE TABLE dbo.tblSecondary
(
  id int NOT NULL,
  filler char(4000) NOT NULL DEFAULT '',
  CONSTRAINT pk_tblSecondary PRIMARY KEY (id)
) ON [SECONDARY]; CREATE INDEX ix_tblSecondary ON dbo.tblSecondary(id DESC) INCLUDE (filler) ON [SECONDARY];
GO INSERT dbo.tblSecondary(id) SELECT TOP (1000) [object_id] FROM sys.all_objects;
GO

With those tables in place, I'll go back to the Utility database, and create the following wrapper procedure:

USE Utility;
GO CREATE PROCEDURE dbo.AssessDistribution_ForMultipleTables
  @DatabaseName   nvarchar(260),
  @TableCount     tinyint       = 5,
  @PrimaryFactor  char(4)       = 'rows' -- or 'size'
AS
BEGIN
  SET NOCOUNT ON;   IF DB_ID(@DatabaseName) IS NULL
  BEGIN
    RAISERROR(N'Database %s does not exist.', 11, 1, @DatabaseName);
    RETURN;
  END   DECLARE @sql            nvarchar(max) = N';WITH x AS (SELECT TOP (@TableCount) object_id, ',
          @SchemaName     sysname,
          @ObjectName     sysname,
          @RowOrPageCount int,
          @c              cursor,
          @Context        nvarchar(1024) = COALESCE(QUOTENAME(@DatabaseName) + N'.', N'')
                                           + N'sys.sp_executesql';   DECLARE @t table
  (
    rn             tinyint PRIMARY KEY,
    SchemaName     sysname,
    ObjectName     sysname,
    RowOrPageCount int
  );   IF @PrimaryFactor = 'rows'
  BEGIN
    SET @sql += N'c = SUM(rows) FROM sys.partitions';
  END   IF @PrimaryFactor = 'size'
  BEGIN
    SET @sql += N'c = COUNT(*) FROM sys.dm_db_database_page_allocations
                                    (DB_ID(), NULL, NULL, NULL, N''LIMITED'')';
  END   SET @sql += N'    WHERE OBJECTPROPERTY(object_id, N''IsMsShipped'') = 0
                      -- AND index_id IN (0,1)
                    GROUP BY object_id
                    ORDER BY c DESC
                 )
                 SELECT rn             = ROW_NUMBER() OVER (ORDER BY x.c DESC),
                        SchemaName     = s.name,
                        ObjectName     = o.name,
                        RowOrPageCount = x.c
                   FROM x
                   INNER JOIN sys.objects AS o ON x.object_id = o.object_id
                   INNER JOIN sys.schemas AS s ON o.schema_id = s.schema_id;'   INSERT @t EXEC @Context @sql, N'@TableCount int', @TableCount;   IF @@ROWCOUNT = 0
  BEGIN
    RAISERROR(N'No tables found.', 11, 1);
    RETURN;
  END   SET @c = cursor FORWARD_ONLY STATIC READ_ONLY FOR
    SELECT SchemaName, ObjectName, RowOrPageCount
    FROM @t ORDER BY rn;   OPEN @c;   FETCH NEXT FROM @c INTO @SchemaName, @ObjectName, @RowOrPageCount;   WHILE @@FETCH_STATUS <> -1
  BEGIN
    SELECT obj    = QUOTENAME(@SchemaName) + N'.' + QUOTENAME(@ObjectName),
           factor = @PrimaryFactor,
           num    = @RowOrPageCount;     EXEC Utility.dbo.AssessDistribution_ByTable
      @ObjectName    = @ObjectName,
      @SchemaName    = @SchemaName,
      @DatabaseName  = @DatabaseName;     FETCH NEXT FROM @c INTO @SchemaName, @ObjectName, @RowOrPageCount;
  END
END
GO

Now, if I call it with this syntax, asking for tables by row count:

EXEC Utility.dbo.AssessDistribution_ForMultipleTables 
     @DatabaseName  = N'gronk',
     @PrimaryFactor = 'rows';

I get these results:

Results from procedure call asking for tables ranked by number of rows

And if I switch to ranking tables by size:

EXEC Utility.dbo.AssessDistribution_ForMultipleTables 
     @DatabaseName  = N'gronk',
     @PrimaryFactor = 'size';

The results come back in the opposite order (and indicate the number of pages as opposed to the number rows):

Results from procedure call asking for tables ranked by number of pages

A few caveats:

  • This procedure sums the row count or size across all indexes. If you only want to account for the clustered index (or the table, if you have heaps), uncomment the -- AND index_id IN (0,1) line.
  • This solution is still untested with columnstore indexes or memory-optimized tables.
  • Because it bears repeating, sys.dm_db_database_page_allocations has some potential performance issues; if you ask for 10 tables, you're going to be calling this function 11 times. Just something to keep in mind.
Next Steps

See these related tips and other resources:



Last Updated: 2020-05-26


get scripts

next tip button



About the author
MSSQLTips author Aaron Bertrand Aaron Bertrand (@AaronBertrand) is a passionate technologist with industry experience dating back to Classic ASP and SQL Server 6.5. He is editor-in-chief of the performance-related blog, SQLPerformance.com, and also blogs at sqlblog.org.

View all my tips





Comments For This Article





download


Recommended Reading

SQL Server File and Filegroup Space Usage Part 1

SQL Server Index and Partition Space Usage - Part 2

How to read the SQL Server Database Transaction Log

How to rename a SQL Server database

Different ways to determine free space for SQL Server databases and database files





get free sql tips
agree to terms


Learn more about SQL Server tools