SQL Server File and Filegroup Space Usage

Problem

When investigating a disk space or large SQL Server table issue, I often want to know many pieces of information, and have to scramble to find all types of different queries that I’ve pieced together over time. I’ll want to answer some or all of the following questions:

How big is the table?
How many indexes, and how big are they?
Is the data partitioned?
How many filegroups and files are there?
How is the data distributed across the data files?
How big can the data files get?
How much room do the files have to grow in the current location?
Can some or all of the data be moved to different files / filegroups?
Should filegroups or files be added or moved?
Are these settings sensible and in policy?

Because there is no single SQL Server DMV that lays all of this out nicely for you, I can often hear a 1994 Dennis Hopper yelling at me into his flip phone: “Pop quiz, hot shot!” When there is a critical issue impacting the proverbial cash register, and the pressure is on, it would be nice to be able to get this information quickly and in one place.

Solution

After a couple of incidents like this, I put together a stored procedure that assembled all of this information for me. And I actually made two variations of the stored procedure: one that broke the information down for me based on a table I specified, and a more elaborate version that I could use when I didn’t already know which table(s) might be (at risk of) causing the issue.

But let’s start from the beginning. For a table I have already identified as “large” or a problem in terms of replication latency or other indicators, I want to know, as a starting point:

The name of each filegroup for the database
For each data file:
- The ID, logical name, and path
- Current size and free space (absolute and as %)
- Growth settings and max size
For each index (including the heap / clustered index):
- How much of that data lived in each file
- Partition distribution across those files
Whether the table was partitioned (wouldn’t be obvious from above if, say, one partition per filegroup)
Size of the drive each file is on, and how much free space on that drive (absolute and as %)

So as an output shape, that would look roughly like this:

You can get all of this data from different DMVs and system functions but, again, not in one place or in one tidy resultset. You’ll have to go separately to the following places to piece this together:

sys.filegroups
sys.database_files
FILEPROPERTY
sys.indexes
sys.partitions
sys.dm_db_database_page_allocations
sys.dm_os_volume_stats

Getting the individual pieces of information from each of these places is not overly complicated in isolation. To demonstrate with some examples, let’s first create a database:

CREATE DATABASE FGExample;
GO

USE FGExample;
GO

Example 1 – SQL Server Data Unevenly Distributed Across Files

Let’s say I have a scenario where a table’s data isn’t evenly distributed across the files (maybe a file was added later, and the index hasn’t rebuilt yet, and proportional fill is still favoring older files). To manufacture this scenario, we can add a new, unevenly distributed filegroup, and add two files:

ALTER DATABASE FGExample ADD FILEGROUP UnevenDist;
GO
 
ALTER DATABASE FGExample ADD FILE
(
  name=N'Uneven1', size=64,  filegrowth=20, maxsize=1024,
  filename=N'/var/opt/mssql/data/Uneven1.mdf'
)TO FILEGROUP UnevenDist;
 
ALTER DATABASE FGExample ADD FILE 
(
  name = N'Uneven2', size = 128, filegrowth = 10, maxsize = 1024, 
  filename = N'/var/opt/mssql/data/Uneven2.mdf'
) TO FILEGROUP UnevenDist;
GO

Now, we can create a table on that filegroup, and add some data:

CREATE TABLE dbo.tblUnevenDist
(
    id int NOT NULL, filler char(2000) NOT NULL DEFAULT '',
    CONSTRAINT PK_tblUnevenDist PRIMARY KEY(id) ON UnevenDist
) ON UnevenDist;
GO
 
CREATE INDEX IX_tblUnevenDist ON dbo.tblUnevenDist(id DESC) WHERE id > 0
  ON [PRIMARY];
GO

INSERT dbo.tblUnevenDist(id) SELECT [object_id] FROM sys.all_objects

Then, add a third file, which will be empty until something causes data to move there:

ALTER DATABASE FGExample ADD FILE 
(
  name = N'Uneven3', size = 96, filegrowth = 10, 
  filename = N'/var/opt/mssql/data/Uneven3.mdf'
) TO FILEGROUP UnevenDist;
GO

I can get the information about the filegroup and files this way:

-- get filegroup files
DECLARE @FileGroupName sysname = N'UnevenDist' 
;WITH src AS
(
  SELECT FG          = fg.name, 
         FileID      = f.file_id,
         LogicalName = f.name,
         [Path]      = f.physical_name, 
         FileSizeMB  = f.size/128.0, 
         UsedSpaceMB = CONVERT(bigint, FILEPROPERTY(f.[name], 'SpaceUsed'))/128.0, 
         GrowthMB    = CASE f.is_percent_growth WHEN 1 THEN NULL ELSE f.growth/128.0 END,
         MaxSizeMB   = NULLIF(f.max_size, -1)/128.0,
         DriveSizeMB = vs.total_bytes/1048576.0,
         DriveFreeMB = vs.available_bytes/1048576.0
  FROM sys.database_files AS f
  INNER JOIN sys.filegroups AS fg
        ON f.data_space_id = fg.data_space_id
  CROSS APPLY sys.dm_os_volume_stats(DB_ID(), f.file_id) AS vs
  WHERE fg.name = COALESCE(@FileGroupName, fg.name)
)
SELECT [Filegroup] = FG, FileID, LogicalName, [Path],
  FileSizeMB  = CONVERT(decimal(18,2), FileSizeMB),
  FreeSpaceMB = CONVERT(decimal(18,2), FileSizeMB-UsedSpaceMB),
  [%]         = CONVERT(decimal(5,2), 100.0*(FileSizeMB-UsedSpaceMB)/FileSizeMB),
  GrowthMB    = COALESCE(RTRIM(CONVERT(decimal(18,2), GrowthMB)), '% warning!'),
  MaxSizeMB   = CONVERT(decimal(18,2), MaxSizeMB),
  DriveSizeMB = CONVERT(bigint, DriveSizeMB),
  DriveFreeMB = CONVERT(bigint, DriveFreeMB),
  [%]         = CONVERT(decimal(5,2), 100.0*(DriveFreeMB)/DriveSizeMB)
FROM src
ORDER BY FG, LogicalName

Results (prettified in a spreadsheet):

A screenshot of a social media post Description automatically generated

Looking at this output visually immediately points out five problems. Four of them are highlighted above; from left to right:

The files are all different sizes - For even distribution, you want these to be the same, otherwise small files never get touched and become wasteful.
One file is almost completely empty - This indicates either the file was recently added or it is not getting selected by proportional fill.
One file has a bigger autogrowth setting - If this file grows next, it could become a hotspot because it will have more free space than any other files that only grow by half that amount.
One file doesn’t have a max size - There may be reasons for this, but if only one file is uncapped, it has a potential to become the single bottleneck when the other files get closer to their capacity.
There are three files in this filegroup - Usually, you want to have a number of files that correlates in some way to the number of cores – not necessarily 1:1, maybe 1:2 or 1:8, but some even number. In this case maybe the machine has a single-socket with 6 cores, but still worth investigating to be sure this configuration is optimal.

Example 2 - SQL Server File Percentage Auto Growth

Let’s add another filegroup and two files; this time, we’ll give them equal distribution, but we’ll "accidentally" configure one of the files to use percentage auto-growth (not exactly a best practice).

ALTER DATABASE FGExample ADD FILEGROUP EvenDist;
GO
 
ALTER DATABASE FGExample ADD FILE
(
    name = N'Even1', size = 32, filegrowth = 10%,
    filename = N'/var/opt/mssql/data/Even1.mdf'
)
TO FILEGROUP EvenDist;

ALTER DATABASE FGExample ADD FILE
(
    name = N'Even2', size = 32, filegrowth = 10,
    filename = N'/var/opt/mssql/data/Even2.mdf'
)
TO FILEGROUP EvenDist;

And we’ll insert the same data as before, to a new table on this filegroup:

CREATE TABLE dbo.tblEvenDist
(
    id int NOT NULL, filler char(2000) NOT NULL DEFAULT'',
    CONSTRAINT PK_tblEvenDist PRIMARY KEY(id) ON UnevenDist
) ON UnevenDist;
GO
 
CREATE INDEX IX_tblEvenDist ON dbo.tblUnevenDist(id DESC) WHERE id > 0
  ON [PRIMARY];
GO
 
INSERT dbo.tblEvenDist(id) SELECT [object_id] FROM sys.all_objects;

Changing the query "get filegroup files" from above to specify a different filegroup name (EvenDist in this case) as the parameter (or setting it to NULL so that all filegroups are returned), we see these (additional) rows:

This time we can easily see that the data is more evenly distributed, but we do get a little warning that one of the files has growth configured for percentage. In our scenario, the actual percentage is less important than creating an action item to correct that setting to a fixed size.

Example 3 - SQL Server Filegroup is Skewed

Sometimes, distribution skew among files can be normal and expected. Take the example where you have a partitioned table, and one partition on its own filegroup simply contains more data than another. This can happen for several reasons, mostly around data skew, including:

Sales have doubled year-over-year, but partitions are still defined as a year long.
Partition ranges are different (maybe we switched from annual to monthly).
Some partitions simply aren’t populated yet (say, data is still being back-filled, or partitions are already created to cover several years into the future).
Some partitions are compressed differently (maybe data compression only on some partitions or indexes, or different compression settings – like COLUMNSTORE_ARCHIVE – on older ones).

In addition, any partition could have individual files that are different sizes (or have a different percentage full), due to issues highlighted in the first two examples. You won’t always need to take any action, but you certainly can’t perform any analysis – never mind make any decisions – without visibility.

Let’s create a simple partitioned table with unequal distribution across files and across partitions. We can see where the existing query can help, and where it can fall short. First, we can create the filegroups where our partitions will be distributed:

ALTER DATABASE FGExample ADD FILEGROUP Part1;
ALTER DATABASE FGExample ADD FILEGROUP Part2;
ALTER DATABASE FGExample ADD FILEGROUP Part3;
ALTER DATABASE FGExample ADD FILEGROUP Part4;
GO
 
ALTER DATABASE FGExample ADD FILE 
(
 name = N'P1',   size = 16, filegrowth = 5,
  filename = N'/var/opt/mssql/data/P1.mdf'
) TO FILEGROUP Part1;

ALTER DATABASE FGExample ADD FILE 
(
 name = N'P2',   size = 16, filegrowth = 5,
  filename = N'/var/opt/mssql/data/P2.mdf'
) TO FILEGROUP Part2;

ALTER DATABASE FGExample ADD FILE 
(
 name = N'P3',   size = 16, filegrowth = 5,
  filename = N'/var/opt/mssql/data/P3.mdf'
) TO FILEGROUP Part3;
 
ALTER DATABASE FGExample ADD FILE 
(
 name = N'P4a',  size = 16, filegrowth = 5,
  filename = N'/var/opt/mssql/data/P4a.mdf'
) TO FILEGROUP Part4;

Now we need a partition schema and function:

CREATE PARTITION FUNCTION PFInt(int)
AS RANGE RIGHT FOR VALUES(10,20,30);
 
CREATE PARTITION SCHEME PSInt
AS PARTITION PFInt TO (Part1, Part2, Part3, Part4);

Then a table with the clustered index created on that scheme (and non-clustered indexes on PRIMARY):

CREATE TABLE dbo.PartExample
(
  id     int        NOT NULL, 
  dt     datetime   NOT NULL DEFAULT GETDATE(),
  filler char(4000) NOT NULL DEFAULT'',
  INDEX cix_pe CLUSTERED(id)
) ON PSInt(id);
GO
 
CREATE INDEX ix_pe ON dbo.PartExample(id DESC)
  ON[PRIMARY];
GO
 
CREATE INDEX ix_dt ON dbo.PartExample(dt)
  ON[PRIMARY];
GO

Some data:

INSERT dbo.PartExample(id) SELECT CASE 
  WHEN o >  1000000 THEN 5  --   331 rows
  WHEN o >  0       THEN 15 --   692 rows
  WHEN o > -10000   THEN 25 -- 3,106 rows
  ELSE 35 END               -- 6,779 rows
FROM (SELECT [object_id] FROM sys.all_columns) AS t(o);

And then an additional file to create some artificial skew in the Part4 filegroup:

ALTER DATABASE FGExample ADD FILE 
(
  name = N'P4b',  size = 16, filegrowth = 5, 
  filename = N'/var/opt/mssql/data/P4b.mdf'
) TO FILEGROUP Part4;

Now if we run the query "get filegroup files" again with the relevant filegroups we show these results:

Relevant filegroups after adding a partitioned table.

I’ve highlighted the two things that stand out: that one of the files in the Part4 filegroup has grown from its original configuration of 16MB, and that the other file – which hasn’t grown – is empty. One thing that’s not obvious in this resultset, though, is whether any indexes are non-partition-aligned, and it may not be something I know I should be looking for.

Next Steps

Stay tuned for part 2, where I’ll show how to introduce index information to this output without overwhelming the consumer – making it easy to spot all of these issues and more, without knowing what queries you should be running to do so. In the meantime, read on for related tips and other resources:

Aaron Bertrand

Aaron Bertrand (@AaronBertrand) is a passionate technologist with industry experience dating back to Classic ASP and SQL Server 6.5. He also blogs at sqlblog.org.

MSSQLTips Awards: Author of the Year – 2016, 2023 | Leadership (200+ tips) – 2022

SQL Server File and Filegroup Space Usage – Part 1

Example 1 – SQL Server Data Unevenly Distributed Across Files

Example 2 - SQL Server File Percentage Auto Growth

Example 3 - SQL Server Filegroup is Skewed

Leave a ReplyCancel Reply