Calculating Table Width in SQL Server

By:   |   Comments (6)   |   Related: > Database Design


Problem

I am concerned our database design is overly denormalized.  I believe we have some very wide tables which makes some of our coding very easy, but I am concerned about the data access and overall performance.  Before I go too far down the path of just changing some of these tables or trying to change the ways of my team for future database design, how can I find out how wide some of these tables really are based on the data as compared to the theoretical maximum size?  Can you provide a script or two that I can run on my SQL Server databases?  Can you also provide any resources for practical database design?  I know having some denormalization is reasonable, I am just not sure if some of the database design has gone to an extreme or not.

Solution

As far as calculating table width is concerned, a few different options are available.  Let's take a look at those scripts and then address the database design independently.

Option 1 - DBCC SHOWCONTIG

The DBCC SHOWCONTIG command reports row related information and is a viable source to consider.  This is achieved by using the WITH TABLERESULTS option.  Then the MinimumRecordSize, MaximumRecordSize and AverageRecordSize can be reviewed based on your need.

Simple DBCC SHOWCONTIG Command

USE AdventureWorks;
GO
DBCC SHOWCONTIG WITH TABLERESULTS;
GO
 

Keep in mind that this functionality of the DBCC SHOWCONTIG command is available with SQL Server 2000 and SQL Server 2005.  Although on busy SQL Server databases it is not recommended to be run during production hours.  It is advisable to run the command either during off hours, during a maintenance window or on a backed up version of the database.  Here is one point of reference - DBCC SHOWCONTIG Blocking Issues.

Option 2 - sys.dm_db_index_physical_stats

One of the new features with SQL Server 2005 are the dynamic management views and functions.  The one that will come in handy in this scenario is the sys.dm_db_index_physical_stats.  The beauty of the dynamic management views and functions is that they can be queried with a simple SELECT statement.  Here are a few different examples using the AdventureWorks SQL Server 2005 sample database:

sys.dm_db_index_physical_stats - Basic SELECT Statement

USE AdventureWorks;
GO
SELECT CAST(DB_NAME(DATABASE_ID) AS VARCHAR(20)) AS 'DatabaseName',
CAST(OBJECT_NAME([OBJECT_ID]) AS VARCHAR(20)) AS 'TableName',
index_id,
index_type_desc
,
alloc_unit_type_desc,
min_record_size_in_bytes,
max_record_size_in_bytes
,
avg_record_size_in_bytes
FROM SYS.DM_DB_INDEX_PHYSICAL_STATS (DB_ID('AdventureWorks'),NULL,NULL,NULL,'DETAILED');
GO
 

sys.dm_db_index_physical_stats - Basic SELECT Statement with an ORDER BY Clause

USE AdventureWorks;
GO
SELECT CAST(DB_NAME(DATABASE_ID) AS VARCHAR(20)) AS 'DatabaseName',
CAST(OBJECT_NAME([OBJECT_ID]) AS VARCHAR(20)) AS 'TableName',
index_id,
index_type_desc
,
alloc_unit_type_desc,
min_record_size_in_bytes,
max_record_size_in_bytes
,
avg_record_size_in_bytes
FROM SYS.DM_DB_INDEX_PHYSICAL_STATS (DB_ID('AdventureWorks'),NULL,NULL,NULL,'DETAILED')
ORDER BY avg_record_size_in_bytes DESC;
GO
 

Database Design Considerations

Here are some data points to consider when evaluating your database design situation i.e. wide tables:

  • Good or Bad - A wide table is not inherently a bad design depending on how it is used.  For reporting environments, some database designs are wide to meet reporting needs to avoid joins or provide a simple interface to users/reporting applications.
  • Eliminate Joins - In OLTP environments, data is duplicated in some circumstances to eliminate joining tables.  Depending on the circumstances and how the duplicated data is maintained, this may be a critical technique to ensure the user experience.
  • Repeating columns - To me this is typically a sure sign that the design was not thought about or it has evolved over time.  If a table has 3 or more columns with the same meaning i.e. product1, product2, product3, this is a good candidate for a one to many relationship with the current table and the new product table.  Another data point to consider is what will happen when a fourth or fifth product is ordered? 
  • Off the map - If a database consists of a few (wide) tables and all of the columns are text data types when particular columns would be better suited as integers, date time, etc. as opposed to just characters, then either the design was not thought out at all or some database design education is needed.
  • Check out the Next Steps section below for some additional points of reference related to the database design.
Next Steps

 



sql server categories

sql server webinars

subscribe to mssqltips

sql server tutorials

sql server white papers

next tip



About the author
MSSQLTips author Jeremy Kadlec Jeremy Kadlec is a Co-Founder, Editor and Author at MSSQLTips.com with more than 300 contributions. He is also the CTO @ Edgewood Solutions and a six-time SQL Server MVP. Jeremy brings 20+ years of SQL Server DBA and Developer experience to the community after earning a bachelor's degree from SSU and master's from UMBC.

This author pledges the content of this article is based on professional experience and not AI generated.

View all my tips



Comments For This Article




Monday, January 31, 2011 - 2:20:33 PM - Jeremy Kadlec Back To Top (12758)

Jay,

Unfortunately, I do not understand what you mean by this:

"row size matter for IO based upon Index present or not and Index is nothing but the name of the column for e.g. if table CUSTOMER as column - FirstNAME of Varchar(255) and i create Non-Clustered Index on that column where the Index Key will be "FirstNAME"."

In terms of the indexing references in your two posts, I do not think the tip references indexing best practices or any sort of indexing recommendations, so I am not sure what you are referencing there.

For this statement "So the point over here i am making is the table width which is 255 assuming one table with one column.", I am not sure if that is true or not.  In my example, varchar columns were wide, but only a fraction of the available storage.  So when I ran the queries from the tip, the column width was not 255 (as in your example), the min could be 5, the avg could be 25 and the max could be 35.  This was the case for the columns aggregating the width for the row and/or table.

Sorry I do not have a better response, I am not sure exactly sure how I can help you.  If you can post some additional information, I can see if I can clarify this any further.

Thank you,
Jeremy Kadlec


Monday, January 31, 2011 - 1:34:55 PM - Jay Back To Top (12757)

Should i expect feedback or ?

Thanks

Jay


Monday, January 31, 2011 - 8:49:07 AM - Jeremy Kadlec Back To Top (12751)

Jay,

Thank you for the post.

Thank you,
Jeremy Kadlec


Saturday, January 29, 2011 - 7:58:21 AM - Jay Back To Top (12743)

Hello:
 
I am sorry but still it is not clear. if column is varchar(500) then it is not going to store value 501 and the remaing will be truncated unless it is LOB where 1st row will store 16byte point for row-over-flow. but over here we are not talking about row over flow.
 
row size matter for IO based upon Index present or not and Index is nothing but the name of the column for e.g. if table CUSTOMER as column - FirstNAME of Varchar(255) and i create Non-Clustered Index on that column where the Index Key will be "FirstNAME".
 
So the point over here i am making is the table width which is 255 assuming one table with one column.
 
Thanks
Regards
Jay


Friday, January 28, 2011 - 8:36:42 PM - Jeremy Kadlec Back To Top (12742)

Jay,

Thank you for the post and feedback.  Sorry the tip is confusing to you.  That was not the intention.  Let's see if I can clarify things any for you.

The tip is intended to determine the width of a table based on the data types for its columns vs, the actual amount of data that is being stored per row, which is an aggregate of the columns.  When people talk about "wide tables" it is in reference to the amount of storage for a row and\or a large number of columns per table. 

For tables with a 100+ columns with variable data types this may or may not be a situation where the physical row widths are exceeded and row overflows come into play.  The queries in the tip were intended to give you a sense of how wide the rows for the table are independent of the amount of storage the table is actually designed to store, which is based on the column definitions.  

For example, a table can have 100 columns.  One column has an int data type and the remainder are a varchar(1000).  At the surface, you would assume the table is grossly denormalized and the physical row must be exceeded, so this is causing all sorts of performance problems.  In the situation I was in, a design change was needed, but the actual lengths of the rows were not exceeding the physical row limits.  There was some education needed to help the team (my customer) understand how to design a database and select the correct data types.  Unfortunately, this sort of design with tables with 100+ columns was the norm in 1 application and I worked with the Development Team to try to improve upon the design.

In terms of the BOL article you referenced, I think this is indexing related.  It is a good set of information that does reference indexing best practices (width, number, duplication, data distribution, etc.), but I did not see how it related to calculating table widths.

I hope this helps.  If anything is unclear, please post back your questions.

Thank you,
Jeremy Kadlec


Friday, January 28, 2011 - 6:25:51 PM - Jay Back To Top (12741)

Jeremy:

This article is good but bit confusing. artcile is about Table width which is more in respect to Table Column and the size of the column in restpect to datatype that column has.

Where for e.g. min, max size that DMV physical Stats shows is the record size. To my knowledge Table Width is more depends on how many columns it has & what kind of datatype & not to mentioned any if BLOB where row_over_flow might happen.

This link is more towards table width. please correct me if i am wrong which will help me to understand concept in better way. http://msdn.microsoft.com/en-us/library/ms191195.aspx

Thanks

Jay















get free sql tips
agree to terms