Learn more about SQL Server tools

mssqltips logo
 

Tutorials          DBA          Dev          BI          Career          Categories          Webcasts          Whitepapers          Today's Tip          Join

Tutorials      DBA      Dev      BI      Categories      Webcasts

DBA    Dev    BI    Categories

 

Calculating Table Width in SQL Server


By:   |   Read Comments (6)   |   Related Tips: More > Database Design

Attend these FREE SQL Server 2017 webcasts >> click to register


Problem
I am concerned our database design is overly denormalized.  I believe we have some very wide tables which makes some of our coding very easy, but I am concerned about the data access and overall performance.  Before I go too far down the path of just changing some of these tables or trying to change the ways of my team for future database design, how can I find out how wide some of these tables really are based on the data as compared to the theoretical maximum size?  Can you provide a script or two that I can run on my SQL Server databases?  Can you also provide any resources for practical database design?  I know having some denormalization is reasonable, I am just not sure if some of the database design has gone to an extreme or not.

Solution
As far as calculating table width is concerned, a few different options are available.  Let's take a look at those scripts and then address the database design independently.

Option 1 - DBCC SHOWCONTIG

The DBCC SHOWCONTIG command reports row related information and is a viable source to consider.  This is achieved by using the WITH TABLERESULTS option.  Then the MinimumRecordSize, MaximumRecordSize and AverageRecordSize can be reviewed based on your need.

Simple DBCC SHOWCONTIG Command

USE AdventureWorks;
GO
DBCC SHOWCONTIG WITH TABLERESULTS;
GO
 

Keep in mind that this functionality of the DBCC SHOWCONTIG command is available with SQL Server 2000 and SQL Server 2005.  Although on busy SQL Server databases it is not recommended to be run during production hours.  It is advisable to run the command either during off hours, during a maintenance window or on a backed up version of the database.  Here is one point of reference - DBCC SHOWCONTIG Blocking Issues.

Option 2 - sys.dm_db_index_physical_stats

One of the new features with SQL Server 2005 are the dynamic management views and functions.  The one that will come in handy in this scenario is the sys.dm_db_index_physical_stats.  The beauty of the dynamic management views and functions is that they can be queried with a simple SELECT statement.  Here are a few different examples using the AdventureWorks SQL Server 2005 sample database:

sys.dm_db_index_physical_stats - Basic SELECT Statement

USE AdventureWorks;
GO
SELECT CAST(DB_NAME(DATABASE_ID) AS VARCHAR(20)) AS 'DatabaseName',
CAST(OBJECT_NAME([OBJECT_ID]) AS VARCHAR(20)) AS 'TableName',
index_id,
index_type_desc
,
alloc_unit_type_desc,
min_record_size_in_bytes,
max_record_size_in_bytes
,
avg_record_size_in_bytes
FROM SYS.DM_DB_INDEX_PHYSICAL_STATS (DB_ID('AdventureWorks'),NULL,NULL,NULL,'DETAILED');
GO
 

sys.dm_db_index_physical_stats - Basic SELECT Statement with an ORDER BY Clause

USE AdventureWorks;
GO
SELECT CAST(DB_NAME(DATABASE_ID) AS VARCHAR(20)) AS 'DatabaseName',
CAST(OBJECT_NAME([OBJECT_ID]) AS VARCHAR(20)) AS 'TableName',
index_id,
index_type_desc
,
alloc_unit_type_desc,
min_record_size_in_bytes,
max_record_size_in_bytes
,
avg_record_size_in_bytes
FROM SYS.DM_DB_INDEX_PHYSICAL_STATS (DB_ID('AdventureWorks'),NULL,NULL,NULL,'DETAILED')
ORDER BY avg_record_size_in_bytes DESC;
GO
 

Database Design Considerations

Here are some data points to consider when evaluating your database design situation i.e. wide tables:

  • Good or Bad - A wide table is not inherently a bad design depending on how it is used.  For reporting environments, some database designs are wide to meet reporting needs to avoid joins or provide a simple interface to users/reporting applications.
  • Eliminate Joins - In OLTP environments, data is duplicated in some circumstances to eliminate joining tables.  Depending on the circumstances and how the duplicated data is maintained, this may be a critical technique to ensure the user experience.
  • Repeating columns - To me this is typically a sure sign that the design was not thought about or it has evolved over time.  If a table has 3 or more columns with the same meaning i.e. product1, product2, product3, this is a good candidate for a one to many relationship with the current table and the new product table.  Another data point to consider is what will happen when a fourth or fifth product is ordered? 
  • Off the map - If a database consists of a few (wide) tables and all of the columns are text data types when particular columns would be better suited as integers, date time, etc. as opposed to just characters, then either the design was not thought out at all or some database design education is needed.
  • Check out the Next Steps section below for some additional points of reference related to the database design.

Next Steps

 



Last Update:


signup button

next tip button



About the author
MSSQLTips author Jeremy Kadlec Since 2002, Jeremy Kadlec has delivered value to the global SQL Server community as an Edgewood Solutions SQL Server Consultant, MSSQLTips.com co-founder and Baltimore SSUG co-leader.

View all my tips





Post a comment or let the author know this tip helped.

All comments are reviewed, so stay on subject or we may delete your comment. Note: your email address is not published. Required fields are marked with an asterisk (*).

*Name    *Email    Notify for updates 


SQL tips:

*Enter Code refresh code     



Monday, January 31, 2011 - 2:20:33 PM - Jeremy Kadlec Back To Top

Jay,

Unfortunately, I do not understand what you mean by this:

"row size matter for IO based upon Index present or not and Index is nothing but the name of the column for e.g. if table CUSTOMER as column - FirstNAME of Varchar(255) and i create Non-Clustered Index on that column where the Index Key will be "FirstNAME"."

In terms of the indexing references in your two posts, I do not think the tip references indexing best practices or any sort of indexing recommendations, so I am not sure what you are referencing there.

For this statement "So the point over here i am making is the table width which is 255 assuming one table with one column.", I am not sure if that is true or not.  In my example, varchar columns were wide, but only a fraction of the available storage.  So when I ran the queries from the tip, the column width was not 255 (as in your example), the min could be 5, the avg could be 25 and the max could be 35.  This was the case for the columns aggregating the width for the row and/or table.

Sorry I do not have a better response, I am not sure exactly sure how I can help you.  If you can post some additional information, I can see if I can clarify this any further.

Thank you,
Jeremy Kadlec


Monday, January 31, 2011 - 1:34:55 PM - Jay Back To Top

Should i expect feedback or ?

Thanks

Jay


Monday, January 31, 2011 - 8:49:07 AM - Jeremy Kadlec Back To Top

Jay,

Thank you for the post.

Thank you,
Jeremy Kadlec


Saturday, January 29, 2011 - 7:58:21 AM - Jay Back To Top

Hello:
 
I am sorry but still it is not clear. if column is varchar(500) then it is not going to store value 501 and the remaing will be truncated unless it is LOB where 1st row will store 16byte point for row-over-flow. but over here we are not talking about row over flow.
 
row size matter for IO based upon Index present or not and Index is nothing but the name of the column for e.g. if table CUSTOMER as column - FirstNAME of Varchar(255) and i create Non-Clustered Index on that column where the Index Key will be "FirstNAME".
 
So the point over here i am making is the table width which is 255 assuming one table with one column.
 
Thanks
Regards
Jay


Friday, January 28, 2011 - 8:36:42 PM - Jeremy Kadlec Back To Top

Jay,

Thank you for the post and feedback.  Sorry the tip is confusing to you.  That was not the intention.  Let's see if I can clarify things any for you.

The tip is intended to determine the width of a table based on the data types for its columns vs, the actual amount of data that is being stored per row, which is an aggregate of the columns.  When people talk about "wide tables" it is in reference to the amount of storage for a row and\or a large number of columns per table. 

For tables with a 100+ columns with variable data types this may or may not be a situation where the physical row widths are exceeded and row overflows come into play.  The queries in the tip were intended to give you a sense of how wide the rows for the table are independent of the amount of storage the table is actually designed to store, which is based on the column definitions.  

For example, a table can have 100 columns.  One column has an int data type and the remainder are a varchar(1000).  At the surface, you would assume the table is grossly denormalized and the physical row must be exceeded, so this is causing all sorts of performance problems.  In the situation I was in, a design change was needed, but the actual lengths of the rows were not exceeding the physical row limits.  There was some education needed to help the team (my customer) understand how to design a database and select the correct data types.  Unfortunately, this sort of design with tables with 100+ columns was the norm in 1 application and I worked with the Development Team to try to improve upon the design.

In terms of the BOL article you referenced, I think this is indexing related.  It is a good set of information that does reference indexing best practices (width, number, duplication, data distribution, etc.), but I did not see how it related to calculating table widths.

I hope this helps.  If anything is unclear, please post back your questions.

Thank you,
Jeremy Kadlec


Friday, January 28, 2011 - 6:25:51 PM - Jay Back To Top

Jeremy:

This article is good but bit confusing. artcile is about Table width which is more in respect to Table Column and the size of the column in restpect to datatype that column has.

Where for e.g. min, max size that DMV physical Stats shows is the record size. To my knowledge Table Width is more depends on how many columns it has & what kind of datatype & not to mentioned any if BLOB where row_over_flow might happen.

This link is more towards table width. please correct me if i am wrong which will help me to understand concept in better way. http://msdn.microsoft.com/en-us/library/ms191195.aspx

Thanks

Jay


Learn more about SQL Server tools