SQL Server Extended Events using collect_database

By: Aaron Bertrand | Comments | Related: > Extended Events

Problem

We recently converted our schema change collection process from DDL triggers in every SQL Server database to a single Extended Events session. This was done both to simplify management and reduce security complexity. The DDL trigger had to write its output on behalf of the person running the DDL to a central log table. Now the SQL Server Extended Events session is capturing the data and its ability to write to any target does not hinge on the rights of the caller.

This all sounds fine, but we came upon a bit of an issue with a specific edge case. In this tip, I will explain the issue and the factors that went into deciding how we handled it.

Solution

The original SQL Server DDL trigger looked something like this; to keep it simple, let's pretend it just captures ALTER PROCEDURE:

CREATE TRIGGER TheDDLTrigger
ON DATABASE
FOR ALTER_PROCEDURE
AS
BEGIN
  DECLARE @x xml = EVENTDATA();
  INSERT DBALogDB.dbo.DDLEventLog(...) 
    SELECT DB_NAME() --, <ugly parsing I won't bother repeating>;
END

To replicate this using Extended Events, I created the following session (again simplified for brevity):

CREATE EVENT SESSION TheDDLSession 
ON SERVER 
ADD EVENT sqlserver.object_altered
(
    ACTION(sqlserver.server_principal_name, sqlserver.sql_text, ...)
    WHERE ([database_id] <> (2))
)
ADD TARGET ...;

Now, in order to capture this data to DDLEventLog, we have a background process running on a schedule that extracts all the new data from the latest .xel files and loads it into the table. (There's also a process that moves files we know we've consumed and deletes them 90 days later to keep the loading job linear.)

This revealed an edge case where it is possible to lose one aspect of the data: database name. When originally building the session, I thought database_id would be enough because the loading process could always look up the name from there. But this is brittle. Imagine the case where a user drops a table, drops the database, then the loading process pulls data from the session. That data about dropping the table contains just a database_id that no longer exists.

If we needed to audit, we could piece it together manually from other data.

We could fix this by changing the session slightly, adding collect_database_name:

...
ADD EVENT sqlserver.object_altered
(
    SET collect_database_name = (1)
    ACTION(...

We are collecting and storing the name of the database for every single event where, probably close to 100% of the time, the database_id value is both sufficient and valid.

The conundrum: should we care about the edge case?

This scenario is highly unlikely and even if it does happen, do we really care about DDL events that happened just before the database was dropped? Probably not.

Another scenario we didn't discuss, but that could be relevant to the same timing issue, is when a user drops a table then renames the database. If the session is collecting the database name, the audit data would reflect the name at the time of the drop. If, instead, the name isn't looked up until the loading process runs, then it would reflect the new name. Which one is correct? Is there an easy answer?

On the other hand, the extra data that would be captured to cover the edge case is negligible. So, other than introducing an additional deployment (and the testing around that), not much would be lost.

In the end we opted to leave the session as is, deciding that it wasn't worth deploying a fix. The main reason is that there is near-zero likelihood of either scenario happening. Your scenario may lean the other way, so I thought it would be interesting to consider the tradeoffs.

Next Steps

If you are collecting events and relying on database_id, you might want to consider adding collect_database_name to cover the edge cases I've described here. If you aren't collecting DDL events this way yet, make sure you consider these scenarios to help you decide whether you should collect the database name or not.

See these related tips and other resources relating to DDL triggers and Extended Events:

About the author

Aaron Bertrand (@AaronBertrand) is a passionate technologist with industry experience dating back to Classic ASP and SQL Server 6.5. He is editor-in-chief of the performance-related blog, SQLPerformance.com, and also blogs at sqlblog.org.

This author pledges the content of this article is based on professional experience and not AI generated.

View all my tips

Using collect_database_name with SQL Server Extended Events

Problem

Solution

The conundrum: should we care about the edge case?

Next Steps

About the author

Comments For This Article