Basic SQL Server XML Querying

By:   |   Comments (18)   |   Related: > XML


Problem

We have XML data in a SQL Server database that we want to search efficiently, but are not sure of the best way to get the results we want. We need to find specific values in certain XML nodes- this will involve finding node values that match given substrings.  Check out this tip to learn more.

Solution

We will be working with the AdventureWorks2012 sample database - querying the Demographics XML column in the Person.Person table. To see what the Demographics XML data looks like, let's pull a sample Demographics field by selecting a random record:

--ordering by NEWID() function randomizes the sort order
SELECT TOP 1 Demographics 
FROM AdventureWorks2012.Person.Person
ORDER BY NEWID() 
GO

Basic XML Querying in SQL Server

If we expand the XML contents of this field, we see:

querying the Demographics XML column in the Person.Person table

It's evident that we are dealing with typed XML, since we can see a namespace declaration (xmlns="http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/IndividualSurvey") inside the root node. This means that the XML data is associated with an XML schema that enforces data type and validation rules. We want to get a count of the surveys where the individual's education is listed as ‘Bachelors'. How should we do this? We can try a rather primitive method that finds the existence of substrings anywhere in the XML column by using the CHARINDEX function:

--filtering XML data using CHARINDEX function
SELECT COUNT(1)
FROM Person.Person
WHERE CHARINDEX('Bachelors',CONVERT(VARCHAR(MAX),Demographics),1) > 0

Querying with XQuery

This does return a count of the number of times the word ‘Bachelors' appears. However, it's quite clumsy - and worse yet, doesn't guarantee that the text ‘Bachelors' is a real XML value in the table - it could also be part of a node name. A much cleaner solution is to use XQuery.

Querying SQL Server Data with XQuery

XQuery is a language designed for querying XML data, and is not proprietary to SQL Server - it is used by many relational database management systems (RDBMS). A simple implementation of an XQuery solution is as follows:

--filtering XML data using XQuery
;WITH XMLNAMESPACES 
('http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/IndividualSurvey' AS ns)
SELECT COUNT(1)
FROM Person.Person
WHERE Demographics.exist('(/ns:IndividualSurvey/ns:Education[.="Bachelors"])')=1
GO

This may look a little complicated, but really isn't too hard to decipher. The WITH XMLNAMESPACES statement is needed to explicitly declare the namespace for in order to reference any XML node element or attribute, which we did in the WHERE clause (the Education node element). Another way to declare the namespace is to put the declaration in-line with the actual XQuery method that is being used:

--filtering XML data using XQuery with in-line namespace declaration
 SELECT COUNT(1)
FROM Person.Person
WHERE Demographics.exist('declare namespace ns=
"http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/IndividualSurvey"; 
(/ns:IndividualSurvey/ns:Education[.="Bachelors"])')=1
GO

In both examples, we declared a namespace and aliased it as ‘ns', which is used as a prefix for each path step or node reference. We listed the node path steps from root node (IndividualSurvey) to destination node (Education).

Using the SQL Server XQuery exist() method

The exist() method is used in our query as the WHERE criterion to filter the XML data, finding XML records where the Education node's value is equal to the text ‘Bachelors'. The exist() method returns a bit value of 1 if the method finds a non-empty node whose element or attribute value matches the given criteria. Upon running the query, however, we get a record count of zero:

4

What happened? We've assumed too much - that the value we are looking for in the Education node will exactly match the text ‘Bachelors'. If we look at the sample Demographics XML fragment closely, we see that there is a space after the word. So, if we modify our query and run it again, we get the following:

--filtering XML data using XQuery - adding a space to the string
;WITH XMLNAMESPACES 
('http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/IndividualSurvey' AS ns)
SELECT COUNT(1)
FROM Person.Person
WHERE Demographics.exist('(/ns:IndividualSurvey/ns:Education[.="Bachelors "])')=1
GO

Using the contains() function

Using the SQL Server XQuery contains() function

How can we prevent this oversight from happening again? XQuery in SQL Server makes use of a contains() function that finds a string as a whole substring in a node value. The contains() function requires two arguments:

    arg1 - String value to test
    arg2 - Substring to look for

Using the contains() function, we don't have to be concerned with matching the string exactly:

--filtering XML data using the XQuery contains() function
;WITH XMLNAMESPACES 
('http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/IndividualSurvey' AS ns)
SELECT COUNT(1)
FROM Person.Person
WHERE Demographics.exist('(/ns:IndividualSurvey/ns:Education[contains(.,"Bachelors")])')=1
GO

XQuery in SQL Server makes use of a contains() function

We can easily expand our filter by additionally specifying that only records where the Gender node contains ‘F' are returned:

--filtering XML data using the XQuery contains() function in multiple WHERE criteria
;WITH XMLNAMESPACES 
('http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/IndividualSurvey' AS ns)
SELECT COUNT(1)
FROM Person.Person
WHERE Demographics.exist('(/ns:IndividualSurvey/ns:Education[contains(.,"Bachelors")])')=1
AND Demographics.exist('(/ns:IndividualSurvey/ns:Gender[contains(.,"F")])')=1
GO

Using the value() method

Our contains() function instances use what is called a self node axis abbreviation (‘.') as their first argument. This specifies that we want to search in the current node context (Gender or Education). The second argument is for the actual string we are searching for (‘Bachelors' or ‘F').

Using the SQL Server XQuery value() method

We assume that our query is returning the records that meet our criteria, but what if we want to see the actual values to verify this? Another XQuery method we can use for this is value(). The syntax for the method is as follows:

    value (XQuery, SQLType)

We'll employ this method to return node values for the Education and Gender nodes (only the first 3 records), along with the non-XML LastName field:

;WITH XMLNAMESPACES 
('http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/IndividualSurvey' AS ns)
SELECT TOP 3 
LastName,
Demographics.value('(/ns:IndividualSurvey/ns:Education)[1]', 'varchar(50)') AS Education,
Demographics.value('(/ns:IndividualSurvey/ns:Gender)[1]', 'char(1)') AS Gender
FROM Person.Person
WHERE Demographics.exist('(/ns:IndividualSurvey/ns:Education[contains(.,"Bachelors")])')=1
 AND Demographics.exist('(/ns:IndividualSurvey/ns:Gender[contains(.,"F")])')=1
GO

We've looked briefly at some basic ways to query XML data using XQuery

The value() method requires an XQuery path argument like the exist() method, but also needs a SQL Server data type argument. We are using ‘varchar(50)' and ‘char(1)' as the SQL Server data types for the Education and Gender node values.

Singleton Requirement

The value() method also enforces a singleton rule - requiring that the node element or attribute instance in a sequence be explicitly specified, even if there is only one. For example, there is only one Education node in our XML fragment, but we still have to designate a singleton by placing the ‘[1]' at the end of the value() method, thus indicating that we want the first Education node.

Summary

We've looked briefly at some basic ways to query XML data using XQuery. We introduced the exist() method as an effective XQuery method to use in the WHERE clause. We also experimented with the XQuery contains()function, using it to count XML records having a precise node string value match. Finally, we used a very common XQuery method - the value() method - to return XML values.

Next Steps
  • We performed XQuery method operations on some very simple XML fragments. We had no need to iterate multiple nodes to find matching values, for example. Had our XML data contained multiple Education nodes, our XQuery operations would not have been able to easily check every Education node for each record. We could have specified node instances using the singleton designator, but that would not have systematically searched every Education node in an XML fragment. For this and other advanced XML querying operations, the XQuery FLWOR statement provides effective solutions. FLWOR stands for FOR, LET, WHERE, ORDER BY, and RETURN. FLWOR is an important part of XQuery, and will be a necessary item to add to your XML querying skills.


sql server categories

sql server webinars

subscribe to mssqltips

sql server tutorials

sql server white papers

next tip



About the author
MSSQLTips author Seth Delconte Seth Delconte is a Microsoft Certified Trainer and Database Administrator, and works primarily on SQL Server development.

This author pledges the content of this article is based on professional experience and not AI generated.

View all my tips



Comments For This Article




Sunday, January 20, 2019 - 5:42:48 PM - JL Back To Top (78825)

'Still looking for a good teacher on this rocket science. All of you assume people know what you know. Microsoft is notorious for this.

What is ;WITH XMLNAMESPACES?

What is Demographics.exist('(/ns:IndividualSurvey/ns:Education?

I need a visual of these query components against the actual XML being queried or even against the schema.

Nevermind. I'll figure it out in a week.


Friday, February 10, 2017 - 1:25:03 PM - David Back To Top (46168)

Great info! Thank you!


Monday, December 5, 2016 - 4:38:02 PM - Saima Back To Top (44901)

 Hi Seth

This article is amazing. Well explained. Thank you. 


Thursday, September 11, 2014 - 9:56:19 AM - John Keter Back To Top (34497)

i have an xml in sql server which has some xml header attached to xml i want to extract only xml part using sql query


Monday, July 28, 2014 - 3:17:55 PM - doupanpan Back To Top (33907)

Thanks Seth,

This article explains XML clearly, I enjoy reading it.

 


Thursday, May 8, 2014 - 4:24:30 PM - Greg Robidoux Back To Top (30682)

Hi Ron S, what version of SQL Server are you using?  Also, can you point out which part of the code did not work for you.

 

Thanks
Greg Robidoux
MSSQLTips Team Leader


Thursday, May 8, 2014 - 4:13:34 PM - Ron S Back To Top (30681)

Much of your code does not work.  The exist (XQuery) requires a string literal arguement and thereby, "ns:" is a literal and hence not found.  Please check your code before publishing or preserving it.  Wasted hours believing your code.


Friday, February 21, 2014 - 1:21:04 AM - ranjit Back To Top (29522)

Excellent Article Seth..Keep going


Monday, August 19, 2013 - 4:18:41 AM - Mujeeb Back To Top (26375)

Hi ,

I need a sql query to search a record of XML data type which has a particular value for a tag in the xml.

we have rows with different columns , one of the column is of XML data type.

i want only my record which contains a particular value in the XML


Tuesday, May 7, 2013 - 10:49:29 AM - Seth Delconte Back To Top (23763)

@Calvin - thanks!  This article shows some examples using both the nodes() and query() methods.  Hope it helps!

https://www.simple-talk.com/sql/database-administration/ad-hoc-xml-file-querying/


Wednesday, April 24, 2013 - 2:24:33 PM - Jeff Moden Back To Top (23544)

Nice article Seth. Very well done.

Not directed at Seth's article, I'm totally amazed that people would store such grossly denormalized data as XML in a table especially data like that found in AdventureWorks.  It's not hierarchical and XML provides absolutely no advantage in this particular case.  The data should be in separate columns in the same table.  I'd have to shoot a designer if anyone ever proposed using XML for such a thing.  I'd even settle for an EAV table instead of using XML in this fashion.


Tuesday, April 23, 2013 - 1:15:20 PM - Calvin Brown Back To Top (23526)

 

Thank you Seth. This has  been very helpful to me. I'm still searching for the other 2  methods ,nodes() and query() to gain a firm understanding in order to pass 70-433.


Friday, March 22, 2013 - 9:36:02 AM - Khurram Cheema Back To Top (22961)

Excellent, Thanks Seth


Monday, February 25, 2013 - 11:46:37 PM - Muhammad Ikram Back To Top (22418)

Nice informative article. God bless you, thanks for sharing knowledge.


Thursday, February 21, 2013 - 11:39:19 AM - Bob Back To Top (22343)

These kinds of posts are great for those of us who want to break into SQL Server DBA world.  I am a DB2 DBA but I don'w work with XML much.  Thank you, Seth.


Wednesday, February 20, 2013 - 10:19:32 AM - Cathy Back To Top (22308)

This is an excellent and very understandable write-up.   Thank you, Seth!


Wednesday, February 20, 2013 - 9:26:26 AM - Seth Delconte Back To Top (22305)

Thanks very much Armando!


Wednesday, February 20, 2013 - 9:13:00 AM - Armando Prato Back To Top (22304)

I enjoyed your previous tip on XML and I enjoyed this one just as much.... thanks!















get free sql tips
agree to terms