Learn more about SQL Server tools

mssqltips logo
giveaway
 

Tutorials          DBA          Dev          BI          Career          Categories          Webcasts          Whitepapers          Today's Tip          Join

Tutorials      DBA      Dev      BI      Categories      Webcasts

DBA    Dev    BI    Categories

 

Reporting on Sharepoint lists from SQL Reporting Services


By:   |   Read Comments (6)   |   Related Tips: > Sharepoint

Problem
Sharepoint is a great collaborative tool, but its internal reporting capability is limited. Wouldn't it be so powerful if you could report against one or more Sharepoint lists from Microsoft's Reporting Services? Or how about just having the ability to integrate the data into some other SQL application and/or data warehouse? There is very little documentation on the internal Sharepoint databases, so how can you gather data from Sharepoint and make meaningful reports using Reporting Services?

Disclaimer
This procedure is not for the faint of heart. When reporting directly against a production server, there could be performance implications. I highly recommend doing all testing in a non-production environment. I also recommend you check with your Sharepoint administrator and/or SQL DBA before doing any of this. (Hopefully both of those are you ;-). Furthermore, this procedure will defeat any internal Sharepoint security configurations, so you should be careful to consider what data your are letting out of the bottle. Check with business owner's before going live with anything.

This procedure works with Sharepoint services (WSS 2.0 or 3.0). I have no first hand knowledge about any of the full Portal Server products, but I imagine it is similar since the Portal product is based upon WSS.

Enough on the disclaimer. Let's proceed...

Solution

Scope:

The scope of this solution is limited to list data. This would include event/calendars, announcements, custom lists, issues, links, contacts. Document and picture libraries, discussions, wikis, and the like will not be discussed, but it's possible the techniques discussed may help with those as well.

Additionally, as mentioned in the disclaimer, we're not going to be able to get at the security information relating to the list, though we will be able to display the "Created By" and "Modified By" fields.

Materials needed:

  • Read only access to the Sharepoint database. This can be WSS2 or WSS3.
  • Another SQL database as a working database for reporting.
  • Plenty of coffee...and lot's of pizza.

Let's start with getting the data into a place where we can comfortably work on it. Create a new database on a test server.

Export the following tables and/or views from your production Sharepoint database to your new test database.

  • Webs
  • UserInfo
  • Lists *
  • UserData *

*IMPORTANT: In the case of WSS 3.0, these (with an asterisk *) are Views, not Tables.

In any event, get all these objects into your test (or working) database as tables.

These four tables contain all the information you need to report on your Sharepoint content. Below is a brief description of these tables/views.

Webs - This is simply a table listing all your Sharepoint "sites" and "sub-sites". (Not site collections.) The important fields are:

  • Id
  • FullURL
  • Title.

UserInfo - What you would expect. Pertinent fields are:

  • tp_ID
  • tp_Login
  • tp_Title
  • tp_email.

Lists - This is a table of all the Sharepoint lists in your Sharepoint database. All of them! Needed fields are:

  • tp_WebId
  • tp_ID
  • tp_Title
  • tp_Created
  • tp_Modified
  • tp_Fields

UserData - This is where all the content is stored. All of it! However...not so fast, buster. This is where the hard work begins. Because the content is stored in generically named fields.

What do I mean? I mean, let's say you have a list. Let's say it's a Sharepoint contact list. You will have fields like, FirstName, LastName, Business Phone, Cell Phone, Address, etc. However, if you peak, you will see that the actual database field names are more like nvarchar1, nvarchar2, nvarchar3, nvarchar4, ntext1, ntext2, datetime1, datetime2, int1, int2, int3, etc.

Now your are exploring the limits of Sharepoint. You will see exactly how many datetime fields you can have in a single list, just by looking at the number of datetime fields. (WSS3 extends the capability a bit with a concept called site columns...but that's outside the scope of this discussion as well.)

Mapping the field names

So how to map the named list fields with the generic database fields? Well as it turns out, all this information is stored in a field called tp_fields in the Lists table. But it's an ugly map.

Below is the contents of the tp_fields field for a list I created to illustrate this example. (Don't even try to read this yet. Skip down below it.

<FieldRef Name="ContentTypeId"/><FieldRef Name="Title" ColName="nvarchar1"/><FieldRef Name="_ModerationComments" ColName="ntext1"/><FieldRef Name="File_x0020_Type" ColName="nvarchar2"/><Field Type="Text" DisplayName="Single_line_of_text" Required="FALSE" MaxLength="255" ID="{61337d5c-35eb-4af3-9593-7611a617290c}" SourceID="{7334222f-e0a9-4cff-9108-ce06ae7b57fa}" StaticName="Single_line_of_text" Name="Single_line_of_text" ColName="nvarchar3" RowOrdinal="0"/><Field Type="Text" DisplayName="Single_line_of_text_two" Required="FALSE" MaxLength="255" ID="{a7da379f-7daf-4fba-b99d-8d9b2f17ade3}" SourceID="{7334222f-e0a9-4cff-9108-ce06ae7b57fa}" StaticName="Single_line_of_text_2" Name="Single_line_of_text_2" ColName="nvarchar4" RowOrdinal="0" Version="1"><Default/></Field><Field Type="Note" DisplayName="Multiple_lines_of_text_PlainText" Required="FALSE" NumLines="6" RichText="FALSE" Sortable="FALSE" ID="{5450f7ac-53cd-4760-b440-b8539750bdd2}" SourceID="{7334222f-e0a9-4cff-9108-ce06ae7b57fa}" StaticName="Multiple_lines_of_text_PlainText" Name="Multiple_lines_of_text_PlainText" ColName="ntext2" RowOrdinal="0"/><Field Type="Note" DisplayName="Multiple_lines_of_text_RichText" Required="FALSE" NumLines="6" RichText="TRUE" RichTextMode="Compatible" Sortable="FALSE" ID="{cdf42c72-2b38-4e45-bc03-811432cb3f6a}" SourceID="{7334222f-e0a9-4cff-9108-ce06ae7b57fa}" StaticName="Multiple_lines_of_text_RichText" Name="Multiple_lines_of_text_RichText" ColName="ntext3" RowOrdinal="0"/><Field Type="Note" DisplayName="Multiple_lines_of_text_EnhanceRichText" Required="FALSE" NumLines="6" RichText="TRUE" RichTextMode="FullHtml" IsolateStyles="TRUE" Sortable="FALSE" ID="{875cfe1f-df40-4741-8a70-b0df287a30e7}" SourceID="{7334222f-e0a9-4cff-9108-ce06ae7b57fa}" StaticName="Multiple_lines_of_text_EnhanceRi" Name="Multiple_lines_of_text_EnhanceRi" ColName="ntext4" RowOrdinal="0" Version="1"/><Field Type="Choice" DisplayName="Choice_dropdown" Required="FALSE" Format="Dropdown" FillInChoice="FALSE" ID="{59f15fae-20ee-45bc-aede-fb560e44a9cd}" SourceID="{7334222f-e0a9-4cff-9108-ce06ae7b57fa}" StaticName="Choice_dropdown" Name="Choice_dropdown" ColName="nvarchar5" RowOrdinal="0"><Default>Red</Default><CHOICES><CHOICE>Red</CHOICE><CHOICE>Ye llow</CHOICE><CHOICE>Blue</CHOICE></CHOICES></Field><Field Type="Number" DisplayName="Number_column" Required="FALSE" ID="{97935b88-1752-4f87-a571-34d42fe08b5e}" SourceID="{7334222f-e0a9-4cff-9108-ce06ae7b57fa}" StaticName="Number_column" Name="Number_column" ColName="float1" RowOrdinal="0"/><Field Type="Currency" DisplayName="Currency_column" Required="FALSE" LCID="1033" ID="{a2803d64-a1b5-4986-a651-fb1fdbe37532}" SourceID="{7334222f-e0a9-4cff-9108-ce06ae7b57fa}" StaticName="Currency_column" Name="Currency_column" ColName="float2" RowOrdinal="0"/><Field Type="DateTime" DisplayName="Date_and_Time" Required="FALSE" Format="DateOnly" ID="{3c394c06-51cb-48a6-8eaf-0c373e1c320f}" SourceID="{7334222f-e0a9-4cff-9108-ce06ae7b57fa}" StaticName="Date_and_Time" Name="Date_and_Time" ColName="datetime1" RowOrdinal="0"/><Field Type="Boolean" DisplayName="Yes_No" ID="{ffc5b6c2-d3ee-4a6d-9b4a-db938a1f8e7d}" SourceID="{7334222f-e0a9-4cff-9108-ce06ae7b57fa}" StaticName="Yes_No" Name="Yes_No" ColName="bit1" RowOrdinal="0"><Default>1</Default></Field><Field Type="User" DisplayName="Person_or_Group" List="UserInfo" Required="FALSE" ShowField="ImnName" UserSelectionMode="PeopleAndGroups" UserSelectionScope="0" ID="{feae87e2-dfb0-45ba-85d6-4d782d064987}" SourceID="{7334222f-e0a9-4cff-9108-ce06ae7b57fa}" StaticName="Person_or_Group" Name="Person_or_Group" ColName="int1" RowOrdinal="0"/><Field Type="URL" DisplayName="Hyperlink" Required="FALSE" Format="Hyperlink" ID="{5db05486-d8ff-42bc-82a8-fc6bc3228d88}" SourceID="{7334222f-e0a9-4cff-9108-ce06ae7b57fa}" StaticName="Hyperlink" Name="Hyperlink" ColName="nvarchar6" RowOrdinal="0" ColName2="nvarchar7" RowOrdinal2="0"/>

Whatever this stuff is, if you are good with Perl you might be able to parse it. But that's also outside the scope of this article. We're going to do it the dirty way.

The first thing to do is make it somewhat readable. I look for statements that begin with "<FieldRef Name=" or "<Field Type=" and put a line break in order to make the long spaghetti code readable. These mark the beginning of each field. Then I will mark in bold the mappings and make notations where I wanted to make comments. One more thing. My Sharepoint column names are things like "Single_line_of_text", "Multiple_lines_of_text" and "date_and_time", because I wanted to see the mapping of what Sharepoint calls 'field types' to actual database field types. These names are incidental to this discussion and was only done as a matter of interest to me. Hopefully, your Sharepoint field names will be a bit more meaningful.

Here comes the mapping...

<FieldRef Name="ContentTypeId"/><FieldRef Name="Title" ColName="nvarchar1"/>

<FieldRef Name="_ModerationComments" ColName="ntext1"/>
I don't really care about this one, but it's here for the sake of completion.

<FieldRef Name="File_x0020_Type" ColName="nvarchar2"/>
Don't care about this either.

<Field Type="Text" DisplayName="Single_line_of_text" Required="FALSE" MaxLength="255" ID="{61337d5c-35eb-4af3-9593-7611a617290c}" SourceID="{7334222f-e0a9-4cff-9108-ce06ae7b57fa}" StaticName="Single_line_of_text" Name="Single_line_of_text" ColName="nvarchar3" RowOrdinal="0"/>

<Field Type="Text" DisplayName="Single_line_of_text_two" Required="FALSE" MaxLength="255" ID="{a7da379f-7daf-4fba-b99d-8d9b2f17ade3}" SourceID="{7334222f-e0a9-4cff-9108-ce06ae7b57fa}" StaticName="Single_line_of_text_2" Name="Single_line_of_text_2" ColName="nvarchar4" RowOrdinal="0" Version="1"><Default/></Field>
Notice that the DisplayName no longer maps to StaticName or Name. That's because I changed the field name in Sharepoint, but Sharepoint keeps a history of the original name. A watchout is to look for DisplayName...not just Name.

<Field Type="Note" DisplayName="Multiple_lines_of_text_PlainText" Required="FALSE" NumLines="6" RichText="FALSE" Sortable="FALSE" ID="{5450f7ac-53cd-4760-b440-b8539750bdd2}" SourceID="{7334222f-e0a9-4cff-9108-ce06ae7b57fa}" StaticName="Multiple_lines_of_text_PlainText" Name="Multiple_lines_of_text_PlainText" ColName="ntext2" RowOrdinal="0"/>

<Field Type="Note" DisplayName="Multiple_lines_of_text_RichText" Required="FALSE" NumLines="6" RichText="TRUE" RichTextMode="Compatible" Sortable="FALSE" ID="{cdf42c72-2b38-4e45-bc03-811432cb3f6a}" SourceID="{7334222f-e0a9-4cff-9108-ce06ae7b57fa}" StaticName="Multiple_lines_of_text_RichText" Name="Multiple_lines_of_text_RichText" ColName="ntext3" RowOrdinal="0"/>

<Field Type="Note" DisplayName="Multiple_lines_of_text_EnhanceRichText" Required="FALSE" NumLines="6" RichText="TRUE" RichTextMode="FullHtml" IsolateStyles="TRUE" Sortable="FALSE" ID="{875cfe1f-df40-4741-8a70-b0df287a30e7}" SourceID="{7334222f-e0a9-4cff-9108-ce06ae7b57fa}" StaticName="Multiple_lines_of_text_EnhanceRi" Name="Multiple_lines_of_text_EnhanceRi" ColName="ntext4" RowOrdinal="0" Version="1"/>
Note that all types of text that is multple lines map to the ntext database fields.

<Field Type="Choice" DisplayName="Choice_dropdown" Required="FALSE" Format="Dropdown" FillInChoice="FALSE" ID="{59f15fae-20ee-45bc-aede-fb560e44a9cd}" SourceID="{7334222f-e0a9-4cff-9108-ce06ae7b57fa}" StaticName="Choice_dropdown" Name="Choice_dropdown" ColName="nvarchar5" RowOrdinal="0"><Default>Red</Default><CHOICES><CHOICE>Red</CHOICE><CHOICE>Ye llow</CHOICE><CHOICE>Blue</CHOICE></CHOICES></Field>
Note the choices for the dropdown are stored here.

<Field Type="Number" DisplayName="Number_column" Required="FALSE" ID="{97935b88-1752-4f87-a571-34d42fe08b5e}" SourceID="{7334222f-e0a9-4cff-9108-ce06ae7b57fa}" StaticName="Number_column" Name="Number_column" ColName="float1" RowOrdinal="0"/>

<Field Type="Currency" DisplayName="Currency_column" Required="FALSE" LCID="1033" ID="{a2803d64-a1b5-4986-a651-fb1fdbe37532}" SourceID="{7334222f-e0a9-4cff-9108-ce06ae7b57fa}" StaticName="Currency_column" Name="Currency_column" ColName="float2" RowOrdinal="0"/>

<Field Type="DateTime" DisplayName="Date_and_Time" Required="FALSE" Format="DateOnly" ID="{3c394c06-51cb-48a6-8eaf-0c373e1c320f}" SourceID="{7334222f-e0a9-4cff-9108-ce06ae7b57fa}" StaticName="Date_and_Time" Name="Date_and_Time" ColName="datetime1" RowOrdinal="0"/>

<Field Type="Boolean" DisplayName="Yes_No" ID="{ffc5b6c2-d3ee-4a6d-9b4a-db938a1f8e7d}" SourceID="{7334222f-e0a9-4cff-9108-ce06ae7b57fa}" StaticName="Yes_No" Name="Yes_No" ColName="bit1" RowOrdinal="0"><Default>1</Default></Field>

<Field Type="User" DisplayName="Person_or_Group" List="UserInfo" Required="FALSE" ShowField="ImnName" UserSelectionMode="PeopleAndGroups" UserSelectionScope="0" ID="{feae87e2-dfb0-45ba-85d6-4d782d064987}" SourceID="{7334222f-e0a9-4cff-9108-ce06ae7b57fa}" StaticName="Person_or_Group" Name="Person_or_Group" ColName="int1" RowOrdinal="0"/>

<Field Type="URL" DisplayName="Hyperlink" Required="FALSE" Format="Hyperlink" ID="{5db05486-d8ff-42bc-82a8-fc6bc3228d88}" SourceID="{7334222f-e0a9-4cff-9108-ce06ae7b57fa}" StaticName="Hyperlink" Name="Hyperlink" ColName="nvarchar6" RowOrdinal="0" ColName2="nvarchar7" RowOrdinal2="0"/>

So much for the mapping. You will use your map to create field aliases in your select statement.

Bringing it home

No all we have to do is link some tables and we're home-free. You can create a view and base your reports on the view of your Sharepoint data. Here's how the tables link up:

  • To get the title of your list, link UserData.tp_ListId to Lists.tp_ID
  • To get the site information (FullURL & Title) link Lists.tp_WebId to Webs.Id
  • To get user information link UserData.tp_Author and UserDate.tp_Editor to UserInfo.tp_Id. You will have to do this with twin aliased versions of UserInfo.

You might end up with something that looks like this:

SELECT USERDATA.NVARCHAR1        AS TITLE,
       USERDATA.TP_AUTHOR        AS AUTHORID,
       USERINFO_EDITORS.TP_TITLE AS EDITORNAME,
       USERINFO_AUTHORS.TP_EMAIL AS AUTHOREMAIL,
       USERDATA.TP_EDITOR        AS EDITORID,
       USERINFO_AUTHORS.TP_TITLE AS AUTHORNAME,
       USERDATA.TP_MODIFIED,
       USERDATA.TP_CREATED,
       USERDATA.NVARCHAR2        AS ADDRESS,
       USERDATA.NVARCHAR3        AS PHONE,
       USERDATA.NTEXT1           AS COMMENTS,
       USERDATA.NTEXT2           AS NOTES,
       USERDATA.NTEXT3           AS [STATUS REPORT],
       LISTS.TP_TITLE            AS LIST_TITLE,
       USERINFO_EDITORS.TP_LOGIN AS USERLOGIN,
       WEBS.FULLURL              AS SITEURL,
       WEBS.TITLE                AS SITETITLE
FROM   USERDATA
       INNER JOIN LISTS
         ON USERDATA.TP_LISTID = LISTS.TP_ID
       INNER JOIN USERINFO AS USERINFO_AUTHORS
         ON USERDATA.TP_AUTHOR = USERINFO_AUTHORS.TP_ID
       INNER JOIN USERINFO AS USERINFO_EDITORS
         ON USERDATA.TP_EDITOR = USERINFO_EDITORS.TP_ID
       INNER JOIN WEBS
         ON LISTS.TP_WEBID = WEBS.ID
WHERE  (LISTS.TP_ID = 'the-id-of-your-list')



To find the id of your list for the WHERE part of the statement above, use the following query:

SELECT LISTS.TP_TITLE AS LIST_TITLE,
       WEBS.FULLURL   AS SITEURL,
       WEBS.TITLE     AS SITETITLE,
       LISTS.TP_ID
FROM   LISTS
       INNER JOIN WEBS
         ON LISTS.TP_WEBID = WEBS.ID
WHERE  (LISTS.TP_TITLE = 'name-of-your-list')

Check the FullUrl in case there are duplicates.

Next Steps

  • Sharepoint is a great collaborative tool. Now you can make it even better using the robust reporting options in Microsoft SQL Reporting Services. Just think of what you can create!
  • Take a look at these other Reporting Services tips


Last Update:






About the author
MSSQLTips author Rob Fisch Rob Fisch has worked with SQL Server since version 6.5 as a dba, developer, report writer and data warehouse designer.

View all my tips





More SQL Server Solutions











Post a comment or let the author know this tip helped.

All comments are reviewed, so stay on subject or we may delete your comment. Note: your email address is not published. Required fields are marked with an asterisk (*).

*Name    *Email    Notify for updates 


SQL tips:

*Enter Code refresh code     



Wednesday, June 12, 2013 - 10:54:48 AM - Ernesto Back To Top

Does microsoft support an environment where you have SQL Queries running directly to the content Data Base??? Even just the SELECT * FROM??


Thursday, October 14, 2010 - 4:15:43 PM - rlively Back To Top
I think it's up to everyone to determine the method that works best for them, and I wasn't trying to change anyone's opinion. I just wanted to post the additional information regarding Microsoft's official stance (since it is a Microsoft product) to make sure everyone is aware.


Tuesday, October 12, 2010 - 6:02:35 PM - PhilB Back To Top
rlively, I do understand were Microsoft is coming from on the reading.

For me, I have been reading from the content database for years beginning with SharePoint Services 2 because it made my integration with other applications so much easier and cost effective to create and maintain for just reading databases.  If you are doing a nolock read then I don't see how the read can break a content database.  Also, it has never been an issue for my applications where I was worried about missing an update to data that SharePoint might have been in the middle of.  I think we just don't keep that kind of data in SharePoint.

I was actually thinking about not doing that anymore, especially when I saw what SharePoint 2010 looked like, but then I started using SPViews and they really seem to know the schema well.  I don't know.  I prefer this method for the simplicity of it.

Phil


Monday, August 30, 2010 - 12:05:16 AM - Adrian Back To Top
I've been using the i-PMO Data Miner 2009 for SharePoint. Makes the entire process vastly simpler, is an approach supported by microsoft, easily configured by business users and is as fast as SQL will go
http://www.i-pmo.com.au/Products/iPMODataMiner2009forSharePoint.aspx

Adrian


Monday, August 02, 2010 - 1:37:55 PM - rlively Back To Top
For full disclosure, please mention Microsoft's official policy regarding direct SharePoint content database reads:
10 Best Practices For Building SharePoint Solutions

http://msdn.microsoft.com/en-us/magazine/dd458798.aspx

Microsoft is very clear in its do***entation that you should not make calls directly to the SharePoint content and configuration databases.  Microsoft can change the underlying schema in these databases, and there can be more than one content database per Web application. Therefore, seemingly benign direct query operations can lead to brittle solutions.

SharePoint Database Access

http://msdn.microsoft.com/en-us/library/bb861829(office.12).aspx

Important: This type of customization is not supported.

the process for acquiring that data should be through the built-in SharePoint object model, and not by using any type of query to the database.

If a server component requires information from the database, it must get that data by using the appropriate items in the SharePoint object model, and not by trying to get the items from the data structures in the database through some query mechanism.

Support for changes to the databases that are used by Office server products and by Windows SharePoint Services

http://support.microsoft.com/kb/841057/en-us

In the event that unsupported read operations are discovered as part of a support call, the database will be considered to be in an unsupported state. To return the database to a supported state, the customer must cease all unsupported read activities, regardless of whether they are being performed manually or programmatically, and may need to a database restoration from the last known good backup that did not include unsupported read activities.

 http://blogs.msdn.com/b/mikefitz/archive/2005/04/01/404802.aspx

 “… even reading from the database can indeed have negative effects on performance and stability.  Just to name two such effects, I’ll paraphrase his comments:

1.       Deadlocks: direct SELECT statements against the database take shared read locks at the default transaction level, and can violate our lock hierarchy, leading to seemingly random failures.

2.       Incorrect data: if you avoid taking locks to avoid a deadlock, you’ll very likely observe the database in an intermediate state and retrieve incorrect data.

Microsoft won’t be able to help you … you will run the risk of breaking or degrading the performance/stability of technology we built to be fast and robust.  The odds of a database change in our next Service Pack aren’t small.  It’s that you’ll on your own if anything goes wrong.  Consider this carefully.  Our object model is a contract with you.  We’ve gone to a lot of trouble to ensure that using the OM results in stable and performant database interactions.  If anything’s amiss, we’re honor-bound to fix it.  Please, please use it.”



Monday, May 10, 2010 - 2:49:50 PM - PhilB Back To Top

I have been using SPViews from Sharpest Tools Software for a little while now for my integration and a little reporting.  Works well for me.


Learn more about SQL Server tools