![]() |
|
|
|
By: Rob Fisch | Read Comments (5) | Print Rob has worked with SQL Server since version 6.5 as a dba, developer, report writer and data warehouse designer. Related Tips: More |
|
Problem
Sharepoint is a great collaborative tool, but its internal reporting capability is limited. Wouldn't it be so powerful if you could report against one or more Sharepoint lists from Microsoft's Reporting Services? Or how about just having the ability to integrate the data into some other SQL application and/or data warehouse? There is very little documentation on the internal Sharepoint databases, so how can you gather data from Sharepoint and make meaningful reports using Reporting Services?
| Disclaimer This procedure is not for the faint of heart. When reporting directly against a production server, there could be performance implications. I highly recommend doing all testing in a non-production environment. I also recommend you check with your Sharepoint administrator and/or SQL DBA before doing any of this. (Hopefully both of those are you ;-). Furthermore, this procedure will defeat any internal Sharepoint security configurations, so you should be careful to consider what data your are letting out of the bottle. Check with business owner's before going live with anything. This procedure works with Sharepoint services (WSS 2.0 or 3.0). I have no first hand knowledge about any of the full Portal Server products, but I imagine it is similar since the Portal product is based upon WSS. Enough on the disclaimer. Let's proceed... |
Solution
Scope:
The scope of this solution is limited to list data. This would include event/calendars, announcements, custom lists, issues, links, contacts. Document and picture libraries, discussions, wikis, and the like will not be discussed, but it's possible the techniques discussed may help with those as well.
Additionally, as mentioned in the disclaimer, we're not going to be able to get at the security information relating to the list, though we will be able to display the “Created By” and “Modified By” fields.
Materials needed:
Let's start with getting the data into a place where we can comfortably work on it. Create a new database on a test server.
Export the following tables and/or views from your production Sharepoint database to your new test database.
*IMPORTANT: In the case of WSS 3.0, these (with an asterisk *) are Views, not Tables.
In any event, get all these objects into your test (or working) database as tables.
These four tables contain all the information you need to report on your Sharepoint content. Below is a brief description of these tables/views.
Webs – This is simply a table listing all your Sharepoint “sites” and “sub-sites”. (Not site collections.) The important fields are:
UserInfo – What you would expect. Pertinent fields are:
Lists – This is a table of all the Sharepoint lists in your Sharepoint database. All of them! Needed fields are:
UserData – This is where all the content is stored. All of it! However...not so fast, buster. This is where the hard work begins. Because the content is stored in generically named fields.
What do I mean? I mean, let's say you have a list. Let's say it's a Sharepoint contact list. You will have fields like, FirstName, LastName, Business Phone, Cell Phone, Address, etc. However, if you peak, you will see that the actual database field names are more like nvarchar1, nvarchar2, nvarchar3, nvarchar4, ntext1, ntext2, datetime1, datetime2, int1, int2, int3, etc.
Now your are exploring the limits of Sharepoint. You will see exactly how many datetime fields you can have in a single list, just by looking at the number of datetime fields. (WSS3 extends the capability a bit with a concept called site columns...but that's outside the scope of this discussion as well.)
Mapping the field names
So how to map the named list fields with the generic database fields? Well as it turns out, all this information is stored in a field called tp_fields in the Lists table. But it's an ugly map.
Below is the contents of the tp_fields field for a list I created to illustrate this example. (Don't even try to read this yet. Skip down below it.
<FieldRef Name="ContentTypeId"/><FieldRef Name="Title" ColName="nvarchar1"/><FieldRef Name="_ModerationComments" ColName="ntext1"/><FieldRef Name="File_x0020_Type" ColName="nvarchar2"/><Field Type="Text" DisplayName="Single_line_of_text" Required="FALSE" MaxLength="255" ID="{61337d5c-35eb-4af3-9593-7611a617290c}" SourceID="{7334222f-e0a9-4cff-9108-ce06ae7b57fa}" StaticName="Single_line_of_text" Name="Single_line_of_text" ColName="nvarchar3" RowOrdinal="0"/><Field Type="Text" DisplayName="Single_line_of_text_two" Required="FALSE" MaxLength="255" ID="{a7da379f-7daf-4fba-b99d-8d9b2f17ade3}" SourceID="{7334222f-e0a9-4cff-9108-ce06ae7b57fa}" StaticName="Single_line_of_text_2" Name="Single_line_of_text_2" ColName="nvarchar4" RowOrdinal="0" Version="1"><Default/></Field><Field Type="Note" DisplayName="Multiple_lines_of_text_PlainText" Required="FALSE" NumLines="6" RichText="FALSE" Sortable="FALSE" ID="{5450f7ac-53cd-4760-b440-b8539750bdd2}" SourceID="{7334222f-e0a9-4cff-9108-ce06ae7b57fa}" StaticName="Multiple_lines_of_text_PlainText" Name="Multiple_lines_of_text_PlainText" ColName="ntext2" RowOrdinal="0"/><Field Type="Note" DisplayName="Multiple_lines_of_text_RichText" Required="FALSE" NumLines="6" RichText="TRUE" RichTextMode="Compatible" Sortable="FALSE" ID="{cdf42c72-2b38-4e45-bc03-811432cb3f6a}" SourceID="{7334222f-e0a9-4cff-9108-ce06ae7b57fa}" StaticName="Multiple_lines_of_text_RichText" Name="Multiple_lines_of_text_RichText" ColName="ntext3" RowOrdinal="0"/><Field Type="Note" DisplayName="Multiple_lines_of_text_EnhanceRichText" Required="FALSE" NumLines="6" RichText="TRUE" RichTextMode="FullHtml" IsolateStyles="TRUE" Sortable="FALSE" ID="{875cfe1f-df40-4741-8a70-b0df287a30e7}" SourceID="{7334222f-e0a9-4cff-9108-ce06ae7b57fa}" StaticName="Multiple_lines_of_text_EnhanceRi" Name="Multiple_lines_of_text_EnhanceRi" ColName="ntext4" RowOrdinal="0" Version="1"/><Field Type="Choice" DisplayName="Choice_dropdown" Required="FALSE" Format="Dropdown" FillInChoice="FALSE" ID="{59f15fae-20ee-45bc-aede-fb560e44a9cd}" SourceID="{7334222f-e0a9-4cff-9108-ce06ae7b57fa}" StaticName="Choice_dropdown" Name="Choice_dropdown" ColName="nvarchar5" RowOrdinal="0"><Default>Red</Default><CHOICES><CHOICE>Red</CHOICE><CHOICE>Ye llow</CHOICE><CHOICE>Blue</CHOICE></CHOICES></Field><Field Type="Number" DisplayName="Number_column" Required="FALSE" ID="{97935b88-1752-4f87-a571-34d42fe08b5e}" SourceID="{7334222f-e0a9-4cff-9108-ce06ae7b57fa}" StaticName="Number_column" Name="Number_column" ColName="float1" RowOrdinal="0"/><Field Type="Currency" DisplayName="Currency_column" Required="FALSE" LCID="1033" ID="{a2803d64-a1b5-4986-a651-fb1fdbe37532}" SourceID="{7334222f-e0a9-4cff-9108-ce06ae7b57fa}" StaticName="Currency_column" Name="Currency_column" ColName="float2" RowOrdinal="0"/><Field Type="DateTime" DisplayName="Date_and_Time" Required="FALSE" Format="DateOnly" ID="{3c394c06-51cb-48a6-8eaf-0c373e1c320f}" SourceID="{7334222f-e0a9-4cff-9108-ce06ae7b57fa}" StaticName="Date_and_Time" Name="Date_and_Time" ColName="datetime1" RowOrdinal="0"/><Field Type="Boolean" DisplayName="Yes_No" ID="{ffc5b6c2-d3ee-4a6d-9b4a-db938a1f8e7d}" SourceID="{7334222f-e0a9-4cff-9108-ce06ae7b57fa}" StaticName="Yes_No" Name="Yes_No" ColName="bit1" RowOrdinal="0"><Default>1</Default></Field><Field Type="User" DisplayName="Person_or_Group" List="UserInfo" Required="FALSE" ShowField="ImnName" UserSelectionMode="PeopleAndGroups" UserSelectionScope="0" ID="{feae87e2-dfb0-45ba-85d6-4d782d064987}" SourceID="{7334222f-e0a9-4cff-9108-ce06ae7b57fa}" StaticName="Person_or_Group" Name="Person_or_Group" ColName="int1" RowOrdinal="0"/><Field Type="URL" DisplayName="Hyperlink" Required="FALSE" Format="Hyperlink" ID="{5db05486-d8ff-42bc-82a8-fc6bc3228d88}" SourceID="{7334222f-e0a9-4cff-9108-ce06ae7b57fa}" StaticName="Hyperlink" Name="Hyperlink" ColName="nvarchar6" RowOrdinal="0" ColName2="nvarchar7" RowOrdinal2="0"/>
Whatever this stuff is, if you are good with Perl you might be able to parse it. But that's also outside the scope of this article. We're going to do it the dirty way.
The first thing to do is make it somewhat readable. I look for statements that begin with “<FieldRef Name=” or “<Field Type=” and put a line break in order to make the long spaghetti code readable. These mark the beginning of each field. Then I will mark in bold the mappings and make notations where I wanted to make comments. One more thing. My Sharepoint column names are things like “Single_line_of_text”, “Multiple_lines_of_text” and “date_and_time”, because I wanted to see the mapping of what Sharepoint calls 'field types' to actual database field types. These names are incidental to this discussion and was only done as a matter of interest to me. Hopefully, your Sharepoint field names will be a bit more meaningful.
Here comes the mapping...
<FieldRef Name="ContentTypeId"/><FieldRef Name="Title" ColName="nvarchar1"/>
<FieldRef Name="_ModerationComments" ColName="ntext1"/>
I don't really care about this one, but it's here for the sake of completion.
<FieldRef Name="File_x0020_Type" ColName="nvarchar2"/>
Don't care about this either.
<Field Type="Text" DisplayName="Single_line_of_text" Required="FALSE" MaxLength="255" ID="{61337d5c-35eb-4af3-9593-7611a617290c}" SourceID="{7334222f-e0a9-4cff-9108-ce06ae7b57fa}" StaticName="Single_line_of_text" Name="Single_line_of_text" ColName="nvarchar3" RowOrdinal="0"/>
<Field Type="Text" DisplayName="Single_line_of_text_two" Required="FALSE" MaxLength="255" ID="{a7da379f-7daf-4fba-b99d-8d9b2f17ade3}" SourceID="{7334222f-e0a9-4cff-9108-ce06ae7b57fa}" StaticName="Single_line_of_text_2" Name="Single_line_of_text_2" ColName="nvarchar4" RowOrdinal="0" Version="1"><Default/></Field>
Notice that the DisplayName no longer maps to StaticName or Name. That's because I changed the field name in Sharepoint, but Sharepoint keeps a history of the original name. A watchout is to look for DisplayName...not just Name.
<Field Type="Note" DisplayName="Multiple_lines_of_text_PlainText" Required="FALSE" NumLines="6" RichText="FALSE" Sortable="FALSE" ID="{5450f7ac-53cd-4760-b440-b8539750bdd2}" SourceID="{7334222f-e0a9-4cff-9108-ce06ae7b57fa}" StaticName="Multiple_lines_of_text_PlainText" Name="Multiple_lines_of_text_PlainText" ColName="ntext2" RowOrdinal="0"/>
<Field Type="Note" DisplayName="Multiple_lines_of_text_RichText" Required="FALSE" NumLines="6" RichText="TRUE" RichTextMode="Compatible" Sortable="FALSE" ID="{cdf42c72-2b38-4e45-bc03-811432cb3f6a}" SourceID="{7334222f-e0a9-4cff-9108-ce06ae7b57fa}" StaticName="Multiple_lines_of_text_RichText" Name="Multiple_lines_of_text_RichText" ColName="ntext3" RowOrdinal="0"/>
<Field Type="Note" DisplayName="Multiple_lines_of_text_EnhanceRichText" Required="FALSE" NumLines="6" RichText="TRUE" RichTextMode="FullHtml" IsolateStyles="TRUE" Sortable="FALSE" ID="{875cfe1f-df40-4741-8a70-b0df287a30e7}" SourceID="{7334222f-e0a9-4cff-9108-ce06ae7b57fa}" StaticName="Multiple_lines_of_text_EnhanceRi" Name="Multiple_lines_of_text_EnhanceRi" ColName="ntext4" RowOrdinal="0" Version="1"/>
Note that all types of text that is multple lines map to the ntext database fields.
<Field Type="Choice" DisplayName="Choice_dropdown" Required="FALSE" Format="Dropdown" FillInChoice="FALSE" ID="{59f15fae-20ee-45bc-aede-fb560e44a9cd}" SourceID="{7334222f-e0a9-4cff-9108-ce06ae7b57fa}" StaticName="Choice_dropdown" Name="Choice_dropdown" ColName="nvarchar5" RowOrdinal="0"><Default>Red</Default><CHOICES><CHOICE>Red</CHOICE><CHOICE>Ye llow</CHOICE><CHOICE>Blue</CHOICE></CHOICES></Field>
Note the choices for the dropdown are stored here.
<Field Type="Number" DisplayName="Number_column" Required="FALSE" ID="{97935b88-1752-4f87-a571-34d42fe08b5e}" SourceID="{7334222f-e0a9-4cff-9108-ce06ae7b57fa}" StaticName="Number_column" Name="Number_column" ColName="float1" RowOrdinal="0"/>
<Field Type="Currency" DisplayName="Currency_column" Required="FALSE" LCID="1033" ID="{a2803d64-a1b5-4986-a651-fb1fdbe37532}" SourceID="{7334222f-e0a9-4cff-9108-ce06ae7b57fa}" StaticName="Currency_column" Name="Currency_column" ColName="float2" RowOrdinal="0"/>
<Field Type="DateTime" DisplayName="Date_and_Time" Required="FALSE" Format="DateOnly" ID="{3c394c06-51cb-48a6-8eaf-0c373e1c320f}" SourceID="{7334222f-e0a9-4cff-9108-ce06ae7b57fa}" StaticName="Date_and_Time" Name="Date_and_Time" ColName="datetime1" RowOrdinal="0"/>
<Field Type="Boolean" DisplayName="Yes_No" ID="{ffc5b6c2-d3ee-4a6d-9b4a-db938a1f8e7d}" SourceID="{7334222f-e0a9-4cff-9108-ce06ae7b57fa}" StaticName="Yes_No" Name="Yes_No" ColName="bit1" RowOrdinal="0"><Default>1</Default></Field>
<Field Type="User" DisplayName="Person_or_Group" List="UserInfo" Required="FALSE" ShowField="ImnName" UserSelectionMode="PeopleAndGroups" UserSelectionScope="0" ID="{feae87e2-dfb0-45ba-85d6-4d782d064987}" SourceID="{7334222f-e0a9-4cff-9108-ce06ae7b57fa}" StaticName="Person_or_Group" Name="Person_or_Group" ColName="int1" RowOrdinal="0"/>
<Field Type="URL" DisplayName="Hyperlink" Required="FALSE" Format="Hyperlink" ID="{5db05486-d8ff-42bc-82a8-fc6bc3228d88}" SourceID="{7334222f-e0a9-4cff-9108-ce06ae7b57fa}" StaticName="Hyperlink" Name="Hyperlink" ColName="nvarchar6" RowOrdinal="0" ColName2="nvarchar7" RowOrdinal2="0"/>
So much for the mapping. You will use your map to create field aliases in your select statement.
Bringing it home
No all we have to do is link some tables and we're home-free. You can create a view and base your reports on the view of your Sharepoint data. Here's how the tables link up:
You might end up with something that looks like this:
SELECT USERDATA.NVARCHAR1 AS TITLE,
USERDATA.TP_AUTHOR AS AUTHORID,
USERINFO_EDITORS.TP_TITLE AS EDITORNAME,
USERINFO_AUTHORS.TP_EMAIL AS AUTHOREMAIL,
USERDATA.TP_EDITOR AS EDITORID,
USERINFO_AUTHORS.TP_TITLE AS AUTHORNAME,
USERDATA.TP_MODIFIED,
USERDATA.TP_CREATED,
USERDATA.NVARCHAR2 AS ADDRESS,
USERDATA.NVARCHAR3 AS PHONE,
USERDATA.NTEXT1 AS COMMENTS,
USERDATA.NTEXT2 AS NOTES,
USERDATA.NTEXT3 AS [STATUS REPORT],
LISTS.TP_TITLE AS LIST_TITLE,
USERINFO_EDITORS.TP_LOGIN AS USERLOGIN,
WEBS.FULLURL AS SITEURL,
WEBS.TITLE AS SITETITLE
FROM USERDATA
INNER JOIN LISTS
ON USERDATA.TP_LISTID = LISTS.TP_ID
INNER JOIN USERINFO AS USERINFO_AUTHORS
ON USERDATA.TP_AUTHOR = USERINFO_AUTHORS.TP_ID
INNER JOIN USERINFO AS USERINFO_EDITORS
ON USERDATA.TP_EDITOR = USERINFO_EDITORS.TP_ID
INNER JOIN WEBS
ON LISTS.TP_WEBID = WEBS.ID
WHERE (LISTS.TP_ID = 'the-id-of-your-list')
To find the id of your list for the WHERE part of the statement above, use the following query:
SELECT LISTS.TP_TITLE AS LIST_TITLE,
WEBS.FULLURL AS SITEURL,
WEBS.TITLE AS SITETITLE,
LISTS.TP_ID
FROM LISTS
INNER JOIN WEBS
ON LISTS.TP_WEBID = WEBS.ID
WHERE (LISTS.TP_TITLE = 'name-of-your-list')
Check the FullUrl in case there are duplicates.
Next Steps
| Share: | Share | Tweet |
|
![]() |
![]() |
Free SQL Server Learning |
| Monday, May 10, 2010 - 2:49:50 PM - PhilB | Read The Tip |
|
I have been using SPViews from Sharpest Tools Software for a little while now for my integration and a little reporting. Works well for me. |
|
| Monday, August 02, 2010 - 1:37:55 PM - rlively | Read The Tip |
|
For full disclosure, please mention Microsoft's official policy regarding direct SharePoint content database reads: 10 Best Practices For Building SharePoint Solutions http://msdn.microsoft.com/en-us/magazine/dd458798.aspx Microsoft is very clear in its do***entation that you should not make calls directly to the SharePoint content and configuration databases. Microsoft can change the underlying schema in these databases, and there can be more than one content database per Web application. Therefore, seemingly benign direct query operations can lead to brittle solutions.
SharePoint Database Access http://msdn.microsoft.com/en-us/library/bb861829(office.12).aspx Important: This type of customization is not supported. the process for acquiring that data should be through the built-in SharePoint object model, and not by using any type of query to the database. If a server component requires information from the database, it must get that data by using the appropriate items in the SharePoint object model, and not by trying to get the items from the data structures in the database through some query mechanism. Support for changes to the databases that are used by Office server products and by Windows SharePoint Services http://support.microsoft.com/kb/841057/en-us In the event that unsupported read operations are discovered as part of a support call, the database will be considered to be in an unsupported state. To return the database to a supported state, the customer must cease all unsupported read activities, regardless of whether they are being performed manually or programmatically, and may need to a database restoration from the last known good backup that did not include unsupported read activities. http://blogs.msdn.com/b/mikefitz/archive/2005/04/01/404802.aspx “… even reading from the database can indeed have negative effects on performance and stability. Just to name two such effects, I’ll paraphrase his comments: 1. Deadlocks: direct SELECT statements against the database take shared read locks at the default transaction level, and can violate our lock hierarchy, leading to seemingly random failures. 2. Incorrect data: if you avoid taking locks to avoid a deadlock, you’ll very likely observe the database in an intermediate state and retrieve incorrect data. Microsoft won’t be able to help you … you will run the risk of breaking or degrading the performance/stability of technology we built to be fast and robust. The odds of a database change in our next Service Pack aren’t small. It’s that you’ll on your own if anything goes wrong. Consider this carefully. Our object model is a contract with you. We’ve gone to a lot of trouble to ensure that using the OM results in stable and performant database interactions. If anything’s amiss, we’re honor-bound to fix it. Please, please use it.” |
|
| Monday, August 30, 2010 - 12:05:16 AM - Adrian | Read The Tip |
|
I've been using the i-PMO Data Miner 2009 for SharePoint. Makes the entire process vastly simpler, is an approach supported by microsoft, easily configured by business users and is as fast as SQL will go http://www.i-pmo.com.au/Products/iPMODataMiner2009forSharePoint.aspx Adrian |
|
| Tuesday, October 12, 2010 - 6:02:35 PM - PhilB | Read The Tip |
|
rlively, I do understand were Microsoft is coming from on the reading. For me, I have been reading from the content database for years beginning with SharePoint Services 2 because it made my integration with other applications so much easier and cost effective to create and maintain for just reading databases. If you are doing a nolock read then I don't see how the read can break a content database. Also, it has never been an issue for my applications where I was worried about missing an update to data that SharePoint might have been in the middle of. I think we just don't keep that kind of data in SharePoint. I was actually thinking about not doing that anymore, especially when I saw what SharePoint 2010 looked like, but then I started using SPViews and they really seem to know the schema well. I don't know. I prefer this method for the simplicity of it. Phil |
|
| Thursday, October 14, 2010 - 4:15:43 PM - rlively | Read The Tip |
|
I think it's up to everyone to determine the method that works best for them, and I wasn't trying to change anyone's opinion. I just wanted to post the additional information regarding Microsoft's official stance (since it is a Microsoft product) to make sure everyone is aware. |
|
|
privacy | disclaimer | copyright | advertise | about authors | contribute | feedback | giveaways | user groups Some names and products listed are the registered trademarks of their respective owners. Edgewood Solutions LLC | MSSharePointTips.com | MSSQLTips.com |