File Handling in a SharePoint Document Library Using Python


By:   |   Updated: 2021-04-01   |   Comments   |   Related: > SharePoint


Problem

According to Microsoft, over 200,000 organizations and 190 million people have SharePoint for intranets, team sites, and content management (Microsoft, 2021). Nowadays, multiple versions of SharePoint are in use. Several open-source Python libraries can interact with SharePoint. People may come across compatibility issues when running Python programs to handle files in SharePoint document libraries. These Python programs usually upload, download, delete, and search for documents. IT professionals must spend much time on a trial and error learning process to solve the compatibility issues. They want to look at some sample programs and execute them in a free trial environment to speed up learning.

Solution

We first introduce the "Microsoft 365 Developer Program" that allows IT professionals to access a SharePoint Online environment (Graham, 2019). People can join the program without a credit card and any payment (Microsoft 365, 2021). At the time of this writing, all participants can get a free Microsoft 365 E5 developer subscription. We then work on a fictitious project for a digital publishing platform, like MSSQLTips.com, and provide Python scripts to handle files in a SharePoint document library. We use the Python library "Office365-REST-Python-Client" (Gremyachev, 2020) to connect SharePoint document libraries. The official site of this Python library provides many useful sample scripts. We can practice these Python scripts in the trial environment as well. The article comprises two parts, and readers who have an Office 365 environment can skip the first part:

In the first part, we go through the steps to create an office 365 account and access a SharePoint Online environment. To simulate a small corporate environment, we install sample data packs. The "Users" sample data pack creates 16 fictitious users, and the SharePoint sample data pack can add several SharePoint site templates. To allow Python programs to access SharePoint document libraries, we disable security defaults. However, at work, we should consult administrators to configure the security for Python program access.

The second part explores steps to establish a connection with a SharePoint document library using the Python library "Office365-REST-Python-Client 2.3.1." We then cover the process to upload a document to a SharePoint document library and add the document’s metadata. Next, we show steps to download and delete files from the library. The document library may have folders and sub-folders. We show a way of searching for files recursively. We also provide sample scripts to handle documents in sub-folders.

The author tested all the programs used in this article with Microsoft Visual Studio Community 2019 and Python 3.7 (64-bit) on Windows 10 Home 10.0 <X64>. During testing, all the Python programs were connected to the SharePoint Online environment provided by the "Microsoft 365 Developer Program." The web browser "Opera" was used to access the SharePoint document library.

1 – Creating a SharePoint Online Environment

An effective strategy to learn computer skills is to learn by doing. During practicing, we may want to separate our studying environments from working environments. This section creates a new environment (a sandbox) using no personal information except a cell phone number. We need a cell phone to receive a verification code when setting up our Microsoft 365 developer subscription. The Office 365 account registration process needs an email account. We sign up for a personalized email at https://www.mail.com. We also download the web browser "Opera" for accessing the information on the Internet.

1.1 Creating a Microsoft Account

At the time of this writing, we can access the landing page of the Microsoft 365 Developer Program at https://developer.microsoft.com/en-us/microsoft-365/dev-program. Figure 1 illustrates the landing page.

microsoft 365

Figure 1 The Landing page of the Microsoft 365 Developer Program

After clicking on the button "Join Now" on the landing page, we are asked to sign in to microsoftonline.com, as illustrated in Figure 2.

microsoft 365 sign in

Figure 2 The Microsoft account sign-in dialog

To create a new Microsoft account, we click on the link button "Create one!" The dialog for creating a Microsoft account appears, as shown in Figure 3. We enter an email address, click on the "Next" button, and then follow the instructions on the screens to complete the registration process. At the end of this step, we should have a Microsoft account.

microsoft 365 create account

Figure 3 The dialog to create a Microsoft account

1.2 Participating in the Microsoft 365 Developer Program

We go back to the Microsoft account sign-in page, illustrated in Figure 2. Following the instructions on the screens, we sign into the Microsoft account. During the signing-in process, the program asks us a few questions, as shown in Figure 4. Answer these questions and complete the form.

microsoft 365 create account

Figure 4 Join the Microsoft 365 developer program

We click on the button "Next" and follow instructions on the screens to complete the signing-in process. Then, we arrive at the profile page, as shown in Figure 5.

microsoft 365 training

Figure 5 The profile page of the Microsoft Developer Program

1.3Setting up an Office 365 Developer Subscription

As illustrated in Figure 5, the Microsoft 365 developer program membership entitles us to a free Microsoft 365 E5 subscription with all Office 365 apps, including SharePoint, OneDrive, Outlook/Exchange, Teams, Planner, Word, Excel, PowerPoint, and more (Microsoft 365, 2021). We use this subscription to build a SharePoint Online environment and run Python programs. We click on the "SET UP E5 SUBSCRIPTION" button on the web page to start the subscription process. A dialog appears, and it should look like Figure 6.

microsoft 365 developer subscription

Figure 6 Set up Microsoft 365 E5 developer subscription

We must select a country (or region), a username, a domain, and a password to complete the form. When the button "Continue" becomes enabled, we click on the button. The "Add phone number for security" dialog appears, as shown in Figure 7. We should use a valid cell phone number to receive a code. After we enter the code, the "Set up" button becomes enabled. We click on the button to continue the sign-up process.

microsoft 365 developer subscription

Figure 7 Add phone number for security

We follow instructions on the screens to complete the setup process. Then, we land on a dashboard page, as shown in Figure 8. The username and domain selected in Figure 6 form the administrator account. In the "Sample data packs" section, we can choose to install sample data packs. The Users sample data pack creates 16 fictitious users on the subscription, and the SharePoint sample data pack contains several site templates (Microsoft 365, 2021). We install the User data packs and a SharePoint team site template. The installation process may take a few minutes.

microsoft 365 developer subscription

Figure 8 A dashboard in the Microsoft 365 dev center

1.4 Turning off Security Defaults

By default, Microsoft automatically enforces a set of basic identity security mechanisms. To enable Python programs to access SharePoint sites, we disable the security defaults. However, at work, we should consult Office 365 administrators for any changes in security configurations. All configurations in this article are for demonstration. The goal is to show how to use Python scripts to handle files in a SharePoint document library. We present a quick and straightforward way to create a document library for Python applications to access. We should not apply this method in production.

We use the following steps to turn off the security defaults. First, let us sign into the site https://www.office.com/ using the administrator account. We then find the "App launcher" icon on the top left, as shown in Figure 9.  Clicking on the icon, we see a list of Apps in a pop-up pane. The "Admin" app icon appears in the pane. Alternatively, we can find the "Admin" app icon in the vertical toolbar on the left.

office 365 admin center

Figure 9 Click on the App launcher icon to show some frequently used Microsoft Apps

Clicking on the "Admin" icon, we land on the "Microsoft 365 admin center" page, as shown in Figure 10. We can expand the User menu to view the 16 fictitious users. We expect to find the "Azure Active Directory" menu item, but it is not visible. Clicking on the "Show all" button, we see a complete list of menu items.

office 365 admin center

Figure 10 The landing page of the Microsoft 365 admin center

After clicking on the "Show all" button, we expand the left pane and see the "Azure Active Directo…" menu item, as shown in Figure 11.

office 365 admin center

Figure 11 Expand the admin centers

When we click on the "Azure Active Directo…" menu item, a pop-up dialog shows up, as shown in Figure 12.

office 365 admin center

Figure 12 Use administrator account to log in Microsoft Azure

We select the administrator account to access the "Azure Active Directory admin center," and the landing page should look like Figure 13. On the left pane, there is an "Azure Active Directory" link button.

azure active directory admin center

Figure 13 Azure Active Directory admin center

We click on the link button "Azure Active Directory," then choose the "Properties" menu item. The "Manage Security defaults" link button appears, as shown in Figure 14. Clicking on the button, we can disable the security defaults on the right pane. Finally, we click on the "Save" button to save the change. We want to emphasize again that we should consult Office 365 administrators for any security configuration changes. This article introduces a quick way, but not a very practical one, to give access to Python applications.

azure active directory admin center

Figure 14 The process to disable security defaults

1.5 Creating a SharePoint Document Library

SharePoint is a website-based collaboration system that empowers business teams to work together and implement business process automation (Satapathi, 2020). A fictitious company ABC provides a digital publishing platform for writers. The company wants to use a SharePoint document library to manage articles and some corporate documents. After writers submit their articles through a web form, A Python program moves the articles to the SharePoint document library. Then, ABC editors review these articles. If articles do not fit the platform, the editors tag the articles with the "Rejected" label. Otherwise, the editors put the "Approved" label on the articles. Then, another Python program carries these articles to a file server for publication. The program places corporate documents in a sub-folder of the document library.

To create a SharePoint document library for the company ABC, we first need to build a SharePoint site. If we sign out of the site https://www.office.com/, we need to sign in again with the administrator’s account. We click on the "App launcher" icon. On the pop-up pane shown in Figure 9, we select the "SharePoint" app. We then land on a page shown in Figure 15. The "Create site" button on this page allows us to create a SharePoint site using the installed site templates. Microsoft provides SharePoint Online training tutorial: Create a site in SharePoint. Following their instructions, we create a SharePoint site, "ABC," and add a "Publications" document library.

create sharepoint site

Figure 15 Create a SharePoint Site

We add two columns to the library: Category and Status. The Category column is a free form text field. A writer can determine a category for his/her article during uploading process. Editors can recommend a category for an article. The Status column is a choice column with four options: Submitted, In Process, Rejected, and Approved. The default value of the choice column is "Submitted." After completing the setup, we upload a document to test the library. The document library should look like Figure 16. In the next section, we use Python scripts to upload, download, and delete documents. That section also covers steps to add values to the Category column, create a folder and handle files in the folder.

create sharepoint site

Figure 16 The SharePoint library with one document

2 – Handling Files in the SharePoint Document Library

At the ABC company, we use a Python program to move the uploaded files from a webserver to the SharePoint document library. After editors review articles and determine to publish them, we use another Python program to move the latest versions of the files from the document library to a file server. We also use Python code to upload some corporate documents to a sub-folder of the library. During development, we created a Python application project using Visual 2019 (Zhou, 2020). We create a virtual environment for this project using this requirement.txt:

certifi==2020.12.5
cffi==1.14.5
chardet==4.0.0
cryptography==3.4.6
idna==2.10
msal==1.10.0
Office365-REST-Python-Client==2.3.1
pip==21.0.1
pycparser==2.20
PyJWT==2.0.1
requests==2.25.1
setuptools==54.1.1
urllib3==1.26.3

The Python library "Office365-REST-Python-Client" comes with many examples, and we can study these examples at https://github.com/vgrem/Office365-REST-Python-Client/tree/master/examples/sharepoint. These examples cover most business requirements we may encounter at work. If we cannot find detailed information about a class or a method, we can view the source code at https://github.com/vgrem/Office365-REST-Python-Client/tree/master/office365/sharepoint. These Python scripts are also useful resources for us to study Python programming. In situation where some operations, for example, upload and download documents are used repeatedly in a project, it is good idea to create helper methods for them.

2.1 Connecting to the SharePoint Library

A document library is a special type of SharePoint list that contains documents and metadata (Katzer & Crawford, 2013). We can use the Python library "Office365-REST-Python-Client" to construct a file object or a list item object. The file object has a property "listItemAllFields" representing the corresponding list item. Likewise, the list item has a property "file" that allows us to access the associated file object. Since we created the document library and uploaded a file to the library already, we now use the following Python code to connect the SharePoint library, list all files, and access metadata through their associated list item properties:

fromoffice365.sharepoint.client_contextimportClientContext

SP_SITE_URL ='https://trainingcenter2021.sharepoint.com/sites/ABC/'
SP_DOC_LIBRARY ='Publications'
USERNAME ='sptraining@trainingcenter2021.onmicrosoft.com'
PASSWORD ='******' 

# 1. Create a ClientContext object and use the user’s credentials for authentication 
ctx =ClientContext(SP_SITE_URL).with_user_credentials(USERNAME, PASSWORD) 

# 2. Read file entities from the SharePoint document library 
files = ctx.web.lists.get_by_title(SP_DOC_LIBRARY).root_folder.files ctx.load(files)
ctx.execute_query()

# 3. loop through file entities
for filein files: 
  # 4. Access the file object properties 
  print(file.properties['Name'], file.properties['UniqueId'])
  # 5. Access list item object through the file object 
  item = file.listItemAllFields
  ctx.load(item) 
  ctx.execute_query() 
  print('Access metadata - Category: {0}, Status: {1}'.format(item.properties['Category'], item.properties['Status']))

# 4. The Output: 
# File Handling in SharePoint Document Library Using Python.docx 77819f08-5fbe-450f-9f9b-d3ae2862cbb5
# Access metadata - Category: Python, Status: Submitted 			

As aforementioned, a document library is a special type of SharePoint list. We also can find all list items in the library and then access the associated file objects. Like the preceding Python code block, we create a "ClientContext" object to represent the target SharePoint resources and use the user credentials for authentication. Then, the Python code retrieves a SharePoint list from the document library. A list item has an associated file object; therefore, we can access all documents in a library by accessing all the list items.

fromurllibimportparse 
fromoffice365.sharepoint.client_contextimportClientContext

SP_SITE_URL ='https://trainingcenter2021.sharepoint.com/sites/ABC/'
SP_DOC_LIBRARY ='Publications'
USERNAME ='sptraining@trainingcenter2021.onmicrosoft.com'
PASSWORD ='******' 

# 1. Create a ClientContext object and use the user’s credentials for authentication 
ctx =ClientContext(SP_SITE_URL).with_user_credentials(USERNAME, PASSWORD) 

# 2. Retrieve a SharePoint list from the document library 
oList = ctx.web.lists.get_by_title(SP_DOC_LIBRARY) 
items = oList.items 
ctx.load(items)
ctx.execute_query()

# 3. Look through list items
for itemin items: 
  # 4. Access list item properties. Parse file unique Id from the ServerRedirectedEmbedUri property of the list item 
  dic_query_string =parse.parse_qs(parse.urlsplit(item.properties['ServerRedirectedEmbedUri']).query)
  unique_id = dic_query_string['sourcedoc'][0][1:-1]
  print('Unique Id:{0}, Category: {1}, Status: {2}'.format(unique_id, item.properties['Category'], item.properties['Status']))
  # 5. Retrieve the file object from the list item 
  file = item.file 
  ctx.load(file) 
  ctx.execute_query() 
  print('Unique Id:{0}, Name: {1}'.format(file.properties['UniqueId'], file.properties['Name']))

# 6. The output: 
#Unique Id:77819f08-5fbe-450f-9f9b-d3ae2862cbb5, Category: Python, Status: Submitted 
#Unique Id:77819f08-5fbe-450f-9f9b-d3ae2862cbb5, Name: File Handling in SharePoint Document Library Using Python.docx 

We usually place all settings in a standalone configure file. Gremyachev gives an example of a setting file. We can also use Python ConfigParser to read configuration data from a file with a structure like Microsoft Windows INI files (Bodnar, 2020). Another method is to place the configuration data in a JSON file (TutsWiki, 2020). However, the code we presented here did not use any of these methods but put all Python statements in a single file. This coding style is for demonstration, and we should not apply this style in actual development.

2.2 Uploading Files to the Document Library

When a writer uploads his/her article to ABC company, the writer chooses the category "R Language" for the article. The web application places the article on the webserver. A Python program should transport this file to the SharePoint library and tag this article with the category "R Language." The program shown in the following code block implements these steps.

importos 
fromoffice365.sharepoint.client_contextimportClientContext
fromoffice365.sharepoint.listitems.caml.caml_queryimportCamlQuery 

SP_SITE_URL ='https://trainingcenter2021.sharepoint.com/sites/ABC/'
SP_DOC_LIBRARY ='Publications'
USERNAME ='sptraining@trainingcenter2021.onmicrosoft.com'
PASSWORD ='******' 

# Assuming the file path on the webserver 
file_path ="C:/temp/articles/Index Numbers Explained with Examples in R.docx" 
file_name =os.path.basename(file_path)

# 1. Create a ClientContext object and use the user’s credentials for authentication 
ctx =ClientContext(SP_SITE_URL).with_user_credentials(USERNAME, PASSWORD) 

# 2. Read the content of the file 
with open(file_path,'rb')as file_obj: 
  file_content = file_obj.read()

# 3. Upload the content to the SharePoint document library 
sp_folder = ctx.web.lists.get_by_title(SP_DOC_LIBRARY).root_folder 
sp_file = sp_folder.upload_file(file_name, file_content) ctx.execute_query()

# 4. Add metadata to the document by assigning a value to the Category property 
item = sp_file.listItemAllFields item.set_property("Category","R Language")
item.update()
ctx.execute_batch()			

We execute the code through Studio 2019 and then check the document library using the "Opera." The document "Index Numbers Explained with Examples in R.docx" should be added to the library. The Status column shows the default value "Submitted," and the Category column of the document should have a value of "R Language," as shown in Figure 17.

publications

Figure 17 Update Metadata of a file in the SharePoint library

2.3 Downloading Files from the Document Library

At the company ABC, the editors review articles and correct spelling, punctuation, and grammatical errors. When they work on an article in the document library, they change the article status from "Submitted" to "In Process." Then, they may change the status to "Approved" or "Rejected" when they complete reviewing. Next, a Python program moves the approved article to a file server for publication. The program first searches all articles with status "Approved" status. Then, the program downloads all these articles to the file server. The following Python code implements these steps.

importos 
fromoffice365.sharepoint.client_contextimportClientContext
fromoffice365.sharepoint.listitems.caml.caml_queryimportCamlQuery 
fromoffice365.sharepoint.files.fileimportFile 

SP_SITE_URL ='https://trainingcenter2021.sharepoint.com/sites/ABC/'
SP_DOC_LIBRARY ='Publications'
USERNAME ='sptraining@trainingcenter2021.onmicrosoft.com'
PASSWORD ='******' 
FILE_SERVER_PATH ="C:/temp/publications/"

# 1. Create a ClientContext object to connect the document library 
ctx =ClientContext(SP_SITE_URL).with_user_credentials(USERNAME, PASSWORD) 

# 2. Build a CAML query to find all list items that associate the approved articles 
qry_text ='''<Where>
                <Eq> 
                   <FieldRef Name='Status'/>
                   <Value Type='Text'>Approved</Value>
                </Eq>          
              </Where>'''
caml_query =CamlQuery.parse(qry_text)
caml_query.FolderServerRelativeUrl = SP_DOC_LIBRARY 

# 3. Retrieve list items based on the CAML query 
oList = ctx.web.lists.get_by_title(SP_DOC_LIBRARY) 
items = oList.get_items(caml_query) 
ctx.execute_query()

# 4. Loop through all list items
for itemin items: 
  # 5. Query the SharePoint document library to find the associated file of a list item 
  file = item.file 
  ctx.load(file) 
  ctx.execute_query() 
  # 6. Construct the file path on the file server 
  download_file_path =os.path.join(FILE_SERVER_PATH, file.properties['Name'])
  # 7. Download file from the document library to the file server 
  with open(download_file_path,'wb')as download_file: 
  download_file.write(file.read())
  print('File downloaded :{0}'.format(download_file_path))

The code block also shows the CAML syntax. The Collaborative Application Markup Language (CAML), an XML-based language, can query a SharePoint list (McCarthy, 2017). In the preceding code block, the simple CAML query definition defines a filter where the column field equals a value of "Approved" using the "Eq" element. We also can use CAML to customize SharePoint sites (Juvonen & Caputo, 2014).

2.4 Deleting Files from the Document Library

The ABC company does not want to maintain the rejected articles. The company uses a Python program to remove these files from the document library. Similar to the downloading process, the deleting process searches for all articles with the status "Rejected." Then, the program deletes all these articles.

fromoffice365.sharepoint.client_contextimportClientContext
fromoffice365.sharepoint.listitems.caml.caml_queryimportCamlQuery 
fromoffice365.sharepoint.files.fileimportFile 

SP_SITE_URL ='https://trainingcenter2021.sharepoint.com/sites/ABC/'
SP_DOC_LIBRARY ='Publications'
USERNAME ='sptraining@trainingcenter2021.onmicrosoft.com'
PASSWORD ='******' 

# 1. Create a ClientContext object to connect the document library 
ctx =ClientContext(SP_SITE_URL).with_user_credentials(USERNAME, PASSWORD) 

# 2 Build a CAML query to find all list items that associate the rejected articles 
qry_text ='''<Where>
                <Eq> 
                   <FieldRef Name='Status'/>
                   <Value Type='Text'>Rejected</Value>
                </Eq>
              </Where>'''
caml_query =CamlQuery.parse(qry_text)
caml_query.FolderServerRelativeUrl = SP_DOC_LIBRARY 
 
# 3 Retrieve list items based on the CAML query 
oList = ctx.web.lists.get_by_title(SP_DOC_LIBRARY) 
items = oList.get_items(caml_query) 
ctx.execute_query()

# 4 Loop through all list items
for itemin items: 
  # 5 Query the SharePoint document library to find the associated file 
  file = item.file 
  ctx.load(file) 
  ctx.execute_query() 
  # 6 Invoke the delete_object method to delete the file     
  file.delete_object() 
  ctx.execute_query() 
  print('File deleted :{0}'.format(file.properties['Name']))			

2.5 Handling Files in SharePoint Document Library Folders

A document library may contain folders, which are similar to folders in the Windows File System. To organize corporate documents, the ABC company wants to group various documents in different folders. In preceding sections, when we uploaded files to the Publication library, the files went to the root folder. We can create folders inside the root folder. These folders may have independent permissions.

2.5.1 Uploading Files to the Document Library Folder

The following Python code accesses the folder "Administrator" in the SharePoint library. If the folder does not exist, the program is going to create it. The Python library "Office365-REST-Python-Client" provides a folder class to represent a SharePoint folder. We use the folder object’s upload file method to upload file content.

importos  
fromoffice365.sharepoint.client_contextimportClientContext
fromoffice365.sharepoint.listitems.caml.caml_queryimportCamlQuery 

SP_SITE_URL ='https://trainingcenter2021.sharepoint.com/sites/ABC/'
SP_DOC_LIBRARY ='Publications'
SP_DOC_LIBRARY_FOLDER ='Administrator'
# The folder in the document library 
USERNAME ='sptraining@trainingcenter2021.onmicrosoft.com'
PASSWORD ='******' 

# 1. Assuming the location of the uploading file 
file_path ="C:/temp/administration/Request Form2.docx" 
file_name =os.path.basename(file_path)

# 2. Create a ClientContext object to access the SharePoint library 
ctx =ClientContext(SP_SITE_URL).with_user_credentials(USERNAME, PASSWORD) 

# 3. Access the SharePoint folder or create the folder if it doesn’t exist 
sp_folder_path ="/{0}/{1}".format(SP_DOC_LIBRARY,SP_DOC_LIBRARY_FOLDER)
sp_folder  =  ctx.web.ensure_folder_path(sp_folder_path).execute_query()

# 4. Read the content of the file 
with open(file_path,'rb')as file_obj: 
  file_content = file_obj.read()

# 5. Upload the file to the folder in the SharePoint library 
sp_file = sp_folder.upload_file(file_name, file_content) 
ctx.execute_query()

# 6. Add metadata to the uploaded document 
item = sp_file.listItemAllFields 
item.set_property("Category","Form")
item.update()
ctx.execute_batch()			

2.5.2 Downloading Files from the Document Library Folder

We want to download all forms (files marked with Category "Form") in the Administrator folder. We first create a folder object to represent the document library folder. The folder object tells us the folder's relative URL in SharePoint Online. A CAML query then uses this URL to find all files in the folder. Next, we download the file to the file server.

importos 
fromoffice365.sharepoint.client_contextimportClientContext
fromoffice365.sharepoint.listitems.caml.caml_queryimportCamlQuery 
fromoffice365.sharepoint.files.fileimportFile 

SP_SITE_URL ='https://trainingcenter2021.sharepoint.com/sites/ABC/'
SP_DOC_LIBRARY ='Publications'
SP_DOC_LIBRARY_FOLDER ='Administrator'    
# The folder in the document library 
USERNAME ='sptraining@trainingcenter2021.onmicrosoft.com'
PASSWORD  ='******' 
FILE_SERVER_PATH ="C:/temp/forms/"

# 1. Create a ClientContext object to connect the document library 
ctx =ClientContext(SP_SITE_URL).with_user_credentials(USERNAME, PASSWORD) 

# 2. Connect to the SharePoint folder and construct a SharePoint folder object 
sp_folder_path ="/{0}/{1}".format(SP_DOC_LIBRARY,SP_DOC_LIBRARY_FOLDER)
sp_folder = ctx.web.ensure_folder_path(sp_folder_path).execute_query()

# 3. Build a CAML query to find all list items that are in the "Form" category 
qry_text ='''<Where>
                <Eq> 
                   <FieldRef Name='Category'/>
                   <Value Type='Text'>Form</Value>
                </Eq>
              </Where>'''
caml_query =CamlQuery.parse(qry_text)
caml_query.FolderServerRelativeUrl = sp_folder.properties['ServerRelativeUrl']

# 4. Retrieve all list items based on the CAML query 
oList = ctx.web.lists.get_by_title(SP_DOC_LIBRARY) 
items = oList.get_items(caml_query) ctx.execute_query()

# 5. Loop through all list items
for itemin items: 
  # 6. Query the SharePoint document library to find the associated file 
  file = item.file 
  ctx.load(file) 
  ctx.execute_query() 
  # 7. Construct the file path on the file server 
  download_file_path =os.path.join(FILE_SERVER_PATH, file.properties['Name'])
  # 8. Download the file from the document library folder 
  with open(download_file_path,'wb')as download_file: 
       download_file.write(file.read())
  print('File downloaded :{0}'.format(file.properties['Name']))			

2.5.3 Deleting Files from the Document Library Folder

To delete a file from the document library folder, we should first find the list item that contains the file. From the list item, we can construct a file object that provides a delete object method. Next, we invoke the delete method to remove the file from the folder. The following code deletes all forms from the library.

importos 
fromoffice365.sharepoint.client_contextimportClientContext
fromoffice365.sharepoint.listitems.caml.caml_queryimportCamlQuery 
fromoffice365.sharepoint.files.fileimportFile 

SP_SITE_URL ='https://trainingcenter2021.sharepoint.com/sites/ABC/'
SP_DOC_LIBRARY ='Publications'
SP_DOC_LIBRARY_FOLDER ='Administrator'
USERNAME ='sptraining@trainingcenter2021.onmicrosoft.com'
PASSWORD ='******' 

# 1. Create a ClientContext object to connect the document library 
ctx =ClientContext(SP_SITE_URL).with_user_credentials(USERNAME, PASSWORD) 

# 2. Construct a SharePoint folder object 
sp_folder_path ="/{0}/{1}".format(SP_DOC_LIBRARY,SP_DOC_LIBRARY_FOLDER)
sp_folder =  ctx.web.ensure_folder_path(sp_folder_path).execute_query()

# 3. Build a CAML query to find all list items that associate to the form 
qry_text ='''<Where>
                <Eq> 
                   <FieldRef Name='Category'/>
                   <Value Type='Text'>Form</Value>
                </Eq>
              </Where>'''
caml_query =CamlQuery.parse(qry_text)
caml_query.FolderServerRelativeUrl = sp_folder.properties['ServerRelativeUrl']

# 4. Retrieve all list items based on the CAML query 
oList = ctx.web.lists.get_by_title(SP_DOC_LIBRARY) 
items = oList.get_items(caml_query) ctx.execute_query()

# 5. Loop through all list items
for itemin items: 
  # 6. Query the SharePoint document library to construct a file object 
  file = item.file 
  ctx.load(file) 
  ctx.execute_query() 
  # 7. Invoke the delete_object method to delete the file 
  file.delete_object() 
  ctx.execute_query() 
  print('File deleted :{0}'.format(file.properties["ServerRelativeUrl"]))			

2.5.4 Searching for Files in the Document Library Recursively

The ABC company may need to find all files being in the category "Approved." Since a document library has folders, and the folders may have sub-folders, the program should search for files recursively. When constructing a CAML query, we can specify the scope for returning list items and list folders. When we pass the scope "ViewScope.Recursive" to the CamlQuery.parse() function, the function searches for files that satisfy the query in all folders.

fromoffice365.sharepoint.client_contextimportClientContext
fromoffice365.sharepoint.listitems.caml.caml_queryimportCamlQuery,ViewScope

SP_SITE_URL ='https://trainingcenter2021.sharepoint.com/sites/ABC/'
SP_DOC_LIBRARY ='Publications'
USERNAME ='sptraining@trainingcenter2021.onmicrosoft.com'
PASSWORD ='******' 
 
# 1. Create a ClientContext object to connect the document library 
ctx =ClientContext(SP_SITE_URL).with_user_credentials(USERNAME, PASSWORD) 

# 2. Build a CAML query to find all list items being in the category "Approved" 
qry_text =''' <Where>                 
                       <Eq> 
                           <FieldRef Name='Status'/>
                           <Value Type='Text'>Approved</Value>
                       </Eq>                     
               </Where>'''


# 3. Define the scope "Recursive" that search for files in all the folders.
caml_query =CamlQuery.parse(qry_text,ViewScope.Recursive)
caml_query.FolderServerRelativeUrl = SP_DOC_LIBRARY 

# 4. Retrieve list items based on the CAML query 
oList = ctx.web.lists.get_by_title(SP_DOC_LIBRARY) 
items = oList.get_items(caml_query) ctx.execute_query()

# 5. Loop through all list items
for itemin items: 
  # 6. Query the SharePoint document library to find the associated file 
  file = item.file 
  ctx.load(file) 
  ctx.execute_query() 
  # 7. Find the Server Relative URL of the file 
  file_relative_url =file.properties["ServerRelativeUrl"]
  # 8. Print the Relative URL of the file 
  print('Relative File Path: {0}'.format(file_relative_url))

Summary

The article demonstrated how to set up a SharePoint Online environment and create a SharePoint document library. We then used Python scripts to handle files in the document library. These scripts also showed that the document library is a SharePoint list. To show how to use these scripts at work, we assumed a fictitious project that enabled business process automation for a digital publishing platform. We provided a proof-of-concept solution to the project.

We covered steps to create a Microsoft account, participate in the Microsoft 365 Developer Program, set up an Office 365 Developer subscription, and create a SharePoint site. For the sake of demonstration and simplicity, we turned off security so that Python programs could access the SharePoint site. We also installed sample data packs so that users could practice Python scripts in a small corporate environment. The article explored a few steps to add a SharePoint document library to the SharePoint site.

Next, the article showed how to connect to the SharePoint document library, upload files to the library, and download and delete files from the library. The Python programs provided a way of adding metadata to an uploaded document. We also demonstrated techniques to access SharePoint document library folders and handle files in the folders. The last piece of the Python code could search for files in the document library recursively.

Reference

Bodnar, J. (2020). Python ConfigParser tutorial. https://zetcode.com/python/configparser/.

Graham, L. (2019). Microsoft 365 Developer Program FAQ. https://docs.microsoft.com/en-us/office/developer-program/microsoft-365-developer-program-faq.

Juvonen, V. & Caputo, L. (2014). Introduction to Collaborative Application Markup Language (CAML). https://docs.microsoft.com/en-us/sharepoint/dev/schema/introduction-to-collaborative-application-markup-language-caml.

Katzer, M. & Crawford, D. (2013). Office 365: Migrating and Managing Your Business in the Cloud. Apress

McCarthy, C. (2017). Learn to Use CAML Query – Part 1. https://www.brightwork.com/blog/caml-joins-caml-syntax-brightwork.

Microsoft (2021). SharePoint. https://www.microsoft.com/en-ca/microsoft-365/sharepoint/collaboration/.

Microsoft 365 (2021). Join the Microsoft 365 Developer Program today! https://developer.microsoft.com/en-us/microsoft-365/dev-program.

Gremyachev, V. (2020). Office365-REST-Python-Client 2.3.1. https://pypi.org/project/Office365-REST-Python-Client/

Zhou, N. (2020). Learning Python in Visual Studio 2019. /sqlservertip/6442/learning-python-in-visual-studio-2019/.

Chan, E. (2013). Scopes in a CAML Query. http://www.codegrimoire.com/2013/08/scopes-in-caml-query.html.

Satapathi, S. (2020). Performing A CRUD Operation On A SharePoint List Using Python. https://www.c-sharpcorner.com/article/performing-crud-operation-on-sharepoint-list-using-python/.

TutsWiki (2020). WRITING AND READING JSON CONFIG FILE IN PYTHON. https://tutswiki.com/read-write-json-config-file-in-python/

Next Steps


Last Updated: 2021-04-01


get scripts

next tip button



About the author
MSSQLTips author Nai Biao Zhou Nai Biao Zhou is a Senior Software Developer with 20+ years of experience in software development, specializing in Data Warehousing, Business Intelligence, Data Mining and solution architecture design.

View all my tips



Comments For This Article





download





Recommended Reading

Reading SharePoint Lists with Integration Services 2017

Using a SharePoint List as a Data Source in SQL Server Reporting Services 2008 R2

Managing SharePoint Content Databases with PowerShell

Understanding SharePoint Exports to Excel

SharePoint Backup and Restore














get free sql tips
agree to terms