How to Install cosmicworks a Sample Database in Azure Cosmos DB

By:   |   Updated: 2023-05-08   |   Comments   |   Related: > Azure Cosmos DB


Problem

I'm trying to learn more about Azure Cosmos DB. As usual, you learn the most by actually working with the product. I would like to load some data into a container, write some SQL queries on that container, and see how the integration works with other Azure services. However, since each item in the container is represented as JSON, creating your own sample data can be quite cumbersome. Is there a ready-to-use data set I can utilize?

Solution

Azure Cosmos DB is a globally distributed, multi-model database service offered by Microsoft Azure. It provides a highly scalable and available platform for storing and querying large amounts of data using various data models, including NoSQL (also sometimes referred to as "document" or SQL API), Column-Family (Cassandra), Graph (Gremlin), and Key-Value (Table) APIs. Recently, Microsoft has also added support for MongoDB and PostgreSQL. For more background information, check out the tip Introduction to Azure Cosmos DB database and the SQL API .

different APIs in cosmos DB

In this tip, we're exclusively using the Cosmos NoSQL API. There are multiple methods to get data into Azure Cosmos DB. If you already have some sample data in a relational database, you can try one of these methods:

  1. Use the Azure Cosmos DB Data Migration tool to import the data into your Cosmos DB container. You can find instructions on how to do this in the tip Migrating SQL Data into Azure Cosmos DB.
  2. You can create your own import using pipelines in Azure Data Factory. You might have to do a two-step process if your data is more complex than just one table with no nested values. This blog post explains the process: How to Store Normalized SQL Server Data into Azure Cosmos DB.

Another method is to use a sample database – called cosmicworks – that is already provided by Microsoft.

Prerequisites

If you don't have an Azure Cosmos DB account in your tenant, follow the prerequisite steps in the tip named Analyze Azure Cosmos DB data with Synapse Serverless SQL Pools to set up an account. If you don't have an Azure subscription, try Cosmos DB using the emulator. This tool allows you to develop and test Cosmos DB locally on your computer and is available to download: cosmosdb-emulator.

install azure cosmos db emulator

After you've run the installation wizard, it will launch itself in the browser:

azure cosmos db emulator

With the emulator, you can test basic functionality. It's only possible to have a database with provisioned throughput; the serverless option is unavailable locally.

Installing Sample Data Using the Portal

You can install a sample database with a container holding some data with a couple of clicks. In the Data Explorer of your Azure Cosmos DB Account, you will see the following home screen:

home screen in data explorer

When you click in the quick start, it will open a dialog allowing you to create a sample container with the associated database:

quick start wizard

The sample container will contain 295 JSON documents holding product information.

sample data from the quick start wizard

In the emulator, you have a similar wizard:

quick start wizard in the emulator

However, this wizard creates a small Persons database.

sample db installation in wizard

This container only has four JSON documents with a very simple structure.

persons sample db in emulator

Installing Sample Data with the Command Line

If you want the full cosmicworks sample database in the emulator or want to install the sample database programmatically instead of manually through the portal, you will need the cosmicworks nuget package. If you have the .Net SDK installed on your machine, you can run the dotnet command from the prompt. Run the following command to install the cosmicworks package:

dotnet tool install –global cosmicworks

Once the tool is installed, you can deploy a copy of the cosmicworks database to a Cosmos DB account. You will need the endpoint URI of the account and the account key (called the primary key in the emulator). For the emulator, both can be found in the sample place on the quickstart overview page:

uri and key in the emulator

For a regular Azure Cosmos DB account, you can find the URI on the overview page of the account:

uri in the portal

On the Keys page, you can find the primary and secondary keys (you only need the first one) and the URI as well:

primary key in the portal

When you have found the necessary information, you can run the following command from the prompt:

Cosmicworks --endpoint myendpoint --key myprimarykey --datasets product

This will load all the product sample documents to a database called cosmicworks in a container called products.

command line output of cosmicworks

Products is actually one of the multiple sample datasets. You can find more information on them in the Github project repo.

Multiple datasets available

By specifying the name of another dataset, you can install additional sample containers in the cosmicworks database (the data itself is modeled after the AdventureWorks sample database for SQL Server). As you can see, with the command line, you have more options for sample data than through the portal. The other datasets are also much larger. For example, the customers dataset contains over 50,000 documents:

count for customers container

Installing Sample Data with Visual Studio

Instead of loading each dataset through the command prompt, you can simultaneously load them through a Visual Studio project. The cosmicsworks github repo contains a Visual Studio project that allows you to run a program that will upload everything for you. This repo is a demo environment to showcase the capabilities of Azure Cosmos DB and how you can model normalized data from a relational database into Cosmos DB. Everything is licensed under the MIT license, so you can download the source code from the repo and run it to load the sample data.

In the Visual Studio solution, you need to fill in the URI and the primary key in the appSettings.json files.

appsettings.json files need the uri and the key

Then you must configure the modeling_demos project as the startup project.

configure project as startup

When you run the project (press F5), you will be presented with a menu—press "k" to create the databases and the containers.

menu options for the cosmicworks demos

This will only work on an Azure Cosmos DB account with provisioned throughput or the emulator. It will error out a serverless account. Once the objects are created, press "l" to load all the sample data.

loading sample data through Visual Studio

This might take a while. Running the program will create multiple databases with multiple containers.

full cosmicworks sample db

Each database represents a different iteration in the modeling process explained in a presentation. Check out the readme file of the repo for more information.

You now have multiple databases with sample data you can use to familiarize yourself with Azure Cosmos DB.

Next Steps


sql server categories

sql server webinars

subscribe to mssqltips

sql server tutorials

sql server white papers

next tip



About the author
MSSQLTips author Koen Verbeeck Koen Verbeeck is a seasoned business intelligence consultant at AE. He has over a decade of experience with the Microsoft Data Platform in numerous industries. He holds several certifications and is a prolific writer contributing content about SSIS, ADF, SSAS, SSRS, MDS, Power BI, Snowflake and Azure services. He has spoken at PASS, SQLBits, dataMinds Connect and delivers webinars on MSSQLTips.com. Koen has been awarded the Microsoft MVP data platform award for many years.

This author pledges the content of this article is based on professional experience and not AI generated.

View all my tips


Article Last Updated: 2023-05-08

Comments For This Article

















get free sql tips
agree to terms