Improve the Data Quality for Healthcare, Clinical and Life Sciences Projects with Melissa Informatics

By:   |   Updated: 2024-01-02   |   Comments   |   Related: > Data Quality Services


All companies fundamentally have the need to improve data quality for decision making. With complex data sets it is even more difficult to improve data quality. The problem is exacerbated with numerous standards and synonyms around the globe with data in numerous systems, varying formats and industry specific terminology. For healthcare, clinical and life sciences organizations, merging, reconciling and enhancing data is challenging for electronic medical records, clinical trials and pharmaceutical applications. How can the technology teams at healthcare and life sciences organizations adopt a framework to improve the data quality for complex data sets for the medical professionals they support?


Although many database platforms have the means to support healthcare and life sciences organizations, the reality is the data is complex to consolidate, build relationships and find meaningful results from electronic medical records, clinical trials and more. Inconsistent data from numerous systems supporting multi-decade studies, data is housed in free form text fields or in unstructured formats, includes over 100 synonyms and spellings for specific terms, and requires significant financial resources and precious time from medical experts to derive value from the collected data.

Melissa is aware of these challenges in the healthcare field and has been delivering data quality solutions for 35 years including direct integration with relational database engines such as SQL Server and Oracle. Melissa provides high-quality tools for data profiling, validation, enrichment and more that many SQL Server Professionals are familiar with that seamlessly integrate with SQL Server Integration Services (SSIS). Melissa even has a user-based tool called Unison to enable power users to apply much of the same logic from the SSIS Data Quality tools and incorporate it into a user friendly tool to enable Power Users, who are intimately familiar with the data, to resolve data quality issues.

Melissa recognized these solutions solve broad data quality needs. In particular industries such as healthcare, some challenges remained unanswered. To compliment the Melissa Data Quality tools for SQL Server Integration Services, Melissa Informatics delivers Machine Reason and Machine Learning solutions for complex and dynamic data sets as well as mapping industry specific terminology for organizations across the globe. One implementation of this technology by Melissa Informatics is in the Healthcare and Life Sciences fields to improve data quality of complex data sets for clinical trials, drug validation, and more initiatives. Melissa Informatics goals are:

  • Improve the quality and completeness of data in the healthcare industry
  • Reduce costs and increase efficiency of data management
  • Enable interoperability and searchable integration of data

This article is intended to serve as a primer for database professionals who support medical researchers, analysts, physicians, clinicians, etc. to gain an understanding of Melissa Informatics capabilities who are looking to improve the organizational data management capabilities. Let’s dive into the Melissa Informatics products and services offerings for the healthcare industry, designed to significantly reduce costs for clinical trials and pharmaceutical applications.

Melissa Informatics Process

From a process perspective, Melissa recommends first implementing Data Quality best practices including data profiling, matching, enriching, etc. to build an accurate data set in the relational engine. This is accomplished with the Melissa integration with SSIS Tools or Unison for power users. Once the broad data quality best practices are implemented, if more advanced data integration and analysis is a goal, Melissa recommends leveraging the Sentient Suite of tools. These tools combine SQL and semantic database technologies including Machine Learning and Machine Reasoning algorithms. These semantic technologies provide advanced means to categorize and process data as well as discover new relationships that were previously difficult to ascertain.

Generally, the solution for health care and life sciences organizations from Melissa is a hybrid of relational and semantic technologies. The relational engine provides a consistent means to perform data cleansing and enrichment as well as high speed data access to support end users. The semantic model is used to understand and improve the complex data sets, easily build relationships and make new discoveries with the data. In the end, the final results are populated in a relational engine for end users to access the data in a relational engine due to the high performance. Let’s walk through each of the products Melissa Informatics delivers for a final solution.

Melissa Informatics Sentient Suite

Healthcare data is unusually complex with many types and forms of data. Unfortunately, this can easily lead to poor data quality and incomplete data sets leading to errors and inefficient decision making. To address these needs, Melissa Informatics has built the Sentient Suite which are data integration tools to enable data exploration and run queries to discover data relationships:

  • Sentient Knowledge Explorer
  • Sentient Server - Web Query
  • Sentient Server - Applied Semantic Knowledgebase
  • Sentient Server - Data Manager

The Sentient Suite architecture is dynamic, to support data discovery, harmonization and integration. This enables medical data scientists to correct incomplete data sets and errors to stop poor decision making, to create new data relationships and to discover data patterns for decision making - driving innovation and discovery, and increasing efficiency for data management.

Sentient Knowledge Explorer

Sentient Knowledge Explorer is a data integration and data exploration tool built on semantic technologies. This tool helps to build ontologies, which are models for data management and relationships. This enables the technology and medical professionals to build a unified knowledge database in order to perform data analysis.

Sentient Web Query

Sentient Web Query is a web-based interface for medical professionals to run queries against any relational database, web service or Sentient data store in a secure and compliant manner around the globe. This tool also enables medical professionals to also import data for comparison and further analysis.

Sentient Applied Semantic Knowledgebase

Sentient Applied Semantic Knowledgebase builds on customer data analysis to help researchers proactively address predictive biology and early stage drug development challenges with hypothesis testing, yielding better decision making.

Sentient Data Manager

Sentient Data Manager is a web-based application for data entry, importing, correction, delivery and reporting. This solution has the flexibility to work with data from unstructured text, image data, web services, SQL and NoSQL databases, medical devices, and other sources, providing medical professionals with the flexibility they need to efficiently conduct their research.

Melissa Informatics Knowledge Hub

Knowledge Hub is a massive knowledge database used for data enrichment in six key healthcare areas. This database includes content (verified integrated data), lexicons (industry specific terminology to normalize over 17 million terms) and ontologies (over 800 data models). The Knowledge Hub includes:

As an example, the Drugs Knowledge Hub includes data on 193,077 drugs to ensure your organization is using the correct terms in compliance with the FDA in the United States, RxNorm with NIH, SnoMed-CT internationally or other standards as needed. The data includes drugs that are commonly prescribed simultaneously, precautions when using a particular drug and relationships between drugs, genes, proteins and diseases. Further, you are able to append your drug data with links to 34,756 genes, 3,554 proteins and 58,351 diseases.

Below is a screen shot of one interactive Knowledge Hub visualization outlining the relationship between a drug, genes, ontologies and diseases. You can explore nodes, relationships and hierarchies.


The Melissa Informatics Knowledge Hub appends and enriches data ultimately giving healthcare organizations a more complete picture. Further, the Knowledge Hub validates and builds a coherent data set yielding accuracy and time savings.

Melissa Informatics Cloud APIs

Druginator is one of the Melissa Informatics Cloud APIs, which is a web service for validating drug names, variants, dosages and spellings. Druginator accepts user input for a drug term and validates the drug, verifies the preferred name and enriches the data to ensure the output terms are standardized for global interoperability, including:

  • Drug Cleansing and Append
  • Drug Identification and Profiling
  • Drug Enrichment
  • Drug Semantics and Harmonization

Druginator saves time researching drugs, ensures data accuracy and reduces errors when cleaning up dirty data from electronic medical records or internal data.


Melissa Informatics provides new opportunities for healthcare organizations facing data quality challenges resulting in poor decision making, missed opportunities and high data management costs. Melissa Informatics delivers comprehensive data discovery, harmonization, integration and research for healthcare with broad quality tools, including relational and semantic technologies for Machine Reasoning and Machine Learning, and Professional Services. This solution drives innovation and decision making at lower costs, enabling healthcare organizations to make data management a competitive advantage.

Next Steps Product Spotlight sponsored by Melissa, makers of Melissa Informatics.

About the author
MSSQLTips author Jeremy Kadlec Jeremy Kadlec is a Co-Founder, Editor and Author at with more than 300 contributions. He is also the CTO @ Edgewood Solutions and a six-time SQL Server MVP. Jeremy brings 20+ years of SQL Server DBA and Developer experience to the community after earning a bachelor's degree from SSU and master's from UMBC.

This author pledges the content of this article is based on professional experience and not AI generated.

View all my tips

Article Last Updated: 2024-01-02

Comments For This Article