Clinical Research Beyond ICD Codes – Search Clinical Notes Now

February 26, 2024

In healthcare precision and accuracy are paramount, especially when it comes to conducting clinical research. Traditionally, the International Classification of Diseases (ICD) has been the cornerstone for identifying and categorizing various medical conditions. While ICD-10 codes offer a standardized way to classify diseases and health-related problems, they can fall short in capturing rare diagnoses and other clinical information that does not fit easily consistently into an ICD category. Clinical notes, filled with detailed descriptions of symptoms, patient history, and contextual information, hold a treasure trove of insights, which can be accessed using the SKAN NLP self-service tool.

Dr. Ryan Hughes, a radiation oncologist specializing in head and neck cancer, recently used SKAN to identify patients for retrospective chart review. Dr. Hughes was seeking to build a cohort of patients diagnosed with Rosai-Dorfman disease, an uncommon histiocytic disorder. Rosai-Dorfman disease is typically coded as ICD10 D76.3, “Other histiocytosis syndromes”, along with several other diseases and syndromes. A search for this ICD10 code yielded 420 patients in the legacy Wake system. However SKAN was able to narrow that down and more precisely identify 46 patients using text search of clinical notes. This drastically reduced the amount of time Dr. Hughes and his team needed for reviewing charts.

Max Oscherwitz, a medical student researching a dermatological condition known as nevus sebaceous, was also able to use SKAN to hone in and find 388 patients with this indication. A previous i2b2 search using ICD codes had yielded thousands of possible patients. These use cases demonstrate one of the key strengths of SKAN: its ability to identify diagnoses that may not be explicitly documented using ICD-10 codes. In many cases, clinicians may describe a patient's condition using descriptive language or clinical terminology that does not neatly align with existing diagnostic codes. Researchers can search these terms to build more cohorts for research more easily, bridging the gap between clinical documentation and codified data.

SKAN allows researchers to freely search clinical notes for keyword terms in order to identify a cohort of interest. SKAN leverages Boolean operators such as “and”, “or”, and “not”, allowing researchers to mix and match, or even combine, multiple queries to further refine results. The aggregate counts and demographic breakdowns returned from a search may be used for feasibility analysis and as inclusion criteria for data requests requiring additional data elements from the EHR and other sources. Researchers may also use it to view Natural Language Processing (NLP) de-identified notes for the purposes or confirming or adjusting their search criteria.

Data-driven insights drive healthcare innovation, and SKAN enhances how we leverage clinical data to advance medical knowledge and improve patient outcomes. By transcending the limitations of traditional coding systems and harnessing the power of NLP, SKAN opens new avenues for exploration and discovery. Clinical note sets currently available for text search include radiology, pathology, and progress notes, with more on the way. Cohorts identified by researchers using SKAN can be used when requesting data extraction from the Office of Informatics. Visit our website or attend a twice weekly open consultation session to learn more about SKAN, i2b2, and requesting data to support your research.