Document

Phenotyping Algorithms for Identifying Knee Surgeries after ACLR

Description

Objective: Up to 34% of individuals who undergo anterior cruciate ligament reconstruction (ACLR) require an unplanned subsequent knee surgery, including surgeries for failed ACLR, contralateral anterior cruciate ligament tear, meniscus or cartilage injury, and loss of motion. Electronic health record (EHR) data provides a rich resource for studying these procedures, but identification can be costly when chart review is used or inaccurate when claims-based codes are used. The purpose of this study was to determine the performance of 1) structured data (diagnoses and procedure codes) alone versus 2) structured data plus information extracted from unstructured operative reports using natural language processing (NLP) to identify cases of subsequent knee surgery after ACLR.

Methods: Individuals with and without Current Procedural Terminology (CPT) codes related to subsequent knee surgery were randomly selected from an overall dataset of 5,234 individuals. The EHR of each case was manually reviewed by a clinician to identify all subsequent knee surgeries and procedures following ACLR and served as the gold standard comparison.  Recall, precision, and balanced F1 score were used to evaluate the performance of CPT codes alone versus algorithms utilizing CPT codes, diagnosis codes, and/or information extracted using NLP methods from operative reports to identify subsequent knee surgeries. 

Results: The records of 378 individuals were reviewed, including 328 cases in the derivation set (169 cases with subsequent knee surgery CPT codes; 159 without) and 50 in the validation set, all with subsequent knee surgery CPT codes. Using the presence or absence of identified CPT codes alone, individuals who did or did not undergo subsequent knee surgery were identified with > 0.98 specificity, recall, precision, and F-1 score. When identifying the specific subsequent surgeries performed, only meniscus procedures were identified with > 0.9 performance metrics using CPT codes alone. The highest performance metrics for each subsequent surgery category were achieved using a combination algorithm that combined use of CPT codes, diagnoses codes, and/or information extracted from the operative report (F-1 score ranged from 0.839 for cartilage procedures to 1 for posterior cruciate ligament procedures and synovial procedures).

Conclusions: Procedure codes alone were sufficient for differentiating individuals who underwent a subsequent knee surgery from those who did not, but algorithms using a combination of procedure and diagnosis codes and information extracted from the operative report are needed to reliably identify specific procedures performed during subsequent knee surgeries after ACLR. Application of these methods, including the use of NLP could improve the accuracy of data extraction from large medical record data sets to facilitate clinical research efforts.

Content restricted!

You need to login to see this content

Content restricted!

You need to login to see this content

Author

K M

Kathleen M. Poploski

University of Pittsburgh

S D

Sahil Dadoo

University of Pittsburgh

J B

Jeffrey B. Moorhead Jr

University of Pittsburgh

S R

Scott Rothenberger

University of Pittsburgh

J D

Jonathan D. Hughes

University of Pittsburgh

V M

Volker Musahl

University of Pittsburgh

R D

Richard D. Boyce

University of Pittsburgh

J J

James J. Irrgang

University of Pittsburgh

ESSKA Continuous Professional Education Partners