Description
Introduction
Positioning landmarks on X-ray radiography images plays a pivotal role in post-surgical patient follow-up and the assessment of prosthesis alignment. The integration of artificial intelligence into medical imaging has demonstrated the potential for automating and standardizing landmark localization through deep learning models. However, the critical challenge lies in ensuring the robustness and generalizability of these algorithms to make them clinically reliable.
In pursuit of this goal, we introduce an artificial intelligence algorithm centered on heatmap prediction for landmark positioning in knee prosthesis X-ray radiography. Our study demonstrates the model's ability to optimize the reliability of landmarks prediction, surpassing results achieved by a previously developed regression algorithm, while mitigating overfitting when applied to an external clinical practice.
Materials and methods
We selected 4994 images of frontal and lateral view of knee X-ray radiography with a total arthroplasty. Two different PACS were used (PACS-1 with 3893 images and PACS-2 with 1101 images). For each image, orthopedic surgeons annotated 16 anatomical landmarks. While PACS-1 was used for training (60%), validation (20%) and test (20%), PACS-2 was used only as a test set.
The classic approach for landmarks prediction consists of using a regression model which predicts the X&Y coordinates of each landmark of interest. We investigated a second approach, the heatmap model : the neural network predicts a heatmap where each pixel corresponds to the probability of being the landmark of interest. For the regression model, we use a ResNet 50 and for the heatmaps model, we use a U-Net architecture.
In Figure 1, we illustrate how heatmaps are generated for 3 knee X-rays using the landmarks posed by the orthopedic surgeons. We evaluated the mean squared error for each predicted landmark. We used the Pytorch library and data augmentation techniques to improve the results. We evaluated the mean squared error for each predicted landmark. We used the Pytorch library and data augmentation techniques to improve the results.
Results
In table 1, we compare results obtained with the two models on the two test sets. While average error on PACS-1 remains stable, the results improve on PACS-2, showing a better generalizability of the algorithm. Also, standard deviation on errors on PACS-1 test set increase and this could enhance an overfitting on the regression model.
- PACS1 average error :
- Regression model : 18.53 +/- 15.80
- Heatmap model : 18.270 +/- 32.059
- PACS2 average error :
- Regression model : 44.40 +/- 38.56
- Heatmap model : 26.204 +/- 31.839
Discussion
Artificial intelligence algorithms applied to medical imaging represent a real resource for moving towards the personalized medicine of the future. Nonetheless they need to generalize across different clinical cohorts to be efficient and reliable. In this study, we showed the importance to better explore state of the art algorithms to improve results. As this study has shown how reliably this AI method can perform radiological measurements on different body parts, these innovative technologies can be integrated into the orthopedic surgeon's routine.