An artificial intelligence model that predicts the onset of rheumatoid arthritis in patients with serologically undiagnosed arthritis.
A deep learning model using clinical data predicts rheumatoid arthritis onset in seronegative patients, enhancing diagnostic accuracy and reducing joint damage by identifying those at risk earlier.
Patent Information
- Authority / Receiving Office
- JP · JP
- Patent Type
- Patents
- Current Assignee / Owner
- Y&M DATA ANALYSIS SERVICE LLC
- Filing Date
- 2024-09-30
- Publication Date
- 2026-06-18
AI Technical Summary
Existing methods struggle to accurately predict the onset of rheumatoid arthritis in patients with seronegative undifferentiated arthritis, leading to delayed and inaccurate diagnoses, which can result in irreversible joint damage.
A pre-trained artificial intelligence model using clinical data such as age, sex, BMI, and blood test results to predict the likelihood of developing rheumatoid arthritis, utilizing a deep learning approach.
Enables earlier and more accurate prediction of rheumatoid arthritis onset, improving diagnostic efficiency and reducing the risk of joint destruction.
Smart Images

Figure 0007875407000001 
Figure 0007875407000002 
Figure 0007875407000003
Abstract
Description
【Technical Field】 【0001】 The present invention relates to a method for predicting the onset of rheumatoid arthritis in patients with seronegative undifferentiated arthritis using a pre-trained artificial intelligence model constructed using clinical data. 【Background Art】 【0002】 Rheumatoid arthritis is a disease characterized by systemic polyarthritis and joint destruction. Delayed treatment intervention may cause irreversible joint destruction. Rheumatoid arthritis often has positive autoantibodies such as rheumatoid factor and anti-CCP antibody, but about 20% of patients have negative autoantibodies (seronegative). Seronegative rheumatoid arthritis is often difficult to diagnose even for rheumatologists, and the sensitivity of the current commonly used ACR / EULAR 2010 rheumatoid arthritis classification criteria for rheumatoid arthritis is less than 30%. That is, even if a patient actually has rheumatoid arthritis, more than 7 out of 10 cases will be missed. Therefore, in actual clinical practice, additional examinations such as contrast MRI are often performed, or the diagnosis is often relied on the doctor's "intuition", so the diagnosis is often delayed. Generally, since rheumatoid arthritis develops slowly, there is a period when arthritis exists but the diagnosis of rheumatoid arthritis is not reached, and this period is called undifferentiated arthritis. Among undifferentiated arthritis, about 50% remit spontaneously and about 30% progress to rheumatoid arthritis. By predicting the risk of progression to rheumatoid arthritis in seronegative undifferentiated arthritis, stratification and early intervention of patients become possible, but the prediction has been difficult. 【Prior Art Documents】 【Patent Documents】 【0003】 For example, prior art literature reports that in a group of patients with undiagnosed arthritis, including serologically positive patients, the progression of rheumatoid arthritis could be predicted with over 75% accuracy by integrating clinical information with DNA methylation profiling of peripheral blood mononuclear cells to construct a machine learning model (support vector machine) (Non-Patent Literature 1). The inventors also reported that they were able to predict the progression to rheumatoid arthritis with high accuracy by using deep learning in a similar group of patients, including serologically positive patients (Non-Patent Literature 2). However, no artificial intelligence models have been reported for serologically negative patients with undiagnosed arthritis, a group of patients for whom progression to rheumatoid arthritis is generally more difficult. [Non-patent literature] 【0004】 [Non-Patent Document 1] de la Calle-Fabregat C, Niemantsverdriet E, et al. Prediction of the Progression of Undifferentiated Arthritis to Rheumatoid Arthritis Using DNA Methylation Profiling. Arthritis Rheumatol. 2021 Dec;73(12):2229-2239. doi: 10.1002 / art.41885. Epub 2021 Nov 2. PMID: 34105306. [Non-Patent Document 2] Fujii T, Murata K, Onizawa H, et al. OP0190 A MACHINE LEARNING MODEL THAT PREDICTS RA PROGRESSION FROM UNDIFFERENTIATED ARTHRITIS -KURAMA AND ANSWER COHORT STUDY-Annals of the Rheumatic Diseases 2023;82:126. [Overview of the Initiative] [Problems that the invention aims to solve] 【0005】 This invention has been made in view of the above problems, and aims to output whether or not a patient with serologically undiagnosed arthritis will develop rheumatoid arthritis in the future. [Means for solving the problem] 【0006】 The trained model of the present invention uses patient data as input to determine whether or not a patient will develop rheumatoid arthritis in the future. The input data consists of only 16 pieces of data that can be easily collected in daily clinical practice, such as patient age, sex, BMI, and blood test data. By using the above-mentioned classifier, the model outputs whether or not the patient will develop rheumatoid arthritis in the future. [Effects of the Invention] 【0007】 By using this invention, it is possible to determine whether or not a person will develop rheumatoid arthritis in the future based solely on clinical information easily obtained in daily medical practice. This enables earlier and more accurate diagnosis of rheumatoid arthritis, as well as increased efficiency in medical care. [Brief explanation of the drawing] 【0008】 [Figure 1] Figure 1 is an explanatory diagram showing the neural network structure of the deep learning method described herein. [Figure 2] Figure 2 is an explanatory diagram showing the receiver operation characteristic curve in the training dataset of this disclosure. [Figure 3] Figure 3 is an explanatory diagram showing the confusion matrix in the training dataset of this disclosure. [Figure 4] Figure 4 is an explanatory diagram showing the confusion matrix in the validation dataset of this disclosure. [Modes for carrying out the invention] 【0009】 Training dataset Twenty-one patients with serologically undiagnosed arthritis were followed up to see if they would eventually develop rheumatoid arthritis. Patients were then labeled as either developing rheumatoid arthritis or not. Of the 210 patients, 57 developed rheumatoid arthritis, while 153 did not. 【0010】 Validation dataset Since machine learning models, including the pre-trained model disclosed here, may overfit to the trends in the training data, i.e., the trends of the facilities from which the training data was collected, validation was performed using a validation dataset. Data collected separately from the training data, from 125 seronegative, undiagnosed arthritis patients, of which 40 developed rheumatoid arthritis and 85 did not, was used as validation data to verify the accuracy of the model. 【0011】 Creating a pre-trained model A deep learning model, as shown in Figure 1, was defined and trained using the patient's age, sex, BMI, number of swollen joints, number of tender joints, smoking history, family history of rheumatoid arthritis, pain scores assessed by both healthcare professionals and the patient, CDAI score, HAQ score, rheumatoid factor, anti-cyclic citrullinated peptide antibody level, matrix metalloproteinase 3, hemoprecipitation rate, and C-reactive protein level as input data at the time of diagnosis of serologically undiagnosed arthritis. This artificial intelligence model returns a value between 1 and 0 depending on the likelihood of developing rheumatoid arthritis in the future. 【0012】 Figure 2 shows the receiver operation characteristic curve of the disclosed trained model. The area under the curve for the training dataset was 0.925. The true positive rate and false positive rate of the disclosed trained model at a threshold of 0.4 are shown by the dotted lines. 【0013】 By setting the threshold to 0.4, the accuracy on the training dataset was 80.7% for sensitivity, 73.0% for positive predictive value, and 86.7% for accuracy. The confusion matrix at this time is shown in Figure 3. 【0014】 The confusion matrix for the validation dataset of the disclosed trained model is shown in Figure 4. The accuracy was 80% sensitivity, 47.1% positive predictive value, and 64.8% accuracy. [Industrial applicability] 【0015】 According to this disclosure, it is possible to accurately predict whether or not a patient will develop rheumatoid arthritis in the future, based solely on clinical data from patients with serologically undiagnosed arthritis. Because this prediction is possible using only data typically collected in routine clinical practice, it is particularly useful in the medical field.
Claims
[Claim 1] An artificial intelligence model for medical professionals to predict whether patients with inflammatory arthritis that cannot be diagnosed (hereinafter referred to as undiagnosed arthritis) and who test negative for autoantibodies will develop rheumatoid arthritis in the future, wherein the model is trained using physical findings data and blood test data of patients with undiagnosed arthritis who ultimately developed rheumatoid arthritis and those who did not as training data, wherein the model is a deep learning model with learned weighting coefficients, comprising a first input layer that receives the patient's physical findings data, blood test data, and matrix metalloproteinase 3 as input on a computer, three hidden layers, and an output layer that calculates the risk of developing rheumatoid arthritis, wherein the model is trained to function on the computer to perform calculations based on the learned weighting coefficients on the physical findings data, blood test data, and matrix metalloproteinase 3 input to the input layer, and output whether or not rheumatoid arthritis will develop in the future. [Claim 2] The model according to claim 1, characterized in that the physical findings data includes the patient's age, sex, BMI (weight / (height squared)), number of swollen joints, number of tender joints, smoking history, family history of rheumatoid arthritis, pain scores assessed by the healthcare provider and the patient, respectively, CDAI score (number of swollen joints + number of tender joints + healthcare provider-assessed pain score + patient-assessed pain score), and HAQ score (difficulty in daily living score). [Claim 3] A model according to claim 1, characterized in that the blood test data includes rheumatoid factor, anti-cyclic citrullinated peptide antibody level, matrix metalloproteinase 3, hemoprecipitation rate, and C-reactive protein level. [Claim 4] The model according to claim 1, wherein the artificial intelligence model is trained using a machine learning algorithm based on past patient data and predicts the risk of developing rheumatoid arthritis. [Claim 5] The model according to claim 4, characterized in that the machine learning algorithm is a neural network. [Claim 6] A model according to any one of claims 1 to 5, characterized in that it includes an output layer that outputs to a medical professional whether or not rheumatoid arthritis will develop in the future.