中国全科医学 ›› 2025, Vol. 28 ›› Issue (23): 2870-2877.DOI: 10.12114/j.issn.1007-9572.2024.0525

• 论著 • 上一篇    下一篇

胃癌发生风险的列线图预测模型研究

周倩1, 吴晓敏1,2, 王宝华1, 严若菡1,2, 蔚苗1,3, 吴静1,*()   

  1. 1.100050 北京市,中国疾病预防控制中心慢性非传染性疾病预防控制中心
    2.300070 天津市,天津医科大学公共卫生学院
    3.010107 内蒙古自治区呼和浩特市,内蒙古医科大学公共卫生学院
  • 收稿日期:2024-08-29 修回日期:2024-11-11 出版日期:2025-08-15 发布日期:2025-06-17
  • 通讯作者: 吴静

  • 作者贡献:

    周倩提出主要研究目标,负责研究的构思与设计,撰写论文;周倩、吴晓敏、王宝华进行数据的收集与整理,统计学处理,图、表的绘制与展示;周倩、严若菡、蔚苗进行论文的修订;王宝华、吴静负责文章的质量控制与审查,对文章整体负责,监督管理。

  • 基金资助:
    国家科技重大专项(2023ZD0509800)

Study on Nomogram Prediction Model for Risk of Gastric Cancer

ZHOU Qian1, WU Xiaomin1,2, WANG Baohua1, YAN Ruohan1,2, YU Miao1,3, WU Jing1,*()   

  1. 1. National Center for Chronic and Non-communicable Disease Control and Prevention, Chinese Center for Disease Control and Prevention, Beijing 100050, China
    2. School of Public Health, Tianjin Medical University, Tianjin 300070, China
    3. School of Public Health, Inner Mongolia Medical University, Hohhot 010107, China
  • Received:2024-08-29 Revised:2024-11-11 Published:2025-08-15 Online:2025-06-17
  • Contact: WU Jing

摘要: 背景 胃癌(GC)严重危害我国居民健康,对个体GC发病风险进行预测有助于早期识别高风险人群,进而采取有针对性的干预措施避免或延缓GC进展。 目的 构建并验证预测个体GC发生风险的列线图模型。 方法 选取2020年1月—2021年7月在安徽、河南、山东、江苏4省14个县(区)肿瘤登记系统中确诊的≥40岁的GC患者作为病例组(684例),按1∶2比例纳入性别、年龄、居住地及健康状况匹配的一般人群作为对照组(1 368例)。将所有研究对象按8∶2比例随机分为训练集(1 641例)和验证集(411例)。采用多因素Logistic回归分析筛选变量,建立列线图预测模型。绘制模型预测GC发病风险的受试者工作特征(ROC)曲线,采用ROC曲线下面积(AUC)、Hosmer-Lemeshow法评估模型的区分度和校准度;采用Bootstrap法进行模型验证,采用临床决策曲线(DCA)评估模型的临床适用性。 结果 多因素Logistic回归分析结果显示,饮食口味偏咸(OR=1.690,95%CI=1.333~2.142)、食物偏硬(OR=1.596,95%CI=1.145~2.225)、喜食辣食(OR=1.387,95%CI=1.093~1.760)、二手烟暴露(OR=1.880,95%CI=1.473~2.399)、经常发脾气(OR=3.283,95%CI=2.236~4.819)、胃部疾病史(OR=4.008,95%CI=3.046~5.273)、一级亲属肿瘤史(OR=1.549,95%CI=1.170~2.051)、幽门螺杆菌感染(OR=1.298,95%CI=1.028~1.693)、高盐饮食(OR=1.338,95%CI=1.033~1.734)是GC发生的独立危险因素(P<0.05);初中(OR=0.616,95%CI=0.468~0.811)和高中及以上学历(OR=0.491,95%CI=0.342~0.703)、规律饮食(OR=0.542,95%CI=0.405~0.726)、食生蒜或蒜苗(OR=0.501,95%CI=0.394~0.636)是GC发生的保护因素(P<0.05)。训练集和验证集预测GC发生风险的AUC分别为0.768(95%CI=0.744~0.792)和0.776(95%CI=0.728~0.823)。Bootstrap法验证结果显示,校正曲线与实际曲线一致性良好(训练集:Brier评分=0.177;验证集:Brier评分=0.176);Hosmer-Lemeshow检验结果显示,模型拟合度良好(训练集:χ2=4.408,P=0.819;验证集:χ2=4.650,P=0.794)。DCA显示当阈值为0.05~0.79时,使用列线图模型预测GC发生风险可以使患者临床获益。 结论 本研究构建的列线图模型可预测GC的发病风险,便于早期识别高风险人群,且有助于制定有针对性的个体化干预措施。

关键词: 胃癌, 危险因素, 列线图, 预测模型, 病例对照研究, Logistic回归

Abstract:

Background

Gastric cancer (GC) causes a heavy burden in China. Predicting individual GC risk can help to identify high-risk groups early, and then take targeted interventions to avoid or delay GC progression.

Objective

Establish and validate a nomogram model for predicting individual GC risk.

Methods

From January 2020 to July 2021, GC patients ≥40 years were diagnosed from the cancer registry system in 14 counties (districts) of Anhui province, Henan province, Shandong province and Jiangsu province were selected as the case group (684 cases). Match the general population with a frequency of 1∶2 based on gender, age, place of residence and health status matching people as a control group (1 368 cases). All subjects were randomly divided into training set (1 641 cases) and validation set (411 cases) according to a ratio of 8∶2. Multivariate Logistic regression analysis was used to screen variables and establish nomogram prediction model. The receiver operating characteristic (ROC) curve of the model predicting the risk of GC was drawn, the discrimination and calibration of the model were evaluated via the area under the ROC curve (AUC) and the Hosmer-Lemeshow test. The model was verified by Bootstrap method and the decision curve analysis (DCA) was used to evaluate the clinical practicability of the model.

Results

Multivariate Logistic regression analysis showed that salty tastes (OR=1.690, 95%CI=1.333-2.142), dry and hard diet (OR=1.596, 95%CI=1.145-2.225), spicy food tastes (OR=1.387, 95%CI=1.093-1.760), exposure to secondhand smoking (OR=1.880, 95%CI=1.473-2.399), frequent tantrums (OR=3.283, 95%CI=2.236-4.819), history of stomach disease (OR=4.008, 95%CI=3.046-5.273), the family history of cancer (OR=1.549, 95%CI=1.170-2.051), Helicobacter pylori (Hp) infection (OR=1.298, 95%CI=1.028-1.693), high-salt diet (OR=1.338, 95%CI=1.033-1.734) were independent risk factors for GC (P<0.05). Junior high school education (OR=0.616, 95%CI=0.468-0.811), high school education or above (OR=0.491, 95%CI=0.342-0.703), regular diet (OR=0.542, 95%CI=0.405-0.726), the garlic consumption (OR=0.501, 95%CI=0.394-0.636) were protective factors for GC (P<0.05). The AUC for predicting GC risk in the training and validation sets was 0.768 (95%CI=0.744-0.792) and 0.776 (95%CI=0.728-0.823), respectively. The verification results of Bootstrap method showed that the calibration curve was in good agreement with the actual curve (Brier score of training set=0.177; Brier score of verification set=0.176) ; Hosmer-Lemeshow results showed that the model had a good fit (training set: χ2=4.408, P=0.819; verification set: χ2=4.650, P=0.794). The DCA curve showed that when the threshold is between 0.05 and 0.79, patients can benefit clinically using the nomogram model to predict the risk of GC occurrence.

Conclusion

The nomogram model constructed in this study could predict individual GC risk, early identify high-risk groups and help to formulate targeted and individualized interventions.

Key words: Gastric cancer, Risk factors, Nomogram, Prediction model, Case-control study, Logistic regression

中图分类号: