中国全科医学 ›› 2025, Vol. 28 ›› Issue (19): 2407-2413.DOI: 10.12114/j.issn.1007-9572.2023.0512

所属专题: 乳腺癌最新文章合辑

• 论著 • 上一篇    

基于深度学习模型辅助穿刺病理图像预测乳腺癌新辅助治疗疗效的研究

罗云昭, 蒋宏传, 徐峰*()   

  1. 100020 北京市,首都医科大学附属北京朝阳医院乳腺外科
  • 收稿日期:2024-08-25 修回日期:2024-12-18 出版日期:2025-07-05 发布日期:2025-05-28
  • 通讯作者: 徐峰

  • 作者贡献:

    罗云昭负责临床数据和穿刺病理WSI的收集和标注、统计学处理、深度学习模型的搭建及测试,并撰写论文初稿;蒋宏传提出临床数据研究指标,制定纳排标准,负责研究对象的选取;徐峰提出研究思路,设计研究方案,负责研究的质量控制及审校,并对论文负责;所有作者确认了论文的最终稿。

  • 基金资助:
    北京市医管局青苗项目(QMS20210305)

Predicting Response to Neoadjuvant Therapy in Breast Cancer Using Deep Learning on Primary Core Needle Biopsy Slides

LUO Yunzhao, JIANG Hongchuan, XU Feng*()   

  1. Department of Breast Surgery, Beijing Chaoyang Hospital of Capital Medical University, Beijing 100020, China
  • Received:2024-08-25 Revised:2024-12-18 Published:2025-07-05 Online:2025-05-28
  • Contact: XU Feng

摘要: 背景 术前新辅助治疗(NAT)是治疗局部晚期乳腺癌的标准化手段,但只有部分患者对NAT敏感,在NAT前对患者进行疗效预测至关重要。既往研究利用统计学方法结合临床数据或深度学习方法结合影像学图像预测乳腺癌NAT疗效,效果欠佳。 目的 利用多示例学习(MIL)方法训练基于乳腺癌粗针穿刺全切片图像(WSI)的深度学习(DL-CNB)模型,实现对病理性完全缓解(pCR)的预测和相关肿瘤区域的可视化。 方法 采用回顾性研究模式,收集北京朝阳医院2019年4月—2022年4月收治的经NAT的乳腺癌患者的临床资料和NAT前穿刺苏木精-伊红(HE)染色切片。依据纳排标准共筛选出195例患者。根据Miller-Payne(MP)分级将患者分为pCR组(MP=5级,n=40)和non-pCR组(MP=1~4级,n=155)。首先对临床资料进行分析,构建pCR影响因素的Logistic回归模型。将所有WSI图像按照4∶1的比例随机划分为训练集和测试集,并从训练集中取出25%的数据作为验证集。标记每张WSI中全部肿瘤细胞区域,通过滑动窗口取块、数据筛选、数据增强、归一化处理等步骤准备训练集。对比5种卷积神经网络模型,选择最优模型作为DL-CNB的特征提取器。设置参数训练DL-CNB模型。利用独立测试集测试模型,评价DL-CNB的预测价值。根据由注意力模块获得的权重绘制热力图,实现WSI中与预测相关重要区域的可视化。 结果 pCR组组织学分级高、ER阴性、PR阴性、HER2阳性、Ki-67高表达的患者占比高于non-pCR组(P<0.05)。与HR+/HER2-相比,HR-/HER2+(OR=10.189,95%CI=3.225~32.187)和HR+/HER2+(OR=3.349,95%CI=1.152~9.737)可测预患者达到pCR状况(P<0.05)。Logistic回归模型的受试者工作特征曲线下面积(AUC)为0.769,准确率为81.000%。DL-CNB模型独立测试集AUC为0.914,准确率为84.211%。随机选取独立测试集中某张标签为non-pCR和某张标签为pCR的WSI肿瘤区域进行可视化展示。 结论 DL-CNB模型实现了通过乳腺癌穿刺WSI对新辅助治疗pCR的预测和重要区域的可视化,其预测结果优于临床数据预测模型。由此,本研究能够为符合NAT适应证的乳腺癌患者提供临床决策参考,辅助实现个体化精准治疗,对改善患者生活质量及生存预期具有重大意义。

关键词: 乳腺肿瘤, 乳腺癌新辅助治疗, 穿刺病理全切片图像, 深度学习模型, 多示例学习算法, 精准治疗

Abstract:

Background

Preoperative neoadjuvant therapy (NAT) is a standardized treatment for locally advanced breast cancer. However, only a portion of patients are sensitive to NAT, hence it is very important to predict the treatment efficacy before NAT. Previous studies have used statistical methods combined with clinical data or deep learning methods combined with medical imaging to predict the efficacy of NAT in breast cancer, but without good results.

Objective

A deep learning model based on core-needle biopsy whole slide images (WSI) of breast cancer (DL-CNB) was trained using the multiple instance learning (MIL) method to predict pathological complete response (pCR) and visualize related tumor areas.

Methods

A retrospective study was conducted to collect the clinical data and biopsy hematoxylin-eosin (HE) stained slides of breast cancer patients who received NAT in Beijing Chaoyang Hospital from April 2019 to April 2022. A total of 195 patients were selected according to the inclusion and exclusion criteria. Patients were divided into pCR group (MP=5, n=40) and non-pCR group (MP=1-4, n=155) according to Miller-Payne (MP) grading. The clinical data were analyzed and the Logistic regression model of pCR influencing factors was constructed. All WSI images were randomly divided into training set and test set in a ratio of 4∶1, and 25% of the data from the training set was taken as verification set. All tumor cell regions in each WSI were labeled, and the training set was prepared by sliding window extraction, data screening, data enhancement, and normalization. Compared with five convolutional neural network models, the optimal model was selected as the feature extractor of DL-CNB. Parameters were set to train the DL-CNB model. The predictive value of DL-CNB was evaluated by using independent test set. To realize the visualization of the important regions related to prediction in the WSI, heat map was drawn according to the weights obtained by the attention-based module.

Results

The proportion of patients with high histological grade, ER negative, PR negative, HER2 positive and Ki-67 high expression in pCR group was higher than that in non-pCR group, and the difference was statistically significant (P<0.05). Compared with the HR+/HER2-, HR-/HER2+ (OR=10.189, 95%CI= 3.225-32.187) and HR+/HER2+ (OR=3.349, 95%CI=1.152-9.737) predicted patients' achie pCR (P<0.05). The AUC of the logistic regressmodel is 0.769, with an accuracy of 81.000%. The AUC of DL-CNB model in the independent test set was 0.914, and the accuracy was 84.211%. Pieces of tumor region labeled non-pCR and pCR in the independent test set were randomly selected for visual display.

Conclusion

The DL-CNB model enables the prediction of pCR in neoadjuvant therapy and visualization of important regions by WSI of breast cancer biopsies. The prediction results are better than the clinical data Logistic regression method. Therefore, we can provide clinical decision-making reference for breast cancer patients who meet the indications of NAT, and assist the realization of individualized precision treatment, which is of great significance to improve the quality of life and survival expectancy for patients.

Key words: Breast cancer, Neoadjuvant therapy for breast cancer, Biopsy pathological WSI, Deep learning model, Multiple instance learning algorithm, Precision therapy

中图分类号: