基于不均衡样本的盾构结泥饼风险预测模型建立及实证

Establishment and Empirical Study of Mud Cake Formation Risk Prediction Model for Shield Machines Based on Imbalanced Samples

  • 摘要:
    目的 基于盾构机数据的刀盘结泥饼预测在保障隧道施工安全和提高施工效率方面有重要价值。传统机器学习模型在处理此类小样本数据时,难以有效捕捉少数类别特征,导致模型倾向于学习多数类别,而忽视少数类别,从而影响预警效果。对此,有必要基于不均衡样本建立盾构结泥饼风险预测模型并进行实证。
    方法 首先,通过特征工程剔除停机数据并识别稳定掘进段;随后,结合特征重要度评估与相关性分析,筛选用于泥饼预测的关键特征;在此基础上,将Focal Loss(焦点损失)函数嵌入LSTM(长短期记忆网络),以增强模型对少数类样本的关注。以长春某地铁盾构实际工程为例对模型预测准确性进行实证。
    结果及结论 面向EPB(土压平衡盾构)原始掘进数据的预处理流程框架有效提升了数据质量。通过正交试验,确定了焦点损失函数的最佳超参数组合为:调制指数γ=1.000,直实类别对应的类别权重αz=0.750。在相同数据集和超参数条件下,传统LSTM模型的性能评估指标F1值为0.724,而使用基于Focal Loss的LSTM模型后,F1值提高至0.982,F1值的增加表明Focal Loss函数的引入有效提升了模型对不平衡样本的预测性能。

     

    Abstract:
    Objective Predicting the cutterhead mud cake formation based on shield machine data is of significant value for ensuring tunnel construction safety and improving construction efficiency. Traditional machine-learning models struggle to effectively capture minority-class features when handling such small-sample datasets, causing the model to lean towards learning the majority classes while neglecting the minority ones, thereby affecting early-warning performance. Therefore, it is necessary to establish a mud cake formation risk prediction model for shield tunneling based on imbalanced samples and conduct empirical validation.
    Method First, the feature engineering is applied to remove shutdown data and identify stable excavation segments. Then, key features for mud cake prediction are selected through feature-importance evaluation and correlation analysis. On this basis, Focal Loss function is embedded into an LSTM (long and short-term memory network) to enhance the model attention to minority-class samples. An actual metro shield tunneling project in Changchun City is used as a case study to empirically verify the model prediction accuracy.
    Result & Conclusion  The constructed preprocessing framework for EPB (earth pressure balance) shield raw tunneling data effectively improves the data quality. Through orthogonal tests, the optimal hyperparameter combination for the Focal Loss function is determined: modulation factor γ = 1.000 and class weight corresponding to the true class αz = 0.750. Under the same dataset and hyperparameter settings, the traditional LSTM model achieves a performance evaluation indicator F1-score of 0.724, whereas the LSTM model using Focal Loss function achieves an F1-score of 0.982. The increase in F1-score indicates that introducing the Focal Loss function effectively enhances the model prediction performance for imbalanced samples.

     

/

返回文章
返回