面向混部云失败批处理作业的预测算法

作者:Lin Weiwei; Shi Fang; Li Yurui; Liu Fagui; Liu Jie; Peng Shaoliang; Wang James Z.
来源:Journal of National University of Defense Technology, 2022, 44(5): 71-79.
DOI:10.11887/j.cn.202205008

摘要

In order to reduce the risk of failed batch jobs in co-located cloud, the K-means algorithm was used to divide batch jobs into four categories.On the basis of classification, the TLNM (two-layer nested classification model) was proposed and the prediction algorithm based on TLNM was implemented. Experiment results based on Ali Trace 2018 data set show that the ROC(receiver operating characteristic) curve of this algorithm is significantly better than other commonly used classifiers, and the area under the ROC curve (i.e.AUC) can reach 0.978, indicating that this algorithm has good classification performance. At the same time, the recall rate can reach 0.951. Through the confusion matrix, it can be seen that the TLNM algorithm can accurately predict the failed batch jobs. ? 2022 National University of Defense Technology.

全文