摘要

In many research fields such as Linguistics, Natural Language Processing, and Artificial Intelligence, semantic similarity computation between words is an important issue. In this paper, semantic similarity metrics are firstly introduced and analyzed in order to determine their advantages and limitations, then a new unsupervised Uyghur context-based semantic similarity metric is proposed combining the feature characteristics of Uyghur. The proposed metric is automatic, do not require any annotated knowledge resources, and can be applied to other languages. In this work, the proposed metric is evaluated on the Miller &Charles data set, the metric is evaluated for different feature weighting schemes and as a function of the number of Web documents used. 50 Uyghur speakers are chosen to take the experiments, it is shown that the correlation scores between context-based similarity metric and human judgments are significantly higher than that of the co-occurrence-based metrics.