摘要
Similar text positioning is an important part of plagiarism detection.The existing positioning method directly merges text or fingerprint to obtain similar text.Due to the disturb information in the similar text, the positioning accu racy is poor.The semantic features of the match fingerprints were analyzed, and a cluster method based on slope density for similar text positioning was proposed, which converts the text merge problem into dense sample points clustering problem, and improves the efficiency and accuracy of the positioning.Through the experiment on the PAN public corpus,the result shows it performs better than the PAN10 top three.This method has been used in the South China University of Technology's feature professional teaching platform to detect the plagiarism of homework.
- 单位