Combined Method for Summarizing Russian-Language Texts
Abstract
Combined Method for Summarizing Russian-Language Texts
Incoming article date: 30.06.2025This article presents the development of a combined method for summarizing Russian-language texts, integrating extractive and abstractive approaches to overcome the limitations of existing methods. The proposed method is preceded by the following stages: text preprocessing, comprehensive linguistic analysis using RuBERT, and semantic similarity-based clustering. The method involves extractive summarization via the TextRank algorithm and abstractive refinement using the RuT5 neural network model. Experiments conducted on the Gazeta.Ru news corpus confirmed the method's superiority in terms of precision, recall, F-score, and ROUGE metrics. The results demonstrated the superiority of the combined approach over purely extractive methods (such as TF-IDF and statistical methods) and abstractive methods (such as RuT5 and mBART).
Keywords: combined method, summarization, Russian-language texts, TextRank, RuT5