×

You are using an outdated browser Internet Explorer. It does not support some functions of the site.

Recommend that you install one of the following browsers: Firefox, Opera or Chrome.

Contacts:

+7 961 270-60-01
ivdon3@bk.ru

Using the determining the similarity of words method to evaluate text vectorization algorithms

Abstract

Using the determining the similarity of words method to evaluate text vectorization algorithms

Saygin A.A., Fedosin S.A.

Incoming article date: 03.11.2024

The article presents the existing methods of reducing the dimensionality of data for teaching machine models of natural language. The concepts of text vectorization and word-form embedding are introduced. The task of text classification is being formed. The stages of classifier training are being formed. A classifying neural network is being designed. A series of experiments is being conducted to determine the effect of reducing the dimension of word-form embeddings on the quality of text classification. The results of evaluating the work of trained classifiers are compared.

Keywords: natural language processing, vectorization, word-form embedding, text classification, data dimensionality reduction, classifier