Engineering Journal of Don

Comparative Analysis of Methods of Knowledge Extraction from Texts for Building Ontologies
- Abstract
- pdf (rus)
This article is devoted to a comparative analysis of methods for extracting knowledge from texts used to build ontologies. Various extraction approaches are reviewed, such as lexical, statistical, machine learning and deep learning methods, as well as ontology-oriented methods. As a result of the study, recommendations are formulated for choosing the most effective methods depending on the specifics of the task and the type of data being processed.

Keywords: ontology, knowledge extraction, text classification, named entities, machine learning, semantic analysis, model
Using the determining the similarity of words method to evaluate text vectorization algorithms
- Saygin A.A.
- Fedosin S.A.
- Abstract
- pdf (rus)
The article presents the existing methods of reducing the dimensionality of data for teaching machine models of natural language. The concepts of text vectorization and word-form embedding are introduced. The task of text classification is being formed. The stages of classifier training are being formed. A classifying neural network is being designed. A series of experiments is being conducted to determine the effect of reducing the dimension of word-form embeddings on the quality of text classification. The results of evaluating the work of trained classifiers are compared.

Keywords: natural language processing, vectorization, word-form embedding, text classification, data dimensionality reduction, classifier
Using the determining the similarity of words method to evaluate text vectorization algorithms
- Saygin A.A.
- Fedosin S.A.
- Abstract
- pdf (rus)
The article provides a brief description of the existing methods of vectorization of texts in natural language. The evaluation is described by the method of determining the similarity of words. A comparative analysis of the operation of several vectorizer models is carried out. The process of selecting data for evaluation is described. The results of evaluating the performance of the models are compared.

Keywords: natural language processing, vectorization, word-form embedding, semantic similarity, correlation
Applying DIANA hierarchical clustering to improve text classification quality
- Abstract
- pdf (rus)
The article presents ways to improve the accuracy of the classification of normative and reference information using hierarchical clustering algorithms.

Keywords: machine learning, artificial neural network, convolutional neural network, normative reference information, hierarchical clustering, DIANA

Comparative Analysis of Methods of Knowledge Extraction from Texts for Building Ontologies

Using the determining the similarity of words method to evaluate text vectorization algorithms

Using the determining the similarity of words method to evaluate text vectorization algorithms

Applying DIANA hierarchical clustering to improve text classification quality

News

News archive