×

You are using an outdated browser Internet Explorer. It does not support some functions of the site.

Recommend that you install one of the following browsers: Firefox, Opera or Chrome.

Contacts:

+7 961 270-60-01
ivdon3@bk.ru

Formation of a search query for searching information in a subject area using the zipf law and the three sigma rule

Abstract

Formation of a search query for searching information in a subject area using the zipf law and the three sigma rule

Vakurin I.S., Tremasova L.A., Alyoshintsev A.V., Gadasin D.V.

Incoming article date: 30.12.2024

The annual growth of the load on data centers increases many times over, which is due to the growing growth of users of the information and telecommunications network Internet. Users access various resources and sources, using search engines and services for this. Installing equipment that processes telecommunications traffic faster requires significant financial costs, and can also significantly increase the downtime of the data center due to possible problems during routine maintenance. It is more expedient to focus resources on improving the software, rather than the hardware of the equipment. The article provides an algorithm that can reduce the load on telecommunications equipment by searching for information within a specific subject area, as well as by using the features of natural language and the process of forming words, sentences and texts in it. It is proposed to analyze the request based on the formation of a prefix tree and clustering, as well as by calculating the probability of the occurrence of the desired word based on the three sigma rule and Zipf's Law.

Keywords: Three Sigma Rule, Zipf's Law, Clusters, Language Analysis, Morphemes, Prefix Tree, Probability Distribution