Constrained Approximate Search Algorithms in Knowledge Discovery

Petrovic, Slobodan; Sidorova, Julia

Petrovic, Slobodan; Sidorova, Julia

Journal article

Published version

Åpne

Petrovic (899.5Kb)

Permanent lenke

https://hdl.handle.net/11250/2730328

Utgivelsesdato

2020

Sammendrag

Knowledge discovery in big data is one of the most important applications of computing machinery today. Search is essential part of all such procedures. Search algorithms must be extremely efficient, but at the same time knowledge discovery procedures must not produce too many false positives or false negatives. False positives require post-processing, which reduces the overall efficiency of the knowledge discovery procedures, while false negatives reduce the sensitivity of such procedures. To reduce the false positive and false negative rate, in this paper, constrained approximate search algorithms are proposed to be applied. An overview of search theory, exact and approximate, is given first, exposing fundamentals of dynamic programming-based and bit-parallel-based approximate search algorithms without constraints. Then, introduction of constraints specific for various knowledge discovery procedures is explained, together with the subtleties of various applications, such as SPAM filtering and digital and network forensics (file carving, intrusion detection in hosts and networks). Advantages and disadvantages of applications of such constrained search algorithms in knowledge discovery procedures are also discussed. A potential application in bioinformatics is outlined.

Utgiver

Lomonosov Moscow State University (MSU)

Tidsskrift

Modern Information Technologies and IT-Education

Med mindre annet er angitt, så er denne innførselen lisensiert som Navngivelse 4.0 Internasjonal