首页> 外文期刊>Mathematical Problems in Engineering >On Detecting and Removing Superficial Redundancy in Vector Databases
【24h】

On Detecting and Removing Superficial Redundancy in Vector Databases

机译:向量数据库中表面冗余的检测与消除

获取原文
获取原文并翻译 | 示例
           

摘要

A mathematical model is proposed in order to obtain an automatized tool to remove any unnecessary data, to compute the level of the redundancy, and to recover the original and filtered database, at any time of the process, in a vector database. This type of database can be modeled as an oriented directed graph. Thus, the database is characterized by an adjacency matrix. Therefore, a record is no longer a row but a matrix. Then, the problem of cleaning redundancies is addressed from a theoretical point of view. Superficial redundancy is measured and filtered by using the 1-norm of a matrix. Algorithms are presented by Python and MapReduce, and a case study of a real cybersecurity database is performed.
机译:提出了一个数学模型,以便获得自动化的工具,以删除任何不必要的数据,计算冗余级别并在过程中的任何时间在向量数据库中恢复原始的和经过过滤的数据库。可以将这种类型的数据库建模为有向图。因此,数据库以邻接矩阵为特征。因此,一条记录不再是一行,而是一个矩阵。然后,从理论的角度解决清理冗余的问题。通过使用矩阵的1范数来测量和过滤表面冗余。由Python和MapReduce提出了算法,并进行了一个真实网络安全数据库的案例研究。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号