【24h】

Digging for Knowledge

机译:挖掘知识

获取原文
获取原文并翻译 | 示例

摘要

The "smile of a mother" is always recognized, whenever and wherever. But why is my PC always dumb and unable to recognize me or my needs, whoever or whatever? This paper postulates that such a 6 W's query and search system needs matching storage. Such a lament will soon be mended with a smarter PC, or a smarter Google engine, a network computer, working in the field of data retrieval, feature extraction, reduction, and knowledge precipitation. Specifically, the strategy of modern information storage and retrieval shall work like our brains, which are constantly overwhelmed by 5 pairs of identical tapes taken by eyes, ears, etc. 5 high fidelity sensors generate 5 pairs of high definition tapes which produce the seeing and hearing etc. in our perception. This amounts to 10 tapes recorded in a non-abridged fashion. How can we store and retrieve them when we need to? We must reduce the redundancy, enhancing the signal noise ratio, and fusing invariant features using a simple set of mathematical operations to write according to the union and read by the intersection in the higher dimensional vector space. For example, (U_i w_i) n (U_j W_j) ≠ 0 where the query must be phrased in terms of the union of imprecise or partial set of 6w's denoted by the union of lower case w's. The upper case W's are the archival storage of a primer tree. A simplified humanistic representation may be called the 6 W space (who, what, where, when, why, how), also referred to as the Newspaper geometry. It seems like mapping the 6W to the 3W (World Wide Web) is becoming relatively easier. It may thus become efficient and robust by rapidly digging for knowledge through the set operations of union, writing, and intersection, reading, upon the design of 6 W query searching engine matched efficiently by the 6W vector index databases. In fact, Newspaper 6D geometry may be reduced furthermore by PCA (Principal Component Analysis) eigenvector mathematics and mapped into the 2D causality space comprised of the causes (What, How, Why) and the effects (Where, When and Who). If this hypothesis of brain strategy were true, one must then develop a 6W query language to support a 6W-ordered set storage of linkage pointers in high D space. In other words, one can easily map the basic 1st Gen. Google Web, 1-D statistical PageRanking databases, to a nested 6W tree where each branch of sub-6-W is stemming from the prime 6 W tree, using a System of automated text mining assisted by syntactic semantics to discern the properties of the 6W for that query. Goehl et al. has demonstrated previously that such is doable, but one may need more tools to support the knowledge extraction and automated feature reduction. In this paper, we have set out to demonstrate lossless down sampling using the 2nd Gen wavelet transform, the so-called "1-D Cartesian lifting processing of Swelden" adopted by JPEG 2000. "The loss of statistics, if any (including PageRanking and 1-D lifting), is the loss of geometry insights," such as 2-D vector time series, video, whose 1 -D lifting Cartesian product will loss the diagonal changes insights. We recommend two approaches: (i) knowledge extraction from achieved web pages requires a smart system of automated text mining, which analyzes data by using the concept of 6W followed by PageRanking & PCA statistics as an example for knowledge indexing. On the other hand, any new data is encouraged (ii) to be casted into the Newspaper 6W geometry, then we can bypass text mining to demonstrate that the dominant principal value components is the cause and the effect 2-D subspace. Then the time evolution of 2-D subspace and the video, may be further compressed by de-noising it and ridding it of irrelevance. Furthermore, using the 2D lifting which preserves the statistical mean and variance, we can down sample the 2-D image that is spanned by the cause (Why, How, What) and the effect (Who, Where, When) subspace. The 2-D image processing analyzes parallel relationships for all four corner neighborhoods (2~D) to produce a more accurate change in visual edges in all directions (including diagonal), more than 1 -D lifting along the x and y directions only, without losing the original statistics such as mean (and HOS such as variance & Kortosis). Coupling this lossless down sampling with a system of 6W-oriented databases can generate a superior pre-processing method for many large scale applications. From a web user's standpoint, we can eliminate the irrelevant data from search results if preserved in the form readily along the who and when indices. The benefits to businesses are immense. 6D storage facilities would identify the relevance from the irrelevance for the reason of protection of the privacy, and furthermore isolate the invariant knowledge automatically from temporal fluctuation, in order to minimize the amount of memory used and in turns it increases the efficiency of search. The financial sector would gain immensely from the real-time traceability of lossless down-sampled data for interpretation and prediction. CEO's could branch out to different industries by evaluating trend line interests of their target markets. The list goes on and on. However, the failure to act will only lead to a stagnant and predictable industry, where any progression will be dampened by a conservative mindset.
机译:无论何时何地,始终都能识别出“母亲的微笑”。但是,为什么我的PC总是笨拙,无法识别我或我的需求,无论是谁还是其他?本文假设这样一个6 W的查询和搜索系统需要匹配的存储。这样的哀叹很快将被用于数据检索,特征提取,归约和知识沉淀领域的更智能的PC或更智能的Google引擎,网络计算机所弥补。具体来说,现代信息存储和检索的策略应像我们的大脑一样工作,不断地被眼睛,耳朵等所用的5对相同的磁带所淹没。5个高保真传感器生成5对高清晰度的磁带,它们产生可见和可见的听觉等。这相当于以非删节方式录制的10盘磁带。在需要时我们如何存储和检索它们?我们必须使用一组简单的数学运算来减少冗余,提高信号噪声比并融合不变特征,以根据并集进行写操作,并在高维向量空间中通过交集进行读取。例如,(U_i w_i)n(U_j W_j)≠0,其中查询必须用不精确联合或用小写w的联合表示的6w的部分集的联合来表述。大写字母W是底漆树的档案存储。简化的人性化表示形式可以称为6 W空间(谁,什么,在哪里,何时,为什么,如何),也称为报纸几何。将6W映射到3W(万维网)似乎变得相对容易了。因此,在通过6W向量索引数据库有效匹配的6W查询搜索引擎的设计中,通过并集,写入和相交,读取的设置操作可快速挖掘知识,从而使其变得高效而强大。实际上,报纸6D几何可以通过PCA(主成分分析)特征向量数学进一步缩小,并映射到由原因(什么,如何,为什么)和影响(何处,何时和谁)组成的2D因果关系空间中。如果这种关于大脑策略的假设是正确的,则必须开发一种6W查询语言,以支持高D空间中链接指针的6W排序集合存储。换句话说,使用以下系统,您可以轻松地将基本的第一代Google Web一维统计PageRanking数据库映射到嵌套的6W树,其中子6-W的每个分支都来自原始6 W树。通过语法语义自动文本挖掘,以识别该查询的6W属性。 Goehl等。以前已经证明这样做是可行的,但是可能需要更多工具来支持知识提取和自动特征缩减。在本文中,我们着手演示了使用第二代小波变换(即JPEG 2000所采用的所谓的“ Swelden的1-D笛卡尔笛卡尔提升处理”)进行的无损下采样。“统计信息的丢失(如果有的话)(包括PageRanking和“一维提升”是几何学见解的损失”,例如二维矢量时间序列视频,其一维提升笛卡尔积将丢失对角线变化见解。我们建议两种方法:(i)从获得的网页中提取知识需要一个自动文本挖掘的智能系统,该系统使用6W的概念分析数据,然后使用PageRanking和PCA统计信息作为知识索引的示例。另一方面,鼓励将任何新数据(ii)强制转换为Newspaper 6W几何图形,然后我们可以绕过文本挖掘来证明主要的主值成分是2D子空间的原因和结果。然后,可以通过对2D子空间和视频进行去噪并消除其不相关性来进一步压缩2D子空间和视频的时间演化。此外,使用保留统计均值和方差的2D提升,我们可以对由原因(为什么,如何,什么)和影响(谁,哪里,何时)子空间生成的2D图像进行下采样。 2-D图像处理会分析所有四个角邻域(2〜D)的平行关系,以在所有方向(包括对角线)上产生更精确的视觉边缘变化,仅在x和y方向上产生超过-D的提升,而不会丢失原始统计数据(例如均值)和HOS​​(例如方差和Kortosis)。将这种无损下采样与面向6W的数据库系统耦合在一起,可以为许多大型应用程序提供出色的预处理方法。从网络用户的角度来看,如果可以容易地将索引沿who和when的形式保存下来,则可以从搜索结果中消除不相关的数据。对企业的好处是巨大的。 6D存储设施会出于隐私保护的原因从不相关中识别出相关性,并且还会自动将不变知识与时间波动隔离开,以最大程度地减少使用的内存量,从而提高搜索效率。金融部门将从无损降采样数据的实时可追溯性中获得巨大的收益,以进行解释和预测。 CEO可以通过评估目标市场的趋势线兴趣来扩展到不同行业。清单一直在继续。但是,不采取行动只会导致停滞不前和可预测的行业,在这种行业中任何进展都会被保守的心态所挫败。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号