首页>
外国专利>
System and Method for Matching Data Using Probabilistic Modeling Techniques
System and Method for Matching Data Using Probabilistic Modeling Techniques
展开▼
机译:使用概率建模技术匹配数据的系统和方法
展开▼
页面导航
摘要
著录项
相似文献
摘要
A system and method for matching data using probabilistic modeling techniques is provided. The system includes a computer system and a data matching model/engine. The present invention precisely and automatically matches and identifies entities from approximately matching short string text (e.g., company names, product names, addresses, etc.) by pre-processing datasets using a near-exact matching model and a fingerprint matching model, and then applying a fuzzy text matching model. More specifically, the fuzzy text matching model applies an Inverse Document Frequency function to a simple data entry model and combines this with one or more unintentional error metrics/measures and/or intentional spelling variation metrics/measures through a probabilistic model. The system can be autonomous and robust, and allow for variations and errors in text, while appropriately penalizing the similarity score, thus allowing dataset linking through text columns.
展开▼