...
首页> 外文期刊>MIS quarterly >PRIVACY AND BIG DATA: SCALABLE APPROACHES TO SANITIZE LARGE TRANSACTIONAL DATABASES FOR SHARING
【24h】

PRIVACY AND BIG DATA: SCALABLE APPROACHES TO SANITIZE LARGE TRANSACTIONAL DATABASES FOR SHARING

机译:隐私和大数据:可伸缩的方法来清理大型交易数据库以进行共享

获取原文
获取原文并翻译 | 示例
           

摘要

Scalability and privacy form two critical dimensions that will eventually determine the extent of the success of big data analytics. We present scalable approaches to address privacy concerns when sharing transactional databases. Although the benefits of sharing are well documented and the number of firms sharing transactional data has increased over the years, the rate at which this number has grown is not quite what it could have been. Concerns about revealing proprietary information have prevented some retailers from sharing, despite the obvious advantages in an increasingly networked economy. In the context of sharing transactional data, sensitive information is typically based on relationships derived from frequently occurring itemsets, result of surprisingly successful promotions by the retailer, or unexpected relationships identified by the retailer while mining the data. Prior work in this area includes optimal approaches based on integer programming to maximize the accuracy of shared databases, while hiding all sensitive itemsets. While these approaches were shown to solve problems involving up to 10 million transactions, many transactional databases in the big data context are considerably larger and the existing integer programming-based procedures do not scale well enough to solve these larger problems. Consequently, there is no effective solution procedure for such databases in extant literature.
机译:可扩展性和隐私形成两个关键维度,这些维度最终将决定大数据分析成功的程度。我们提出了可扩展的方法来解决共享事务数据库时的隐私问题。尽管共享的好处已被充分证明,并且共享事务数据的公司的数量在过去几年中有所增加,但是这个数字的增长速度并没有达到预期的水平。尽管在日益网络化的经济中具有明显的优势,但对于泄露专有信息的担忧使一些零售商无法共享。在共享交易数据的上下文中,敏感信息通常基于从频繁出现的项目集,零售商出人意料的成功促销结果或零售商在挖掘数据时发现的意外关系得出的关系。该领域的先前工作包括基于整数编程的最佳方法,以最大化共享数据库的准确性,同时隐藏所有敏感项目集。尽管显示出这些方法可以解决涉及多达1000万笔交易的问题,但大数据环境中的许多交易数据库都相当大,并且现有的基于整数编程的过程的伸缩性不足以解决这些更大的问题。因此,现有文献中没有针对此类数据库的有效解决程序。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号