journal6 ›› 2014, Vol. 35 ›› Issue (6): 38-41.DOI: 10.3969/j.issn.1007-2985.2014.06.010

• 计算机 • 上一篇    下一篇

批量数据入库在ASP中筛选重复的处理方法——以湘西民族职业技术学院新生信息录入库为例

龚书   

  1. (湘西民族职业技术学院,湖南 吉首 416000)
  • 出版日期:2014-11-25 发布日期:2014-11-27
  • 作者简介:龚书(1979—),男,湖南凤凰人,湘西民族职业技术学院讲师,主要从事计算机应用研究.

Methods of Duplication Screening for ASP Mass Data Storage:A Case Study of Enrollment Information Storage in Xiangxi Vocational and Technical College for Nationalities

 GONG  Shu   

  1.  (XiangXi Vocational and Technical College For Nationalities,Jishou 416000,Hunan China)
  • Online:2014-11-25 Published:2014-11-27

摘要:在建立数据库时,重复数据的判断对于数据库管理至关重要,没有准确关键词作为对比,重复数据的判断将变得非常困难.传统的哈稀技术、固定分块技术、滑动块技术、可变分块技术和数据指纹等对重复数据进行查找与删除时,占用了大量系统处理时间,且准确性较低.为提高数据处理效率,提出了ASP批量数据在入库时的查重筛选方法,实践验证了该方法的鲁棒性与可靠性,极大地减轻了操作员对数据库管理的繁重工作.

关键词: 清除重复, 数据清理, 数据核对, 筛选入库, 数据仓库, 数据导出

Abstract: When a database is established,judgment on data duplication is crucial for its administration,which will be difficult without accurate keywords for reference.The commonly used methods-Hash technology,fixed-sized partition detection technology,sliding block technology,content-defined chunking detection technology,and fingerprint data exploitation,require a large amount of processing time for the detection and removal of duplication.This paper describes the ASP mass data storage method and duplication screening method,and verifies the robustness and validity of these methods.It is shown that the heavy workload of database management for operators can be greatly reduced.

Key words: duplication removal, data cleaning, data check, screening and storage, data warehouse, data export

公众号 电子书橱 超星期刊 手机浏览 在线QQ