论文中文题名: | 汽车百货连锁销售数据仓库的设计与实现 |
姓名: | |
学号: | 20070324 |
保密级别: | 公开 |
学科代码: | 081202 |
学科名称: | 计算机软件与理论 |
学生类型: | 硕士 |
学位年度: | 2010 |
院系: | |
专业: | |
第一导师姓名: | |
第二导师姓名: | |
论文外文题名: | Design and Implementation of Data Warehouse of Automotive Department Store Chain Sales |
论文中文关键词: | |
论文外文关键词: | Data Warehouse ETL tool Semi-structured data Structured data Metadata |
论文中文摘要: |
本课题以飞跃汽车百货连锁销售服务公司为研究背景,依据公司决策层的需求,分析和设计了适合公司决策支持的数据仓库逻辑模型和物理结构,并对数据抽取、转换、装载(ETL)和数据清洗等技术进行了具体地应用实践探讨,最终设计并实现了可满足决策分析报表所需的数据仓库及其相关的ETL工具。本论文的主要研究内容和成果概述如下:
首先,在调研该企业的组织机构、管理流程和业务系统现状的基础上,分析和设计了适合公司决策支持所需要的数据仓库的八大主题,即VIP客户主题、采购主题、服务主题、库存商品主题、商品毛利主题、销售主题、应付款主题和应收款主题。采用星型模型和雪花模型相结合的方式,建立了八大主题的逻辑模型和物理模型。根据公司的财力和投资情况,在操作数据存储(ODS)环境下实现了数据仓库。
其次,设计并实现了一个基于XML半结构化数据的ETL工具。本文运用DOM对象分析XML数据源,从企业实际出发设计并实现了半结构化数据的ETL工具,从而解决了公司财务软件系统的对外输出接口导出的XML半结构化数据加载到数据仓库中的难题,同时也解决了商用ETL工具不能直接抽取并加载XML文档到数据仓库的弊端,满足了企业的实际需要。
然后,设计并实现了一个结构化数据的ETL工具。在该公司的现行业务系统中,大多数的数据都是基于SQL SERVER 2000和ORACLE 9i的结构化数据,通过结构化数据ETL工具的设计与实现,为用户预留自定义数据清洗函数接口,弥补了商品化ETL工具清洗函数不可扩展性的不足。另外,在支持ETL工具运行的元数据中保留了数据提取的SQL文本,减少了同类SQL再次执行时重新编译带来的时间开销。
通过实验,验证了本文的研究成果是可行的。最后,对数据仓库优化技术进行了探讨。
﹀
|
论文外文摘要: |
Based on the research background of Feiyue Automotive Department Store Chain Sales & Service Company, according to the requirements of the company’s decision-makers, Data Warehouse’s logical model and physical structure are analyzed and designed for the company's decision support. This thesis discusses the key technologies on the ETL (Extraction-Transformation-Loading, ETL) and data cleaning in detail. Finally, Data Warehouse and related ETL tools, which can meet a variety of required decision analysis reports, are designed and realized. The main contents and contributions of the thesis are summarized as follows:
First of all, on investigating the basis of the enterprise’s organization, management processes and business system, eight subjects of Data Warehouse are analyzed and designed, which are suitable to the company’s policy. (The eight subjects are the subject of VIP customer, the purchasing, the service, the inventory goods, the goods profits, the sales, the receivable and the payable fund.) Then, their logical model and physical model are builded with the star model and the snowflake model. Finally, according to the company's financial and investment situation, the Data Warehouse is implemented in ODS (Operational Data Store, ODS) environment.
Secondly, ETL tool based on XML semi-structured data is designed and implemented. Data source of XML is analyzed by DOM (Document Object Model, DOM) objects and the ETL tool is implemented according to the enterprise’s actual conditions. Hence, it solves the problems of loading semi-structured data, which is led by the external output interface of the company’s finance software system, into Data Warehouse and commercial ETL tool cannot extract and load XML files into the Data Warehouse directly. Therefore, it makes the actual needs of the company.
Thirdly, ETL tool based on the structuralized data is designed and implemented. In the company’s business systems, most of structuralized data is based on SQL SERVER 2000 and ORACLE 9i. The ETL tool is designed and implemented. It reserves user-defined data cleaning function interfaces for users to make up for the drawback of the commercialized ETL tool which couldn’t be extended in data cleaning function. In addition, the metadata of supporting ETL tool operation reserves the extracted SQL sentences and saves the time in recompiling of the similar SQL.
Through the experiments, the thesis shows that this research is feasible. Finally, optimization techniques of the Data Warehouse were discussed.
﹀
|
中图分类号: | TP311.13 |
开放日期: | 2011-04-11 |