论文中文题名: | 分布式数据库查询优化的研究 |
姓名: | |
学号: | 06297 |
保密级别: | 公开 |
学科代码: | 081203 |
学科名称: | 计算机应用技术 |
学生类型: | 硕士 |
学位年度: | 2009 |
院系: | |
专业: | |
第一导师姓名: | |
论文外文题名: | Research on Query Optimization of Distributed Database |
论文中文关键词: | |
论文外文关键词: | Distribute database Query optimization Performance Optimization |
论文中文摘要: |
随着科学技术的发展以及计算机网络技术的普及,分布式数据库系统逐渐取代了集中式数据库系统,走进我们的生活中。然而伴随着分布式数据库系统的广泛应用,其所涉及的查询效率以及性能问题也就随之而来,因此分布式数据库的查询优化成为分布式数据库领域的研究热点之一。
本文首先介绍了一些分布式数据库的相关知识,如数据分布、数据分片、连接相关操作以及分布式数据库的查询过程。其次简要讲述了几种常规的优化算法:基于关系代数等价变换的优化算法,基于连接的优化算法以及基于半连接的优化算法。然后详尽的分析了两类应用广泛的优化算法及其改进算法:SDD_1及其改进算法和哈希划分等值连接算法及其改进算法。最后在对上述这些算法进行研究的基础上,对以上算法很少涉及的I/O代价和CPU代价,从重复查询的角度,通过两种主要的数据结构结合LRU算法以及一致性哈希算法,设计了一种基于分布式查询缓存的优化方案,本文称之为DCO(Distributed Cache Optimization分布式缓存优化),用来提高整个分布式数据库系统的查询吞吐量以及响应时间。这种方案主要适用于以分布式数据库的主站点作为分布式查询结果的装配站点这种情况。在缓存实验中设计了三组测试用例,实验结果表明了本方案的有效性。
﹀
|
论文外文摘要: |
With the development of science and technology, as well as the popularity of computer network technology, distributed database gradually replaced the centralized database, and entered our life.However, along with a wide range of distributed database applications; the issue of efficiency and performance it involved has come to us. Distributed query optimization has become one of the hot areas of research on distributed database.
This paper first introduces a number of distributed database relevant knowledge, such as data distribution, data fragmentation, join related operation, distributed database query processing. Next talking about several conventional optimization algorithms: optimization algorithm based on the regulation of relational algebra equivalence transformation, optimization algorithm based on join and semi-join. And then a detailed analysis of two widely used optimization algorithm: SDD_1 and its derivatives as well as its improving algorithm; and hash division optimization and it’s improving algorithm. Finally, on the basis of research on these algorithms, this paper presents a new solution from the view of repeated queries, aiming at the I/O and CPU cost, which is little involved by the above-mentioned algorithm. This solution, which adopted two main data structures, combined with LRU algorithm and consistent hash algorithm and based on distributed cache, can be called the DCO (Distributed Cache Optimization), and used to enhance the distributed database system throughput as well as the query response time.The solution mainly applies to the case that the main site of distributed database is the distribute query results assembly site. In cache experiment, three sets of test cases are designed; the experimental results show the effectiveness of the solution.
﹀
|
中图分类号: | TP311.138 |
开放日期: | 2010-03-31 |