论文中文题名: | 基于内容图像搜索引擎关键技术研究 |
姓名: | |
学号: | 20070317 |
保密级别: | 公开 |
学科代码: | 081202 |
学科名称: | 计算机软件与理论 |
学生类型: | 硕士 |
学位年度: | 2010 |
院系: | |
专业: | |
第一导师姓名: | |
论文外文题名: | Research of Key Technology of Content-based Image Search Engine |
论文中文关键词: | 基于内容图像搜索引擎 ; 图像特征提取 ; 区域加权信息熵 ; 树索引 |
论文外文关键词: | |
论文中文摘要: |
基于内容图像搜索引擎是一个重要且具有挑战性的学术研究领域。发展实用的基于内容图像搜索引擎,找出图像之间的相互联系,研究基于内容图像搜素引擎关键技术有重要的现实意义。本文将区域加权信息熵应用于图像特征提取,探索基于内容图像搜索引擎图像库索引技术新途径。在研究和比较几种商用搜索引擎机器学习的基础上,本文探索适合基于内容图像搜索引擎的机器学习方法,并开发出相应的软件。主要工作包括:
分析现有的基于颜色-空间图像特征提取算法的基础上,结合图像信息熵概念与图像分割算法,提出了一种新的图像信息熵描述方法,即区域加权信息熵,并证明了区域加权信息熵的若干性质。采用信息熵性能评价指标从概率的角度描述因权值变化而引起的图像信息熵分布的变化,并考虑应用的兴趣区域以及权值粒度从而确定合理权值。实验表明区域加权信息熵方法比单纯信息熵方法描述图像内容准确率高了50%以上。
将多维索引概念应用于基于内容图像搜索引擎中。由于基于图像内容搜索引擎的特点所以不能使用现有的文本搜索引擎的索引结构。本文对 树索引进行了适应性改进使之能应用到基于内容的搜索引擎中。图像多特征预处理将图像的多个特征值规一化以便建树及查询, 树圆域查询定义了多特征图像匹配中相似距离的概念,从而找出含有相似图片的叶子节点。实验表明,使用 树索引后检索时间大幅降低,并且 树索引时间性能优于简单索引结构。
在分析了现有商用搜索引擎机器学习的基础上,结合基于图像内容搜索引擎自身特点,设计并实现了基于图像内容搜索引擎三个方面的机器学习功能。使图像搜索引擎搜索效率和准确性均有了明显的提高。
基于上述研究结果,设计并实现了基于内容的Web图像搜索引擎V2.0系统。该系统采用区域加权信息熵方法提取图像特征、 树多维索引结构,等基于内容图像搜索引擎关键技术。实现后系统的准确性和用户响应速度达到了预期的目标。
﹀
|
论文外文摘要: |
Content-based image search engine is an important and challenging field of academic research. Developing practical content-based image search engine and finding out relationships between images have the important practical significance in the field of research of key technology of content-based image search engine. This thesis aims to explore new ways of image database index technique of content-based image search engine after regional weighted entropy is applied to image feature extraction. With careful study and comparison of machine learning of several business search engines, this thesis seeks to explore the machine learning methods of content-based image search engine, and develop the corresponding software. The main contributions are included as follows:
A new description method of image entropy named regional weighted entropy is proposed, which combines the concept of image entropy and image segmentation algorithm after analyzing the current color-space image feature extraction algorithms. Some properties of regional weighted entropy are proved. The distribution change of image entropy, which is caused by weight’s change, is described by using entropy performance evaluation index from the point of view of probability, considering the interested regions and weights precision applied by users, then the reasonable weight is determined. Experimental results show that the accuracy of image content described by regional weighted entropy method is 50% higher than that of traditional entropy methods.
Multidimensional indexing concept is applied to content-based image search engine. Index structure of text-based search engine could not be used due to features of content-based image search engine. R * tree index is improved adaptively and applied to content-based image search engine in this thesis. Multiple eigenvalues of images are normalized in the preprocessing of image multi-feature, which is convenient to build trees and query. Concept of similarity distance is defined in multi-featured image matching by R * tree circle-field query, then leaf nodes which contain similar images is found. Experiment results show that, searching time used by R * tree reduced greatly, and timing performance is superior to that of simple indexing structure.
On the basis of analysis for machine learning of current business search engines, this thesis combines features of content-based image search engine, designs and implements machine learning function of three aspects of content-based image search engine. Efficiency and accuracy of image search engine which adds machine learning function improved greatly.
On the basis of above research results, this thesis designs and implements V2.0 system of content-based web image search engine. The system adopts Gordian technique of several content-based image search engine, such as regional weighted entropy which is applied to image feature extraction, R* tree multidimensional indexing structure, and so on. The system V2.0 of content-based image search engine is developed, and system’s accuracy and user’s response speed meet expected object
﹀
|
中图分类号: | TP391 |
开放日期: | 2011-04-11 |