- 无标题文档
查看论文信息

论文中文题名:

 MapReduce模型下的图像并行化处理研究    

姓名:

 翟锐涛    

学号:

 201408394    

学科代码:

 085211    

学科名称:

 计算机技术    

学生类型:

 工程硕士    

学位年度:

 2017    

院系:

 计算机科学与技术学院    

专业:

 计算机技术    

研究方向:

 大数据并行计算    

第一导师姓名:

 薛弘晔    

第一导师单位:

 西安科技大学    

论文外文题名:

 Research on Image parallel Processing based on MapReduce Model    

论文中文关键词:

 Hadoop ; MapReduce模型 ; 图像并行化    

论文外文关键词:

 Hadoop ; MapReduce model ; image parallelization    

论文中文摘要:
近年来,伴随着云计算和大数据的兴起,网络中各应用领域所产生的数据量快速增长,已经达到PB级别甚至更高。在这些数据中,图像大数据的处理与存储已经成为各界研究的热点。MapReduce技术是一种高可靠的并行编程框架,常用于进行大数据量的并行计算,对复杂的集群环境问题有着很好的解决方案。Hadoop平台的核心之一MapReduce模型就是利用该项技术处理海量数据,目前已取得了较好成效,但不足的是对于图像文件的处理,尤其是海量小图像文件的研究还不成熟。本文针对于此问题,探讨了在MapReduce模型下图像的并行化处理,给出了一种可以作为处理海量小图像文件的新的数据平台基础架构,主要研究的内容及研究成果如下: 首先,本文综述了在大数据背景下,海量图像数据处理的研究背景、研究现状及意义,并介绍了Hadoop生态系统,包括其核心HDFS系统和MapReduce框架技术。 其次,详细设计了如何改进Hadoop系统,提出组合分片方法,提高了海量小图片的处理效率,扩展了MapReduce软件框架,使之能够很好的支持并处理图像文件。 最后,本文设计并实现了在MapReduce模型下,图像的并行K-means聚类算法分析、图像的并行Sobel边缘检测算法、扩展了图像在MapReduce模型下的并行化直方图提取等操作。通过实验验证,扩展后的MapReduce模型的可行性及处理图像文件的高效性。通过指标性能分析,验证了MapReduce模型下进行图像并行化处理的有效性,为在Hadoop平台下处理海量大数据图像文件的应用提供了一种可行的解决方案。
论文外文摘要:
In recent years, along with the rise of cloud computing and big data, the amount of data generated by various applications in the network has grown rapidly and has reached the PB level even higher. In these data, image processing and storage of big data has become a hot topic of research. MapReduce technology is a highly reliable parallel programming framework, commonly used for large amount of data in parallel computing, complex cluster environment has a very good solution. One of the cores of the Hadoop platform, MapReduce, is to use the technology to process massive amounts of data, and it has achieved good results, but the lack of research on image files, especially the small image files. In this paper, we discuss the parallelization of the image under the MapReduce model, and give a new data platform architecture which can be used as a new data platform for processing small data files,and the main idea studied and included are as follows: Firstly, the paper introduced the research background and the state of it. Meanwhile sums up the study meaning of mass image data processing in the big data background, and introduces Hadoop ecosystem, including its core HDFS system and MapReduce framework technology. Secondly, the design of how to improve the Hadoop system, put forward a combination of fragmented methods to improve the processing efficiency of small images, the expansion of the MapReduce software framework, so that it can be a good support and image files. Finally, this paper designs and implements the parallel K-means clustering analysis of the image under the MapReduce model, parallel Sobel edge detection of the image, and extends the parallel histogram extraction of the image under the MapReduce model. Through the experimental verification, the feasibility of the extended MapReduce model and the efficiency of the image files. Through the index performance analysis, the effectiveness of image parallel processing under MapReduce model is verified, which provides a feasible solution to the application of massive data image file in Hadoop.
中图分类号:

 TP391.41    

开放日期:

 2017-06-13    

无标题文档

   建议浏览器: 谷歌 火狐 360请用极速模式,双核浏览器请用极速模式