论文中文题名: | 关联规则算法在网上招聘系统中的研究与应用 |
姓名: | |
学号: | 06299 |
保密级别: | 公开 |
学科代码: | 081203 |
学科名称: | 计算机应用技术 |
学生类型: | 硕士 |
学位年度: | 2009 |
院系: | |
专业: | |
第一导师姓名: | |
论文外文题名: | The Application of Algorithm Association Rules in On-line Recruitment System |
论文中文关键词: | 关联规则 ; FP-growth算法 ; HFP-growth算法 ; DSFP-growth算法 ; 网上招聘系统 |
论文外文关键词: | Association Rules ; Algorithm FP-growth ; Algorithm HFP-growth ; DSFP-growth Algori |
论文中文摘要: |
随着信息技术的高速发展和数据库技术在信息管理中的广泛应用,在各应用领域的数据库中已存储了大量的数据。如何从中发现所隐藏的、预先未知的信息,已显的尤为重要,数据挖掘技术就是为解决此问题而产生的。本文针对关联规则挖掘中FP-growth算法的不足,在算法的具体实现过程中对其进行了改进,并将其应用于网上招聘系统之中,通过分析网上招聘系统中的数据来预测招聘者的招聘规律,对挖掘出的结果做出分析,应用关联规则挖掘技术解决了实际问题。
本文针对FP-growth算法的不足在算法的具体实现过程中从两方面对其进行了改进:一方面,FP-growth算法是一种效率较高的算法,它不产生候选集,但仍需多次遍历结果集L。针对此问题本文对其进行了改进,改进后的HFP-growth算法将结果集L的数据以项名称对应项支持度计数的形式存入hash表,在找某个项的支持度计数时给hash表传入项名称就能返回对应的支持度计数,节省多次遍历结果集L的时间。实验结果表明,本算法有效地节省了挖掘的时间,在实际挖掘过程中取得了良好的效果;另一方面,由于数据项与条件模式库都要运用建立FP-tree的算法,因此本文对其进行了改进,改进后的DSFP-growth算法统一了数据项的数据结构与条件模式库的数据结构,生成条件树的时候省去了数据结构的转换过程。实验结果表明,改进后的算法效率优于原FP-growth算法效率。
将以上的研究成果实际应用于名智网上招聘系统之中,从中发现网上招聘系统中各属性之间的关联规则,即用人单位的录用规律。通过对挖掘结果进行归纳与分析,结合实际工作,有效的协助有关部门在招生过程中对学生选择专业的方向进行指导,克服学生选择专业时的盲目性,优化专业结构,从而提高就业率。
﹀
|
论文外文摘要: |
With the development of the information technology and the application of database technology in information management, massive data have been continuously stored in many kinds of database. We all know that it is very important to find the hidden and unknown information for these areas. The Data Mining technology is an efficient solution to this problem. According to the deficiency of the FP-growth Algorithm, the FP-growth Algorithm is improved in Algorithm realization and is used in the on-line recruitment system. By analyzing data from on-line recruitment system, the recruitment rules are known, and the mined results are analyzed, and some realistic problems have been solved by applying the association rules mining technology.
In the face of the traditional association rules mining algorithms weakness,this dissertation improved the FP-growth algorithm from two aspects: First, in the association rule data mining FP-growth algorithm is a high efficiency algorithm, not producing the candidate sets, but still having to travel result sets L many times. The improved algorithm HFP-growth is proposed, and this algorithm stores the data of the result sets L by means of the item name correspondence support count into the hash table, when looking for some support count, the corresponding support count is returned after transfer item name to the hash table, lots of time is saved when traversing sets result sets L. Experiment indicates that this algorithm saves the time of mining and it is much higher effective than the traditional algorithms; Second, data item and condition patterns database use algorithm to generate FP-tree. Therefore, their data structure of the data item and the condition patterns database should unified in the improved DSFP-growth algorithm. The process of data structure translation would be deleted when condition tree generating. Experiment result indicates that this algorithm is much higher efficient than the traditional FP-growth algorithms.
Applied the new algorithm and the models proposed above to a database of a MingZhi on-line recruitment system, Association rules between job-hunter and the department are obtained, that is recruitment rules of the company. After analyzing the results of mining, putting in to the practical work, the rules can offer effectual guiding for students to choose their specialty, optimizing the structure of specialty, so to improve the percentage of employment.
﹀
|
中图分类号: | TP311.13 |
开放日期: | 2010-03-31 |