اليمن - صنعاء - جنوب كلية الشرطةinfo@yemenacademy.edu.ye+967 1 248001

Improved FTWeightedHasht Apriori Algorithm for Big Data Using Hadoop-Map Reduce Model

الأكاديمية اليمنية للدراسات العليا > رسائل الماجستير > تقنية المعلومات > Improved FTWeightedHasht Apriori Algorithm for Big Data Using Hadoop-Map Reduce Model
عنوان الرسالة
الباحث
صارم مجاهد يحيي عمار
سنة الإقرار
لغة الرسالة
إنجليزي
الملخص

The most significant issue of data mining is frequent itemset mining on big datasets. The best known basic algorithm for mining the frequent itemsets is Apriori. Apriori is one of the more well-known algorithms that is used to extract frequent itemsets from big datasets where the frequent itemsets can be used as basis for discovering knowledge such as detecting unknown relationships and producing results which can be used for decision making. When the data size is very big, both memory usage and computational cost will be very expensive. And in this case, single processor’s memory and CPU resources are very limited which make the algorithm performance inefficient. Thus, parallel and distribute the algorithm improves the performance of the algorithm.
In this research, a novel approach named “FTWeightedHashT” is presented for frequent itemset mining on big datasets. The proposed algorithm has used Hadoop-MapReduce with enhanced scalability, and execution time. The results obtained in this research are 8040, 4280, 2170, 1030, 850, and 610 milliseconds corresponding to standalone machine, 2, 4, 8, 12, and 16 node. ANOVA has been used for analyzing the results obtained compared with the former results. Experiments have been done using Retail and Mushroom Datasets, and showed about 60% of improved results regarding time execution. The proposed algorithm can process big datasets efficiently on Hadoop-MapReduce model with 16 node, which can significantly reduce the time execution, and enhance the scalability of the Apriori Algorithm.

شاهد أيضا