Manuscript Number : CSEIT1724152
Frequent Data Partitioning using Parallel Mining Item Sets in MapReduce
Authors(3) :-Chenna Venkata Suneel, Dr. K. Prasanna, Dr. M. Rudra Kumar For mining frequent Itemsets parallel traditional algorithms are used. Existing parallel Frequent Itemsets mining algorithm partition the data equally among the nodes. These parallel Frequent Itemsets mining algorithms have high communication and mining overheads. We resolve this problem by using data partitioning strategy. It is based on Hadoop. The core of Apache Hadoop consists of a storage part, called as Hadoop Distributed File System (HDFS), and a processing part called Map Reduce. Hadoop divides files into large blocks. It distributes them across nodes in a cluster. By using this strategy the performance of existing parallel frequent-pattern increases. This paper shows the various parallel mining algorithms for frequent itemsets mining. We summarize the various algorithms that were developed for the frequent itemsets mining, like candidate key generation algorithm, such as Apriori algorithm and without candidate key generation algorithm, such as FP-growth algorithm. These algorithms lacks mechanisms like load balancing, data distribution I/O overhead, and fault tolerance. The most efficient the recent method is the FiDoop using ultrametric tree (FIUT) and Mapreduce programming model. FIUT scans the database only twice. FIUT has four advantages. First: I reduces the I/O overhead as it scans the database only twice. Second: only frequent itemsets in each transaction are inserted as nodes for compressed storage. Third: FIU is improved way to partition database, which significantly reduces the search space. Fourth: frequent itemsets are generated by checking only leaves of tree rather than traversing entire tree, which reduces the computing time.
Chenna Venkata Suneel Data Mining, Recommender Systems, Social Network Publication Details Published in : Volume 2 | Issue 4 | July-August 2017 Article Preview
M.Tech.,(PG Scholar), Dept of CSE,Annamacharya Institute of Technology & Sciences, Rajampet, Kadapa, Andhra Pradesh, India
Dr. K. Prasanna
Assocaite Professor, Dept of CSE,Annamacharya Institute of Technology & Sciences, Rajampet, Kadapa, Andhra Pradesh, India
Dr. M. Rudra Kumar
Professor, Dept of CSE, Annamacharya Institute of Technology & Sciences, Rajampet, Kadapa, Andhra Pradesh, India
Date of Publication : 2017-08-31
License: This work is licensed under a Creative Commons Attribution 4.0 International License.
Page(s) : 641-644
Manuscript Number : CSEIT1724152
Publisher : Technoscience Academy
|
BibTeX | RIS | CSV