K Nearest Neighbor Joins And Mapreduce Process Enforcement For The Cluster Of Data Sets In Bigdata

K Nearest Neighbor Joins (KNN join) are regarded as highly primitive and expensive operations in the data mining.The efficient use of KNN join has proven good results in finding the objects from two data sets prevailed in the huge databases.This has been achieved with the combination of K-Nearest Ne...

Full description

Saved in:
Bibliographic Details
Main Authors: Md Shah, Wahidah, Othman, Mohd Fairuz Iskandar, Hussian Hassan, Ali Abdul, Talib, Mohammed Saad, Mohammed, Ali Abdul Jabbar
Format: Article
Language:en
Published: Institute of Advanced Scientific Research 2018
Subjects:
Online Access:http://eprints.utem.edu.my/id/eprint/21621/2/2018%20K%20Nearest%20Neighbor%20Joins%20and%20Mapreduce.pdf
http://eprints.utem.edu.my/id/eprint/21621/
https://www.researchgate.net/publication/325528226_K_Nearest_Neighbor_Joins_and_Mapreduce_Process_Enforcement_for_the_Cluster_of_Data_Sets_in_Bigdata
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:K Nearest Neighbor Joins (KNN join) are regarded as highly primitive and expensive operations in the data mining.The efficient use of KNN join has proven good results in finding the objects from two data sets prevailed in the huge databases.This has been achieved with the combination of K-Nearest Neighbor query and join operation to find the distinct objects from different data sets.MapReduce is a newly introduced program with the combination of Map Procedure method and Reduce Method widely used in BigData.MapReduce is enriched with parallel distributed algorithm to find the results on a cluster of data sets in BigData.In this paper,the combination of KNN join and MapReduce methods are utilized on the cluster of data sets in BigData for knowledge discovery.Exploring the pinpoint data from huge data sets stored in Big Data demands the distributed large scale data processing.The present research paper is focusing on generic steps for KNN joins exploration operations on MapReduce.The operations of KNN Join are targeted to perform the data partitioning and data pre-processing and necessary calculations.By utilizing the combination of KNN joins with MapReduce methods on BigData data sets will demonstrate a solution for complex computational analysis.