TY - JOUR TI - Survey on Clustering High-Dimensional data using Hubness AU - Miss. Archana Chaudahri AU - Mr. Nilesh Vani JO - International Journal of Scientific Research in Computer Science, Engineering and Information Technology PB - Technoscience Academy DA - 2020/01/05 PY - 2020 DO - https://doi.org/10.32628/CSEIT195671 UR - https://ijsrcseit.com/CSEIT195671 VL - 6 IS - 1 SP - 01 EP - 07 AB - Most data of interest today in data-mining applications is complex and is usually represented by many different features. Such high-dimensional data is by its very nature often quite difficult to handle by conventional machine-learning algorithms. This is considered to be an aspect of the well known curse of dimensionality. Consequently, high-dimensional data needs to be processed with care, which is why the design of machine-learning algorithms needs to take these factors into account. Furthermore, it was observed that some of the arising high-dimensional properties could in fact be exploited in improving overall algorithm design. One such phenomenon, related to nearest-neighbor learning methods, is known as hubness and refers to the emergence of very influential nodes (hubs) in k-nearest neighbor graphs. A crisp weighted voting scheme for the k-nearest neighbor classifier has recently been proposed which exploits this notion.