Density Based Visualization of Big Data with Graphical Processing Units
Document Type
Thesis
Degree Name
Master of Science (MS)
Department
Computer Science and Info Sys
Date of Award
Fall 2014
Abstract
The purpose of this study was to visualize the data clusters using OPTICS algorithm, with the help of Graphical Processing Units/GPUs and Python (Python, 2001) as a high level programming language through Graphical User Interface (GUI). The GUI is platform independent since Python is supported by all major operating systems, such as Windows XP, 7, 8, Linux and Mac OS. Identifying clusters for large databases is not an easy computation for a Central Processing Unit (CPU), as it can perform the calculations in some minutes-to-hours based on the size and dimensionality of the input data.A GPU might have a large number of multiprocessors, each of which has several cores. CUDA (Compute Unified Device Architecture) (NVIDIA, 2006) is a parallel programming model developed by NVIDIA (NVIDIA, 2014), which works with GPU. It is known that working with the CUDA is n times faster than working with a CPU. By combining the high computational power of GPUs and multiple advantages provided by OPTICS, clustering results can be obtained in a much faster and efficient way. In this study, large databases were divided into smaller parts and distributed among multiprocessors or GPUs, which in turn calculated the results and passed on the data to the CPU which had invoked the operation. The tool we developed will help researchers in various fields like astronomy, medicine, geology, biology and many more. Though the implementation of OPTICS is provided by tools like WEKA (WEKA, 1993), and KNIME (KNIME, 2006), there is no GPU-supported API in the literature. We found that our multiplatform software fastened OPTICS calculations and visualization up to 24 times comparing the CPU version of the algorithm. With respect to the user perspective, the tool is simple to use and adaptable to different data formats, providing user with the option of using it in many kinds of analysis on various operating systems.
Advisor
Mutlu Mete
Subject Categories
Computer Sciences | Physical Sciences and Mathematics
Recommended Citation
Periyapatna, Ramesh Shreyank, "Density Based Visualization of Big Data with Graphical Processing Units" (2014). Electronic Theses & Dissertations. 636.
https://digitalcommons.tamuc.edu/etd/636