Please use this identifier to cite or link to this item: http://dr.iiserpune.ac.in:8080/xmlui/handle/123456789/8370
Full metadata record
DC FieldValueLanguage
dc.contributor.advisorMundankar, Ajinkya-
dc.contributor.authorMANOHAR, SHARDUL-
dc.date.accessioned2023-12-22T04:03:41Z-
dc.date.available2023-12-22T04:03:41Z-
dc.date.issued2023-11-
dc.identifier.citation40en_US
dc.identifier.urihttp://dr.iiserpune.ac.in:8080/xmlui/handle/123456789/8370-
dc.description.abstractThe primary goal of our project is to create a non - deep learning solution for effectively segmenting cells within tabular data, accommodating tables with or without gridlines. We have devised an algorithm based on K-Means Clustering to facilitate cell segmentation within tables, irrespective of the presence of gridlines. Our approach involves identifying clusters of characters, often representing words or numbers, and subsequently calculating their centres of mass. We create distinct arrays for the x and y coordinates of these centres. Employing K-Means clustering separately on x coordinates and y coordinates of centres, we determine the optimal number of clusters, denoted as 'k,' from 1 to a predefined maximum value ('max_k') using a novel method for selecting the most suitable 'k', as the existing methods yielded unsatisfactory results. Subsequently, we discern rows and columns separately by employing K-Means clustering with the determined 'k' and identify individual cells through the intersection of these rows and columns. In addition, we have developed an alternative algorithm tailored for tables containing gridlines. In this scenario, we use canny edge detection and hough transform to detect lines, followed by the identification of intersection points. We use intersection points to detect gridlines. Using these detected gridlines, we reconstruct the table structure.en_US
dc.language.isoenen_US
dc.subjectTable cell detectionen_US
dc.subjectCell detection in tabular dataen_US
dc.titleCell Detection in Tabular dataen_US
dc.typeArticleen_US
dc.description.embargoNo Embargoen_US
dc.type.degreeBS-MSen_US
dc.contributor.departmentDept. of Data Scienceen_US
dc.contributor.registration20181104en_US
Appears in Collections:MS THESES

Files in This Item:
File Description SizeFormat 
20181104_Shardul_Pramod_Manohar_MS_Thesis.pdfMS Thesis3.62 MBAdobe PDFView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.