dc.description.abstract |
Mixture models are widely used in situations where the interest is in the probability densities. In such cases, the focus is usually to estimate parameters and mixing probabilities with successful clustering. But in real-life applications, analysis becomes substantially challenging for two reasons; the first one is when we introduce heterogeneous features of continuous, categorical, and even ordinal type, and the second one is too many features make the algorithm computationally expensive and not all features contribute for the inference. Feature selection will be implemented for the same. Goal of this project is to summarize the existing methods and develop model-based approaches that are robust and scalable. These approaches will enable model-based clustering simultaneously selecting relevant features within heterogeneous data. Idea is to use Bayesian framework using the Gibbs sampling techniques. These are very popular techniques in mixture models. We investigate Gaussian and Categorical features in model-based clustering, assuming the number of cluster is finite and does not grow with sample size; such frameworks are called finite mixture model. These proposed methods are compared with maximum likelihood approach, which uses the Expected-Maximisation algorithm. |
en_US |