Digital Repository

Offspring GAN augments biased human genomic data

Show simple item record

dc.contributor.author DAS, SUPRATIM
dc.contributor.author Shi, Xinghua
dc.coverage.spatial Northbrook en_US
dc.date.accessioned 2023-04-26T09:11:40Z
dc.date.available 2023-04-26T09:11:40Z
dc.date.issued 2022-08
dc.identifier.citation BCB '22: Proceedings of the 13th ACM International Conference on Bioinformatics, Computational Biology and Health, 50, 1-10. en_US
dc.identifier.isbn 9781450393867
dc.identifier.uri https://doi.org/10.1145/3535508.3545537 en_US
dc.identifier.uri http://dr.iiserpune.ac.in:8080/xmlui/handle/123456789/7753
dc.description.abstract Genomic data have been used for trait association and disease risk prediction for a long time. In recent years, many such prediction models are built using machine learning (ML) algorithms. As of today, human genomic data and other biomedical data suffer from sampling biases in terms of people's ethnicity, as most of the data come from people of European ancestry. Smaller sample sizes for other population groups can cause suboptimal results in ML-based prediction models for those populations. Suboptimal predictions in precision medicine for some particular group can cause serious consequences limiting the model's applicability in real-world problems. As data collection for those populations is time-consuming and costly, we suggest deep learning-based models for in-silico data enhancement. Existing Generative Adversarial Network (GAN) models for genomic data like Population scale Genomic conditional-GAN (PG-cGAN) can generate realistic genomic data while trained on fairly unbiased data but fails while trained on biased data and encounters severe mode collapse. Our proposed model, Offspring GAN, can resolve the mode collapse issue even when trained in strongly biased genomic datasets. Our results demonstrate the ability of Offspring GAN to generate realistic and diverse label-aware data, which can augment limited real data to alleviate biases and disparities in genomic data. We also propose a privacy-preserving protocol using Offspring GAN to protect the privacy of genomic data. en_US
dc.language.iso en en_US
dc.publisher Association for Computing Machinery en_US
dc.subject Human genomic data en_US
dc.subject Offspring GAN en_US
dc.subject 2022 en_US
dc.title Offspring GAN augments biased human genomic data en_US
dc.type Conference Papers en_US
dc.contributor.department Dept. of Biology en_US
dc.identifier.sourcetitle BCB '22: Proceedings of the 13th ACM International Conference on Bioinformatics, Computational Biology and Health en_US
dc.publication.originofpublisher Foreign en_US


Files in this item

Files Size Format View

There are no files associated with this item.

This item appears in the following Collection(s)

Show simple item record

Search Repository


Advanced Search

Browse

My Account