Abstract:
Natural Language Processing (NLP) is one of the most challenging and rapidly growing
fields in artificial intelligence. It is all about deciphering human languages and deriving
meaning from them. Some of the commonly used test cases include the classification of
sentiments and reviews from text data.
In this study, we present different language models to assign medical codes to electronic
health records. Medical codes (ICD codes) are used to map diseases, injuries, health conditions and surgical procedures to a set of universally recognisable alphanumeric codes. They
have become essential for storing patient records to analysing health statistics. It also has
enormous financial importance in the form of medical billings and insurance. However, assigning codes to medical records are typically done manually and is error-prone due to its
complexity.
This work presents a comparative study of machine learning models to assign ICD codes
from given medical text with increasing complexity. We believe this research can act as a
baseline for further improvements and research.