Introduction: The number of social network users is on the rise, and the size of the user-generated contents is increasing as well. Analyzing the generated contents can lead to the attainment of a vast amount of information, such as users’ feelings on specific products or events, or personal information about life events. Aim:: The aim of this paper is to describe an model for detecting medical information present in generated contents, such as posts or comments. Results: The proposed model is based on the Unified Medical Language System (UMLS) and is tested on a dataset collected from Twitter and Facebook. The extracted information can be used to aid in the early detection of diseases or to supply commercial benefits to medical companies. Experimental results demonstrate that the proposed model achieves 94.6% accuracy and 87% precision. Conclusion: In this study, we attempted to extract clinical information present in UGC. Using the proposed model should involve a reliable dataset that contains most clinical expressions; the UMLS was a suitable dataset for our model.
Text similarity, Electronic health record, Facebook, Social network