{"id":2910,"date":"2025-07-03T10:24:14","date_gmt":"2025-07-03T13:24:14","guid":{"rendered":"https:\/\/icc.fcen.uba.ar\/?p=2910"},"modified":"2025-07-03T10:24:14","modified_gmt":"2025-07-03T13:24:14","slug":"low-cost-algorithms-for-clinical-notes-phenotype-classification-to-enhance-epidemiological-surveillance-a-case-study","status":"publish","type":"post","link":"https:\/\/icc.fcen.uba.ar\/en\/low-cost-algorithms-for-clinical-notes-phenotype-classification-to-enhance-epidemiological-surveillance-a-case-study\/","title":{"rendered":"Low-cost algorithms for clinical notes phenotype classification to enhance epidemiological surveillance: A case study"},"content":{"rendered":"<div class=\"fusion-fullwidth fullwidth-box fusion-builder-row-1 fusion-flex-container has-pattern-background has-mask-background nonhundred-percent-fullwidth non-hundred-percent-height-scrolling\" style=\"--awb-border-radius-top-left:0px;--awb-border-radius-top-right:0px;--awb-border-radius-bottom-right:0px;--awb-border-radius-bottom-left:0px;--awb-flex-wrap:wrap;\" ><div class=\"fusion-builder-row fusion-row fusion-flex-align-items-flex-start fusion-flex-content-wrap\" style=\"max-width:1144px;margin-left: calc(-4% \/ 2 );margin-right: calc(-4% \/ 2 );\"><div class=\"fusion-layout-column fusion_builder_column fusion-builder-column-0 fusion_builder_column_1_1 1_1 fusion-flex-column\" style=\"--awb-bg-size:cover;--awb-width-large:100%;--awb-margin-top-large:0px;--awb-spacing-right-large:1.92%;--awb-margin-bottom-large:20px;--awb-spacing-left-large:1.92%;--awb-width-medium:100%;--awb-order-medium:0;--awb-spacing-right-medium:1.92%;--awb-spacing-left-medium:1.92%;--awb-width-small:100%;--awb-order-small:0;--awb-spacing-right-small:1.92%;--awb-spacing-left-small:1.92%;\"><div class=\"fusion-column-wrapper fusion-column-has-shadow fusion-flex-justify-content-flex-start fusion-content-layout-column\"><div class=\"fusion-text fusion-text-1\"><p>Authors: J. Petri; P. Barcena Barbeira; M. Pesce; V. Xhardez; R. Laje; V. Cotik.<\/p>\n<p>Abstract:<br \/>\nObjective: Our study aims to enhance epidemic intelligence through event-based surveillance in an emerging pandemic context. We classified electronic health records (EHRs) from La Rioja, Argentina, focusing on predicting COVID-19-related categories in a scenario with limited disease knowledge, evolving symptoms, non-standardized coding practices, and restricted training data due to privacy issues.<\/p>\n<p>Methods: Using natural language processing techniques, we developed rapid, cost-effective methods suitable for implementation with limited resources. We annotated a corpus for training and testing classification models, ranging from simple logistic regression to more complex fine-tuned transformers.<\/p>\n<p>Results: The transformer-based, Spanish-adapted models BETO Cl\u00ednico and RoBERTa Cl\u00ednico, further pre-trained with an unannotated portion of our corpus, were the best-performing models (F1= 88.13% and 87.01%). A simple logistic regression (LR) model ranked third (F1=85.09%), outperforming more complex models like XGBoost and BiLSTM. Data classified as COVID-confirmed using LR and BETO Cl\u00ednico exhibit stronger time-series Pearson correlation with official COVID-19 case counts from the National Health Surveillance System (SNVS 2.0) in La Rioja province compared to the correlations observed between the International Code of Diseases (ICD-10) codes and the SNVS 2.0 data (0.840, 0.873, and 0.663, p-values \u22643\u00d710-7). Both models have a good Pearson correlation with ICD-10 codes assigned to the clinical notes for confirmed (0.940 and 0.902) and for suspected cases (0.960 and 0.954), p-values \u22641.7\u00d710-18.<\/p>\n<p>Conclusion: This study shows that simple, resource-efficient methods can achieve results comparable to complex approaches. BETO Cl\u00ednico and LR strongly correlate with official data, revealing uncoded confirmed cases at the pandemic&#8217;s onset. Our results suggest that annotating a smaller set of EHRs and training a simple model may be more cost-effective than manual coding. This points to potentially efficient strategies in public health emergencies, particularly in resource-limited settings, and provides valuable insights for future epidemic response efforts.<\/p>\n<p>More information:<br \/>\n<a href=\"https:\/\/doi.org\/10.1016\/j.jbi.2025.104795\" target=\"_blank\" rel=\"noopener\" data-saferedirecturl=\"https:\/\/www.google.com\/url?q=https:\/\/doi.org\/10.1016\/j.jbi.2025.104795&#038;source=gmail&#038;ust=1751633664025000&#038;usg=AOvVaw18mJX-QQy39OeC4bhFJbcT\">https:\/\/doi.org\/10.1016\/j.jbi.<wbr \/>2025.104795<\/a><\/p>\n<\/div><\/div><\/div><\/div><\/div>\n","protected":false},"excerpt":{"rendered":"","protected":false},"author":9,"featured_media":2911,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[98],"tags":[],"class_list":["post-2910","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-papers"],"_links":{"self":[{"href":"https:\/\/icc.fcen.uba.ar\/en\/wp-json\/wp\/v2\/posts\/2910","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/icc.fcen.uba.ar\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/icc.fcen.uba.ar\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/icc.fcen.uba.ar\/en\/wp-json\/wp\/v2\/users\/9"}],"replies":[{"embeddable":true,"href":"https:\/\/icc.fcen.uba.ar\/en\/wp-json\/wp\/v2\/comments?post=2910"}],"version-history":[{"count":1,"href":"https:\/\/icc.fcen.uba.ar\/en\/wp-json\/wp\/v2\/posts\/2910\/revisions"}],"predecessor-version":[{"id":2912,"href":"https:\/\/icc.fcen.uba.ar\/en\/wp-json\/wp\/v2\/posts\/2910\/revisions\/2912"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/icc.fcen.uba.ar\/en\/wp-json\/wp\/v2\/media\/2911"}],"wp:attachment":[{"href":"https:\/\/icc.fcen.uba.ar\/en\/wp-json\/wp\/v2\/media?parent=2910"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/icc.fcen.uba.ar\/en\/wp-json\/wp\/v2\/categories?post=2910"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/icc.fcen.uba.ar\/en\/wp-json\/wp\/v2\/tags?post=2910"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}