2007 — 2009 |
Elhadad, Noemie |
G08Activity Code Description: A grant available to health-related institutions to improve the organization and management of health related information using computers and networks. |
Librarian Infobutton Tailoring Environment @ Columbia University Health Sciences
[unreadable] DESCRIPTION (provided by applicant): [unreadable] [unreadable] Infobuttons are context specific links, placed in clinical information systems, which retrieve context-specific information from on-line health resources. Infobutton Managers are systems that match context information to potential information needs and provide users with a selection of links to relevant resources. Several Infobutton Managers exist, serving a variety of institutions, and an HL7 standard for Infobutton Managers is in development. Current Infobutton Managers are maintained and tailored by systems personnel. The goal of the proposed project is to increase the use of the Infobutton Manager. We propose to develop a Librarian Infobutton Tailoring Environment (LITE) that will allow institutional librarians to apply their expertise to specify to the Infobutton Manager the information needs of their users, determine appropriate on-line resources to address those needs, and construct appropriate queries for retrieving answers. We envision each institution charging someone with the responsibility for establishing, maintaining, and monitoring the questions and resources provided to the users of their clinical information systems. Our specific aims are to: [unreadable] [unreadable] 1. Conduct a community assessment of the infobutton management functions needed by institution librarians [unreadable] 2. Refine the planned features of LITE (as envisioned above) to address needs of institution librarians [unreadable] 3. Establish a forum for collecting feedback from institution librarians as LITE is developed [unreadable] 4. Develop LITE in an iterative manner, based on feedback from institution librarians [unreadable] 5. Develop a user manual and tutorial for training institution librarians in the use of LITE [unreadable] 6. Evaluate the usability of LITE by trained institution librarians [unreadable] 7. Evaluate the use of LITE at institutions that integrate the IM into their clinical information systems [unreadable] 8. Disseminate the results of the project [unreadable] 9. Promote the use of the IM and LITE [unreadable] [unreadable] The result will be a resource that will help health care institutions address the problem of unresolved clinician information needs. We exploit existing information technology infrastructure to empower informatics professionals to leverage it for translating and disseminating research information. We seek to create and make available a more usable tool that will facilitate use of the IM by ours and other institutions. [unreadable] [unreadable] [unreadable] [unreadable]
|
1 |
2009 — 2010 |
Elhadad, Noemie |
R01Activity Code Description: To support a discrete, specified, circumscribed project to be performed by the named investigator(s) in an area representing his or her specific interest and competencies. |
An Nlp Approach to Generating Patient Record Summaries @ Columbia University Health Sciences
: The long-term goal of this proposal is to enhance the manner in which physicians access, process and marshal medical information by providing them with an automatically generated, comprehensive, and up-to date summary of the information appearing in a patient record. At the point of patient care, physicians must often rapidly process a potentially overwhelming quantity of information pertaining to a patient. Failure to do so effectively may lead to provision of suboptimal care. Some electronic health record systems provide an automatically produced "cover sheet" geared to help physicians with a broad overview of a given patient, but the information is derived from the structured data fields in the patient record, ignoring the valuable narrative text entered by clinicians over time. We are building upon our prior work in summarization and natural language processing and leveraging our expertise in cognitive research studying information needs and decision making of clinicians to build a patient record summarizer that gathers information narrative (unstructured) as well as structured parts in the record. We focus on producing a summary for patients with kidney disease, as they often have a complex medical history with numerous conditions, procedures and medications. Providing a holistic, up-to-date summary of their chart would prove valuable to physicians in general and nephrologists in particular. The following three aims will be carried out: (1) conduct a formative study to determine how physicians prioritize and mentally represent relevant information when reviewing a patient chart;(2) create a set of automated methods to select salient pieces of information in the patient record and organize them into a coherent summary;and (3) evaluate the efficacy, efficiency and physician-user satisfaction associated with the use of the summarizer. A primary strength of this proposal is that we are addressing the problem of information overload, a bottleneck in the use of electronic health records, and evaluate the impact of our solution on clinicians'actions and patients'health outcomes. Furthermore, we propose to use novel natural language processing, knowledge-based and data mining methods to extract and organize salient information. Finally, we contribute to informatics research by extending the electronic health record functionalities to go beyond a simple documentation-entry system towards a useful reference and decision-making tool for physicians
|
1 |
2010 — 2013 |
Chapman, Wendy W. [⬀] Elhadad, Noemie Savova, Guergana K. (co-PI) [⬀] |
R01Activity Code Description: To support a discrete, specified, circumscribed project to be performed by the named investigator(s) in an area representing his or her specific interest and competencies. |
Annotation, Development and Evaluation For Clinical Information Extraction @ University of California San Diego
DESCRIPTION (provided by applicant): Much of the clinical information required for accurate clinical research, active decision support, and broad-coverage surveillance is locked in text files in an electronic medical record (EMR). The only feasible way to leverage this information for translational science is to extract and encode the information using natural language processing (NLP). Over the last two decades, several research groups have developed NLP tools for clinical notes, but a major bottleneck preventing progress in clinical NLP is the lack of standard, annotated data sets for training and evaluating NLP applications. Without these standards, individual NLP applications abound without the ability to train different algorithms on standard annotations, share and integrate NLP modules, or compare performance. We propose to develop standards and infrastructure that can enable technology to extract scientific information from textual medical records, and we propose the research as a collaborative effort involving NLP experts across the U.S. To accomplish this goal, we will address three specific aims: Aim 1: Extend existing standards and develop new consensus standards for annotating clinical text in a way that is interoperable, extensible, and usable. Aim 2: Apply existing methods and tools, and develop new methods and tools where necessary for manually annotating a set of publicly available clinical texts in a way that is efficient and accurate. Aim 3: Develop a publicly available toolkit for automatically annotating clinical text and perform a shared evaluation to evaluate the toolkit, using evaluation metrics that are multidimensional and flexible.
|
0.954 |
2010 — 2014 |
Elhadad, Noemie |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Cdi-Type I: Collaborative Research: Gaining Knowledge From Other Patients: Structuring and Searching the Content of Health-Related Web Posts
Individuals with chronic diseases rely more and more on online forums, blogs, and mailing lists to exchange information, practical tips, and stories about their conditions and to get emotional support from their peers. While this type of social networking has become central to the daily lives and decision-making processes of many patients, there has been little research on the quality of the content it conveys, as well as its use and impact in the fields of medicine and public health. On the patients' side, forums are surprisingly technologically poor: users have often no choice but to browse through massive numbers of posts while looking for a particular piece of information. The lack of appropriate tools to organize, analyze and ultimately understand the overwhelming number of health-related, patient-written posts hinders researchers from investigating this medium and hinders patients from using this medium to its full potential.
This project aims at helping both patients and health professionals access online patient-authored information by creating tools to search for information in patient forums. The proposed work spans several fields: natural language processing, data management, information retrieval, public health and behavioral medicine, and it will build the foundations for understanding peer patient posts available through online forums and mailing lists.
This proposal aims at bringing together information processing and medical understanding of patient-centric resources. The work in this project will process texts from an emerging medium, which directly addresses the immediate concerns of patients. The tools designed as part of this project will benefit the researchers who study the behaviors and information needs of patients online. These tools, such as the intelligent search engine for posts, will also enhance the experience of the patients themselves, who are avid users of this medium. In addition to the research agenda, this proposal presents an education plan consistent with the overall goal of bridging the gap between researchers in computer science and researchers in medical fields. In particular, a course is presented that introduces methods of intelligent information processing in the context of research questions important to the fields of public health and medicine.
|
1 |
2010 — 2011 |
Bantum, Erin O'carroll Elhadad, Noemie Owen, Jason E [⬀] |
R21Activity Code Description: To encourage the development of new research activities in categorical program areas. (Support generally is restricted in level of support and in time.) |
Use of Natural Language Processing to Identify Linguistic Markers of Coping
DESCRIPTION (provided by applicant): Understanding mechanisms of action is key to improving psychosocial interventions for cancer and other chronic disease conditions. In cancer, emotional expression has been identified as one possible mediator of the effect of psychosocial intervention on patient-reported outcomes. However, scientific evaluations of psychological mechanisms of adjustment to cancer and other chronic diseases are constrained by limitations associated with self-report measures. Because self-care resources, peer-to-peer networks, and more recent forms of psychosocial intervention are increasingly being delivered online, linguistic and behavioral data can be used to characterize internal coping processes, social interactions, and other manifest behaviors. Few tools are currently available for harnessing text as a potential data source, and signal detection indices of existing tools leave room for considerable improvement in these methodologies (Bantum &Owen, 2009). In the present study, natural language processing and other tools of computational linguistics will be used to develop a machine-learning classifier to identify emotional expression in electronic text data. The aims of the study are: 1) to annotate a large text corpus from cancer survivors using an objective and reliable emotion-coding procedure, 2) to incorporate linguistic and psychological features into a machine-learning classification method and identify which of these features are most strongly associated with codes assigned by trained human raters, and 3) to develop combined psychological and natural language processing (NLP) methods for identifying linguistic markers of emotional coping behaviors. To accomplish these aims, a comprehensive corpus of emotionally-laden cancer communications will be developed from 5 existing linguistic datasets. Five raters will be selected and undergo a rigorous training procedure for coding emotional expression using an emotion-coding system previously developed by the research. Coding will take place using an Internet-based coding interface that will allow the investigators to continuously monitor inter-rater reliability. Simultaneous with the coding process, the investigators will link the electronic text data with key linguistic and psychological features, including Linguistic Inquiry and Word Count (LIWC), Affective Norms for English Words (ANEW), WordNet, part of speech tags, patterns of capitalization and punctuation, emoticons, and textual context. A machine-learning classifier, using tools of natural language processing, will then be applied to the text/feature data and validated against human-rated emotion codes. The long-term objective of this research is to advance a methodology for objectively identifying coping behavior, particularly emotional expression, in order to supplement self-report measures and improve scientific understanding of adjustment to chronic disease, trauma, or other psychological conditions. This work is essential for identifying mechanisms of action in psychosocial interventions for cancer survivors and others and has significance for the fields of medicine, psychology, computational linguistics, and artificial intelligence. PUBLIC HEALTH RELEVANCE: Identifying specific emotional, cognitive, and behavioral factors that contribute to adjustment to cancer and other chronic diseases is essential for being able to develop and improve effective interventions to promote health and well-being. To date, the study of these factors as mechanisms of action has been limited to self-report measures that may not correlate well with other more objective indicators. The proposed study will improve our ability to identify mechanisms of action by supplementing self-report measures with objectively identified markers of coping behaviors such as emotional expression in natural language used by individuals living with cancer.
|
0.955 |
2012 — 2018 |
Hirschberg, Julia [⬀] Chang, Shih-Fu (co-PI) [⬀] Zeevi, Assaf (co-PI) [⬀] Elhadad, Noemie Rosenberg, Andrew Levitan, Rebecca |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Igert: From Data to Solutions: a New Phd Program in Transformational Data & Information Sciences Research and Innovation
This Integrative Graduate Education and Research Traineeship (IGERT) provides Ph.D. students with the unique interdisciplinary training necessary to extract useful information from vast amounts of collected data. Consumer opinions, information on disease and its symptoms, and breaking information on social websites allow us to gather information on a scale previously unknown. However, such big data is useful only if we can extract information from it. Columbia and CUNY, in collaboration with international partners in Argentina and Brazil, propose the interdisciplinary training of students in making sense of big data. Researchers from Computer Science, Electrical Engineering, Psychology, and Statistics will partner with Biomedical Informatics, Business and Journalism, to educate the next generation of information scientists in solving real world problems. We will train students to extract information from multiple data types ? text, video, audio ? and familiarize them with a wide range of techniques for making sense of information from a world in which Youtube, Wikipedia, Facebook, and Twitter are supplanting newspapers, encyclopedias, television, and consumer surveys as everyday sources of information.
Broader Impacts: A curriculum based upon the concept of studio learning will integrate techniques from Business, Journalism, and Biomedical Informatics. Aided by advisors from large corporations, major research labs, and small start-up companies, the program will encourage IGERT trainees to pursue patents, and to apply their research in society. A second goal will be attracting more diverse students to information sciences by emphasizing real world applications, a supportive environment, and diverse faculty role models. The mentoring and career development activities proposed will help to retain this diverse population and prepare them for their future careers.
IGERT is an NSF-wide program intended to meet the challenges of educating U.S. Ph.D. scientists and engineers with the interdisciplinary background, deep knowledge in a chosen discipline, and the technical, professional, and personal skills needed for the career demands of the future. The program is intended to establish new models for graduate education and training in a fertile environment for collaborative research that transcends traditional disciplinary boundaries, and to engage students in understanding the processes by which research is translated to innovations for societal benefit.
|
1 |
2014 — 2018 |
Elhadad, Noemie Wiggins, Chris |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Sch: Int: Large-Scale Probabilistic Phenotyping Applied to Patient Record Summarization
This project creates novel methods and tools for the analysis of large-scale Electronic Health Record (EHR) data. Models of disease, or phenotypes, are derived from a large collection of patient characteristics, as recorded in the EHR. To assess their value and robustness in a clinical application, the phenotypes are incorporated into a longitudinal patient record summarization system for clinicians at the point of patient care.
The research for this project contributes to two inter-related outcomes: (i) a probabilistic graphical model of a patient record, specifically a Latent Dirichlet Allocation (LDA) model of the patient phenotypes. Models that can handle the heterogeneous data types in the EHR, along with their challenges, such as sparseness and artificial redundancy are investigated. For the models to be useful in the clinical world, they must be interpretable by humans, easily adaptable for EHR-driven applications, and clinically relevant. This is achieved by specifying prior clinical knowledge into the models and learning from clinicians' feedback automatically; and (ii) a patient record summarizer for clinicians at the point of patient care. The summarizer leverages the probabilistic patient model and learns new models of salience through the clinicians' interactions with the deployed summarizer, in essence learning relevance of different patient phenotypes. For the evaluation of the phenome model and the summarizer, particular care is given to assessing their value in a real-world clinical setting, at the point of care.
The research builds on and is translated into deliverables that are robust and are inter-operable with the EHR of a large hospital in New York City. If successful, the availability of interpretable and actionable patient models can impact drastically both EHR-driven research activities and patient care, through better tools for clinicians. Finally, the project introduces students in the field of medicine to STEM activities, while presenting real-world, exciting application to STEM students.
For further information see the project website at: http://people.dbmi.columbia.edu/noemie/phenosum
|
1 |
2016 — 2019 |
Elhadad, Noemie Savova, Guergana K. (co-PI) [⬀] |
R01Activity Code Description: To support a discrete, specified, circumscribed project to be performed by the named investigator(s) in an area representing his or her specific interest and competencies. |
Extended Methods and Software Development For Health Nlp @ Columbia University Health Sciences
PROJECT SUMMARY There is a deluge of health-related texts in many genres, from the clinical narrative to newswire and social media. These texts are diverse in content, format, and style, and yet they represent complementary facets of biomedical and health knowledge. Natural Language Processing (NLP) holds much promise to extract, understand, and distill valuable information from these overwhelming large and complex streams of data, with the ultimate goal to advance biomedicine and impact the health and wellbeing of patients. There have been a number of success stories in various biomedical NLP applications, but the NLP methods investigated are usually tailored to one specific phenotype and one institution, thus reducing portability and scalability. Moreover, while there has been much work in the processing of clinical texts, other genres of health texts, like narratives and posts authored by health consumers and patients, are lacking solutions to marshal and make sense of the health information they contain. Robust NLP solutions that answer the needs of biomedicine and health in general have not been fully investigated yet. A unified, data-science approach to health NLP enables the exploration of methods and solutions unprecedented up to now. Our vision is to unravel the information buried in the health narratives by advancing text-processing methods in a unified way across all the genres of texts. The crosscutting theme is the investigation of methods for health NLP (hNLP) made possible by big data, fused with health knowledge. Our proposal moves the field into exploring semi-supervised and fully unsupervised methods, which only succeed when very large amounts of data are leveraged and knowledge is injected into the methods with care. Our hNLP proposal also targets a key challenge of current hNLP research: the lack of shared software. We seek to provide a clearinghouse for software created under this proposal, and as such all developed tools will be disseminated. Starting from the data characteristics of health texts and information needs of stakeholders, we will develop and evaluate methods for information extraction, information understanding. We will translate our research into the publicly available NLP software platform cTAKES, through robust modules for extraction and understanding across all genres of health texts. We will also demonstrate impact of our methods and tools through several use cases, ranging from clinical point of care to public health, to translational and precision medicine, to participatory medicine. Finally, we will disseminate our work through community activities, such as challenges to advance the state of the art in health natural language processing.
|
1 |
2017 — 2019 |
Elhadad, Noemie |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Bd Spokes: Spoke: Northeast: Collaborative Research: Integration of Environmental Factors and Causal Reasoning Approaches For Large-Scale Observational Health Research
Vast quantities of health, environmental, and behavioral data are being generated today, yet they remain locked in digital silos. For example, data from health care providers, such as hospitals, provide a dynamic view of health of individuals and populations from birth to death. At the same time, government institutions and industry have released troves of economic, environmental, and behavioral datasets, such as indicators of income/poverty, adverse exposure (e.g., air pollution), and ecological factors (e.g., climate) to the public domain. How are economic, environmental, and behavioral factors linked with health? This project will put together numerous sources of large environmental and clinical data streams to enable the scientific community to address this question. By breaking current data silos, the broader scientific impacts will be wide. First, this effort will foster new routes of biomedical investigation for the big data community. Second, the project will enable discoveries that will have behavioral, economic, environmental, and public health relevance.
This project will aim to assemble a first-ever data warehouse containing numerous health/clinical, environmental, behavioral, and economic data streams to ultimately enable causal discovery between these data sources. First, the team will integrate numerous health data streams by leveraging the Observational Health Data Sciences and Informatics (OHDSI, www.ohdsi.org) network, a virtual data repository that contains millions of longitudinal patient measurements, such as drugs and disease diagnoses. Second, the team will build a centralized data warehouse that contains important environmental, behavioral, and economic data across the United States, such as the Environmental Protection Agency air pollution AirData, the United States Census data on income and occupation statistics, and the National Oceanic Administration Association for climate and weather-related information. Third, the team will disseminate emerging computational methods for causal inference and machine learning to enable researchers to find causal links between environmental, economic, behavioral, and clinical factors. The team will leverage our broad collaborative network consisting of academic big data researchers, federal-level institutes (e.g., EPA, NOAA), and hospitals (e.g., Partners HealthCare) to integrate these data and to disseminate cutting edge machine learning tools. Lastly, the project will create training resources (e.g., interactive how-to guides), coordinate cross-institution student internships, and lead a hands-on workshop to demonstrate use of the integrated data warehouse. The ultimate goal of the project is to facilitate community-led and collaborative causal discovery through dissemination of integrated and open big data and analytics tools.
|
1 |
2019 — 2021 |
Elhadad, Noemie |
R01Activity Code Description: To support a discrete, specified, circumscribed project to be performed by the named investigator(s) in an area representing his or her specific interest and competencies. |
Phendophl:a Data-Science Enabled Personal Health Library to Manage Endometriosis @ Columbia University Health Sciences
PROJECT SUMMARY Endometriosis is a chronic condition which is estimated to affect 10% of women in reproductive age. It has a very high burden on quality of life and productivity, and the self-management needs of women living with the disease are multiple. This project aims to design, develop, and evaluate a data-science enabled personal health library called PhendoPHL to support the self-management needs of women living with endometriosis. Grounded in self-determination theory, and informed by user-centered design methods PhendoPHL will enable exploration of health patterns through interactive visualizations of integrated clinical and self-tracked data, identify temporal personalized patterns and comparison to population norms through novel data-science methods, and provide actionable visualizations of data for shared decision making during patient-provider encounters. PhendoPHL builds on our existing work in novel informatics methods for endometriosis, and the extensive experience of our research team in designing and evaluating novel informatics interventions. The proposed work also fills a research gap in personal health informatics: the development and validation of novel computational methods to identify personalized and population- based patterns in clinical and self-monitoring data; both types of data which are critical to successful self- management and challenging from a computational standpoint because they are temporal, heterogeneous, and sparse. Using a mixed-methods evaluation study (standardized surveys, logfile analysis, Critical Incident Technique interviews, focus groups), we will study PhendoPHL?s usability, assess the factors critical to user engagement and perceived impact on self-determination and shared decision making, and the generalizability to other reproductive chronic conditions in women?s health.
|
1 |