2004 — 2010 |
Setoodehnia, Ali Dobosiewicz, John Gao, Jing Yu, Xiaobo [⬀] Avirappattu, George |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Epsilon Corps: Expanding Stem Talent Through Exploration, Mentoring and Sequenced Curricular Support
The Epsilon Corp is enhancing STEM major recruitment and retention among the highly diverse students at Kean University, an urban, comprehensive, public university in northern New Jersey. The project focuses on progressive recruitment and engagement of incoming students, intensive support and STEM skill building, community building with synergistic interaction among all STEM students, and campus-wide, culture-shifting and institutional changes. Epsilon Corps has five interwoven components: 1) engaging both undecided and intended incoming STEM students as Epsilon Explorers and encouraging them to choose STEM majors through peer-led exploration activities in the Summer Epsilon Institute and in Special Sections of Freshman Seminar; 2) engaging prospective and declared STEM majors with sophomore status as Epsilon Scholars and enhancing their STEM skills through mentoring and tutoring in Special Sections of Research & Technology; 3) engaging STEM juniors and seniors as Epsilon Peer Mentors and Project Leaders and enhancing their mentoring, project development and leadership skills through activities in special sections of an interdisciplinary general education course; 4) using this sequenced curricular support structure to build and sustain a synergistic STEM community of active learners and peer mentors, known collectively as Epsilon Corps); and 5) creating a campus-wide science-friendly atmosphere with coherent motivational and support activities, programs and facilities (Epsilon Celebration Day, Epsilon Awards and Scholarships, Epsilon Outreach, Epsilon Web Platform, and Epsilon Activities Center) to promote student interest in STEM careers and enhance their success in STEM programs.
|
0.97 |
2012 — 2017 |
Zhang, Aidong Gao, Jing |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Iii: Small: Dynamic Social Network Mining: Feature Extraction, Modeling and Anomaly Detection
This project develops a general framework for anomaly detection in dynamic social networks that evolve in both links and nodes. The framework includes first capturing the information of timestamps of dynamic networks by transferring them into carefully selected graph kernel feature spaces. A dynamic modeling method is then designed to learn the dynamism on the target dynamic social network. Anomaly detection methods are finally developed to mine abnormal nodes in the dynamic network. The main innovation of the approaches is to represent dynamic networks by bags of attributes, including graph kernel features to epitome details in each timestamp, learned latent variables for dynamism of networks and user specified features to turn the direction of attributes representation towards aimed tasks. Based on this representation, the project designs innovative methods based on latent support vector machine and transfer learning to detect abnormal nodes.
This research provides a clear understanding of evolution patterns, including both normality and abnormality in dynamic social networks. The approaches developed in this project help identify various abnormalities in our life, including detecting spammers in websites, monitoring potential dangerous activities in crime networks, identifying malicious source of infection in disease networks, and many others. The successful modeling of such network dynamics can provide scientific basis for appearance and disappearance of human relationships, improve the prospects for uncovering potentially undiscovered evolution patterns in social networking process and help develop qualitative and quantitative algorithms for more applications.
|
0.913 |
2013 — 2017 |
Gao, Jing |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Iii: Small: Collaborative Research: Conflicts to Harmony: Integrating Massive Data by Trustworthiness Estimation and Truth Discovery
Big data leads to big challenges, not only in the volume of data but also in its dynamics and variety. Multiple descriptions about the same set of objects or events from different sources unavoidably lead to data or information inconsistency. Then, among conflicting pieces of data or information, it is crucial to tell which data source is reliable or which piece of information is correct. Accurate information is referred to as the truth and the chance of a source providing accurate information is denoted as source reliability or trustworthiness. The objective of this project is to detect truths without supervision, by integrating source reliability estimation and truth finding. A unified framework is developed to model complex trustworthiness factors, heterogeneous data types, incremental and parallel computation, and source and data dependencies so that truth and trustworthiness can be inferred from multiple conflicting sources of heterogeneous, disparate, correlated, gigantic, scattered, and streaming data.
This project makes tangible contributions to data integration, information understanding and decision making, and benefits many applications where critical decisions have to be made based on the correct information extracted from diverse sources. Research results of this project are integrated into course materials and projects, and into training students and new generation researchers, especially female and minority students. For further information about this project, please refer to the project website: http://www.cse.buffalo.edu/~jing/truth.htm
|
0.913 |
2015 — 2018 |
Zhang, Aidong Ramanathan, Murali Hutson, Alan (co-PI) [⬀] Freudenheim, Jo Gao, Jing |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Iii: Medium: High-Dimensional Interaction Analysis in Bio-Data Sets
Discovering interactions between the attributes in a data set provides insight into the underlying structure of the data and explains the relationships between the attributes. This project develops multi-disciplinary approaches that integrate computer science, statistics, and epidemiology techniques to mine interaction relationships among attributes and phenotypes (traits or class labels) in biological data sets. Specifically, this project develops innovative and statistically sound methodologies for mining novel interactions within attributes or between attributes and phenotypes to help identify critical factors in biological applications. In particular, the novel analysis methods can enable the genetic and environmental interactions underlying a range of complex diseases to be delineated. The research activities of this project can also promote the integration of biology, computer science, and statistics, which is highly significant to many applications.
The project will formulate various metrics that enable efficient pruning and searching in the multi-dimensional combinatorial space for identifying significant interaction relationships. This enables highly effective approaches that build search-based trees or identify highly correlated subspaces to detect meaningful local interactions that may not be significant considering the whole data sets but are strongly interacted with traits on a subset of data. This enables comparison of data from multiple different groups such as based on age, race, or other properties. It is important to find both common and different interactions in different groups so that effective methods can be developed for targeted groups. The methods developed will detect complex interactions between attributes in multiple groups simultaneously by capturing both their commonalities and differences in joint matrix factorization or deep learning models. These approaches are remarkably powerful for biological applications, such as detecting gene-gene interactions and gene-environmental interactions that lead to breast cancer. The concept of interaction is also ubiquitous and important in many scientific disciplines ranging from economics, sociology and physics, to the pharmaceutical sciences. The novel approaches and analysis tools developed in this project are useful for finding out any interaction relationships between attributes associated with phenotype labels or without phenotype labels. These approaches and tools are general and are applicable to a variety of applications.
|
0.913 |
2016 — 2023 |
Gao, Jing |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Career: Mining Reliable Information From Crowdsourced Data
With the proliferation of mobile devices and social media platforms, any person can publicize observations about any activity, event or object anywhere and at any time. The confluence of these enormous crowdsourced data can contribute to an inexpensive, sustainable and large-scale decision system that has never been possible before. Such a system could vastly improve the efficiency and cost of transportation, healthcare, and many other applications. The main obstacle in building such a system lies in the problem of information veracity, i.e., individual users might provide unreliable or even misleading information. This project identifies important research questions in the task of mining reliable information from noisy and unreliable crowdsourced data, and pursues an integrated research and education plan to address these questions. Through integrating data from various sources, this project addresses information veracity, which will benefit the many applications where crowdsourced data are ubiquitous but veracity can be suspect.
In particular, this project develops novel methods to mine reliable information by taking into consideration various properties of crowdsourcing: 1) Crowdsourcing platforms collect users' observations about certain objects. Other valuable information sources, such as spatial-temporal, user influence, and textual data, are leveraged to effectively detect reliable information from these observations. 2) Effective privacy protection and budget allocation mechanisms are designed to better motivate active crowdsourcing. These investigations are integrated with the exploration of both theoretical and practical aspects of the proposed methods. From the theoretical perspective, fundamental questions regarding the confidence in the estimated reliability and the convergence of the proposed methods are explored. From the practical perspective, the proposed methods are adapted to tackle challenging problems in various applications such as transportation, healthcare and education to enable new insights into these domains. In addition to the research advances, this project contributes to educational innovation, as the proposed methods are applied to educational methodologies such as peer assessment and question answering. Additional information about this project, including research results, publications, datasets, and software, can be found at http://www.cse.buffalo.edu/~jing/crowd.htm
|
0.961 |
2017 — 2019 |
Gao, Jing Stefanone, Michael |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Eager: Collaborative: Understanding and Modeling Rumor Propagation For Vulnerability Assessment of Social Media Platforms
As social media becomes a primary news source, rumors can spread widely in a short time. In recent cases, the rapid spread of misinformation caused social panic, had a dramatic financial impact, and put individuals and communities at great risk. Even when correct information is eventually disseminated, large delays can have devastating consequences. To determine how vulnerable networks are to misinformation spread, and to develop effective proactive and reactive counter-measures, it is necessary to study rumor propagation. However, rumor propagation is challenging to model and capture due to its dynamic complexity and self-sustaining nature.
The rapid diffusion of rumors across online social networks is influenced by numerous factors from both local (user forwarding behavior) and global (network diffusion) perspectives. Diffusion depends on each user's local decision about whether to propagate the information or not. That decision is related to factors including trust relationships, information provenance, and content. Diffusion also depends on the global topology of networks, how users are interconnected, as well as the rate at which users propagate the content. From this global point of view, characterizing rumor propagation across networks requires accurate yet tractable mathematical models of diffusion. This project investigates rumor diffusion via social media from these two perspectives. The integration of social psychological and computer science methodologies in this project reveals propagation patterns in large-scale networks and the psychological motivations driving user behavior. This project contributes to better monitoring, detection, and ultimately prevention of the propagation of misinformation that undermines social stability and national security. Research and training opportunities are offered to students across multiple fields, including computer science, engineering, and social science.
|
0.913 |
2017 — 2019 |
Gao, Jing |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Eager: Medical Knowledge Graph Construction From Heterogeneous Sources
The objective of this project is to construct comprehensive medical knowledge graphs. Such knowledge graphs can satisfy users' growing needs for reliable medical information, help the effective communication between patients and doctors, and potentially help reduce the high costs of health care. This project tackles a series of unique challenges observed in medical domains, and develops effective approaches that can extract knowledge from the deluge of crowdsourced data to augment medical knowledge graphs. The proposed research advances the fields of knowledge graph construction and information trustworthiness analysis by developing novel methods that mines knowledge from unstructured data in medical domains.
This project investigates the problem of medical graph construction from the following perspectives: 1) Effective approaches are developed to take into account the semantic relations between medical terms during the extraction of reliable medical facts from noisy answers on healthcare question and answering websites; 2) A unified framework is designed to integrate information from heterogeneous data sources in the process of medical knowledge discovery; 3) The knowledge graph is completed by inferring new relations based on existing relations between medical terms in the graph. The proposed research is implemented into a system prototype that displays the extracted medical facts and the knowledge graph, which benefit online users who seek medical information. The proposed techniques are used to enhance educational methodologies. Research results of this project are integrated into course materials and projects that reinforce student training.
|
0.913 |
2017 — 2020 |
Su, Lu Anas, Alex (co-PI) [⬀] Gao, Jing Qiao, Chunming (co-PI) [⬀] Sadek, Adel |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Scc-Irg Track 2: Towards Quality Aware Crowdsourced Road Sensing For Smart Cities
With nearly a billion automobiles on the road today, the current transportation systems have begun to show signs of serious strain, such as congestions, traffic accidents, road surface defects, and malfunctioning traffic regulation infrastructures. Therefore, it is of great importance to collect and disseminate road/traffic condition information accurately, efficiently, and timely. Traditionally, road and traffic monitoring are conducted through either stationary sensors or instrumented probe vehicles. Unfortunately, the prohibitively high deployment cost of such devices makes it impossible to achieve large-scale deployment, leading to limited road coverage and delayed information update. To mitigate these problems, this project develops QuicRoad, a Quality of Information (QoI) aware crowdsourced road sensing system that can collect road/traffic information from a variety of sources, including smartphones, social media and transportation authorities (as well as future connected vehicles), and then distribute the collected information in real time. The PIs team up with local transportation agencies in the Buffalo-Niagara region on applications related to road surface and traffic condition monitoring, border crossing delay estimation, and incident management.
This project integrates across both social and technological research dimensions. In the technological dimension, it leads to a novel Quality of Information (QoI) aware information integration framework that can jointly optimize the estimation of the QoI of various sources, and the information-integration as well as decision-making process. In the social dimension, it answers fundamental questions such as whether and to what degree the road/traffic condition information provided by the proposed QuicRoad system would change the social behavior of the travelers. By seamlessly integrating the technological and social dimensions, the proposed research can not only improve the coverage and quality of assisted driving and road navigation services for travelers, but also support policy-making in traffic planning and operations by transportation authorities. The research will potentially benefit a wide spectrum of real-world road sensing applications aimed at improving road safety, mitigating traffic congestions, and reducing fuel consumption and emissions, and eventually contribute to building a sustainable society.
|
0.913 |
2021 — 2024 |
Gao, Jing Dobler, Gregory (co-PI) [⬀] Boukari, Hacene Bianco, Federica Tameze, Claude |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Collaborative Research: Hdr Dsc: Delaware and Mid-Atlantic Data Science Corps
Data science is having an enormous impact on society and the economy. As the field of data science evolves, there is an increasing need for a qualified and diverse set of professionals with skills in data science. The Delaware and Mid-Atlantic Data Science Corps network will bring together three partner institutions from the Delaware Valley, the University of Delaware (UD), Delaware State University (DSU), and Lincoln University (LU). The resulting network will strengthen and promote interdisciplinary training in foundational and applied data science, both in the classroom and in projects with internal researchers and external collaborations. This project will provide equitable pedagogical opportunities in data science across students’ backgrounds, resources, and lived experiences, focusing on inclusion of student populations traditionally underrepresented in STEM. The program will offer traditional (classroom-based) and active (participatory workshops and research experiences) opportunities providing job-ready skills to students, including training in the ethics of data science, visualization and communication, and collaborative skills.
Each year of the program will focus on a different area of data-intensive pedagogy and research selected to be maximally relevant to one of the three partnering institutes, maximizing student engagement and empowering each institute to contribute domain expertise and data to support hands-on research opportunities for in-classroom and out-of-classroom activities. DSU and LU, two Historically Black Colleges and Universities with growing interests in data science, will leverage UD’s established portfolio of pedagogical and research initiatives in data science throughout the program to develop independent data science initiatives. A modular approach will be used that will allow the program to meet the needs of students at any point of their career path, regardless of previous exposure to data science and STEM. Coding and Stats Bootcamps will be offered to the uninitiated. Data science foundational classes will level the playing field and offer all students basic skills. Master Classes will enable participatory discussion and hands-on training in ethics, collaboration skills, and scientific communication, and facilitate engagement with external partners. Topical advanced classes and research opportunities will complete the program.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
|
0.943 |
2021 — 2023 |
Gao, Jing |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Iii: Medium: Collaborative Research: Mining and Leveraging Knowledge Hypercubes For Complex Applications
Knowledge repository refers to a machine-readable structure that stores knowledge about various entities (e.g., organizations, events, genes), which facilitates efficient information seeking. In many domains, knowledge varies with respect to contexts, and a flat structure that is commonly adopted by existing knowledge repositories cannot capture the complicated knowledge associated with different contexts. To make knowledge resources more findable, accessible, interoperable, and reusable (FAIR), this project plans to conceptualize a new structure, Knowledge Hypercube, for organizing and retrieving knowledge that could support complex applications in various domains. A knowledge hybercube organizes knowledge with respect to selected important dimensions (e.g., time, locations, conditions), and thus it allows people to easily access knowledge in any context, encapsulate distinctive entities and facts, and conduct cross-dimensional comparison and inference. This project impacts how people find and use knowledge, advances knowledge-based data analytics approaches, and benefits a wide range of domains which have gigantic literature and unsolved complex tasks by building a bridge between them. Knowledge hypercubes can also support educational innovation and contributes to educational tasks such as knowledge tracing. The major objective of this proposal is to form a paradigm of mining knowledge hybercubes from massive collection of text documents and leveraging such hybercubes for complex exploration and prediction tasks. To meet this goal, this project tackles a series of technical challenges. First, to automatically construct a knowledge hypercube from massive texts, innovative weakly supervised approaches are designed to organize text documents based on the hypercube structure, extract open entity and relationship information and organize cell-specific and cross-cell knowledge in a multi-dimensional manner. Second, novel refinement approaches are developed to automatically verify the information quality within and across cells in knowledge hypercubes by cross-checking within the hypercubes and with external information. Third, knowledge hypercubes motivate the development towards new discovery and learning tasks. In particular, the project introduces an automatic knowledge search pipeline for leveraging knowledge hypercubes for downstream prediction tasks, and a hypothesis generation approach for scoring unknown associations between concepts. The planned paradigm is realized in two specific domains (i.e., biomedical and news events), demonstrating the power of knowledge hypercubes to enable new insights into these domains.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
|
0.961 |
2021 — 2024 |
Ferraro, Paul Messer, Kent (co-PI) [⬀] Gao, Jing Ellis, Sean |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Sustainable Agricultural Land Use Practices in Large-Scale Landscape Evolution
How farmers use and manage their fields impacts the environment at local, state, national, and global levels. The public and the private sectors have invested billions of dollars each year as short-term financial incentives to encourage the initial adoption of sustainable agricultural land use practices to reduce negative environmental impacts. Many agri-environmental programs promote using cover crops during the winter, a practice that reduces erosion, prevents fertilizer runoff from polluting waterways, provides pollinator services, and improves soil health. While long-term persistent use is required for sustainable agricultural land use practices to generate their public benefits, very little is known about whether short-term financial incentives can lead to long-term persistent change. This project investigates how the persistent application of cover crops varies over time and space, and assesses how this sustainable agricultural land use practice has resulted in the adoption of best management practices that promote sustainability. The findings enable scientists to better model the environmental impacts of changes in land use practices, and to provide advice on how to create landscapes that promote environmental sustainability and societal well-being by incentivizing and contracting for sustainable land use practices.
Changes in local-level land use practices can accumulate to shape landscape patterns and have important lasting impacts on the environment. To study how individual-level sustainable land use decisions aggregate to form large-scale landscape patterns that yield environmental benefits, this project analyzes a large, national dataset (2010-2020) with about 374 million observations of agricultural fields that together cover approximately 95% of planted acres for major commodities, along with state-level longitudinal data and expert opinions. The project generates new insights using machine learning techniques to identify associations between characteristics of related natural and human systems and field-level persistence of the sustainable practice. The insights can be translated for practitioners at federal and state agencies to better design incentive programs to yield the most public benefits per dollar invested.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
|
0.943 |