1993 — 1996 |
Das, Gautam |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Ria: Network and Polyhedral Approximations
This project will investigate approximation schemes for geometric objects. The geometric objects considered are Euclidean Networks and Polyhedra. Euclidean networks have diverse applications, such as in modeling geometric networks and circuit interconnections. Polyhedra are used to model almost any real space filling object, such as machine parts, automobiles, and robots. The complexity of algorithms that manipulate these geometric objects often depends upon the complexity of the objects themselves. The approximation schemes to be studied will attempt to replace the original objects by simpler ones such that relevant properties of the original objects are retained by the simpler objects. The hope is, later computations get sped up. For dense Euclidean networks, a subset of the edges will be eliminated in such a manner that in the sparser network communication paths between any pair of vertices is not appreciably lengthened. For polyhedra, the surface will be replaced by a simpler one containing fewer planar pieces, yet retaining important original properties.
|
0.948 |
1999 — 2002 |
Maletic, Jonathan Dasgupta, Dipankar (co-PI) [⬀] Lin, King-Ip (co-PI) [⬀] Das, Gautam |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Cise Research Instrumentation: Instruments For Systems, Software, and Database Research
9818323 Maletic, Jonathan I. Dasgupta, Dipankar University of Memphis
Instruments for Systems, Software, and Database Research
This research instrumentation enables research projects in: - Geometric Techniques for Data Mining, - Immunity-Based Intrusion Detection, - Adaptive Indexing, and - Software Reuse and Understanding.
To support the aforementioned projects, this award contributes in building an instrumentation infrastructure, a laboratory dedicated to specific research in systems, software, and database, at the University of Memphis, Department of Mathematical Sciences. The funds will contribute to the purchase of a Sun Enterprise 250, some Sun Ultra 5's, and the supporting networking software facilities. The computational techniques in the first project will be used to handle data mining problems in time series, where the data is not geometric in nature, but solutions involve geometric structures and algorithms and in spatial data mining, where the data itself is geometric in nature. Specific problems include similarity measure and searching in time series, and proximity and clustering problems in spatial data mining. The second project focuses on investigating immunological principles in designing a multi-agent system for network intrusion detection. The immune agents roam around the nodes of the network and monitor the situation, mutually recognize each other's activities, and produce specific action, while learning and adapting to its environment dynamically. The third project explores techniques such as query information and automatically adjusting features to enable the indexing to respond to change more appropriately. The last project tries to improve software quality and productivity through software reuse using latent semantic analysis.
|
0.948 |
2008 — 2012 |
Das, Gautam Athitsos, Vassilis [⬀] |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Iii-Cor-Small: Collaborative Research: Time Series Subsequence Matching For Content-Based Access in Very Large Multimedia Databases @ University of Texas At Arlington
In a query-by-humming system, a user sings part of a melody and the computer identifies the songs that contain the melody. In a sign spotting system, a sign language user searches for occurrences of specific signs in a video database of sign language content. These are two example applications where users want to retrieve the best matching subsequences in a time series database given a query sequence. This project is developing methods for efficient subsequence matching in large time-series databases using the popular Dynamic Time Warping (DTW) distance measure. Embeddings are being designed that partially convert the subsequence matching problem into the much more manageable problem of similarity search in a vector space. This conversion allows leveraging the full arsenal of vector indexing and metric indexing methods for speeding up subsequence matching. The proposed methods will be applicable in a wide variety of time series domains, including, e.g., stock market modeling, seismic activity analysis, and sensor-based health monitoring. To showcase the commercial, social, and educational impact of the research, the project will produce three demonstration systems: a query-by-humming system, a handwritten document search-by-keyword system, and a sign spotting system. The results of the research are being integrated into these systems to achieve efficient retrieval in the presence of large amounts of data. The creation and dissemination of large, real-world datasets for these three systems will be an additional contribution of the project.
|
0.934 |
2008 — 2010 |
Zhang, Nan Das, Gautam |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Sger: Data Analytics Over Hidden Databases @ University of Texas At Arlington
Structured hidden databases are widely prevalent on the Web. They provide restricted form-like search interfaces that allow users to execute search queries by specifying desired attribute values of the sought-after tuples, and the system responds by returning a few (e.g., top-k) tuples that satisfy the selection conditions, sorted by a suitable ranking function. Although search interfaces for hidden databases are designed with focused search queries in mind, for certain applications it may be advantageous to infer more aggregated views of the data from the returned results of search queries. Such aggregated information will facilitate learning data distributions or building mining models, which can then be used to power and optimize a multitude of emerging data analytical applications.
This research involves developing effective techniques for performing data analytics, especially sampling, over hidden structured databases via their public interfaces. The outcomes include efficient algorithms for sampling hidden databases with a heterogeneous mix of data types, achievability results for sampling different types of search interfaces, and a prototypical toolset which demonstrates the sampling of real-world hidden databases. The ability to pose high-level analytical queries over hidden databases is needed by knowledge workers in a wide variety of corporations, governments, and security agencies. Parts of this project will be integrated into teaching and carried out by students as part of advanced class projects, which will potentially attract motivated students to pursue doctoral degrees. The project Web site (http://dbxlab.uta.edu/dataAnalytics.html) will be used for results dissemination.
|
0.934 |
2009 — 2014 |
Das, Gautam |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Iii: Small: Collaborative Research: Suppressing Sensitive Aggregates Over Hidden Web Databases, a Novel and Urgent Challenge @ University of Texas At Arlington
The objective of this project is to understand, evaluate, and contribute towards the suppression of sensitive aggregates over hidden databases. Hidden databases are widely prevalent on the Web, ranging from databases of government agencies, databases that arise in scientific and health domains, to databases that occur in the commercial world. They provide proprietary form-like search interfaces that allow users to execute search queries by specifying desired attribute values of the sought-after tuple(s), and the system responds by returning a few (e.g., top-k) satisfying tuples sorted by a suitable ranking function.
While owners of hidden databases would like to allow individual search queries, many also want to maintain a certain level of privacy for aggregates over their hidden databases. This has implications in the commercial domain (e.g., to prevent competitors from gaining strategic advantages) as well as in homeland-security related applications (e.g., to prevent potential terrorists from learning flight occupancy distributions). The PIs' prior work pioneered techniques to efficiently obtain approximate aggregates over hidden databases using only a small number of search queries issued via their proprietary front-end. Such powerful and versatile techniques may also be used by adversaries to obtain sensitive aggregates; thus defending against them becomes an urgent task requiring imminent attention. This project investigates techniques to suppress the sensitive aggregates while maintaining the usability of hidden databases for bona fide search users. In particular, it explores a solution space which spans all three components of a hidden database system: (1) the back-end hidden database, (2) the query processing module, and (3) the front-end search interface. The intellectual merit of the project is two-fold: (1) problem novelty: it initiates a new direction of research in information privacy of suppressing sensitive aggregates over hidden databases, and (2) solution novelty: it investigates a variety of promising techniques across the three components. The outcomes of this research have broader impacts on the nation's higher education system and high-tech industries. Parts of the project will be carried out by students of the University of Texas Arlington and George Washington University as advanced class projects or individual research projects.
|
0.934 |
2010 — 2015 |
Das, Gautam Li, Chengkai [⬀] |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Iii: Small: Entityengine: a Query Engine For Entity-Relationship Queries Over Web Text @ University of Texas At Arlington
The continuous evolution of the Web has made itself the primary knowledge source for many people. It has become an information repository full of entities (material or virtual) and descriptions of their properties and relationships. In discovering and exploring the entities that fascinate them, the users are in need of structured querying facilities, coupled with text retrieval capabilities, that explicitly deal with the entities, their properties, and relationships.
In this project, the PIs investigate a novel declarative query mechanism, entity-relationship queries (ERQ), for users to discover and explore the rich structured and entity-centric information on the Web. The research objective is to produce general methods for efficient processing and optimization of entity-relationship queries and automatic ranking of query results, and to systematically develop a query engine for such queries. The methodology is to exploit the evidence of the co-occurrence of entities and keyword constraints, through the integration of DB and IR methods. A systematic approach will be taken to produce automatic ranking function formulation method, efficient entity-centric index and index selection methods, and top-k query processing algorithms.
The research results will have broader impacts on the higher education system, high-tech industries, the scientific community, and the general public. The educational goal of the project is to be achieved by integrating research and educational efforts through these activities: broadening database curriculum; involving under-represented students and undergraduates in research; outreach; and publicly releasing the online demo, software, datasets, publications, and course materials.
For further information see the project web page: URL: http://idir.uta.edu/erq
|
0.934 |
2014 — 2019 |
Jiang, Hong (co-PI) [⬀] Metsis, Vangelis (co-PI) [⬀] Das, Gautam Csallner, Christoph (co-PI) [⬀] Makedon, Fillia [⬀] Kung, David Mariottini, Gian Luca |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
I/Ucrc Phase I: Iperform - I/Ucrc For Assistive Technologies to Enhance Human Performance @ University of Texas At Arlington
The I/UCRC for Assistive Technologies (AT) will attract support for the advancement of AT research and promote innovation in academia that is driven by industrial needs. In work environments, ATs can improve work efficiency, identify safety risks, shorten the learning curve or worker training through simulations, and improve resource allocation, creativity and communication. In healthcare environments, AT tools can enhance sensory and cognitive capabilities, improve training & delivery, enable remote monitoring, delay physical and cognitive decline in chronic conditions, personalize rehabilitation, predict risks for the elderly who live alone, monitor sleep disorders, design better prosthetics or drugs, design better robotic assistants, smart wheelchairs, therapy games and tools to monitor mental/physiological conditions, such as depression, epilepsy, or heart problems. The seed projects conducted within the center will help advance basic research in computer vision, machine learning, user interfaces, brain imaging, human robot interaction, human computer interaction, virtual reality, simulation, and many other related research areas.
The projects conducted at I/UCRC for Assistive Technologies will drive a broad spectrum of advances in the areas such as worker productivity and safety, transportation, health, company operations and intelligence, and promote the development of AT research infrastructure. The center addresses real-world problems and thus can generate new jobs, products, services, and impact all areas where a human has the potential to improve. The center will play role in enhancing the quality and diversity of AT professionals and prepare a future generation of competitive employees-scientists who can solve problems due to unmet human needs. Through compelling projects, the center will also attract students to CSE fields. UTA and UTD have a strong record in training students and have ongoing NSF projects, e.g., to identify software errors, analysis of facial expressions to identify arthritic pain, or efficient multimodal database searches.
|
0.934 |
2014 — 2017 |
Das, Gautam Huang, Heng [⬀] |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Sch: Exp: Collaborative Research: Privacy-Preserving Framework For Publishing Electronic Healthcare Records @ University of Texas At Arlington
This project builds a novel privacy-preserving framework with both new algorithms and software tools to: 1) evaluate the effectiveness of current identifier-suppression techniques for Electronic Healthcare Record (EHR) data; 2) de-identify and anonymize EHR data to protect personal information without significantly reducing the utility of data for secondary data analysis. The proposed techniques eliminate the violation of privacy through re-identification, and facilitate the secondary usage, sharing, publishing and exchange of healthcare data without the risk of breaching protected health information (PHI). This new privacy-preserving framework injects the ICD-9-CM-aware constraint-based privacy-preserving techniques into EHRs to eliminate the threat of identifying an individual in the secondary use of research data. The proposed technique and development can be readily adapted to other types of healthcare databases in order to ensure privacy and prevent re-identification of published data. The project produces groundbreaking algorithms and tools for identifying privacy leakages and protecting personal privacy information in EHRs to improve healthcare data publishing. New privacy-preserving techniques developed in this project lead towards a new type of healthcare science for EHRs. The project also delivers fundamental advancements to engineering by showing how to integrate biomedical domain knowledge with a computationally advanced quantitative framework for preserving the privacy of published EHRs. HIPAA has established protocols and industry standards to protect the confidentiality of PHI. However, our results demonstrate that, even with regard to health data that meets HIPAA requirements, the risk of re-identification is not completely eliminated. By identifying the security vulnerabilities inherent in the HIPAA standards, our research develops a more rigorous security standard that greatly improves privacy protections by applying state-of-the-art algorithms.
The developed data privacy-preserving framework has significant implications for the future of US healthcare data publishing and related applications. Specifically, the transition from paper records to EHRs has accelerated significantly since the passage of the HITECH Act of 2009. The Act provides monetary incentives for the "meaningful use" of EHRs. As a result, the quality and quantity of healthcare databases has risen sharply, which has renewed the public's fear of a breach of privacy of their medical information. This research work is innovative and crucial not only for facilitating EHR data publishing, but also for enhancing the development and promotion of EHRs. At the educational front, this project facilitates the development of novel educational tools to construct entirely new courses and laboratory classes for healthcare, data privacy, data mining, and a wide range of applications. As a result, it enhances the current instructional methods for teaching data privacy and data mining, and has compelling biomedical and healthcare applications that can facilitate learning of computational algorithms. This project involves both undergraduate and graduate students in the three participating institutions. The PIs make a strong effort to engage minority graduate and undergraduate students in research activities in order to increase their exposure to cutting-edge research.
|
0.934 |
2017 — 2018 |
Basu Roy, Senjuti Xu, Heng [⬀] Das, Gautam |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Convergence Htf: Workshop On Converging Human and Technological Perspectives in Crowdsourcing Research @ Pennsylvania State Univ University Park
Intelligent, interactive, and highly networked machines -- with which people increasingly interact -- are a growing part of the landscape, particularly in regard to work. As automation today moves from the factory floor to knowledge and service occupations, research is needed to reap the benefits in increased productivity and increased job opportunities, and to mitigate social costs. The workshop supported by this award will promote the convergence of computer science, data management, machine learning, education, and the social and behavioral sciences to define key challenges and research imperatives of the nexus of humans, technology, and work. Convergence is the deep integration of knowledge, theories, methods, and data from multiple fields to form new and expanded frameworks for addressing scientific and societal challenges and opportunities. This convergence workshop addresses the future of work at the human-technology frontier.
The specific focus of this workshop is on crowdsourcing -- the production of networked knowledge from public participation. This is a new area of research, gaining attention from researchers who study human-computer interactions, data management, machine learning, human behavior, and business. This workshop will bring together researchers from these and other relevant communities to (1) synthesize the diverse perspectives found in these different fields, (2) integrate different knowledge, theories and data to create a transdisciplinary and convergent research roadmap, and (3) catalyze new research directions and advance scientific discovery and innovation in crowdsourcing research. The workshop will also contribute toward broadening participation in this area of research by proactively seeking inclusion of traditionally underrepresented persons.
|
0.927 |
2017 — 2019 |
Das, Gautam |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Eager: Data Analytics Over Location Based Services @ University of Texas At Arlington
Location Based Services are extremely popular, with millions of users making daily use of mapping services such as Google and Bing maps, as well as location based features integrated into numerous other systems such as Twitter and Yelp. Data analytics over the backend databases of these services can reveal interesting "big picture" information, such as geographic distribution of points of interest and regional variation in user behavior. In this project we show that such interesting data analytics can be performed by users whose access to the databases is limited via the available programming and query interfaces. The research results of this project will impact the nation's higher education system and high-tech industries. The ability to pose high-level analytical queries over location based services is needed by knowledge workers in a wide variety of corporations, governments, and security agencies. Parts of this project are being integrated into teaching, which will potentially attract motivated students to pursue doctoral degrees.
The research involves developing a suite of algorithms and techniques for understanding the opportunities and challenges of data analytics over location based services. The various data analytics and mining tasks considered include point and path aggregate estimation as well as dual mining over location based services limited by available data access interfaces. The research makes fundamental advancements to engineering by showing how to integrate theoretically-proven algorithms with application-specific details of real-world location based services. A data analytics prototype is also being developed and will be evaluated over several real-world location based services.
|
0.934 |
2020 — 2021 |
Das, Gautam Cong, Zhen [⬀] Zhou, Yuan Xu, Ling Hagedorn, Aaron |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Build and Broaden: Conference On Social Connections to Promote Individual and Community Resilience in Post-Covid-19 Society @ University of Texas At Arlington
This project consists of a conference, the objective of which is to build partnerships and collaborations among the University of Texas at Arlington (UTA), a Hispanic-Serving Institution (HSI), several other Minority-Serving Institutions (MSIs) in Texas, as well as other leading research institutions across the nation. The conference addresses fundamental research questions of the role of social connections as a key form of social capital for individuals and communities adapting to changes after high-impact disasters and extreme events. Collaborative opportunities among the social sciences, computer sciences, and engineering in innovative data, technology, and methods of studying and strengthening social connections are highlighted. The conference promotes interdisciplinary dialogues as the basis for a series of fruitful collaborations to address critical social needs as society recovers from the COVID-19 pandemic.
This hybrid conference includes a two-day live-streamed physical conference, two pre-conference virtual meeting sessions, and a series of post-conference follow-up virtual meetings. Each pre-conference virtual meeting consists of three speakers? presentations and an open discussion session with an estimate of 25 participants. The physical conference includes three structured speaker sessions and three open discussion sessions, with an estimated 50 participants. Overall, speaker sessions include topics on 1) social connections as key social capital in coping with the COVID -19 crisis, 2) vulnerability and risks of COVID-19 and mitigation impact of social connections among minority and high-risk populations, and 3) innovative methods in investigating the impact of social connections in vulnerability and resilience to COVID-19. Open discussion sessions include topics on 1) the unique roles of MSIs in leading research and dissemination among the most affected communities, and 2) exploring interdisciplinary collaborative opportunities and dialogues. A student-focused poster session is held at the physical meeting, with an estimate of 50 poster presentations and an additional 50 visiting students. The poster session focuses on social connections as a critical component of resilience. Post-conference follow-up meetings and activities concern team building and how the impact of the conference may be sustained.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
|
0.934 |
2020 — 2023 |
Das, Gautam Kim, Won Hwa |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Iii: Small: Collaborative Research: An Optimization Framework For Designing Derived Attributes With Humans-in-the-Loop @ University of Texas At Arlington
Attribute design is one of the most challenging aspects of the Big Data Science pipeline, where raw attributes need to be transformed into easily-interpretable attributes that can aid data scientists in ad-hoc data exploration and building predictive models. Unfortunately, current automated techniques for attribute design do not offer adequate interpretability to the end user, and attribute designed by human data scientists is painstakingly slow and heavily reliant on domain expertise. This project will develop a novel and transformative approach to enable an ensemble of amateur human workers to be involved in the computational loop for attribute design. It will benefit various domains that require effective applications of Big Data Science. In addition, the project will improve the economic well-being of the country by involving amateur workers and fostering systematic development of a data science workforce. On the educational front, the project will have significant education and outreach activities that span K-12 as well as graduate Data Science education.
The research involves developing a suite of algorithms and techniques for understanding the opportunities and challenges of involving an ensemble of human workers in attribute design, which is inspired by ensemble methods in Machine Learning. The main focus is on iterative methods to guide amateur human workers even with limited domain expertise to suggest new attributes for data exploration and predictive modeling. The research makes fundamental advancements to engineering by integrating theoretically-proven attribute design algorithms with application-specific details of real-world data science tasks. A prototype will be rigorously evaluated involving datasets from several application domains and human workers. The outcomes of this research will spur significant research in next generation human-in-the-loop computing, as well as impact data exploration over huge, high dimensional and unfathomable datasets.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
|
0.934 |