2008 — 2013 |
Greenhill, Lincoln (co-PI) [⬀] Pfister, Hanspeter Aspuru-Guzik, Alan |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Cdi Type Ii: Scientific Computation For Astronomy, Neurobiology and Chemistry Using Graphics Processing Units and Solid-State Storage
TECHNICAL ABSTRACT
Data-intensive science requires tera- and peta-scale computing. Hardware demands are high, and software, from system to application level, is highly specialized. As a result, scientific investigation is constrained. New and more accessible models for large-scale computing are required. The proposed program seeks to leverage new off-the-shelf computing technologies to develop concepts and tools (e.g., programming strategies, general-purpose and domain-specific libraries) to enable practical and transparent Scalable Heterogeneous Computing (SHC). Results will be applied to three cutting-edge challenges in radio astronomy, quantum chemistry, and neuroscience. These three applications share the need to process massive data streams. They broadly span a parameter space of processing challenges, defined by complexity in computation, data volume, and throughput. Addressing these requires scalable algorithms for massive datasets, systems with high-bandwidth memory access, and the ability to process high-throughput data streams. Proposed SHC strategies, tools, and optimizations would develop, for general scientific computing, use of massively parallel graphics processing units (GPUs) and fast, low-power, large-volume, solid-state storage (SSS) devices that are commercially available. The tools developed will be applied to the analysis of radioastronomy data generated by the Muchison Wide-Field Array, the development of a SHC-enabled molecular quantum chemistry code, and to the Connectome project, an effort to make a complete map of the neuron connectivity of mammalian brains. Education is tightly integrated into the SHC program at the undergraduate and graduate levels. Initiatives to disseminate results will include tutorials and documented open-source libraries as well as workshops.
LAY ABSTRACT
Scientists engaged in data-intensive research, from astronomy to neuroscience, are in desperate need of new strategies and tools. This project approaches the challenge by leveraging new off-the-shelf hardware and software technologies in unique combinations. The project will bring massively parallel graphics processing units and fast solid-state storage devices together with traditional central processing units. Project staff will develop scalable algorithms that leverage this commercially available hardware to process massive data sets and streams of data. The project will use three major scientific challenges as testbeds for development of these new approaches: a radioastronomy telescope called the Murchison Wide-Field Array; exploration of chemistry at the quantum level; and the Connectome, an effort to make a complete map of the neuron connectivity in mammalian brains. Postdoctoral researchers from astronomy, chemistry, neuroscience, and computer science will work together at Harvard?s Initiative in Innovative Computing, where they will combine experience from these domains to develop algorithms and code to broadly enable advances in science. The project will include tutorials, workshops and documented code libraries and a strong educational component.
|
1 |
2011 — 2016 |
Pfister, Hanspeter Greenhill, Lincoln (co-PI) [⬀] Aspuru-Guzik, Alan |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Cdi Type Ii: Bridging the Computational Semantic Gap: a Demand-Driven Framework For Portal-Based Chemistry, Astronomy and Neurobiology
We propose a new framework to enhance the quality of scientific computational research in the fields of chemistry, astronomy, and neurobiology. We make an analogy with demand-driven, portal-based online knowledge systems, such as Google Search, Wolfram|Alpha, Expedia and Orbitz. For example, travelers in the recent past relied heavily on travel agents who manipulated complicated flight databases and booking protocols. Nowadays customers can perform searches and make reservations on their own via the web using a simple online interface without any specialized knowledge such as hotel logistics or airport codes. One of the greatest advantages is the freedom to explore the information space autonomously, potentially finding many more travel options, increasing overall competition and quality of results.
In our analogy, the application scientist is the traveler and the computer expert is the travel agent. Scientific computation is an accepted third pillar of scientific discovery, alongside traditional experiment and pen-and-paper theory. However, it is very often restricted to a cadre of specialists who have mastered the knowledge of utilizing complicated software tools and hardware, despite massive investment in world-class high-performance computing (HPC) facilities. Successful scientists who lack these computer skills face a semantic gap impeding their potential. Our ultimate vision is to democratize the exploitation of these valuable HPC resources to a broader range of application scientists, inspired by the success of portal-based online services which abstract-away the technical details of a computer platform without sacrificing functionality. Our system will provide access to simulation and data-processing packages on HPC hardware, and facilitate the imaging and visualization of complex data sets. Moreover, a natural-language interface and expert system, based around the Wolfram|Alpha approach, will help to guide productive inquiry and interpretation of results. The selected drivers are: (1) quantum chemistry simulation of thousands of molecules, including the search for advanced materials; (2) imaging massive cosmological datasets from radio telescopes; (3) analysis of high-resolution electron microscope brain images in computational neurobiology.
We envisage a broad impact of the work not only in many areas of fundamental research, but also in education. The program will be integrated into courses offered at Harvard and its Extension School, and outreach to minorities and the underprivileged will be accomplished through established mechanisms at the University. The portal format, coupled with an online database of historical results, naturally enables dissemination of research products and offers an ideal platform for training events and workshops.
|
1 |
2011 — 2015 |
Pfister, Hanspeter |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Cgv: Large: Collaborative Research: Analyzing Images Through Time
This collaborative research project leverages expertise of four research teams (IIS-1111415, Massachusetts Institute of Technology; IIS-1110955, Harvard University; IIS-1111398, Washington University; and IIS-1111534, Cornell University). Understanding time-varying processes and phenomena is fundamental to science and engineering. Due to tremendous progress in digital photography, images and videos (including images from webcams, time- lapse photography captured by scientists, surveillance videos, and Internet photo collections) are becoming an important source of information about our dynamic world. However, techniques for automated understanding and visualization of time-varying processes from images or videos are scarce and underdeveloped, requiring fundamental new models and algorithms for representing changes over time. This research involves creating systems that enable modeling, analysis, and visualization of time-varying processes based on image data. These models and algorithms will form the basis for a new set of tools that can help answer important questions about how our environment is changing, how our cities are evolving, and what significant events are happening around the world.
Analyzing images over time poses fundamental new technical challenges. This project focuses on developing and demonstrating end-to-end systems consisting of (1) novel representations necessary to model time-varying image datasets; (2) algorithms for estimating long-range temporal correspondence in image datasets; (3) algorithms for decomposing image datasets into intuitive primitives such as shading, illumination, reflectance, and motion; (4) analysis tools for deriving higher level information from the decomposed representations (e.g., trends, repeated patterns, and unusual events); and (5) tools for visualization of the high-level information and methods for re-synthesis of image data.
This work has the potential to have significant impact in a broad range of areas where images are generated over time, e.g., in ecology, astronomy, urban planning, health, and many others. The results of this research will be broadly disseminated by making source code and datasets publicly available via the project web site (https://groups.csail.mit.edu/vision/image_time/) and offering tutorials and organizing workshops at significant conferences. The project provides educational opportunities and offers hands-on collaborative research experience to students at both the undergraduate and graduate levels and the four institutions.
|
1 |
2011 — 2015 |
Pfister, Hanspeter |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Cgv: Small: Collaborative Research: From Virtual to Real
From Virtual to Real Wojciech Matusik, MIT, and Hanspeter Pfister, Harvard University
Novel and innovative digital output devices, such as stereoscopic TVs, passive (e-Ink) displays, and 3D printers, are entering the mass market. They are rapidly improving in quality and decreasing in price. This trend empowers users to consume and produce digital media like never before. However, while there has been tremendous progress in the hardware development of these output devices, the provided digital content creation software, algorithms, and tools are largely underdeveloped. For example, creating a 3D hardcopy of an animated computer graphics character is well beyond the reach of consumers, and to approximate the character's appearance and deformation behavior using multi-material 3D printers is difficult or perhaps even impossible for professionals. The main issues are a lack of accurate previews of how the output will look like, a lack of standardization between devices with similar capabilities, and a lack of accurate conversion tools and algorithms to go from the virtual (i.e., the computer model) to the real (i.e., the physical output).
This research involves the development of a complete process and software framework that allows moving from abstract computer models to their physical counterparts efficiently and accurately. Designing this process is posing the following fundamental computational challenges: (1) accurate and efficient simulation methods that can predict the properties and behavior of an output without physically generating it; (2) efficient methods to compute an output gamut that describe physically-realizable outputs for a given device; (3) general gamut mapping algorithms that convert abstract computer models to realizable points in the device gamut; and (4) accurate perceptual metrics that allow comparing different output elements during the gamut mapping algorithm. This research is focusing on two emerging classes of important output devices: multi-view auto-stereoscopic displays and multi-material 3D printers. The research is creating a complete and general software architecture that will support both existing and future output devices.
|
1 |
2014 — 2018 |
Pfister, Hanspeter Lichtman, Jeff (co-PI) [⬀] |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Bigdata: Ia: Dka: Collaborative Research: High-Throughput Connectomics
High-Throughput Connectomics
Connectomics is the science of mapping the connectivity between neuronal structures to help us understand how brains work. Using the analogy of astronomy, connectomics researchers wish to build 'telescopes' that will allow scientists to accurately view the brain. However, as in astronomy, the raw data collected by microtomes and electron microscopes, the instruments of connectomics, is too large to store effectively, and must be analyzed at very high computation rates. Our goal is to research, develop, and deploy a software architecture that enables high-throughput analysis of connectomics data at the speed at which it is being acquired. We will develop the first computational infrastructure to support high-throughput connectomics without human intervention. If successful, this system will allow for the first time the mapping of a cortical column of a small mammalian brain (1 cubic millimeter), and hopefully within a few years the mapping of significant sections of a mammalian cortex.
The solution to the big data problem of connectomics is a new high-throughput connectomics software architecture that we call MapRecurse. MapRecurse, named so because it bears some resemblance to the widely used MapReduce framework, will provide a unified way of specifying sequences of computational steps and validation tests to be applied to the collected data. Key to MapRecurse will be the ability to layout data and computation in a structured way that preserves locality. Using it, programmers will be able to apply fast, less accurate segmentation algorithms to low resolutions of the data in order to quickly compute a first version of the output neural network graph. Domain-specific graph theoretical methods will then check for correctness of the graph and identify areas of inconsistencies that are in need of further refinement. MapRecurse will then apply bottom-up, local processing with slower, more accurate segmentation and reconstruction algorithms to higher resolutions of the data, verifying and correcting any errors. The iterations progress recursively and in parallel across multiple cores, giving the approach its name. We believe that MapRecurse and the data structures and algorithms developed here will find applications in other high-throughput applications, such as, in astronomy, biology, social media applications, or economics.
|
1 |
2015 — 2018 |
Pfister, Hanspeter Brenner, Michael [⬀] Brenner, Michael [⬀] |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Reu Site: Team Research in Computational and Applied Mathematics (Tricam)
The Team Research in Computational and Applied Mathematics (TRiCAM) REU program aims to give students an experience in real-world collaborative problem solving, challenging them to apply mathematics and computation to tackle team projects posed by Harvard faculty and industrial partners. Projects will involve the application of computational and mathematical tools such as machine learning, data analysis, and numerical simulation to solve problems in fields such as geoscience, medicine, materials science, and the social sciences. Topics will be chosen to appeal to a wide population of students who are at early stages of their academic development, and who have limited awareness of the vast range of potential career paths in applied mathematics and computational science. The program's team-based approach to research will teach students an appreciation for scientific collaboration, prepare them for future employment, and provide opportunities to practice conflict resolution skills. Ultimately the program will help develop a new generation of collaborative scientists and engineers who are excited about applying computation and mathematics to solve interdisciplinary, real-world problems.
This REU project will support four teams of undergraduate students for ten weeks each summer as they gain both the mathematical, computational, and statistical skills necessary to tackle a research problem, and real-world experience of working in a team-based collaborative environment. The summer will be divided into four phases: a two-week orientation where teams formulate a statement of work for summer in response to the problem posed by the faculty or industrial sponsor, two three-week work periods, the first resulting in a midterm report submitted to the sponsor, and the second focused on responding to feedback from the midterm report, and a one-week completion phase where teams prepare final reports and presentations to both the sponsors and the larger Harvard REU community. Students will be selected for teams based on their individual academic strengths and for their potential fit as a member of a team and for a particular project. The program will focus on students for whom this will be an early, even perhaps a first, experience with research, by targeted recruiting at historically black colleges and non-research institutions.
|
1 |
2016 — 2019 |
Pfister, Hanspeter Lichtman, Jeff (co-PI) [⬀] |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Us-Israel Collaboration: Collaborative Research: New Tools For Extracting Neuronal Phenotypes From a Volumetric Set of Cerebral Cortex Images
A major limitation in connectomics is that there are few tools to transform connectomic images into a minable database. The research aim of this project is to develop a suite of tools that extract essential structural parameters from the brain's physical structure that was imaged at very high (nanometer scale) resolution. The PIs will determine, by using automated methods, the sizes and shapes of neurons, synapses and their connectivity patterns. Using their tools, the PIs will analyze this detailed and varied dataset to find the key patterns within it. It is their belief that such automated methods are a requirement to comprehend the regularities and rules that govern the formation of neural circuits in the cerebral cortex, which to date have only been studied on very small sample spaces. The cerebral cortex remains perhaps the least understood aspect of mammalian biology. No studyof this magnitude of the neuronal phenotype space has ever been conducted: the dataset will contain hundreds of thousands of somata and a billion synapses, allowing the PIs to search for patterns that could only be guessed at with the tools used in prior research. Knowing what overarching organizational principles exist in a cerebral cortical network is crucial for understanding how brains work normally and how they may go awry in disease. Moreover, connectomic studies are beginning in a large number of different laboratories throughout the world focused on a wide range of species and parts of the brain. These tools should have direct applicability to many of these endeavors.
The PIs are a consortium of four laboratories with complementary areas of expertise in computer science (Shavit), systems biology (Alon), image processing (Pfister) and neurobiology (Lichtman). Together they are building a stacked set of methods that extract important parameters from connectomic images. These methods include neuron geometry extraction, network structure, motif detection, and archetypical pattern analysis. These approaches are based on two software platforms:the MapRecurse platform for generating connectome graphs and the Pareto Inference Engine for mining patterns within such graphs. The PIs will test these techniques on an a volume of mammalian cerebral cortex containing tens of thousands of cells and a billion synapses, with the aim of extracting the properties of neural circuits that would be difficult or impossible to obtain any other way. The work in this proposal will have significant impact on neuroscience. It speaks directly to the central goals of the White House BRAIN Initiative. It will provide neuroscientists with anumber of powerful and novel tools to understand the cells and circuits that underlie brain function. It should also be influential in developing approaches in machine learning and neuromorphic computing.
A companion project is being funded by the US-Israel Binational Science Foundation (BSF).
|
1 |
2018 — 2021 |
Pfister, Hanspeter Lichtman, Jeff (co-PI) [⬀] |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Ncs-Fo: Analyzing Synapses, Motifs and Neural Networks For Large-Scale Connectomics
High-resolution analysis of the brain's connectivity, which reveals the actual wiring diagram connecting nerve cells of the brain, provides insights unattainable any other way into the way the healthy brain works and what goes awry in diseases and disorders of the nervous system. The primary challenge of this approach is that at present there are no reliable, robust and powerful computer-based techniques to analyze the extraordinarily large and vastly complicated networks of brain cells to detect connectional motifs in their highly branching and connected structure. Nor are there visualization tools that allow neuroscientists to explore the brain network patterns effectively. This work will analyze large brain networks from electron microscopy datasets in young and old mammalian brain samples. These data sets each contains hundreds of thousands of nerve cells and billions of synapses that interconnect them. The proposal aims to develop new methods and tools to analyze these vast brain networks at the synapse, motif, and network levels. If successful, the project will provide data and analysis tools for the development of new theories of how the brain works.
Recent advances in image acquisition using multi-beam serial-section electron microscopy (sSEM) and automated segmentation methods have enabled data collection for large tissue samples in a variety of animals. These data will be used to curate large-scale datasets with one million labeled synapses with synaptic cleft locations, pre- and postsynaptic polarity predictions, and excitatory and inhibitory type predictions. This has not been accomplished previously given the enormous amount of data. The aim is to discover synaptic motifs by subdividing complex neural networks into quantifiable and meaningful subgraphs. Automatic generation of candidates for motifs will be created by developing an efficient neurite-centric wiring-diagram reconstruction method and subgraph detection algorithm to find common patterns. These data will be used to quantify and compare reconstructed neural networks from different specimens at different spatial and temporal scales and build a visualization platform to assist neuroscientists to analyze these networks as they seek to ask and answer fundamental questions related to neural circuits in the brain.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
|
1 |
2021 — 2024 |
Pfister, Hanspeter |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Iii: Medium: Collaborative Research: Situated Visual Information Spaces
The aim of this project is to enable people to effectively visualize information about the world in augmented reality. Augmented reality is potentially the next big social benefit from computer technologies, because it allows visual information to be embedded - or ‘situated’ - into the real world. This allows people using smartphones and smartglasses to see data around them in the correct real-world context. However, unlike when visualizing data on a regular computer or smartphone display, where a designer has complete control over how the application looks and feels, augmented reality visualizations are inherently overlaid on the real world. As such, visualizations must be capable of reacting to different real-world environments including dynamic scenes, and for there to be design recommendations that say how visualizations should react to different environments. This project will scientifically investigate visualization for augmented reality, study the efficacy of different approaches, create design recommendations, and then build a software system that can apply these recommendations to help design and run effective visualization applications. The proposed approach will be experimentally validated in the sports and healthcare domains.
Situated visual information spaces fuse the digital information world with the physical world of objects, people, locations, and environments using augmented reality. To realize this, three scientific and design challenges will be tackled: (1) Situated visualization, interaction, and collaboration, which requires intuitive in-situ data visualizations, physical and digital interfaces for natural user interactions, and schemes for collaboration in augmented reality. Novel situated visual embedding methods will be studied for spatial and non-spatial data in dynamic environmental and situational contexts. These visualizations will automatically adapt to the physical environment, digital entities, users, and tasks while using perceptually and cognitively effective methods that do not overwhelm the user. (2) Design via constraints, where software reduces the increased complexity of creating visualizations that adapt to real-world environments. This software is aimed at visualization designers and evaluates guidelines as constraints, then balances these to provide recommendations for appropriate data and designs for the current environment. (3) Situated applications, where two wellness applications in healthcare and sports will be developed and evaluated in partnership with respective domain experts. Within them, these domains cover a spectrum of different techniques, tasks, and users. These applications will help to define an achievable research scope, drive it with motivated stakeholders, and present best-practices via use cases.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
|
1 |
2021 — 2024 |
Pfister, Hanspeter Lichtman, Jeff (co-PI) [⬀] |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Ncs-Fo: Empowering Data-Driven Hypothesis Generation For Scalable Connectomics Analysis
The field of connectomics aims to reconstruct the wiring diagram of neurons and synapses at nanometer resolution to enable new insights into the workings of the brain. Recent advances in image acquisition and machine learning methods have yielded complete reconstructions of neural connectivity of large tissue samples. The investigators have one such dataset from a human brain tissue consisting of two petabytes of raw image data from electron microscopy. In collaboration with Google, they have spent the past two years reconstructing the complete 3D shape of about 50,000 cells, including 18,000 neurons, and identifying about 133 million synapses. This data will enable them to examine the prototypes of various neuron shapes, the correlations between these neuron types and their internal structures, and how they are connected to each other. This will be done in a dataset that is orders of magnitude larger than previous brain samples. These dense brain reconstruction results come with complex spatial and network structures, posing new challenges for scientists who wish to explore and analyze such data. The proposed program will develop a scalable visual analytics system that allows researchers to generate novel data-driven hypotheses from the petabyte-scale connectomics data.
This three-year project aims to build novel visual analytics tools and efficient deep learning methods to advance the field of connectomics. Project deliverables will empower neuroscientists to analyze large brain networks in a one cubic millimeter volume containing tens of thousands of neurons and hundreds of millions of synaptic connections. The project aims to analyze the brain at the neuron level and network level. It will investigate scalable visual analytics methods for the comparison of morphological features and analysis of spatial distributions and proximity of cell organelles. The network-level analysis will be supported, from local synaptic network motifs to larger-scale connectivity patterns of different cortical layers. A tightly integrated targeted proofreading/analysis loop will be developed, using techniques from machine learning for automatic error suggestion and guidance of the proofreading process to obtain high-quality data with minimal user interaction. To support intuitive hypothesis generation based on the data-driven visual analysis, an intuitive domain-specific query framework and investigate methods for automatic user guidance and hypothesis suggestion will be designed. Ultimately, this project will provide data and analysis tools to develop new theories of how the brain works.
This project is funded by Integrative Strategies for Understanding Neural and Cognitive Systems (NCS), a multidisciplinary program jointly supported by the Directorates for Biology (BIO), Computer and Information Science and Engineering (CISE), Education and Human Resources (EHR), Engineering (ENG), and Social, Behavioral, and Economic Sciences (SBE).
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
|
1 |