1971 — 1973 |
Joshi, Aravind |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Theory and Computation of Linguistic Transformations @ University of Pennsylvania |
1 |
1972 — 1977 |
Joshi, Aravind |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Mathematical Investigation of Transformational Grammars @ University of Pennsylvania |
1 |
1975 — 1976 |
Joshi, Aravind |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Acquisition of Computer Graphics Equipment @ University of Pennsylvania |
1 |
1976 — 1978 |
Joshi, Aravind Smoliar, Stephen |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Specialized Research Equipment: Sound and Speech Synthesis Facility @ University of Pennsylvania |
1 |
1976 — 1979 |
Joshi, Aravind |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Natural Language Processing and Mathematical Linguistics @ University of Pennsylvania |
1 |
1977 — 1978 |
Joshi, Aravind |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Computer Science Research Equipment @ University of Pennsylvania |
1 |
1978 — 1982 |
Bajcsy, Ruzena [⬀] Joshi, Aravind Badler, Norman (co-PI) [⬀] |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Scene Understanding @ University of Pennsylvania |
1 |
1978 — 1985 |
Joshi, Aravind |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Research in Natural Language Processing @ University of Pennsylvania |
1 |
1979 — 1981 |
Joshi, Aravind |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Acquisition of Computer Science and Computer Engineering Research Equipment @ University of Pennsylvania |
1 |
1982 — 1983 |
Joshi, Aravind |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Usa-France Joint Seminar On Comprehension of Natural Language: June 1982, Paris, France @ University of Pennsylvania |
1 |
1983 — 1989 |
Joshi, Aravind Bajcsy, Ruzena (co-PI) [⬀] Buneman, O. Peter |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Modelling Interactive Processes: Flexible Communication With Knowledge Bases @ University of Pennsylvania |
1 |
1984 — 1988 |
Joshi, Aravind |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Natural Language Processing (Computer Research) @ University of Pennsylvania |
1 |
1987 — 1991 |
Garito, Anthony (co-PI) [⬀] Joshi, Aravind Farhat, Nabil [⬀] Mueller, Paul (co-PI) [⬀] Palmer, Larry (co-PI) [⬀] |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Neuromorphic Cognitive Systems @ University of Pennsylvania
Significant progress in computational neuroscience and neuroengineering requires a multifaceted interdisciplinary research program in several interrelated areas: Mathematical modeling and analysis of neural nets to better understand their collective behavior and capabilities for practical applications. Neurophysiological studies to better understand how retinal information is processed by neuronal assemblies in the striate and extrastriate cortex. Neural vision systems and their VLSI implementation for scene analysis and primitive extraction. Architectures and opto-electronic implementations of self-organizing neural nets partitioned into input/output and internal neurons for supervised and unsupervised learning with stochastic and deterministic state update rules. Higher order processing, in interconnected neural net modules utilizing, sequential and cyclic storage and recall, generalization, for multisensory data fusion and knowledge aggregation. Smart sensing and recognition from sketchy information with emphasis on object recognition including study of object representations that produce distortion invariant recognition. Highly structured associative memory and processing of spoken language. Study of optical materials and devices suitable for realizing artificial plasticity and learning specially in nets with unipolar binary neurons and ternary synaptic weights that facilitate opto-electronic implementations. The present proposal deals with studies to be carried out by a group of faculty with extensive expertise in the above areas, from the schools of Engineering and Medicine. Results of this research are expected to contribute to the development of a new generation of neuromorphic cognitive systems and to outperform more conventional approaches to signal processing. outperform more conventional approaches to signal and
|
1 |
1989 — 1995 |
Joshi, Aravind Badler, Norman (co-PI) [⬀] Bajcsy, Ruzena [⬀] Farber, David (co-PI) [⬀] Buneman, O. Peter |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Keeping Up With the 90"S in Computer Science Equipment @ University of Pennsylvania
This award will provide infrastructure for research that is organized around five laboratories: 1. LINC - for research on artificial intelligence and natural language processing; 2. GRASP - for research on machine perception and robotics; 3. GRAPHICS - for research on graphic interfaces, movement description, and animation; 4. DSL - for research in computer architecture and computer communication; 5. LOGIC & COMPUTATION - for research in logic and computation, including theory of computation, database systems, and programming languages. Two new facets of the research, integration and upward scaling, require an enhanced experimental environment involving machines with massively parallel architectures. The award will help to develop this environment by providing funds for a SIMD machine for work in natural language processing, and active perception and real time manipulation; a MIMD machine for simulation and research involving extensive scientific calculations; as well as high speed workstations with rich environments for work in theoretical computer science.
|
1 |
1991 — 2002 |
Gleitman, Lila (co-PI) [⬀] Joshi, Aravind Liberman, Mark (co-PI) [⬀] |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Center For Research in Cognitive Science @ University of Pennsylvania
ABSTRACT This proposal from the University of Pennsylvania requests funds to establish a Science Technology Center for Research in Cognitive Science. The Director of the Center will be Professor Aravind K. Joshi. The Center for Research in Cognitive Science unites a diverse and richly interconnected group from many traditional disciplines (computer science, linguistics, mathematics, philosophy, and psychology). The goal of the research is to understand the processes and mechanisms by which human beings acquire knowledge about their environment, store and retrieve that knowledge, communicate it to others, and apply it to carry out actions and manipulate their environment. The research is organized into three separate but highly interrelated themes: perception and action, language learning, and language processing. Research in the area of perception and action spans the processes involved in the first stages of visual and auditory representation of spatial and spectral information, to higher order representations of more complex attributes, to the storage and retrieval of such representations by the organism as they are used in goal-oriented actions. The study of language learning focuses on how children develop the abstract representations of language on the basis of their visual and auditory perceptions. The research in language processing combines investigation of formal systems with investigation of computational models, all in the context of empirical study of a wide range of natural languages. Significant features of the perception and action research are its increasing fidelity to actual neural computation and its sophisticated computational modeling and related potential for contributing to artificial intelligence technology. The language learning research has significant potential for technological spin-off in machine learning and automatic acquisition of lexical and grammatical information for language systems, crucial to the development of grammars sufficient for the robust analysis of unconstrained text. And the language processing research will have significant impact on the technological base for human- computer interaction, in particular the design of natural language interfaces for data base and expert systems and knowledge-rich systems in general. This Center will stimulate enhanced activity in precollege education and in the development of human resources.
|
1 |
1991 — 1994 |
Joshi, Aravind |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Research in Natural Language Processing: Mathematical and Computational Investigations in Constrained Grammatical Formalisms @ University of Pennsylvania
This is a joint award with Dr. Vijayshanker at the University of Delaware. This team has proposed several major research tasks in natural language processing with special emphasis on several mathematical and computational aspects. The work clearly has a formal and mathematical character. The various computational structures and strategies the team has developed and will develop, need to be investigated mathematically because such investigations shed light on the descriptive and processing powers of these formalisms. These two aspects, i.e., the development of the structures and strategies, and their mathematical investigations, are very much interrelated. It is believed that natural language processing backed up by a formal framework, and mathematical investigations grounded in empirical studies are two very productive areas of research. The researcher will focus on interested in mathematical investigations only to the extent to which the results have important implications for natural language processing.
|
1 |
1992 — 1997 |
Freyd, Pamela Joshi, Aravind Massey, Christine (co-PI) [⬀] |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Constructing Science: Materials and Activities For Kindergarten and First-Grade @ University of Pennsylvania
Constructing Science: Materials and Activities for kindergarten and First grade will use a collaboration of classroom teachers, science educators, university psychology and education researchers, and university scientists to develop, implement, evaluate and disseminate on a national basis model sets of instructional methods and materials for the active engagement of kindergarten and first-grade children in science learning and exploration. This process for curriculum development has been successfully piloted through a colloquium series initiated by the Citizens' Committee for Public Education in Philadelphia and facilitated by directors of PENNlincs with funding from PATHS/PRISM, ARCO and the Institute for Research in Cognitive Science, U of P. The proposed program will improve the quality of science education and at the same time ensure that more time is spent on science education in kindergarten. This project will bring developmentally appropriate science content and science processes to primary grade children in such a way (a) that scientists, educators and parents will be assured that children are truly learning important science concepts and processes, (b) that the methods have applicability to the realities of classrooms, and (c) that children's initial school experiences in science are rich and that children gain positive attitudes, motivation and vocabulary that will serve as a strong foundation for more formal science study. The methods will enhance any science curriculum available or under development.
|
1 |
1999 — 2002 |
Palmer, Martha (co-PI) [⬀] Joshi, Aravind Badler, Norman [⬀] |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
The Actionary: a Dictionary That Portrays Natural Language Expressions as Context-Sensitive Simulations of Human Actions @ University of Pennsylvania
Abstract
IIS-9900297 Badler, Norman I.; Joshi, Aravind, K.; & Palmer, Martha University of Pennsylvania $178,712 - 12 mos.
The Actionary: A Dictionary that Portrays Natural Language Expressions as Context-Sensitive Simulations of Human Actions
The `Actionary' is an action database that associates natural language expressions with context-sensitive graphical simulations acted out by ``smart'' virtual human agents. It rests on a foundation of Parameterized Action Representations (PARs) that explicitly link Lexicalized Tree-Adjoining Grammar structures to the Parallel Transition Networks that drive virtual human motion generators. PARs provide a conceptual representation of different types of actions: changes of state, changes of location (kinematic), and exertion of force (dynamic). To date, only change of state actions have been addressed by natural language processing systems. It is known that there are no linguistic distinctions between `running,' `jogging,' or `loping,' and that these can only be distinguished by making reference to visual models and context. The Actionary approach defines actions involving continuous changes or process execution with reference to situated concrete models suitable for a visualized performance. The Actionary will facilitate: Translation of human action instructions into different languages for sample action execution for education (e.g., foreign language learning) and training (e.g., machinery operation and repair). Low bandwidth communication of multi-person activities: by transmitting compact textual instructions locally interpreted via the Actionary, smart agents can execute instructions for potential applications in remote skill training, virtual video-conferencing, and 3D virtual communities.
|
1 |
2000 — 2001 |
Joshi, Aravind Trueswell, John [⬀] |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Approaches to Studying World-Situated Language Use: Cuny Conference On Human Sentence Processing, March 15-17, 2001, Philadelphia, Pa @ University of Pennsylvania
This grant will fund a special conference session on 'world-situated language use in natural dialog', held in conjunction with the 14th Annual CUNY Conference on Human Sentence Processing. The conference, held March 15-17, 2001, at the University of Pennsylvania in Philadelphia, is the most prominent U.S. conference for the interdisciplinary study of human language understanding. On an annual basis, it brings together roughly 250 linguists, psycholinguists and computational linguists interested in detailed processing accounts of language comprehension and production.
The special session, entitled "Approaches to Studying World-Situated Language Use: Bridging the Language-as-Product and Language-as-Action Traditions," is designed as a step toward linking conversational/discourse research with the formal linguistic and mechanistic approaches typically found at the CUNY conference. Five prominent researchers working in this bridging area have been invited to give talks and participate in a panel discussion. In addition, peer-reviewed submitted talks and posters on this topic will be presented in accompanying sessions. It is hoped that by holding this symposium at the CUNY Conference, timely cross-disciplinary discussions will occur so to inspire a new generation of psycholinguistic and computational research on questions such as how natural utterances with disfluencies are processed, how information from context, gesture and linguistic input are combined in real-time processing, how interlocutors coordinate attention, and how these coordination processes impact real-time language processing commitments.
|
1 |
2002 — 2003 |
Joshi, Aravind Marcus, Mitchell [⬀] |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Human Language Technology 2002: Special Focus On Language Modeling of Biological Data @ University of Pennsylvania
This will support a special focus workshop at the Human Language Technology Conference in the area of Language Processing of Biological Data. The purpose of this special focus within HLT 2002 context is to bring to the attention of a wide audience of researchers across all aspects of human language technology the research opportunites and recent research breakthroughs in this newly emerging area. This support is also intended to further promote cross-disciplinary approaches to the new field of bioinformatics.
|
1 |
2002 — 2008 |
Palmer, Martha (co-PI) [⬀] Liberman, Mark (co-PI) [⬀] Joshi, Aravind Davidson, Susan (co-PI) [⬀] Pereira, Fernando |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Itr: Mining the Bibliome -- Information Extraction From the Biomedical Literature @ University of Pennsylvania
EIA-0205448 Joshi, Aravind University of Pennsylvania
ITR: Mining the Bibliome -- Information Extraction from the Biomedical Literature
The major goal is the development of qualitatively better methods for automatically extracting information from the biomedical literature, relying on recent research in high-accuracy parsing and shallow semantic analysis. The special focus will be on information relevant to drug development, in collaboration with researchers in the Knowledge Integration and Discovery Systems group at GlaxoSmithKline.
This project will also address several database research problems, including methods for modeling complex, incomplete and changing information using semistructured data, and also ways to connect the text analysis process to an information integration environment that can deal with the wide variety of extant bioinformatic data models, formats, languages and interfaces.
The engine of recent progress in language processing research has been linguistic data: text corpora, treebanks, lexicons, test corpora for information retrieval and information extraction, and so on. Much of this data has been created by Penn researchers and published by Penn's Linguistic Data Consortium. Hence, one of our major goals is to develop and publish new linguistic resources in three categories: a large corpus of biomedical text annotated with syntactic structures `Treebank' and shallow semantic structures (proposition bank or `Propbank'; several large sets of biomedical abstracts and full-text articles annotated with entities and relations of interest to drug developers, such as enzyme inhibition by various compounds or genotype/phenotype connections `Factbanks'; and broad-coverage lexicons and tools for the analysis of biomedical texts.
|
1 |
2002 — 2006 |
Liberman, Mark (co-PI) [⬀] Joshi, Aravind |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Cise Research Resources: Discourse Penn Treebank and Multimodal Form: Development of Two Richly Annotated Corpora @ University of Pennsylvania
EIA-0224417 Aravind K. Joshi Mark Liberman University of Pennsylvania
CISE RR: Discourse Penn Trebank and Multimodal FORM: Development of Two Richly Annotated Corpora
This project, providing critical resources for research discourse modeling and conversational interaction, aims at developing new technologies and systems for information retrieval and human computer interaction. Centering on the construction of annotated corpora, two large-scale resources, one in the discourse domain and one in the dialog domain will be built:
1. Discourse Penn Treebank (DPTB) and 2. MultiFORM: Augmenting the FORM corpus with body movements, speech, and intonation.
The former project develops a large scale and reliably annotated corpus that will encode coherence relations associated with discourse connectives, including their argument structure and anaphoric links, thus exposing a clearly defined level of discourse structure and supporting the extraction of a range of inferences associated with discourse connectives. This annotation will be "on top of" the Penn Treebank (PTB) annotations as well as the predicate-argument annotations of PTB (called the Proposition Bank or Prop Bank). The latter involves a corpus of gesture-annotated videos, FORM that was designed to be extensible in order to eventually represent the entire multimodal experience of conversational interaction. This multimodal FORM , MultiFORM, will be created by adding body movement, speech and syntactic structure, and intonation. Large-scale annotated corpora have played a critical role in speech and natural language research by enabling large-scale integration of statistical knowledge (derived from the corpora) with linguistic knowledge (as represented in annotations) leading to scientific and technological advances. Representative examples constitute robust parsing and automatic extraction of relations and coreferences and their applications to information extraction, question answering, summarization, and machine translation. PTB, a resource developed a decade ago, represents an example of such a resource that impacts natural language processing worldwide. PTB deals with corpora at the sentence level warranting a new large scale and reliable discourse and dialog structure annotated corpora. Although intellectual and practical connections exist between studies of the structures of discourse and dialog, the initial requirements for resources to study these areas diverge while overlapping in conception. On the discourse side, we need for corpora that deals with the kinds of structures found in composed text such as journalistic articles. The dialog side needs to focus on interactions among people and on extemporized rather than pre-composed material.
|
1 |
2002 — 2008 |
Dill, Ken Lafferty, John (co-PI) [⬀] Liberman, Mark (co-PI) [⬀] Joshi, Aravind Pereira, Fernando |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Itr: Language, Learning, and Modeling Biological Sequences @ University of Pennsylvania
EIA-0205456 Joshi. Aravind K University of Pennsylvania
ITR: Language, Learning, and Modeling Biological Sequences
Recent significant advances in natural language processing such as the integration of grammatical and probabilistic machine-learning techniques have not been exploited for modeling biological sequences. These new techniques are highly relevant to the biological domain because they support the integration of sequence features at several scales, from dependencies between successive items through dependencies involving complex structures to overall sequence statistics. Hence, the major goals to be pursued are: (1) Development of new techniques for integrating grammatical and probabilistic information, in particular, integration and evaluation of grammatical, probabilistic, and approximate counting methods for fold prediction in secondary and tertiary structures of biomolecules. (2) Development and evaluation of probabilistic exponential models for gene finding, in particular genes for apicoplast-targeted proteins in eukaryotic human pathogens of the phylum `Apicomplexa'.
This research is highly interdisciplinary, involving the disciplines of computer science, biology and linguistics. It will have a significant impact on the modeling of biological sequences. It will also provide a wonderful opportunity to train new researchers to carry out this interdisciplinary research, thus contributing to science and mathematical education and human resource development.
The proposed research arose out of many discussions that took place at a landmark workshop on `Language Modeling of Biological Data' held at the University of Pennsylvania in February 2001.
|
1 |
2004 — 2008 |
Joshi, Aravind Rambow, Owen |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Metagrammatical Knowledge For Grammars and Corpora @ University of Pennsylvania
There is today a broad consensus among theoretical linguists (of all frameworks) and researchers in Natural Language Processing (NLP) about what the syntactic phenomena are that we encounter in natural languages. However, there are many different frameworks in which analyses of these phenomena have been implemented, and there is even disagreement about specific analyses within one single framework. As a result, linguistic resources such as annotated corpora or grammars cannot be easily reused across frameworks. This project will investigate the common categorization of syntax that underlies work in linguistics and NLP. This underlying categorization is called a ``metagrammar''. Given a metagrammar, a tool can be produced to automatically generate grammars in different frameworks.
This research contains three main activities. The first involves comparative work in several languages (including English) that will lead to coordinated metagrammars for these languages. These framework-independent specifications will catalog syntactic properties and detail their possible interaction; categories shared between languages will lead to shared portions of the metagrammar. The second concerns the development of specific grammar statements that relate metagrammatical categories to constructs in particular frameworks and for particular languages. It is these statements that, in their interaction, determine word order. The third involves annotating the Penn Treebank (PTB) corpus with the syntactic properties from the metagrammar, thus making the information implicitly encoded in the phrase structure of the PTB explicit and usable by other frameworks.
This project will enable the NLP and linguistics communities to better share insights on syntactic phenomena. Additionally, the work will enable the development of new NLP tools that are less dependent on a particular representation. It will enable linguists to rapidly develop grammars and test-suites for different frameworks and languages, thus allowing for both cross- and inter-framework evaluation of linguistic grammars. Upon completion of the project, the PTB re-annotated with the high-level categories of the metagrammar will be made available to the research community .
|
1 |
2007 — 2012 |
Joshi, Aravind Prasad, Rashmi (co-PI) [⬀] |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Ri: Exploiting and Exploring Discourse Connectivity: Deriving New Technology and Knowledge From the Penn Discourse Treebank @ University of Pennsylvania
Large scale corpora annotated at the sentence level have played a critical role in natural language research. They have enabled large scale integration of statistical knowledge (derived from the corpora) with linguistic knowledge leading to both technological and scientific applications, such as information extraction, question answering, summarization, and machine translation, among others. This approach is now being extended to the discourse level, thus going beyond the sentence level. Using a resource called the Penn Discourse Treebank (PDTB), a large scale corpus annotated with discourse structure along with the associated semantics, new major experimental work on discourse processing is being carried out, leading to the generation of more coherent summaries and texts, extraction of complex relations in texts, among others, as well as foundational research relevant to language technology. This work is also providing a deeper understanding of the relationship between sentence level and discourse level structures. While pursuing these goals, a variety of tools for making a productive use of the PDTB resource are also being developed. This research program is also coupled with a strong educational program involving training researchers in the PDTB methodology so that similar resources can be developed in other languages substantially divergent from English. This part of the research program has international components including collaboration with research groups in Czech Republic, India, and Finland. The international collaboration is funded by the NSF Office of International Science and Engineering.
|
1 |
2011 — 2013 |
Joshi, Aravind |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Ci: Addo-En: Significant Enhancement of the Exisitng Penn Discourse Treebank @ University of Pennsylvania
Building large scale annotated resources is crucial for basic and applied research in Natural Language Processing (NLP). Our major long term goal of this project is to make very substantial extensions to an existing unique resource, the Penn Discourse Treebank (PDTB), developed under prior NSF support, augmenting it with a variety of new annotations as well as refining earlier annotations. Our proposed work involves conducting some new annotations and some pilot experiments to confirm the strategies for augmentation. A further goal is to bring together a cross section of potential users of this resource, first to acquaint them with the potential of this resource as well as to get their feedback for guiding further augmentations. Applications of PDTB for the task of summarization have already been made. Future applications are in the areas of information extraction, question-generation, and machine translation among others. On the theoretical side, our resource will prove useful in increased theoretical understanding of discourse structure of language.
|
1 |
2014 — 2017 |
Joshi, Aravind |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Ri: Small: Collaborative Research: Research Leading to Comprehensive Guidelines For Discourse Relation Annotation @ University of Pennsylvania
Machine Translation, automated question answering, dialogue systems -- the many useful, emerging language technologies -- depend on recognizing patterns in text. Right now, the only patterns that can dependably be recognized are very local, no bigger than a sentence clause. Enabling patterns to be recognized across clauses in a text by identifying what links them and what the link conveys was the goal of the NSF-supported Penn Discourse TreeBank (PDTB), a nearly 1-million word text resource labelled with text-linking devices ("discourse connectives" and adjacency), the spans of text they link, and what the link conveys. In the five years since the release of the PDTB, computational linguistics researchers from around the world have used the format it pioneered,to develop similar resources for other languages and to use these resources for recognizing larger patterns in text. The current PDTB, however, lacks the full range of explicit and implicit text-linking devices in English and what they convey; the information which is badly needed by many forward-looking language technology applications. The goal of this project is to conduct research with the purpose to enrich the PDTB with these additional devices and to develop ways for authoritatively annotating other texts with similar information, but with less manual effort, as a basis for extending the range of texts whose larger, cross-clausal patterns can be recognized automatically.
This project is a response to calls (from both the language technology and computational psycholinguistics communities) for increased coverage and continuity of discourse relation annotation, both across and within the sentences of a text. To ensure a systematic annotation scheme grounded in evidence, the project starts by addressing some foundational questions about the properties of additional linguistic signals of discourse relations and how to capture these properties consistently and completely through manual annotation. From this follows systematic, evidence-grounded annotation of Entity Relations; constructions (other than discourse connectives) that reliably signal discourse relations; implicit intra-sentential discourse relations (building on PropBank annotation of the Penn TreeBank, and concurrent discourse relations (where implicit relations hold in addition to ones signaled explicitly). The project also explores the use of crowd-sourcing to support sub-tasks in discourse relation annotation that would lead to a reduction in the manual effort needed for expert annotation of other corpora, or enable large-scale experiments on aspects of human understanding of discourse relations. As with the Penn Discourse TreeBank 2.0, the enhanced corpus resulting from the project will be disseminated by the Linguistic Data Consortium (LDC), a well-established institution for world-wide distribution of language resources.
|
1 |