Computational linguistics investigates human language through computational techniques. Our language processing capacity is at the core of human intelligence; language provides the predominant channel of inter-human communication; and digitally encoded language is universally recognized as the `fabric' of the World-Wide Web. Thus, computational linguistics is of increasing societal relevance, culturally as well as economically.
The computational investigation of human language is an inherently inter-disciplinary field, with a direct bearing on both the humanities (notably linguistics and philosophy) and the sciences (mathematics and computing).
However, a growing number of successful applications of computational linguistics and a related increased interest in practical, engineering approaches have led to a partitioning of research in computational linguistics—into either predominantly theoretical or primarily applied perspectives. Adverse effects of this dichotomy are clearly visible today: methodological fragmentation, on the one hand, and plateau effects in engineering progress, on the other hand.
Our multi-disciplinary research group will help reduce fragmentation, re-unite and henceforward advance in tandem historically related sub-disciplines, as well as prepare the field (and participating researchers) for emerging and future challenges in the formal, computational, and applied analysis of human language. The group involves leading international experts in formal syntax and semantics, computational grammars, annotation of language data, and automated grammatical analysis. Participants come from, among others, the University of Colorado, the University of Edinburgh, the University of Groningen, the University of Konstanz, Oxford University, the Polish Academy of Sciences, Charles University in Prague, Stanford University, Uppsala University, and the University of Washington.
- Were your results as expected, or did your research take other directions than outlined in your project description?
Abstractly, our results very much correspond to expectations, even though not all of the scientific questions raised in our original proposal received full attention during our year at CAS (and several of our actual working group topics have transcended the original proposal).
This is in part owed to the passing of about two years between proposal submission and the start of our residency at CAS, but probably even more owing to our ‘democratic’ approach to shaping group activities while at CAS. The formation of working groups (see below) was in no small part a ‘bottom-up’ process, based on presentations and suggestions by fellows at two preparatory face-to-face meetings before the group took up residency at CAS, which were then collectively ‘distilled’ into sufficiently general research questions and working group formations.
The ability to pursue such an exploratory approach in tandem with leading international scholars over an extended period of time is arguably the most rewarding and exhilarating component of our CAS experience.
Among the key results (so far) of our time at CAS there are both ‘hard’ scientific outcomes and ‘softer’ socio-scientific developments. One common theme of activities in the group with a visible international impact already is the work on extending and refining the design of so-called enhanced Universal Dependencies (UD), which is widely viewed as the current de facto standard for broad-coverage morpho-syntactic annotation and parsing.
Group members from both the HPSG and LFG traditions used the opportunity at CAS to collectively identify strong and weak aspects in earlier proposals for enhanced UD and arrived at a shared vision for more linguistically informed enhancements of the UD framework. This vision has in part been ‘implemented’ already through the conversion of pre-existing HPSG and LFG treebanks (for Dutch and Polish, respectively) to enhanced UD, as well as through an ongoing project on conversion from the Norwegian LFG treebank.
These new resources have been published through the public release of UD version 2.2 (in July 2018) and, thus, immediately contribute to broader visibility of linguistically richer, ‘enhanced’ dependency representations in the applied NLP community. Theoretical underpinnings of this development have been published in international conferences and workshops, notably as a position paper by Przepiórkowski and Patejuk (2018) in the International Conference on Computational Linguistics, which received a best paper award.
- How did your group work together during the year at CAS?
Much of the research activities took place in specialised working groups, typically involving 3–8 researchers working on topics of joint interest.
An initial set of six working groups was launched at the start of the academic year, in August 2017, as the result of a two-day kick-off and planning meeting. Additional working groups were initiated with the arrival of new fellows in the beginning of 2018 and, again, in early April 2018. While some smaller working groups reflected specialiaed shared interests from the beginning (often extending prior collaboration among group members), other working groups involved larger subsets of fellows and more diverse backgrounds and interests.
These working groups often took the form of critical reading groups (including publications by participants), jointly seeking to establish contentful relations across different perspectives.
- Why was a year at CAS important for this project?
This kind of cross-discipline and cross-framework collaboration would not have been possible without CAS. The ability to bring together leading international experts representing a broad variety of perspectives for an extended period of intense scientific exchange is a unique opportunity, greatly treasured by all fellows who participated in the group.
- Are there any plans for continuing the collaborations initiated during your stay at CAS?
A ‘softer’, indirect scientific result of our time at CAS is the formation of new collaborative relations. Stephan Oepen and Jan Hajic, for example, are participating in a new US-based consortium on Uniform Meaning Representation; Mary Dalrymple and Joakim Nivre have joined Dag Haug and Stephan Oepen in two basic research proposals to the Norwegian Research Council; and the group at large is working toward a European doctoral training network in Meaning Representation and Processing under the Marie Skłodowska Curie funding programme.
Bender, Emily M.Professor University of Washington 2017/2018
Bouma, GosseAssociate Professor University of Groningen 2017/2018
Condoravdi, CleoProfessor Stanford University 2017/2018
Dalrymple, MaryProfessor Oxford University 2017/2018
Dyvik, Helge Julius Jakhelln
Flickinger, DanSenior Researcher Stanford University 2017/2018
Gotham, MatthewPostdoctoral Fellow University of Oslo 2017/2018
Hajič, JanProfessor Charles University 2017/2018
Nivre, JoakimProfessor Uppsala University 2017/2018
Palmer, MarthaProfessor University of Colorado at Boulder 2017/2018
Patejuk, AgnieszkaDr. Polish Academy of Sciences 2017/2018
Przepiórkowski, AdamAssociate Professor Polish Academy of Sciences / University of Warsaw 2017/2018
Solberg, Per ErikResearch Fellow University of Oslo (UiO) 2017/2018
Øvrelid, LiljaAssociate Professor University of Oslo (UiO) 2017/2018
19 Mar - 21 Mar 2018-Hotel Gabelshus Hotel Gabelshus
28 Nov 201711:45 - 12:30Turret Room, CAS Oslo Turret Room, CAS Oslo
22 Nov - 24 Nov 2017- 12:30Turret Room, CAS Oslo Turret Room, CAS Oslo
26 Sep 201710:00 - 16:00House of Business, Henrik Ibsensgate 90 House of Business, Henrik Ibsensgate 90
My Time at CAS: Dan Flickinger25.06.2018
Research in Review: 'SynSem: From Form to Meaning - Integrating Linguistics and Computing'25.06.2018
CAS fellows win 'best paper award' at computational linguistics conference19.06.2018
Meaningful Work: Advancing Computational Semantics06.06.2018
New blog explores Norwegian grammatical phenomena with data03.04.2018
Gaps and Errors: A Linguistic Struggle06.03.2018
Project Update: 'SynSem: From Form to Meaning -- Integrating Linguistics and Computing'30.01.2018
Meet the Group Leaders: Dag Trygve Truslew Haug and Stephan Oepen04.09.2017
- Bouma, G. et al. 2018. “Expletives in Universal Dependency Treebanks.”
- Dalrymple, M., D. Haug, & J. Lowe. 2018. "Integrating LFG's binding theory with PCDRT".
- Hajič, J. et al. 2018. "Synonymy in Bilingual Context: The CzEngClass Lexicon."
- Haug, D. & M. Dalrymple. 2018. "Reciprocal scope revisited."
- Oepen, S. et al. 2018. “The 2018 Shared Task on Extrinsic Parser Evaluation. On the Downstream Utility of English Universal Dependency Parsers.”
- Przepiórkowski, A. & A. Patejuk. 2018. "Arguments and adjuncts in Universal Dependencies".
- Przepiórkowski, A. & A. Patejuk. In press. "From Lexical Functional Grammar to Enhanced Universal Dependencies: The UD-LFG Treebank of Polish."