Summer School on Corpus Linguistics


Awareness Seminar on Language Technology

Tusnad - Romania

13-26 July 1997

1. Theme of the EUROLAN'97 Summer School

The theme of Eurolan'97 was "Corpus Linguistics" - the use of language corpora to develop tools and resources for automated language processing and to test linguistic theories. As well as lectures by distinguished international faculty members, there were workshops that built upon the lectures and aimed specifically at stimulating discussion and further collaboration. Lectures and workshops were organized in three thematic tracks:

Lexicon and Corpora - The aim of this track was to confront Summer School participants with the state-of-the-art in using machine readable dictionaries and other on-line linguistic resources in Natural Language research and with research on extracting these resources directly from corpora.

The titles of the lectures were:

Nicoletta Calzolari - University of Pisa
- Corpus-based lexicography for Language Engineering
Jean-Pierre Chanod - Xerox Grenoble
- NLP finite-state technology: two-level morphology, tokenisation, Part-of-Speech tagging
- Incremental Finite-State Parsing
Tomaz Erjavec - Josef Stefan Institute, Ljubljana
- An Introduction to SGML
Nancy Ide - Vassar College
- Encoding linguistic corpora
- Word sense disambiguation
Martin Rajman - Artificial Intelligence Laboratory, Swiss Federal Institute of Technology, Laussane
- Hidden Markov Models for corpus-based NLP
Dan Tufis- Romanian Academy and ICI-Bucharest
- Corpus-based morpho-syntactic processing in a multilingual environment
Discourse Corpus Linguistics - This track was aimed at presenting a new trend in Natural Language Processing research that tries to base progress in discourse theories upon intensive corpus analysis. This workshop followed up on a workshop held at the University of Pennsylvania in March 1996 on discourse-coding and corpus-based discourse analysis.

The titles of the lectures were:

Massimo Poesio - Centre for Cognitive Science - University of Edinburgh
- Evaluating the reliability of dialogue annotation
- Robust processing of definite descriptions
- Coreference and topic tracing on spoken corpora: the `MapTask project'
Laurent Romary - CRIN-CNRS, Nancy
- Annotating coreference for the study of discourse phenomena
John Sinclair - University of Birmingham
- A Corpus-driven Theory of Meaning

Grammar engineering and grammar learning - This track was focused both on work being done on formalizing the Romanian language using HPSG in connection with research in formalizing other Romance languages and on methods for inferring grammars from corpora. Researchers that announced or begun their work on Romanian HPSG at the ACM Summer School in Belis-Fintinele, Romania in July 1996, as well as others researchers, reported their progress within this framework. 

The titles of the lectures:

Liviu Ciortuz - University "Alexandru Ioan Cuza" of Iasi
- Concurrent parsing with HPSG meta-rules. Application to Romanian.
Aravind Joshi - University of Pennsylvania
- Various aspects of lexicalized grammars, partial parsing, supertagging and applications
Paola Monachesi - University of Tuebingen
- Phenomena in Italian HPSG: clitics, reflexives, clitic climbing, complex predicates. Relations to other Romance languages (French, Romanian)
Martin Rajman - Artificial Intelligence Laboratory, Swiss Federal Institute of Technology, Laussane
- Probabilistic methods in corpus-based NLP
- Stochastic grammars and their extension to data-oriented parsing
Hans Uszkoreit - DFKI and University of Saarbruecken
- Linguistically interpreted data for language competence and performance
Michael Zock - LIMSI-Orsay, University of Paris-Sud
- Lexical choice and computation of syntactic structure (pattern matching) from statistics on corpora

2. The Awareness Seminar on Language Technology

The one day Seminar tried to bring together opinion formers, decision makers, academic/research and industrial groups, highlighting the global strategic importance of Language Tehnology for the competitiveness of business, industries and administration, the tremendous potential that Language Engineering may have for the information-based society. By exploiting Information and Language Technologies (spoken and written) the human communication is enhanced and eased and thereby socio-economic developments are supported whilst maintaining the diversity of languages and cultures in Europe. Approaching these groups, and especially opinion formers, and make them realize the benefits of undertaking a major initiative in the Language Engineering field, and the risk of not doing it, are some of the key subjects.

3. Venue

Tusnad is a holiday resort situated at about 200 km North from Bucharest and 67 km from Brasov in the Carpathian mountains. More about Tusnad you can find here

4. Participants

The Summer School was designed for graduate students and researchers in Natural Language Processing, including computer scientists and linguists interested in corpus research.

The Seminar was perceived as an important European event in the field, being attended by more than 100 participants, goverment and public decision makers, industry, research/development and university people.

The list of participants and their addresses can be found here.

5. Background

The Eurolan series of Summer Schools was established in 1993 to stimulate young researchers to progress towards the highest levels of Natural Language Processing and Language Technology research in their own countries. The first Eurolan summer school was held in 1993 in Iasi (Romania), with the theme "Natural Language and Logic Programming". It was jointly supported by the French government and the University of Iasi. Seven invited faculty members gave lectures to 45 students from Russia, Moldavia, Romania, Bulgaria and Albania. The second Eurolan summer school was held in 1995, again in Iasi, with the theme "Language and Perception: Representations and Processes". This time there were eight faculty members giving lectures to 55 students from six different countries. The second summer school was jointly funded by the European Union, the Romanian Ministry for Research and Development, and the University of Iasi.

6. General information

All lectures were given to all participants. Each of the three workshops lasted for one day. Each workshop included a student session in the morning and a round table and/or a working session in the afternoon. 

Workshop on Grammar Engineering and Grammar Learning

Fees for Eurolan'97 were as follows:

Tuition: USD 135
Accommodation and meals: USD 325
Accompanying adult (accommodation and meals): USD 325
Accompanying child (accommodation and meals): USD 160.
The fees also included the transport (Bucharest - Baile Tusnad, on 13 July and Baile Tusnad - Bucharest, on 26 July).

A number of full and partial scholarships were available for people having difficulties in attending the event. Excepting the TELRI members, the scholarships didn't cover travel expenses outside Romania. The full scholarship amounted to USD 460 while a partial scholarship amounted to USD 410.

The participants were housed in the hotels "Raza Soarelui" and "Olt" in 2 and 3 beds rooms. Meals were taken in the Restaurant "Veverita".

7. Organization

The Eurolan'97/Awareness Seminar was jointly organized and co-sponsored by:  The Organizing Committee of the joint event express warm thanks to all our sponsors.

The Program Committee of the joint event:

Liviu Ciortuz (University "Alexandru Ioan Cuza", Iasi)
Dan Cristea (University "Alexandru Ioan Cuza", Iasi)
Paola Monachesi (University of Tuebingen)
Massimo Poesio - University of Edinburgh
Gheorghe Popa (University "Alexandru Ioan Cuza", Iasi) - Chair
Dan Tufis (Romanian Academy, Bucharest)
Hans Uszkoreit (DFKI, Saarbruecken)
Grazyna Wojcieszko (European Commission - Directorate-General XIII)

The Organizing Committee included:

Non-local members
Cristina Peti (POLITEHNICA University, Bucharest)
Dan Tufis (Romanian Academy, Bucharest)
Claus Unger (ACM, University of Hagen)
Hans Uszkoreit (DFKI, Saarbruecken)
Local members (from the University "Alexandru Ioan Cuza" Iasi)
Dan Cristea
Amalia Todirascu

The accommodation, meals, travel and entertainment were organized by the NET Agency - Iasi, Romania.

