EUROLAN 2011 Lecturers

  • António Branco, University of Lisbon, Portugal [abstract]
      Language technology and business models
  •     Language technology is perceived as offering the potential to leverage existing business models to new levels of value creation and the possibility to foster innovative business models. In order to get a more informed insight on this, this session aims at supporting the group in assessing these assumptions. An overview of business models will be followed by practical work devoted to analyze existing companies whose core business has a strong connection to language technology and to brainstorming on possible innovative ways of applying and exploring language technology.
  • Davide Cali, ULA, Italy [abstract]
      i-Tour: an intelligent multi-modal transportation service with NLP interface
  •     i-Tour is a research project funded by the EU VII frame program with the direct involvement of important industrial partners such as FIAT, TFL, THALES etc. The aim of the project, which is now half way to be delivered, is to build an opersource software platform to help urban traveller to choose the best path to move around big metropolitan area and to be assisted on the trip with a real time assistant. The i-Tour user interface will be very advanced, based on a multiplatform distribution system, with advanced use of 3d, video camera streaming with overlay of augmented reality. Ula’s will present at Eurolan its contribution to the i-Tour project which will be to integrate with a multimodal approach the graphic interface with a fully free natural language dialogue module. Thanks to NLP technology the i-Tour user will be able to interact with the travelling data in a new and totally free way.

  • Lorenzo Cassulo, Semantic Valey, Italy [abstract]
      NLP technology in the Semantic Valley
  •     The Semantic Valley Consortium is composed of IT companies based in Trentino that work with semantic technologies. It was founded to link institutions, companies and research institutes in order to create a center for semantic technologies in Trentino. The intention is to promote Trentino as an ecosystem that generates innovation in society and in business. All of the Semantic Valley companies share the vision of semantic technologies as an opportunity to increase the quality of life. We will show you what the consortium can do for small high tech companies, using NLP technology, can cooperate with bigger companies and reach new markets and bigger clients.

  • Alfio Massimiliano Gliozzo, IBM T.J. Watson Research, USA [abstract]
      Jeopardy grand challenge, Watson and the Deep QA architecture
  •     Open domain Question Answering (QA) is a long standing research problem.Recently, IBM took on this challenge in the context of the Jeopardy! game. Jeopardy! is a well-known TV quiz show that has been airing on television in the United States for more than 25 years. It pits three human contestants against one another in a competition that requires answering rich natural language questions over a very broad domain of topics. The development of a system able to compete to grand champions in the Jeopardy! challenge led to the design of the DeepQA architecture and the implementation of Watson.
        The DeepQA project shapes a grand challenge in Computer Science that aims to illustrate how the wide and growing accessibility of natural language content and the integration and advancement of Natural Language Processing, Information Retrieval, Machine Learning, Knowledge Representation and Reasoning, and massively parallel computation can drive open-domain automatic Question Answering technology to a point where it clearly and consistently rivals the best human performance.
        Natural Language Processing (NLP) plays a crucial role in the overall Deep QA architecture. It allows to “make sense” of both question and unstructured knowledge contained in the large corpora where most of the answers are located. That’s why we decided to focus this tutorial on the NLP technology adopted by Watson and on how it fits in the general Deep QA architecture. The course is structured in two classes, described below.
  • Nancy Ide, Vassar College,USA [abstract]
      Issues and Strategies for Interoperability among NLP Software Systems and Resources
  •     The tutorial will begin by defining interoperability, a term that is often used in broad and/or vague terms but rarely defined in precise and usable terms, and consider syntactic interoperability and semantic interoperability. Syntactic interoperability relies on specified data formats, communication protocols, and the like to ensure communication and data exchange, whereas semantic interoperability exists when two systems have the ability to automatically interpret exchanged information meaningfully and accurately in order to produce useful results via deference to a common information exchange reference model. For language resources, we can define syntactic interoperability as the ability of different systems to process (read) exchanged data either directly or via trivial conversion. Semantic interoperability for language resources can be defined as the ability of systems to interpret exchanged linguistic information in meaningful and consistent ways.

  • Marius Pasca, Google Inc., USA [abstract]
      Web Search Queries as a Resource for Information Extraction
  •     Web search queries are cursory reflections of knowledge encoded deeply within unstructured and structured content available in documents on the Web and elsewhere. As such, they constitute an intriguing input data resource in open-domain information extraction, as an alternative to using human-generated resources or resources derived automatically from other types of textual data. Specifically, queries are useful in the acquisition of classes of instances (e.g., palo alto, santa barbara, twentynine palms), where the classes are unlabeled or labeled (e.g., california cities); as well as relations, including class attributes (e.g., population density, mayor).

  • Yannis Korkontzelos, University of Manchester, UK [abstract]
      U-Compare: an integrated text mining/natural language processing system
  •     U-Compare is an integrated text mining/natural language processing system based on the UIMA Framework which provides access to a large collection of ready-to-use, interoperable, natural language processing components. U-Compare allows users to build complex NLP workflows from these components via an easy drag-and-drop interface, and makes visualization and comparison of the outputs of these workflows simple. As the name implies comparison of components and workflows is a central feature of the system. U-Compare allows sets of components to be run in parallel on the same inputs and then automatically generates statistics for all possible combinations of these components. Once a workflow has been created in U-Compare it can be exported and shared with other users or used with other UIMA compatible tools and so in addition to comparison, U-Compare also functions as a general purpose workflow creation tool. The U-Compare application is the result of a joint project between the University of Tokyo, the Center for Computational Pharmacology (CCP) at the University of Colorado Health Science Center, and the National Centre for Text Mining (NaCTeM) at the University of Manchester. In this tutorial, an introduction to U-Compared and its functionality will be accompanied with a practical demonstration of the system. In particular, the presentation will focus on simple and parallel workflows, different processing components, different viewers, results analysis and the U-Compare typesystem.
  • Radu Șoricuț, Language Weaver, California, USA [abstract]
      Accurary of Machine Translation – a key issue of engaging business
  •     We present the current state-of-the-art in Machine Translation (also known as Automatic Translation for natural language), in terms of paradigms, approaches, and standard algorithms. We also describe how translation performance is measured in the academic community and the business community, and use this performance measure to bridge the gap between research in the lab and business value in the marketplace. We identify the ability to predict the level of automated translation accuracy as a key problem for providing business value, and present current approaches for solving it.
  • Tamás Váradi, Hungarian Academy of Sciences, Hungary [abstract]
      Practical settings for NOOJ
  •     This tutorial will give an overview of the finite state linguistic analysis tool NooJ. The system provides a comprehensive linguistic development environment, it has integrated corpus handling facilities, coupled with a morphological lexicon and parsing through a series of cascaded local grammars. The finite state engine has been much enhanced recently with advanced features enabling the system to develop context free grammars. NooJ language modules (minimally, inflecting lexicons and some sample grammars) have been developed for a wide variety of languages.
    For further details, see:
    www.nooj4nlp.net

  • Andrejs Vasiljevs, Tilde, Riga, Latvia  [abstract]
      META-NET, related projects and industry-research collaboration
  •     META-NET is European network of excellence dedicated to foster language technology development and creation of multilingual applications. High fragmentation and a lack of unified access to language resources are among key factors that hinder European innovation potential in language technology development and research. META-NET addresses this problem by covering a broad range of activities from accessing the current situation for every EU official language, to defining strategic goals, creating and populating a language resource sharing infrastructure, and bridging LT with neighboring research fields. META-NET activities are implemented through tight cooperation of several projects – T4ME, META-NORD, CESAR, METANET4U – involving major national players in language technology R&D. Using examples of ACCURAT, LetsMT! and TTC projects we will also show how other FP7 and ICT-PSP projects contribute to the META-NET by providing new language resources and tools. Online language resource sharing repository META-SHARE will be demonstrated, its metadata schema and usage for research and industry developments explained.