Research in NLP increasingly requires sophisticated software architectures. Because there is no agreed "integrated" model of language processing, researchers often produce ad hoc and application-specific solutions, while NLP platforms help by bringing components together and making them interoperable. Due to the complex nature of NLP applications — and language itself — one needs to mix in the same process different models, resources and algorithms, leading to important problems of interoperability and data exchange. Moreover, the increasing complexity of linguistic models implies sophisticated formalisation tools, and the development of experimental studies on large corpora also brings strong constraints on the software environments.
These various characteristics and requirements raise many questions which will be the focus of the present special issue on platforms for language processing.
Two interrelated and complementary aspects are relevant: an architectural and a methodological one.
1) Architectural issues
Bringing together different modules raises many architectural and technical questions, centered around interoperability and data exchange, such as:
interoperability between representation formats of corpora, linguistic resources, documents and annotations; possible standardisation;
technical compatibility of heterogeneous algorithms and data: portability, availability, maintainability, etc;
graphical user interfaces supporting computer scientists and/or linguists to assemble sets of "components" and to visualise and debug the results of their application to corpora;
multimodality and multilinguism;
software execution models: pipeline, agent-based architectures, distributed web services, etc; techniques for limiting errors from propagating into subsequent modules; efficiency comparisons of the various kinds of architectures; scalability (massive data, simultaneous on-line users...).
2) Methodological issues
The scientific approach which consists in projecting a certain linguistic model or a set of models — on the same data or corpora also raises interesting questions:
formalisation: which formalisms are appropriate for the various levels of linguistic analysis? How can they be made interoperable? What should be considered first: the expressivity of a formalism or the complexity of the associated algorithms?
descriptive power: how can we ensure the declarative nature of NLP processes, from linguistic rules to the specification of process chaining ? Is it possible to have a convergence of descriptive and prescriptive models?
repeatability: how to ensure that an experiment, based on complex algorithms, can be reproduced? How to share and capitalize on operational models and resources?
modularity: how to make a complex process independent of the choice of a particular component for a particular task? Reuse and adaptation of resources and components; support for multiple annotations;
evaluation of composite processes;
theoretical productivity: by bringing together different "local" models can we study new linguistic phenomena, at a higher complexity level?
Presentation of concrete experiments embedding NLP platforms into application-oriented software systems (human-machine interfaces, information retrieval and extraction, terminology/ontology constitution, automated translation...) is warmly encouraged. Description of specific NLP platforms is also welcome, and authors should then clearly explain the underlying principles and hypotheses, in order to contribute to the general discussion.
TAL (Traitement Automatique des Langues / Natural Language Processing) is a forty year old international journal published by ATALA (French Association for Natural Language Processing) with the support of CNRS (National Centre for Scientific Research). It has moved to an electronic mode of publication, with printing on demand. This affects in no way its reviewing and selection process.
The first electronic issues are online on the journal Web site:
The tables of contents for the 2004-2006 issues can be consulted on this Web site.
The tables of contents for the 1991-2004 issues can be found on the former Web site of the journal.
Manuscripts may be submitted in English or French. French-speaking authors are requested to submit in French.
as soon as possible: send an email including the title, authors and a ten lines abstract, to firstname.lastname@example.org (preferred but optional)
11/02/2008 Deadline for submission
11/04/2008 Notification to authors
18/05/2008 Deadline for submission of revised version
09/06/2008 Final decision
September 2008 Publication
Contributions (25 pages maximum, PDF format) must be sent by e-mail to the following address: email@example.com
Style sheets are available for download on the Web site of the journal.
Patrice ENJALBERT (University of Caen, France)
Benoît HABERT (ENS LSH & ICAR, France)
Kalina BONTCHEVA (University of Sheffield, United Kingdom)
Jason BALDRIDGE (University of Texas Austin, USA)
Frédérik BILHAUT (University of Caen, France)
Jean CARLETTA (University of Edinburgh, United Kingdom)
Farid CERBAH (Dassault Aviation, France)
Javier COUTO (INCO, Uruguay)
Robert DALE (Macquarie University, Australia)
François DAOUST (UQAM, Québec, Canada)
Thierry DECLERCK (DFKI, Germany)
Serge HEIDEN (ENS-LSH & ICAR, France)
Nancy IDE (Vassar College, New-York, USA & LORIA/CNRS, France)
Michel JACOBSON (LACITO, France)
Diana MAYNARD (University of Sheffield, United Kingdom)
Jean-Luc MINEL (MoDyCo, CNRS, France)
Sylvaine NUGIER (EDF, France)
Sébastien PAUMIER (University of Marne-la-Vallée, France)
Etienne PETITJEAN (ATILF, France)
Thierry POIBEAU (LIPN-CNRS , France)
Laurent ROMARY (INRIA, France & MPG, Germany)
Vera Lucia STRUBE de LIMA (Pontifícia Universidade Católica do Rio Grande do Sul, Brasil)
Valentin TABLAN (University of Sheffield, United Kingdom)
John TAIT (IRF, Austria)
and the Editorial board of the journal.