6-monthly report October 1999 to March 2000

This is the fourth and last 6-monthly report of the KTH part of the language engineering project Integrated language tools for writing and document handling.

Participants at KTH during this period


Fulfilling of milestones

Before starting the project we wrote down some milestones and deliverables. We will now show that almost all milestones have been fulfilled.
Implementation of rule language
The complete rule language, with the exception of regular expression rules, has been implemented. An optimization scheme for the rule matching has been created, which automatically finds out which part of a rule that should trig a matching. This makes the rule matching six times faster than before. Now 2800 words per second are processed by Granska. The optimization is described in the report [Carlberger, Domeij, Kann, and Knutsson, 2000].
Replacement proposals, user aspects
Some user aspects on the use of Granska and the replacement proposals have been evaluated and presented in [Hansén-Eriksson and Knutsson, 2000].
Grammar checking rules construction
A standard set of grammar checking rules has been created, covering the error types we want Granska to detect [Domeij and Knutsson, 1999]. In particular we have studied, implemented and evaluated rules for split compounds and incongruence in nominal phrases, see [Domeij, Knutsson, and Öhrman, 1999] and [Domeij, Knutsson, Carlberger, and Kann, 1999]. The rules for NP detection have evaluated in [Johansson, 2000].
The tokenization has been improved using the lexical analyzer Flex.
Guessing word tags for unknown words
The word tag guesser tags 93 % of the unknown words correctly. The tagger tags 97 % of all words correctly. A report describing the techniques used for tagging was published last year [Carlberger and Kann, 1999].
User interface implementation
There now exist four user interfaces for Granska: a text-based interface, a web interface (, a stand-alone grammar checking editor with a graphical user interface (Windows), and an add-in in Word (Windows).
Linguistic search and editing
A design specification for a linguistic editing function has not been written yet.

Conferences and presentations

October 22-23, 1999
Rickard Domeij, Ola Knutsson and Lena Öhrman presented a paper about error types at Svenskans beskrivning in Linköping.
December 3-5, 1999
The project in cooperation with the Swedish language council organized a conference at KTH.
December 9-10, 1999
Rickard Domeij and Ola Knutsson presented a paper about Granska at NoDaLiDa-99 in Trondheim.
August 24-26, 2000
Johan Carlberger and Viggo Kann will present a paper about applications of the tagger at Qualico-2000, the 4th Quantitative Linguistics Conference, in Prague.


A Swedish grammar checker
J. Carlberger, R. Domeij, V. Kann, O. Knutsson
submitted to Comp. Linguistics, April 2000.
Some applications of a statistical tagger for Swedish
J. Carlberger, V. Kann
Qualico-2000, to appear, August 2000.
Implementing an efficient part-of-speech tagger
J. Carlberger, V. Kann
Software Practice and Experience, 29, 815-832, 1999.
Granska - an efficient hybrid system for Swedish grammar checking
R. Domeij, O. Knutsson, J. Carlberger, V. Kann
NoDaLiDa, December 1999.
Inkongruens och felaktigt särskrivna sammansättningar - en beskrivning av två feltyper och möjligheten att detektera felen automatiskt
R. Domeij, O. Knutsson, L. Öhrman
Svenskans beskrivning, October 1999.
Explorativ studie av språkgranskningsverktyg
A. Hansén-Eriksson and O. Knutsson
March, 2000.
NP-detektion - utvärdering och förslag till förbättringar av Granskas NP-regler
V. Johansson
C level thesis in comp. linguistics, Department of Linguistics, Stockholm University, February 2000.
Granskas regelspråk
O. Knutsson
Internal report, updated October 1999.

