Grammar checking and proof reading
6-monthly report April 1999 to September 1999
This is the third 6-monthly report of the KTH part of the language
engineering project
Integrated language tools for writing and document handling.
The 6-monthly report of the Göteborg part of the project can be found
here in RTF and
Word format.
Participants at KTH during this period
URL: http://www.nada.kth.se/theory/projects/granska/index.html
Fulfilling of milestones
Before starting the project we wrote down some
milestones and deliverables. We will now show
that almost all milestones have been fulfilled.
- Lexicon work
- Morphological rules for inflection of any word in the lexicon have been
constructed and optimized. We had hoped to be able to use SAOL 12 for this
but Svenska Akademien has still, after a year, not come to a decision
on whether we could be allowed to use it.
- Guessing word tags for unknown words
- The word tag guesser has been improved in several ways and now tags 91 %
of the words correctly, much better than our goal of 85 %.
Our report describing the techniques used for tagging has recently
been published [Carlberger and Kann, 1999].
- Replacement proposals
- Algorithms for generation of replacement proposals have been implemented.
This includes generation of spelling error replacement proposals and
grammar error replacement proposals, but not yet any order of precedence
for the proposals when several proposals are given
for the same spelling or grammar error.
Replacement proposal generation rules have been written for
almost all grammar-checking rules.
- Grammar checking rules construction
- A specification of error types that we want Granska to detect
has been written [Domeij and Knutsson, 1999].
We have compared the list to
similar lists for commercial Swedish grammar checkers
and found that our list covers most errors and contains error types that are
not detected by other grammar checkers, for example
split compounds.
In particular we have studied, implemented and evaluated rules for
two error types: split compounds and incongruence in nominal phrases.
A report will be presented [Domeij, Knutsson, and Öhrman, 1999].
- User interface design and implementation
-
Connecting the graphical user interface and the grammar error detection module
has been unexpectedly hard, but progress is done.
Design of interface components for POS lexicon editing and replacement
suggestions will not be done in this project.
A text-based interface to Granska has been implemented and connected
to the grammar error detection module. A web interface is under construction,
see
http://www.nada.kth.se/theory/projects/granska/scrutinizer-web-demo.html
- Linguistic search and editing
- An empirical study of revision patterns has been performed and a report
has been written [Tyndall, 1999].
- Swedish language rules and help system
- A draft of the new version of Svenska skrivregler has been completed.
The design of the help system in HTML has been specified.
Conferences and presentations
- March 22
- The project organized
Temadag om datorstödd språkgranskning
at KTH. The Granska project was presented in talks by
Kerstin Severinson-Eklundh, Rickard Domeij, Viggo Kann, Ola Knutsson and
Johan Carlberger.
There was an audience of about 90 persons.
- April 26
- Kerstin Severinson-Eklundh, Rickard Domeij, Viggo Kann and Ola Knutsson
presented the Granska project in Lund at the HSFR language technology program
meeting.
- September 6
- Johan Carlberger presented the Granska tagger and
grammar checking system at a seminar at Stockholm university.
- September 21
- Kerstin Severinson-Eklundh and Ola Knutsson
presented the project at the Department of Linguistics, University of
Göteborg.
- October 22-23
- Rickard Domeij, Ola Knutsson and Lena Öhrman will present a paper about error types at Svenskans beskrivning in Linköping.
- December 3-5
- The project in cooperation with the Swedish language council
will organize a conference at KTH.
New publications
- Implementing an efficient part-of-speech tagger
- J. Carlberger, V. Kann
- Software Practice and Experience, 29, 815-832, 1999.
- Specifikation av grammatiska feltyper i Granska
- R. Domeij, O. Knutsson
- Internt arbetspapper, Nada, September 1999.
- HTML
- Inkongruens och felaktigt särskrivna sammansättningar - en beskrivning av två feltyper och möjligheten att detektera felen automatiskt
- R. Domeij, O. Knutsson, L. Öhrman
- Svenskans beskrivning, October 1999.
- HTML.
- Granska - ett effektivt hybridsystem för kontroll av svensk grammatik
- R. Domeij, O. Knutsson, J. Carlberger, V. Kann
- Submitted to NoDaLiDa, December 1999.
- HTML.
- Datorstöd för lingvistisk redigering - en förstudie
- A. Tyndall
- Masters thesis, Department of Linguistics, Stockholm University, June 1999.
-
Postscript,
PDF.
Up to Swedish grammar checking project.
Responsible for this page: Viggo Kann <viggo@nada.kth.se>
Latest change September 30, 1999
Technical support: <webmaster@nada.kth.se>