Grammar checking and proof reading
6-monthly report September 1998 to March 1999
This is the second 6-monthly report of the KTH part of the language
engineering project
Integrated language tools for writing and document handling.
Participants at KTH
URL: http://www.nada.kth.se/theory/projects/granska/index.html
Fulfilling of milestones
Before starting the project we wrote down some
milestones and deliverables. We will now show
that almost all milestones have been fulfilled.
- Tagging
- The tagger has been improved in several ways and now tags 97 %
of the words correctly. A report describing the techniques used has
been written and accepted for publication [Carlberger and Kann, 1999].
We have taken the initiative to
announce a tagger competition that will be held later this year.
- Implementation of rule language
- Most of the rule language is now implemented. It has also been
further extended with for instance a more advanced goto function and
Stava integration. The interpretator of the rule language is now
running under Unix, and will soon be incorporated into
the new Windows version of Granska.
- Grammar checking rules construction
- The existing rules for the old version of Granska have been
evaluated and rewritten in the syntax of the new rule language.
Rules for split compounds have been constructed and evaluated.
- Tokenization
- The problem with split compounds has been addressed in a report [Öhrman, 1998].
The rules described in the report have been implemented and improved.
- Guessing word tags for unknown words
- The tagger tags about 95% of unknown compound words correctly, much
better than 90% that was our goal.
- User interface design and implementation
- The new user interface to Granska has been implemented. The
user interface and the grammar error detection module will soon be connected.
Design of interface components for POS lexicon editing and replacement
suggestions has not been done yet.
- Stava/Granska integration
- Stava is integrated into granska as a module that currently can be used
only inside grammar checking rules.
- Linguistic search and editing
- An empirical study of revision patterns is currently being done.
New publications
- Implementing an efficient part-of-speech tagger
- J. Carlberger, V. Kann
- Software Practice and Experience, to appear, 1999.
-
Postscript,
PDF.
- Felaktigt särskrivna sammansättningar
- L. Öhrman
- C level thesis in comp. linguistics, Department of linguistics,
Stockholm University, October 1998.
-
Postscript,
PDF.
Up to Swedish grammar checking project.
Responsible for this page: Viggo Kann <viggo@nada.kth.se>
Latest change September 24, 1999
Technical support: <webmaster@nada.kth.se>