Can computers be taught to read?

Abstract

An evaluation on different sorts of cluster analysis have been performed. To do this evaluation an interesting problem-set will be used, character recognition. This is interesting because cluster analysis is a method built on unsupervised learning. What distinguish unsupervised learning from other methods is that it have no knowledge about what class a object belong to when evaluating. This is an important diffrence from other character recognition techniques, where they often learn by looking at what features a specific character has.

The cluster algorithms evaluated are K-means, Xmeans and Hierarchical clustering, and for these methods the following distance measures have been used Euclidian distance, Cosine similarity and Manhattan distance.

The objective of this evaluation is to find out if its possible to use cluster analysis to do character recognition and to evaluate the conditions needed to do this succesfully.

The main conclusion is that it indeed is possible to use cluster analysis to perform character recognition, if you have the possibility to affect the parameters used. In this evaluation it has been found that parameters that map more directly to the picture representation of the characters have performed better. To determine more exact what a good parameter is a more extensive analysis is needed.

Author: Niklas Lundborg