Please note: This master’s thesis presentation will take place online.
Arun Cheriakara Joseph, Master’s candidate
David R. Cheriton School of Computer Science
Supervisor: Professor Stephen Watt
This thesis studies stroke grouping for online word-level handwriting recognition of Latin letters and digits using orthogonal polynomial representations of pen strokes. A word arrives as an ordered sequence of pen-down strokes, and the system has to decide which strokes belong to which character before it can decide what each character is. At the word level the problem is harder than for isolated characters: the right grouping of strokes depends on what the characters turn out to be, and the right characters depend on how the strokes are grouped. Most existing systems commit to one segmentation and use whatever that segmentation outputs, which can lead to wrong results. The difficulty is sharpened by characters drawn with multiple strokes, by variation in stroke order between writers, and by several letter pairs and letter/digit pairs that share the same shape.
This thesis describes an online word-level recognition pipeline built on orthogonal polynomial representations of multistroke characters. Each pen stroke is re-parameterized by arc length, and its coefficients are projected onto an orthogonal Legendre basis of degree eleven, giving a fixed-length coefficient vector per stroke. For multistroke characters, the per-stroke vectors are concatenated into a single feature vector. Because all strokes in a character are normalized together against a shared bounding box, this block-concatenated representation captures the relative position and scale of the strokes within the character, but it does not directly encode every pairwise relationship between strokes. A probabilistic gap model generates up to six candidate groupings per word, and each candidate character group is normalized in a common bounding box before projection. The resulting vectors are matched against a reference database of 76,428 samples across 62 character labels, organized into 3,237 classes. Classification runs in two stages: a centroid-and-radius heuristic prunes the candidate pool to fifty classes, and a label-pooled K-nearest-neighbor stage then ranks the seven closest samples per label by distance to the convex hull of those samples. The pipeline is evaluated on the UniPen word collection drawn from the 62-character Latin-plus-digits alphabet.
Attend this master’s thesis presentation virtually on MS Teams.