Michael Cormier
PHD student at the University of Waterloo
Research Interests
Broadly speaking, my research is in the area of computer vision,
usually including a machine learning component. I particularly enjoy
working with "unusual" image classes, outside of the traditional
"coventional camera viewing a scene composed of discrete, opaque
objects" scenario.
My current research focus is on the use of computer vision to
analyse the structure of web pages, with the aim of supporting assistive
technology solutions such as screen reader programs. The ultimate goal
of this research is to create a system capable of parsing the structure
of the content of a page at a high level using the same visual
information available to users (i.e. using an image of the
rendered page rather than the page source code). I take this approach
rather than the more typical analysis of the page source code for
several reasons:
- Web pages often include dynamic content, images, Flash objects,
and other objects that are difficult to analyse at a source code level
- The visual layout of a web page is designed to clearly
convey its semantic structure to the users, while the source code is
simply designed to cause the browser's rendering engine to produce a
page with the visual layout specified by the designer
- While web frameworks and associated technologies will
inevitably shift over time, it is reasonable to assume that visual
layout cues will remain broadly consistent to avoid confusing users
Additionally, web pages are an interesting class of image to analyse
from a computer vision perspective. A rendered web page is a designed
image, but one designed to convey information to human users rather than
to a computer vision system. Thus, they form a restricted but
nontrivial domain, intermediate between toy problems and natural scenes,
in which to study computer vision methods.
I have also worked with images formed by projection (in the sense
of integration of density along a line of sight), such as telescopic
imagery of galaxies or X-ray images, and the reconstruction of density
functions from small numbers of projections under assumptions about
density function structure.
Reviewed Publications
- Cormier, M., Mann, R., Moffatt, K., and Cohen, R. (2017). "Towards an Improved Vision-based Web Page Segmentation Algorithm". Computer and Robot Vision (CRV) 2017. (conference; refereed; paper with poster presentation)
- Cormier, M., Mann, R., Cohen, R., and Moffat, K. (2016). "Classification via Hidden Markov Trees for a Vision-Based Approach to Conveying Webpages to Users with Assistive Needs". 2016 IEEE/WIC/ACM International Conference on Web Intelligence. (conference; refereed; paper with oral presentation)
- Cormier, M. (2016). "Computer Vision-based Analysis of Web Page Structure for Assistive Interfaces". Proceedings of the 13th Web for All Conference, 24:1-24:2. (conference; selected by organizers as one of only six students for the Doctoral Consortium, and funded by Google; paper with oral presentation)
- Cormier, M., Moffatt, K., Cohen, R., and Mann, R. (2016). "Purely vision-based segmentation of web pages for assistive technology". Computer Vision and Image Understanding (Special Issue on Assistive Computer Vision and Robotics), 148, 46-66. (journal; refereed)
- Cormier, M., Lizotte, D. J., and Mann, R. (2015). "Reconstruction
of 3-D Density Functions from Few Projections: Structural Assumptions
for Graceful Degradation". In CRV 2015. (conference; refereed; oral
presentation)
- Cormier, M., Cohen, R., Mann, R., Rahim, K., and Wang, D.
(2014). "A Robust Vision-Based Framework for Screen Readers". Second
Workshop on Assistive Computer Vision and Robotics (ACVR 2014) at ECCV
2014. (Workshop; refereed; poster presentation)
- Cohen, R., Lam, D. Y., Agarwal, N., Cormier, M., Jagdev, J.,
Jin, T., Kukreti, M., Liu, J., Rahim, K., Rawat, R., Sun, W., Wang, D.,
and Wexler, M. (2014). "Using computer technology to address the
problem of cyberbullying". ACM Special Interest Group on Computers and
Society (SIGCAS) Newsletter, pp. 52-61. (newsletter; reviewed by
editors)
- Gondra, I., Xu, T., Chiu, D. K. Y., and Cormier, M. (2014).
"Object Segmentation Through Multiple Instance Learning". International
Conference on Image and Signal Processing (ICISP 2014). (Conference;
refereed)
- Cormier, M. and Gondra, I. (2011). "Supervised Object
Segmentation using Visual and Spatial Features". Proceedings of the
2011 International Conference on Image Processing, Computer Vision,
& Pattern Recognition. Vol. 2, pp. 557-563 (Conference; refereed;
oral presentation)
- Hatchard, T. D., George, A. E., Farrell, S. P., Steinitz, M.
O., Adams, C. P., Cormier, M., & Dunlap, R. A. (2010). "Production
and characterization of < 1 0 0 > textured magnetostrictive Fe-Ga
rods". Journal of Alloys and Compounds. 494(1-2) 420-425.
Other Publications
- Cormier, M., Lizotte, D. J., and Mann, R. (2014). "3-D
Reconstruction from Few Projections: Structural Assumptions for Graceful
Degradation". University of Waterloo Cheriton School of Computer
Science Technical Report CS-2014-07. (technical report; not refereed;
expanded and published in CRV 2015 as "Reconstruction of 3-D Density
Functions from Few Projections: Structural Assumptions for Graceful
Degradation")
- Cormier, M. and Gondra, I. (2010). "Strong image
segmentation using learned regions and spatial relationships". In APICS
Annual Computer Science Conference, St. Mary's University, Halifax, Nova
Scotia, Canada. (Student conference; non-refereed; oral presentation)
- Cormier, M. and Gondra, I. (2009). "Strong image
segmentation through non-homogeneous region merging". In APICS Annual
Computer Science Conference, Dalhousie University, Halifax, Nova Scotia,
Canada. (Student conference; non-refereed; oral presentation)
Education
In Progress: PhD (University of Waterloo)
Title:Visual Document Understanding for Assistive Technology (tentative title)
Supervisors: Prof. Robin Cohen and Prof. Richard Mann
Description:
I am presently studying the application of computer vision tecchniques
to the problem of understand the structure of documents. This research
has led to a paper describing the use of computer vision to interpret
the high-level semantic structure of web pages; the primary objective is
to improve the screen reader programs that visually impaired users need
to use the Internet. Future possibilities for research include
empirical studies of our framework for understanding web page
organization, generalization to other types of document, and methods of
presenting complex structures to visually-impaired users.
2013: MMath (University of Waterloo)
Title:3-D Reconstruction from Single Projections, with Applications to Astronomical Images
Supervisors: Prof. Daniel J. Lizotte and Prof. Richard Mann
Description:
In my MMath thesis, I developed a framework for the reconstruction of
three-dimensional data from images formed by projection (i.e., by
integration along a line of sight of some value at each point in a
volume). This framework was designed for the reconstruction of the
distribution of light-emitting matter (stars, for the most part) in a
galaxy from a single projected image formed by integration of luminosity
along the line of sight from each pixel.
For simplicity, assume that the image is square, and the volume of
the galaxy is divided into voxels (“volume pixels”) in a cube with a
side length equal to the side length of the image. Since each pixel
provides a linear constraint, the voxel values are underdetermined. By
making certain physically reasonable assumptions about structural
properties such as symmetry, however, additional constraints can be
found that allow the system to be solved, thus reconstructing the
original distribution insofar as it is consistent with the structural
assumptions made. Furthermore, the projected image of the reconstructed
distribution shows which aspects of the original image can be explained
by the structure assumed. This allows the isolation of structures which
are not consistent with the assumptions made.
A wide range of structural assumptions can be expressed easily
in this framework using a combination of reparametrization of the
reconstruction problem and the addition of regularization terms. Since
the framework uses three-dimensional reconstruction, the structural
constraints are also three-dimensional. Three-dimensional constraints
better reflect physical reality than constraints on the two-dimensional
image, and are often both simpler and more flexible.
2011: BSc with First-Class Honours (St. Francis Xavier University)
Title:Strong Image Segmentation using Learned Regions and Spatial Relationships
Supervisor: Prof. Iker Gondra
Description:
In my B.Sc. Honours thesis, I developed an algorithm to segment an
image in such a way as to isolate an object of interest (OOI), which may
consist of many distinct regions (each internally homogeneous with
respect to low-level image features), using multiple instance learning
to find prototypical representations of each region and using a naive
Bayesian classifier to determine whether a given block of pixels in the
image is part of the OOI or part of the background. The features
considered include spatial relationships between regions and the color
and texture of the block. The results of this thesis showed that these
techniques can be used to learn characteristics of the OOI useful in
segmentation.