Link to the University of Waterloo home page

Andrew Kane

Facebook - LinkedIn - arkane@cs[dot]uwaterloo.ca


About
I completed my PhD in 2014 from the University of Waterloo, Cheriton School of Computer Science as a member of the Database Group and the Information Retrieval Group supervised by Professor Frank Wm. Tompa. I am now a postdoc at the University of Waterloo.

I am currently looking at:

  • search engine space-time performance (efficiency)
  • math search (see Tangent v. 0.3)

In the past I worked on:

  • a search engine to locate Manipulus Florum quotation variants in digial documents
  • disk write latency
  • distributed system design and implementation

I previously worked at Open Text Corp. on the Livelink Search Engine.

Source Code

List Intersection Code - PhD Thesis and SIGIR 2014 paper - arkane-intersect-2014-v1.zip
The merge, skips, bitvectors, and semi-bitvectors code has been released for academic purposes only, while some of the list compression code (from WestLab, Polytechnic Institute of NYU in files suel_*.h) has separate copyright notices.

Publications

R. Zanibbi, K. Davila, A. Kane and F. W. Tompa, Multi-stage math formula search: Using appearance-based similarity metrics at scale. International Conference on Research and Development in Information Retrieval (SIGIR), 2016

R. Zanibbi, K. Davila, A. Kane and F. W. Tompa, Tangent-3 at the NTCIR-12 MathIR Task. NTCIR-12, 2016

A. Kane and A. López-Ortiz, Intersections of Inverted Lists. Encyclopedia of Algorithms, 2015

A. Kane, Integrating skips and bitvectors for list intersection. PhD Thesis, University of Waterloo, 2014

A. Kane and F. W. Tompa, Skewed partial bitvectors for list intersection. International Conference on Research and Development in Information Retrieval (SIGIR), 2014

A. Kane and F. W. Tompa, Distribution by document size. Workshop on Large Scale and Distributed Systems for Information Retrieval (LSDS-IR), 2014

A. Kane and F. W. Tompa, Janus: the intertextuality search engine for the electronic Manipulus florum project. Literary and Linguistic Computing, 2011; doi: 10.1093/llc/fqr009

A. Kane, Simulation of Distributed Search Engines: Comparing Term, Document and Hybrid Distribution. University of Waterloo Technical Report CS-2009-10.

A. Kane, Motivating a Distributed System of Commodity Machines. University of Waterloo Technical Report CS-2009-09.

Demonstrations

C. Nighman, A. Kane, and F. W. Tompa, The Intertextuality Search Engine for the Electronic Manipulus florum Project. Demonstration at the International Medieval Congress, University of Leeds, UK, July 2008, Session 1304.

Presentations

Skewed Partial Bitvectors for List Intersection - SIGIR 2014 paper presentation - July 8th 2014

Document Size Distribution - LSDS-IR 2014 workshop presentation - February 28th 2014

Document Size Distribution - DBTalk - February 12th 2014

Space-Time Optimization of Hybrid Bitvector Intersection - DBTalk - January 16th 2013

Contemporary misconceptions that limit distributed system design and implementation - DBTalk - July 20th 2011

Unusual Disk Optimization Techniques - DBTalk - Oct. 28th 2009

Courses - Graduate Studies
    F09 - CS860 (Audit) - Search Engines, Design to Implementation.
    F09 - CS856 - Systems Software for Multicore Environments (SSME)
    • Project: "Transactional Address Spaces: A single interface for persistent, distributed, shared and memory based transactional systems".
    S09 - CS775 - Parallel Algorithms in Scientific Computing
    • Project: "Parallel 2D Particle Simulation (Billiards)".
    W09 - CS848 - Distributed Information Systems
    • Project: "Optimizing Small Log Writes".
    W09 - CS846 - Topics in Software Engineering and Design
    • Project: "Measuring Efficiency of Text Based Clone Detection".
    W08 - CS856 - Performance Modeling and Analysis
    • Project: "Simulation of Distributed Search Engines: Comparing Term, Document and Hybrid Distribution" published as University of Waterloo Technical Report CS-2009-10.
    W08 - CS798 - Non-photorealistic Rendering (NPR)
    • Project: "Human Comparison of Computer Generated Mosaic Algorithms".
    F07 - CS848 - Self-Managing Databases
    • Project: "Motivating Automatic Tuning of Physical Index Structures in Search Engines".
    F07 - CS798 - Information Retrieval (IR)
    • Literature Review: "Motivating a Distributed System of Commodity Machines" published as University of Waterloo Technical Report CS-2009-09.
Courses - Undergraduate Studies
    W01 - CS488 - Introduction to Computer Graphics (C, Tcl/Tk, OpenGL) W01 - CS448 - Introduction to Database Management (DB2, SQL)
    S00 - CS454 - Networking and Distributed Systems (C++)
    W00 - CS444 - Compiler Construction (C++, Ada/CS, SPARC Assembly) W00 - CS487 - Introduction to Symbolic Computation (Maple)
    W00 - CS442 - Principles of Programming Languages (Scheme, Prolog, ML, Simula67)
Art