Brzozowski 80 - Waterloo, Ontario, Canada - June 24 2015

The Role of Theory in Computer Science

Abstracts

Carl-Johan Seger (Intel)

"Formal hardware verification: theory meets practice"

Modern microprocessors contain several billion transistors and implement very complex algorithms. Ensuring that such complex designs are correct is a challenging and expensive task. Traditional simulation is rapidly becoming inadequate for this task and even multi-million dollar emulation hardware is rapidly losing the race against complexity. Formal hardware verification marries discrete mathematics and theoretical computer science concepts with effective heuristic-based satisfaction solvers and graph-based function representations to mathematically prove that some part of a chip implements exactly the function provided as a specification. In this talk, I will briefly discuss the major building blocks used in formal hardware verification, discuss current industrial use of the technology, and provide a glimpse into the future of hardware design.


Bruce Watson (Stellenbosch University)

"The coming decades of Brzozowski’s automata algorithms: incrementality and parallelism"

Brzozowski's minimization and automata construction algorithms are two of the original algorithms in the fields of automata theory and regular expressions, having been developed more than five decades ago. These algorithms are not only elegant, but efficient in practice, and have found widespread application in areas ranging from compiler construction to network security. Current applications of automata and regular expressions (especially network security and bioinformatics) require automata on the order of 109 states, which stretch the limits of current algorithms and data-structures, require incremental algorithms in which only the "active" portion of an automaton is constructed. Simultaneously, the development of multicore hardware means that several application areas can benefit from parallel algorithms for constructing, minimising and using automata.

In this talk, I will present both the parallel and the incremental variants of Brzozowski's algorithms, focusing on their derivation, correctness, and elegance. Ongoing experiments show that both the parallel and the incremental versions of Brzozowski's algorithms outstrip other algorithms in performance and memory consumed. This is joint work with Tinus Strauss, Derrick Kourie, and Loek Cleophas.


Zoltán Ésik (University of Szeged)

"A fixed point theorem for non-monotonic functions"

We present a novel fixed point theorem involving non-monotonic functions over certain complete lattices which originates in logic programming.

Joint work with Panos Rondogiannis (University of Athens).


Galina Jirásková (Slovak Academy of Sciences)

"On the boundary of regular languages"

We prove that the tight bound on the state complexity of the boundary of regular languages, defined by bd(L) = L* ∩ (Lc)*, is 3/4 · 4n + 2n-2 - 2 · 3n-2 - n + 2. Our witness languages are described over a five-letter alphabet. Next, we show that this bound cannot be met by any quaternary language if n ≥ 5. However, the state complexity of boundary in the quaternary case is smaller by just one. Finally, we prove that the state complexity of boundary in the binary and ternary cases is in Θ(4n).


Ernst Leiss (University of Houston)

"Is complementation evil?"

Complementation has acquired a bad reputation over the years. It is strongly manifested in Stockmeyer's result that adding this operator into the mix of operations of regular expressions defines regular languages whose complexity is non-elementary (exceeds any fixed-sized stack of exponentiations). It was further driven home by a deceptively simple language equation, over a single-letter alphabet, with complementation and concatenation as only operators, whose solution is not regular (and therefore not context free). Further afield, there is a result by Schlörer on general trackers in the area of inference control in statistical databases where the presence of complementation defeats seemingly strong security defenses. Reexamining these and other examples tempers the bad reputation of complementation somewhat: Complementation cannot achieve its misdeeds alone -- it needs accomplices (in the form of other operators).


Jacques Sakarovitch (CNRS/Telecom ParisTech, France)

"The validity of weighted automata"

Joint work with Sylvain Lombardy (Institut Polytechnique de Bordeaux)

The role --- one of the roles --- of theory is to set up the proper frameworks and define the adequate structures that allow to build relevant computation models and justify the algorithms that can apply to them.

This talk addresses the problem of the validity of weighted automata in which the presence of epsilon-circuits results in infinite summations and which is a good example of a situation where there is a need for appropriate definitions and algorithm justification. Earlier works either rule out such automata or characterise the semirings in which these infinite sums are all well-defined.

By means of a topological approach, we take here a definition of validity that is strong enough to insure that in any kind of semirings, any closure algorithm will succeed on every valid weighted automaton and turn it into an equivalent proper automaton. This definition is stable with respect to natural transformations of automata.

The classical closure algorithms, in particular algorithms based on the computation of the star of the matrix of epsilon-transitions, cannot be used to decide validity. This decision problem remains open for general topological semirings.

We present a closure algorithm that yields a decision procedure for the validity of automata in the case where the weights are taken in Q or R, two cases that had never been treated before. These algorithms and procedure are implemented in the Vaucanson platform.


Hellis Tamm (Tallinn University of Technology)

"Atoms of regular languages"

Atoms of regular languages were introduced in 2011 by Brzozowski and me. The atoms of a regular language are non-empty intersections of uncomplemented or complemented left quotients of the language, where each quotient appears in a term of the intersection. Every regular language defines a unique nondeterministic finite automaton (NFA), the átomaton, whose states are the atoms of the language. The átomaton of a language is isomorphic to the reverse automaton of the minimal deterministic finite automaton (DFA) of the reverse language. An NFA is atomic if the right language of every state is a union of atoms. We generalized Brzozowski's double-reversal method for minimizing a DFA, showing that the result of applying the subset construction to an NFA is a minimal DFA if and only if the reverse of the NFA is atomic. We studied quotient complexity of atoms, providing formulas for the upper bounds of complexities of atoms. Brzozowski and Davies defined a new class of regular languages, which have a maximal number of atoms, all of which have maximal complexity. I will review these results and some of their consequences.


Bruce Cockburn (University of Alberta)

"Data structures and algorithms underlying genome reconstruction from short reads"

The bioinformatics revolution has been driven by both advances in deoxyribonucleic acid (DNA) sequencers, as well as in the data structures and algorithms that are required to process and extract useful information from the resulting flood of genetic data. It is now becoming possible to sequence the 3.2-billion-symbol-long human for US $1000. Hopes are high that the technology will provide significant benefits in the form of insight into the genetic causes of disease, more rapid development of drugs, and personalized medicine. Unfortunately, the output of so-called "Next Generation" DNA sequencers is provided in the form of many millions of fully sequenced short reads that vary typically in length from 50 to <200 nucleotides from the DNA alphabet {A, C, T, G} and that are cleaved from random positions from many instances of the genome. A massive DNA sequence reassembly problem must be solved to infer the full genome of a patient from the corresponding set of short reads. We review the major strategies that have been adopted in the state-of-the-art short read alignment algorithms for genome reconstruction, including those algorithms based on the Burrows-Wheeler transform and its compressed forms (e.g., Bowtie, BWA and recent variants) and those based on hash tables of short k-mers (e.g., SHRiMP2, BFAST and recent variants). The emphasis is on techniques that are best suited for implementation on hardware accelerators.


Jo Ebergen (Oracle)

"The design of a fast, energy-efficient stack"

Stacks, also called LIFOs, are one of the most frequently used data structures in software. At Oracle Labs we have investigated several novel hardware implementations of a stack. All designs are clockless, fast, and energy efficient, while occupying modest area. We implemented a small stack on a chip which functions correctly at speeds of up to 2.7 GHz in a 180nm TSMC process at 2V. The energy consumption per stack operation depends on the number of data movements in the stack, which grows very slowly with the number of data items in the stack. The average dynamic energy per move in the stack varies between 6pJ and 8pJ depending on the type of move. I will present the ideas behind the stack implementation and some test results from the actual chip.

This is joint work with Daniel Finchelstein (Nvidia), Russel Kao, Jon Lexau, and David Hopkins (Oracle Labs).


Andrew Szilard (Western University)

"When Canadian theoretical computer science was born: a personal perspective"

The 1960's are remembered for many important revolutionary events in the world. From mathematics and electrical engineering, computer science was born then. Expensive mainframe computers started to be installed at some giant financial institutions and at large universities. Machine and programming languages were created to communicate algorithms, and a large number of analysts and programmers were needed to design and to code applications for the giant computers. Undergraduate and graduate programs were conceived to teach computer languages, file and data structures, and advanced computer concepts. Good teachable undergraduate texts were nonexistent, and the fundamentals of computer science were lacking. The first Canadian Theoretical Computer Science Summer School was held in Toronto, and the main speakers were two young researchers Janusz Brzozowski and Arto Salomaa. This talk is an attempt to recreate some of Prof. Brzozowski's brilliant inspirational seminal lectures that became fundamental to the teaching of computer science theory in Canada and beyond.


Helmut Jürgensen (Western University)

"The arduous task of modelling"

Theoretical computer science, like theoretical physics, differs from mathematics in that the former are guided by real-world issues into the investigation of mathematical problems while the latter investigates such problems, regardless of their nature, for their mere fundamental insights. Sometimes, fortunately, these goals coincide.

From about 1984 until about 1999, Brzozowski and I worked on establishing a formal model for circuit testing. It was a long journey from understanding what practitioners in the field proposed as a reason for their success to proving that their conclusions were justified. We applied our modelling experience later to software verification.

I shall retrace our work to exhibit why finding really adequate mathematical models is such an intrinsically difficult task.


Jean-Éric Pin (University of Paris)

"Open problems about regular languages, 35 years later"

In 1980, Janusz A. Brzozowski presented a selection of six open problems about regular languages: star height, restricted star height, group complexity, star removal, regularity of non-counting classes, and optimality of prefix codes. He also mentioned two other problems in the conclusion of his article: the limitedness problem and the dot-depth hierarchy. These problems have been the source of some of the greatest breakthroughs in automata theory over the past 35 years. In this lecture, I will survey the research done so far on these questions and the hopes for the next 35 years.


C.-J. (Richard) Shi (University of Washington)

"Mixed-signal computing machine: abstraction and verification of your Blueberry/IPhone chip design inspired by Brzozowski's reasoning"

John Brzozowski has made profound contributions to several aspect of finite state machines and complexity theory. One small example is a formulation of finite-state machines (foundation of modern digital VLSI circuits) as differential equations (foundation of modern analog circuit design). In this talk, I will show how we extend the theory of finite state machines to mixed-signal computing machines. This allows us to develop a complete suite of tools that can automatically derive Verilog abstraction of transistor netlist (SPICE differential equation abstraction). These tools are used by leading semiconductor companies to develop chip sets for IPhone/Blueberry.


Reminiscences


Financial Support

 

Financial support is being supplied by the Fields Institute and by the School of Computer Science, University of Waterloo.