Xu Chu successfully defended his PhD in August 2017. He was supervised by Professor Ihab Ilyas, a member of the David R. Cheriton School of Computer Science’s Data Systems Group.
Xu came to the University of Waterloo in 2010 as a fourth-year undergraduate exchange student from China’s Nanjing University. After completing his undergraduate degree in computer science at Nanjing University, he enrolled in the graduate program as a master’s student in 2011.
Xu showed much promise, so Professor Ilyas recommended he switch from the master’s to the PhD program. In 2015, Xu was invited to join the prestigious Microsoft Research PhD Fellowship Program, a two-year industry fellowship for outstanding PhD candidates.
Although he just graduated with his doctorate, Xu was recently offered a tenure-track position in the College of Computing at Georgia Institute of Technology in Atlanta. We sat down with Xu to learn more about his experience and his trajectory from undergraduate exchange student to PhD candidate to tenure-track faculty.
Waterloo has a formal exchange agreement through the UW 3+1 Program with several top universities in China. Undergraduate students at Nanjing University, my home university, can come to Waterloo to study for a year and students here can go to Nanjing.
Computer science at Waterloo is very strong, so Nanjing’s students are attracted to study here. Similarly, Nanjing has a number of strong programs, so people at Waterloo are attracted to those. It’s a beneficial exchange for students at both universities.
When I learned about this exchange program I decided to come to Waterloo to take advantage of this fantastic opportunity.
Was
this
your
first
trip
to
Canada?
It
was
my
first
trip
abroad!
And
it
was
my
first
time
travelling
by
plane
—
a
13-hour
flight,
almost
the
longest
single
flight
you
can
take.
What
were
your
early
impressions
of
life
in
Canada?
Some
senior
students
helped
me
and
other
exchange
students
rent
a
house
in
Waterloo,
so
we
had
a
place
to
stay
when
we
arrived
in
Canada.
We
took
a
taxi
from
the
airport
to
the
rental
house
in
Waterloo,
but
we
had
nothing
to
eat
when
we
arrived.
I
checked
a
map
and
the
closest
grocery
store
was
a
20-minute
walk
away.
This
is
totally
different
from
life
in
China,
where
everything
is
available
everywhere.
I had some training in English, but it was taught by Chinese teachers who didn’t speak much conversational English. So, improving English skills was perhaps the most important task I had to master. I improved my English by taking classes, interacting with fellow students, doing academic presentations, and watching English TV shows. My all-time favourite show was Friends. [laughs] I guess you could say that had six English instructors.
What
did
you
do
during
your
undergraduate
exchange
year
at
Waterloo?
I
took
computer
science
courses
just
like
domestic
undergraduates,
completed
assignments
and
wrote
exams.
I
was
already
familiar
with
the
subject
matter,
but
learning
it
in
another
language
certainly
helped
improve
my
English
substantially.
I didn’t do a lot of research until I started my master’s degree in 2011, but I did take advantage of a URA — the Undergrad Research Assistantship program. Ihab took me on as research assistant and I worked about five hours a week with one of his PhD students. It was my first introduction to research.
Tell
us
a
bit
about
your
graduate
experience
with
Professor
Ihab
Ilyas
Ihab
is
a
rigorous
supervisor,
but
he’s
fair
and
systematic
and
guides
students
so
they
meet
milestones.
During
my
early
years
as
a
PhD
student
he
had
me
concentrate
on
meeting
research
goals,
but
during
my
senior
years
he
wanted
me
to
focus
on
collaborating
with
researchers
here,
at
other
universities
and
in
industry.
He’s
guided
me
strategically
all
along
the
way
and
it
paid
off.
What’s
the
overarching
goal
of
your
PhD
research?
My
research
aims
to
make
dirty
data
clean.
We
call
ourselves
data
janitors.
[laughs]
Data dirtiness can come from anywhere. For example, when you enter someone into a personnel record you might make a typo, you might enter the same person twice, or use the person’s initials instead of the full name. It’s one person, but the records are all different. And the problem becomes more complex when you integrate data sets — for example, personnel records at the School of Computer Science with those of the university. You want a unified database, but the integration process itself can introduce errors.
The aim of my research was to detect data dirtiness and deal with it by updating the data automatically to the correct data. Of course, you may not know what the correct data is, so that’s the challenge. This process is still done manually, but it’s slow and difficult to scale up. My research aimed to have a computer do it automatically.
Tell
us
about
your
career-searching
experience
I
started
job
hunting
in
November
2016
and
I
applied
to
three
categories
of
employers
—
universities,
industry
labs
and
industry.
I
was
most
interested
in
university
research,
so
I
focused
on
faculty
positions,
but
I
applied
to
all
three
employer
categories.
I got interviews at a dozen or so universities, including Cornell, Georgia Tech and the University of British Columbia. On the industry lab side, I got an offer from Microsoft Research. From industry I had an applied scientist offer from Amazon.
My original thought process was that if I got an offer from a top university, I’ll have strong students and be in a stimulating environment where I can pursue my research interests. Georgia Tech is a fantastic school, so I eventually accepted their offer.
What
research
will
you
be
pursuing
at
Georgia
Tech?
I’m
still
interested
in
data
cleaning
and
will
continue
to
collaborate
with
Ihab’s
group,
but
I’m
going
to
explore
other
research
directions,
too.
I’m
not
sure
at
this
point,
but
I
want
to
figure
that
out
before
I
start
at
Georgia
Tech
in
January.
Do
you
have
any
advice
to
current
and
prospective
graduate
students?
Computer
science
at
Waterloo
is
fantastic
and
is
every
bit
the
equal
of
top
computer
science
schools
around
the
world.
As graduate students, we have all the resources we need to succeed. The school has close to 100 faculty members, which is larger than any computer science school in Canada and almost all in the United States. In the Data Systems Group alone we have more than a dozen faculty members. And that’s just one group. We have research breadth and depth that few other schools can offer.
The Cheriton School has fantastic collaborative opportunities, so you can do research in one area but tap into the expertise and resources of another. We’re encouraged to attend conferences and to build networks.
If I could offer just one bit of advice I’d say try to land an internship during your graduate degree. In 2015, I become a Microsoft Research PhD Fellow. Each year Microsoft picks 12 PhD students across North America for this two-year fellowship program. That year, two of the 12 students were from the Cheriton School of Computer Science, me and Laura Inozemtseva. That two of the fellows were from Waterloo that year speaks volumes about our standing and shows we compete favourably with the best computer science schools.
This internship was great for a variety of reasons. I got a strong recommendation letter from Microsoft’s principal researcher. Recommendations letters are extremely important. Publications obviously matter a lot as well, but so does what people say about you. It was a great opportunity and I’m glad I took advantage of it.