Please note: This PhD defence will take place in DC 1331.
Aida
Sheshbolouki,
PhD
candidate
David
R.
Cheriton
School
of
Computer
Science
Supervisor: Professor M. Tamer Özsu
This thesis introduces two main-memory systems sGrapp and sGradd for performing the fundamental analytic tasks of biclique counting and concept drift detection over a streaming graph. A heuristic function designing approach informed by the knowledge discovered from the data is used to architect the systems. To this end, initially, the growth patterns of the streaming graph representing web-logged user-item streams are mined to discover the emergence principles of streaming motifs. Next, the discovered principles are (a) explained by a graph generator called sGrow; and (b) utilised to establish the requirements for efficient, effective, explainable, and interpretable management and processing of streams. sGrow is used to benchmark the stream analytics, particularly in case of concept drift detection.
sGrow displays robust realization of streaming growth patterns independent of initial conditions, scale and temporal characteristics, and model configurations. Extensive evaluations confirm the simultaneous effectiveness and efficiency of sGrapp and sGradd, given native memory of only 15.6GB. sGrapp achieves average window error up to 0.05 and 0.14 in streaming graphs with uniform and non-uniform temporal distribution and a processing throughput of 1.5 million data record per second (160× higher throughput and 0.02× lower estimation error than baselines). sGradd demonstrates an improving performance over time as it achieves zero false positives and duplicate detections with average detection delay in a range of [9,373]/[5,310] seconds for sequential drifts occurring in close/far intervals.