Please note: This Systems and Networking seminar will be given online.
Sina
Gholamian,
PhD
candidate
Department
of
Electrical
and
Computer
Engineering, University
of
Waterloo
Log files are widely used to record runtime information of software systems, such as the timestamp of an event, the unique ID of the source of the log, and a part of the state of task execution. The rich information of logs enables system developers (and operators) to monitor the runtime behaviors of their systems and further track down system problems in production settings. With the ever-increasing scale and complexity of modern systems, the volume of logs is rapidly growing, e. g., at a rate of gigabytes of logs per minute. Therefore, the traditional way of log analysis that largely relies on manual inspection (e.g., searching for error/warning keywords or grep) has become an inefficient, labor-intensive, and error-prone task. To address this challenge, many efforts have recently tried to automate log analysis by use of data-mining techniques. However, the current logging process is mostly manual, and thus, proper placement and content of logging statements remain as challenges. To overcome these challenges, methods that aim to automate log placement and content prediction, i.e., ‘where and what to log,’ are of high interest.
Thus, in this research, we focus on predicting the log statements, and for this purpose, we perform an experimental study on open-source Java projects. We introduce a log-aware code-clone detection method to predict the location and description of logging statements. Additionally, we incorporate natural language processing (NLP) deep learning methods to further enhance the performance of the log statements’ description prediction. We also analyze execution logs and extract natural language characteristics of logs to enable the application of natural language models for automated log file analysis. Finally, we propose an automated tool for analyzing log files and measure the information gain from logs for different log analysis tasks such as anomaly detection.
Bio: Sina Gholamian is a final-year Ph.D. student at the University of Waterloo supervised by Prof. Paul Ward. He is interested in inventing and building automated approaches for the analysis of software systems with machine learning approaches. His research is supported by an NSERC doctoral scholarship.
To join this Systems and Networking seminar on Zoom, please go to https://zoom.us/j/92268050403?pwd=bVZyS2Nmc2QwRGZOQzNSbzBCM3ROUT09.