My PhD research focuses on the application of Knowledge Graphs for supporting Complex and Exploratory Search Tasks.
During the first 2 years of my PhD I was working on extracting Knowledge Graphs automatically from documents collections.
However, there was a growing realization that the performance of systems which utilize these graphs for supporting users during search not only depends on the accuracy of Graph Generation Algorithms, but it will also be bound to how these graphs are visualized on the screen.
Therefore, I started working on the visualization of the generated graphs and exploring different features and different paradigms of visualization.
While these graphs provide a big picture for the domain of interest, visualizing large graphs on a display is proved to be very challenging.
Consequently, different alternative mechanisms for visualizing potentially large graphs have been proposed, among them I can refer to two major graph visualization paradigms: overview-first (i.e., global) and start-from-known (i.e., local).
Is there a hybrid paradigm that combines global overviews with local views to support exploration and information finding?
Also, which features of these graphs and the functionality of the employed visualization technique are found easy to use by the searchers?
How do they affect the performance of users during search? And ...
I started learning D3 few months ago and I explore different features every day. I added descriptions to highlight the features I tested in each sample.
So far I was focused on force-layout graphs and tree layouts.
The features I have implemented so far include:
First try with some data. The primary goal was to learn how to bind CSV data to nodes and edges. I didn't work on the style and the look of the graph (and hence the ugly graph!). Though as a challenge, I made the nodes sticky, so the user can drag it and it won't go back to its original position. Useful for organizing and customizing a busy graph.
Next, I tried implementing fisheye effect. It will run when the mouse is moving. I also made the nodes collapsible and expandible. Clicking on a node (that has children) will collapse it. I could also start with all nodes as collapsed as the initial state.
Finally, hovering over a node highlights its neighbors.
Collapsing graphs is not a trivial task. As opposed to trees, each node in a graph does not necessarily have one parent only.
Therefore, we need to decide what should be triggered with a user clicks on a node to collapse it. It will depend on the application of course. One option would be collapsing the neighbors of degree 1 only and leaving other nodes intact. This makes sense if we start with a state in which all such nodes are collapsed. Then the user can expand and see those nodes (of lower degree) if he is interested in exploring that node and learn about the corresponding entity further. In this version, I added labels for the edges as well to see how they look like in the visualized graph. Not so pretty I would say!
Static labels on edges made the graph look cluttered. Since some edge labels are long, they won't fit nicely on an edge.
I tried a different design where the user can "highlight" the connections of a node by hovering the mouse over this node; then freezing it by clicking on it and finally reading the edge labels by hovering the mouse over each of the neighbors of the clicked node. A tooltip appears and displays the label. Each label then corresponds to the edge label between the hovered neighbor and the clicked node.
The edge labels help the user understand how and why two nodes in the graph (i.e. entities) are connected. However, it's always beneficial to add more context (on demand) to clarify the connections and minimize misunderstandings.
Next, I tried adding the edge labels on a side bar. The user has the option to choose "read more" to get a longer snippet of text containing the original label.
In this version, I also bounded the graph to the borders of the SVG container, so the nodes won't go out of sight.
Finally I worked on resolving overlapping labels. It works similar to a hill climbing algorithm and it runs every time the graph 'ticks'. As a result, the graph needs more time to become stable.
It definitely needs more work. An easy fix (as done in different projects) is to add a progress bar to the page and show the graph AFTER it became stable.
Finally, I tried replacing graphs with trees for a different collection. The new data represents a hierarchy.
Purple nodes can be clicked on to get more information.
The non-leaf nodes can also be collapsed / expanded.
Collapsing / Expanding a node will also shift the layout to fit it more nicely in the screen.
I tried working with different samples to generate clusters of nodes based on random data. First start with moving the mouse around. Next, you can click anywhere to make the circles stable. Then you can play with clusters, drag nodes around and see them go back to the group they belong to.