Diversified Stress Testing of RDF Data Management Systems
May 9, 2014
Last updated June 18, 2014
Paper (16 pages): [pdf]
Distributions w.r.t. query features: [pdf]
In our experiments, we used the Waterloo SPARQL Diversity Test Suite (WatDiv).
- Using WatDiv data generator,
we generated two datasets at scale-factors 100 (~10M triples)
and 1000 (~100M triples), respectively.
- Using WatDiv query template generator,
we generated 125 structurally diverse query templates.
- Using WatDiv query generator,
we instantiated each query template with 100 queries.
- We generated two sets of workloads,
one for the smaller dataset and one for the larger dataset.
- As described in the paper,
each workload consists of a warmup sequence and 5 test sequences.
- Files with the suffix ".sparql" constitute the warmup/test sequences.
- Files with the suffix ".desc" indicate the query sequence (i.e., a sequence of integers corresponding to query instance ids).
- To obtain the query template id, subtract 1 from the query instance id, divide it by 100, and add 1.
Benchmark results are available (in two csv files) for the two scale factors:
10M triples and 100M triples.
For each query template, we report the mean query execution time.
Error margins are computed at the 95% confidence interval.
These results are visually shown below:
|Results at 10M triples [pdf]
||Results at 100M triples [pdf]