THIS PAGE IS OUT OF DATE

When I get some spare time I will fix it. For now, if you need to use Shadow on Ripple, ask me (Justin) for help.


How to Run Shadow-Tor on Ripple

As of this writing, Shadow/Shadow-Tor doesn't quite work on Ripple if one uses the default instructions. Just in case someone else wants to use Shadow for some experiments, here's a quick guide to save you some work in figuring these things out.

First thing I would recommend is following the official guides for Shadow and Shadow-Tor on your lab machine, to get Shadow and Shadow-Tor working on your local setup. If it doesn't work, then it almost certainly won't work on Ripple either, so you should figure that out first. The documentation can fall out of date at times, so you may want to look through recent commits to see if the problem is in there somewhere (note that the docs tell you to use the release branch instead of the main branch, but the main branch sometimes has fixes for known bugs in it that release doesn't). If you can't figure out what's wrong, file a bug report. Rob will typically respond relatively quickly.

Once you have that working, or if you just feel like skipping it, we can move on to using it on Ripple. If you haven't already, set up an account with Ian Goldberg, book via https://ripple.uwaterloo.ca/, and ssh in.

Before we get started installing, just so we don't forget, there are a few things we should check: The maximum number of files you as a user can have open, the maximum number of files the system can have open, and the maximum number of maps you can have.

Run

ulimit -n

to see how many files you can have open in a process. If the number is less than, say, 10000 or so, you almost certainly want to contact Lori to ask them to significantly increase that number (i.e., add the respective hard and soft limits in /etc/security/limits.conf for your username). I have yet to run into issues with a limit of 1000000 (though with a sufficiently large network, you could).

In addition, you should check the maximum number of files the system can have open by running

cat /proc/sys/fs/file-max

It is extremely important that this number be significantly larger (say, ~2x) than the per-process user limit, since if this limit is reached, most commands simply won't run (turns out almost everything on Unix requires a file descriptor, go figure). This means you (or anyone, even root) won't be able to ssh in, and if you are lucky enough to have an open session, you won't be able to run ls or ps. Thankfully, kill still works, so as a last ditch effort, either kill the process if you have the pid, or kill all of your processes by running kill -1. But really, just avoid the problem entirely by ensuring the limit is sufficiently large. If it's not, ask Lori to run

sysctl -w fs.file-max=# way bigger than anything in limits.conf

Finally, it's also likely with large experiments that you'll hit the max number of memory mappings per-process. Run

sysctl vm.max_map_count

to see what the current limit is. Keep in mind that there's going to be at least one map per open file descriptor for this process, probably more, so it should be fairly large. As usual, root can set it via

sysctl -w vm.max_map_count=number

Now that that's out of the way, let's install shadow.

Shadow

Everything is easier to setup assuming it's all in your home directory, so that's where I'll assume things are going.

Clone the release branch from the Shadow repo by running:

git clone https://github.com/shadow/shadow.git -b release

Or, use the master branch if you need that.

Before we build Shadow, we have to locally build glib, because of this bug. That bug is basically a "wontfix" for a broken glib, so unless "lsb_release -a" says something more recent than Ubuntu 14.04, run

mkdir ~/.shadow
cd ~/.shadow
wget http://ftp.gnome.org/pub/gnome/sources/glib/2.42/glib-2.42.1.tar.xz
tar xaf glib-2.42.1.tar.xz
cd glib-2.42.1
./configure --prefix=/home/${USER}/.shadow
make -j 8
make install

Next, we have to deal with a bug in some Ripple machines (notably, Tock), where their install of igraph-dev is some sort of Frankenstien monster of an install. Run:

grep "define IGRAPH_VERSION" /usr/local/include/igraph/igraph_version.h

If you get an error about that file not existing, then the machine probably has a standard igraph install and you can skip all of this. If it does find the file, then do the following:

If you see a line like

#define IGRAPH_VERSION "0.6.5+9.c5849a8"

(in particular, the version number is 0.6.5) then the bug is likely still there. If you want to extra-double check, then run

grep -Pzo "igraph_get_shortest_paths_dijkstra\([^\(]*\);" /usr/local/include/igraph/igraph_paths.h

If the function has 9 arguments, but the IGRAPH_VERSION is defined as being 0.6.something, then the bug is still present. If it has 7 arguments, xor the IGRAPH_VERSION is something >=0.7.0, then someone probably fixed it.

If the Ripple igraph bug is still present, run

sed -i -e "s/\${IGRAPH_VFILE}/\"\#define IGRAPH_VERSION \\\\\"0.7.1\\\\\"\"/g" ~/shadow/cmake/FindIGRAPH.cmake

This will force Shadow to interpret igraph as being version 0.7.1 (which the develpoment libraries appear to be, and though the dynamicly linked libraries at runtime are 0.6.5, it all still apparently works).
If, on the other hand, the Ripple igraph bug is no longer present, but the version number found earlier includes a git revision after the patch number (e.g. it looks like "0.7.1+9.c5849a8" instead of "0.7.1"), then either run the above command with the applicable version number in place of 0.7.1 to hardcode it, or run

sed -i -e "s/\+\\\\\"/+\(\\\\\\\\+.*\)?\\\\\"/g" ~/shadow/cmake/FindIGRAPH.cmake

to fix the regex in the relevant cmake file.

Now we can actually build Shadow.

cd ~/shadow
./setup build -cg
./setup install

Then, either manually add the .shadow/bin directory to your PATH in .bashrc, or just run

echo "export PATH=${PATH}:/home/${USER}/.shadow/bin" >> ~/.bashrc && bash

At this point, you should run

shadow --version

to check that it successfully installed and the version is what you expect, then

shadow --help

to make sure it doesn't throw an error from the glib bug.

Shadow-Tor

Now you can install the Shadow-Tor plugin, which is much simpler. For this, you can just follow the directions on Shadow-Tor's wiki. In a nutshell, the commands are

cd
git clone https://github.com/shadow/shadow-plugin-tor.git -b release
cd ~/shadow-plugin-tor
./setup dependencies
./setup build
./setup install

Once it's installed, you need to setup a network to run it on. This is also covered in the official documentation, at the end, but in case it helps, I'll go over it here as well.

First, you need the shadow-tor tools in your PATH. Either run

export PATH=${PATH}:~/shadow-plugin-tor/build/tor/src/or:~/shadow-plugin-tor/build/tor/src/tools

to do it temporarily, or add it to your .bashrc file like the .shadow/bin directory.
Next, we grab some real-world data to base our network on. The following commands download the latest Alexa list of the top 1 million sites, the latest daily data on the Tor network from the beginning of the month to yesterday, and the latest numbers on running Tor clients. Finally, we copy the most recent consensus file available to the resource directory.

cd ~/shadow-plugin-tor/resource
DATE=$(date +%Y-%m)
wget -N http://s3.amazonaws.com/alexa-static/top-1m.csv.zip
unzip -u top-1m.csv.zip
wget -N https://collector.torproject.org/archive/relay-descriptors/server-descriptors/server-descriptors-$DATE.tar.xz
tar xaf server-descriptors-$DATE.tar.xz
wget -N https://collector.torproject.org/archive/relay-descriptors/extra-infos/extra-infos-$DATE.tar.xz
tar xaf extra-infos-$DATE.tar.xz
wget -N https://collector.torproject.org/archive/relay-descriptors/consensuses/consensuses-$DATE.tar.xz
tar xaf consensuses-$DATE.tar.xz
wget -N https://metrics.torproject.org/stats/clients.csv
cd consensuses-$DATE/ && cd $(ls | tail -1) && cp $(ls | tail -1) ../../consensus_file ; cd ../..

When you're running your experiments, you probably don't want to update these, so that you can be sure changes in performance are due to your changes and not changes in the real Tor network. Or rather, you probably do want to change it, to avoid over-fitting, but consistently. Similarly, if you're comparing results against another Shadow experiment, you should make some effort to use the data as close as possible to the month/day/hour the original experiment used.

Next, run the scripts to generate the network. Here, we're creating a network with 1 authority, 20 relays, 200 clients, and 50 servers of various types, all of which are put in a folder called mytor. The default experiment generated runs for 1 hour of simulation time, the first 30 minutes of which is used for Tor bootstrapping; therefore, only everything after the 30 minute mark (simulation time) should be considered valid data.

python ~/shadow-plugin-tor/tools/parsealexa.py
mkdir mytor
cd mytor
python ~/shadow-plugin-tor/tools/generate.py --nauths 1 --nrelays 20 --nclients 200 --nservers 20 --fweb 0.90 --fbulk 0.10 --nperf50k 10 --nperf1m 10 --nperf5m 10 ../alexa-top-1000-ips.csv ../consensus_file ../server-descriptors-$DATE/ ../extra-infos-$DATE/ ../clients.csv

Again, the $DATE variable we set before is used here (twice); change it as needed for consistency. For details on the options passed in, check the documentation and

python ~/shadow-plugin-tor/tools/generate.py --help

Running the experiment

To run the experiment, just go to the directory that it was set up in ("mytor" in the above example) and run

shadow-tor -w 16

where 16 is the thread count. As of this writing, Shadow doesn't scale linearly with thread count, and will actually perform slightly worse if the thread count is too high. The best performance is around 16-60 threads, depending on the workload. Obviously, never run more threads than the CPU supports (run "lscpu | grep ^CPU\(s\):" to check). I haven't determined whether Shadow works well with hypertreading, since it hasn't mattered yet.

For a quick reference, an experiment using the above setup takes around 1-2 hours to run, with the best performance observed at 16 threads (twice as fast as one thread). An experiment set up with --nauths 3 --nrelays 195 --nclients 600 --nservers 100 --fweb 0.90 --fbulk 0.10 --nperf50k 50 --nperf1m 50 --nperf5m 50 takes 4-7 hours to run, with the best performance observed at about 30 threads (four times as fast as one thread).

Once the experiment is over, use the tools to parse and plot the data:

 python ~/shadow/tools/parse-shadow.py --prefix first-results shadow.log
python ~/shadow/tools/parse-tgen.py --prefix first-results shadow.data/hosts
python ~/shadow/tools/plot-shadow.py --prefix "experiment 1" --data first-results/ "run 1"

If everything went according to plan, there should be a PDF of some relevant charts in your current directory. Now, you can go back and recompile Shadow-Tor with the --tor-prefix option to use your custom Tor code.

Topic revision: r7 - 2018-04-10 - JustinTracey
 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback