Parallelization

parallelization performance

Performance issues

You tried everything to reduce runtime of T2 on your Exabyte of traffic or on your 10 GBit interface. And you want certain jobs to be done which need some considerable amount of plugins. Hence, you need to parallelize T2 operation.

T2 does it in a different way as other tools, as we still want it easy and flexible for the user to program plugins on the fly without looking at race conditions and worry about semaphores.

So you parallelize different jobs and this is what you will learn in this tutorial. Note, that we currently develop and test a fully parallelized version, but that one will have a ‘1.0.x’ as a version number. If you want to help the anteater, be his guest and contact him.

Preparation

First, restore T2 into a pristine state by removing all unnecessary or older plugins from the plugin folder ~/.tranalyzer/plugins:

t2build -e -y

Are you sure you want to empty the plugin folder '/home/wurst/.tranalyzer/plugins' (y/N)? yes
Plugin folder emptied

Then compile the core (tranalyzer2) and the following plugins:

t2build tranalyzer2 protoStats basicFlow basicStats tcpStates txtSink

...
BUILD SUCCESSFUL

If you did not create a separate data and results directory yet, please do it now in another bash window, that facilitates your workflow:

mkdir ~/data ~/results

Now you are all set!

Parallelization the geeky way

As T2 has only two threads, one for the core and one for the monitoring, the parallelization concept is a bit different than you are used to.

If you have a multi-core machine you can bind different T2 with different tasks, aka plugins sniffing the same traffic. To bind T2 to a core the -c cpu option is used to tell T2 on which CPU he should run. If you use -c 0 the OS picks the CPU. In order to separate flows from different T2 each instance should have a unique sensor ID. The -x ID option does exactly that. The default sensor ID is 666.

basicFlow

vi src/basicFlow.h

...
/* ========================================================================== */
/* ------------------------ USER CONFIGURATION FLAGS ------------------------ */
/* ========================================================================== */

#define BFO_SENSORID       0 // Output sensorID

#define BFO_HDRDESC_PKTCNT 0 // Include packet count for header description

...

Let’s start simple with three T2 performing different tasks sharing the same core configuration. You have to swap INTERFACE with the interface name of your HW, e.g., eth0; ifconfig helps there.

t2conf basicFlow -D BFO_SENSORID=1

t2conf tranalyzer2 -D IO_BUFFERING=1

t2build -R

t2build -p ~/.tranalyzer/L7 basicFlow tcpStates txtSink dnsDecode voipDetector smtpDecode httpSniffer smbDecode

t2build -p ~/.tranalyzer/mining basicFlow tcpStates txtSink nFrstPkts pktSIATHisto descriptiveStats wavelet

st2 -i INTERFACE -w ~/results/stat -l -c 0 -x 1

[sudo] password for wurst:
^Z
[4]+  Stopped                 sudo /home/wurst/tranalyzer2/tranalyzer2/build/tranalyzer -p "/home/wurst"/.tranalyzer/plugins/ -i INTERFACE -w ~/results/stat -l -c 0 -x 1

bg

[4]+ sudo /home/wurst/tranalyzer2/tranalyzer2/build/tranalyzer -p "/home/wurst"/.tranalyzer/plugins/ -i INTERFACE -w ~/results/stat -l -c 0 -x 1 &

st2 -i INTERFACE -w ~/results/L7 -p ~/.tranalyzer/L7 -l -c 0 -x 2 &

[2] 40161

st2 -i INTERFACE -w ~/results/mining -p ~/.tranalyzer/mining -l -c 0 -x 3 &

[3] 40192
ls ~/.tranalyzer

L7  mining  plugins

t2stat -l

40192	/home/wurst/tranalyzer2/tranalyzer2/build/tranalyzer -p /home/wurst/.tranalyzer/plugins/ -i INTERFACE -w /home/wurst/results/mining -p /home/wurst/.tranalyzer/mining -l -c 0 -x 3	00:03
40161	/home/wurst/tranalyzer2/tranalyzer2/build/tranalyzer -p /home/wurst/.tranalyzer/plugins/ -i INTERFACE -w /home/wurst/results/L7 -p /home/wurst/.tranalyzer/L7 -l -c 0 -x 2	00:06
40147	/home/wurst/tranalyzer2/tranalyzer2/build/tranalyzer -p /home/wurst/.tranalyzer/plugins/ -i INTERFACE -w /home/wurst/results/stat -l -c 0 -x 1	00:08

Note that with versions > 0.8.12 the binary of t2 resides under tranalyzer2/tranalyzer2/build.

Open another bash shell and send a monitoring report signal to selected two. Then terminate all.

t2stat -i -s

40192	/home/wurst/tranalyzer2/tranalyzer2/build/tranalyzer -p /home/wurst/.tranalyzer/plugins/ -i INTERFACE -w /home/wurst/results/mining -p /home/wurst/.tranalyzer/mining -l -c 0 -x 3	03:43
Send -USR1 signal to 42860 (y/N)? y
[sudo] password for wurst:
40161	/home/wurst/tranalyzer2/tranalyzer2/build/tranalyzer -p /home/wurst/.tranalyzer/plugins/ -i INTERFACE -w /home/wurst/results/L7 -p /home/wurst/.tranalyzer/L7 -l -c 0 -x 2	04:12
Send -USR1 signal to 42854 (y/N)?  y
40147	/home/wurst/tranalyzer2/tranalyzer2/build/tranalyzer -p /home/wurst/.tranalyzer/plugins/ -i INTERFACE -w /home/wurst/results/stat -l -c 0 -x 1	06:34
Send -USR1 signal to 42832 (y/N)? y
ls ~/results

L7_flows.txt  L7_headers.txt  L7_log.txt  mining_flows.txt  mining_headers.txt  mining_log.txt  stat_flows.txt  stat_headers.txt  stat_log.txt

If you like to use nohup to decouple T2 from the shell in sudo mode the password input will not work. Hence, you need first to invoke any shell command with sudo so that you are authenticated and then use the following command sequence:

sudo echo -n

[sudo] password for wurst:

nohup bash -ci 'st2 -i INTERFACE -w ~/results/stat -l -c 0 -x 1 &'

nohup: ignoring input and appending output to '/home/wurst/nohup.out'

nohup bash -ci 'st2 -i INTERFACE -w ~/results/L7 -p ~/.tranalyzer/L7 -l -c 0 -x 2 &'

nohup: ignoring input and appending output to '/home/wurst/nohup.out'

nohup bash -ci 'st2 -i INTERFACE -w ~/results/mining -p ~/.tranalyzer/mining -l -c 0 -x 3'

nohup: ignoring input and appending output to '/home/wurst/nohup.out'

t2stat -TERM -s

t2stat -l

No running instance of Tranalyzer found

Now we like to have one T2 which does flow stuff and another which has a monitoring job. So the core configuration is different. Bind the flow tranalyzer to CPU 1 and the monitoring one to CPU 3. The sensor ID should be from Boeing: 737 and 747. I mean the good old reliable 737 not the 737max. And configure differential machine monitoring, verbose 0 and block all flow output code to save processing time. The monitoring interval should be 2 seconds.

First start netcat in another bash window to pipe the output to for the monitoring T2. The output you see after the netcat appears after the monitoring T2 is started.

netcat -l 127.0.0.1 -p 6666

%repTyp	time	dur	pktsRec	pktsDrp	ifDrp	memUsageKB	fillSzHashMap	numFlows	numAFlows	numBFlows	numPkts	numAPkts	numBPkts	numV4Pkts	numV6Pkts	numVxPkts	numBytes	numABytes	numBBytes	numFrgV4Pkts	numFrgV6Pkts	numAlarms	rawBandwidth	globalWarn	0x0042Pkts	0x0042Bytes	0x00fePkts	0x00feBytes	0x0806Pkts	0x0806Bytes	0x8035Pkts	0x8035Bytes	0x0800Pkts	0x0800Bytes	0x86ddPkts	0x86ddBytes	ICMPPkts	ICMPBytes	IGMPPkts	IGMPBytes	TCPPkts	TCPBytes	UDPPkts	UDPBytes	GREPkts	GREBytes	ICMPv6Pkts	ICMPv6Bytes	SCTPPkts	SCTPBytes	connSip	connDip	connSipDip	connFave
USR1MR_D	1568993695.581831	7.999967	10	0	0	30188	4	4	4	0	10	10	0	2	2	0	975	975	0	0	0	0	0.975	0x000000000000c044	0	0	0	0	1	60	0	0	2	140	2	180	0	0	0	0	0	0	0	0	0	0	0	0	0	0	2	2	1	11.000
USR1MR_D	1568993705.582146	10.000315	29	0	0	0	4	4	4	0	18	18	0	3	8	0	1839	1839	0	0	0	0	1.471	0x000010000000c044	0	0	0	0	2	120	0	0	3	300	8	824	0	0	0	0	0	0	1	160	0	0	6	644	0	0	2	3	0	1inf
USR1MR_D	1568993715.974287	10.392141	64	0	0	116	9	9	7	2	36	30	6	14	12	0	3760	2968	792	0	0	0	2.894	0x000010000000c064	0	0	0	0	4	222	0	0	14	1446	12	1176	0	0	0	0	10	985	2	321	0	0	10	996	0	0	1	3	1	11.000
USR1MR_D	1568993725.764515	9.790228	75	0	0	0	-2	0	0	0	10	10	0	2	2	0	975	975	0	0	0	0	0.797	0x000010000000c064	0	0	0	0	1	60	0	0	2	140	2	180	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0-nan
USR1MR_D	1568993736.782906	11.018391	103	0	0	0	0	0	0	0	29	29	0	2	16	0	2743	2743	0	0	0	0	1.992	0x000010000000c064	0	0	0	0	6	360	0	0	2	140	16	1648	0	0	0	0	0	0	0	0	0	0	14	1468	0	0	0	0	0	0-nan
USR1MR_D	1568993746.768866	9.985960	121	0	0	0	1	1	1	0	18	18	0	4	8	0	2298	2298	0	0	0	0	1.841	0x000010000200c064	0	0	0	0	0	0	0	0	4	558	8	824	0	0	0	0	0	0	2	418	0	0	6	644	0	0	0	1	0	0-nan
USR1MR_D	1568993755.581209	8.812343	133	0	0	0	0	0	0	0	12	12	0	4	2	0	1393	1393	0	0	0	0	1.265	0x000010000200c064	0	0	0	0	1	60	0	0	4	558	2	180	0	0	0	0	0	0	2	418	0	0	0	0	0	0	0	0	0	0-nan
USR1MR_D	1568993765.581119	9.999910	151	0	0	0	0	0	0	0	18	18	0	2	10	0	1799	1799	0	0	0	0	1.439	0x000010000200c064	0	0	0	0	1	60	0	0	2	140	10	1004	0	0	0	0	0	0	0	0	0	0	8	824	0	0	0	0	0	0-nan
...

First produce a copy of T2 which becomes the monitoring T2. Acquire sudo and start the statistics T2, configure the monitoring T2, compile and invoke.

cp -r tranalyzer2-0.9.2 montran

sudo echo -n

[sudo] password for wurst:

st2 -i INTERFACE -w ~/results/stat -l -c 0 -x 737 &

[1] 42529

t2conf -t ~/montran tranalyzer2 -D MONINTTMPCP_ON=1 -D MONINTV=2 -D BLOCK_BUF=1 -D VERBOSE=0 -D MACHINE_REPORT=1 -D DIFF_REPORT=1

cd ~/montran

./autogen.sh -p ~/.tranalyzer/monitoring tranalyzer2 basicStats tcpStates connStat

sudo ~/montran/tranalyzer2/tranalyzer2/build/tranalyzer -i INTERFACE -p ~/.tranalyzer/monitoring -c 3 -x 747 &

[2] 42588

Enable DPDK

In order to improve the performance on the HW side, you need a special INTEL or NVIDIA Mellanox network card where the load balancing of different queues where T2 processes are attached is optimized.

...
#define DPDK_MP   0 // Use DPDK multi-process mode instead of libpcap
...

You then need to recompile all loaded plugins, e.g., if you loaded txtSink, the interface info changes. Use the following command:

t2conf -D tranalyzer2 DPDK_MP=1 && t2build -R

Conclusion

All the command also work with different pcaps and the -R or -D options.

In future there will be a t2wizard which simplifies the process of parallelization of different cores.

And don’t forget to stop all T2 when you are finished and reset the configuration of your main tranalyzer if you want to do another tutorial.

t2stat -TERM -s

t2stat -l

No running instance of Tranalyzer found

t2conf basicFlow -D BFO_SENSORID=0 && t2build -R

To reset the configuration of T2 and all plugins the reset option is now available:

t2conf --reset

t2build -R

Have fun!