Flow name labeling
Contents
Introduction
The fnameLabel plugin helps you tag flows which originate from different files or interfaces.
It is especially useful for the -R
or -D
options, discussed in the multifileIO tutorial.
Preparation
First, restore T2 into a pristine state by removing all unnecessary or older plugins from the plugin folder ~/.tranalyzer/plugins:
t2build -e -y
Are you sure you want to empty the plugin folder '/home/wurst/.tranalyzer/plugins' (y/N)? yes
Plugin folder emptied
Then compile the following plugins:
t2build tranalyzer2 basicFlow tcpStates fnameLabel txtSink
...
BUILD SUCCESSFUL
If you did not create a separate data and results directory yet, please do it now in another bash window, that facilitates your workflow:
mkdir ~/data ~/results
The anonymized sample PCAP used in this tutorial can be downloaded here: faf-exercise.pcap.
Please save it in your ~/data folder.
Now you are all set!
PCAP fragmentation
As we are using the multifileIO option, the pcap needs to be chunked. Just invoke the commands below:
mkdir ~/data/F
tcpdump -r ~/data/faf-exercise.pcap -w ~/data/F/faf-exercise.pcap -C 1
tcpdump -r ~/data/faf-exercise.pcap -w ~/data/F/faf-exercise.pcap -C 1
ls ~/data/F
faf-exercise.pcap faf-exercise.pcap1 faf-exercise.pcap2 faf-exercise.pcap3 faf-exercise.pcap4 faf-exercise.pcap5
Now you are all set for the following chapter.
Plugin fnameLabel
The fnameLabel plugin not only tags flows but also adds a hash value or a label which represents the number contained in a file or
a specific letter. It is predominantly used to automatically separate flows created by the -R
or -D
option for training of classifiers.
In order to see the configuration, move to the fnameLabel plugin and look into the fnameLabel.h file.
fnameLabel
vi src/fnameLabel.h
...
/* ========================================================================== */
/* ------------------------ USER CONFIGURATION FLAGS ------------------------ */
/* ========================================================================== */
#define FNL_LBL 1 // 1: Output label derived from input
// (Use fileNum for Tranalyzer -D option, otherwise refer to FNL_IDX)
#define FNL_IDX 0 // Use the 'FNL_IDX' letter of the filename as label
// (t2 -R/-i/-r options) [require FNL_LBL=1]
#define FNL_HASH 0 // 1: Output hash of filename
#define FNL_FLNM 1 // 1: Output filename
#define FNL_FREL 1 // Use absolute (0) or relative (1) filenames for fnLabel, fnHash and fnName
#define FNL_NAMELEN 1024 // Max length for filename
/* +++++++++++++++++++++ ENV / RUNTIME - conf Variables +++++++++++++++++++++ */
/* No env / runtime configuration flags available for fnameLabel */
/* ========================================================================== */
/* ------------------------- DO NOT EDIT BELOW HERE ------------------------- */
/* ========================================================================== */
...
If -D
is utilized the label denotes the file number in the file name regex.
In all other cases the constant FNL_IDX
defines the position of the character in the filename to be taken as label.
Note that if FNL_FREL=1
, then the position refers to the relative filename (position 0 refers to the first character after the last slash).
If FNL_FREL=0
, the position refers to the absolute path. For example:
Filename | FNL_IDX |
FNL_FREL=1 |
FNL_FREL=0 |
---|---|---|---|
/home/user/data/F/faf-exercise.pcap | 0 |
f |
/ |
/home/user/data/F/faf-exercise.pcap | 1 |
a |
h |
If you like you may switch on FNL_HASH
as well, as it produces a unique number representing the filename.
Here, we leave everything else as default.
fnameLabel using the -D option
t2 -D ~/data/F/faf-exercise.pcap1,5 -w ~/results/================================================================================ Tranalyzer 0.8.14 (Anteater), Tarantula. PID: 56586 ================================================================================ [INF] Creating flows for L2, IPv4, IPv6 Active plugins: 01: basicFlow, 0.8.14 02: tcpStates, 0.8.14 03: fnameLabel, 0.8.14 04: txtSink, 0.8.14 [INF] IPv4 Ver: 5, Rev: 16122020, Range Mode: 0, subnet ranges loaded: 406105 (406.11 K) [INF] IPv6 Ver: 5, Rev: 17122020, Range Mode: 0, subnet ranges loaded: 51345 (51.34 K) Processing file: /home/wurst/data/F/faf-exercise.pcap1 Link layer type: Ethernet [EN10MB/1] Dump start: 1258594168.120912 sec (Thu 19 Nov 2009 01:29:28 GMT) Processing file: /home/wurst/data/F/faf-exercise.pcap2 Processing file: /home/wurst/data/F/faf-exercise.pcap3 Processing file: /home/wurst/data/F/faf-exercise.pcap4 Processing file: /home/wurst/data/F/faf-exercise.pcap5 Dump stop : 1258594491.683288 sec (Thu 19 Nov 2009 01:34:51 GMT) Total dump duration: 323.562376 sec (5m 23s) ... Number of processed flows: 4 Number of processed A flows: 2 [50.00%] Number of processed B flows: 2 [50.00%] Number of request flows: 2 [50.00%] Number of reply flows: 2 [50.00%] ...
Only 4 flows? Why is that? If you run faf-exercise.pcap with t2 -r
we have 72 flows.
This is because most of the flows are generated in the first chunk F/faf-exercise.pcap.
We started with index 1, remember?! Gotcha.
If you wanted to process all the chunks, you could modify the -D
option as follows: -D ~/data/F/faf-exercise.pcap,5
If you look now into the resulting flow file ~/results/faf-exercise_flows.txt you will see flows with fnLabel
1 and 5,
which match the number in fname
(the filename). This means each of those files caused a flow to be created.
tcol ~/results/faf-exercise_flows.txt
%dir flowInd flowStat timeFirst timeLast duration numHdrDesc numHdrs hdrDesc srcMac dstMac ethType ethVlanID srcIP srcIPCC srcIPOrg srcPort dstIP dstIPCC dstIPOrg dstPort l4Proto tcpStates fnLabel fname
A 1 0x0400000000004000 1258594168.120912 1258594185.427506 17.306594 1 3 eth:ipv4:tcp 00:19:e3:e7:5d:23 00:08:74:38:01:b4 0x0800 143.166.11.10 us "Dell" 64334 192.168.1.105 07 "Private network" 49330 6 0x03 1 "faf-exercise.pcap1"
B 1 0x0400000000004001 1258594168.121080 1258594191.015208 22.894128 1 3 eth:ipv4:tcp 00:08:74:38:01:b4 00:19:e3:e7:5d:23 0x0800 192.168.1.105 07 "Private network" 49330 143.166.11.10 us "Dell" 64334 6 0x43 1 "faf-exercise.pcap1"
A 2 0x0400000000004000 1258594185.618346 1258594185.618346 0.000000 1 3 eth:ipv4:tcp 00:08:74:38:01:b4 00:19:e3:e7:5d:23 0x0800 192.168.1.105 07 "Private network" 49329 143.166.11.10 us "Dell" 21 6 0x03 5 "faf-exercise.pcap5"
B 2 0x0400000000004001 1258594185.427515 1258594491.683288 306.255773 1 3 eth:ipv4:tcp 00:19:e3:e7:5d:23 00:08:74:38:01:b4 0x0800 143.166.11.10 us "Dell" 21 192.168.1.105 07 "Private network" 49329 6 0x43 5 "faf-exercise.pcap5"
Note that if you set FNL_FREL
to 0, then the absolute path, e.g., /home/user/data/F/faf-exercise.pcap1, would be printed instead of the relative one, e.g., faf-exercise.pcap1.
fnameLabel using the -R option
In case of the -R
option, we first have to create a pcap file list.
t2caplist ~/data/*[0-9] > ~/data/faf-exercise.txt
cat ~/data/faf-exercise.txt
/home/user/data/F/faf-exercise.pcap1
/home/user/data/F/faf-exercise.pcap2
/home/user/data/F/faf-exercise.pcap3
/home/user/data/F/faf-exercise.pcap4
/home/user/data/F/faf-exercise.pcap5
Then FNL_IDX
needs to be set to the character position where the number is to be expected.
And then invoke T2 on this very list.
t2conf fnameLabel -D FNL_IDX=17 && t2build fnameLabel
t2 -R ~/data/F/faf-exercise.txt -w ~/results/================================================================================ Tranalyzer 0.8.14 (Anteater), Tarantula. PID: 27767 ================================================================================ [INF] Creating flows for L2, IPv4, IPv6 Checking list file checking file '/home/user/data/F/faf-exercise.pcap1' checking file '/home/user/data/F/faf-exercise.pcap2' checking file '/home/user/data/F/faf-exercise.pcap3' checking file '/home/user/data/F/faf-exercise.pcap4' checking file '/home/user/data/F/faf-exercise.pcap5' Active plugins: 01: basicFlow, 0.8.14 02: tcpStates, 0.8.14 03: fnameLabel, 0.8.14 04: txtSink, 0.8.14 [INF] IPv4 Ver: 5, Rev: 16122020, Range Mode: 0, subnet ranges loaded: 406105 (406.11 K) [INF] IPv6 Ver: 5, Rev: 17122020, Range Mode: 0, subnet ranges loaded: 51345 (51.34 K) Processing list file: /home/user/data/F/faf-exercise.txt Processing file no. 1 of 5: /home/user/data/F/faf-exercise.pcap1 Link layer type: Ethernet [EN10MB/1] Dump start: 1258594168.120912 sec (Thu 19 Nov 2009 01:29:28 GMT) Processing file no. 2 of 5: /home/user/data/F/faf-exercise.pcap2 Link layer type: Ethernet [EN10MB/1] Processing file no. 3 of 5: /home/user/data/F/faf-exercise.pcap3 Link layer type: Ethernet [EN10MB/1] Processing file no. 4 of 5: /home/user/data/F/faf-exercise.pcap4 Link layer type: Ethernet [EN10MB/1] Processing file no. 5 of 5: /home/user/data/F/faf-exercise.pcap5 Link layer type: Ethernet [EN10MB/1] Dump stop : 1258594491.683288 sec (Thu 19 Nov 2009 01:34:51 GMT) Total dump duration: 323.562376 sec (5m 23s) Finished processing. Elapsed time: 0.002678 sec Finished unloading flow memory. Time: 0.002699 sec ... Number of processed flows: 4 Number of processed A flows: 2 [50.00%] Number of processed B flows: 2 [50.00%] Number of request flows: 2 [50.00%] Number of reply flows: 2 [50.00%] ...
And you see the same result as before with the -D
option.
tcol ~/results/faf-exercise_flows.txt
%dir flowInd flowStat timeFirst timeLast duration numHdrDesc numHdrs hdrDesc srcMac dstMac ethType ethVlanID srcIP srcIPCC srcIPOrg srcPort dstIP dstIPCC dstIPOrg dstPort l4Proto tcpStates fnLabel fname
A 1 0x0400000000004000 1258594168.120912 1258594185.427506 17.306594 1 3 eth:ipv4:tcp 00:19:e3:e7:5d:23 00:08:74:38:01:b4 0x0800 143.166.11.10 us "Dell" 64334 192.168.1.105 07 "Private network" 49330 6 0x03 1 "faf-exercise.pcap1"
B 1 0x0400000000004001 1258594168.121080 1258594191.015208 22.894128 1 3 eth:ipv4:tcp 00:08:74:38:01:b4 00:19:e3:e7:5d:23 0x0800 192.168.1.105 07 "Private network" 49330 143.166.11.10 us "Dell" 64334 6 0x43 1 "faf-exercise.pcap1"
A 2 0x0400000000004000 1258594185.618346 1258594185.618346 0.000000 1 3 eth:ipv4:tcp 00:08:74:38:01:b4 00:19:e3:e7:5d:23 0x0800 192.168.1.105 07 "Private network" 49329 143.166.11.10 us "Dell" 21 6 0x03 5 "faf-exercise.pcap5"
B 2 0x0400000000004001 1258594185.427515 1258594491.683288 306.255773 1 3 eth:ipv4:tcp 00:19:e3:e7:5d:23 00:08:74:38:01:b4 0x0800 143.166.11.10 us "Dell" 21 192.168.1.105 07 "Private network" 49329 6 0x43 5 "faf-exercise.pcap5"
Conclusion
Don’t forget to reset FNL_IDX
to its default value:
t2conf fnameLabel -D FNL_IDX=0 && t2build fnameLabel
Or use the new command:
t2conf --reset fnameLabel
Have fun!