Creating a graph of the rulebase

Thursday, December 2nd, 2010

To get a better overview of a rulebase you can create a graph that shows you the chain of normalization.

At first you have to install an additional package called graphviz. Graphviz is a tool that creates such a graph with the help of a control file (created with the rulebase). Here you will find more informaton about graphviz.

To install it you can use the package manager or the yum command.

$ sudo yum install graphviz

The next step would be creating the control file for graphviz. Therefor we use the normalizer command with the optionsĀ -d “prefered filename for the control file” and -r “folder of sampledb”

$ ./normalize -d control.dot -r /home/Test/messages.rb

Please note that there is no need for an input or output file.
If you have a look at the control file now you will see that the content is a little bit confusing, but it includes all information, like the nodes, fields and parser, that graphviz needs to create the graph. Of course you can edit that file, but please note that it is a lot of work.

Now we can create the graph by typing

$ dot control.dot -Tpng >graph.png

dot + name of control file + option -T -> file format + output file

That is just one example for using graphviz, of course you can do many other great things with it. But I think this “simple” graph could be very helpful for the normalizer.

Please find below a sample for such a graph, but please note that this is not such a pretty one. We will update that graph as soon as we have a adequate one. Such a graph can grow very fast by editing your rulebase.

graph sample
Click to enlarge.

Creating a rulebase

Tuesday, November 16th, 2010

A first example for a rulebase you can download at
http://blog.gerhards.net/2010/11/log-normalization-first-results.html

I will use an excerpt of that rulebase to show you the most common expressions.

rule=:%date:date-rfc3164% %host:word% %tag:char-to:\x3a%: no longer listening on %ip:ipv4%#%port:number%'

That excerpt is a common rule. A rule contains different “parts”/properties, like the message you want to normalize (e.g. Host, IP, Source, Syslogtag…)

All rules have to start with “rule=:

The buildup of a property is as follows

%field name:field type:additional information%

field name -> that name can be free selected. It should reflect the content of the field, e.g. src-ip for the source IP. In common sense, the field names should be the same in all samples, if the content of the field means the same.

field type -> selects the accordant parser

date-rfc3164: date in format of rfc3164

ipv4: ip adress

number: sequence of numbers (example: %port:number%)

word: everything till the next blank (example: %host:word%)

char-to: the field will be defined by the sign in the additional information (example: %tag:char-to:\x3a%: (x3a means ":" in the additional information))

additional information -> dependent on the field type; some field types need additional information

In our example we have some more information that is used as “simple text”. That parts are exactly like the parts in the messages and are not selected by a property.

Very important:

In the field type “char-to” you can use any item that is on your keyboard. In the case shown above, the item “:” has to be escaped with it’s ANSII version. Other characters do not have to be escaped.