Wednesday, April 1, 2009

Simple Statistics

A small bash script (with awk) to find out simple statistics on data. The data should be in a single column in a file. The file can contain any number of columns.

simple_stat - download
simple_stat - view


example usage: ./simple_stat [filename] [column number] [column separator]

The script outputs:

Column number,
Sum,
Average,
Standard Deviation,
Number of lines in the file,
Maximum value,
Minimum Value.

the output, for example, could be something like:

column= 1 Sum= -5.914206 Avg= -0.029571 SD= 0.760153 Num_of_lines= 200 max= 1.427456 min= -1.305308

Very often we end up with block files having data in each column. This script will be very useful to get some simple statistics on those data.

No comments: