Monday, November 7, 2011

Perl as a Pluggable Pipe Tool

The find command is a useful tool for finding files/folders on a disk.

And it has a very useful feature, the -exec switch, which can be used to perform some operations on those files/folders.

For example...
Lets say i want to find all files in a directory which contain the word PASS.

easy right?
find "dir" -name '*.log' -exec grep -l PASS {} \;

But then lets imagine that we want to find all files that contain the word PASS in a certain directory. but then we also want to get a string that matches /testcase: (.*)/ in the file.
That seems like a reasonable request...

I could try to update my find command, but the code quickly gets complex , and not portable across different shells csh bash etc... And you could spend plenty fo time debugging this.
find "dir" -exec \[ -z \` grep -l INST {} \` \] \&\& grep LOC {} \;


I could write a once off script that does the work of the find command to find the files and to process them.
This is fine if you only want to do this job once. but the effort involved in writing standalone scripts to do this kind of simple processing is large. It gets you wishing that there was some kind of way find could handle this.

Well now there is!!

Lets discuss...
Perl
as a useful language to build a pluggable pipe tool for processing Files in unix.

Firstly lets write a file called ppipe
and put this contents in it
#!/usr/local/bin/perl
my $cmd = join (' ', map {s/^(.*\s+.*)$/\'\1\'/; $_ } @ARGV);
while (<STDIN>) {
chomp;
$cmd =~ s/\{\}/$_/;
system($cmd);

}

Then we can easily run our find command like this

find "dir" -name '*.log' | ppipe grep -l PASS {} | | ppipe grep -P 'testcase: (.*)' {}

Hey presto.... Infinite scaleiability of the find command