[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Omaha.pm] Unexpected internal vs shell command speed result...



My code currently calls the UNIX "du" command to get the size of a directory structure:
�� � � �$size = `/usr/bin/du -sk $DATA_DIR | cut -f1`;

Knowing that shells are CPU time expensive and generally not portable across platforms I am looking into replacing it with a pure perl implementation:
�� � � �find( sub { -f and ( $size += -s _ ) }, $DATA_DIR );

Wanting to be able to brag about the speed increase, I timed them with the Benchmark routines, and got a shock when I tested against my /tmp directory:
�� � � � � Rate Internal Shell_du
Internal 11.6/s � � � -- � � -99%
Shell_du 1538/s � 13123% � � � --

WOW! �The shell to du was 13 TIMES faster than the internal find code. �(FYI, the /tmp/ directory has 349MB across 6400 files.)

As a test, I created a very small directory structure (12 files, 2 sub-directories, 120KB) and the results for 10,000 timings are opposite:
�� � � � � Rate Shell_du Internal
Shell_du 1664/s � � � -- � � -68%
Internal 5208/s � � 213% � � � --

This time the internal code was faster...

My test system is a CentOS 5.5 64-bit (2GB RAM, mostly free RAM used for caching), with Perl 5.8.8, and the /tmp filesystem is an EXT3.

This bit of code isn't time critical and the actual data that will be processed is closer to the 120K test case, so I may continue and remove the shell/du line, but I'd like to know how this got so slow!

Dan

Just in case I made a blunder, here's the test code:
#!/usr/bin/perl -w
use strict;
use Benchmark qw(:all);
use File::Find;

my $foo � � � � � � � = 0;
my $count � � � � � � = shift || 2000;
my $DATA_DIR � � � � �= shift || "/tmp";

sub shell_du {
�� � � �my $size = 0;
�� � � �$size = `/usr/bin/du -sk $DATA_DIR | cut -f1`;
�� � � �chomp $size;
�� � � �return $size;
}

sub internal_du {
�� � � �my $size = 0;
�� � � �find( sub { -f and ( $size += -s _ ) }, $DATA_DIR );
�� � � �return $size;
}

cmpthese ($count, {
�� � � �'Shell_du' => sub { $foo = shell_du(); � �},
�� � � �'Internal' => sub { $foo = internal_du(); },
});

--�
***************** ************* *********** ******* ***** *** **
"Quis custodiet ipsos custodes?"
� � (Who can watch the watchmen?)
� � -- from the Satires of Juvenal
"I do not fear computers, I fear the lack of them."
� � -- Isaac Asimov (Author)
** *** ***** ******* *********** ************* *****************