Reading a specific line from large file in Perl

Is there any fast and memory efficient way to read specific lines of large file, without loading it to memory?

I wrote a perl script, that runs many forks and I would like them to read specific lines from a file.

At the moment Im using an external command:

sub getFileLine {
    my ( $filePath, $lineWanted ) = @_;
    $SIG{PIPE} = '_IGNORE_';
    open( my $fh, '-|:utf8', "tail -q -n +$lineWanted \"$filePath\" | head -n 1" );
    my $line = <$fh>;
    close $fh;
    chomp( $line );
    return $line;
}

Its fast and it works - but maybe there's a more "Perl-ish" way, as fast and as memory efficient as this one?

As you know, creating a fork process in Perl duplicates the main process memory - so if the main process is using 10MB, the fork will use at least that much.

My goal is to keep fork process (so main process until running forks also) memory use as low as possible. Thats why I dont want to load the whole file into memory.

Answers


Before you go further, it's important to understand how fork works. When you fork a process, the OS uses copy-on-write semantics to share the bulk of the parent and child processes' memory; only the amount of memory that differs between the parent and child need to be separately allocated.

For reading a single line of a file in Perl, here's a simple way:

open my $fh, '<', $filePath or die "$filePath: $!";
my $line;
while( <$fh> ) {
    if( $. == $lineWanted ) { 
        $line = $_;
        last;
    }
}

This uses the special $. variable which holds the line number of the current filehandle.


Take a look at Tie::File core module.


You don't need to fork. As you can imagine, reading a specific line from a file is a common enough operation that one of the 20k modules on CPAN does it already.

File::ReadBackwards is memory-efficient and fast.


Need Your Help

static_cast restricts access to public member function?

c++ templates inheritance access-modifiers static-cast

I'm getting "error: ‘A’ is an inaccessible base of ‘B’" in static_cast of the following example:

About UNIX Resources Network

Original, collect and organize Developers related documents, information and materials, contains jQuery, Html, CSS, MySQL, .NET, ASP.NET, SQL, objective-c, iPhone, Ruby on Rails, C, SQL Server, Ruby, Arrays, Regex, ASP.NET MVC, WPF, XML, Ajax, DataBase, and so on.