|
perlfaq5 - perl常问问题集,第五篇
NAMEperlfaq5 - Files and Formats ($Revision: 1.22 $, $Date: 1997/04/24 22:44:02 $)
DESCRIPTIONThis section deals with I/O and the ``f'' issues: filehandles, flushing, formats, and footers.
How do I flush/unbuffer a filehandle? Why must I do this?The C standard
I/O library (stdio) normally buffers characters sent to
devices. This is done for efficiency reasons, so that there isn't a system call
for each byte. Any time you use In most stdio implementations, the type of buffering and the size of the buffer varies according to the type of device. Disk files are block buffered, often with a buffer size of more than 2k. Pipes and sockets are often buffered with a buffer size between 1/2 and 2k. Serial devices (e.g. modems, terminals) are normally line-buffered, and stdio sends the entire line when it gets the newline. Perl does not support truly unbuffered output (except insofar as you can
If you expect characters to get to your device when you print them there, you'll want to autoflush its handle, as in the older: use FileHandle; open(DEV, "<+/dev/tty"); # ceci n'est pas une pipe DEV->autoflush(1); or the newer IO::* modules: use IO::Handle; open(DEV, ">/dev/printer"); # but is this? DEV->autoflush(1); or even this: use IO::Socket; # this one is kinda a pipe? $sock = IO::Socket::INET->new(PeerAddr => 'www.perl.com', PeerPort => 'http(80)', Proto => 'tcp'); die "$!" unless $sock; $sock->autoflush(); $sock->print("GET /\015\012"); $document = join('', $sock->getlines()); print "DOC IS: $document\n"; Note the hardcoded carriage return and newline in their octal equivalents. This is the ONLY way (currently) to assure a proper flush on all platforms, including Macintosh. You can use $oldh = select(DEV); $| = 1; select($oldh); You'll also see code that does this without a temporary variable, as in select((select(DEV), $| = 1)[0]);
How do I change one line in a file/delete a line in a file/insert a line in the middle of a file/append to the beginning of a file?Although humans have an easy time thinking of a text file as being a sequence of lines that operates much like a stack of playing cards -- or punch cards -- computers usually see the text file as a sequence of bytes. In general, there's no direct way for Perl to seek to a particular line of a file, insert text into a file, or remove text from a file. (There are exceptions in special circumstances. Replacing a sequence of bytes
with another sequence of the same length is one. Another is using the
The general solution is to create a temporary copy of the text file with the changes you want, then copy that over the original. $old = $file; $new = "$file.tmp.$$"; $bak = "$file.bak"; open(OLD, "< $old") or die "can't open $old: $!"; open(NEW, "> $new") or die "can't open $new: $!"; # Correct typos, preserving case while (<OLD>) { s/\b(p)earl\b/${1}erl/i; (print NEW $_) or die "can't write to $new: $!"; } close(OLD) or die "can't close $old: $!"; close(NEW) or die "can't close $new: $!"; rename($old, $bak) or die "can't rename $old to $bak: $!"; rename($new, $old) or die "can't rename $new to $old: $!"; Perl can do this sort of thing for you automatically with the # Renumber a series of tests from the command line perl -pi -e 's/(^\s+test\s+)\d+/ $1 . ++$count /e' t/op/taint.t # form a script local($^I, @ARGV) = ('.bak', glob("*.c")); while (<>) { if ($. == 1) { print "This line should appear at the top of each file\n"; } s/\b(p)earl\b/${1}erl/i; # Correct typos, preserving case print; close ARGV if eof; # Reset $. } If you need to seek to an arbitrary line of a file that changes infrequently, you could build up an index of byte positions of where the line ends are in the file. If the file is large, an index of every tenth or hundredth line end would allow you to seek and read fairly efficiently. If the file is sorted, try the look.pl library (part of the standard perl distribution). In the unique case of deleting lines at the end of a file, you can use
open (FH, "+< $file"); while ( <FH> ) { $addr = tell(FH) unless eof(FH) } truncate(FH, $addr); Error checking is left as an exercise for the reader.
How do I count the number of lines in a file?One fairly efficient way is to count newlines in the file. The following program uses a feature of tr///, as documented in the perlop manpage. If your text file doesn't end with a newline, then it's not really a proper text file, so this may report one fewer line than you expect. $lines = 0; open(FILE, $filename) or die "Can't open `$filename': $!"; while (sysread FILE, $buffer, 4096) { $lines += ($buffer =~ tr/\n//); } close FILE;
How do I make a temporary file name?Use the process ID and/or the current time-value. If you need to have many temporary files in one process, use a counter: BEGIN { use IO::File; use Fcntl; my $temp_dir = -d '/tmp' ? '/tmp' : $ENV{TMP} || $ENV{TEMP}; my $base_name = sprintf("%s/%d-%d-0000", $temp_dir, $$, time()); sub temp_file { my $fh = undef; my $count = 0; until (defined($fh) || $count > 100) { $base_name =~ s/-(\d+)$/"-" . (1 + $1)/e; $fh = IO::File->new($base_name, O_WRONLY|O_EXCL|O_CREAT, 0644) } if (defined($fh)) { return ($fh, $base_name); } else { return (); } } } Or you could simply use IO::Handle::new_tmpfile.
How can I manipulate fixed-record-length files?The most efficient way is using
# sample input line: # 15158 p5 T 0:00 perl /home/tchrist/scripts/now-what $PS_T = 'A6 A4 A7 A5 A*'; open(PS, "ps|"); $_ = <PS>; print; while (<PS>) { ($pid, $tt, $stat, $time, $command) = unpack($PS_T, $_); for $var (qw!pid tt stat time command!) { print "$var: <$$var>\n"; } print 'line=', pack($PS_T, $pid, $tt, $stat, $time, $command), "\n"; }
How can I make a filehandle local to a subroutine? How do I pass filehandles between subroutines? How do I make an array of filehandles?You may have some success with typeglobs, as we always had to use in days of old: local(*FH); But while still supported, that isn't the best to go about getting local
filehandles. Typeglobs have their drawbacks. You may well want to use the
use FileHandle; sub findme { my $fh = FileHandle->new(); open($fh, "</etc/hosts") or die "no /etc/hosts: $!"; while (<$fh>) { print if /\b127\.(0\.0\.)?1\b/; } # $fh automatically closes/disappears here } Internally, Perl believes filehandles to be of class IO::Handle. You may use that module directly if you'd like (see Handle), or one of its more specific derived classes. Once you have IO::File or FileHandle objects, you can pass them between subroutines or store them in hashes as you would any other scalar values: use FileHandle; # Storing filehandles in a hash and array foreach $filename (@names) { my $fh = new FileHandle($filename) or die; $file{$filename} = $fh; push(@files, $fh); } # Using the filehandles in the array foreach $file (@files) { print $file "Testing\n"; } # You have to do the { } ugliness when you're specifying the # filehandle by anything other than a simple scalar variable. print { $files[2] } "Testing\n"; # Passing filehandles to subroutines sub debug { my $filehandle = shift; printf $filehandle "DEBUG: ", @_; } debug($fh, "Testing\n");
How can I set up a footer format to be used with write()?There's no builtin way to do this, but the perlform manpage has a couple of techniques to make it possible for the intrepid hacker.
How can I write() into a string?See the perlform
manpage for an
How can I output my numbers with commas added?This one will do it for you: sub commify { local $_ = shift; 1 while s/^(-?\d+)(\d{3})/$1,$2/; return $_; } $n = 23659019423.2331; print "GOT: ", commify($n), "\n"; GOT: 23,659,019,423.2331 You can't just: s/^(-?\d+)(\d{3})/$1,$2/g; because you have to put the comma in and then recalculate your position. Alternatively, this commifies all numbers in a line regardless of whether they have decimal portions, are preceded by + or -, or whatever: # from Andrew Johnson <ajohnson@gpu.srv.ualberta.ca> sub commify { my $input = shift; $input = reverse $input; $input =~ s<(\d\d\d)(?=\d)(?!\d*\.)><$1,>g; return reverse $input; }
How can I translate tildes (~) in a filename?Use the <> (glob()) operator, documented in the perlfunc manpage. This requires that you have a shell installed that groks tildes, meaning csh or tcsh or (some versions of) ksh, and thus may have portability problems. The Glob::KGlob module (available from CPAN) gives more portable glob functionality. Within Perl, you may use this directly: $filename =~ s{ ^ ~ # find a leading tilde ( # save this in $1 [^/] # a non-slash character * # repeated 0 or more times (0 means me) ) }{ $1 ? (getpwnam($1))[7] : ( $ENV{HOME} || $ENV{LOGDIR} ) }ex;
How come when I open the file read-write it wipes it out?Because you're using something like this, which truncates the file and then gives you read-write access: open(FH, "+> /path/name"); # WRONG Whoops. You should instead use this, which will fail if the file doesn't exist. open(FH, "+< /path/name"); # open for update If this is an issue, try: sysopen(FH, "/path/name", O_RDWR|O_CREAT, 0644); Error checking is left as an exercise for the reader.
Why do I sometimes get an "Argument list too long" when I use <*>?The To get around this, either do the glob yourself with
Is there a leak/bug in glob()?Due to the current implementation on some operating systems, when you
use the
How can I open a file with a leading ">" or trailing blanks?Normally perl ignores trailing blanks in filenames, and interprets certain leading characters (or a trailing ``|'') to mean something special. To avoid this, you might want to use a routine like this. It makes incomplete pathnames into explicit relative ones, and tacks a trailing null byte on the name to make perl leave it alone: sub safe_filename { local $_ = shift; return m#^/# ? "$_\0" : "./$_\0"; } $fn = safe_filename("<<<something really wicked "); open(FH, "> $fn") or "couldn't open $fn: $!"; You could also use the
How can I reliably rename a file?Well, usually you just use Perl's rename($old, $new) or system("mv", $old, $new); It may be more compelling to use the File::Copy module instead. You just copy
to the new file to the new name (checking return values), then delete the old
one. This isn't really the same semantics as a real
How can I lock a file?Perl's
builtin
flock() can't lock network files.
What can't I just open(FH, ">file.lock")?A common bit of code NOT TO USE is this: sleep(3) while -e "file.lock"; # PLEASE DO NOT USE open(LCK, "> file.lock"); # THIS BROKEN CODE This is a classic race condition: you take two steps to do something which must be done in one. That's why computer hardware provides an atomic test-and-set instruction. In theory, this ``ought'' to work: sysopen(FH, "file.lock", O_WRONLY|O_EXCL|O_CREAT, 0644) or die "can't open file.lock: $!": except that lamentably, file creation (and deletion) is not atomic over NFS, so this won't work (at least, not every time) over the net.
Various schemes involving involving
I still don't get locking. I just want to increment the number in the file. How can I do this?Didn't anyone ever tell you web-page hit counters were useless? Anyway, this is what to do: use Fcntl; sysopen(FH, "numfile", O_RDWR|O_CREAT, 0644) or die "can't open numfile: $!"; flock(FH, 2) or die "can't flock numfile: $!"; $num = <FH> || 0; seek(FH, 0, 0) or die "can't rewind numfile: $!"; truncate(FH, 0) or die "can't truncate numfile: $!"; (print FH $num+1, "\n") or die "can't write numfile: $!"; # DO NOT UNLOCK THIS UNTIL YOU CLOSE close FH or die "can't close numfile: $!"; Here's a much better web-page hit counter: $hits = int( (time() - 850_000_000) / rand(1_000) ); If the count doesn't impress your friends, then the code might. :-)
How do I randomly update a binary file?If you're just trying to patch a binary, in many cases something as simple as this works: perl -i -pe 's{window manager}{window mangler}g' /usr/bin/emacs However, if you have fixed sized records, then you might do something more like this: $RECSIZE = 220; # size of record, in bytes $recno = 37; # which record to update open(FH, "+<somewhere") || die "can't update somewhere: $!"; seek(FH, $recno * $RECSIZE, 0); read(FH, $record, $RECSIZE) == $RECSIZE || die "can't read record $recno: $!"; # munge the record seek(FH, $recno * $RECSIZE, 0); print FH $record; close FH; Locking and error checking are left as an exercise for the reader. Don't forget them, or you'll be quite sorry. Don't forget to set
How do I get a file's timestamp in perl?If you want to retrieve the time at which the file was last read,
written, or had its meta-data (owner, etc) changed, you use the
-M, -A, or -C filetest
operations as documented in the perlfunc
manpage. These retrieve the age of the file (measured against the start-time
of your program) in days as a floating point number. To retrieve the ``raw''
time in seconds since the epoch, you would call the stat function, then use
Here's an example: $write_secs = (stat($file))[9]; print "file $file updated at ", scalar(localtime($file)), "\n"; If you prefer something more legible, use the File::stat module (part of the standard distribution in version 5.004 and later): use File::stat; use Time::localtime; $date_string = ctime(stat($file)->mtime); print "file $file updated at $date_string\n"; Error checking is left as an exercise for the reader.
How do I set a file's timestamp in perl?You use the if (@ARGV < 2) { die "usage: cptimes timestamp_file other_files ...\n"; } $timestamp = shift; ($atime, $mtime) = (stat($timestamp))[8,9]; utime $atime, $mtime, @ARGV; Error checking is left as an exercise for the reader. Note that
How do I print to more than one file at once?If you only have to do this once, you can do this: for $fh (FH1, FH2, FH3) { print $fh "whatever\n" } To connect up to one filehandle to several output filehandles, it's easiest
to use the open (FH, "| tee file1 file2 file3"); Otherwise you'll have to write your own multiplexing print function -- or your own tee program -- or use Tom Christiansen's, at http://www.cnnb.net/tianyige/tppmsgs/msgs1.htm#112, which is written in Perl. In theory a IO::Tee class could be written, but to date we haven't seen such.
How can I read in a file by paragraphs?Use the
How can I read a single character from a file? From the keyboard?You can use the builtin
If your system supports POSIX, you can use the following code, which you'll note turns off echo processing as well. #!/usr/bin/perl -w use strict; $| = 1; for (1..4) { my $got; print "gimme: "; $got = getone(); print "--> $got\n"; } exit; BEGIN { use POSIX qw(:termios_h); my ($term, $oterm, $echo, $noecho, $fd_stdin); $fd_stdin = fileno(STDIN); $term = POSIX::Termios->new(); $term->getattr($fd_stdin); $oterm = $term->getlflag(); $echo = ECHO | ECHOK | ICANON; $noecho = $oterm & ~$echo; sub cbreak { $term->setlflag($noecho); $term->setcc(VTIME, 1); $term->setattr($fd_stdin, TCSANOW); } sub cooked { $term->setlflag($oterm); $term->setcc(VTIME, 0); $term->setattr($fd_stdin, TCSANOW); } sub getone { my $key = ''; cbreak(); sysread(STDIN, $key, 1); cooked(); return $key; } } END { cooked() } The Term::ReadKey module from CPAN may be easier to use: use Term::ReadKey; open(TTY, "</dev/tty"); print "Gimme a char: "; ReadMode "raw"; $key = ReadKey 0, *TTY; ReadMode "normal"; printf "\nYou said %s, char number %03d\n", $key, ord $key; For DOS systems, Dan Carson mailto:dbc@tc.fluke.COM reports the following: To put the PC in ``raw'' mode, use ioctl with some magic numbers gleaned from msdos.c (Perl source file) and Ralf Brown's interrupt list (comes across the net every so often): $old_ioctl = ioctl(STDIN,0,0); # Gets device info $old_ioctl &= 0xff; ioctl(STDIN,1,$old_ioctl | 32); # Writes it back, setting bit 5 Then to read a single character: sysread(STDIN,$c,1); # Read a single character And to put the PC back to ``cooked'' mode: ioctl(STDIN,1,$old_ioctl); # Sets it back to cooked mode. So now you have $c. If # PC 2-byte keycodes = ^@ + the following: # HEX KEYS # --- ---- # 0F SHF TAB # 10-19 ALT QWERTYUIOP # 1E-26 ALT ASDFGHJKL # 2C-32 ALT ZXCVBNM # 3B-44 F1-F10 # 47-49 HOME,UP,PgUp # 4B LEFT # 4D RIGHT # 4F-53 END,DOWN,PgDn,Ins,Del # 54-5D SHF F1-F10 # 5E-67 CTR F1-F10 # 68-71 ALT F1-F10 # 73-77 CTR LEFT,RIGHT,END,PgDn,HOME # 78-83 ALT 1234567890-= # 84 CTR PgUp This is all trial and error I did a long time ago, I hope I'm reading the file that worked.
How can I tell if there's a character waiting on a filehandle?You should check out the Frequently Asked Questions list in comp.unix.* for things like this: the answer is essentially the same. It's very system dependent. Here's one solution that works on BSD systems: sub key_ready { my($rin, $nfd); vec($rin, fileno(STDIN), 1) = 1; return $nfd = select($rin,undef,undef,0); } You should look into getting the Term::ReadKey extension from CPAN.
How do I open a file without blocking?You need to use the O_NDELAY or O_NONBLOCK flag from the Fcntl module in conjunction with
use Fcntl; sysopen(FH, "/tmp/somefile", O_WRONLY|O_NDELAY|O_CREAT, 0644) or die "can't open /tmp/somefile: $!":
How do I create a file only if it doesn't exist?You need to use the O_CREAT and
O_EXCL flags from the Fcntl module in conjunction with
use Fcntl; sysopen(FH, "/tmp/somefile", O_WRONLY|O_EXCL|O_CREAT, 0644) or die "can't open /tmp/somefile: $!": Be warned that neither creation nor deletion of files is guaranteed to be an atomic operation over NFS. That is, two processes might both successful create or unlink the same file!
How do I do a |