Perl -- Re: stripping headers from archives

Peter Gargano peter at ntserver.techedge.com.au
Fri Jun 11 21:29:26 GMT 1999


Okay, I know this is a perennial issue, but...

steve ravet wrote:

> They're on the ftp site in incoming, one for each list:  gmecm.awk,
> diy_efi.awk, efi332.awk

For those who don't have a Unix system, the Perl language, which is
free AND comes in both Unix/Linux and Win32 (ie. command line but
32 bit code) versions, is what you need to process the archive text
files to remove headers AND convert them to something a bit easier
to read with a browser - like (say) 

  1. Red coloured "subject:", "date", and "from", 
  2. italicised lines where "> or |" begin a line, and 
  3. hyperlinks to the next message with the same subject. 

So, you'd download an archive, process it with a unix/win32 command line 
like "perl cvtarch archive.txt" and then you'd end up with archive.htm 
that you could browse with nutscrape/internet-exploder.

I also believe that this should be done at the efi332 end of things
so that the oft repeated missive, "you'll find it in the archives", will
become not a chore, but a joy!

AND I think that the archives should be zipped (or at least available
in a zipped version), hopefully a "zip" format that WinZip AND Unix/Linix
understand in their native form.  Any discussion on this issue? My view
is that 2 Mbyte monthly archive can be compressed (with pkzip) to about
30% of its original size (I just tried it 1.8 Mb --> 590 kb @69% deflation)

Okay, I've asked for enough, so I'll offer to write the Perl code and
tell everyone where they can get the Perl interpreter. So, is anyone
interested, or have I missed the mark on what people are asking for here?

PS. It's a long weekend here so "I'll be back" to answer any ... on Tuesday.

I also have some ideas about the (dreaded) "you'll find it in the 
archives" situation, but that can wait.

regards
-- 
Peter Gargano



More information about the Gmecm mailing list