Operating System - OpenVMS
1820390 Members
3954 Online
109623 Solutions
New Discussion юеВ

HTML to ASCII Conversion

 
James Summerhill
New Member

HTML to ASCII Conversion

Does anyone know of any conversion utilities that can convert simple HTML files into ASCII equivalents? The HTML files will include some tables. The system is a VAX4000-400 running OpenVMS 7.1.

Many Thanks

James.
6 REPLIES 6
Mobeen_1
Esteemed Contributor

Re: HTML to ASCII Conversion

James,
Preview the link below which has a host of freeware for VMS platform and see if you can find what you are looking for

http://vms.process.com/fileserv-software.html

regards
Mobeen
Joseph Huber_1
Honored Contributor

Re: HTML to ASCII Conversion

Well, for me HTML files always look like ASCII text :-)

The only programs I know which extract the essential text content of a HTML file (this is probably what is wanted, right ?) are browsers like LYNX, which has a Print command to save the extracted text to a file.

The Netscape/Mozilla browser can do it as well,
using the "save page as" menu entry, and then choose text as output format.

Netscape/Mozilla can also work in "remote command" mode, so it may be possible to script the save_as command somehow ...
http://www.mpp.mpg.de/~huber
Joseph Huber_1
Honored Contributor

Re: HTML to ASCII Conversion

A quick search found this page

http://www.linuxdevcenter.com/pub/a/linux/2005/05/26/textonly.html

describing the "lynx -dump -force_html url_or_file "

command. That seems to be a way to go.
In VMS, precede the command by a define/user sys$output textfile, and see what You get in textfile.
http://www.mpp.mpg.de/~huber
comarow
Trusted Contributor

Re: HTML to ASCII Conversion

When they are displayed they are displayed in test mode. Why not cut and paste them into another document.

Now a tool that convert html to plain text would be useful.


Bob
Joseph Huber_1
Honored Contributor

Re: HTML to ASCII Conversion

In fact , such a tool has to parse the HTML syntax, and omit any graphic items.
And the lynx browser does it the best way I can imagine. Why reinvent the wheel ?
http://www.mpp.mpg.de/~huber
Paul Beaudoin
Regular Advisor

Re: HTML to ASCII Conversion

James,

I acquired an AWK programme some years ago called dehtml. It requires installation of 'GAWK'(FREEWARE). it detags the file back to ascii and if I remember correctly, has some smarts in it for tables and other 'formatted' data. Used it extensively a long time ago and was impressed. If you want a copy (DEHTML.AWK) mail me and I'll zip it up. It is only 10 blocks.

Regards

PAul