Operating System - OpenVMS
1752806 Members
6011 Online
108789 Solutions
New Discussion юеВ

Re: Looking For PDF Search Program

 
Robert Atkinson
Respected Contributor

Looking For PDF Search Program

Does anyone know of a search engine/program that can find text in PDF files, and will run on VMS?
12 REPLIES 12
Antoniov.
Honored Contributor

Re: Looking For PDF Search Program

Search here
http://decwarch.free.fr/pspdf.html

Antonio Vigliotti
Antonio Maria Vigliotti
Robert Atkinson
Respected Contributor

Re: Looking For PDF Search Program

Antoniov - I hunted all through the Ghostscript sites, but couldn't find anything relating to a search facility.

I also hunted for an email address to pose the question, but still came up blank.

Could you point me in the right direction for either. I don't want to install such a large package if it doesn't suit my needs.

Cheers, Robert.
Martin Vorlaender
Honored Contributor

Re: Looking For PDF Search Program

Sorry for answering so late.

My port of ht://Dig to VMS can search PDFs. It uses pdftotext and pdfinfo from the xpdf package.

See http://www.pdv-systeme.de/users/martinv/htdig/
Robert Atkinson
Respected Contributor

Re: Looking For PDF Search Program

Martin - thanks for the links, all very usefull.

One question though; can HTDig search the PDF files without extracting the data?

If not, then it would be too cumbersome to use in my application.

Any other suggestions you can give to allow me to search a bunch of PDF files quickly would be very much appreciated.

Cheers, Robert.
Martin Vorlaender
Honored Contributor

Re: Looking For PDF Search Program

Robert,

sorry, but no. I know of no utility that searches PDF files "natively" on VMS, i.e. without extracting the text.
labadie_1
Honored Contributor

Re: Looking For PDF Search Program

Swish-E does it.

But as you are a subscriber of the Wasd mailing list, you already know that :-)



Martin Vorlaender
Honored Contributor

Re: Looking For PDF Search Program

No, it doesn't. Quoting from http://swish-e.org/current/docs/searchdoc.html :

"Swish-e can internally only parse HTML, XML and TXT (text) files by default"

Same philosophy as ht://Dig.
Martin Vorlaender
Honored Contributor

Re: Looking For PDF Search Program

labadie_1
Honored Contributor

Re: Looking For PDF Search Program

Sorry Martin, but I would tend to disagree, a post today in the Wasd mailing list explains how to do it.

And when you search in the Vms documentation at
http://pi-net.dyndns.org/cgiplus-bin/search

you find sometimes Pdf documents.

So it works

:-)