Operating System - OpenVMS
1752565 Members
5472 Online
108788 Solutions
New Discussion юеВ

Re: Lucene, Solr and filepaths

 
SOLVED
Go to solution
Ben Armstrong
Regular Advisor

Lucene, Solr and filepaths

I tried to get some help from the lucene-java-user list at apache.org back in January getting Solr to work on VMS:

http://mail-archives.apache.org/mod_mbox/lucene-java-user/201001.mbox/<4B433F43.8040803@dymaxion.ca>

http://mail-archives.apache.org/mod_mbox/lucene-java-user/201001.mbox/<4B436182.6000207@dymaxion.ca>

I'm wondering if anyone else has tried to make Lucene or Solr work on VMS. Did you get it to work? If so, how?

Thanks,
Ben
11 REPLIES 11
Craig A Berry
Honored Contributor
Solution

Re: Lucene, Solr and filepaths

It looks like you're trying to get this working on ODS-2, which is likely going to be a double dose of self-punishment but I assume is necessary due to your local environment.

Folks with Unixy expectations tend to treat the filesystem as a database with the filenames as case-sensitive primary keys that may contain anything that's not a shell metacharacter. Programs with these expectations tend to get indigestion when interacting with the VMS filesystem and its highly structured sense of where the parts of a filespec begin and end and tendency to throw in extra punctuation to avoid ambiguity. An example of the latter is the reporting of a filename with no extension as "foo." in order to avoid the bareword "foo" that could easily be confused with a logical name.

There are various feature logicals in Java and within the CRTL to work around the whole set of expectations I've referred to. I believe what you're looking for is the feature logical DECC$READDIR_DROPDOTNOTYPE, which might do the trick if Java is using the CRTL's readdir():

$ help crtl feature DECC$READDIR_DROPDOTNOTYPE

CRTL

Feature_Logical_Names

DECC$READDIR_DROPDOTNOTYPE

With DECC$READDIR_DROPDOTNOTYPE enabled, readdir when reporting
files in UNIX style only reports the trailing period (.) for
files with no file type when the file name contains a period.

With this logical name disabled, all files without a file type
are reported with a trailing period.


H.Becker
Honored Contributor

Re: Lucene, Solr and filepaths

DECC$READDIR_DROPDOTNOTYPE, may help. From the shown help, a VMS file "A.;1" will then show as "A". In the rare case that the app also expects a file like "A." it will not work.

If you only have an ODS2 disk available you may want to use the LD (or even the VD) driver to initialize and mount a container file as an ODS5 disk.

>>>
Folks with Unixy expectations ...
<<<
... expect any character in a filename except '/' and '\0'.

>>>
With DECC$READDIR_DROPDOTNOTYPE enabled, readdir when reporting files in UNIX style only reports the trailing period (.) for files with no file type when the file name contains a period.
<<<

There is no such thing like a VMS file "with no file type", try
$ write sys$output f$parse("x",,,"type")
USO
Occasional Advisor

Re: Lucene, Solr and filepaths

There are 2 problems with all the Java applications using Lucene as a search engine:

1) Lucene creates a file like xxx123 (of course OpenVMS adds a dot) and then later tries to extracts "123" from the filename but gets "123." I solved this problem by compiling Lucene after modification of its source code.

2) Lucene fails to update the index. Error messages were not clear for me and I did not find a solution.
Tim E. Sneddon
Occasional Advisor

Re: Lucene, Solr and filepaths

I've not tried using Solr or Lucene before. However, in my experience using software Java on OpenVMS, the first thing I do is define JAVA$FILENAME_CONTROLS to "0" and then rely entirely on the DECC$* feature logicals.

In my opinion, JAVA$FILENAME_CONTROLS is a pain. Once upon a time, when ODS-5 and the DECC$* logicals were not as well understood or supported, it had a place. However, now it just puts more moving parts and confusion into the mix.

Tim.
Rishi Singhal
Occasional Advisor

Re: Lucene, Solr and filepaths

Hi Ben,

>>I'm wondering if anyone else has tried to >>make Lucene or Solr work on VMS. Did you >>get it to work?
Yes, we was able to make lucene work on OpenVMS. The java Setup that was used is:
$ set proc/parse=extended
$ @SYS$COMMON:[JAVA$150.COM]JAVA$150_SETUP.COM
$ define DECC$ARGV_PARSE_STYLE ENABLE
$ define DECC$EFS_CASE_PRESERVE ENABLE
$ define DECC$POSIX_SEEK_STREAM_FILE ENABLE
$ define DECC$EFS_CHARSET ENABLE
$ define DECC$ENABLE_GETENV_CACHE ENABLE
$ define DECC$FILE_PERMISSION_UNIX ENABLE
$ define DECC$FIXED_LENGTH_SEEK_TO_EOF ENABLE
$ define DECC$RENAME_NO_INHERIT ENABLE
$ define DECC$ENABLE_TO_VMS_LOGNAME_CACHE ENABLE
$ FILE_MASK = %x00000008 + %x00040000
$ DEFINE JAVA$FILENAME_CONTROLS 'file_mask'

Regards,
Rishi

I am the person you had referred to in
>>I see at least one other user has >>attempted to make Lucene work on
>>OpenVMS before, but ran into problems >>which appear to remain unresolved:
>>http://www.lucidimagination.com/search/docu>>ment/8f4a752f43f34c6a/indexer_crashes_with_>>hit_exception_during_merge#8e9ea1db106e9cea
Ben Armstrong
Regular Advisor

Re: Lucene, Solr and filepaths

Craig,

That's the golden answer! Yes, the example that ships with Solr works fine after I DEFINEDECC$READDIR_DROPDOTNOTYPE ENABLE.

If I can't get it to go with ODS-2, enough of our clients are now on ODS-5 that we'd just make upgrading a requirement. However, we'd like to avoid that, if we can.

H.Becker,

The container file as an ODS-5 disk is an option I'll keep in mind. Though just switching to ODS-5 seems preferable, if it comes to that.

USO,

Maybe give Solr 1.4.0 a try? Granted, all I've done is run the example, so there may be other problems lurking ahead. But so far it looks good.

Tim,

For my testing this time, I just left JAVA$FILENAME_CONTROLS alone (it is set to -1). If I have any trouble with this, I'll look at the other DECC$ controls instead. Thanks.

Rishi,

Do you still have problems with large numbers of documents as indicated in that thread? I wonder if the Lucene bundled in Solr 1.4.0 has the same issue ...

Thanks, everyone!
Ben

Rishi Singhal
Occasional Advisor

Re: Lucene, Solr and filepaths

I was able to resolve that issue by some changes in the code (no not lucene code but the wrapper on lucene we were using to test).

If you also face the same issue let me know. I will dig out what we did to resolve that

Regards,
Rishi
DougR
New Member

Re: Lucene, Solr and filepaths

I'm currently trying to port a wiki extension which uses a Lucene back end to OpenVMS and have hit an issue in the trying to open a file from Lucene's code. I have set up all the Logicals described above but it throws an Exception stating

no segments* file found in org.apache.luc
ene.store.FSDirectory@/rootdir/indexes/search/db.links: files:

Have any of you run into a similar issue while indexing? Is there a logical which could aid me in this or do I need to dive into the Lucene code?

Rishi Singhal
Occasional Advisor

Re: Lucene, Solr and filepaths

Hi Doug,

Here are my theories for
>>no segments* file found in org.apache.lucene.store.FSDirectory@/rootdir/indexes/search/db.links: files:

1. In Lucene code the Exception that is thrown is
throw new FileNotFoundException("no segments* file found in " + directory + ": files:" + s);
where "org.apache.lucene.store.FSDirectory" is the path of the class and the directory being searched is
/rootdir/indexes/search/db.links

"db.links" looks wrong here and should be "db/links"

2. Check if the wiki extension has a directory as db^.links.dir . Modify it to two seperate directories (db.dir and inside it links.dir)and see if indexing works.

3. Check if in your wiki extension the string "db.links" gets appended to the index directory path


Regards,
Rishi