Operating System - OpenVMS
1834704 Members
2325 Online
110069 Solutions
New Discussion

NOCALLTRANS and SSRVEXCEPT process bugchecks

 
Richard Jordan
Regular Advisor

NOCALLTRANS and SSRVEXCEPT process bugchecks

I think I've exhausted all my other options for analyzing this; the system is not under support any more. The problem occurred some time ago, and unfortunately not all the info needed to debug was saved; at this point all we really have is the error log output since the operator, audit, and accounting logs have long since been purged, and no process dumps remain, if they were ever created. Nor can we re-establish the configuration that had the problem, or install the hardware mentioned, until we can work out what likely caused the problem. I've never seen the particular errors before.

AS1200 5/533 dual, 1.5GB RAM. OVMS 7.2-2 with UPDATE V1.0, TCPIP V5.1 (no eco). The system disk is on a KZPBA-CX UWSE bus, data storage is an HBA raid controller (ID's as a DAC960, I believe it is a KZPAC). In this mode the system runs fine. Note that the patches listed were current at the time of the test.

The site attempted to set up fiber storage using a redundant pair of HSG60 controllers and a KGPSA-CA host adapter, moving the disks from the raid cabinet to use in the fiber-attached shelves (good backups were taken, drives initialized and restored). The config had been tested with smaller drives on a backup server before trying to switch.

On the first day of production use they experienced numerous process terminations; from the user POV it seemed they just got disconnected and had to log back on; it only affected telneted users, though since the vast majority used telnet that may not be relevant.

The error log showed both NOCALLTRANS and SSRVEXCEPT bugchecks corresponding to the failures. There is no known customer translated code on the system (for the NOCALLTRANS error). The attachment is a sample of one of each of the bugcheck entries. The PC is the same for each occurrence of NOCALLTRANS, and the same for each occurrence of SSRVEXCEPT; I don't have a 7.2-2 system with requivalent patches to see if they point at anything relevant.

After a few hours they were forced to move the storage back to the raid controller and restore. The problems stopped at that point, implying some connection to the fiber or controllers. The customer really needs the performance increase so we're trying to track this one down so we know we can prevent it.

The system has since been updated to current class 1 patches, but there was nothing in the release notes that matched this problem.

Any thoughts, or previous experience would be appreciated, even if it is just to say there's no way to tell with the minimal info available.
9 REPLIES 9
Lokesh_2
Esteemed Contributor

Re: NOCALLTRANS and SSRVEXCEPT process bugchecks

Hi,

This error most often occurs when a program created for the VAX
architecture is vested, linked and run on the Alpha. Recompile the native code using the /TIE qualifier and relink the
application using the /NONATIVE_ONLY qualifier
From the system help:
___________________________________
NOCALLTRANS, code at 'address' cannot call translated code

Facility: SYSTEM, System Services

Explanation: The autojacketing routine EXE$NATIVE_TO_TRANSLATED terminated
the user program. The native routine containing the specified
address does not have a procedure signature block associated
with it.

User Action: Compile the native routine using the /TIE qualifier. The
debugger can identify the native routine by the address cited
in the message.

To enable the native routine to call a translated routine, use
the /NONATIVE_ONLY default for the LINK command when linking
the image that contains the native routine.

____________________________________

This error most often occurs when a program created for the VAX
architecture is vested, linked and run on the Alpha. Recompile the native code using the /TIE qualifier and relink the application using the /NONATIVE_ONLY qualifier


Also, as you are facing this problem with telnet users only, so updating TCPIP with ECO's can be a starting point...

Best wishes,
Lokesh Jain
What would you do with your life if you knew you could not fail?
Richard Jordan
Regular Advisor

Re: NOCALLTRANS and SSRVEXCEPT process bugchecks

Lokesh,
thank you for responding. Unfortunately there is no known user-built vested code on the system; a pretty thorough audit was performed to be certain. Plus the fact that the problem only occurred when the fiber drives were installed... Thats why we still don't have a resolution; the problem doesn't make sense based on the reported error (at least that one).

Rich Jordan
Martin P.J. Zinser
Honored Contributor

Re: NOCALLTRANS and SSRVEXCEPT process bugchecks

Hello Rich,

there is an installation rating 2 patch kit out for 7.2-2 that addresses numerous problems with fiber attached storage (FIBER_SCSI V400). While it does not mention your problem explicitly I'd certainly install it for safety reasons before the next test (it is recommended for all installations that use fiber storage).

I would also recommend ECO5 for TCP/IP 5.1 to cover the bases.

Greetings, Martin
Richard Jordan
Regular Advisor

Re: NOCALLTRANS and SSRVEXCEPT process bugchecks

Martin,
thank you for responding. The most recent update, any newer class 1, fibre_scsi, and TCPIP patches have in fact been installed over the last month or so. The site is not willing to reconnect the fiber storage (for now) due to the amount of work involved since all the data disks have to be moved and rebuilt, until (and presumably _if_) they are able to locate the cause of the original problem. They don't want to set the fiber storage back up just to have to back it down again if the problem reccurs. The patches are one of the reasons I mentioned that we could not duplicate the problem configuration any more.

If we come up empty here, though, thats what they're going to have to do...

Thanks again!

Rich
Martin P.J. Zinser
Honored Contributor

Re: NOCALLTRANS and SSRVEXCEPT process bugchecks

Hello Rich,

can you run the old and new in parallel for a while? This is what we generally do when phasing in a new storage array type. First we just connect it up and let it run empty for a while. Then we move one not so critical volume and then gradually the rest of the stuff (assuming we are not running into problems of course ;-)

Greetings, Martin
Richard Jordan
Regular Advisor

Re: NOCALLTRANS and SSRVEXCEPT process bugchecks

That is how we're going to recommend proceeding if the cause of the previous problems cannot be worked out. After all if the info available really is inadequate then there's not much choice other than trying to generate more complete and current info, or verify that the problem is no longer occurring even if we cannot guarantee why.

Thanks for responding again. I'm going to leave this open for another couple of days to see if anyone has experienced anything like this before... ahh for those darned audit/accounting logs...

Rich
Richard Jordan
Regular Advisor

Re: NOCALLTRANS and SSRVEXCEPT process bugchecks

Thanks to all who responded. Guess its a real mystery problem. We're going to try and arrange reconnecting the fiber arrays and just go from there; hopefully the patches already installed will have a positive effect.
George Busccher
New Member

Re: NOCALLTRANS and SSRVEXCEPT process bugchecks

Richard,
Where you able to find the solution to this problem? I happen to be in alomst the exact same situtation. I have the same hardware and am runnign 7.2-2 with all ECO's.
Ian Miller.
Honored Contributor

Re: NOCALLTRANS and SSRVEXCEPT process bugchecks

I think TCPIP includes some VESTed code in SMTP server I think (reliance on VAXSCAN parahaps). There may be other things that have VESTed code that you don't know about. For the curious I discovered the above reliance when I installed VMS without support for translated code once and found SMTP did not work. VMS V7.3-1 TCP 5.3
____________________
Purely Personal Opinion