Server Management - Systems Insight Manager
cancel
Showing results for 
Search instead for 
Did you mean: 

MXDTF Restarting & Can't Log into SIM (Sometimes)

 
Highlighted

MXDTF Restarting & Can't Log into SIM (Sometimes)

SIM 5.3 SP1

I'm plagued with mxdtf restarting itself constantly. I'm also experiencing the exact same problem described here almost daily. I believe both problems might be related.
http://h20000.www2.hp.com/bizsupport/TechSupport/Document.jsp?lang=en&cc=us&taskId=110&prodSeriesId=428936&prodTypeId=15351&prodSeriesId=428936&objectID=c00726235

I can fix the problem when it occurs but then it comes back sporadically.

I need to get this fixed because I have tasks which can't run w/o mxdtf running in memory.

I've tried all the tricks with no luck.
1. Dumped config\sshtools, ran mxinitconfig to initialize the Agent and SSH Keys and imported the CMS server using HP's instructions. This works for the time being but then comes right back again.
2. Dumped and reloaded OpenSSH
3. When I can't log into SIM, I have to run mxstop and mxconfigrepo -F.. SIM starts fine but then mxdtf gets stuck into the restart loop again.

I have an open case with HP and they're trying to call it database corruption and rebuild. They are stilljavascript:postMessageSubmit('submit'); investigating because I can't rebuild because of the months of time invested in customizing SIM.

Has anyone experienced this behavior?

Is there a link between the SSH user and OpenSSH that's not working? I'm at a loss with this one.
3 REPLIES 3
Highlighted

Re: MXDTF Restarting & Can't Log into SIM (Sometimes)

After talking with HP Escalation, they just suggested a OS reload and rebuild of SIM without getting their hands dirty and looking into the code to see what was actually happening.

We were able to find the root cause of MXDTF getting stuck in a restart loop each time our server was rebooted.

The problem lies within the KNOWN_HOSTS file under the folder:
\Program Files\HP\Systems Insight Manager\config\sshtools\

We found a few servers where the ILO DNS name contained a space. Seemed like a simple typo when someone originally built the server. Wasn't me.. :)

I removed the space in the name within the ILO network config for both servers. I also removed the space from the DNS A record as well. I then modified the KNOWN_HOSTS file and removed the space in the ILO name for the public keys and restarted the SIM service.

Tadda!! Mxdtf is behaving now. Problem Solved!!

This issue occurs because the java code used by SIM writes the name with the space to the hosts file but when it comes time for MXDTF to start and read through the file, it barks because it doesn't understand the space in the name.

This explains why the HP recommendation to delete all files under sshTools and run mxiniticonfig to rebuild new keys fixes the issue only temporarily. As soon as those bad DNS server names are generated into the KNOWN_HOSTS, we get stuck in the same loop only after the SIM server has rebooted.

Moral of the story, make sure all DNS records don't contain any spaces. We have a script scrubbing our DNS now!!
Highlighted
Occasional Contributor

Re: MXDTF Restarting & Can't Log into SIM (Sometimes)

We had the same issue with the mxdtf restarting itself constantly. However the fix above did not work. We are running HP SIM 5.3 SP2.

Instead we found the globalsettings.props file was corrupted, the size of the file was around 2MB instead of about 8-15Kb. After cleaning out a lot of "//"s and duplicate entries and restarting the HP SIM service, the mxdtf error went away and HP SIM started normally. We then found the RSP component did not send a test alert and we had to re-intstall and update the components. Since then HP SIM has been working reasonably well.
Highlighted
Occasional Visitor

Re: MXDTF Restarting & Can't Log into SIM (Sometimes)

Wow, thanks for that post. I just upgraded our SIM from version 7.0 to 7.1 and had the same problem. The file globalsettings.props was 13MB in size and full of \u0000 which I removed. Unfortunately lots of records gone missing in that file. Good news is that the services came up ok. However, when logging on to the web page the first time wizard launched.Next time I'll take a copy of that file upfront to be able to quickly recover the settings.

 

So the upgrade process has broken that file and it is a little bit scary to see that the software still has the same bug.

 

Thorsten