HPE Storeonce 6500 object failed to start

Waseem_2 · 2 weeks ago

There are 4 node cluster in HPE storonce 6500. after cache module replacement in node 4, storeonce work fine. One day later, I observed an error on node 3 but I cannot find any hardware issues. The disk, volumes and pool do not report any errors. However, an error is reported on the object 'FIHE6-CAT3A-3D2-ORA1' (failed to start). Below error observed in storeonce logs node 3.

ad Stores for ssid3 :
--------------------------------------------------------
0[Id:13]  failed on  Oct 3 20:09 EEST with Exception Details: 
serialization::archive
‹
Exception [HousekeepingException]: thrown in (Reclaimer.cpp):149 processContainerAsync
Error (62) Timer expired

Exception [StorageMgrException]: thrown in (StorageMgrVFS.cpp):1692 sync_header
Error (62) Timer expired
Failed to (write) sync header



Caused by:

    /var/cache/jenkins/vagrant/dedupe/src/adapters/vt/rpmbuild/BUILD/d2d-dedupe-21.14.36173/src/corededupe/storagemgr/StorageMgrVFS.cpp(1692): Throw in function sync_header
    Dynamic exception type: boost::exception_detail::clone_impl<boost::exception_detail::current_exception_std_exception_wrapper >
    std::exception::what: Failed to (write) sync header
    [boost::errinfo_errno_*] = 62, "Timer expired"
    [boost::tag_original_exception_type*] = N6dedupe10storagemgr19StorageMgrExceptionE
    </boost::exception_detail::current_exception_std_exception_wrapper

Azr_geek · a week ago

Hello @Waseem_2,

Login via SSH or use the CLI:

storeonce-catalyst --show
storeonce-catalyst-store --name FIHE6-CAT3A-3D2-ORA1 --status

This might give more detail about whether the store is offline, corrupted, or in recovery.

Review Logs for Specific Error Lines
Search logs on Node 3 for errors around the time the issue appeared:
grep FIHE6-CAT3A-3D2-ORA1 /var/log/storeonce/*

Try Restarting StoreOnce Services on Node 3

If this is a software-level issue, restarting services may resolve transient errors:
systemctl restart catalystd
systemctl restart storeonce-services

Check Replication / StoreSync Status

If replication is involved (e.g., this object is part of a DR or StoreOnce Replication job), run:
storeonce-replication --status

Clean up Stale Jobs (if applicable)

Sometimes a failed or zombie job prevents a store from restarting.
storeonce-jobs --list
storeonce-jobs --abort <job_id>

If this object is critical for backup/restore operations (e.g., tied to an Oracle RMAN process), ensure that no jobs are actively trying to access it when restarting services.
If the object is corrupted, it may require manual intervention or recreation of the Catalyst store.
In a 4-node cluster, data protection through redundancy (RAID and replication) helps, but metadata corruption is still critical to address quickly.

Regards,
Azr_geek

Waseem_2 · a week ago

Hi @Azr_geek ,

In Storeonce events, below error recorded when the issue started.

381368,Sep 30 13:36:34,ALERT,E0805001b,System,[ssid3] : The system has detected a potential filesystem issue
381369,Sep 30 13:36:35,ALERT,E08050000,System,[ssid3] : D2DManager process state has changed: OK to Critical (SMM)

Waseem_2 · a week ago

Hi @Azr_geek ,

I think we need to login as root with HP Support passowrd. But this unit is not covered under SLA. Please advise if any workaround.

Azr_geek · Sunday

StoreOnce has detected a potential corruption or inconsistency in the internal filesystem used to store deduplicated objects, likely impacting object mount/load operations.

Node 3 (ssid3) This is not necessarily a physical disk failure — it’s more likely a problem with metadata or file structure in the deduplication store.

1. Run Filesystem Integrity Checks

On Node 3, run the built-in StoreOnce filesystem checker: fscheck --node 3 --all

2. Check D2DManager Logs on Node 3

Examine logs related to D2DManager and object handling:
less /var/log/storeonce/d2dmanager.log
less /var/log/storeonce/object.log

3. Restart D2DManager Services (if allowed)

If filesystem is clean but D2DManager is still in a bad state, try restarting StoreOnce services (only if this won't disrupt live operations): systemctl restart storeonce

If this is a production system with active jobs, consult with HPE before restarting services.

Categories

Company

Local Language

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Discussions

Forums

Forums

Discussions

Forums

Discussions

Forums

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Community

Resources

Other HPE Sites

Discussions

Forums

Blogs

HPE Storeonce 6500 object failed to start

HPE Storeonce 6500 object failed to start

Re: HPE Storeonce 6500 object failed to start

Re: HPE Storeonce 6500 object failed to start

Re: HPE Storeonce 6500 object failed to start

Re: HPE Storeonce 6500 object failed to start