- Community Home
- >
- Software
- >
- HPE Ezmeral Software platform
- >
- Spark job output is not available immediately on F...
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Forums
Discussions
Discussions
Discussions
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-09-2024 01:42 PM
09-09-2024 01:42 PM
there is a time delay between when the hadoop job completes, and the output is made available for other jobs to run. First job is completing succesfully the same output is feeding for another job but some times i'm getting file not found exception but after a minute i can see the file in same location.
This is only happening one in a hundered jobs and it's not consistent. Can someone please let me know what's going wrong and why output file is not available immediately after job completes?
I've checked the server resorce usage it's under utilized and everything looking good but i don't know what causing the issue. can someone please help me with this
Solved! Go to Solution.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-12-2024 12:54 AM
09-12-2024 12:54 AM
SolutionGood day!
It might occurs due to delay in how Hadoop synchronizes its files after the completion of a Spark job. Even though the first job may complete successfully, the output might not be fully written to the filesystem, especially in a distributed environment.
The cause might be the file may exist in the system, but HDFS might not have fully replicated the data across the nodes, or the metadata might not have been updated, leading to a temporary visibility delay.
Spark jobs use the Hadoop Job Commit Protocol, and in some scenarios, speculative execution can cause multiple tasks to write output to the same location. If some tasks finish slightly before others, the job may appear complete before the files are fully visible on the filesystem.
You can try disabling speculative execution by setting the following configuration in your Spark job:
"spark.speculation = false" (this ensures that no extra tasks are run, reducing the chances of inconsistent outputs.)
- Introduce a small delay or retry mechanism in your second job to wait until the output becomes available.
"Thread.sleep(60000) # wait for 60 seconds before retrying" (Alternatively, you can implement a loop that checks for the file’s existence before proceeding.)
- In some distributed environments, network issues can cause latency between nodes when replicating files. Even if your system resources are under-utilized, intermittent network hiccups could cause short-lived delays in file visibility.
- Some file systems might cache the file's metadata or contents, which can result in a delay in propagating file changes, especially across different nodes. Maken sure that the job is not using any caching mechanisms that delay visibility.
I hope this give some insights to resolve your issue, let me know