<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Brainstorm on Memory issues for Spark in HPE Ezmeral Software platform</title>
    <link>https://community.hpe.com/t5/hpe-ezmeral-software-platform/brainstorm-on-memory-issues-for-spark/m-p/7160733#M215</link>
    <description>&lt;P&gt;I recently encountered Spark always creating 200 partitions after wider transformations. Sometimes I would need less and sometimes I would need more. To resolve this I have enabled Spark 3.0 adaptive query execution.&lt;/P&gt;&lt;P&gt;Spark 3.0 provides&amp;nbsp;&lt;A href="https://sparkbyexamples.com/spark/spark-3-0-adaptive-query-execution/?swcfpc=1" target="_blank" rel="noopener"&gt;Adaptive Query execution&lt;/A&gt;&amp;nbsp;&amp;nbsp;which improves the query performance by re-optimizing the query plan during runtime. You can enable this by setting&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;spark.conf.set("spark.sql.adaptive.enabled",true)&lt;/SPAN&gt;&lt;/P&gt;&lt;DIV&gt;&lt;DIV&gt;&lt;DIV&gt;Spark 3 dynamically determines the optimal number of partitions by looking at the metrics of the completed stage. In order to use this, you need to enable the below configuration.&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;P&gt;spark.conf.set("spark.sql.adaptive.coalescePartitions.enabled",true)&lt;/P&gt;&lt;DIV&gt;&lt;DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;Thanks&lt;/DIV&gt;</description>
    <pubDate>Thu, 17 Feb 2022 02:32:28 GMT</pubDate>
    <dc:creator>vathi106</dc:creator>
    <dc:date>2022-02-17T02:32:28Z</dc:date>
    <item>
      <title>Brainstorm on Memory issues for Spark</title>
      <link>https://community.hpe.com/t5/hpe-ezmeral-software-platform/brainstorm-on-memory-issues-for-spark/m-p/7118157#M20</link>
      <description>&lt;P&gt;Hi Team,&lt;/P&gt;&lt;P&gt;Have you encountered any kinds of memory issues on Spark?&lt;/P&gt;&lt;P&gt;If so, do you want to share the troubleshooting tips?&lt;/P&gt;&lt;P&gt;Thanks~&lt;/P&gt;</description>
      <pubDate>Thu, 21 Jan 2021 18:49:21 GMT</pubDate>
      <guid>https://community.hpe.com/t5/hpe-ezmeral-software-platform/brainstorm-on-memory-issues-for-spark/m-p/7118157#M20</guid>
      <dc:creator>Hao_Zhu</dc:creator>
      <dc:date>2021-01-21T18:49:21Z</dc:date>
    </item>
    <item>
      <title>Re: Brainstorm on Memory issues for Spark</title>
      <link>https://community.hpe.com/t5/hpe-ezmeral-software-platform/brainstorm-on-memory-issues-for-spark/m-p/7118640#M26</link>
      <description>&lt;P&gt;I would say the Memory issue is a subtopic for more general Optimization issue in Spark.&lt;/P&gt;&lt;P&gt;Since Spark was designed as in-memory computation framework, naturally it is more demanding to RAM space than lagacy MR. Therefore it is always good idea to design your cluster specification with this in mind.&lt;/P&gt;&lt;P&gt;However there is no recepy to make your cluster highly utilised and never hit OOM. It is always speculative and is subject to change with time. I would argue this is about the balance of stability and costs. With time you get understanding what is reasonable capacity for your workloads. This is iterative and dynamic process.&lt;/P&gt;&lt;P&gt;There are multiples layers of memory you should consider before taking actions on OOM issue.&lt;/P&gt;&lt;P class="lia-indent-padding-left-30px"&gt;1. Is a physical memory - this is what OS sees when the job is launched. In Linux you check it with "top", "free", etc.&lt;/P&gt;&lt;P&gt;If you're submitting Spark jobs with Yarn RM you can diagnose this type of OOM in container logs:&lt;/P&gt;&lt;PRE&gt;Error: ExecutorLostFailure Reason: Container killed by YARN for exceeding limits.
12.4 GB of 12.3 GB physical memory used. 
Consider boosting spark.yarn.executor.memoryOverhead.
Error: ExecutorLostFailure Reason: Container killed by YARN for exceeding limits.
4.5GB of 3GB physical memory used limits.
Consider boosting spark.yarn.executor.memoryOverhead.&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;As suggested consider boosting "spark.yarn.executor.memoryOverhead". Typically, you need to allocate 1/10 of spark.executor.memory to get rid of it.&lt;/P&gt;&lt;P class="lia-indent-padding-left-30px"&gt;2. Virtual memory. This is your app's physical memory + swap (paged files).&lt;/P&gt;&lt;P&gt;This is managed by RM, and diagnosed by message below:&lt;/P&gt;&lt;PRE&gt;Container killed by YARN for exceeding memory limits.
1.1gb of 1.0gb virtual memory used. Killing container.&lt;/PRE&gt;&lt;P&gt;Can be solved by disabling vmem check on NM:&lt;/P&gt;&lt;PRE&gt;"yarn.nodemanager.vmem-check-enabled":"false"&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P class="lia-indent-padding-left-30px"&gt;3. Java Heap Space. This is memory avaiable for Spark JVM inself (driver/executor).&lt;/P&gt;&lt;P&gt;It can be detected in container logs as message below:&lt;/P&gt;&lt;PRE&gt;WARN TaskSetManager: Loss was due to 
java.lang.OutOfMemoryError
java.lang.OutOfMemoryError: Java heap space&lt;/PRE&gt;&lt;P&gt;&amp;nbsp; You request RM for memory slots for your Spark app by setting this intrinsic config spark.executor.memory&lt;/P&gt;&lt;P&gt;In many cases if this run out, Spark would try to spill data to disk and no OOM occurs. You as a Spark app developer can chose not to use disk at all for performance concerns. Then your app fails fast with OOM, instead of occupying resources of your cluster.&lt;/P&gt;&lt;P&gt;There are numerous optimisation techniqes however to lower the memory footprint.&lt;/P&gt;&lt;P&gt;Here are useful links that cover this subject:&lt;/P&gt;&lt;P&gt;&lt;A href="https://0x0fff.com/spark-memory-management/" target="_blank" rel="noopener"&gt;https://0x0fff.com/spark-memory-management/&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;A href="https://aws.amazon.com/blogs/big-data/best-practices-for-successfully-managing-memory-for-apache-spark-applications-on-amazon-emr/" target="_blank" rel="noopener"&gt;https://aws.amazon.com/blogs/big-data/best-practices-for-successfully-managing-memory-for-apache-spark-applications-on-amazon-emr/&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;A href="https://g1thubhub.github.io/spark-memory.html" target="_blank" rel="noopener"&gt;https://g1thubhub.github.io/spark-memory.html&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 26 Jan 2021 13:17:20 GMT</pubDate>
      <guid>https://community.hpe.com/t5/hpe-ezmeral-software-platform/brainstorm-on-memory-issues-for-spark/m-p/7118640#M26</guid>
      <dc:creator>idyptan</dc:creator>
      <dc:date>2021-01-26T13:17:20Z</dc:date>
    </item>
    <item>
      <title>Re: Brainstorm on Memory issues for Spark</title>
      <link>https://community.hpe.com/t5/hpe-ezmeral-software-platform/brainstorm-on-memory-issues-for-spark/m-p/7119447#M33</link>
      <description>&lt;P&gt;By default, Spark uses On-heap memory only. The size of the On-heap memory is configured by the –executor-memory or spark.executor.memory parameter when the Spark Application starts. The concurrent tasks running inside Executor share JVM's On-heap memory.&lt;/P&gt;&lt;P&gt;The On-heap memory area in the Executor can be roughly divided into the following four blocks:&lt;/P&gt;&lt;P&gt;Storage Memory: It's mainly used to store Spark cache data, such as RDD cache, Broadcast variable, Unroll data, and so on.&lt;BR /&gt;Execution Memory: It's mainly used to store temporary data in the calculation process of Shuffle, Join, Sort, Aggregation, etc.&lt;BR /&gt;User Memory: It's mainly used to store the data needed for RDD conversion operations, such as the information for RDD dependency.&lt;BR /&gt;Reserved Memory: The memory is reserved for system and is used to store Spark's internal objects.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;A href="https://support.datafabric.hpe.com/s/article/Spark-Troubleshooting-guide-Memory-Management-How-to-troubleshooting-out-of-memory-OOM-issues-on-Spark-Executor?language=en_US" target="_blank" rel="noopener"&gt;https://support.datafabric.hpe.com/s/article/Spark-Troubleshooting-guide-Memory-Management-How-to-troubleshooting-out-of-memory-OOM-issues-on-Spark-Executor?language=en_US&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 02 Feb 2021 07:23:48 GMT</pubDate>
      <guid>https://community.hpe.com/t5/hpe-ezmeral-software-platform/brainstorm-on-memory-issues-for-spark/m-p/7119447#M33</guid>
      <dc:creator>Vinayak_Meghraj</dc:creator>
      <dc:date>2021-02-02T07:23:48Z</dc:date>
    </item>
    <item>
      <title>Re: Brainstorm on Memory issues for Spark</title>
      <link>https://community.hpe.com/t5/hpe-ezmeral-software-platform/brainstorm-on-memory-issues-for-spark/m-p/7126956#M49</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;Spark is not an "in-memory" solution.&lt;/P&gt;&lt;P&gt;Spark was created by Amp Labs to reduce the latency between the map and reduce cycles found in MR1 and MR2.&amp;nbsp;&lt;/P&gt;&lt;P&gt;Spark does rely more on memory (both heap and non heap memory.) but it also caches to local disk.&lt;/P&gt;</description>
      <pubDate>Tue, 23 Mar 2021 19:26:23 GMT</pubDate>
      <guid>https://community.hpe.com/t5/hpe-ezmeral-software-platform/brainstorm-on-memory-issues-for-spark/m-p/7126956#M49</guid>
      <dc:creator>Michael_Segel</dc:creator>
      <dc:date>2021-03-23T19:26:23Z</dc:date>
    </item>
    <item>
      <title>Re: Brainstorm on Memory issues for Spark</title>
      <link>https://community.hpe.com/t5/hpe-ezmeral-software-platform/brainstorm-on-memory-issues-for-spark/m-p/7126957#M50</link>
      <description>&lt;P&gt;&lt;a href="https://community.hpe.com/t5/user/viewprofilepage/user-id/2031754"&gt;@Hao_Zhu&lt;/a&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;There many reasons why you could have memory issues w your spark applications.&lt;/P&gt;&lt;P&gt;You could have very inefficient code, along with sizing of your spark job and even cluster container space if you're running spark on your cluster.&lt;/P&gt;&lt;P&gt;There are a lot of factors and places to look.&lt;/P&gt;&lt;P&gt;Can you be more specific where you are having problems?&lt;/P&gt;&lt;P&gt;Also which version of spark and which features are you using?&amp;nbsp; (e.g. SparkSQL, SparkStructuredStreaming, etc...)&lt;/P&gt;</description>
      <pubDate>Tue, 23 Mar 2021 19:29:10 GMT</pubDate>
      <guid>https://community.hpe.com/t5/hpe-ezmeral-software-platform/brainstorm-on-memory-issues-for-spark/m-p/7126957#M50</guid>
      <dc:creator>Michael_Segel</dc:creator>
      <dc:date>2021-03-23T19:29:10Z</dc:date>
    </item>
    <item>
      <title>Re: Brainstorm on Memory issues for Spark</title>
      <link>https://community.hpe.com/t5/hpe-ezmeral-software-platform/brainstorm-on-memory-issues-for-spark/m-p/7160733#M215</link>
      <description>&lt;P&gt;I recently encountered Spark always creating 200 partitions after wider transformations. Sometimes I would need less and sometimes I would need more. To resolve this I have enabled Spark 3.0 adaptive query execution.&lt;/P&gt;&lt;P&gt;Spark 3.0 provides&amp;nbsp;&lt;A href="https://sparkbyexamples.com/spark/spark-3-0-adaptive-query-execution/?swcfpc=1" target="_blank" rel="noopener"&gt;Adaptive Query execution&lt;/A&gt;&amp;nbsp;&amp;nbsp;which improves the query performance by re-optimizing the query plan during runtime. You can enable this by setting&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;spark.conf.set("spark.sql.adaptive.enabled",true)&lt;/SPAN&gt;&lt;/P&gt;&lt;DIV&gt;&lt;DIV&gt;&lt;DIV&gt;Spark 3 dynamically determines the optimal number of partitions by looking at the metrics of the completed stage. In order to use this, you need to enable the below configuration.&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;P&gt;spark.conf.set("spark.sql.adaptive.coalescePartitions.enabled",true)&lt;/P&gt;&lt;DIV&gt;&lt;DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;Thanks&lt;/DIV&gt;</description>
      <pubDate>Thu, 17 Feb 2022 02:32:28 GMT</pubDate>
      <guid>https://community.hpe.com/t5/hpe-ezmeral-software-platform/brainstorm-on-memory-issues-for-spark/m-p/7160733#M215</guid>
      <dc:creator>vathi106</dc:creator>
      <dc:date>2022-02-17T02:32:28Z</dc:date>
    </item>
  </channel>
</rss>

