Skip to content

[BUG]: Executor goes stale #1222

@junaidiqbal0022

Description

@junaidiqbal0022

Hello,
I am reading data from parquet files as Dataframes and uploading them to iceberg using spark.
I have many small files that may write to same table .
I have an issue that after some files are written executor goes into stale state and just hangs there. the running query does not do anything,

  df.WriteTo(fullTableName)
    .Options(tableOptions)
    .Append();

Stale yet running query looks like this:

append at <unknown>:0

org.apache.spark.sql.DataFrameWriterV2.append(DataFrameWriterV2.scala:153)
jdk.internal.reflect.GeneratedMethodAccessor104.invoke(Unknown Source)
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
java.base/java.lang.reflect.Method.invoke(Method.java:568)
org.apache.spark.api.dotnet.DotnetBackendHandler.handleMethodCall(DotnetBackendHandler.scala:167)
org.apache.spark.api.dotnet.DotnetBackendHandler.$anonfun$handleBackendRequest$2(DotnetBackendHandler.scala:105)
org.apache.spark.api.dotnet.ThreadPool$$anon$1.run(ThreadPool.scala:34)
java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539)
java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
java.base/java.lang.Thread.run(Thread.java:842)

Nuget version: 2.3.0

spark version 3.5.3

hadoop 3.4.1

jdk-17

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions