问题描述

问题描述:Flink job提交到集群上跑,提交成功了,但是每次都是等很久之后就失败了,一直报:Could not allocate the required slot within slot request timeout. Please make sure that the cluster has enough resources.

此外,跑了Spark任务和Flink官方的批处理的WordCount例子都是可以跑成功的;

具体错误

[root@BigData04 flink-1.11.3]# bin/flink run -m yarn-cluster -yjm 1024 -ytm 1024 -c com.yuan.java.stream.SocketWindowWordCountJava ./examples/test/db_flink-1.0-SNAPSHOT-jar-with-dependencies.jar
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/data/software/flink-1.11.3/lib/log4j-slf4j-impl-2.12.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/data/software/hadoop-3.3.0/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
2021-07-14 12:24:03,968 INFO  org.apache.flink.yarn.cli.FlinkYarnSessionCli                [] - Found Yarn properties file under /tmp/.yarn-properties-root.
2021-07-14 12:24:03,968 INFO  org.apache.flink.yarn.cli.FlinkYarnSessionCli                [] - Found Yarn properties file under /tmp/.yarn-properties-root.
2021-07-14 12:24:05,534 WARN  org.apache.flink.yarn.configuration.YarnLogConfigUtil        [] - The configuration directory ('/data/software/flink-1.11.3/conf') already contains a LOG4J config file.If you want to use logback, then please delete or rename the log configuration file.
2021-07-14 12:24:05,741 INFO  org.apache.hadoop.yarn.client.RMProxy                        [] - Connecting to ResourceManager at BigData01/192.168.93.128:8032
2021-07-14 12:24:06,132 INFO  org.apache.flink.yarn.YarnClusterDescriptor                  [] - No path for the flink jar passed. Using the location of class org.apache.flink.yarn.YarnClusterDescriptor to locate the jar
2021-07-14 12:24:06,335 WARN  org.apache.flink.yarn.YarnClusterDescriptor                  [] - Neither the HADOOP_CONF_DIR nor the YARN_CONF_DIR environment variable is set. The Flink YARN Client needs one of these to be set to properly load the Hadoop configuration for accessing YARN.
2021-07-14 12:24:06,393 INFO  org.apache.flink.yarn.YarnClusterDescriptor                  [] - Cluster specification: ClusterSpecification{masterMemoryMB=1024, taskManagerMemoryMB=1024, slotsPerTaskManager=1}
2021-07-14 12:24:25,940 INFO  org.apache.flink.yarn.YarnClusterDescriptor                  [] - Submitting application master application_1625991246671_0012
2021-07-14 12:24:26,044 INFO  org.apache.hadoop.yarn.client.api.impl.YarnClientImpl        [] - Submitted application application_1625991246671_0012
2021-07-14 12:24:26,045 INFO  org.apache.flink.yarn.YarnClusterDescriptor                  [] - Waiting for the cluster to be allocated
2021-07-14 12:24:26,050 INFO  org.apache.flink.yarn.YarnClusterDescriptor                  [] - Deploying cluster, current state ACCEPTED
2021-07-14 12:24:39,208 INFO  org.apache.flink.yarn.YarnClusterDescriptor                  [] - YARN application has been deployed successfully.
2021-07-14 12:24:39,209 INFO  org.apache.flink.yarn.YarnClusterDescriptor                  [] - Found Web Interface bigdata02:32936 of application 'application_1625991246671_0012'.
Job has been submitted with JobID 86d8abb86082429a8bc55fb4b9fa8cee------------------------------------------------------------The program finished with the following exception:org.apache.flink.client.program.ProgramInvocationException: The main method caused an error: org.apache.flink.client.program.ProgramInvocationException: Job failed (JobID: 86d8abb86082429a8bc55fb4b9fa8cee)at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:302)at org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:198)at org.apache.flink.client.ClientUtils.executeProgram(ClientUtils.java:149)at org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:699)at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:232)at org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:916)at org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:992)at java.security.AccessController.doPrivileged(Native Method)at javax.security.auth.Subject.doAs(Subject.java:422)at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1836)at org.apache.flink.runtime.security.contexts.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:992)
Caused by: java.util.concurrent.ExecutionException: org.apache.flink.client.program.ProgramInvocationException: Job failed (JobID: 86d8abb86082429a8bc55fb4b9fa8cee)at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895)at org.apache.flink.client.program.StreamContextEnvironment.getJobExecutionResult(StreamContextEnvironment.java:116)at org.apache.flink.client.program.StreamContextEnvironment.execute(StreamContextEnvironment.java:80)at org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.execute(StreamExecutionEnvironment.java:1700)at com.yuan.java.stream.SocketWindowWordCountJava.main(SocketWindowWordCountJava.java:54)at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)at java.lang.reflect.Method.invoke(Method.java:498)at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:288)... 11 more
Caused by: org.apache.flink.client.program.ProgramInvocationException: Job failed (JobID: 86d8abb86082429a8bc55fb4b9fa8cee)at org.apache.flink.client.deployment.ClusterClientJobClientAdapter.lambda$null$6(ClusterClientJobClientAdapter.java:116)at java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:602)at java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:577)at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)at java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1962)at org.apache.flink.client.program.rest.RestClusterClient.lambda$pollResourceAsync$22(RestClusterClient.java:602)at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:760)at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:736)at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)at java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1962)at org.apache.flink.runtime.concurrent.FutureUtils.lambda$retryOperationWithDelay$8(FutureUtils.java:309)at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:760)at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:736)at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)at java.util.concurrent.CompletableFuture.postFire(CompletableFuture.java:561)at java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:929)at java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:442)at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.flink.runtime.client.JobExecutionException: Job execution failed.at org.apache.flink.runtime.jobmaster.JobResult.toJobExecutionResult(JobResult.java:147)at org.apache.flink.client.deployment.ClusterClientJobClientAdapter.lambda$null$6(ClusterClientJobClientAdapter.java:114)... 19 more
Caused by: org.apache.flink.runtime.JobException: Recovery is suppressed by NoRestartBackoffTimeStrategyat org.apache.flink.runtime.executiongraph.failover.flip1.ExecutionFailureHandler.handleFailure(ExecutionFailureHandler.java:116)at org.apache.flink.runtime.executiongraph.failover.flip1.ExecutionFailureHandler.getFailureHandlingResult(ExecutionFailureHandler.java:78)at org.apache.flink.runtime.scheduler.DefaultScheduler.handleTaskFailure(DefaultScheduler.java:192)at org.apache.flink.runtime.scheduler.DefaultScheduler.maybeHandleTaskFailure(DefaultScheduler.java:185)at org.apache.flink.runtime.scheduler.DefaultScheduler.updateTaskExecutionStateInternal(DefaultScheduler.java:179)at org.apache.flink.runtime.scheduler.SchedulerBase.updateTaskExecutionState(SchedulerBase.java:503)at org.apache.flink.runtime.scheduler.UpdateSchedulerNgOnInternalFailuresListener.notifyTaskFailure(UpdateSchedulerNgOnInternalFailuresListener.java:49)at org.apache.flink.runtime.executiongraph.ExecutionGraph.notifySchedulerNgAboutInternalTaskFailure(ExecutionGraph.java:1710)at org.apache.flink.runtime.executiongraph.Execution.processFail(Execution.java:1287)at org.apache.flink.runtime.executiongraph.Execution.processFail(Execution.java:1255)at org.apache.flink.runtime.executiongraph.Execution.markFailed(Execution.java:1086)at org.apache.flink.runtime.executiongraph.ExecutionVertex.markFailed(ExecutionVertex.java:748)at org.apache.flink.runtime.scheduler.DefaultExecutionVertexOperations.markFailed(DefaultExecutionVertexOperations.java:41)at org.apache.flink.runtime.scheduler.DefaultScheduler.handleTaskDeploymentFailure(DefaultScheduler.java:435)at org.apache.flink.runtime.scheduler.DefaultScheduler.lambda$assignResourceOrHandleError$6(DefaultScheduler.java:422)at java.util.concurrent.CompletableFuture.uniHandle(CompletableFuture.java:822)at java.util.concurrent.CompletableFuture$UniHandle.tryFire(CompletableFuture.java:797)at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977)at org.apache.flink.runtime.jobmaster.slotpool.SchedulerImpl.lambda$internalAllocateSlot$0(SchedulerImpl.java:168)at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:760)at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:736)at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977)at org.apache.flink.runtime.jobmaster.slotpool.SlotSharingManager$SingleTaskSlot.release(SlotSharingManager.java:726)at org.apache.flink.runtime.jobmaster.slotpool.SlotSharingManager$MultiTaskSlot.release(SlotSharingManager.java:537)at org.apache.flink.runtime.jobmaster.slotpool.SlotSharingManager$MultiTaskSlot.lambda$new$0(SlotSharingManager.java:432)at java.util.concurrent.CompletableFuture.uniHandle(CompletableFuture.java:822)at java.util.concurrent.CompletableFuture$UniHandle.tryFire(CompletableFuture.java:797)at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977)at org.apache.flink.runtime.concurrent.FutureUtils.lambda$forwardTo$21(FutureUtils.java:1120)at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:760)at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:736)at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977)at org.apache.flink.runtime.concurrent.FutureUtils$Timeout.run(FutureUtils.java:1036)at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRunAsync(AkkaRpcActor.java:402)at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:195)at org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage(FencedAkkaRpcActor.java:74)at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:152)at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:26)at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:21)at scala.PartialFunction.applyOrElse(PartialFunction.scala:123)at scala.PartialFunction.applyOrElse$(PartialFunction.scala:122)at akka.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:21)at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:172)at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:172)at akka.actor.Actor.aroundReceive(Actor.scala:517)at akka.actor.Actor.aroundReceive$(Actor.scala:515)at akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:225)at akka.actor.ActorCell.receiveMessage(ActorCell.scala:592)at akka.actor.ActorCell.invoke(ActorCell.scala:561)at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:258)at akka.dispatch.Mailbox.run(Mailbox.scala:225)at akka.dispatch.Mailbox.exec(Mailbox.scala:235)at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)at akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)at akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Caused by: org.apache.flink.runtime.jobmanager.scheduler.NoResourceAvailableException: Could not allocate the required slot within slot request timeout. Please make sure that the cluster has enough resources.at org.apache.flink.runtime.scheduler.DefaultScheduler.maybeWrapWithNoResourceAvailableException(DefaultScheduler.java:441)... 47 more
Caused by: java.util.concurrent.CompletionException: java.util.concurrent.TimeoutExceptionat java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:292)at java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:308)at java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:593)at java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:577)... 27 more
Caused by: java.util.concurrent.TimeoutException... 25 more

错误原因

首先检查是否集群资源不够的问题,然后再检查是否是打包的问题;

我这边出现错误的原因是因为我的POM配置文件配置有误;

POM文件有问题,漏写了几个<scope>provided</scope>,导致有些Flink相关的包也给打包进去了

解决办法

将漏写的 <scope>provided</scope> 补上,重新打包运行;

Flink任务报:Could not allocate the required slot within slot request timeout.相关推荐

  1. flink Could not allocate the required slot within slot request timeout

    flink Slot request bulk is not fulfillable! Could not allocate the required slot within slot request ...

  2. Redis服务停止报错解决方案[NOAUTH Authentication required]

    Redis服务停止报错解决方案[NOAUTH Authentication required] 参考文章: (1)Redis服务停止报错解决方案[NOAUTH Authentication requi ...

  3. 解决FLink:Missing required options are: slot.name

    [ERROR] Could not execute SQL statement. Reason: org.apache.flink.table.api.ValidationException: One ...

  4. shell 脚本 exit 1 报错:numeric argument required问题解决

    shell 脚本 exit 1 报错:numeric argument required问题解决 参考文章: (1)shell 脚本 exit 1 报错:numeric argument requir ...

  5. flink taskmanager 挂掉 报错No pooled slot available and request to ResourceManager for new slot failed

    1.解决办法:调大taskmanager.memory.process.size 参数的数值. 2.导致此问题的原因: taskmanager.memory.process.size 数值太小  ta ...

  6. mvn deploy 报错:Return code is: 400, ReasonPhrase: Bad Request. -

    mvn deploy 报错:Return code is: 400, ReasonPhrase: Bad Request. -> TEST通过没有报错,但是最终部署到Nexus中时出现错误. 后 ...

  7. 聊聊flink的slot.request.timeout配置

    序 本文主要研究一下flink的slot.request.timeout配置 JobManagerOptions flink-release-1.7.2/flink-core/src/main/jav ...

  8. 【Kafka】Flink kafka 报错 Failed to send data to Kafka: Failed to allocate memory within the config

    1.背景 [2020-09-05 14:57:51] [INFO] [org.apache.flink

  9. 【性能|优化】TB级flink任务报错分析:Could not compute the container Resource

    文章目录 一. 问题引入 1. 场景描述 2. 日志简析 二. 初级问题分析与解决 1. 问题分析 1.1. yarn的调度器设置 1.2. 程序设置 2. 问题解决 三. (性能)新的问题 1. 问 ...

最新文章

  1. nextcloud安装教程
  2. Java常见问题汇总
  3. 2.1TF模型持久化
  4. ABAP Development Tools的语法高亮实现原理
  5. app启动广告页的实现,解决了广告图片要实时更新的问题
  6. java 最大公约数和最小公倍数
  7. oracle中where中使用函数,Oracle 尽量避免在 SQL语句的WHERE子句中使用函数
  8. PyCharm pyqt5用label控件显示图片 QPixmap 串口通信指示灯
  9. java dao层的泛型get方法_dao层的泛型实现(2种方法)
  10. 制作ecc证书(linux命令行)
  11. 用思维导图快速学语法
  12. 再来学习一下“八荣八耻”
  13. rk3288 调试dvp摄像头_RK3288 uvc摄像头调试
  14. 一个开发神器,可助程序员实现副业赚钱
  15. [Swift]LeetCode41. 缺失的第一个正数 | First Missing Positive
  16. [寒江孤叶丶的Cocos2d-x之旅_17]Cocos2d-x 3.2版本以上LUA脚本热更新(动态更新)解决方案
  17. windows7下休眠不断网
  18. CSS相关知识【黑马程序员前端】
  19. Diffusion 扩散模型(DDPM)详解及torch复现
  20. visual studio c++ 制作 简单的项目模板

热门文章

  1. Git入门:边玩边学
  2. 为什么Git把SVN拍在了沙滩上?
  3. 蓝牙定位技术的原理和应用
  4. Cisco Packet Tracer软件的下载安装
  5. 青阳网络文件传输系统 kiftd 1.1.0 正式发布!
  6. Python中String, Bytes, Hex, Base64之间的关系与转换方法详解
  7. htmlrunner用法_HTMLTestRunner用法
  8. 计算机大三名企实习怎么找?
  9. 四旋翼的非线性模型预测控制(MPC)
  10. 写行政区划数据方案设计系列有感