Flink保留savepoint,并从savepoint启动示例

发布时间 2023-06-01 13:57:56作者: RICH-ATONE

FLink1.6版本,基于Yarn集群示例:

1、启动示例:

../bin/flink run -t yarn-per-job -Dyarn.application.queue="default"  -c org.apache.flink.base.basedoit._23_State_OperatorState_Demo ./test.jar
../bin/flink run  -t yarn-per-job -Dyarn.application.queue="flink" -Dtaskmanager.memory.process.size=4096mb -Dtaskmanager.memory.task.heap.size=3200mb -Dtaskmanager.memory.managed.size=128mb -Dtaskmanager.memory.jvm-overhead.min=128mb -Dtaskmanager.memory.jvm-overhead.max=128mb -Dtaskmanager.memory.network.min=128mb -Dtaskmanager.memory.network.max=128mb -Dtaskmanager.memory.jvm-metaspace.size=256mb -Dyarn.application.name=dws-on-event-sdk-task -Dtaskmanager.numberOfTaskSlots=3 -p 3 -c com.babeltime.realtime.task.dws.DwsOnEventSdkTask ./real-time-table-store-1.0.0-RELEASE-jar-with-dependencies.jar -d

 

2、 ① 保留savepoint

../bin/flink savepoint 4decf50ff6445f8eea372fc240a2978d  hdfs://xxx:8020/user/pirate/checkpoint_save/  -yid application_1669628526849_1861287

  ②停止并保留savepoint

 ../bin/flink cancel -m yarn-cluster -yid <YARN Application ID> -s <savepoint path> <Job ID>

 ../bin/flink cancel -m yarn-cluster -yid application_1669628526849_1869686 -s hdfs://192.168.2.22:8020/user/pirate/checkpoint_save/  334ca79291e26de1c0b65ed5bb8c66ff

注:此方法虽然杀死了任务,在yarn集群上还是显示进程 

 

3、从指定的savepoint启动
-m yarn-cluster (这个参数不能去除,否则报错空指针异常)

 ../bin/flink run -m yarn-cluster -s <savepoint path> xxx
../bin/flink run -m yarn-cluster -s hdfs://ip:8020/user/pirate/checkpoint_save/savepoint-c74519-0b0399f011ab -c org.apache.flink.base.basedoit._23_State_OperatorState_Demo ./test.jar -t yarn-per-job -Dyarn.application.queue="default"