resourcemanager.job.timeout |
"5 minutes" |
String |
Timeout for jobs which don't have a job manager as leader assigned. |
resourcemanager.previous-worker.recovery.timeout |
0 ms |
Duration |
Timeout for resource manager to recover all the previous attempts workers. If exceeded, resource manager will handle new resource requests by requesting new workers. If you would like to reuse the previous workers as much as possible, you should configure a longer timeout time to wait for previous workers to register. |
resourcemanager.rpc.port |
0 |
Integer |
Defines the network port to connect to for communication with the resource manager. By default, the port of the JobManager, because the same ActorSystem is used. Its not possible to use this configuration key to define port ranges. |
resourcemanager.standalone.start-up-time |
-1 |
Long |
Time in milliseconds of the start-up period of a standalone cluster. During this time, resource manager of the standalone cluster expects new task executors to be registered, and will not fail slot requests that can not be satisfied by any current registered slots. After this time, it will fail pending and new coming requests immediately that can not be satisfied by registered slots. If not set, slot.request.timeout will be used by default. |
resourcemanager.start-worker.max-failure-rate |
10.0 |
Double |
The maximum number of start worker failures (Native Kubernetes / Yarn) per minute before pausing requesting new workers. Once the threshold is reached, subsequent worker requests will be postponed to after a configured retry interval ('resourcemanager.start-worker.retry-interval'). |
resourcemanager.start-worker.retry-interval |
3 s |
Duration |
The time to wait before requesting new workers (Native Kubernetes / Yarn) once the max failure rate of starting workers ('resourcemanager.start-worker.max-failure-rate') is reached. |
resourcemanager.taskmanager-registration.timeout |
5 min |
Duration |
Timeout for TaskManagers to register at the active resource managers. If exceeded, active resource manager will release and try to re-request the resource for the worker. If not configured, fallback to 'taskmanager.registration.timeout'. |
resourcemanager.taskmanager-timeout |
30000 |
Long |
The timeout for an idle task manager to be released. |
slotmanager.max-total-resource.cpu |
(none) |
Double |
Maximum cpu cores the Flink cluster allocates for slots. Resources for JobManager and TaskManager framework are excluded. If not configured, it will be derived from 'slotmanager.number-of-slots.max'. |
slotmanager.max-total-resource.memory |
(none) |
MemorySize |
Maximum memory size the Flink cluster allocates for slots. Resources for JobManager and TaskManager framework are excluded. If not configured, it will be derived from 'slotmanager.number-of-slots.max'. |
slotmanager.number-of-slots.max |
infinite |
Integer |
Defines the maximum number of slots that the Flink cluster allocates. This configuration option is meant for limiting the resource consumption for batch workloads. It is not recommended to configure this option for streaming workloads, which may fail if there are not enough slots. Note that this configuration option does not take effect for standalone clusters, where how many slots are allocated is not controlled by Flink. |
slotmanager.redundant-taskmanager-num |
0 |
Integer |
The number of redundant task managers. Redundant task managers are extra task managers started by Flink, in order to speed up job recovery in case of failures due to task manager lost. Note that this feature is available only to the active deployments (native K8s, Yarn). |