current position:Home>What are the two shared variables provided by spark?

What are the two shared variables provided by spark?

2022-02-04 16:48:03 Alibaba cloud Q & A

Spark What are the two shared variables provided ?




Take the answer 1:

Spark Most of the operation of the program is RDD operation , Pass in the function to RDD Operate the function to calculate , These functions execute concurrently on different nodes , Internal variables have different scopes , Can't visit each other , In some cases, it's not convenient .

Broadcast variables , It's a read-only object , There is a cache on all nodes , The way to create it is SparkContext.broadcast(). It doesn't make sense to update its value after it's created , It's usually used val To modify the definition . Counter , Can only increase , You can count or sum , Support for custom types . The way to create it is SparkContext.accumulator(V, name). Only Driver The program can read the variables of this calculator ,RDD It is meaningless to read counter variables in operation . However, the node can add to the calculator (???) Both types are Spark Shared variables for .


copyright notice
author[Alibaba cloud Q & A],Please bring the original link to reprint, thank you.
https://en.cdmana.com/2022/02/202202041648018318.html

Random recommended