What is the difference between spark.sql.shuffle.partitions and spark.default.parallelism?

From the answer here, spark.sql.shuffle.partitions configures the number of partitions that are used when shuffling data for joins or aggregations. spark.default.parallelism is the default number of partitions in RDDs returned by transformations like join, reduceByKey, and parallelize when not set explicitly by the user. Note that spark.default.parallelism seems to only be working for raw RDD and is ignored when working with dataframes. If the task you are performing … Read more

what does O(N) mean [duplicate]

The comment was referring to the Big-O Notation. Briefly: O(1) means in constant time – independent of the number of items. O(N) means in proportion to the number of items. O(log N) means a time proportional to log(N) Basically any ‘O’ notation means an operation will take time up to a maximum of k*f(N)where: k is a … Read more

Hata!: SQLSTATE[HY000] [1045] Access denied for user 'divattrend_liink'@'localhost' (using password: YES)