#hive

[ follow ]
fromHackernoon
3 months ago

Tired of Copy-Pasting Hive Output? This PySpark Hack Fixes It | HackerNoon

Automating CSV export from Hive or Impala output is essential for efficient data engineering tasks.
Data science
frommedium.com
3 months ago

Data Engineering Interview Questions You Must Prepare For!

Data skewness in Spark leads to performance issues due to uneven partition distribution.
Dynamic partitioning in Hive allows for on-the-fly partition creation during data insertion.
Coalesce reduces partitions without shuffle; repartition changes partition count with shuffle.
[ Load more ]