Upload date
All time
Last hour
Today
This week
This month
This year
Type
All
Video
Channel
Playlist
Movie
Duration
Short (< 4 minutes)
Medium (4-20 minutes)
Long (> 20 minutes)
Sort by
Relevance
Rating
View count
Features
HD
Subtitles/CC
Creative Commons
3D
Live
4K
360°
VR180
HDR
137 results
Azure Databricks Learning: Performance Optimization: Spark/Databricks Interview Question Series - II ...
13,637 views
2 years ago
Learn PySpark, an interface for Apache Spark in Python. PySpark is often used for large-scale data processing and machine ...
1,642,617 views
4 years ago
Introduction to Catalyst Optimizer Purpose and logical architecture of Catalyst Optimizer Logical and Physical plan selection and ...
1,470 views
3 years ago
Nowadays, Spark is widely adopted in the big enterprise by handling the large volume of data. In PayPal, more and more complex ...
533 views
5 years ago
You've seen the technical deep dives on Spark's Catalyst query optimizer. You understand how to fix joins, how to find common ...
1,417 views
The SQL tab in the Spark UI provides a lot of information for analysing your spark queries, ranging from the query plan, to all ...
18,093 views
These are black boxes for Spark optimizer, blocking several helpful optimizations like WholeStageCodegen, Null optimization etc.
8,787 views
Over the last year, we have added a series of optimizations in Apache Spark to solve the above problems for Parquet.
1,603 views
Spark SQL provides a convenient layer of abstraction for users to express their query's intent while letting Spark handle the more ...
6,305 views
Over the last year, we've added a series of optimizations in Spark to improve parquet pushdown performance. We developed a ...
3,325 views
Boosting Apache Spark Performance with Small JSON Files in Microsoft Fabric. Learn how to achieve a 10x performance ...
1,365 views
1 year ago
Examples of these cost-based optimizations include choosing the right join type (broadcast-hash-join vs. sort-merge-join), ...
9,495 views
Speed up slow pandas/python code by 2500x using this simple trick. Face it, your pandas code is slow. Learn how to speed it up!
200,107 views
This is a video on how to get started with TPCDS_PySpark ...
388 views
Learn about RDDs, DataFrames, optimization techniques, and more, with detailed explanations and practical examples tailored to ...
294 views
The Delta Architecture pattern has made the lives of data engineers much simpler, but what about improving query performance ...
8,881 views
One of the most significant benefits provided by Databricks Delta is the ability to use z-ordering and dynamic file pruning to ...
1,018 views
This talk will break down merge in Delta Lake—what is actually happening under the hood—and then explain about how you can ...
16,076 views
In rapidly changing conditions, many companies build ETL pipelines using ad-hoc strategy. Such an approach makes automated ...
6,703 views
Notebooks are a great tool for Big Data. They have drastically changed the way scientists and engineers develop and share ideas ...
4,178 views