Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
pola-rs
GitHub Repository: pola-rs/polars
Path: blob/main/docs/source/user-guide/lazy/optimizations.md
6940 views

Optimizations

If you use Polars' lazy API, Polars will run several optimizations on your query. Some of them are executed up front, others are determined just in time as the materialized data comes in.

Here is a non-complete overview of optimizations done by polars, what they do and how often they run.

OptimizationExplanationruns
Predicate pushdownApplies filters as early as possible/ at scan level.1 time
Projection pushdownSelect only the columns that are needed at the scan level.1 time
Slice pushdownOnly load the required slice from the scan level. Don't materialize sliced outputs (e.g. join.head(10)).1 time
Common subplan eliminationCache subtrees/file scans that are used by multiple subtrees in the query plan.1 time
Simplify expressionsVarious optimizations, such as constant folding and replacing expensive operations with faster alternatives.until fixed point
Join orderingEstimates the branches of joins that should be executed first in order to reduce memory pressure.1 time
Type coercionCoerce types such that operations succeed and run on minimal required memory.until fixed point
Cardinality estimationEstimates cardinality in order to determine optimal group by strategy.0/n times; dependent on query