Path: blob/main/docs/source/src/python/polars-cloud/quickstart.py
6940 views
"""1# --8<-- [start:general]2import polars_cloud as pc3import polars as pl45# First, we need to define the hardware the cluster will run on.6# This can be done by specifying the minimum CPU and memory or by specifying the exact instance type in AWS.7ctx = pc.ComputeContext(memory=8, cpus=2, cluster_size=1)89# Then we write a regular lazy Polars query. In this example we compute the maximum of column.10lf = pl.LazyFrame(11{12"a": [1, 2, 3],13"b": [4, 4, 5],14}15).with_columns(16pl.col("a").max().over("b").alias("c"),17)1819# At this point, the query has not been executed yet.20# We need to call `.remote()` to signal that we want to run on Polars Cloud and then `.sink_parquet()` to send21# the query and execute it.2223(24lf.remote(context=ctx)25.sink_parquet(uri="s3://my-bucket/result.parquet")26)2728# We can then wait for the result with `result = lf.await_result()`.29# This will only include a few rows of the output as the result might be very large.30# The query and compute used will also show up in the portal https://cloud.pola.rs/portal/3132# --8<-- [end:general]33"""343536