Skip to content

DataSet.to_df

This is a blocking call. It will collect all the data from the cluster and return a Polars DataFrame to the calling Python session (could be your local machine, be careful of OOM!).

Return

Polars DataFrame

Source code in pyquokka/quokka_dataset.py
26
27
28
29
30
31
32
33
34
35
def to_df(self):

    """
    This is a blocking call. It will collect all the data from the cluster and return a Polars DataFrame to the calling Python session (could be your local machine, be careful of OOM!).

    Return:
        Polars DataFrame
    """

    return ray.get(self.wrapped_dataset.to_df.remote(self.dataset_id))