Skip to content

DataStream.sum

Return the sums of the specified columns.

Parameters:

Name Type Description Default
columns str or list

the column name or a list of column names to sum.

required
collect bool

if True, return a Polars DataFrame. If False, return a Quokka DataStream.

True
Source code in pyquokka/datastream.py
1984
1985
1986
1987
1988
1989
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
def sum(self, columns, collect = True):

    """
    Return the sums of the specified columns.

    Args:
        columns (str or list): the column name or a list of column names to sum.
        collect (bool): if True, return a Polars DataFrame. If False, return a Quokka DataStream.
    """

    assert type(columns) == str or type(columns) == list
    if type(columns) == str:
        columns = [columns]
    for col in columns:
        assert col in self.schema

    if collect:
        return self.agg({col: "sum" for col in columns}).collect()
    else:
        return self.agg({col: "sum" for col in columns})