Skip to content

GroupedDataStream.agg_sql

The SQL version of agg. Look at the examples.

Parameters:

Name Type Description Default
aggregations str

a string that is a valid SQL aggregation expression. Look at the examples.

required
Return

A DataStream object that holds the aggregation result. It will only emit one batch, which is the result when it's done.

Examples:

>>> d = d.groupby(["l_orderkey","o_orderdate","o_shippriority"]).agg_sql("sum(l_extendedprice * (1 - l_discount)) as revenue")
>>> f = d.groupby("o_orderpriority").agg_sql("count(*) as count_order")
>>>  f = d.groupby("l_shipmode").agg_sql("
>>>        sum(case when o_orderpriority = '1-URGENT' or o_orderpriority = '2-HIGH' then 1 else 0 end) as high_line_count,
>>>        sum(case when o_orderpriority <> '1-URGENT' and o_orderpriority <> '2-HIGH' then 1 else 0 end) as low_line_count
>>>    ")
Source code in pyquokka/datastream.py
2163
2164
2165
2166
2167
2168
2169
2170
2171
2172
2173
2174
2175
2176
2177
2178
2179
2180
2181
2182
2183
2184
2185
2186
2187
def agg_sql(self, aggregations: str):

    """
    The SQL version of `agg`. Look at the examples.

    Args:
        aggregations (str): a string that is a valid SQL aggregation expression. Look at the examples.

    Return:
        A DataStream object that holds the aggregation result. It will only emit one batch, which is the result when it's done.

    Examples:

        >>> d = d.groupby(["l_orderkey","o_orderdate","o_shippriority"]).agg_sql("sum(l_extendedprice * (1 - l_discount)) as revenue")

        >>> f = d.groupby("o_orderpriority").agg_sql("count(*) as count_order")

        >>>  f = d.groupby("l_shipmode").agg_sql("
        >>>        sum(case when o_orderpriority = '1-URGENT' or o_orderpriority = '2-HIGH' then 1 else 0 end) as high_line_count,
        >>>        sum(case when o_orderpriority <> '1-URGENT' and o_orderpriority <> '2-HIGH' then 1 else 0 end) as low_line_count
        >>>    ")

    """

    return self.source_data_stream._grouped_aggregate_sql(self.groupby, aggregations, self.orderby)