GroupedDataStream.agg_sql
The SQL version of agg
. Look at the examples.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
aggregations |
str
|
a string that is a valid SQL aggregation expression. Look at the examples. |
required |
Return
A DataStream object that holds the aggregation result. It will only emit one batch, which is the result when it's done.
Examples:
>>> d = d.groupby(["l_orderkey","o_orderdate","o_shippriority"]).agg_sql("sum(l_extendedprice * (1 - l_discount)) as revenue")
>>> f = d.groupby("o_orderpriority").agg_sql("count(*) as count_order")
>>> f = d.groupby("l_shipmode").agg_sql("
>>> sum(case when o_orderpriority = '1-URGENT' or o_orderpriority = '2-HIGH' then 1 else 0 end) as high_line_count,
>>> sum(case when o_orderpriority <> '1-URGENT' and o_orderpriority <> '2-HIGH' then 1 else 0 end) as low_line_count
>>> ")
Source code in pyquokka/datastream.py
2163 2164 2165 2166 2167 2168 2169 2170 2171 2172 2173 2174 2175 2176 2177 2178 2179 2180 2181 2182 2183 2184 2185 2186 2187 |
|