GroupedDataStream.count_distinct
Count the number of distinct values of a column for each group. This may result in out of memory. This is not approximate.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
col |
str
|
the column to count distinct values of |
required |
Source code in pyquokka/datastream.py
2114 2115 2116 2117 2118 2119 2120 2121 2122 2123 2124 |
|