Conversation
teunbrand
left a comment
There was a problem hiding this comment.
I'm excited for this! Don't let the nitpicky comments convince you otherwise
| The simple functions are: | ||
|
|
||
| * `'count'`: Non-null tally of the bound column. | ||
| * `'sum'` and `'prod'`: The sum or product | ||
| * `'min'`, `'max'`, and `'range'`: Extremes and max - min | ||
| * `'mean'`, and `'median'`: Central tendency | ||
| * `'geomean'`, `'harmean'`, and `'rms'`: Geometric, harmonic, and root-mean-square | ||
| * `'sdev'`, `'var'`, `'iqr'`, and `'se'`: Standard deviation, variance, interquartile range, and standard error | ||
| * `'p05'`, `'p10'`, `'p25'`, `'p50'`, `'p75'`, `'p90'`, and `'p95'`: Percentiles | ||
|
|
||
| For band functions you combine an offset with an expansion, potentially multiplied. An example could be `'mean-1.96sdev'` which does exactly what you'd expect it to be. The general form is `<offset>±<multiplier><expansion>` with `<multiplier>` being optional (defaults to `1`). | ||
|
|
||
| Allowed offsets are: `'mean'`, `'median'`, `'geomean'`, `'harmean'`, `'rms'`, `'sum'`, `'prod'`, `'min'`, `'max'`, and `'p05'`–`'p95'` | ||
|
|
||
| Allowed expansions are: `'sdev'`, `'se'`, `'var'`, `'iqr'`, and `'range'` |
There was a problem hiding this comment.
This should come before the explanation of how >1 aggregations interact.
|
|
||
| #[derive(Debug, Clone, PartialEq)] | ||
| pub struct Band { | ||
| pub sign: char, |
There was a problem hiding this comment.
Is there a reason why sign can't be folded into the mod_value?
mean-sdev would then just resolve to `mod_value = -1.
| "sdev" => format!("STDDEV_POP({})", qcol), | ||
| "se" => format!("(STDDEV_POP({c}) / SQRT(COUNT({c})))", c = qcol), | ||
| "var" => format!("VAR_POP({})", qcol), |
There was a problem hiding this comment.
I think these might be duckdb dialect specific
| "Warning: aggregate dropped numeric mapping for aesthetic '{}' (no applicable default and no targeted function)", | ||
| aesthetic_ctx.map_internal_to_user(d) | ||
| ); | ||
| } | ||
| return Ok(StatResult::Identity); | ||
| } | ||
|
|
||
| for d in &dropped { | ||
| eprintln!( | ||
| "Warning: aggregate dropped numeric mapping for aesthetic '{}' (no applicable default and no targeted function)", | ||
| aesthetic_ctx.map_internal_to_user(d) |
There was a problem hiding this comment.
This is me being pedantic and not really a priority.
These warnings messages should probably hint at what the user should be doing to mitigate the problem. I think the solution it is to include an aggregation for the aesthetic (e.g. 'color:mean'), but I'll let you preside over that.
Also we could probably collapse dropped into a comma-separated list for a single warning, but this is not really necessary.
| let mut group_cols: Vec<String> = Vec::new(); | ||
| for g in group_by { | ||
| if !group_cols.contains(g) { | ||
| group_cols.push(g.clone()); | ||
| } | ||
| } | ||
| for c in &kept_cols { | ||
| if !group_cols.contains(c) { | ||
| group_cols.push(c.clone()); | ||
| } | ||
| } |
There was a problem hiding this comment.
Doesn't the group_by variable already contain PARTITION BY + discrete columns? If so, we'd only need to add the domain aesthetics.
| format!("'{}'", s.replace('\'', "''")) | ||
| } | ||
|
|
||
| #[cfg(test)] |
There was a problem hiding this comment.
Should we add a test that build_group_by_query and build_group_by_query produce the right shape of query, or is that already included implicitly in other tests?
Co-authored-by: Thomas Lin Pedersen <thomasp85@gmail.com> Co-authored-by: Teun van den Brand <49372158+teunbrand@users.noreply.github.com>
Fix #160