Skip to content
Closed
Show file tree
Hide file tree
Changes from 5 commits
Commits
Show all changes
72 commits
Select commit Hold shift + click to select a range
398a9f5
Percentile functions
sim1984 Nov 20, 2025
732db5a
Optimization. Record numbers are calculated only once.
sim1984 Nov 21, 2025
4241c38
Test ICU in Android
asfernandes Nov 24, 2025
314378c
Fixed PERCENTILE_DISC with non-numeric argument
sim1984 Nov 24, 2025
312ccbb
increment build number
actions-user Nov 24, 2025
a1f25a5
Getting the parameter type when preparing a query for percentile func…
sim1984 Nov 25, 2025
5179bfe
update doc
sim1984 Nov 25, 2025
be4359f
Changes according to asfernandes
sim1984 Nov 26, 2025
4af373f
Totally misc
dyemanov Nov 29, 2025
8048d14
A few more checks for a valid database file, see also #8450. This pre…
dyemanov Nov 29, 2025
8b3c3d5
increment build number
actions-user Nov 29, 2025
0c1f862
More accurate calculation of the average page fill factor (#8816)
dyemanov Dec 1, 2025
56dfbc2
increment build number
actions-user Dec 1, 2025
579ff5c
Restore the broken record layout optimization by gbak and extend it t…
dyemanov Dec 3, 2025
a2e406c
Merge pull request #8818 from FirebirdSQL/work/gh-8817
hvlad Dec 2, 2025
ead6d59
increment build number
actions-user Dec 3, 2025
ba0d7b2
Fix #8820 - Missing a column name for boolean expression
asfernandes Dec 4, 2025
9048844
increment build number
actions-user Dec 4, 2025
4ae039f
On Windows, return number of valid (full-sized) pages rather than rou…
dyemanov Dec 7, 2025
e878785
increment build number
actions-user Dec 7, 2025
62a2a15
Per pagespace I/O statistics and new trace API interfaces to allow ex…
dyemanov Dec 8, 2025
3b980c6
increment build number
actions-user Dec 8, 2025
ddc4402
Fixes a loop in the GENERATE_SERIES function on boundary values. (#8812)
sim1984 Dec 10, 2025
2a02c8c
increment build number
actions-user Dec 10, 2025
2bf3502
Fix trace docs still using parts of FB 2.5 syntax
mrotteveel Dec 12, 2025
67946bf
Add DeepWiki shield
pcisar Dec 12, 2025
1ea9ef5
increment build number
actions-user Dec 12, 2025
5791b57
Fix errors in LISTAGG implementation
asfernandes Dec 13, 2025
547c6c6
Change wrong and unused return
asfernandes Dec 13, 2025
64f3082
Check field source schema name change
asfernandes Dec 13, 2025
2734218
increment build number
actions-user Dec 13, 2025
68981fa
Fix double increment of variable `i`
asfernandes Dec 15, 2025
a73f5de
increment build number
actions-user Dec 15, 2025
16a7f49
Correction
asfernandes Dec 16, 2025
3b2bc84
Frontported pull request #8826: Fixed potential endless loop inside M…
hvlad Dec 16, 2025
8098b4c
increment build number
actions-user Dec 16, 2025
6826f9b
Fix warnings
asfernandes Dec 17, 2025
effbde2
Added check for percentile constancy within a group
sim1984 Dec 17, 2025
ac85f87
increment build number
actions-user Dec 17, 2025
a3355d5
Update tzdata to version 2025c. (#8832)
github-actions[bot] Dec 20, 2025
f3806fe
increment build number
actions-user Dec 20, 2025
7a87c1f
Fix file name in UTF-8 encoding for gstat (#8829)
paradox1307 Dec 23, 2025
0317f68
increment build number
actions-user Dec 23, 2025
1652fcc
Fix for #8836 (#8838)
aafemt Dec 24, 2025
cfe495d
increment build number
actions-user Dec 24, 2025
f115163
Comment according to Mark's request
AlexPeshkoff Dec 25, 2025
8ed234a
Fixed #8806: Missing privilege checks for the COMMENT ON PARAMETER co…
AlexPeshkoff Dec 25, 2025
5ee2b41
increment build number
actions-user Dec 25, 2025
08387c8
Fix cardinality estimation for the invariant filter
dyemanov Dec 26, 2025
0c1aa58
Fix the cardinality estimation (FIRST ROWS case) in HASH JOINs
dyemanov Dec 26, 2025
ce08ec0
increment build number
actions-user Dec 26, 2025
fc0b004
Fix debug build after b7f5b4b (Use cloop cmake build) (#8824)
aafemt Dec 29, 2025
ab9432c
increment build number
actions-user Dec 29, 2025
a7c8c64
Fix undefined message number error in DELETE WHERE CURRENT OF RETURNING
asfernandes Jan 1, 2026
3767287
increment build number
actions-user Jan 1, 2026
77c8c9d
Fix #8842 - GTT accept weird syntax and has unnecessary syntax conflicts
asfernandes Jan 2, 2026
b40984e
Misc correction
dyemanov Jan 3, 2026
abec538
increment build number
actions-user Jan 3, 2026
e88f729
Fix unused variable warnings
asfernandes Jan 3, 2026
9760157
Fix #8822: Some procedures containing LIST aggregate function are not…
dyemanov Jan 4, 2026
f8c49d4
Percentile functions
sim1984 Nov 20, 2025
5ea818f
Optimization. Record numbers are calculated only once.
sim1984 Nov 21, 2025
4b9e7d7
Fixed PERCENTILE_DISC with non-numeric argument
sim1984 Nov 24, 2025
80e3189
Getting the parameter type when preparing a query for percentile func…
sim1984 Nov 25, 2025
25cbbbf
update doc
sim1984 Nov 25, 2025
3e54bde
Changes according to asfernandes
sim1984 Nov 26, 2025
834feaf
Added check for percentile constancy within a group
sim1984 Dec 17, 2025
fe789bd
Use blr_within_group_order instead blr_sort
sim1984 Jan 4, 2026
03da30c
Merge branch 'percentile-functions' of github.com:sim1984/firebird in…
sim1984 Jan 4, 2026
98c3a50
use blr_within_group_order instead nlr_sort
sim1984 Jan 4, 2026
aaf3174
fixed build
sim1984 Jan 4, 2026
0d51cdf
Changes according to dyemanov
sim1984 Jan 4, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
187 changes: 187 additions & 0 deletions doc/sql.extensions/README.percentile_disc_cont.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,187 @@
# PERCENTILE_DISC and PERCENTILE_CONT functions

The `PERCENTILE_CONT` and `PERCENTILE_DISC` functions are known as inverse distribution functions.
These functions operate on an ordered set. Both functions can be used as aggregate or window functions.

## PERCENTILE_DISC

`PERCENTILE_DISC` is an inverse distribution function that assumes a discrete distribution model.
It takes a percentile value and a sort specification and returns an element from the set.
Nulls are ignored in the calculation.

Syntax for the `PERCENTILE_DISC` function as an aggregate function.

```
PERCENTILE_DISC(<percent>) WITHIN GROUP (ORDER BY <expr> [ASC | DESC])
```

Syntax for the `PERCENTILE_DISC` function as an window function.

```
PERCENTILE_DISC(<percent>) WITHIN GROUP (ORDER BY <expr> [ASC | DESC])
OVER (PARTITION BY <part_expr>)
```

The first argument `<percent>` must evaluate to a numeric value between 0 and 1, because it is a percentile value.
This expression must be constant within each aggregate group.
The `ORDER BY` clause takes a single expression that can be of any type that can be sorted.

The function `PERCENTILE_DISC` returns a value of the same type as the argument in `ORDER BY`.

For a given percentile value `P`, `PERCENTILE_DISC` sorts the values of the expression in the `ORDER BY` clause and
returns the value with the smallest `CUME_DIST` value (with respect to the same sort specification)
that is greater than or equal to `P`.

### Analytic Example

```sql
SELECT
DEPT_NO,
SALARY,
CUME_DIST() OVER(PARTITION BY DEPT_NO ORDER BY SALARY) AS "CUME_DIST",
PERCENTILE_DISC(0.5) WITHIN GROUP(ORDER BY SALARY)
OVER(PARTITION BY DEPT_NO) AS MEDIAN_DISC
FROM EMPLOYEE
WHERE DEPT_NO < 600
ORDER BY 1, 2;
```

```
DEPT_NO SALARY CUME_DIST MEDIAN_DISC
======= ===================== ======================= =====================
000 53793.00 0.5000000000000000 53793.00
000 212850.00 1.000000000000000 53793.00
100 44000.00 0.5000000000000000 44000.00
100 111262.50 1.000000000000000 44000.00
110 61637.81 0.5000000000000000 61637.81
110 68805.00 1.000000000000000 61637.81
115 6000000.00 0.5000000000000000 6000000.00
115 7480000.00 1.000000000000000 6000000.00
120 22935.00 0.3333333333333333 33620.63
120 33620.63 0.6666666666666666 33620.63
120 39224.06 1.000000000000000 33620.63
121 110000.00 1.000000000000000 110000.00
123 38500.00 1.000000000000000 38500.00
125 33000.00 1.000000000000000 33000.00
130 86292.94 0.5000000000000000 86292.94
130 102750.00 1.000000000000000 86292.94
140 100914.00 1.000000000000000 100914.00
180 42742.50 0.5000000000000000 42742.50
180 64635.00 1.000000000000000 42742.50
```

## PERCENTILE_CONT

`PERCENTILE_CONT` is an inverse distribution function that assumes a continuous distribution model.
It takes a percentile value and a sort specification and returns an element from the set.
Nulls are ignored in the calculation.

Syntax for the `PERCENTILE_CONT` function as an aggregate function.

```
PERCENTILE_CONT(<percent>) WITHIN GROUP (ORDER BY <expr> [ASC | DESC])
```

Syntax for the `PERCENTILE_CONT` function as an window function.

```
PERCENTILE_CONT(<percent>) WITHIN GROUP (ORDER BY <expr> [ASC | DESC])
OVER (PARTITION BY <part_expr>)
```

The first argument `<percent>` must evaluate to a numeric value between 0 and 1, because it is a percentile value.
This expression must be constant within each aggregate group.
The `ORDER BY` clause takes a single expression, which must be of numeric type to perform interpolation.

The `PERCENTILE_CONT` function returns a value of type `DOUBLE PRECISION` or `DECFLOAT(34)` depending on the type
of the argument in the `ORDER BY` clause. A value of type `DECFLOAT(34)` is returned if `ORDER BY` contains
an expression of one of the types `INT128`, `NUMERIC(38, x)` or `DECFLOAT(16 | 34)`, otherwise - `DOUBLE PRECISION`.

The result of `PERCENTILE_CONT` is computed by linear interpolation between values after ordering them.
Using the percentile value (`P`) and the number of rows (`N`) in the aggregation group, you can compute
the row number you are interested in after ordering the rows with respect to the sort specification.
This row number (`RN`) is computed according to the formula `RN = (1 + (P * (N - 1))`.
The final result of the aggregate function is computed by linear interpolation between the values from rows
at row numbers `CRN = CEILING(RN)` and `FRN = FLOOR(RN)`.

```
function f(N) ::= value of expression from row at N

if (CRN = FRN = RN) then
return f(RN)
else
return (CRN - RN) * f(FRN) + (RN - FRN) * f(CRN)
```

### Analytic Example

```sql
SELECT
DEPT_NO,
SALARY,
PERCENT_RANK() OVER(PARTITION BY DEPT_NO ORDER BY SALARY) AS "PERCENT_RANK",
PERCENTILE_CONT(0.5) WITHIN GROUP(ORDER BY SALARY)
OVER(PARTITION BY DEPT_NO) AS MEDIAN_CONT
FROM EMPLOYEE
WHERE DEPT_NO < 600
ORDER BY 1, 2;
```

```
DEPT_NO SALARY PERCENT_RANK MEDIAN_CONT
======= ===================== ======================= =======================
000 53793.00 0.000000000000000 133321.5000000000
000 212850.00 1.000000000000000 133321.5000000000
100 44000.00 0.000000000000000 77631.25000000000
100 111262.50 1.000000000000000 77631.25000000000
110 61637.81 0.000000000000000 65221.40500000000
110 68805.00 1.000000000000000 65221.40500000000
115 6000000.00 0.000000000000000 6740000.000000000
115 7480000.00 1.000000000000000 6740000.000000000
120 22935.00 0.000000000000000 33620.63000000000
120 33620.63 0.5000000000000000 33620.63000000000
120 39224.06 0.2500000000000000 33620.63000000000
121 110000.00 0.000000000000000 110000.0000000000
123 38500.00 0.000000000000000 38500.00000000000
125 33000.00 0.000000000000000 33000.00000000000
130 86292.94 0.000000000000000 94521.47000000000
130 102750.00 1.000000000000000 94521.47000000000
140 100914.00 0.000000000000000 100914.0000000000
180 42742.50 0.000000000000000 53688.75000000000
180 64635.00 1.000000000000000 53688.75000000000
```

## An example of using both aggregate functions

```sql
SELECT
DEPT_NO,
PERCENTILE_CONT(0.5) WITHIN GROUP(ORDER BY SALARY) AS MEDIAN_CONT,
PERCENTILE_DISC(0.5) WITHIN GROUP(ORDER BY SALARY) AS MEDIAN_DISC
FROM EMPLOYEE
GROUP BY DEPT_NO;
```

```
DEPT_NO MEDIAN_CONT MEDIAN_DISC
======= ======================= =====================
000 133321.5000000000 53793.00
100 77631.25000000000 44000.00
110 65221.40500000000 61637.81
115 6740000.000000000 6000000.00
120 33620.63000000000 33620.63
121 110000.0000000000 110000.00
123 38500.00000000000 38500.00
125 33000.00000000000 33000.00
130 94521.47000000000 86292.94
140 100914.0000000000 100914.00
180 53688.75000000000 42742.50
600 66450.00000000000 27000.00
621 71619.75000000000 62550.00
622 53167.50000000000 53167.50
623 60000.00000000000 60000.00
670 71268.75000000000 31275.00
671 81810.19000000000 81810.19
672 45647.50000000000 35000.00
900 92791.31500000000 69482.63
```
2 changes: 2 additions & 0 deletions src/common/ParserTokens.h
Original file line number Diff line number Diff line change
Expand Up @@ -373,6 +373,8 @@ PARSER_TOKEN(TOK_PARAMETER, "PARAMETER", false)
PARSER_TOKEN(TOK_PARTITION, "PARTITION", true)
PARSER_TOKEN(TOK_PASSWORD, "PASSWORD", true)
PARSER_TOKEN(TOK_PERCENT_RANK, "PERCENT_RANK", true)
PARSER_TOKEN(TOK_PERCENTILE_CONT, "PERCENTILE_CONT", true)
PARSER_TOKEN(TOK_PERCENTILE_DISC, "PERCENTILE_DISC", true)
PARSER_TOKEN(TOK_PI, "PI", true)
PARSER_TOKEN(TOK_PKCS_1_5, "PKCS_1_5", true)
PARSER_TOKEN(TOK_PLACING, "PLACING", true)
Expand Down
Loading
Loading