Skip to content

Conversation

@Tpt
Copy link
Contributor

@Tpt Tpt commented Dec 29, 2025

No description provided.

@Tpt Tpt requested review from afs and rubensworks December 29, 2025 16:21
@Tpt Tpt self-assigned this Dec 29, 2025
@Tpt Tpt mentioned this pull request Dec 29, 2025
40 tasks
@afs
Copy link
Contributor

afs commented Dec 31, 2025

Testing this accurately is harder than usual.

Jena's usual comparison of result sets uses a test for same term and, if that fails, a test for same value. This will pass this PR but it's weak.

What do other systems do?

  1. These unary operators require more than "same value" but still be value-sensitive, not "same term". After a calculation, the lexical form is not fixed. The tests need to be datatype sensitive.

  2. Negative zero. Value testing isn't enough.

In XSD 1.1, differentiates between "canonical mapping" and "canonical representation".

There is a canonical representation for xsd:double in XSD 1.0 but there isn't XSD 1.1. The nearest in XSD 1.1 is:

The ·canonical mapping· ·doubleCanonicalMap· is provided as an example of a mapping that does not produce unnecessarily long ·canonical representations·. Other algorithms which do not yield identical results for mapping from float values to character strings are permitted by [IEEE 754-2008].

The is a canonical mapping is for the value. It does state the formats 0.0E0 for positiveZero -0.0E0 negativeZero.

SPARQL systems are at liberty to provide Turtle-friendly forms (Jena does) or use what the programming language provides.

What to do about it?

  1. Be relaxed. Record the right answer, accept that it isn't perfect.
  2. Put the logic in the query (!).
  3. Make the comparison of result sets more complicated. I think this is excessive.

Example of 2 for ROUND below.

@afs
Copy link
Contributor

afs commented Dec 31, 2025

Example for data that includes the outcome and a query that does detailed checking. The results are a table of "Pass" and "Fail" with expected and actual terms.

Data, including outcomes
PREFIX :     <http://example/>
PREFIX xsd:  <http://www.w3.org/2001/XMLSchema#>

:n01_Zero_f :num "0"^^xsd:float ;
    :expected "0"^^xsd:float .

:n02_Zero_d :num "0"^^xsd:double ;
    :expected "0"^^xsd:double .

:n03_NegativeZero_f
    :num "-0"^^xsd:float ;
    :expected "-0"^^xsd:float .

:n04_NegativeZero_d
    :num "-0"^^xsd:double ;
    :expected "-0"^^xsd:double .

:n05_Positive_f
    :num "0.1"^^xsd:float ;
    :expected "0"^^xsd:float .

:n06_Positive_d
    :num "0.1"^^xsd:double ;
    :expected "0"^^xsd:double .

:n07_Negative_f
    :num "-0.1"^^xsd:float ;
    :expected "-0"^^xsd:float .
    
:n08_Negative_d
    :num "-0.1"^^xsd:double ;
    :expected "-0"^^xsd:double .

:n09_NaN_f
    :num "NaN"^^xsd:float ;
    :expected "NaN"^^xsd:float .

:n10_NaN_d
    :num "NaN"^^xsd:double ;
    :expected "NaN"^^xsd:double .

:n11_Inf_f
    :num "INF"^^xsd:float ;
    :expected "INF"^^xsd:float .

:n12_Inf_d
    :num "INF"^^xsd:double ;
    :expected "INF"^^xsd:double .

:n13_NegInf_f
    :num "-INF"^^xsd:float ;
    :expected "-INF"^^xsd:float .

:n14_NegInf_d
    :num "-INF"^^xsd:double ;
    :expected "-INF"^^xsd:double .
Query to process data with included outcomes
PREFIX :     <http://example/>
PREFIX xsd:  <http://www.w3.org/2001/XMLSchema#>

SELECT ?n ?outcome ?expected ?actual{
  ?n  :num ?value ;
      :expected ?expected .
  BIND(round(?value) AS ?actual)
  BIND(
      IF(
            ## Same datatype
            ( datatype(?actual) = datatype(?expected) ) &&
            ## Same value or same term (for NaN)
            ( (?expected = ?actual) || (sameTerm(?expected, ?actual)) ) &&
            ## Same sign (for negative zero)
            ( ( ?expected >= 0 && ?actual >= 0 ) || ( ?expected < 0 && ?actual < 0 ) )
        , "Pass"
        , "Fail"
        ) AS ?outcome)
} ORDER BY ?n
Results
---------------------------------------------------------------------------
| n                   | outcome | expected           | actual             |
===========================================================================
| :n01_Zero_f         | "Pass"  | "0"^^xsd:float     | "0.0"^^xsd:float   |
| :n02_Zero_d         | "Pass"  | "0"^^xsd:double    | 0.0e0              |
| :n03_NegativeZero_f | "Pass"  | "-0"^^xsd:float    | "-0.0"^^xsd:float  |
| :n04_NegativeZero_d | "Pass"  | "-0"^^xsd:double   | -0.0e0             |
| :n05_Positive_f     | "Pass"  | "0"^^xsd:float     | "0.0"^^xsd:float   |
| :n06_Positive_d     | "Pass"  | "0"^^xsd:double    | 0.0e0              |
| :n07_Negative_f     | "Pass"  | "-0"^^xsd:float    | "-0.0"^^xsd:float  |
| :n08_Negative_d     | "Pass"  | "-0"^^xsd:double   | -0.0e0             |
| :n09_NaN_f          | "Pass"  | "NaN"^^xsd:float   | "NaN"^^xsd:float   |
| :n10_NaN_d          | "Pass"  | "NaN"^^xsd:double  | "NaN"^^xsd:double  |
| :n11_Inf_f          | "Pass"  | "INF"^^xsd:float   | "INF"^^xsd:float   |
| :n12_Inf_d          | "Pass"  | "INF"^^xsd:double  | "INF"^^xsd:double  |
| :n13_NegInf_f       | "Pass"  | "-INF"^^xsd:float  | "-INF"^^xsd:float  |
| :n14_NegInf_d       | "Pass"  | "-INF"^^xsd:double | "-INF"^^xsd:double |
---------------------------------------------------------------------------

@Tpt
Copy link
Contributor Author

Tpt commented Dec 31, 2025

This is a great point. Thanks! I agree with you that option 2 is likely the best. I am going to update the tests.

Fun aside: XSD has only a single lexical representation for INF, -INF and NaN so there is no issue for these cases.

@afs
Copy link
Contributor

afs commented Dec 31, 2025

Fun aside:

And yet NaN != NaN.

@Tpt
Copy link
Contributor Author

Tpt commented Jan 9, 2026

And yet NaN != NaN.

Is it true IN SPARQL? In SPARQL 1.2 spec

"NaN"^^xsd:double and "NaN"^^xsd:float are considered to represent the same value. If term1 and term2 are both "NaN" for either xsd:double or xsd:float, then return TRUE.
But it's something specific to SPARQL 1.2

@afs
Copy link
Contributor

afs commented Jan 9, 2026

And yet NaN != NaN.

Is it true IN SPARQL? In SPARQL 1.2 spec

Yes - = goes to the dispatch table first, which delegates to op:numeric-equal which calls out NaN as special "NaN does not equal itself."

@afs
Copy link
Contributor

afs commented Jan 9, 2026

"NaN"^^xsd:double and "NaN"^^xsd:float are considered to represent the same value. If term1 and term2 are both "NaN" for either xsd:double or xsd:float, then return TRUE.

But it's something specific to SPARQL 1.2

Yes.

sameValue("NaN"^^xsd:double, "NaN"^^xsd:double) is a consequence of being the same term and the lexical to value mapping is a function.

sameValue("NaN"^^xsd:double, "NaN"^^xsd:float) is a choice. I don't see text that definitively points one way or the other.

For binary operations, float is promoted to double and "The result is the xs:double value that is the same as the original value".)

This is treating INF/-INF/NaN as symbols used as point extension of the numbers to create a value space. (For point extensions, one has to define the meaning in all operations.)

https://www.w3.org/TR/xpath-functions-31/#func-numeric-equal

For xs:float and xs:double values, positive zero and negative zero compare equal. INF equals INF, and -INF equals -INF. NaN does not equal itself.

which I read as "INF"^^xsd:double = "INF"^^xsd:float and continuing that to "NaN"^^xsd:double sameValue "NaN"^^xsd:float seems natural but isn't a fixed design conclusion.

The double and float "NaN"s could be different value points. I don't know of a way to observe the the design choice here.

One hint is: https://www.w3.org/TR/xmlschema11-2/#double

The only significant differences between float and double are the three defining constants 53 (vs 24), −1074 (vs −149), and 971 (vs 104).

https://www.w3.org/TR/xmlschema11-2/#dt-specialvalue

A few special values in different value spaces (e.g. positiveInfinity, negativeInfinity, and notANumber in float and double) share names. Thus, special values can be distinguished from each other in the general case by considering both the name and the primitive datatype of the value;

"distinguished from each other in the general case" would be "sameTerm" in RDF speak.

float and double values spaces are different - the constants pick different points on the number line and finite numeric operations work on the number line (see promotions). Are the "special values" in the different values spaces different (fundamental) values? (fundamental values being the choice of point extensions.)

Bottom line: I don't see text that is clearly definitive one way or the other.
Which is a safer choice?

If you, I can move this to PR w3c/sparql-query#343.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants