Skip to content

Add total counterparts to partial functions, such as head. #479

@kindaro

Description

@kindaro

This is the overview of the programme to add total counterparts to partial functions, such as head.

I @kindaro am accountable for this overview and I shall keep this message up to date. If something is off, comment; if I am not answering, send me an electronic letter; if I am not answering again, take over.

We estimate that there is 50% likelihood to get this done in 70 hours of work or less. This number is computed by adding mean values of subjective probability distributions of time it would take to get each of the following tasks done:

  • 10 hours to clarify the design
  • 35 hours to write all missing definitions, 1 hour each
  • 20 hours to write automated checks
  • 5 hours to rebase, merge, release, and otherwise frame the outcome

I heartily welcome the kind reader to comment on the sections of this message marked with the code «TODO».

status: request for comments

Implementation will begin when consensus with the maintainers is reached and the design is clarified.

motivation

There are two aspects to this programme:

  1. Enabling the «parse, don't validate» programming method. In this aspect, we need to add more definitions.
  2. Enabling the total functional programming method. In this aspect, we need to classify definitions into functions and partial functions.

These two aspects translate into two goals that seem to be largely independent.

An optional goal is to offer a clear way of managing partial functions that can then be scaled to other core libraries.

mathematical ground

definitions

As an example of a naming issue called the «mathematical red herring principle», things that are called «partial functions» and «unsafe functions» in the Haskell community are not functions, and «total function» is needless tautology.

  • «Partial functions» are not functions — they are relations that reflect distinctions. Two distinct elements in the target of such a relation cannot be related to the same element in the source — they are only allowed to be related to distinct elements. However, there may be elements in the source that are not related to any element in the target.
  • «Total functions» need the adjective «total» no more than natural numbers need the adjective «non-negative». Does one say «non-negative natural numbers»?
  • Unsafe functions are not functions — they are not even relations with the specified source and target.

So, «partial function» is a convenient but misleading shortcut for «relation that reflects distinctions».

This is important to keep in mind because it means that any theory of functions is not going to be a true description of a definition that wields any partial functions. Many principles of reasoning about Haskell programs will be invalidated. See for example Fast and Loose Reasoning is Morally Correct. Unsafe functions cannot even be modelled as general relations — rather, a theory of automata will need to be wielded, — so they will not be looked at. It seems to be the consensus that the best solution for them is isolation, as in #361.

partial map classifier

From the classical perspective, any partial function f: A ⇸ B can be turned into a function f*: A → B + 1 with the target A + 1, which in Haskell can be modelled by Maybe A. This type Maybe A would be called «partial map classifier» in Mathematics.

However, constructively we cannot really say if an arbitrary computation will terminate. For example, we do not know if a hailstone sequence starting from a big enough number will terminate. Technically speaking, each sound termination analyzer is incomplete.

Nevertheless, we hope to be able to say for any computation found in the core libraries and specifically vector either that:

  • It will terminate on all inputs.
  • There are specific known inputs on which it will not terminate (or evaluate to an error at run time).

Then, we can turn those partial functions for which we are able to say so into functions by putting the inputs on which they will not terminate to Nothing. We shall call this process and its outcome «completion». Thanks to the classical theory, we know that every function has at most one completion, so this is not ambiguous. We can even say that completion is a partial function from partial functions to functions, however sadly not computable.

example

This partial function from Data.Vector.Generic:

-- | /O(1)/ First element.
head :: Vector v a => v a -> a
{-# INLINE_FUSED head #-}
head v = v ! 0

— Is completed to this function:

-- | /O(1)/ Just the first element — or nothing when there are none.
vectorToMaybe :: Vector v a => v a -> Maybe a
{-# INLINE_FUSED vectorToMaybe #-}
vectorToMaybe v = v !? 0

design

A list of partial functions in vector is already compiled for us in rio. There are 35 of them. None of them have a completion yet (TODO look again).

  • To achieve the goal № 1, the missing definitions need to be written and merged, in no specific order.
  • To achieve the goal № 2:
    • A way of marking definitions as functions or partial functions should be offered. (TODO find a way to mark definitions as functions or partial functions that is friendly to both humans and machines)
    • A way to easily relate a partial function and its completion pairwise should be offered. (TODO find a way to mark definitions as related by the completion relation)

For the sake of sustainability, automated checks should be added. Given a pair of a partial function and its completion:

  • A set of inputs should be given for which the partial function evaluates to an error (or a reasonable timeout if it loops rather than evaluating to an error) and its completion evaluates to Nothing.
  • A set of inputs should be given for which the partial function evaluates to a value and its completion evaluates to a value which outermost data constructor is Just.

In the ideal, these two sets should exhaust the source.

In the ideal, these checks will be defined in such a way that human friendly documentation can also be generated, then somehow such documentation should make its way to haddocks. This would ensure sustainability of documentation as well as code itself.

(TODO think of a way to get great automated checks and documentation)

naming

All new names shall be either:

  • a variation on the name of the definition being completed
  • a grammatically correct phrase of the English language that describes the computation

Arbitrary abbreviations and shorthands will not be allowed. More specific names will be picked rather than more general ones.

(TODO draft specific suggestions)

scope

Only definitions exported from Data.Vector are to be completed at this time. (TODO what other definitions should we work on?) Unsafe definitions that cannot be trivially modelled as relations will not be looked at — their completion is not grounded in Mathematics; they are out of scope.

future work

In the ideal, we should offer two modules: the one would export only functions and the other would export only partial functions. That the way we mark definitions as either lets us hope that in future work such modules can be generated automatically. This would complete the solution of the aspect № 2.

chunks of work

The work will be done in working, fully done chunks, split as follows:

  1. Draft the completion of head in Data.Vector.Generic with basic checks and documentation.
  2. Perfect and generalize checks and documentation and add one more completion that shows their strength at scale.
  3. Draft all other completions.
  4. Perfect the names.
  5. Replicate all new definitions through all of the variants of vectors: storable, primitive, unboxed and boxed
  6. Merge.

prior art

(TODO mention other conversations where similar programmes were talked about)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions