Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions core/docs/Changelog.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,9 @@

* Breaking: In `FileSystem.Path` module the default for `eqPath` changed
on Windows to case-sensitive comparison.
* Breaking: A leading "." component (e.g. "." or "./x") is no longer
treated as a rooted path, making the behavior more in line with
intuitive expectation.
* Breaking: In `FileSystem.Path` module the default for `eqPath` changed
on both Posix and Windows so that `allowRelativeEquality` is `True` by
default. Literally identical relative paths (e.g. `./x` and `./x`, or
Expand Down
127 changes: 12 additions & 115 deletions core/src/Streamly/Internal/FileSystem/Path.hs
Original file line number Diff line number Diff line change
Expand Up @@ -6,138 +6,35 @@
-- Maintainer : streamly@composewell.com
-- Portability : GHC
--
-- See docs/Developer/FileSystem.Path.md for design doc.
--
-- The API in this module is equivalent to or can emulate all or most of
-- the filepath package API. It has some differences from the filepath
-- package:
--
-- 1. Empty paths are not allowed. Paths are validated before construction.
-- 2. The default Path type itself affords considerable safety regarding the
-- 1. The append operations follows path construction semantics rather than
-- path resolution and navigation based semantics used by the </> operation in
-- filepath package. Better have run time failures instead of silent problems.
-- 2. Empty paths are not allowed. Paths are validated before construction.
-- 3. The default Path type itself affords considerable safety regarding the
-- distinction of rooted or non-rooted paths, it also allows distinguishing
-- directory and file paths.
-- 3. It is designed to provide flexible typing to provide compile time safety
-- 4. It is designed to provide flexible typing to provide compile time safety
-- for rooted/non-rooted paths and file/dir paths. The Path type is just part
-- of that typed path ecosystem. Though the default Path type itself should be
-- enough for most cases.
-- 4. It leverages the streamly array module for most of the heavy lifting,
-- 5. It leverages the streamly array module for most of the heavy lifting,
-- it is a thin wrapper on top of that, improving maintainability as well as
-- providing better performance. We can have pinned and unpinned paths, also
-- provide lower level operations for certain cases to interact more
-- efficiently with low level code.
-- 6. share name is part of the root when we split the root, this allows us to
-- treat the server and share name always in cases insensitive manner and the
-- remaining path can be normalized as case sensitive or insensitive.
--
-- It builds on arrays, has a richer API, consistent API, streaming ops where
-- it makes sense, performance is primary goal.
--
-- == References
--
-- * https://en.wikipedia.org/wiki/Path_(computing)
-- * https://learn.microsoft.com/en-us/windows/win32/fileio/naming-a-file
-- * https://learn.microsoft.com/en-us/openspecs/windows_protocols/ms-dtyp/62e862f4-2a51-452e-8eeb-dc4ff5ee33cc
--
-- == Windows and Posix Paths
--
-- We should be able to manipulate windows paths on posix and posix paths on
-- windows as well. Therefore, we have WindowsPath and PosixPath types which
-- are supported on both platforms. However, the Path module aliases Path to
-- WindowsPath on Windows and PosixPath on Posix.
--
-- == File System as Tree vs Graph
--
-- A file system is a tree when there are no hard links or symbolic links. But
-- in the presence of symlinks it could be a DAG or a graph, because directory
-- symlinks can create cycles.
--
-- == Rooted and Branch paths
--
-- We make two distinctions for paths, a path may a specific filesystem root
-- attached to it or it may be a free branch without a root attached.
--
-- A path that has a root attached to it is called a rooted path e.g. /usr is a
-- rooted path, . is a rooted path, ./bin is a rooted path. A rooted path could
-- be absolute e.g. /usr or it could be relative e.g. ./bin . A rooted path
-- always has two components, a specific "root" which could be explicit or
-- implicit, and a path segment relative to the root. A rooted path with a
-- fixed root is known as an absolute path whereas a rooted path with an
-- implicit root e.g. "./bin" is known as a relative path.
--
-- A path that does not have a root attached but defines steps to go from some
-- place to another is a path branch. For example, "local/bin" is a path branch
-- whereas "./local/bin" is a rooted path.
--
-- Rooted paths can never be appended to any other path whereas a branch can be
-- appended.
--
-- The rooted/unrooted path concept is especially useful on windows. Windows is
-- different in that C:x is curdir relative path, /x is curdrive relative path.
-- Even though these paths are relative they cannot be appended to other paths.
-- The only relative path that can appended is "./x". Ideally, we should be
-- able to append C:x and C:y to C:x/y if we treat them as ./x and ./y but we
-- can't, only "." has that treatement that it can be removed and made a path
-- segment.
--
-- == Comparing Paths
--
-- We can compare two absolute rooted paths or path branches but we cannot
-- compare two relative rooted paths. If each component of the path is the same
-- then the paths are considered to be equal.
--
-- == Implicit Roots (.)
--
-- On Posix and Windows "." implicitly refers to the current directory. On
-- Windows a path like @/Users/@ has the drive reference implicit. Such
-- references are contextual and may have different meanings at different
-- times.
--
-- @./bin@ may refer to a different location depending on what "." is
-- referring to. Thus we should not allow @./bin@ to be appended to another
-- path, @bin@ can be appended though. Similarly, we cannot compare @./bin@
-- with @./bin@ and say that they are equal because they may be referring to
-- different locations depending on in what context the paths were created.
--
-- The same arguments apply to paths with implicit drive on Windows.
--
-- We can treat @.\/bin\/ls@ as an absolute path with "." as an implicit root.
-- The relative path is "bin/ls" which represents steps from somewhere to
-- somewhere else rather than a particular location. We can also call @./bin@
-- as a "rooted path" as it starts from particular location rather than
-- defining "steps" to go from one place to another. If we want to append such
-- paths we need to first make them explicitly relative by dropping the
-- implicit root. Or we can use unsafeAppend to force it anyway or unsafeCast
-- to convert absolute to relative.
--
-- On these absolute (Rooted) paths if we use takeRoot, it should return
-- RootCurDir, RootCurDrive and @Root Path@ to distinguish @./@, @/@, @C:/@. We
-- could represent them by different types but that would make the types even
-- more complicated. So runtime checks are are a good balance.
--
-- Path comparison should return EqTrue, EqFalse or EqUnknown. If we compare
-- these absolute/located paths having implicit roots then result should be
-- EqUnknown or maybe we can just return False?. @./bin@ and @./bin@ should be
-- treated as paths with different roots/drives but same relative path. The
-- programmer can explicitly drop the root and compare the relative paths if
-- they want to check literal equality.
--
-- Note that a trailing . or a . in the middle of a path is different as it
-- refers to a known name.
--
-- == Ambiguous References (..)
--
-- ".." in a path refers to the parent directory relative to the current path.
-- For an absolute root directory ".." refers to the root itself because you
-- cannot go further up.
--
-- When resolving ".." it always resolves to the parent of a directory as
-- stored in the directory entry. So if we landed in a directory via a symlink,
-- ".." can take us back to a different directory and not to the symlink
-- itself. Thus @a\/b/..@ may not be the same as @a/@. Shells like bash keep
-- track of the old paths explicitly, so you may not see this behavior when
-- using a shell.
--
-- For this reason we cannot process ".." in the path statically. However, if
-- the components of two paths are exactly the same then they will always
-- resolve to the same target. But two paths with different components could
-- also point to the same target. So if there are ".." in the path we cannot
-- definitively say if they are the same without resolving them.
--
-- == Exception Handling
--
-- Path creation routines use MonadThrow which can be interpreted as an Either
Expand Down
Loading
Loading