Skip to content

Type annotations for base spec#2041

Open
slmnsh wants to merge 1 commit into
fsspec:masterfrom
slmnsh:master
Open

Type annotations for base spec#2041
slmnsh wants to merge 1 commit into
fsspec:masterfrom
slmnsh:master

Conversation

@slmnsh

@slmnsh slmnsh commented May 26, 2026

Copy link
Copy Markdown

Fixes #625

Comment thread fsspec/spec.py Outdated
maxdepth: int | None,
topdown: bool,
on_error: Literal["omit", "raise"] | Callable[[OSError], Any],
detail: Literal[True],

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does it makes sense to make detail kwarg explicit (with default false) for all the functions that takes it?

Comment thread fsspec/spec.py Outdated
)
else:
ac = kwargs.pop("autocommit", not self._intrans)
ac: bool = kwargs.pop("autocommit", not self._intrans)

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to make these kwargs explicit with default None. What do you think?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The point was, not all subclasses support autocommit - so they would have to have their methods changed to also have the explicit kwarg and then ignore it.

Comment thread fsspec/spec.py Outdated
Comment thread fsspec/spec.py

@martindurant martindurant left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some thoughts.

My biggest gripe is the extended overrides, mostly because of detail=True|False.

Comment thread fsspec/spec.py Outdated
Comment on lines +134 to +135
async_impl: bool = False
mirror_sync_methods: bool = False

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are class attributes

Comment thread fsspec/spec.py
Comment thread fsspec/spec.py Outdated
self._invalidated_caches_in_transaction.append(path)

def mkdir(self, path, create_parents=True, **kwargs):
def mkdir(self, path: TPath, create_parents: bool = True, **kwargs: Any):

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Annotating Any doesn't really help...

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah but without it the type is Unknown and mypy raises warning for that

Comment thread fsspec/spec.py Outdated
Comment on lines +353 to +363
@overload
def ls(
self, path: TPath, detail: Literal[True], **kwargs: Any
) -> list[FileInfo]: ...

@overload
def ls(self, path: TPath, detail: Literal[False], **kwargs: Any) -> list[str]: ...

def ls(
self, path: TPath, detail: bool = True, **kwargs: Any
) -> list[str] | list[FileInfo]:

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realise that you are trying to specify: if detail is True, returns list[dict]; if False returns list[str]. Does it really take all this code to do so?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sadly yes. python doesn't have proper type manipulation support like typescript yet.

Comment thread fsspec/spec.py Outdated
Comment thread fsspec/spec.py Outdated
Comment on lines +2039 to +2046
fs: AbstractFileSystem,
path: str,
mode: str = "rb",
block_size: int | Literal["default"] | None = "default",
autocommit: bool = True,
cache_type: Literal["readahead", "none", "mmap", "bytes"] = "readahead",
cache_options: dict[str, Any] | None = None,
size: int | None =None,

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

General question: why is it useful to list the attributes above and also type them here, conveying the information twice?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these are two separate typehints one for class members and one for constructor function. python type system can't infer it from constructor function types

Comment thread fsspec/spec.py Outdated
mode: str = "rb",
block_size: int | Literal["default"] | None = "default",
autocommit: bool = True,
cache_type: Literal["readahead", "none", "mmap", "bytes"] = "readahead",

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This list is not exhaustive, more cache types may be registered at runtime (see the "prefetcher" cache in gcsfs and "known_parts" cache in fsspec.parquet; also "all" is missing).

Comment thread fsspec/spec.py Outdated
Comment on lines +1905 to +2081
self.DEFAULT_BLOCK_SIZE if block_size in ["default", None] else block_size
self.DEFAULT_BLOCK_SIZE if block_size is None or block_size == "default" else block_size

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this change?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because type checker can't narrow types with in operator.

Comment thread fsspec/spec.py Outdated
Comment on lines +2133 to +2135
@property
def name(self):
return _unstrip_protocol(self.path, self.fs)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where does this new property come from?

Comment thread fsspec/spec.py Outdated
else:
raise ValueError("Info not available while writing")

@override

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this override?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tell method from io.IOBase

@slmnsh

slmnsh commented May 29, 2026

Copy link
Copy Markdown
Author

@martindurant Thanks for review! I think I should've started with type stubs then later maybe add inline types. type stubs will be cleaner to add than inline types.

I will push type stubs this weekend.

@slmnsh

slmnsh commented Jun 1, 2026

Copy link
Copy Markdown
Author

@martindurant I have added completely separate type stub file. I am planning to add more for known implementations once you review this

@martindurant

Copy link
Copy Markdown
Member

Much happier with the annotations separate :). Now the main code is not affected, nor even import times. Do you have any thoughts on how best to keep the code and types in sync in the future?

@slmnsh

slmnsh commented Jun 1, 2026

Copy link
Copy Markdown
Author

mypy has stubtest for testing stub maybe we can incorporate that

https://mypy.readthedocs.io/en/stable/stubtest.html#automatic-stub-testing-stubtest

@martindurant

Copy link
Copy Markdown
Member

Can you please draft a non-required CI run with that, and text instructions for how to go about updating types in the future as the API changes?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Type annotations?

2 participants