Skip to content

zarr 3.2.x breaks kerchunk #3958

@avalentino

Description

@avalentino

Zarr version

v3.2.1

Numcodecs version

v0.16.5

Python Version

3.13, 3/14

Operating System

Linux

Installation

debian packages

Description

The testsuite of kerchunk fails with zarr v3.2.x.
The detailed error information that I obtain whan I try to build kerchunk with zarr v3.2.1 can be found below.

I have found some ticket that could be related: #3922, #3924, #3926.

Detailed traceback
==================================== ERRORS ====================================
________________________ ERROR at setup of test_fixture ________________________

self = <fsspec.implementations.memory.MemoryFileSystem object at 0x734758ea3e00>
path = '/cfnontime1.zarr/.zgroup', start = None, end = None, kwargs = {}

    def cat_file(self, path, start=None, end=None, **kwargs):
        logger.debug("cat: %s", path)
        path = self._strip_protocol(path)
        try:
>           return bytes(self.store[path].getbuffer()[start:end])
                         ^^^^^^^^^^^^^^^^
E           KeyError: '/cfnontime1.zarr/.zgroup'

/usr/lib/python3/dist-packages/fsspec/implementations/memory.py:230: KeyError

The above exception was the direct cause of the following exception:

self = <fsspec.mapping.FSMap object at 0x73474f432ba0>, key = '.zgroup'
default = None

    def __getitem__(self, key, default=None):
        """Retrieve data"""
        k = self._key_to_str(key)
        try:
>           result = self.fs.cat(k)
                     ^^^^^^^^^^^^^^

/usr/lib/python3/dist-packages/fsspec/mapping.py:155: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
/usr/lib/python3/dist-packages/fsspec/spec.py:917: in cat
    return self.cat_file(paths[0], **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <fsspec.implementations.memory.MemoryFileSystem object at 0x734758ea3e00>
path = '/cfnontime1.zarr/.zgroup', start = None, end = None, kwargs = {}

    def cat_file(self, path, start=None, end=None, **kwargs):
        logger.debug("cat: %s", path)
        path = self._strip_protocol(path)
        try:
            return bytes(self.store[path].getbuffer()[start:end])
        except KeyError as e:
>           raise FileNotFoundError(path) from e
E           FileNotFoundError: /cfnontime1.zarr/.zgroup

/usr/lib/python3/dist-packages/fsspec/implementations/memory.py:232: FileNotFoundError

The above exception was the direct cause of the following exception:

uri_or_store = 'memory:///cfnontime1.zarr', storage_options = None
inline_threshold = 100, inline = None, out = None

    def single_zarr(
        uri_or_store,
        storage_options=None,
        inline_threshold=100,
        inline=None,
        out=None,
    ):
        """kerchunk-style view on zarr mapper
    
        This is a similar process to zarr's consolidate_metadata, but does not
        need to be held in the original file tree. You do not need zarr itself
        to do this.
    
        This is useful for testing, so that we can pass hand-made zarrs to combine.
    
        Parameters
        ----------
        uri_or_store: str or dict-like
        storage_options: dict or None
            given to fsspec
        out: dict-like or None
            This allows you to supply an fsspec.implementations.reference.LazyReferenceMapper
            to write out parquet as the references get filled, or some other dictionary-like class
            to customise how references get stored
    
        Returns
        -------
        reference dict like
        """
        if isinstance(uri_or_store, str):
            mapper = fsspec.get_mapper(uri_or_store, **(storage_options or {}))
            prot = mapper.fs.protocol
            protocol = prot[0] if isinstance(prot, tuple) else prot
        else:
            mapper = uri_or_store
            if isinstance(mapper, fsspec.FSMap) and storage_options is None:
                storage_options = mapper.fs.storage_options
                prot = mapper.fs.protocol
                protocol = prot[0] if isinstance(prot, tuple) else prot
            else:
                protocol = None
    
        try:
>           check = ujson.loads(mapper[".zgroup"])
                                ^^^^^^^^^^^^^^^^^

kerchunk/zarr.py:51: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <fsspec.mapping.FSMap object at 0x73474f432ba0>, key = '.zgroup'
default = None

    def __getitem__(self, key, default=None):
        """Retrieve data"""
        k = self._key_to_str(key)
        try:
            result = self.fs.cat(k)
        except self.missing_exceptions as exc:
            if default is not None:
                return default
>           raise KeyError(key) from exc
E           KeyError: '.zgroup'

/usr/lib/python3/dist-packages/fsspec/mapping.py:159: KeyError

The above exception was the direct cause of the following exception:

    @pytest.fixture(scope="module")
    def refs():
        return {
>           path.replace(".zarr", "").lstrip("/"): single_zarr(f"memory://{path}")
                                                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
            for path in fs.ls("", detail=False)
        }

../../../tests/test_combine.py:242: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

uri_or_store = 'memory:///cfnontime1.zarr', storage_options = None
inline_threshold = 100, inline = None, out = None

    def single_zarr(
        uri_or_store,
        storage_options=None,
        inline_threshold=100,
        inline=None,
        out=None,
    ):
        """kerchunk-style view on zarr mapper
    
        This is a similar process to zarr's consolidate_metadata, but does not
        need to be held in the original file tree. You do not need zarr itself
        to do this.
    
        This is useful for testing, so that we can pass hand-made zarrs to combine.
    
        Parameters
        ----------
        uri_or_store: str or dict-like
        storage_options: dict or None
            given to fsspec
        out: dict-like or None
            This allows you to supply an fsspec.implementations.reference.LazyReferenceMapper
            to write out parquet as the references get filled, or some other dictionary-like class
            to customise how references get stored
    
        Returns
        -------
        reference dict like
        """
        if isinstance(uri_or_store, str):
            mapper = fsspec.get_mapper(uri_or_store, **(storage_options or {}))
            prot = mapper.fs.protocol
            protocol = prot[0] if isinstance(prot, tuple) else prot
        else:
            mapper = uri_or_store
            if isinstance(mapper, fsspec.FSMap) and storage_options is None:
                storage_options = mapper.fs.storage_options
                prot = mapper.fs.protocol
                protocol = prot[0] if isinstance(prot, tuple) else prot
            else:
                protocol = None
    
        try:
            check = ujson.loads(mapper[".zgroup"])
            assert check["zarr_format"] == 2
        except (KeyError, ValueError, TypeError) as e:
>           raise ValueError("Failed to load dataset as V2 zarr") from e
E           ValueError: Failed to load dataset as V2 zarr

kerchunk/zarr.py:54: ValueError

[CUT]

___________________ ERROR at setup of test_cftimes_to_normal ___________________

self = <fsspec.implementations.memory.MemoryFileSystem object at 0x734758ea3e00>
path = '/cfnontime1.zarr/.zgroup', start = None, end = None, kwargs = {}

    def cat_file(self, path, start=None, end=None, **kwargs):
        logger.debug("cat: %s", path)
        path = self._strip_protocol(path)
        try:
>           return bytes(self.store[path].getbuffer()[start:end])
                         ^^^^^^^^^^^^^^^^
E           KeyError: '/cfnontime1.zarr/.zgroup'

/usr/lib/python3/dist-packages/fsspec/implementations/memory.py:230: KeyError

The above exception was the direct cause of the following exception:

self = <fsspec.mapping.FSMap object at 0x73474f432ba0>, key = '.zgroup'
default = None

    def __getitem__(self, key, default=None):
        """Retrieve data"""
        k = self._key_to_str(key)
        try:
>           result = self.fs.cat(k)
                     ^^^^^^^^^^^^^^

/usr/lib/python3/dist-packages/fsspec/mapping.py:155: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
/usr/lib/python3/dist-packages/fsspec/spec.py:917: in cat
    return self.cat_file(paths[0], **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <fsspec.implementations.memory.MemoryFileSystem object at 0x734758ea3e00>
path = '/cfnontime1.zarr/.zgroup', start = None, end = None, kwargs = {}

    def cat_file(self, path, start=None, end=None, **kwargs):
        logger.debug("cat: %s", path)
        path = self._strip_protocol(path)
        try:
            return bytes(self.store[path].getbuffer()[start:end])
        except KeyError as e:
>           raise FileNotFoundError(path) from e
E           FileNotFoundError: /cfnontime1.zarr/.zgroup

/usr/lib/python3/dist-packages/fsspec/implementations/memory.py:232: FileNotFoundError

The above exception was the direct cause of the following exception:

uri_or_store = 'memory:///cfnontime1.zarr', storage_options = None
inline_threshold = 100, inline = None, out = None

    def single_zarr(
        uri_or_store,
        storage_options=None,
        inline_threshold=100,
        inline=None,
        out=None,
    ):
        """kerchunk-style view on zarr mapper
    
        This is a similar process to zarr's consolidate_metadata, but does not
        need to be held in the original file tree. You do not need zarr itself
        to do this.
    
        This is useful for testing, so that we can pass hand-made zarrs to combine.
    
        Parameters
        ----------
        uri_or_store: str or dict-like
        storage_options: dict or None
            given to fsspec
        out: dict-like or None
            This allows you to supply an fsspec.implementations.reference.LazyReferenceMapper
            to write out parquet as the references get filled, or some other dictionary-like class
            to customise how references get stored
    
        Returns
        -------
        reference dict like
        """
        if isinstance(uri_or_store, str):
            mapper = fsspec.get_mapper(uri_or_store, **(storage_options or {}))
            prot = mapper.fs.protocol
            protocol = prot[0] if isinstance(prot, tuple) else prot
        else:
            mapper = uri_or_store
            if isinstance(mapper, fsspec.FSMap) and storage_options is None:
                storage_options = mapper.fs.storage_options
                prot = mapper.fs.protocol
                protocol = prot[0] if isinstance(prot, tuple) else prot
            else:
                protocol = None
    
        try:
>           check = ujson.loads(mapper[".zgroup"])
                                ^^^^^^^^^^^^^^^^^

kerchunk/zarr.py:51: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <fsspec.mapping.FSMap object at 0x73474f432ba0>, key = '.zgroup'
default = None

    def __getitem__(self, key, default=None):
        """Retrieve data"""
        k = self._key_to_str(key)
        try:
            result = self.fs.cat(k)
        except self.missing_exceptions as exc:
            if default is not None:
                return default
>           raise KeyError(key) from exc
E           KeyError: '.zgroup'

/usr/lib/python3/dist-packages/fsspec/mapping.py:159: KeyError

The above exception was the direct cause of the following exception:

    @pytest.fixture(scope="module")
    def refs():
        return {
>           path.replace(".zarr", "").lstrip("/"): single_zarr(f"memory://{path}")
                                                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
            for path in fs.ls("", detail=False)
        }

../../../tests/test_combine.py:242: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

uri_or_store = 'memory:///cfnontime1.zarr', storage_options = None
inline_threshold = 100, inline = None, out = None

    def single_zarr(
        uri_or_store,
        storage_options=None,
        inline_threshold=100,
        inline=None,
        out=None,
    ):
        """kerchunk-style view on zarr mapper
    
        This is a similar process to zarr's consolidate_metadata, but does not
        need to be held in the original file tree. You do not need zarr itself
        to do this.
    
        This is useful for testing, so that we can pass hand-made zarrs to combine.
    
        Parameters
        ----------
        uri_or_store: str or dict-like
        storage_options: dict or None
            given to fsspec
        out: dict-like or None
            This allows you to supply an fsspec.implementations.reference.LazyReferenceMapper
            to write out parquet as the references get filled, or some other dictionary-like class
            to customise how references get stored
    
        Returns
        -------
        reference dict like
        """
        if isinstance(uri_or_store, str):
            mapper = fsspec.get_mapper(uri_or_store, **(storage_options or {}))
            prot = mapper.fs.protocol
            protocol = prot[0] if isinstance(prot, tuple) else prot
        else:
            mapper = uri_or_store
            if isinstance(mapper, fsspec.FSMap) and storage_options is None:
                storage_options = mapper.fs.storage_options
                prot = mapper.fs.protocol
                protocol = prot[0] if isinstance(prot, tuple) else prot
            else:
                protocol = None
    
        try:
            check = ujson.loads(mapper[".zgroup"])
            assert check["zarr_format"] == 2
        except (KeyError, ValueError, TypeError) as e:
>           raise ValueError("Failed to load dataset as V2 zarr") from e
E           ValueError: Failed to load dataset as V2 zarr

kerchunk/zarr.py:54: ValueError
________________________ ERROR at setup of test_inline _________________________

self = <fsspec.implementations.memory.MemoryFileSystem object at 0x734758ea3e00>
path = '/cfnontime1.zarr/.zgroup', start = None, end = None, kwargs = {}

    def cat_file(self, path, start=None, end=None, **kwargs):
        logger.debug("cat: %s", path)
        path = self._strip_protocol(path)
        try:
>           return bytes(self.store[path].getbuffer()[start:end])
                         ^^^^^^^^^^^^^^^^
E           KeyError: '/cfnontime1.zarr/.zgroup'

/usr/lib/python3/dist-packages/fsspec/implementations/memory.py:230: KeyError

The above exception was the direct cause of the following exception:

self = <fsspec.mapping.FSMap object at 0x73474f432ba0>, key = '.zgroup'
default = None

    def __getitem__(self, key, default=None):
        """Retrieve data"""
        k = self._key_to_str(key)
        try:
>           result = self.fs.cat(k)
                     ^^^^^^^^^^^^^^

/usr/lib/python3/dist-packages/fsspec/mapping.py:155: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
/usr/lib/python3/dist-packages/fsspec/spec.py:917: in cat
    return self.cat_file(paths[0], **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <fsspec.implementations.memory.MemoryFileSystem object at 0x734758ea3e00>
path = '/cfnontime1.zarr/.zgroup', start = None, end = None, kwargs = {}

    def cat_file(self, path, start=None, end=None, **kwargs):
        logger.debug("cat: %s", path)
        path = self._strip_protocol(path)
        try:
            return bytes(self.store[path].getbuffer()[start:end])
        except KeyError as e:
>           raise FileNotFoundError(path) from e
E           FileNotFoundError: /cfnontime1.zarr/.zgroup

/usr/lib/python3/dist-packages/fsspec/implementations/memory.py:232: FileNotFoundError

The above exception was the direct cause of the following exception:

uri_or_store = 'memory:///cfnontime1.zarr', storage_options = None
inline_threshold = 100, inline = None, out = None

    def single_zarr(
        uri_or_store,
        storage_options=None,
        inline_threshold=100,
        inline=None,
        out=None,
    ):
        """kerchunk-style view on zarr mapper
    
        This is a similar process to zarr's consolidate_metadata, but does not
        need to be held in the original file tree. You do not need zarr itself
        to do this.
    
        This is useful for testing, so that we can pass hand-made zarrs to combine.
    
        Parameters
        ----------
        uri_or_store: str or dict-like
        storage_options: dict or None
            given to fsspec
        out: dict-like or None
            This allows you to supply an fsspec.implementations.reference.LazyReferenceMapper
            to write out parquet as the references get filled, or some other dictionary-like class
            to customise how references get stored
    
        Returns
        -------
        reference dict like
        """
        if isinstance(uri_or_store, str):
            mapper = fsspec.get_mapper(uri_or_store, **(storage_options or {}))
            prot = mapper.fs.protocol
            protocol = prot[0] if isinstance(prot, tuple) else prot
        else:
            mapper = uri_or_store
            if isinstance(mapper, fsspec.FSMap) and storage_options is None:
                storage_options = mapper.fs.storage_options
                prot = mapper.fs.protocol
                protocol = prot[0] if isinstance(prot, tuple) else prot
            else:
                protocol = None
    
        try:
>           check = ujson.loads(mapper[".zgroup"])
                                ^^^^^^^^^^^^^^^^^

kerchunk/zarr.py:51: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <fsspec.mapping.FSMap object at 0x73474f432ba0>, key = '.zgroup'
default = None

    def __getitem__(self, key, default=None):
        """Retrieve data"""
        k = self._key_to_str(key)
        try:
            result = self.fs.cat(k)
        except self.missing_exceptions as exc:
            if default is not None:
                return default
>           raise KeyError(key) from exc
E           KeyError: '.zgroup'

/usr/lib/python3/dist-packages/fsspec/mapping.py:159: KeyError

The above exception was the direct cause of the following exception:

    @pytest.fixture(scope="module")
    def refs():
        return {
>           path.replace(".zarr", "").lstrip("/"): single_zarr(f"memory://{path}")
                                                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
            for path in fs.ls("", detail=False)
        }

../../../tests/test_combine.py:242: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

uri_or_store = 'memory:///cfnontime1.zarr', storage_options = None
inline_threshold = 100, inline = None, out = None

    def single_zarr(
        uri_or_store,
        storage_options=None,
        inline_threshold=100,
        inline=None,
        out=None,
    ):
        """kerchunk-style view on zarr mapper
    
        This is a similar process to zarr's consolidate_metadata, but does not
        need to be held in the original file tree. You do not need zarr itself
        to do this.
    
        This is useful for testing, so that we can pass hand-made zarrs to combine.
    
        Parameters
        ----------
        uri_or_store: str or dict-like
        storage_options: dict or None
            given to fsspec
        out: dict-like or None
            This allows you to supply an fsspec.implementations.reference.LazyReferenceMapper
            to write out parquet as the references get filled, or some other dictionary-like class
            to customise how references get stored
    
        Returns
        -------
        reference dict like
        """
        if isinstance(uri_or_store, str):
            mapper = fsspec.get_mapper(uri_or_store, **(storage_options or {}))
            prot = mapper.fs.protocol
            protocol = prot[0] if isinstance(prot, tuple) else prot
        else:
            mapper = uri_or_store
            if isinstance(mapper, fsspec.FSMap) and storage_options is None:
                storage_options = mapper.fs.storage_options
                prot = mapper.fs.protocol
                protocol = prot[0] if isinstance(prot, tuple) else prot
            else:
                protocol = None
    
        try:
            check = ujson.loads(mapper[".zgroup"])
            assert check["zarr_format"] == 2
        except (KeyError, ValueError, TypeError) as e:
>           raise ValueError("Failed to load dataset as V2 zarr") from e
E           ValueError: Failed to load dataset as V2 zarr

kerchunk/zarr.py:54: ValueError
____________________ ERROR at setup of test_bad_coo_warning ____________________

self = <fsspec.implementations.memory.MemoryFileSystem object at 0x734758ea3e00>
path = '/cfnontime1.zarr/.zgroup', start = None, end = None, kwargs = {}

    def cat_file(self, path, start=None, end=None, **kwargs):
        logger.debug("cat: %s", path)
        path = self._strip_protocol(path)
        try:
>           return bytes(self.store[path].getbuffer()[start:end])
                         ^^^^^^^^^^^^^^^^
E           KeyError: '/cfnontime1.zarr/.zgroup'

/usr/lib/python3/dist-packages/fsspec/implementations/memory.py:230: KeyError

The above exception was the direct cause of the following exception:

self = <fsspec.mapping.FSMap object at 0x73474f432ba0>, key = '.zgroup'
default = None

    def __getitem__(self, key, default=None):
        """Retrieve data"""
        k = self._key_to_str(key)
        try:
>           result = self.fs.cat(k)
                     ^^^^^^^^^^^^^^

/usr/lib/python3/dist-packages/fsspec/mapping.py:155: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
/usr/lib/python3/dist-packages/fsspec/spec.py:917: in cat
    return self.cat_file(paths[0], **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <fsspec.implementations.memory.MemoryFileSystem object at 0x734758ea3e00>
path = '/cfnontime1.zarr/.zgroup', start = None, end = None, kwargs = {}

    def cat_file(self, path, start=None, end=None, **kwargs):
        logger.debug("cat: %s", path)
        path = self._strip_protocol(path)
        try:
            return bytes(self.store[path].getbuffer()[start:end])
        except KeyError as e:
>           raise FileNotFoundError(path) from e
E           FileNotFoundError: /cfnontime1.zarr/.zgroup

/usr/lib/python3/dist-packages/fsspec/implementations/memory.py:232: FileNotFoundError

The above exception was the direct cause of the following exception:

uri_or_store = 'memory:///cfnontime1.zarr', storage_options = None
inline_threshold = 100, inline = None, out = None

    def single_zarr(
        uri_or_store,
        storage_options=None,
        inline_threshold=100,
        inline=None,
        out=None,
    ):
        """kerchunk-style view on zarr mapper
    
        This is a similar process to zarr's consolidate_metadata, but does not
        need to be held in the original file tree. You do not need zarr itself
        to do this.
    
        This is useful for testing, so that we can pass hand-made zarrs to combine.
    
        Parameters
        ----------
        uri_or_store: str or dict-like
        storage_options: dict or None
            given to fsspec
        out: dict-like or None
            This allows you to supply an fsspec.implementations.reference.LazyReferenceMapper
            to write out parquet as the references get filled, or some other dictionary-like class
            to customise how references get stored
    
        Returns
        -------
        reference dict like
        """
        if isinstance(uri_or_store, str):
            mapper = fsspec.get_mapper(uri_or_store, **(storage_options or {}))
            prot = mapper.fs.protocol
            protocol = prot[0] if isinstance(prot, tuple) else prot
        else:
            mapper = uri_or_store
            if isinstance(mapper, fsspec.FSMap) and storage_options is None:
                storage_options = mapper.fs.storage_options
                prot = mapper.fs.protocol
                protocol = prot[0] if isinstance(prot, tuple) else prot
            else:
                protocol = None
    
        try:
>           check = ujson.loads(mapper[".zgroup"])
                                ^^^^^^^^^^^^^^^^^

kerchunk/zarr.py:51: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <fsspec.mapping.FSMap object at 0x73474f432ba0>, key = '.zgroup'
default = None

    def __getitem__(self, key, default=None):
        """Retrieve data"""
        k = self._key_to_str(key)
        try:
            result = self.fs.cat(k)
        except self.missing_exceptions as exc:
            if default is not None:
                return default
>           raise KeyError(key) from exc
E           KeyError: '.zgroup'

/usr/lib/python3/dist-packages/fsspec/mapping.py:159: KeyError

The above exception was the direct cause of the following exception:

    @pytest.fixture(scope="module")
    def refs():
        return {
>           path.replace(".zarr", "").lstrip("/"): single_zarr(f"memory://{path}")
                                                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
            for path in fs.ls("", detail=False)
        }

../../../tests/test_combine.py:242: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

uri_or_store = 'memory:///cfnontime1.zarr', storage_options = None
inline_threshold = 100, inline = None, out = None

    def single_zarr(
        uri_or_store,
        storage_options=None,
        inline_threshold=100,
        inline=None,
        out=None,
    ):
        """kerchunk-style view on zarr mapper
    
        This is a similar process to zarr's consolidate_metadata, but does not
        need to be held in the original file tree. You do not need zarr itself
        to do this.
    
        This is useful for testing, so that we can pass hand-made zarrs to combine.
    
        Parameters
        ----------
        uri_or_store: str or dict-like
        storage_options: dict or None
            given to fsspec
        out: dict-like or None
            This allows you to supply an fsspec.implementations.reference.LazyReferenceMapper
            to write out parquet as the references get filled, or some other dictionary-like class
            to customise how references get stored
    
        Returns
        -------
        reference dict like
        """
        if isinstance(uri_or_store, str):
            mapper = fsspec.get_mapper(uri_or_store, **(storage_options or {}))
            prot = mapper.fs.protocol
            protocol = prot[0] if isinstance(prot, tuple) else prot
        else:
            mapper = uri_or_store
            if isinstance(mapper, fsspec.FSMap) and storage_options is None:
                storage_options = mapper.fs.storage_options
                prot = mapper.fs.protocol
                protocol = prot[0] if isinstance(prot, tuple) else prot
            else:
                protocol = None
    
        try:
            check = ujson.loads(mapper[".zgroup"])
            assert check["zarr_format"] == 2
        except (KeyError, ValueError, TypeError) as e:
>           raise ValueError("Failed to load dataset as V2 zarr") from e
E           ValueError: Failed to load dataset as V2 zarr

kerchunk/zarr.py:54: ValueError
______________________ ERROR at setup of test_chunk_error ______________________

self = <fsspec.implementations.memory.MemoryFileSystem object at 0x734758ea3e00>
path = '/cfnontime1.zarr/.zgroup', start = None, end = None, kwargs = {}

    def cat_file(self, path, start=None, end=None, **kwargs):
        logger.debug("cat: %s", path)
        path = self._strip_protocol(path)
        try:
>           return bytes(self.store[path].getbuffer()[start:end])
                         ^^^^^^^^^^^^^^^^
E           KeyError: '/cfnontime1.zarr/.zgroup'

/usr/lib/python3/dist-packages/fsspec/implementations/memory.py:230: KeyError

The above exception was the direct cause of the following exception:

self = <fsspec.mapping.FSMap object at 0x73474f432ba0>, key = '.zgroup'
default = None

    def __getitem__(self, key, default=None):
        """Retrieve data"""
        k = self._key_to_str(key)
        try:
>           result = self.fs.cat(k)
                     ^^^^^^^^^^^^^^

/usr/lib/python3/dist-packages/fsspec/mapping.py:155: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
/usr/lib/python3/dist-packages/fsspec/spec.py:917: in cat
    return self.cat_file(paths[0], **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <fsspec.implementations.memory.MemoryFileSystem object at 0x734758ea3e00>
path = '/cfnontime1.zarr/.zgroup', start = None, end = None, kwargs = {}

    def cat_file(self, path, start=None, end=None, **kwargs):
        logger.debug("cat: %s", path)
        path = self._strip_protocol(path)
        try:
            return bytes(self.store[path].getbuffer()[start:end])
        except KeyError as e:
>           raise FileNotFoundError(path) from e
E           FileNotFoundError: /cfnontime1.zarr/.zgroup

/usr/lib/python3/dist-packages/fsspec/implementations/memory.py:232: FileNotFoundError

The above exception was the direct cause of the following exception:

uri_or_store = 'memory:///cfnontime1.zarr', storage_options = None
inline_threshold = 100, inline = None, out = None

    def single_zarr(
        uri_or_store,
        storage_options=None,
        inline_threshold=100,
        inline=None,
        out=None,
    ):
        """kerchunk-style view on zarr mapper
    
        This is a similar process to zarr's consolidate_metadata, but does not
        need to be held in the original file tree. You do not need zarr itself
        to do this.
    
        This is useful for testing, so that we can pass hand-made zarrs to combine.
    
        Parameters
        ----------
        uri_or_store: str or dict-like
        storage_options: dict or None
            given to fsspec
        out: dict-like or None
            This allows you to supply an fsspec.implementations.reference.LazyReferenceMapper
            to write out parquet as the references get filled, or some other dictionary-like class
            to customise how references get stored
    
        Returns
        -------
        reference dict like
        """
        if isinstance(uri_or_store, str):
            mapper = fsspec.get_mapper(uri_or_store, **(storage_options or {}))
            prot = mapper.fs.protocol
            protocol = prot[0] if isinstance(prot, tuple) else prot
        else:
            mapper = uri_or_store
            if isinstance(mapper, fsspec.FSMap) and storage_options is None:
                storage_options = mapper.fs.storage_options
                prot = mapper.fs.protocol
                protocol = prot[0] if isinstance(prot, tuple) else prot
            else:
                protocol = None
    
        try:
>           check = ujson.loads(mapper[".zgroup"])
                                ^^^^^^^^^^^^^^^^^

kerchunk/zarr.py:51: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <fsspec.mapping.FSMap object at 0x73474f432ba0>, key = '.zgroup'
default = None

    def __getitem__(self, key, default=None):
        """Retrieve data"""
        k = self._key_to_str(key)
        try:
            result = self.fs.cat(k)
        except self.missing_exceptions as exc:
            if default is not None:
                return default
>           raise KeyError(key) from exc
E           KeyError: '.zgroup'

/usr/lib/python3/dist-packages/fsspec/mapping.py:159: KeyError

The above exception was the direct cause of the following exception:

    @pytest.fixture(scope="module")
    def refs():
        return {
>           path.replace(".zarr", "").lstrip("/"): single_zarr(f"memory://{path}")
                                                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
            for path in fs.ls("", detail=False)
        }

../../../tests/test_combine.py:242: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

uri_or_store = 'memory:///cfnontime1.zarr', storage_options = None
inline_threshold = 100, inline = None, out = None

    def single_zarr(
        uri_or_store,
        storage_options=None,
        inline_threshold=100,
        inline=None,
        out=None,
    ):
        """kerchunk-style view on zarr mapper
    
        This is a similar process to zarr's consolidate_metadata, but does not
        need to be held in the original file tree. You do not need zarr itself
        to do this.
    
        This is useful for testing, so that we can pass hand-made zarrs to combine.
    
        Parameters
        ----------
        uri_or_store: str or dict-like
        storage_options: dict or None
            given to fsspec
        out: dict-like or None
            This allows you to supply an fsspec.implementations.reference.LazyReferenceMapper
            to write out parquet as the references get filled, or some other dictionary-like class
            to customise how references get stored
    
        Returns
        -------
        reference dict like
        """
        if isinstance(uri_or_store, str):
            mapper = fsspec.get_mapper(uri_or_store, **(storage_options or {}))
            prot = mapper.fs.protocol
            protocol = prot[0] if isinstance(prot, tuple) else prot
        else:
            mapper = uri_or_store
            if isinstance(mapper, fsspec.FSMap) and storage_options is None:
                storage_options = mapper.fs.storage_options
                prot = mapper.fs.protocol
                protocol = prot[0] if isinstance(prot, tuple) else prot
            else:
                protocol = None
    
        try:
            check = ujson.loads(mapper[".zgroup"])
            assert check["zarr_format"] == 2
        except (KeyError, ValueError, TypeError) as e:
>           raise ValueError("Failed to load dataset as V2 zarr") from e
E           ValueError: Failed to load dataset as V2 zarr

kerchunk/zarr.py:54: ValueError
=================================== FAILURES ===================================
________________________________ test_no_inline ________________________________

self = <AsyncGroup <FsspecStore(ReferenceFileSystem, /)>>, key = 'x'

    async def getitem(
        self,
        key: str,
    ) -> AnyAsyncArray | AsyncGroup:
        """
        Get a subarray or subgroup from the group.
    
        Parameters
        ----------
        key : str
            Array or group name
    
        Returns
        -------
        AsyncArray or AsyncGroup
        """
        store_path = self.store_path / key
        logger.debug("key=%s, store_path=%s", key, store_path)
    
        # Consolidated metadata lets us avoid some I/O operations so try that first.
        if self.metadata.consolidated_metadata is not None:
            return self._getitem_consolidated(store_path, key, prefix=self.name)
        try:
>           return await get_node(
                store=store_path.store, path=store_path.path, zarr_format=self.metadata.zarr_format
            )

/usr/lib/python3/dist-packages/zarr/core/group.py:737: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
/usr/lib/python3/dist-packages/zarr/core/group.py:3561: in get_node
    return await _get_node_v2(store=store, path=path)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
/usr/lib/python3/dist-packages/zarr/core/group.py:3518: in _get_node_v2
    metadata = await _read_metadata_v2(store=store, path=path)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

store = <FsspecStore(ReferenceFileSystem, /)>, path = 'x'

    async def _read_metadata_v2(store: Store, path: str) -> ArrayV2Metadata | GroupMetadata:
        """
        Given a store_path, return ArrayV2Metadata or GroupMetadata defined by the metadata
        document stored at store_path.path / (.zgroup | .zarray). If no such document is found,
        raise a FileNotFoundError.
        """
        # TODO: consider first fetching array metadata, and only fetching group metadata when we don't
        # find an array
        zarray_bytes, zgroup_bytes, zattrs_bytes = await asyncio.gather(
            store.get(_join_paths([path, ZARRAY_JSON]), prototype=default_buffer_prototype()),
            store.get(_join_paths([path, ZGROUP_JSON]), prototype=default_buffer_prototype()),
            store.get(_join_paths([path, ZATTRS_JSON]), prototype=default_buffer_prototype()),
        )
    
        if zattrs_bytes is None:
            zattrs = {}
        else:
            zattrs = json.loads(zattrs_bytes.to_bytes())
    
        # TODO: decide how to handle finding both array and group metadata. The spec does not seem to
        # consider this situation. A practical approach would be to ignore that combination, and only
        # return the array metadata.
        if zarray_bytes is not None:
            zmeta = json.loads(zarray_bytes.to_bytes())
        else:
            if zgroup_bytes is None:
                # neither .zarray or .zgroup were found results in KeyError
>               raise FileNotFoundError(path)
E               FileNotFoundError: x

/usr/lib/python3/dist-packages/zarr/core/group.py:3409: FileNotFoundError

The above exception was the direct cause of the following exception:

    def test_no_inline():
        """Ensure that inline_threshold=0 disables MultiZarrToZarr checking file size."""
        ds = xr.Dataset(dict(x=[1, 2, 3]))
        ds["y"] = 3 + ds["x"]
        ds.to_zarr("memory://zarr_store", mode="w", zarr_format=2, consolidated=False)
        store = fsspec.get_mapper("memory://zarr_store")
        ref = kerchunk.utils.consolidate(store)
        # This type of reference with no offset or total size is produced by
        # kerchunk.zarr.single_zarr or equivalently ZarrToZarr.translate.
        ref["refs"]["y/0"] = ["file:///tmp/some/data-that-shouldnt-be-accessed"]
    
        mzz_no_inline = MultiZarrToZarr([ref], concat_dims=["x"], inline_threshold=0)
        # Should be okay because inline_threshold=None so we don't check the file size
        # in order to see if it should be inlined
>       mzz_no_inline.translate()

../../../tests/test_combine.py:807: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
kerchunk/combine.py:645: in translate
    self.first_pass()
kerchunk/combine.py:392: in first_pass
    value = self._get_value(i, z, var, fn=self._paths[i])
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
kerchunk/combine.py:344: in _get_value
    o = z[selector.split(":", 1)[1]][...]
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
/usr/lib/python3/dist-packages/zarr/core/group.py:1843: in __getitem__
    obj = self._sync(self._async_group.getitem(path))
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
/usr/lib/python3/dist-packages/zarr/core/sync.py:203: in _sync
    return sync(
/usr/lib/python3/dist-packages/zarr/core/sync.py:158: in sync
    raise return_result
/usr/lib/python3/dist-packages/zarr/core/sync.py:118: in _runner
    return await coro
           ^^^^^^^^^^
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <AsyncGroup <FsspecStore(ReferenceFileSystem, /)>>, key = 'x'

    async def getitem(
        self,
        key: str,
    ) -> AnyAsyncArray | AsyncGroup:
        """
        Get a subarray or subgroup from the group.
    
        Parameters
        ----------
        key : str
            Array or group name
    
        Returns
        -------
        AsyncArray or AsyncGroup
        """
        store_path = self.store_path / key
        logger.debug("key=%s, store_path=%s", key, store_path)
    
        # Consolidated metadata lets us avoid some I/O operations so try that first.
        if self.metadata.consolidated_metadata is not None:
            return self._getitem_consolidated(store_path, key, prefix=self.name)
        try:
            return await get_node(
                store=store_path.store, path=store_path.path, zarr_format=self.metadata.zarr_format
            )
        except FileNotFoundError as e:
>           raise KeyError(key) from e
E           KeyError: 'x'

/usr/lib/python3/dist-packages/zarr/core/group.py:741: KeyError
_________________________ test_subchunk_exact[chunks0] _________________________

self = <fsspec.implementations.memory.MemoryFileSystem object at 0x734758ea3e00>
path = '/test.zarr/.zgroup', start = None, end = None, kwargs = {}

    def cat_file(self, path, start=None, end=None, **kwargs):
        logger.debug("cat: %s", path)
        path = self._strip_protocol(path)
        try:
>           return bytes(self.store[path].getbuffer()[start:end])
                         ^^^^^^^^^^^^^^^^
E           KeyError: '/test.zarr/.zgroup'

/usr/lib/python3/dist-packages/fsspec/implementations/memory.py:230: KeyError

The above exception was the direct cause of the following exception:

self = <fsspec.mapping.FSMap object at 0x73474c134750>, key = '.zgroup'
default = None

    def __getitem__(self, key, default=None):
        """Retrieve data"""
        k = self._key_to_str(key)
        try:
>           result = self.fs.cat(k)
                     ^^^^^^^^^^^^^^

/usr/lib/python3/dist-packages/fsspec/mapping.py:155: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
/usr/lib/python3/dist-packages/fsspec/spec.py:917: in cat
    return self.cat_file(paths[0], **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <fsspec.implementations.memory.MemoryFileSystem object at 0x734758ea3e00>
path = '/test.zarr/.zgroup', start = None, end = None, kwargs = {}

    def cat_file(self, path, start=None, end=None, **kwargs):
        logger.debug("cat: %s", path)
        path = self._strip_protocol(path)
        try:
            return bytes(self.store[path].getbuffer()[start:end])
        except KeyError as e:
>           raise FileNotFoundError(path) from e
E           FileNotFoundError: /test.zarr/.zgroup

/usr/lib/python3/dist-packages/fsspec/implementations/memory.py:232: FileNotFoundError

The above exception was the direct cause of the following exception:

uri_or_store = 'memory://test.zarr', storage_options = None
inline_threshold = 100, inline = None, out = None

    def single_zarr(
        uri_or_store,
        storage_options=None,
        inline_threshold=100,
        inline=None,
        out=None,
    ):
        """kerchunk-style view on zarr mapper
    
        This is a similar process to zarr's consolidate_metadata, but does not
        need to be held in the original file tree. You do not need zarr itself
        to do this.
    
        This is useful for testing, so that we can pass hand-made zarrs to combine.
    
        Parameters
        ----------
        uri_or_store: str or dict-like
        storage_options: dict or None
            given to fsspec
        out: dict-like or None
            This allows you to supply an fsspec.implementations.reference.LazyReferenceMapper
            to write out parquet as the references get filled, or some other dictionary-like class
            to customise how references get stored
    
        Returns
        -------
        reference dict like
        """
        if isinstance(uri_or_store, str):
            mapper = fsspec.get_mapper(uri_or_store, **(storage_options or {}))
            prot = mapper.fs.protocol
            protocol = prot[0] if isinstance(prot, tuple) else prot
        else:
            mapper = uri_or_store
            if isinstance(mapper, fsspec.FSMap) and storage_options is None:
                storage_options = mapper.fs.storage_options
                prot = mapper.fs.protocol
                protocol = prot[0] if isinstance(prot, tuple) else prot
            else:
                protocol = None
    
        try:
>           check = ujson.loads(mapper[".zgroup"])
                                ^^^^^^^^^^^^^^^^^

kerchunk/zarr.py:51: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <fsspec.mapping.FSMap object at 0x73474c134750>, key = '.zgroup'
default = None

    def __getitem__(self, key, default=None):
        """Retrieve data"""
        k = self._key_to_str(key)
        try:
            result = self.fs.cat(k)
        except self.missing_exceptions as exc:
            if default is not None:
                return default
>           raise KeyError(key) from exc
E           KeyError: '.zgroup'

/usr/lib/python3/dist-packages/fsspec/mapping.py:159: KeyError

The above exception was the direct cause of the following exception:

m = <fsspec.implementations.memory.MemoryFileSystem object at 0x734758ea3e00>
chunks = [10, 10]

    @pytest.mark.parametrize("chunks", [[10, 10], [5, 10]])
    def test_subchunk_exact(m, chunks):
        g = zarr.open_group("memory://test.zarr", mode="w", zarr_format=2)
        data = np.arange(100).reshape(10, 10)
        arr = g.create_array(
            "data", dtype=data.dtype, shape=data.shape, chunks=chunks, compressor=None
        )
        arr[:] = data
>       ref = kerchunk.zarr.single_zarr("memory://test.zarr")["refs"]
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

../../../tests/test_utils.py:108: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

uri_or_store = 'memory://test.zarr', storage_options = None
inline_threshold = 100, inline = None, out = None

    def single_zarr(
        uri_or_store,
        storage_options=None,
        inline_threshold=100,
        inline=None,
        out=None,
    ):
        """kerchunk-style view on zarr mapper
    
        This is a similar process to zarr's consolidate_metadata, but does not
        need to be held in the original file tree. You do not need zarr itself
        to do this.
    
        This is useful for testing, so that we can pass hand-made zarrs to combine.
    
        Parameters
        ----------
        uri_or_store: str or dict-like
        storage_options: dict or None
            given to fsspec
        out: dict-like or None
            This allows you to supply an fsspec.implementations.reference.LazyReferenceMapper
            to write out parquet as the references get filled, or some other dictionary-like class
            to customise how references get stored
    
        Returns
        -------
        reference dict like
        """
        if isinstance(uri_or_store, str):
            mapper = fsspec.get_mapper(uri_or_store, **(storage_options or {}))
            prot = mapper.fs.protocol
            protocol = prot[0] if isinstance(prot, tuple) else prot
        else:
            mapper = uri_or_store
            if isinstance(mapper, fsspec.FSMap) and storage_options is None:
                storage_options = mapper.fs.storage_options
                prot = mapper.fs.protocol
                protocol = prot[0] if isinstance(prot, tuple) else prot
            else:
                protocol = None
    
        try:
            check = ujson.loads(mapper[".zgroup"])
            assert check["zarr_format"] == 2
        except (KeyError, ValueError, TypeError) as e:
>           raise ValueError("Failed to load dataset as V2 zarr") from e
E           ValueError: Failed to load dataset as V2 zarr

kerchunk/zarr.py:54: ValueError
_________________________ test_subchunk_exact[chunks1] _________________________

self = <fsspec.implementations.memory.MemoryFileSystem object at 0x734758ea3e00>
path = '/test.zarr/.zgroup', start = None, end = None, kwargs = {}

    def cat_file(self, path, start=None, end=None, **kwargs):
        logger.debug("cat: %s", path)
        path = self._strip_protocol(path)
        try:
>           return bytes(self.store[path].getbuffer()[start:end])
                         ^^^^^^^^^^^^^^^^
E           KeyError: '/test.zarr/.zgroup'

/usr/lib/python3/dist-packages/fsspec/implementations/memory.py:230: KeyError

The above exception was the direct cause of the following exception:

self = <fsspec.mapping.FSMap object at 0x734736f98850>, key = '.zgroup'
default = None

    def __getitem__(self, key, default=None):
        """Retrieve data"""
        k = self._key_to_str(key)
        try:
>           result = self.fs.cat(k)
                     ^^^^^^^^^^^^^^

/usr/lib/python3/dist-packages/fsspec/mapping.py:155: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
/usr/lib/python3/dist-packages/fsspec/spec.py:917: in cat
    return self.cat_file(paths[0], **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <fsspec.implementations.memory.MemoryFileSystem object at 0x734758ea3e00>
path = '/test.zarr/.zgroup', start = None, end = None, kwargs = {}

    def cat_file(self, path, start=None, end=None, **kwargs):
        logger.debug("cat: %s", path)
        path = self._strip_protocol(path)
        try:
            return bytes(self.store[path].getbuffer()[start:end])
        except KeyError as e:
>           raise FileNotFoundError(path) from e
E           FileNotFoundError: /test.zarr/.zgroup

/usr/lib/python3/dist-packages/fsspec/implementations/memory.py:232: FileNotFoundError

The above exception was the direct cause of the following exception:

uri_or_store = 'memory://test.zarr', storage_options = None
inline_threshold = 100, inline = None, out = None

    def single_zarr(
        uri_or_store,
        storage_options=None,
        inline_threshold=100,
        inline=None,
        out=None,
    ):
        """kerchunk-style view on zarr mapper
    
        This is a similar process to zarr's consolidate_metadata, but does not
        need to be held in the original file tree. You do not need zarr itself
        to do this.
    
        This is useful for testing, so that we can pass hand-made zarrs to combine.
    
        Parameters
        ----------
        uri_or_store: str or dict-like
        storage_options: dict or None
            given to fsspec
        out: dict-like or None
            This allows you to supply an fsspec.implementations.reference.LazyReferenceMapper
            to write out parquet as the references get filled, or some other dictionary-like class
            to customise how references get stored
    
        Returns
        -------
        reference dict like
        """
        if isinstance(uri_or_store, str):
            mapper = fsspec.get_mapper(uri_or_store, **(storage_options or {}))
            prot = mapper.fs.protocol
            protocol = prot[0] if isinstance(prot, tuple) else prot
        else:
            mapper = uri_or_store
            if isinstance(mapper, fsspec.FSMap) and storage_options is None:
                storage_options = mapper.fs.storage_options
                prot = mapper.fs.protocol
                protocol = prot[0] if isinstance(prot, tuple) else prot
            else:
                protocol = None
    
        try:
>           check = ujson.loads(mapper[".zgroup"])
                                ^^^^^^^^^^^^^^^^^

kerchunk/zarr.py:51: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <fsspec.mapping.FSMap object at 0x734736f98850>, key = '.zgroup'
default = None

    def __getitem__(self, key, default=None):
        """Retrieve data"""
        k = self._key_to_str(key)
        try:
            result = self.fs.cat(k)
        except self.missing_exceptions as exc:
            if default is not None:
                return default
>           raise KeyError(key) from exc
E           KeyError: '.zgroup'

/usr/lib/python3/dist-packages/fsspec/mapping.py:159: KeyError

The above exception was the direct cause of the following exception:

m = <fsspec.implementations.memory.MemoryFileSystem object at 0x734758ea3e00>
chunks = [5, 10]

    @pytest.mark.parametrize("chunks", [[10, 10], [5, 10]])
    def test_subchunk_exact(m, chunks):
        g = zarr.open_group("memory://test.zarr", mode="w", zarr_format=2)
        data = np.arange(100).reshape(10, 10)
        arr = g.create_array(
            "data", dtype=data.dtype, shape=data.shape, chunks=chunks, compressor=None
        )
        arr[:] = data
>       ref = kerchunk.zarr.single_zarr("memory://test.zarr")["refs"]
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

../../../tests/test_utils.py:108: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

uri_or_store = 'memory://test.zarr', storage_options = None
inline_threshold = 100, inline = None, out = None

    def single_zarr(
        uri_or_store,
        storage_options=None,
        inline_threshold=100,
        inline=None,
        out=None,
    ):
        """kerchunk-style view on zarr mapper
    
        This is a similar process to zarr's consolidate_metadata, but does not
        need to be held in the original file tree. You do not need zarr itself
        to do this.
    
        This is useful for testing, so that we can pass hand-made zarrs to combine.
    
        Parameters
        ----------
        uri_or_store: str or dict-like
        storage_options: dict or None
            given to fsspec
        out: dict-like or None
            This allows you to supply an fsspec.implementations.reference.LazyReferenceMapper
            to write out parquet as the references get filled, or some other dictionary-like class
            to customise how references get stored
    
        Returns
        -------
        reference dict like
        """
        if isinstance(uri_or_store, str):
            mapper = fsspec.get_mapper(uri_or_store, **(storage_options or {}))
            prot = mapper.fs.protocol
            protocol = prot[0] if isinstance(prot, tuple) else prot
        else:
            mapper = uri_or_store
            if isinstance(mapper, fsspec.FSMap) and storage_options is None:
                storage_options = mapper.fs.storage_options
                prot = mapper.fs.protocol
                protocol = prot[0] if isinstance(prot, tuple) else prot
            else:
                protocol = None
    
        try:
            check = ujson.loads(mapper[".zgroup"])
            assert check["zarr_format"] == 2
        except (KeyError, ValueError, TypeError) as e:
>           raise ValueError("Failed to load dataset as V2 zarr") from e
E           ValueError: Failed to load dataset as V2 zarr

kerchunk/zarr.py:54: ValueError
=========================== short test summary info ============================
FAILED ../../../tests/test_combine.py::test_no_inline - KeyError: 'x'
FAILED ../../../tests/test_utils.py::test_subchunk_exact[chunks0] - ValueErro...
FAILED ../../../tests/test_utils.py::test_subchunk_exact[chunks1] - ValueErro...
ERROR ../../../tests/test_combine.py::test_fixture - ValueError: Failed to lo...
ERROR ../../../tests/test_combine.py::test_fixture_chunks[quad_nochunk1-chunks0]
ERROR ../../../tests/test_combine.py::test_fixture_chunks[quad_1chunk1-chunks1]
ERROR ../../../tests/test_combine.py::test_fixture_chunks[quad_2chunk1-chunks2]
ERROR ../../../tests/test_combine.py::test_get_coos[data:time-expected0] - Va...
ERROR ../../../tests/test_combine.py::test_get_coos[selector1-expected1] - Va...
ERROR ../../../tests/test_combine.py::test_get_coos[INDEX-expected2] - ValueE...
ERROR ../../../tests/test_combine.py::test_get_coos[attr:attr1-expected3] - V...
ERROR ../../../tests/test_combine.py::test_get_coos[vattr:data:attr0-expected4]
ERROR ../../../tests/test_combine.py::test_coo_vars - ValueError: Failed to l...
ERROR ../../../tests/test_combine.py::test_single - ValueError: Failed to loa...
ERROR ../../../tests/test_combine.py::test_single_append - ValueError: Failed...
ERROR ../../../tests/test_combine.py::test_single_append_cf[dtype0-mapper0]
ERROR ../../../tests/test_combine.py::test_single_append_cf[dtype0-mapper1]
ERROR ../../../tests/test_combine.py::test_single_append_cf[dtype1-mapper0]
ERROR ../../../tests/test_combine.py::test_single_append_cf[dtype1-mapper1]
ERROR ../../../tests/test_combine.py::test_lazy_filler - ValueError: Failed t...
ERROR ../../../tests/test_combine.py::test_run_twice - ValueError: Failed to ...
ERROR ../../../tests/test_combine.py::test_outfile_postprocess - ValueError: ...
ERROR ../../../tests/test_combine.py::test_chunked[inputs0-chunks0] - ValueEr...
ERROR ../../../tests/test_combine.py::test_chunked[inputs1-chunks1] - ValueEr...
ERROR ../../../tests/test_combine.py::test_chunked[inputs2-chunks2] - ValueEr...
ERROR ../../../tests/test_combine.py::test_chunked[inputs3-chunks3] - ValueEr...
ERROR ../../../tests/test_combine.py::test_times[time-expected0] - ValueError...
ERROR ../../../tests/test_combine.py::test_times[cfstdtime-expected1] - Value...
ERROR ../../../tests/test_combine.py::test_times[cfnontime-expected2] - Value...
ERROR ../../../tests/test_combine.py::test_cftimes_to_normal - ValueError: Fa...
ERROR ../../../tests/test_combine.py::test_inline - ValueError: Failed to loa...
ERROR ../../../tests/test_combine.py::test_bad_coo_warning - ValueError: Fail...
ERROR ../../../tests/test_combine.py::test_chunk_error - ValueError: Failed t...
= 3 failed, 50 passed, 6 skipped, 22 deselected, 45 warnings, 30 errors in 7.68s =

Steps to reproduce

On Debian Sid

  1. install dependencies
sudo apt install hdf5-filter-plugin pybuild-plugin-pyproject python3-aiohttp python3-all python3-astropy python3-dask python3-cfgrib python3-cftime python3-eccodes python3-fsspec python3-h5netcdf python3-h5py python3-hdf5plugin python3-netcdf4 python3-numcodecs python3-numpy python3-numpydoc python3-pytest python3-s3fs python3-scipy python3-setuptools python3-setuptools-scm python3-sphinx python3-sphinx-rtd-theme python3-tifffile python3-ujson python3-xarray python3-zarr
  1. clone the kerchunk repository
  2. from the kerchunk reporitory root:
$ python3.14 -m pytest -v -m "not remotedata" -k "not test_single_append_parquet and not test_zarr_combine and not test_string_null and not test_string_decode and not test_compound_string_null and not test_compound_string_encode and not test_var and not test_malicious_chunks" tests

Additional output

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugPotential issues with the zarr-python library

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions