Fix for non-ascii characters in text annotations of open-ephys files by MarinManuel · Pull Request #1827 · NeuralEnsemble/python-neo

MarinManuel · 2026-03-23T22:08:33Z

Open Ephys (binary, not sure about other format) files can contain text information that contains non-ascii characters. When trying to open such a file, the code fails with error:

UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 38: ordinal not in range(128)

This fix detects the UnicodeDecodeError and falls back to decoding byte by byte use utf8 encoding.

As far as I can tell, this change does not affect any of the existing tests. Let me know if we want to add a test with a file with non-ascii characters.

alejoe91 · 2026-03-24T11:43:19Z

Thanks @MarinManuel !

Would it make sense to always decode to be sure? (also to avoid the try-except?)

What characters were causing issues?

MarinManuel · 2026-03-24T13:32:45Z

My initial thought was also to always decode with utf8, but I did not want to risk breaking anything, so I choose the safer approach. I also assume the .astype("U") is faster than looping, but I don't know if that matters really.

I had some text strings with the characters µ and Δ that were creating issues. I initially raised an issue with Open ephys (see thread on discord if you are interested) before I pinpointed the issue here.

zm711 · 2026-03-31T22:43:24Z

I also prefer to avoid a bunch of try-except blocks if possible. I'm not sure how just always decoding would break but to be fair I don't do much with unicode vs utf8 stuff.

# Conflicts: # neo/rawio/openephysbinaryrawio.py

MarinManuel · 2026-04-01T03:40:32Z

That's fair, but unfortunately, removing the try blocks causes issues because in some instances the array contains a bunch of numbers which should not be decode()ed.

As an alternative to the try blocks, I propose using conditional tests then to only use decode() if the array contains strings, otherwise use astype("U") as before

zm711 · 2026-04-03T17:52:59Z

+                            info["labels"] = info["text"].astype("U")
                    elif "metadata" in info:
                        # binary case
-                        info["labels"] = info["channels"].astype("U")


Is this a previous typo on our part? Why are we looking at channels here if we are checking for metadata? I'm not super familiar with this format so would love to know what is going on here? I'm just brainstorm if there is a simpler way to make this work and noticed this pattern that seems off?

MarinManuel added 3 commits March 23, 2026 21:57

added utf8 decoding for strings if first attemp fails

675cfa7

added utf8 decoding for strings if first attemp fails

cb1db66

Merge remote-tracking branch 'origin/master'

3f457af

alejoe91 added this to the 0.14.5 milestone Mar 25, 2026

MarinManuel added 3 commits April 1, 2026 03:26

replaced try/except with logic test

3f194f8

replaced try/except with logic test

e89119c

Merge remote-tracking branch 'origin/master'

89043a5

# Conflicts: # neo/rawio/openephysbinaryrawio.py

zm711 reviewed Apr 3, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix for non-ascii characters in text annotations of open-ephys files#1827

Fix for non-ascii characters in text annotations of open-ephys files#1827
MarinManuel wants to merge 6 commits into
NeuralEnsemble:masterfrom
MarinManuel:master

MarinManuel commented Mar 23, 2026

Uh oh!

alejoe91 commented Mar 24, 2026

Uh oh!

MarinManuel commented Mar 24, 2026

Uh oh!

zm711 commented Mar 31, 2026

Uh oh!

MarinManuel commented Apr 1, 2026

Uh oh!

zm711 Apr 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

MarinManuel commented Mar 23, 2026

Uh oh!

alejoe91 commented Mar 24, 2026

Uh oh!

MarinManuel commented Mar 24, 2026

Uh oh!

zm711 commented Mar 31, 2026

Uh oh!

MarinManuel commented Apr 1, 2026

Uh oh!

zm711 Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants