Skip to content

Method for locating MAS data files is not robust #92

@cooperdowns

Description

@cooperdowns

The current glob syntax for locating MAS data files of a specific variable (e.g. t001.hdf t002.hdf, etc.) will match with unwanted stuff, such as a random file called test.txt or tmas and then the call to read the variable (e.g. t = mas_output['t']) will crash ungracefully complaining about coordinates.

Also, if there are multiple variables matching one of the variable prefixes (e.g. t and te) then reading t would read any t and te files silently without warning.

Perhaps this issue showed up with the switch to 3 or 6 digit sequence numbers.

I believe this is the line at issue:

return sorted(glob.glob(str(directory / f"{var}*")))

And maybe this one might need to be checked/revised for 3 vs 6 issues too:

files = glob.glob(str(path / "*[0-9][0-9][0-9].*"))

I suspect a fix would be to be more stringent about the format, looking for either 3 digit or 6 digit HDF files separately and choosing one or the other (e.g. first look for f"{var}[0-9][0-9][0-9].h*" and if there are no matches then look for f"{var}[0-9][0-9][0-9][0-9][0-9][0-9].h*") but I'm not experienced enough with PsiPy IO to implement this knowing it won't cause side effects.

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions