Skip to content

Fix the Windows platform implementation of the FileSystem API #470

Open
iseki0 wants to merge 16 commits intoKotlin:developfrom
iseki0:fix/windows-encoding
Open

Fix the Windows platform implementation of the FileSystem API #470
iseki0 wants to merge 16 commits intoKotlin:developfrom
iseki0:fix/windows-encoding

Conversation

@iseki0
Copy link
Copy Markdown
Contributor

@iseki0 iseki0 commented Sep 28, 2025

Problem

  • The FileSystem API is not working correctly on non-English Windows

Memo

The Windows path related string should be accessed by Win32 API which with W suffix, don't use A suffix API or you will get trouble on non-English content. By mingw is also unrecommended, they always get trouble on encoding handling. So, just use the plain Win32 API, the most eaist choice.

And...
I refactor your project layout, remove the nativeNonAndroid sourceset. (The dir list code is the only code in this sourceset.)
I don't think seperate the world be xx and nonXx is a good choice. Now we need the third part......
If I make something wrong, let me know.

@iseki0
Copy link
Copy Markdown
Contributor Author

iseki0 commented Sep 28, 2025

@fzhinkin

@iseki0
Copy link
Copy Markdown
Contributor Author

iseki0 commented Sep 28, 2025

Another bug exists, for the long path handling. I will fix it in another PR.

@fzhinkin fzhinkin self-requested a review September 29, 2025 07:39
@fzhinkin fzhinkin added the fs label Sep 29, 2025
@fzhinkin
Copy link
Copy Markdown
Collaborator

@iseki0 thanks for opening the PR!

Is it possible to add tests reproducing issues that might occur when using mingw's opendir/readdir (and SomethingSomethingA Win32 functions)?
Or these encoding-specific issues occur only on Windows systems with a specific setup? In this case, we need to figure out how to test them anyway.

Comment thread core/mingw/src/files/Error.kt Outdated
Comment thread core/mingw/src/files/FileSystemMingw.kt Outdated
Comment thread core/mingw/src/files/FileSystemMingw.kt Outdated
Comment thread core/mingw/src/files/FileSystemMingw.kt
Comment thread core/mingw/src/files/FileSystemMingw.kt Outdated
Comment thread core/mingw/src/files/FileSystemMingw.kt Outdated
@iseki0
Copy link
Copy Markdown
Contributor Author

iseki0 commented Sep 29, 2025

Is it possible to add tests reproducing issues that might occur when using mingw's opendir/readdir (and SomethingSomethingA Win32 functions)?

I will have a try, the simpliest way is add some file/folder with non-English filename. I will add it to mingw test directory.

@iseki0
Copy link
Copy Markdown
Contributor Author

iseki0 commented Sep 29, 2025

Please use star-imports instead.

@fzhinkin
Because my IDEA codestyle is not default, I suggest you add it NOW.(Sorry for my poorly English.)

@iseki0
Copy link
Copy Markdown
Contributor Author

iseki0 commented Sep 29, 2025

Or these encoding-specific issues occur only on Windows systems with a specific setup? In this case, we need to figure out how to test them anyway.

When setup Windows with the default configuration(non-English version and, <=Windows 10). The A suffix API handling string in the ANSI encoding, the ANSI means current code page encoding.
For Simplified Chinese locale it's 936(GBK), for Japanese locale it's 932(Shift-JIS). Some program (most part from UNIX-world, maybe include MinGW) require the code page is 65001(UTF-8), but it's not the default configuration. Of course, if the user change the OS codepage to 65001, other program will unable to running...

@iseki0
Copy link
Copy Markdown
Contributor Author

iseki0 commented Sep 29, 2025

Please do squash merging. And please add a co-author which2who@gmail.com, he also made a patch but he's late😂. He notice me that the basename/dirname... should be fixed too.

@iseki0 iseki0 requested a review from fzhinkin September 29, 2025 19:05
@iseki0
Copy link
Copy Markdown
Contributor Author

iseki0 commented Sep 29, 2025

Many testcases failed, because the Windows separator is '\' but the posix is '/'. The testcase compare the string directly...

@iseki0
Copy link
Copy Markdown
Contributor Author

iseki0 commented Sep 29, 2025

We shouldn't compare string directly. The path string shouldn't be comparing directly. It's platform dependent.

@iseki0 iseki0 changed the title Fix Windows target path encoding problem Fix the Windows platform implementation of the FileSystem API Sep 29, 2025
@iseki0
Copy link
Copy Markdown
Contributor Author

iseki0 commented Sep 29, 2025

OK, I fixed the SystemPathSeparator val... Now all test passed.

@iseki0 iseki0 force-pushed the fix/windows-encoding branch from 115c841 to 4d7d065 Compare October 9, 2025 12:36
@iseki0
Copy link
Copy Markdown
Contributor Author

iseki0 commented Oct 9, 2025

@fzhinkin Now, I rebased, please review it again. At least it fix the encoding problem, on Windows

@fzhinkin
Copy link
Copy Markdown
Collaborator

fzhinkin commented Oct 9, 2025

@iseki0,

And please add a co-author which2who@gmail.com, he also made a patch but he's late

I will need an explicit consent from that person for that.

OK, I fixed the SystemPathSeparator val... Now all test passed.

Unfortunately, there are a few test failures: https://teamcity.jetbrains.com/buildConfiguration/KotlinTools_KotlinxIo_BuildAggregated/5556685?buildTab=tests&status=failed

Comment thread core/mingw/src/files/PathsMingw.kt
Comment thread core/common/test/files/SmokeFileTestWindows.kt Outdated
Comment thread core/mingw/src/files/FileSystemMingw.kt Outdated
Comment thread core/mingw/src/files/FileSystemMingw.kt Outdated
Comment thread core/appleAndLinux/src/files/FileSystemAppleAndLinux.kt
Comment thread core/native/src/files/FileSystemNative.kt
Comment thread core/mingw/test/files/SmokeFileTestWindowsMinGW.kt Outdated
Comment thread core/mingw/test/files/SmokeFileTestWindowsMinGW.kt Outdated
Comment thread core/mingw/test/files/SmokeFileTestWindowsMinGW.kt Outdated
Comment thread core/mingw/test/files/SmokeFileTestWindowsMinGW.kt Outdated
@iseki0
Copy link
Copy Markdown
Contributor Author

iseki0 commented Oct 10, 2025

    @Test
    fun testA(){
        memScoped {
            println(dirname("C:\\foo\\bar".cstr)!!.toKString())
            println(dirname("C:\\foo\\bar".cstr)!!.toKStringFromUtf8())
            println(dirname("C:\\啊啊啊啊\\123".cstr)!!.toKString())
            println(dirname("C:\\啊啊啊啊\\123".cstr)!!.toKStringFromUtf8())
            println(dirname("C:\\irohauta\\いろはにほへと\\ちりぬるを".cstr)!!.toKString())
            println(dirname("C:\\irohauta\\いろはにほへと\\ちりぬるを".cstr)!!.toKStringFromUtf8())
        }
    }

output

C:\foo
C:\foo
C:\啊啊啊�
C:\啊啊啊�
.
.

My Windows:

OS Name:                   Microsoft Windows 10 专业版 (professional version)
OS Version:                10.0.19045 N/A Build 19045
System Locale:             zh-cn;Chinese (China)
Input Locale:              zh-cn;Chinese (China)

Default code page: 936, GBK

@iseki0 iseki0 requested a review from fzhinkin October 10, 2025 07:24
@iseki0
Copy link
Copy Markdown
Contributor Author

iseki0 commented Oct 10, 2025

I add a skip to the TeamCity test kotlinx.io.files.SmokeFileTestWindowsMinGW.mingwProblem[mingwX64].

@iseki0
Copy link
Copy Markdown
Contributor Author

iseki0 commented Oct 12, 2025

I think the Windows path handling still have many problems, after this PR. This only fix the encoding problem and make it basiclly available, on Windows platform. So we'd better merge it quickly. Because after that we'd better to begin to consider the following problems:

  • Path creating: Should we do a basic normalization during the Path object creation? Such as fix and remove redundent slash, but don't resolve the double and single dot. (Because we need it to represent relative path, in some cases. And, the Java do it like that.)
  • Roots: At least on Windows we have many roots.
  • Long path: On Windows for historical reasons, there's a MAX_PATH limitation of the path string, for long path we have to add a prefix \\?\ (or \\?\UNC\ for UNC path). Should we normalize it during the Path object creating?
  • Normalization: Of course we need a normalize method because in the common sense we shouldn't do any resolve about the double and single dot segment during the path creation.

Any, maybe many others...

@iseki0
Copy link
Copy Markdown
Contributor Author

iseki0 commented Oct 28, 2025

@fzhinkin

@fzhinkin
Copy link
Copy Markdown
Collaborator

@iseki0, sorry for the delay, and thank you for addressing all my questions/concerns.
There are a few things I wanted to check, but otherwise we're almost there.

@iseki0
Copy link
Copy Markdown
Contributor Author

iseki0 commented Nov 21, 2025

@fzhinkin OK, I wrote a basically available Windows path impl at https://github.com/iseki0/klibpath

@magic-cucumber
Copy link
Copy Markdown

oh, shall we see a new version containing this update before next February?

assertEquals(Path("\\\\server\\share"), Path("\\\\server\\share\\dir").parent)
assertEquals(Path("""\\server\share"""), Path("""\\server\share\dir""").parent)
// This is a root UNC path, so parent is
assertEquals(null, Path("""\\server\share""").parent)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@iseki0 this assertion fails because Path("""\\server\share""").parent returns "\\server".

I suggest fixing parent's behavior separately and fixing the assertion for now.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

jshell> Path.of("\\\\server\\share").getParent()
$2 ==> null

Copy link
Copy Markdown
Contributor Author

@iseki0 iseki0 Jan 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this API has different behaviours on Windows and Linux...

C:\Users\iseki>jshell -version
jshell 24.0.2

C:\Users\iseki>java -version
java version "24.0.2" 2025-07-15
Java(TM) SE Runtime Environment (build 24.0.2+12-54)
Java HotSpot(TM) 64-Bit Server VM (build 24.0.2+12-54, mixed mode, sharing)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's kotlinx.io.files.Path, not java.nio.file.Path. It has its own implementation which is having issues with UNC paths and incorrectly determining their roots.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's a problem. Different platforms have different behaviors. What should we do? Just interpreting all paths like the UNC path is Windows UNC path? Even if we're running on Linux?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@iseki0, let's keep the incorrect behavior for now and once Pathcch library will be available from Kotlin, dirnameImpl could be updated to use PathCchRemoveFileSpec instead of PathRemoveFileSpecW (that'll fix the behavior).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@iseki0, let's keep the incorrect behavior for now and once Pathcch library will be available from Kotlin, dirnameImpl could be updated to use PathCchRemoveFileSpec instead of PathRemoveFileSpecW (that'll fix the behavior).

I don't know which decision is better. In the past month I wrote a pure Kotlin implementation of Windows path parsing and normalization... Of course for the real world io we should use Windows API directly, but for path parsing? If we depends on the Windows API we will have no chance to implement the same behaviors on non-Windows platform. I see the openjdk do it by self-made Java code. I just... In some confuse...

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rewriting path parsing completely is indeed an option, but it is outside of the scope of this PR.

If we depends on the Windows API we will have no chance to implement the same behaviors on non-Windows platform.

Currently, kotlinx.io.files.Path represents platform-specific paths, and it is not aimed for parsing Windows paths on, for example, Android devices.

We're considering providing such cross-platform path parsing functionality as part of #241, but scenarios where it is needed are quite niche (and there are only a few of them).

// Skipping test because console code page is UTF-8,
// use when clause because I'm not sure which codepage should be skipped
when (GetConsoleCP()) {
65001u -> return
Copy link
Copy Markdown
Collaborator

@fzhinkin fzhinkin Jan 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@iseki0, GetConsoleCP returns 65001 even if I run tests on a Windows machine with default code page 950 (Big5). Is some specific setup required to run this test?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fzhinkin IIRC, Windows 11 enable the always UTF-8 option default... But it's not true on other Windows version. (My: Microsoft Windows [Version 10.0.19045.6466])
image
The sentence: Beta: Use Unicode UTF-8 provide international supports...

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@iseki0, is there a reason to keep this test in the repository other than showcasing the issue with mingw's dirname? If no, let's remove it.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, just remove it. No problems.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, just remove it. No problems.

@Wesley-Young
Copy link
Copy Markdown

Any updates on this PR? I'd be glad to see this fix in the next release.

@magic-cucumber
Copy link
Copy Markdown

nops

@Wesley-Young
Copy link
Copy Markdown

Wesley-Young commented Apr 21, 2026

https://github.com/SaltifyDev/ktfs

A temporary workaround which wraps kotlinx.io.files and fixes wide character problems on mingwX64. Will be deprecated once a new version of kotlinx-io with this PR merged is released.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants