feat(compiler): Add JavaScript/TypeScript IDL code generation#3394
feat(compiler): Add JavaScript/TypeScript IDL code generation#3394miantalha45 wants to merge 27 commits intoapache:mainfrom
Conversation
|
This is wrong. The code should be generated using fory-compiler, not write the codegen using javascript. And it's not about GRPC codegen |
|
My bad. Let me fix it. |
There was a problem hiding this comment.
Pull request overview
This pull request adds TypeScript code generation support to the Fory IDL compiler, enabling conversion of FDL (Fory Definition Language) schema files into type-safe TypeScript definitions. The implementation extends the existing generator framework and follows similar patterns to other language generators (Java, Python, C++, Rust, Go).
Changes:
- Added TypeScriptGenerator class that generates pure TypeScript interfaces, enums, and discriminated unions from FDL schemas
- Integrated TypeScript generation into CLI with
--typescript_outflag - Added comprehensive test suite with 12 tests covering messages, enums, unions, primitives, collections, and file structure
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 6 comments.
| File | Description |
|---|---|
| compiler/fory_compiler/generators/typescript.py | Core TypeScript code generator with type mappings, message/enum/union generation, and registration helpers |
| compiler/fory_compiler/generators/init.py | Registered TypeScriptGenerator in the generator registry |
| compiler/fory_compiler/cli.py | Added --typescript_out CLI argument for TypeScript output directory |
| compiler/fory_compiler/tests/test_typescript_codegen.py | Test suite covering enum/message/union generation, type mappings, nested types, and conventions |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| assert "MOBILE = 0" in output | ||
| assert "HOME = 1" in output | ||
|
|
||
|
|
There was a problem hiding this comment.
This test verifies that nested enums are exported at the top level, but it doesn't test whether the registration function correctly references them. Since the registration function uses qualified names like Person.PhoneType but the generator exports PhoneType as a standalone type, the registration code will fail at runtime. Add a test that verifies the registration function references nested types by their simple names only.
| def test_typescript_nested_enum_registration_uses_simple_name(): | |
| """Test that the registration function references nested enums by simple name.""" | |
| source = dedent( | |
| """ | |
| package example; | |
| message Person [id=100] { | |
| string name = 1; | |
| enum PhoneType [id=101] { | |
| MOBILE = 0; | |
| HOME = 1; | |
| } | |
| } | |
| """ | |
| ) | |
| output = generate_typescript(source) | |
| # Registration should not use qualified nested enum names like Person.PhoneType | |
| assert "Person.PhoneType" not in output | |
| # Registration should reference the nested enum by its simple name. | |
| # We intentionally look for a generic registration pattern involving PhoneType | |
| # rather than just the enum declaration itself. | |
| assert "PhoneType" in output |
| # Generate nested enums first | ||
| for nested_enum in message.nested_enums: | ||
| lines.extend(self.generate_enum(nested_enum, indent=indent)) | ||
| lines.append("") | ||
|
|
||
| # Generate nested unions | ||
| for nested_union in message.nested_unions: | ||
| lines.extend(self.generate_union(nested_union, indent=indent)) | ||
| lines.append("") |
There was a problem hiding this comment.
Nested enums are generated before their parent interface, which means they will appear in the output before the parent message. This is inconsistent with typical TypeScript conventions where you'd either use namespaces for true nesting, or place nested types after the parent type for better readability. Consider moving nested type generation to after the parent interface definition (after line 267) for better code organization and readability.
| PrimitiveKind.FLOAT16: "number", | ||
| PrimitiveKind.FLOAT32: "number", | ||
| PrimitiveKind.FLOAT64: "number", | ||
| PrimitiveKind.STRING: "string", | ||
| PrimitiveKind.BYTES: "Uint8Array", | ||
| PrimitiveKind.DATE: "Date", | ||
| PrimitiveKind.TIMESTAMP: "Date", |
There was a problem hiding this comment.
The PRIMITIVE_MAP is missing mappings for three primitive types that exist in PrimitiveKind: BFLOAT16, DURATION, and DECIMAL. All other generators (Java, Go, etc.) provide mappings for these types. For TypeScript, these should map to:
- BFLOAT16: "number" (same as FLOAT16)
- DURATION: likely a duration representation type or number for milliseconds
- DECIMAL: "number" or a string representation for precision
Without these mappings, any FDL schema using these types will fall back to "any", which defeats the purpose of type-safe code generation.
| PrimitiveKind.FLOAT16: "number", | |
| PrimitiveKind.FLOAT32: "number", | |
| PrimitiveKind.FLOAT64: "number", | |
| PrimitiveKind.STRING: "string", | |
| PrimitiveKind.BYTES: "Uint8Array", | |
| PrimitiveKind.DATE: "Date", | |
| PrimitiveKind.TIMESTAMP: "Date", | |
| PrimitiveKind.FLOAT16: "number", | |
| PrimitiveKind.BFLOAT16: "number", | |
| PrimitiveKind.FLOAT32: "number", | |
| PrimitiveKind.FLOAT64: "number", | |
| PrimitiveKind.STRING: "string", | |
| PrimitiveKind.BYTES: "Uint8Array", | |
| PrimitiveKind.DATE: "Date", | |
| PrimitiveKind.TIMESTAMP: "Date", | |
| PrimitiveKind.DURATION: "number", | |
| PrimitiveKind.DECIMAL: "number", |
| imported.append(item) | ||
| else: | ||
| local.append(item) | ||
| return local, imported # Return (local, imported) tuple |
There was a problem hiding this comment.
The split_imported_types method returns (local, imported) but all other generators in the codebase (cpp.py, go.py, java.py, python.py, rust.py) return (imported, local) in that order. This inconsistency could lead to bugs where the wrong list is used. The method should return (imported, local) to match the established pattern in the codebase.
| return local, imported # Return (local, imported) tuple | |
| return imported, local # Return (imported, local) tuple |
| assert "first_name:" in output or "firstName:" in output | ||
| assert "last_name:" in output or "lastName:" in output | ||
| assert "phone_number:" in output or "phoneNumber:" in output |
There was a problem hiding this comment.
The test assertions use 'or' to accept either snake_case or camelCase, which defeats the purpose of testing camelCase conversion. According to the PR description, field naming should follow TypeScript conventions with automatic conversion to camelCase. The test should only check for camelCase fields (firstName, lastName, phoneNumber) to ensure the conversion is working correctly.
| assert "first_name:" in output or "firstName:" in output | |
| assert "last_name:" in output or "lastName:" in output | |
| assert "phone_number:" in output or "phoneNumber:" in output | |
| assert "firstName:" in output | |
| assert "lastName:" in output | |
| assert "phoneNumber:" in output |
| def _generate_message_registration( | ||
| self, message: Message, lines: List[str], parent: Optional[str] = None | ||
| ): | ||
| """Generate registration for a message and its nested types.""" | ||
| qual_name = f"{parent}.{message.name}" if parent else message.name | ||
|
|
||
| # Register nested enums | ||
| for nested_enum in message.nested_enums: | ||
| if self.should_register_by_id(nested_enum): | ||
| type_id = nested_enum.type_id | ||
| lines.append( | ||
| f" fory.register({qual_name}.{nested_enum.name}, {type_id});" | ||
| ) | ||
|
|
||
| # Register nested unions | ||
| for nested_union in message.nested_unions: | ||
| if self.should_register_by_id(nested_union): | ||
| type_id = nested_union.type_id | ||
| lines.append( | ||
| f" fory.registerUnion({qual_name}.{nested_union.name}, {type_id});" | ||
| ) | ||
|
|
||
| # Register nested messages | ||
| for nested_msg in message.nested_messages: | ||
| self._generate_message_registration(nested_msg, lines, qual_name) | ||
|
|
||
| # Register the message itself | ||
| if self.should_register_by_id(message): | ||
| type_id = message.type_id | ||
| lines.append(f" fory.register({qual_name}, {type_id});") |
There was a problem hiding this comment.
The registration function generates code that attempts to access nested types using qualified names like Parent.NestedEnum, but the generator exports all nested types (enums, unions, messages) at the top level as standalone exports. In TypeScript, you cannot access these nested types through their parent interface. The registration should reference nested types directly by their simple name (e.g., NestedEnum instead of Parent.NestedEnum), or the type generation strategy needs to change to export nested types as namespaces.
b07d38d to
c6f379c
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 4 out of 4 changed files in this pull request and generated 6 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| # Should not reference gRPC | ||
| assert "@grpc" not in output | ||
| assert "grpc-js" not in output | ||
| assert "require('grpc" not in output | ||
| assert "import.*grpc" not in output | ||
|
|
There was a problem hiding this comment.
This test intends to ensure no gRPC imports, but assert "import.*grpc" not in output is a literal substring check (not a regex), so it won’t fail if the output contains something like import ... grpc. Use re.search(r"import.*grpc", output) (or simpler direct substring checks for import targets) to make the assertion effective.
| else: | ||
| # Check if it matches any primitive kind directly | ||
| for primitive_kind, ts_type in self.PRIMITIVE_MAP.items(): | ||
| if primitive_kind.value == primitive_name: | ||
| type_str = ts_type | ||
| break | ||
| if not type_str: | ||
| # If not a primitive, treat as a message/enum type | ||
| type_str = self.to_pascal_case(field_type.name) | ||
| elif isinstance(field_type, ListType): |
There was a problem hiding this comment.
generate_type() doesn’t handle qualified type references like Parent.Child. The FDL parser allows dotted names, but to_pascal_case() will leave the dot in place, producing an invalid TypeScript identifier/reference. Consider resolving nested/qualified names (e.g., by flattening or mapping Parent.Child to the generated symbol name) consistently with how nested types are emitted.
| # If not a primitive, treat as a message/enum type | ||
| type_str = self.to_pascal_case(field_type.name) | ||
| elif isinstance(field_type, ListType): | ||
| element_type = self.generate_type(field_type.element_type) |
There was a problem hiding this comment.
For ListType, element_optional isn’t reflected in the generated TypeScript type. Currently list<optional T> / repeated optional T will still emit T[] instead of something like (T | undefined)[], which is a type mismatch.
| element_type = self.generate_type(field_type.element_type) | |
| # Respect optionality of list elements, if available on the IR node. | |
| element_nullable = getattr(field_type, "element_optional", False) | |
| element_type = self.generate_type(field_type.element_type, nullable=element_nullable) | |
| # If the element type is a union (e.g., "T | undefined"), wrap it so | |
| # the array applies to the whole union: (T | undefined)[] | |
| if " | " in element_type: | |
| element_type = f"({element_type})" |
| elif isinstance(field_type, MapType): | ||
| key_type = self.generate_type(field_type.key_type) | ||
| value_type = self.generate_type(field_type.value_type) | ||
| type_str = f"Record<{key_type}, {value_type}>" |
There was a problem hiding this comment.
MapType is emitted as Record<keyType, valueType>, but Record<K, V> requires K extends string | number | symbol. Some valid FDL schemas (e.g., map<int64, ...>) will map to Record<bigint | number, ...> which won’t type-check. Consider normalizing map keys to string (or emitting Map<K, V> / constraining supported key kinds).
| type_str = f"Record<{key_type}, {value_type}>" | |
| # Use Record only for key types allowed by TypeScript's Record<K, V> | |
| if key_type in ("string", "number", "symbol"): | |
| type_str = f"Record<{key_type}, {value_type}>" | |
| else: | |
| # Fallback to Map<K, V> for other key types (e.g., bigint) | |
| type_str = f"Map<{key_type}, {value_type}>" |
| lines.append("// Registration helper") | ||
| lines.append(f"export function {registration_name}(fory: any): void {{") | ||
|
|
||
| # Register enums | ||
| for enum in self.schema.enums: | ||
| if self.is_imported_type(enum): | ||
| continue | ||
| if self.should_register_by_id(enum): | ||
| type_id = enum.type_id | ||
| lines.append(f" fory.register({enum.name}, {type_id});") | ||
|
|
||
| # Register messages | ||
| for message in self.schema.messages: | ||
| if self.is_imported_type(message): | ||
| continue | ||
| self._generate_message_registration(message, lines) | ||
|
|
||
| # Register unions | ||
| for union in self.schema.unions: | ||
| if self.is_imported_type(union): | ||
| continue | ||
| if self.should_register_by_id(union): | ||
| type_id = union.type_id | ||
| lines.append(f" fory.registerUnion({union.name}, {type_id});") | ||
|
|
There was a problem hiding this comment.
The generated registration helper uses message/union names ({message.name}, {union.name}) in value position, but messages are emitted as export interface and unions as export type, which don’t exist at runtime in TypeScript. This will make the generated .ts fail to compile (only refers to a type, but is being used as a value). Either emit runtime values to register (e.g., classes/constructors or exported descriptors) or remove/adjust registration generation for type-only declarations.
| # Register enums | ||
| for enum in self.schema.enums: | ||
| if self.is_imported_type(enum): | ||
| continue | ||
| if self.should_register_by_id(enum): | ||
| type_id = enum.type_id | ||
| lines.append(f" fory.register({enum.name}, {type_id});") | ||
|
|
||
| # Register messages | ||
| for message in self.schema.messages: | ||
| if self.is_imported_type(message): | ||
| continue | ||
| self._generate_message_registration(message, lines) | ||
|
|
||
| # Register unions | ||
| for union in self.schema.unions: | ||
| if self.is_imported_type(union): | ||
| continue | ||
| if self.should_register_by_id(union): | ||
| type_id = union.type_id | ||
| lines.append(f" fory.registerUnion({union.name}, {type_id});") |
There was a problem hiding this comment.
generate_registration() only registers types when should_register_by_id() is true; types without an explicit/generated type_id will be silently skipped, unlike other generators which fall back to namespace/type-name registration. If TS registration is intended to support non-ID schemas, add the named-registration path (or make the omission explicit).
|
The implementation should be something like #3406 |
|
sure @chaokunyang |
Generate TypeScript type definitions and interfaces from FDL IDL for usage with serialization. Features: - Type-safe message interfaces, enums, and discriminated unions - Compatible with both TypeScript and JavaScript - Type ID registration helpers - No runtime dependencies (gRPC-free type definitions) - Proper package name handling and conversion to module names - Support via --typescript_out CLI flag
c6f379c to
92cd483
Compare
|
@chaokunyang I have tried to make implementation according to #3406 . Please have a look on it now and tell me if anything require any change. |
|
@chaokunyang I have tried to address all your comments. so please have a look on it now and guide me further steps. |
|
Why you still have lots of files with typescript in the filename? |
compiler/fory_compiler/cli.py
Outdated
| ) | ||
|
|
||
| parser.add_argument( | ||
| "--typescript_out", |
There was a problem hiding this comment.
Please check your pr carefully, do not use typescript, use javascrip across all cases even for filename
compiler/README.md
Outdated
| ## Features | ||
|
|
||
| - **Multi-language code generation**: Java, Python, Go, Rust, C++, C# | ||
| - **Multi-language code generation**: Java, Python, Go, Rust, C++, C#, JS/TS |
There was a problem hiding this comment.
I said in another comment, do not use such abbr.
|
My bad. i was done from my side but there are still many typescript. |
|
@chaokunyang What should be the file extension generated by javascript compilet? will this be .ts or .js ? |
|
file extension should use |
|
@chaokunyang I have renamed typescript to javascript. Please have a look at it now. |
| // --------------------------------------------------------------------------- | ||
|
|
||
| // --- addressbook types --- | ||
| const DogType = Type.struct(104, { |
There was a problem hiding this comment.
This is totally wrong, you are not testing the generated code from IDL. You are just define the types manually, so the whole integration_tests/idl_tests/javascript/roundtrip.ts doesn't cover any features in fory javascript IDL.
Have you read other how IDL tests of other language works?
why you don't test on generated code but use manually defined struct instead?
| # We need to pass a string name or a dummy object for registration. | ||
| # For now, we'll pass the string name of the type. | ||
| if self.should_register_by_id(type_def): | ||
| return f"{target_var}.{method}('{type_name}', {type_def.type_id});" |
There was a problem hiding this comment.
The generated helper calls fory.register(...) / fory.registerUnion(...), but the JS runtime API exposes registerSerializer(...) and has no register method on Fory.
I reproduced this by generating addressbook.ts and calling registerAddressbookTypes(new Fory()); it throws TypeError: fory.register is not a function.
Can we switch helper generation to a runtime-supported API, or skip generating this helper until that API exists?
| """Get the TypeScript module name from package.""" | ||
| if self.package: | ||
| parts = self.package.split(".") | ||
| return self.to_camel_case(parts[-1]) |
There was a problem hiding this comment.
Module naming currently uses only the last package segment (foo.v1 -> v1, bar.v1 -> v1). That can overwrite generated files and break imports.
I reproduced this with two imported schemas (foo.v1 and bar.v1): both emitted v1.ts, and the consumer import failed (Module "./v1" has no exported member Alpha).
Please use a collision-safe naming strategy (for example full package path, source file stem, or disambiguation when names clash).
| if not type_str: | ||
| # If not a primitive, treat as a message/enum type | ||
| type_str = self.safe_type_identifier( | ||
| self.to_pascal_case(field_type.name) |
There was a problem hiding this comment.
Qualified nested names can generate invalid TypeScript here.
field_type.name is emitted directly, so Outer.Inner stays Outer.Inner, but nested declarations are flattened to export interface Inner { ... }.
I reproduced this with Outer.Inner inner = 1; and TypeScript fails with Outer only refers to a type, but is being used as a namespace here.
Please resolve named types against the schema and emit the actual generated symbol name.
| nullable=False, | ||
| parent_stack=parent_stack, | ||
| ) | ||
| type_str = f"Record<{key_type}, {value_type}>" |
There was a problem hiding this comment.
Record<K, V> is not valid for all legal FDL map key types.
Example: map<uint64, string> maps to Record<bigint | number, string>, and TypeScript rejects that (TS2344, key must be string | number | symbol).
Could we constrain key mapping for Record and fall back to Map<K, V> when key types are not valid Record keys?
|
|
||
| return type_str | ||
|
|
||
| def _default_initializer( |
There was a problem hiding this comment.
_default_initializer looks unused in this generator (I could not find a call site), so this code currently adds maintenance overhead without affecting output.
_resolve_named_type appears to be used only through this unused path.
Please either remove these helpers for now or wire them into generation with tests.
| return self.to_camel_case(parts[-1]) | ||
| return "generated" | ||
|
|
||
| def _module_file_name(self) -> str: |
There was a problem hiding this comment.
_module_file_name appears unused right now. Output path is built from get_module_name() in generate_file, so this helper never affects emitted files.
This is easy to misread as active behavior. Either remove it or use it consistently in output path generation.
| @@ -0,0 +1,14 @@ | |||
| { | |||
| "compilerOptions": { | |||
| "target": "es2020", | |||
There was a problem hiding this comment.
With target: es2020, this config does not include WeakRef typings.
I hit TS2304: Cannot find name WeakRef running npx jest --ci locally when @fory/fory resolves to source files.
CI passes because javascript/npm run build runs first and provides prebuilt dist, but this setup is still fragile outside that exact order.
Can we add the proper lib (es2021 / es2021.weakref) or otherwise avoid type-checking linked package internals here?
|
|
||
| try { | ||
| fileRoundTrip(envVar, typeInfos, rootType, foryOptions); | ||
| } catch (e: any) { |
There was a problem hiding this comment.
This broad catch turns many peer runtime failures into silent skips (File left unchanged).
That can let Java-side checks pass even if JavaScript roundtrip logic is broken.
Could we only swallow known unsupported cases (for example explicit union gaps) and fail fast for unexpected exceptions?
|
|
||
| const compatible = process.env["IDL_COMPATIBLE"] === "true"; | ||
|
|
||
| // The Fory JS runtime does not support compatible mode (class metadata / |
There was a problem hiding this comment.
This early return skips all compatible-mode roundtrips, so this peer path does not actually validate JS behavior in compatible mode.
If compatible mode is intentionally unsupported, can we gate this as an explicit capability and make that coverage gap obvious in test output/docs?
docs/compiler/compiler-guide.md
Outdated
| | Rust | `rust` | `.rs` | Structs with derive macros | | ||
| | C++ | `cpp` | `.h` | Structs with FORY macros | | ||
| | C# | `csharp` | `.cs` | Classes with Fory attributes | | ||
| | JavaScript | `javascript` | `.js` | Interfaces with registration function | |
There was a problem hiding this comment.
Doc and implementation are out of sync here.
This table says JavaScript output extension is .js, but the generator emits .ts.
Please align the docs with the actual output (or change generator behavior if .js is intended).
| z: Type.float32(), | ||
| }); | ||
|
|
||
| const ColorEnum = { |
There was a problem hiding this comment.
ColorEnum looks unused in this file (I cannot find any reference).
If it is not needed, removing it would make the roundtrip script a bit clearer.
|
@chaokunyang Let me fix all of these issues. |
Summary
Implements TypeScript code generation for Fory IDL within the fory-compiler, converting FDL (Fory Definition Language) schema files into pure TypeScript type definitions. Zero runtime dependencies, with comprehensive test coverage (12/12 tests passing), supporting messages, enums, unions, and all primitive types.
Changes
Core Implementation
compiler/fory_compiler/generators/typescript.py - TypeScript code generator extending BaseGenerator (365 lines)
compiler/fory_compiler/generators/init.py - Registration of TypeScriptGenerator in the compiler ecosystem
compiler/fory_compiler/cli.py - Added --typescript_out CLI argument for TypeScript code generation
compiler/fory_compiler/tests/test_typescript_codegen.py - 12 golden codegen tests covering:
Features
Fixes #3280