Commit d82d97f
authored
refactor(compiler): refactor fory compiler into hierarchical architecture (#3179)
## Why?
The current Fory compiler mixes FDL-native and protobuf-compatible
syntax handling in a single parser, making it difficult to add support
for new IDL formats like .proto and .fbs files. The validation logic is
scattered across parsing and Schema.validate(), and there's no clear
separation between parsing, semantic analysis, and code generation.
## What does this PR do?
This PR refactors the Fory compiler into a hierarchical, multi-frontend
architecture that establishes the Fory IDL AST as the **canonical
intermediate representation (IR)**, with separate frontend parsers for
different IDL formats.
**Key changes:**
1. **New directory structure** with clear separation of concerns:
- `ir/` - Intermediate Representation (canonical Fory AST)
- `ast.py` - Core AST node definitions with `SourceLocation` tracking
- `types.py` - Extended type system (primitives including varint, tagged
types, etc.)
- `validator.py` - Centralized semantic validation
- `emitter.py` - FDL text emitter for debugging translated schemas
- `frontend/` - IDL Frontends
- `base.py` - Base frontend interface
- `fdl/` - FDL Frontend (lexer + parser)
- `proto/` - Protobuf Frontend (lexer + parser + translator to Fory IR)
- `fbs/` - FlatBuffers Frontend (placeholder)
2. **Proto3 frontend** - Full support for parsing `.proto` files and
translating to Fory IR:
- Proto3 syntax parsing (messages, enums, nested types, maps, repeated
fields)
- Type mapping (int32→var_uint32, sint32→varint32, fixed32→uint32, etc.)
- Fory extension options (`(fory).id`, `(fory).ref`, `(fory).nullable`,
etc.)
- Well-known types support (google.protobuf.Timestamp, Duration)
3. **Simplified FDL syntax** - Removed protobuf-style `(fory)` prefix
from options:
- File options: `option use_record_for_java_message = true;`
- Type options: `message Foo [id=100] { ... }`
- Field options: `MyType data = 1 [ref=true, nullable=true];`
4. **Extended type system** with new primitive kinds:
- Signed/unsigned variants: `int8`-`int64`, `uint8`-`uint64`
- Variable-length encoding: `varint32`, `varint64`, `var_uint32`,
`var_uint64`
- Tagged types: `tagged_int64`, `tagged_uint64`
- Additional types: `float16`, `duration`, `decimal`
5. **Improved code generators** for all target languages with better
type mapping
6. **CLI enhancements**:
- Auto-detect input format by file extension (`.fdl`, `.proto`)
- New `--emit-fdl` flag to output translated FDL for debugging
7. **Cross-language integration tests** for proto-based schemas
## Related issues
Closes #3178
## Does this PR introduce any user-facing change?
- CLI now accepts `.proto` files directly (in addition to `.fdl`)
- FDL syntax simplified: `option (fory).xxx` → `option xxx`
- New primitive types available in FDL
- [x] Does this PR introduce any public API change?
- [ ] Does this PR introduce any binary protocol compatibility change?
## Benchmark
N/A - This is a compiler refactoring that doesn't affect runtime
performance.1 parent 99585af commit d82d97f
51 files changed
Lines changed: 3475 additions & 1072 deletions
File tree
- .github
- benchmarks/cpp_benchmark
- compiler
- fory_compiler
- frontend
- fbs
- fdl
- proto
- generators
- ir
- tests
- cpp/fory
- meta
- serialization
- docs
- benchmarks/cpp
- specification
- integration_tests/idl_tests
- cpp
- go
- java/src/test/java/org/apache/fory/idl_tests
- proto
- python/src/idl_tests
- rust/tests
- java/fory-core/src/main/java/org/apache/fory/type
- python/pyfory
- rust/fory-core/src
Some content is hidden
Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
20 | 20 | | |
21 | 21 | | |
22 | 22 | | |
| 23 | + | |
| 24 | + | |
23 | 25 | | |
24 | 26 | | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
31 | 31 | | |
32 | 32 | | |
33 | 33 | | |
34 | | - | |
35 | | - | |
36 | | - | |
37 | | - | |
38 | | - | |
39 | | - | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
40 | 40 | | |
41 | 41 | | |
42 | 42 | | |
| |||
47 | 47 | | |
48 | 48 | | |
49 | 49 | | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
50 | 78 | | |
51 | 79 | | |
52 | 80 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
32 | 32 | | |
33 | 33 | | |
34 | 34 | | |
| 35 | + | |
35 | 36 | | |
36 | 37 | | |
37 | 38 | | |
| |||
42 | 43 | | |
43 | 44 | | |
44 | 45 | | |
| 46 | + | |
45 | 47 | | |
46 | 48 | | |
47 | 49 | | |
| |||
50 | 52 | | |
51 | 53 | | |
52 | 54 | | |
| 55 | + | |
53 | 56 | | |
54 | 57 | | |
55 | 58 | | |
| |||
66 | 69 | | |
67 | 70 | | |
68 | 71 | | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
69 | 76 | | |
70 | 77 | | |
71 | 78 | | |
| |||
125 | 132 | | |
126 | 133 | | |
127 | 134 | | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
128 | 139 | | |
129 | 140 | | |
130 | 141 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
220 | 220 | | |
221 | 221 | | |
222 | 222 | | |
223 | | - | |
| 223 | + | |
224 | 224 | | |
225 | | - | |
| 225 | + | |
226 | 226 | | |
227 | 227 | | |
228 | 228 | | |
229 | 229 | | |
230 | | - | |
231 | | - | |
| 230 | + | |
| 231 | + | |
232 | 232 | | |
233 | 233 | | |
234 | 234 | | |
235 | 235 | | |
236 | 236 | | |
237 | | - | |
238 | | - | |
239 | | - | |
240 | | - | |
| 237 | + | |
| 238 | + | |
| 239 | + | |
241 | 240 | | |
242 | 241 | | |
243 | 242 | | |
244 | | - | |
245 | | - | |
| 243 | + | |
246 | 244 | | |
247 | 245 | | |
248 | 246 | | |
| |||
252 | 250 | | |
253 | 251 | | |
254 | 252 | | |
255 | | - | |
256 | | - | |
257 | | - | |
| 253 | + | |
| 254 | + | |
| 255 | + | |
258 | 256 | | |
259 | 257 | | |
260 | 258 | | |
261 | | - | |
262 | | - | |
263 | 259 | | |
264 | 260 | | |
265 | 261 | | |
266 | 262 | | |
267 | 263 | | |
268 | 264 | | |
269 | 265 | | |
270 | | - | |
271 | | - | |
272 | | - | |
273 | | - | |
| 266 | + | |
| 267 | + | |
| 268 | + | |
| 269 | + | |
| 270 | + | |
| 271 | + | |
| 272 | + | |
| 273 | + | |
| 274 | + | |
| 275 | + | |
274 | 276 | | |
275 | 277 | | |
276 | 278 | | |
| |||
280 | 282 | | |
281 | 283 | | |
282 | 284 | | |
283 | | - | |
| 285 | + | |
284 | 286 | | |
285 | | - | |
| 287 | + | |
286 | 288 | | |
287 | | - | |
288 | | - | |
289 | | - | |
| 289 | + | |
| 290 | + | |
| 291 | + | |
290 | 292 | | |
291 | 293 | | |
292 | 294 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
15 | 15 | | |
16 | 16 | | |
17 | 17 | | |
18 | | - | |
| 18 | + | |
19 | 19 | | |
20 | 20 | | |
21 | 21 | | |
22 | | - | |
23 | | - | |
24 | | - | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
25 | 26 | | |
26 | 27 | | |
27 | 28 | | |
| |||
30 | 31 | | |
31 | 32 | | |
32 | 33 | | |
33 | | - | |
34 | | - | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
35 | 37 | | |
0 commit comments