English {#english}
Sprint 10 Review: Delphi Strings in C Codegen
Mintlify docs tour — Sprint 7 delivered generics. Sprint 10 is where native build stops lying about string semantics.
CrabPascal runs Pascal two ways: run interprets the AST through complete_runtime.rs, and build-exe generates C plus invokes gcc/clang. Sprint 9 started measuring parity. Sprint 10 (v2.18.0) made string behavior provably equivalent when a C toolchain exists.
Deliverables from Sprint 10 Review
The official review lists:
Shipped:
-
Parser hardening —
Trim/Copy/etc. no longer confused with types (T+ uppercase heuristic; builtin denylist). -
Codegen fixes — forward declarations for
pascal_*helpers;WriteLnwith%sfor strings;return 0inmain. -
Tests — expanded
run_build_parity; native gate skips gracefully without gcc. - Tag v2.18.0 on the compiler.
Deliberately not shipped:
- VS Code / Cursor marketplace extension — still at v2.17.0.
- Reason: stability gate requires
runvs nativebuildparity with a real C toolchain; CI lacked gcc → gate skipped. - Tracked as TD-MARKETPLACE-001 in the technical debt backlog.
The string parity problem
Delphi strings are not C char*. CrabPascal models UTF-16 code units in runtime and exposes helpers in src/stubs.c:
int pascal_Length(const char* s);
char* pascal_Copy(const char* s, int index, int count);
Before Sprint 10, generated C called Length("año") directly — symbols undefined, semantics wrong. After Sprint 10, codegen maps builtins to pascal_* and parity tests compare stdout:
# With gcc/clang in PATH:
crab-pascal run tests/fixtures/string_conformance.pas
crab-pascal build-exe tests/fixtures/string_conformance.pas -o /tmp/out
Without toolchain, build exits non-zero with a clear message instead of silently falling back to AST execution — fixing TD-RUNTIME-001.
Acceptance criteria (documented in backlog)
From technical debt backlog:
- With gcc/clang:
build string_conformancecompiles, runs, stdout matchesrun. - Without toolchain:
buildexit ≠ 0; message mentions C compiler or generated.cfile. - Generated C uses only
pascal_*for string builtins includingTrim.
What this means for developers
| Scenario | Recommendation |
|---|---|
| String-heavy logic, no gcc locally | Use run; trust interpreter |
| CI with gcc | Add parity test to your pipeline |
| Shipping native binary | Verify string_conformance gate passes first |
Sprint 10 did not solve OO codegen (properties, methods, exceptions) — those land in Sprints 12–13. It did establish the pattern: measure parity, fail loudly, document skips.
Docs and next steps
Mintlify: Roadmap → Sprint 10 Review · Release v2.18.0. Next in series: Sprint 13 — honest build-exe when exceptions appear in source. The compiler stops pretending; developers gain trust.
Português {#portugus}
Review Sprint 10: strings no codegen C
Tour Mintlify — o Sprint 7 entregou genéricos. O Sprint 10 é onde o build nativo para de mentir sobre semântica de strings.
O CrabPascal executa Pascal de duas formas: run interpreta a AST via complete_runtime.rs, e build-exe gera C e invoca gcc/clang. O Sprint 9 começou a medir paridade. O Sprint 10 (v2.18.0) tornou o comportamento de strings provavelmente equivalente quando existe toolchain C.
Entregas do Sprint 10 Review
O review oficial lista:
Entregue:
-
Parser —
Trim/Copy/etc. não confundidos com tipos (heurísticaT+ maiúscula; denylist de builtins). -
Codegen — forward decls
pascal_*;WriteLncom%spara strings;return 0emmain. -
Testes —
run_build_parityampliado; gate nativo com skip sem gcc. - Tag v2.18.0 no compilador.
Deliberadamente não entregue:
- Extensão VS Code / Cursor marketplace — permanece v2.17.0.
- Motivo: gate de estabilidade exige paridade
runvsbuildnativo com toolchain C real; CI sem gcc → gate skipped. - Rastreado como TD-MARKETPLACE-001 no backlog de débito técnico.
O problema de paridade de strings
Strings Delphi não são char* C. O CrabPascal modela code units UTF-16 no runtime e expõe helpers em src/stubs.c:
int pascal_Length(const char* s);
char* pascal_Copy(const char* s, int index, int count);
Antes do Sprint 10, C gerado chamava Length("año") diretamente — símbolos indefinidos, semântica errada. Depois do Sprint 10, codegen mapeia builtins para pascal_* e testes de paridade comparam stdout:
# Com gcc/clang no PATH:
crab-pascal run tests/fixtures/string_conformance.pas
crab-pascal build-exe tests/fixtures/string_conformance.pas -o /tmp/out
Sem toolchain, build sai com código ≠ 0 e mensagem clara em vez de fallback silencioso para execução AST — corrigindo TD-RUNTIME-001.
Critérios de aceite (documentados no backlog)
- Com gcc/clang:
build string_conformancecompila, executa, stdout =run. - Sem toolchain: exit
build≠ 0; mensagem menciona compilador C ou arquivo.cgerado. - C gerado usa só
pascal_*para builtins de string incluindoTrim.
O que isso significa para desenvolvedores
| Cenário | Recomendação |
|---|---|
| Lógica com strings, sem gcc local | Use run; confie no interpretador |
| CI com gcc | Adicione teste de paridade ao pipeline |
| Binário nativo em produção | Verifique gate string_conformance primeiro |
O Sprint 10 não resolveu codegen OO (properties, métodos, exceções) — isso vem nos Sprints 12–13. Estabeleceu o padrão: medir paridade, falhar alto, documentar skips.
Docs e próximos passos
Mintlify: Roadmap → Sprint 10 Review · Release v2.18.0. Próximo na série: Sprint 13 — build-exe honesto quando exceções aparecem no fonte. O compilador para de fingir; desenvolvedores ganham confiança.
Published on dev.to/@crabpascal · Código em CrabPascal