-
-
Notifications
You must be signed in to change notification settings - Fork 307
Added PO handling for new lines + tests #3282
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
WalkthroughPo parsing was changed to split header meta lines by raw newline characters and to interpret common escape sequences inside quoted strings (e.g., Changes
Sequence Diagram(s)sequenceDiagram
participant FS as FileSystem
participant Parser as PoParser
participant Test as TestRunner
FS->>Parser: load example.po (header + entries)
note right of Parser #DDEEFF: Header processing
Parser->>Parser: processHeader(split by raw "\n")
note right of Parser #F6F2DD: Entry parsing
Parser->>Parser: parse msgid/msgstr chars
alt escaped char inside quotes
Parser->>Parser: translate '\n','\r','\t','"','\\' → actual char
else unknown escape
Parser->>Parser: keep backslash + char
end
Parser->>Test: produce translations (11)
Test->>FS: compare against updated example.po expectations
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (4)
backend/data/src/main/kotlin/io/tolgee/formats/po/in/PoParser.kt (2)
54-68: Header split on LF aligns with new decoding; add safety and locale fixes.Good change. Two follow-ups:
- Guard malformed header lines (no colon) to avoid IndexOutOfBounds.
- Use Locale.ROOT for case-insensitive keys; avoid default-locale surprises (e.g., Turkish i).
- Minor: use lineSequence()/forEach instead of map-for-side-effects.
Apply:
- it.msgstr.split("\n").map { metaLine -> - val trimmed = metaLine.trim() - if (trimmed.isBlank()) { - return@map - } - val colonPosition = trimmed.indexOf(":") - val key = trimmed.substring(0 until colonPosition) - val value = trimmed.substring(colonPosition + 1).trim() - when (key.lowercase(Locale.getDefault())) { + it.msgstr.lineSequence().forEach { metaLine -> + val trimmed = metaLine.trim() + if (trimmed.isEmpty()) return@forEach + val colonPosition = trimmed.indexOf(':') + if (colonPosition <= 0) return@forEach + val key = trimmed.substring(0, colonPosition).lowercase(Locale.ROOT) + val value = trimmed.substring(colonPosition + 1).trim() + when (key) { "project-id-version" -> result.projectIdVersion = value "language" -> result.language = value "plural-forms" -> result.pluralForms = value else -> result.other[key] = value } - } + }Please add a small test asserting header meta (e.g., language/plural-forms) still parses when a line lacks a colon or has trailing spaces.
129-137: Escape decoding inside quotes looks right; consider completing the table.Current set covers n/r/t/"/\. Add common C escapes b (backspace), f (form feed), v (vertical tab). Unknowns remain literal (good).
- val specialEscape: Char? = if (quoted) when (this) { + val specialEscape: Char? = if (quoted) when (this) { 'n' -> '\n' 'r' -> '\r' 't' -> '\t' + 'b' -> '\b' + 'f' -> '\u000C' + 'v' -> '\u000B' '"' -> '"' '\\' -> '\\' else -> null } else nullAdd one assertion that unknown escapes (e.g., "\q") are preserved as two characters and that "\b" maps to backspace when enabled.
Also applies to: 138-143
backend/data/src/test/kotlin/io/tolgee/unit/formats/po/in/PoParserTest.kt (1)
30-31: Avoid index fragility; assert by msgid.Order can shift; look up entries by msgid for robustness.
- assertThat(result.translations[10].msgstr.toString()).isEqualTo("This\nis\na\nmultiline\nstring") - assertThat(result.translations[11].msgstr.toString()).isEqualTo("This\r\nis\r\na\r\nmultiline\r\nstring") + val lf = result.translations.first { it.msgid.toString() == "Multiline message with \\n" } + assertThat(lf.msgstr.toString()).isEqualTo("This\nis\na\nmultiline\nstring") + val crlf = result.translations.first { it.msgid.toString() == "Multiline message with \\n\\r" } + assertThat(crlf.msgstr.toString()).isEqualTo("This\r\nis\r\na\r\nmultiline\r\nstring")backend/data/src/test/kotlin/io/tolgee/unit/formats/po/in/PoFileProcessorTest.kt (1)
32-32: Consider asserting presence, not only count.Exact size can fluctuate as the fixture evolves. Optionally assert keys exist to make intent explicit.
assertThat(mockUtil.fileProcessorContext.translations).hasSize(11) +assertThat(mockUtil.fileProcessorContext.translations.keys) + .contains("Multiline message with \\n", "Multiline message with \\n\\r")
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
backend/data/src/main/kotlin/io/tolgee/formats/po/in/PoParser.kt(2 hunks)backend/data/src/test/kotlin/io/tolgee/unit/formats/po/in/PoFileProcessorTest.kt(1 hunks)backend/data/src/test/kotlin/io/tolgee/unit/formats/po/in/PoParserTest.kt(1 hunks)backend/data/src/test/resources/import/po/example.po(1 hunks)
🔇 Additional comments (1)
backend/data/src/test/resources/import/po/example.po (1)
53-57: Test data additions LGTM.The LF/CRLF cases are well chosen and align with parser changes.
Summary by CodeRabbit
Bug Fixes
Tests