-
-
Notifications
You must be signed in to change notification settings - Fork 112
Escape backslashes when building tab-separated row #317
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughA Boost header for string replacement was added to Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~8 minutes
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/utilities/utilities.cpp (1)
37-48: Add escaping for tabs, newlines, and carriage returns intab_separated().PostgreSQL's COPY text format requires that backslashes, newlines, carriage returns, and the delimiter character must be preceded by a backslash if they appear as part of a column value. The current implementation at
src/utilities/utilities.cpp:37-48only escapes backslashes but omits the three other required escape sequences. Sincetab_separated()uses tab as the delimiter, actual tab characters in the data would corrupt the format.Add the following escape operations before line 44 (after the backslash escape):
boost::replace_all(column, "\n", "\\n"); boost::replace_all(column, "\r", "\\r"); boost::replace_all(column, "\t", "\\t");This fix applies to all calls to
tab_separated()in the codebase (lines 466 insrc/database/Export2DB.cpp, line 79 insrc/configuration/tag_key.cpp, and line 97 ininclude/database/Export2DB.h), particularly for way names and other OSM-sourced data that may contain these characters.
🧹 Nitpick comments (2)
src/utilities/utilities.cpp (2)
40-40: Simplify the empty check.The condition
column.empty() || column == ""is redundant sinceempty()already checks for zero length.Apply this diff:
- if (column.empty() || column == "") { + if (column.empty()) {
39-39: Optional: Consider optimizing to avoid copying strings without backslashes.The loop uses pass-by-value (
auto column), which copies every string even when no escaping is needed. For better performance when backslashes are rare, consider usingconst auto&and only copying when replacement is necessary.Example optimization:
- for (auto column: columns) { + for (const auto& column: columns) { if (column.empty()) { result += "\\N\t"; + } else if (column.find('\\') != std::string::npos) { + auto escaped = column; + boost::replace_all(escaped, "\\", "\\\\"); + result += escaped + "\t"; } else { - boost::replace_all(column, "\\", "\\\\"); result += column + "\t"; }Note: This adds complexity, so only worthwhile if profiling shows string copies are a bottleneck.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
src/utilities/utilities.cpp(2 hunks)
🔇 Additional comments (1)
src/utilities/utilities.cpp (1)
23-23: LGTM: Header addition is correct.The Boost string replace header is necessary for the
boost::replace_allfunction used below.
|
All tests are failing, Create the #318 and merged your changes |
|
Thank you for the quick review and merge! |
Thanks for making and maintaining this package, it's really excellent! I have been running into the issue mentioned in #259 (comment) where a name of
"\"causes the process to fail with:I noticed that in the past there was escaping of backslashes (a120fc2), but I wasn't able to track down the removal, so I've taken a similar approach here. I've included a minimal test OSM file below that fails without the patch and succeeds with it. It's based on a much larger file that fails and succeeds in the same way.
Test File
Changes proposed in this pull request:
@pgRouting/admins
Summary by CodeRabbit