Sample Header Ad - 728x90

Avoiding awk injection

4 votes
1 answer
508 views
I have a script which reads an VCS log, converts that to latex, and then uses awk to replace keyword @COMMITS@ in a template with the text:
untagged=$(get-commit-messages "$server" "$rev")
IFS=$'\n' untagged=( $untagged )  # Tokenize based on newlines
for commit in "${untagged[@]}"; do
  tex+="\\\nui{"                  # Wrap each commit in a custom command
  tex+=$(echo "$commit" | pandoc -t latex --wrap=none)
  tex+="}
"
done

awk -v r="$tex" '{gsub(/@COMMITS@/,r)}1' template
Since commit messages are really just text, I use pandoc -t latex to ensure everything is escaped properly for the latex parser. My problem is that the awk parser un-escapes these. If I find a _ in a commit message, pandoc will replace that with \_, but then awk will convert it back and give a warning:
awk: warning: escape sequence \_' treated as plain _'
That will cause the latex parser to fail. Is there a way for me to prevent awk from un-escaping stuff? If not I'll look for a non-awk solution for text-replacement.
Asked by Stewart (15631 rep)
Jun 7, 2021, 02:06 PM
Last activity: Jun 7, 2021, 02:42 PM