This is what I am wanting to do:
Convert a folder of HTML files into markdown, also copying over the the XML metadata of each of the HTML files by converting into YAML.
I have done research and came across the following commands:
1.
find . -name \*.md -type f -exec pandoc -o {}.txt {} \;
* [This was found here](https://stackoverflow.com/questions/10323317/batch-processing-pandoc-conversions) , and it is a command that works and uses pandoc, however the file extentions are ".html.md" not ".md"
2. find / -name "*.md" -type f -exec sh -c 'markdown "${0}" > "${0%.md}.html"' {} \;
* [This was found here.](https://unix.stackexchange.com/questions/43669/use-find-command-to-convert-markdown-files-to-html) This apparently takes away the ".html.md" and turns into ".md", but it does not use pandoc.
3. pandoc -f html -t markdown -s input.html -o output.md
* [This was found here.](https://stackoverflow.com/questions/22866498/use-pandoc-to-create-yaml-metadata-from-html-meta-tags?rq=1) This is the pandoc command that apparently copies over the metadata and turns it into YAML, however it does not work on a folder of files, only on open
What I need is to have one single command that uses pandoc, gives the converted files the ".md." extension and not .html.md, and converts the XML metadata into YAML. All of this can be achieved using these three commands, they just need to be merged into one single command.
Asked by st john smith
(61 rep)
Mar 14, 2015, 03:35 AM
Last activity: Oct 19, 2023, 11:42 AM
Last activity: Oct 19, 2023, 11:42 AM