packet:xrouter:manpages:parsing
Differences
This shows you the differences between two versions of the page.
| Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
| packet:xrouter:manpages:parsing [2025/04/20 07:06] – m0mzf | packet:xrouter:manpages:parsing [2025/04/22 02:40] (current) – removed m0mzf | ||
|---|---|---|---|
| Line 1: | Line 1: | ||
| - | ====== Script to parse Xrouter' | ||
| - | <file awk parse-pzt-manhlp.sh> | ||
| - | #!/bin/bash | ||
| - | ################################## | ||
| - | # by Jason M0MZF (not a programmer!) | ||
| - | # bash / awk / hammer / nail etc. | ||
| - | # License - MIT. Crack on people. | ||
| - | # | ||
| - | # Script to parse Paula G8PZT' | ||
| - | # "some simple markup language" | ||
| - | # https:// | ||
| - | # | ||
| - | # The intention is to parse all MAN / HLP files within the folders and | ||
| - | # write them with appropriate formatting to files which can then be | ||
| - | # pasted directly into the wiki. | ||
| - | # | ||
| - | # This could also be done with groff > HTML > pandoc > ssml but pandoc' | ||
| - | # output format for SSML doesn' | ||
| - | # don't know Lua. Yet. Maybe something like this with a custom output formatter: | ||
| - | # cat ${manpage} | groff -Thtml -P -l -mmandoc 2>/ | ||
| - | # But when all you've got is awk, everything looks like a record / field... ;) | ||
| - | # | ||
| - | ################################## | ||
| - | # Instructions (destructions? | ||
| - | # | ||
| - | # - This script does not take any arguments | ||
| - | # - The only required configuration is to set the following path | ||
| - | # | ||
| - | BASEPATH=/ | ||
| - | # | ||
| - | # This folder should contain the two directories " | ||
| - | # and " | ||
| - | # called " | ||
| - | # concatenated and reformatted files. A manually-created | ||
| - | # index page exists in https:// | ||
| - | # top-level contents, and the pages linked therein have their contents | ||
| - | # copypasta' | ||
| - | # | ||
| - | ################################## | ||
| - | # Changelog | ||
| - | # 20250418 - Implemented MAN page parsing | ||
| - | # 20250419 - Implemented HLP page parsing | ||
| - | # 20250419 - Tidy up, more awk less bash, remove .MAN / .HLP from outputted headers | ||
| - | ################################## | ||
| - | # Globals | ||
| - | DATE=$(date +" | ||
| - | MANFILES=" | ||
| - | HLPFILES=" | ||
| - | OUTPUTDIR=" | ||
| - | |||
| - | # Colourise output | ||
| - | echoRed () { | ||
| - | echo -e " | ||
| - | } | ||
| - | echoGreen () { | ||
| - | echo -e " | ||
| - | } | ||
| - | |||
| - | checkRoot () { | ||
| - | if [[ $UID -eq 0 ]]; | ||
| - | then | ||
| - | echoError "This script must NOT be run as root!" | ||
| - | exit 1 | ||
| - | fi | ||
| - | } | ||
| - | |||
| - | # Use awk to: | ||
| - | # strip out comment lines and remove any <CR> from < | ||
| - | # turn the MAN page title into a code block | ||
| - | # find every subsequent MAN page header and turn it into a docuwiki header and | ||
| - | # | ||
| - | # | ||
| - | # (the final encapsulation is done using " | ||
| - | awkParseMan=' | ||
| - | { | ||
| - | if (NR==1 || NR==2) # For the first two lines | ||
| - | { | ||
| - | gsub(" | ||
| - | if (/^;/ || NF==0) {next} # skip the subsequent print function for comment or empty lines | ||
| - | print "< | ||
| - | } | ||
| - | |||
| - | if (NR> | ||
| - | { | ||
| - | if (/^[A-Z]/) # If the line begins with a character | ||
| - | { | ||
| - | starthead="</ | ||
| - | endhead=" | ||
| - | gsub(" | ||
| - | print starthead $0 endhead # and output the line | ||
| - | } | ||
| - | else # else for all other lines | ||
| - | { | ||
| - | if (/^;/) {next} # skip comment lines | ||
| - | gsub(" | ||
| - | print $0 # and output the line | ||
| - | } | ||
| - | } | ||
| - | } | ||
| - | ' | ||
| - | # Use awk to: | ||
| - | # strip out comment lines (this is always line 1, sometime 2 and 3) and remove any <CR> from < | ||
| - | # insert a start code block in place of the now-empty line 1 | ||
| - | # (the final encapsulation is done using " | ||
| - | awkParseHlp=' | ||
| - | { | ||
| - | endhead="< | ||
| - | gsub(" | ||
| - | if (NR==1) {print endhead} # | ||
| - | if (/^;/ || NF==0) {next} # skip comment / empty lines | ||
| - | print $0 # output the refined line | ||
| - | } | ||
| - | ' | ||
| - | |||
| - | # Use awk to extract a section name from the directory structure | ||
| - | awkSectionHeader=' | ||
| - | BEGIN { FS="/" | ||
| - | { # / | ||
| - | header=" | ||
| - | print header $(NF-1) header # the penultimate field is the section name | ||
| - | } | ||
| - | ' | ||
| - | |||
| - | # Use awk to extract a name from the filename.extension | ||
| - | awkFileHeader=' | ||
| - | BEGIN { FS=" | ||
| - | { # because we want the file name from FILENAME.MAN | ||
| - | header=" | ||
| - | print header $1 header # the first field is the file name | ||
| - | } | ||
| - | ' | ||
| - | |||
| - | parseFiles () { | ||
| - | mkdir -p " | ||
| - | # Traverse folders, skipping files in base directory | ||
| - | for folder in " | ||
| - | do | ||
| - | # Get the penultimate field in file path, i.e. the section (folder) name | ||
| - | section=$(echo $folder | awk -F/ ' | ||
| - | # Format the section name as a docuWiki header | ||
| - | echo " | ||
| - | # Spit some stuff out to the shell | ||
| - | echoRed " | ||
| - | # Traverse through files | ||
| - | for file in " | ||
| - | do | ||
| - | # Get the last field in file path, i.e. file name | ||
| - | title=$(echo $file | awk -F/ ' | ||
| - | # Format the file name as a docuwiki header | ||
| - | echo " | ||
| - | case " | ||
| - | # For MAN files, after awk has done it's job we need to remove the last line; this last line breaks | ||
| - | # the following < | ||
| - | MANFILES) awk " | ||
| - | echo -e "</ | ||
| - | ;; | ||
| - | # For HLP files we don't want to remove the last line because that truly is real content | ||
| - | HLPFILES) awk " | ||
| - | echo -e "</ | ||
| - | ;; | ||
| - | esac | ||
| - | done | ||
| - | done | ||
| - | } | ||
| - | |||
| - | #Let's go! | ||
| - | checkRoot | ||
| - | echoGreen " | ||
| - | parseFiles MANFILES | ||
| - | echoGreen " | ||
| - | parseFiles HLPFILES | ||
| - | |||
| - | </ | ||
packet/xrouter/manpages/parsing.1745132818.txt.gz · Last modified: by m0mzf
