Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replicate doc build in CI #5302

Open
21 of 25 tasks
wenzeslaus opened this issue Mar 4, 2025 · 0 comments
Open
21 of 25 tasks

Replicate doc build in CI #5302

wenzeslaus opened this issue Mar 4, 2025 · 0 comments
Assignees
Labels
Milestone

Comments

@wenzeslaus
Copy link
Member

wenzeslaus commented Mar 4, 2025

At the time of writing, CI Documentation workflow builds MkDocs site from Markdown using the grass repo and integrates addons from the grass-addons repo. However, the current documentation is build from many different pieces and its build is mixed with creation of binary snapshots (make bindist). The purpose of this issue is to track replication of the doc-related parts of the cron_grass_preview_build_binaries.sh script in CI for the Markdown doc files and MkDocs HTML site, with the goal of creating artifacts in the CI equivalent to what is now created on the server for the custom HTML doc.

List of script tasks from the script doc

Will not be addressed here:

  • packages the binaries
  • generated the install scripts
  • generates the programmer's 8 HTML manual (doc, but goes to a separate dir/URL from the user doc, so to limit the scope, moving this for later)
  • copies over generated manual pages to grass-devel/manuals/ (not relevant for creation of the artifact)

Additional tasks spotted but not listed in the main list of tasks:

  • generate i18N stats for HTML page path (goes to snapshot, not doc, so out of scope)

Tasks based on code in the script

(ordered by appearance in the script)

  • extra dependencies (will be resolved case by case)
  • see if LaTeX is actually needed or not (it is needed for disabled module_synopsis.sh, but may be needed for Doxygen or Sphinx, not needed for Sphinx, Doxygen is out of scope, so not investigated)
# Preparations, on server (neteler@grasslxd:$):
# - install further dependencies:
#     apt-get install texlive-latex-extra python3-sphinxcontrib.apidoc
  • generating "synopsis list of module names and descriptions" for core based on GUI, disabled in the script, time to decide its fate: The script is not used for the current documentation. We have full index which has all tools by category/family and their labels or descriptions. Getting anything out of GUI would be completely new implementation, so this is simply omitted.
#### create module overview (https://trac.osgeo.org/grass/ticket/1203)
#sh utils/module_synopsis.sh
#### generate developer stuff: pygrass docs + gunittest docs
# generate pyGRASS sphinx manual (in docs/html/libpython/)
# including source code
$MYMAKE sphinxdoclib
##
echo "Copy over the manual + pygrass HTML pages:"
mkdir -p $TARGETHTMLDIR
mkdir -p $TARGETHTMLDIR/addons # indeed only relevant the very first compile time
# don't destroy the addons during update
rm -rf /tmp/addons
\mv $TARGETHTMLDIR/addons /tmp
rm -f $TARGETHTMLDIR/*.*
(cd $TARGETHTMLDIR ; rm -rf barscales colortables icons northarrows)
\mv /tmp/addons $TARGETHTMLDIR

cp -rp dist.$ARCH/docs/html/* $TARGETHTMLDIR/
echo "Copied pygrass progman to https://grass.osgeo.org/grass${VERSION}/manuals/libpython/"
  • search
# search to be improved with mkdocs or similar; for now we use DuckDuckGo
echo "Injecting DuckDuckGo search field into manual main page..."
(cd $TARGETHTMLDIR/ ; sed -i -e "s+</table>+</table><\!\-\- injected in cron_grass8_relbranch_build_binaries.sh \-\-> <center><iframe src=\"https://duckduckgo.com/search.html?site=grass.osgeo.org%26prefill=Search%20manual%20pages%20at%20DuckDuckGo\" style=\"overflow:hidden;margin:0;padding:0;width:410px;height:40px;\" frameborder=\"0\"></iframe></center>+g" index.html)
  • doc is zipped, but without addons based on the order in the script, but addons are included in the ZIP file, so they are probably from the previous build (example file https://grass.osgeo.org/grass84/manuals/grass-8.4_html_manual.zip (link)) - Sp this basically the artifact except the name and the fact that the artifact has the files right there while the current ZIP file has everything under a single directory manuals. Considering this solved. It makes sense to adjust the details later, once we know and decide what is actually supposed to happen.
# generate manual ZIP package
(cd $TARGETHTMLDIR/.. ; rm -f $TARGETHTMLDIR/*html_manual.zip ; zip -r /tmp/grass-${DOTVERSION}_html_manual.zip manuals/)
mv /tmp/grass-${DOTVERSION}_html_manual.zip $TARGETHTMLDIR/
  • compile addons (into one dir)
  • compile addons into multiple directories if needed for some other step (like the modules.xml file) - not needed, it works as is and modules.xml is out of scope because it goes to a different dir
# update addon repo (addon repo has been cloned twice on the server to
# have separate grass7 and grass8 addon compilation)
(cd $SOURCE/grass$GMAJOR-addons/; git checkout grass$GMAJOR; git pull origin grass$GMAJOR)
# compile addons
cd $GRASSBUILDDIR
sh $MAINDIR/cronjobs/compile_addons_git.sh $GMAJOR \
   $GMINOR \
   $SOURCE/grass$GMAJOR-addons/src/ \
   $SOURCE/$BRANCH/dist.$ARCH/ \
   $MAINDIR/.grass$GMAJOR/addons \
   $SOURCE/$BRANCH/bin.$ARCH/grass \
   1
  • move addons to one dir (this is now not needed as the option to compile to one dir is used) - not needed since we are not splitting it
mkdir -p $TARGETHTMLDIR/addons/
# copy individual addon html files into one target dir if compiled addon
# has own dir e.g. $MAINDIR/.grass8/addons/db.join/ with bin/ docs/ etc/ scripts/
# subdir
for dir in `find $MAINDIR/.grass$GMAJOR/addons -maxdepth 1 -type d`; do
    if [ -d $dir/docs/html ] ; then
        if [ "$(ls -A $dir/docs/html/)" ]; then
            for f in $dir/docs/html/*; do
                cp $f $TARGETHTMLDIR/addons/
            done
        fi
    fi
done
sh $MAINDIR/cronjobs/grass-addons-index.sh $GMAJOR $GMINOR $GPATCH $TARGETHTMLDIR/addons/
# copy over hamburger menu assets
cp $TARGETHTMLDIR/grass_logo.png \
   $TARGETHTMLDIR/hamburger_menu.svg \
   $TARGETHTMLDIR/hamburger_menu_close.svg \
   $TARGETHTMLDIR/grassdocs.css \
   $TARGETHTMLDIR/addons/
chmod -R a+r,g+w $TARGETHTMLDIR 2> /dev/null
  • generate keywords again with addons included
# regenerate keywords.html file with addons modules keywords
export ARCH
export ARCH_DISTDIR=$GRASSBUILDDIR/dist.$ARCH
export GISBASE=$ARCH_DISTDIR
export VERSION_NUMBER=$DOTVERSION
python3 $GRASSBUILDDIR/man/build_keywords.py $TARGETMAIN/grass$GMAJOR$GMINOR/manuals/ $TARGETMAIN/grass$GMAJOR$GMINOR/manuals/addons/
unset ARCH ARCH_DISTDIR GISBASE VERSION_NUMBER
############################################
# Cloning new manual pages into grass-devel/manuals/ (following the Python manual pages concept)
# - inject canonical URL therein to point to "stable" manual page (avoiding "duplicate content" SEO punishment)
#   see https://developers.google.com/search/docs/crawling-indexing/consolidate-duplicate-urls

Notes:

  • The C programming manual does not have canonical URL because there is only one generated from main (or one per major version).
############################################
# SEO: inject canonical link into versioned manual pages (e.g, grass85/)
# - cd back into folder of versioned HTML manual pages
# - run sed to replace an existing HTML header string in the upper part of the HTML file
#   with itself + canonical link of devel version
# --> do this for core manual pages, addons, libpython, recursively

process_files() {
  local dir="$1"
  local prefix="$2"

  find "$dir" -type f -name '*.html' | while IFS= read -r myfile; do
    if ! grep -q 'link rel="canonical"' "$myfile"; then
      manpage="$prefix$(basename ${myfile})"
      sed -i -e "s:</head>:<link rel=\"canonical\" href=\"https\://grass.osgeo.org/grass-stable/manuals/$manpage\">\n</head>:g" ${myfile}
    fi
  done
}

cd "$TARGETHTMLDIR"
process_files "$TARGETHTMLDIR" ""
process_files "$TARGETHTMLDIR/addons" "addons/"
process_files "$TARGETHTMLDIR/libpython" "libpython/"

# SEO: inject canonical link into "devel" manual pages (grass-devel/)
# - cd back into folder of "devel" HTML manual pages
# - run sed to replace an existing HTML header string in the upper part of the HTML file
#   with itself + canonical link of grass-stable version
# --> do this for core manual pages, addons, libpython, recursively
cd "$TARGETHTMLDIRDEVEL"
process_files "$TARGETHTMLDIRDEVEL" ""
process_files "$TARGETHTMLDIRDEVEL/addons" "addons/"
process_files "$TARGETHTMLDIRDEVEL/libpython" "libpython/"

Notes:

  • MkDocs generates sitemap.xml (and sitemap.xml.gz) which contains both the main dir and addons subdir with https://grass.osgeo.org/grass-stable/manuals/ URLs and lastmod date.
  • Sphinx has an extension sphinx-sitemap.
############################################
# create sitemaps to expand the hugo sitemap

# versioned manual:
python3 $HOME/src/grass$GMAJOR-addons/utils/create_manuals_sitemap.py --dir=/var/www/code_and_data/grass$GMAJOR$GMINOR/manuals/ --url=https://grass.osgeo.org/grass$GMAJOR$GMINOR/manuals/ -o
python3 $HOME/src/grass$GMAJOR-addons/utils/create_manuals_sitemap.py --dir=/var/www/code_and_data/grass$GMAJOR$GMINOR/manuals/addons/ --url=https://grass.osgeo.org/grass$GMAJOR$GMINOR/manuals/addons/ -o

# grass-devel manual:
python3 $HOME/src/grass$GMAJOR-addons/utils/create_manuals_sitemap.py --dir=/var/www/code_and_data/grass-devel/manuals/ --url=https://grass.osgeo.org/grass-devel/manuals/ -o
python3 $HOME/src/grass$GMAJOR-addons/utils/create_manuals_sitemap.py --dir=/var/www/code_and_data/grass-devel/manuals/addons/ --url=https://grass.osgeo.org/grass-devel/manuals/addons/ -o

Originally identified as relevant, but later deemed as out of scope:

# copy important files to web space
cp -p AUTHORS CITING CITATION.cff COPYING GPL.TXT INSTALL.md REQUIREMENTS.md $TARGETDIR/
  • not done by the script: fix colortables and other image links (I originally though that the script does that by cp the dirs, but it does not do that, example: t.rast.import.netcdf. Alternative solution would be shifting the URLs during a file-combine stage in build like the keyword links are.)
TARGETHTMLDIRDEVEL=$TARGETMAIN/grass-devel/manuals/
mkdir -p $TARGETHTMLDIRDEVEL $TARGETHTMLDIRDEVEL/addons
# cleanup from previous run
rm -rf /tmp/addons
\mv $TARGETHTMLDIRDEVEL/addons /tmp
rm -f $TARGETHTMLDIRDEVEL/*.*
(cd $TARGETHTMLDIRDEVEL ; rm -rf barscales colortables icons northarrows)
# clone manual pages
cp -rp $TARGETHTMLDIR/* $TARGETHTMLDIRDEVEL/
  • addon build logs are part of the script but go to /addons/grass8/logs not to the doc (/addons/grass8/logs/)
# copy over logs from $MAINDIR/.grass$GMAJOR/addons/logs/
mkdir -p $TARGETMAIN/addons/grass$GMAJOR/logs/
cp -p $MAINDIR/.grass$GMAJOR/addons/logs/* $TARGETMAIN/addons/grass$GMAJOR/logs/
  • file modules.xml with files of compiled addons is generated (used by g.extension) and placed under /addons/grass8 not /grass-stable/manuals (addons/grass8/modules.xml).
# generate addons modules.xml file (required for g.extension module)
$SOURCE/$BRANCH/bin.$ARCH/grass --tmp-project EPSG:4326 --exec $MAINDIR/cronjobs/build-xml.py --build $MAINDIR/.grass$GMAJOR/addons
cp $MAINDIR/.grass$GMAJOR/addons/modules.xml $TARGETMAIN/addons/grass$GMAJOR/modules.xml
  • C programming doc
# generate doxygen programmers's G8 manual
cd $GRASSBUILDDIR/
#$MYMAKE htmldocs-single > /dev/null || (echo "$0 htmldocs-single: an error occurred" ; exit 1)
$MYMAKE htmldocs-single || (echo "$0 htmldocs-single: an error occurred" ; exit 1)

cd $GRASSBUILDDIR/

echo "Generating latest programmer's manual..."
# clean old TARGETPROGMAN stuff from last run
if  [ -z "$TARGETPROGMAN" ] ; then
 echo "\$TARGETPROGMAN undefined, error!"
 exit 1
fi
mkdir -p $TARGETPROGMAN
rm -f $TARGETPROGMAN/*.*

# copy over doxygen manual
cp -r html/*  $TARGETPROGMAN/

echo "Copied HTML progman to https://grass.osgeo.org/programming${GVERSION}"
# fix permissions
#chgrp -R grass $TARGETPROGMAN/*
chmod -R a+r,g+w $TARGETPROGMAN/
chmod -R a+r,g+w $TARGETPROGMAN/*
# bug in doxygen
(cd $TARGETPROGMAN/ ; ln -s index.html main.html)

Created sites or artifacts according to the script

(it lists only doc, not snapshot or other files)

The actual pages (upload is out of scope for this issue, listing URLs):

Just the artifacts:

Will not be addressed here:

  • HTML progman programming (doc, but goes to a separate dir/URL from the user doc, so to limit the scope, moving this for later)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Development

No branches or pull requests

1 participant