Creating a Pelican coding blog

This is a walk-through of how to set up a static blog with the Pelican module in Python that allows you to showcase your Jupyter notebooks. There are several websites that sort of show you how to do this, but not quite comprehensively in my view.

This blog assumes the following knowledge:

  • Installed the latest version of Python from the Anaconda Distribution.
  • Understanding how to use environments in conda.
  • Basic knowledge of using a Linux terminal.

Creating a conda environment for Pelican and installing packages

  • In your Linux Terminal, create an environment called pelican. This Python environment is just for compiling the blog (not for data science or programming).
conda create -n pelican
  • Activate your environment.
source activate pelican
  • Add conda-forge as an additional channel for installing Python packages.
conda config --append channels conda-forge
  • Add anaconda as an additional channel for installing Python packages.
conda config --append channels anaconda
  • Install the Python packages
conda install <pkg1> <pkg2> <pkg3>
conda install pelican jupyter ipython nbconvert markdown beautifulsoup4 invoke pygments typogrifyy

Python packages for static webpage blog:

Packages Description
pelican Creating static html webpages
jupyter Interactive multi-programming language
ipython Interactive Python
nbconvert Converting iPython notebooks to html
markdown Processing markdown pages
beautifulsoup4 Scraping webpages
ghp_import Automating GitHub pages publishing
invoke Shell script task automation
pygments Enhance Python code highlighting in Markdown
typogrify Makes typographical improvements

Re-instating blog after a Crostini powerwash

We will need to create a folder for your blog, and install Pelican plugins and themes as submodules. Plugins allow your blog to have additional functionality that is not part of the core Pelican platform. Themes allow us to customize what our blog looks like.

  • Create a github directory from your root.
mkdir github
  • Go into your github directory.
cd github
  • Clone your GitHub blog contents
git clone git:/github.com/randlow/blog.git blog
  • Go into your github/blog directory
cd github/blog
  • In your blog directory, we will be adding several git submodules. Submodules allow us to include packages from multiple GitHub repos that are required for our blog. We add the following submodules:
    • Pelican themes.
    • Pelican plugins.
    • Pelican ipynb plugin (for inserting Jupyter notebooks)
git submodule add -f https://github.com/getpelican/pelican-themes.git theme
git submodule add -f https://github.com/getpelican/pelican-plugins.git plugins
git submodule add -f https://github.com/danielfrg/pelican-ipynb.git plugins/ipynb
git submodule add -f https://github.com/randlow/randlow.github.io.git output
  • We recursively update all submodules from GitHub.
git submodule update --init --recursive
  • Configure your git identity.
git config --global user.name 'USERNAME'
git config --global user.email 'YOUR E-MAIL'
  • Sometimes the git submodule is messed up. In such cases perform the following:
    • Remove all the submodule directories using rm -rf <DIRNAME>.
    • Remove all references to the submodule in .gitmodules file.
    • Remove all references to the submodule in the .git/config file.
    • Run git rm --cached path_to_submodule
    • Run rm -rf .git/modules/path_to_submodule
    • Stage, commit, and push all changes up to GitHub.
    • Add all submodules back and update.

Local site generation (Draft)

  • From the blog directory, run the pelican command to compile the content folder to generate all the html files in your output folder.
pelican content -s pelicanconf.py
  • Go into your output folder.
cd output
  • Create a Python webserver.
python -m http.server
  • View your blog page in your browser (i.e., Chrome) by typing localhost:8000 in the URL bar.

Comitting to GitHub

There are two repositories that you are pushing to in GitHub:

  • <username>/blog.git - This is the repository for the written content of your Pelican articles (i.e, markdown, iPython notebooks)
  • <username>/<username>.github.io - This is the repository with the static html files for your the static blog webpage

Comitting to GitHub using SSH

The default set up for connecting to GitHub pages is https where you need to type in your username and password each time you perform a git push origin master. Using SSH allows you to git push origin master into your repository withtout keying in the username/password combination but requires quite a complicated setup.

To compare between SSH and HTTPS please see this comparison article. Although GitHub recommends HTTPS, SSH can be a more efficient workflow for publishing your blog in the long term.

Therefore, if you're short-term-hardworking-long-term-lazy like me and would still like to proceed with SSH, make sure you read the GitHub instructions for SSH in detail. The initial process of setting up SSH is onerous but can be worth the effort in the long-term

Using ssh-agent to cache your SSH passphrase

THe instructions provided by GitHub instructions for SSH will require you to key in your SSH key passphrase each time you perform a git commit origin master. To cache it, you need to use ssh-agent

  • Type in ssh-agent.
$ ssh-agent
SSH_AUTH_SOCK=/tmp/ssh-fPiBTULh3K8i/agent.12276; export SSH_AUTH_SOCK;
SSH_AGENT_PID=12277; export SSH_AGENT_PID;
echo Agent pid 12277;
  • Copy/paste the first 2 lines from above into the shell.
$ SSH_AUTH_SOCK=/tmp/ssh-fPiBTULh3K8i/agent.12276; export SSH_AUTH_SOCK;
$ SSH_AGENT_PID=12277; export SSH_AGENT_PID;
  • Type in ssh-add -k ~/.ssh/id_rsa. The -k flag stores it in the cache and the ~/.ssh/id_rsa allows it to know which SSH key the pass phrase is for.
$ ssh-add -k ~/.ssh/id_rsa
Enter passphrase for /home/<USERNAME>/.ssh/id_rsa: 
Identity added: /home/<USERNAME>/.ssh/id_rsa (/home/<USERNAME>/.ssh/id_rsa

Since ssh-agent has now stored the passphrase, you wont be asked for it each time you perform a git push origin master to GitHub.

External site generation (Publishing)

  • Make sure you are in your blog directory.
  • pelican content -o output -s publishconf.py. This will compile the files in your content folder into the output folder and use the additional publication settings in publish.py. publish.py contains publication settings for production such as Google Analytics feeds.
  • Go into your output folder

Using invoke for blog automation

The Python package invoke allowed you to automate certain tasks. You will have a file called tasks.py in your blog directory. The tasks.py contains a list of Python commands that allow you to automate the process of: * Internal single preview: Compiling your blog once and previewing on a local webserver. * Internal re-generating preview: Automatically compiles your blog once any changes are detected and allowing you to preview them immediately on a localwebserver. * External publishing: Comping the blog with publication settings and uploading to GitHub pages.

Internal single preview automation

  • In the blog folder, we run this command to generate the required files in output folder using the pelican configuration (pelicanconf.py).
$ pelican -s pelicanconf.py
  • In the output folder, we run this command to generate a Python webserver.
$ python -m http.server
  • View the webpage in your browser by typing in localhost:8000 in the URL bar.
  • We can automate this code with invoke by having the Python function preview intasks.py.CONFIG['deploy_path']should be set to theoutput` folder.
@task
def preview(c):
    c.run('pelican -s pelicanconf.py') # publishing with pelicanconf.py settings
    os.chdir(CONFIG['deploy_path']) # changing to the output directory
    c.run('python -m http.server') # starting the Python webserver
  • We execute the task of preview.
$ invoke preview

Internal re-generating preview automation

Open up two terminal windows. Run the below commands in different windows.

  • pelican -r -s pelicanconf.py. Terminal window 1 runs the pelican command with the regeneration (i.e., -r) flag and it will keep compling the static html output pages each time it detects a change in any of the subfolders and files of your blog folder.
  • invoke preview. Terminal window 2 will compile the html output pages once, and then activate the Python local webserver so you can see all changes to the blog in your browser on localhost:8000.
  • Make sure that pelicanconf.py has the LOAD_CONTENT_CACHE = False to ensure that the website is not being loaded from cache. if the website is loading from cache, you may not see the regenerated webpage created from changes made to the blog.

Any changes to any of your blog pages will trigger a regeneration of the static html pages as shown in Terminal window 1, and Terminal window 2 has a Python webserver running to allow viewing or your blog on your browser on localhost:8000, Thus, each change you make to your blog, just refresh the page on your browser and you will see the change.

External Publishing

For automating external publication with automation, you must enable Comitting to GitHub using SSH.

  • Once we've reviewed and accepted the webpage contents, we re-generate the Pelican contents with the publishconf.py to account for webpage publication settings (i.e., webpage analytics).
$ pelican -s publishconf.py
  • Next steps are to commit the all written files in blog.
  • Add all updated files in the working directory into the Git staging area.
$ git add .
  • Commit all files in Git staging area to the Git repo.
$ git commit -m "auto commit from invoke"
  • Push all files in Git repo to your GitHub repository (https://github.com/USERNAME/BLOG_REPONAME.git)
$ git push origin master
  • Commit the pelican generated static html files in output
  • Add all updated files in the working directory into the Git staging area.
$ git add .
  • Commit all files in Git staging area to the Git repo.
$ git commit -m "auto commit from invoke"
  • Push all files in Git repo to your GitHub repository (https://github.com/USERNAME/OUTPUT_REPONAME.git)
$ git push origin master

Automating publication

We can automate the external publishing process using invoke by having the following publish function in tasks.py.

@task
def publish(c):
        preview(c)

        # Commit the written content to GitHub
        c.run('git add .')
        c.run('git commit -m "auto-commit from Invoke(blog)"')
        c.run('git push origin master')

        # Commit the static html pages to GitHub Pages
        os.chdir(CONFIG['deploy_path'])
        c.run('git add .')
        c.run('git commit -m "auto-commit from Invoke(blog)"')
        c.run('git push origin master')
  • We execute the task of publish.
invoke publish

Editing pelicanconf.py and publishconf.py

There are many settings in pelicanconf.py, thus I recommend reading through the Pelican Settings directly. publishconf.py is the configuration file for the static html webpage with additional options when publishing on a website (i.e., GitHub pages). I personally found the configuration settings quite unwieldy and one of the reasons why I shifted to using Nikola as a coding blog_ was because the settings were better document there.

.. using Nikola as a coding blog : /posts/other/create-nikola-coding-blog

Comments

Comments powered by Disqus