Back
Featured image of post How to Port Your Medium Articles to Personal Blog with a Simple Bash Script

How to Port Your Medium Articles to Personal Blog with a Simple Bash Script

‘Quick and Dirty’ Blogging Automation

Medium is a great publication platform. It has good exposure, quality content, readers that really appreciate good articles and a neat and easy to use UI. It’s especially great for writers that just start their journey.

As good as it is, having your own blog outside of Medium is still not a bad idea. It enables you to have another channel you can totally own to communicate with your readers. And who knows, no company can last forever, what if Medium got acquired by some other company or something even worse happen. You can still sleep well at night knowing you won’t lose all your articles.

I built my own using Pelican, a Python-based static site generator. I wrote an article explaining the whole process. For every Medium article, I need to copy the URL, run some command to transfer it into Markdown file, then generate the blog site using Pelican. It is simple, but not as simple as I like it to be. So this is a great opportunity for some quick and dirty Bash script to come for the rescue. Let’s see what we can do.

Structure the Script

Before start writing the script, it helps to structure out what we want to accomplish, makes it easier to write quality code. Basically, we need to:

  1. Put all article URLs into one text file manually(plan to automate this part too in the future, using some scraping framework maybe)

  2. Read every line of the file, and for each line.

  3. Extract the title and subtitle

  4. Use the title and subtitle to create meta-data needed for Pelican to turn the Markdown file into a post.

  5. Run Pelican command to generate the static site.

  6. Push the site to GitHub and trigger Netlify’s auto-build

  7. Profit.

Let’s Write the Code

Photo by Shahadat Rahman on Unsplash

Photo by Shahadat Rahman on Unsplash

First of all, define our variables:

    #!/bin/bash 
    # Define variables
    filename='articles.txt'
    n=1

The structure the loop to read every line of the text file:

    # Read in file and do processing on each one
    while read line; do 
        # reading each line
        n=$((n+1)) 
        slug=$(echo $line | sed 's/https:\/\/towardsdatascience.com\///' )  # get slug from URL 
        FILE="$HOME/wayofnumbers.github.io/content/$slug.md"   # generate Markdown file name from slug 
        mediumexporter $line > $FILE   # convert medium article to markdown file    
        # some processing ...
    done < $filename

We used the sed command to remove the first part of the URL: https://towardsdatascience.com/ so the rest could be used as our slug. For example, https://towardsdatascience.com/9-things-i-learned-from-blogging-on-medium-for-the-first-month-2bace214b814 turns into 9-things-i-learned-from-blogging-on-medium-for-the-first-month-2bace214b814, perfect for a slug. Here we also uses the slug to create the filename for the MarkDown file. Then we use mediumexporter to transfer URL into the Markdown file. You can find out more about mediumexporter here.

Now that we have the Markdown file, let’s fill in the processing code we want:

    # Processing the markdown file 
        tail -n +2 "$FILE" > "$FILE.tmp" && mv "$FILE.tmp" "$FILE"  # remove the first line 
        fl=$(head -n 1 $FILE) # put first line (title) into fl 
        firstline=$(echo $fl | sed 's/# //') # Remove '# ' 
        tail -n +3 "$FILE" > "$FILE.tmp" && mv "$FILE.tmp" "$FILE"  # remove the first line 
        subtitle=$(head -n 1 $FILE) # put first line (subtitle) into subtitle 
        tail -n +2 "$FILE" > "$FILE.tmp" && mv "$FILE.tmp" "$FILE"  # remove the first two line

These lines are rather self-explanatory. Now we have firstline variable as the title and subtitle variable as the subtitle, we are now ready to construct the Markdown file meta-data for Pelican:

    # handle metadata for Pelican  
    meta="
    ---
title: "$firstline"
    slug: $slug
    description: "$subtitle"
    date: $(date)
    categories: Machine Learning
    tags: Machine Learning, Artificial Intelligence
    
    
    ---
    "

You can refer to Pelican’s document here for more information about the meta-data format. Simply put, the Markdown file doesn’t need to specifically write the title and subtitle, as long as we specify the title and subtitle field in our meta-data, Pelican will automatically generate them for you in the post, with specific styles per the theme you choose.

With the correct meta-data, now we can finally update the Markdown and get it ready for site generation:

    { echo -n "$meta"; cat $FILE; } >$FILE.new # sticth meta-data and article content together 
    mv $FILE{.new,} 
    head -n -8 $FILE > $FILE.new # Remove medium's recommended articles
    mv $FILE{.new,}
    done < $filename  # don't forget to enclose the loop.

All my Medium articles have several recommendations for further readings. I removed those for my blog(the last line of code above). Now that the Markdown file is ready, time to generate the site and push it to the server:

    # push to server
    cd $HOME/wayofnumbers.github.io
    pelican content -s publishconf.py 
    git add .
    git commit -m "fix"
    git push origin dev

Conclusion

So there you go. This script only works on Pelican static site generator, but the gist of it can be applied to any of your blogging platforms. I hope you learned a thing or two. And happy blogging/coding!

comments powered by Disqus
Built with Hugo
Theme Stack designed by Jimmy