Muuuh

Nature & Wildlife

Using Obsidian Notes as CMS for a Website

Obsidian Notes is a relatively new player in the market, but it has gained a lot of traction due to its ease of use and powerful features. This article provides a comprehensive overview of the benefits and considerations of using Obsidian as a CMS, as well as detailed instructions on how to implement it, based on our use.

Last changed on 2024-03-30

Key Takeaways

  • Obsidian Notes is a great application to write and manage content for a website.
  • It allows to maintain and store content and meta data of large collections of items.
  • There are a variety of options available to publish Obsidian notes in a website.
  • Among them, custom python scripts is a advantageous method to convert notes to html pages

References

Video - How I use Obsidian as a CMS for my websites explained in a video.

Schematic - Presenting the different elements and workflow in a schematic.

Why Using Obsidian as CMS?

Obsidian is a note-taking, knowledge management application and not designed to be used as a CMS. Why could you consider maintaining website content within a Obsidian vault? Primarily, because Obsidian stands out as a well-designed interface that enables content creators to efficiently compose and organize their material.

Here a some advantages of using Obsidian:

How to Publish Web Pages?

Although Obsidian notes are Markdown files, a slim version of a markup language like HTML, these notes cannot directly be served in a web site. There must be some conversion from Markdown to HTML before the content can be displayed as a web page.

These approaches won't work for us because we need full control of web server settings not provided by Github Pages or Obsidian publishing service.

How do We Publish Web Pages?

We use Obsidian as CMS for these Sites: kochtipps.ch, faunaflora.photography and this one "Muuuh Nature & Wildlife"

We publish websites using Firebase Hosting which gives us control over web server settings. Instead of using a library like Jekyll to generate static files from Obsidian notes, we have written our own Python scripts. There are three steps, each one performed by a script:

collect.py and translate.py are scripts which aren't site specific. We use the exact same scripts for all websites which we manage using Obsidian as CMS. build.py on the other side, is customized to a particular site. Not only by using different set of templates but also within the script itself, adjusted to the need of the particular website it builds.

Before diving into the details of each script, some ground rules when writing Obsidian notes:

The author should stay focused on the topic and semantic structure of the notes and not be distracted by decision making about how the content will be displayed on the final web page.

The Python Scripts

We developed python scripts, custom static page generators, to create web pages from Obsidian notes. The scripts read the content of all markdown files within specified folders and converts it from markdown to a fully fledged, state of the art, static web pages.

collect.py - Reading Markdown Files

A simple script, using just a few basic libraries and a few functions. The script is the same for all web sites. The site specific part is a dictionary providing the information where within the drive can be found different collection of markdown files to extract content from.


config = {
    'language': 'en',
    'translations': [],
    'domain': 'https://muuuh.com',
    'public': 'www',
    'images': '',
    'vault': '../Obsidian',
    'resources': [
        {'name': 'Pages', 'path': 'Websites/muuuh.com'},
        {'name': 'Photos', 'path': 'Resources/Photos'},
        {'name': 'Videos', 'path': 'Resources/Videos'}
    ]
}

Executing the script will read all markdown files in the corresponding resource folders and store it temporarily in a JSON file. Even for larger sites like faunaflora.photography, the script takes only one to a few seconds.

translate.py - Translate All Content (Optional)

This script is also the same for all web sites. Only faunaflora.photography though is published in more than one language. After collecting all content, this script will loop through all lines of content, lookup if a line has already been translated and if not, sends the line with instructions to ChatGPT 3.5 for translation.


def translate_text(text, source_language, target_language):
    instruction = f"Translate the text from {source_language} to {target_language} but don't modify double brackets and markdown code."
    response = openai.ChatCompletion.create(
        model="gpt-3.5-turbo",
        messages=[
            {"role": "system", "content": instruction},
            {"role": "user", "content": text}
        ],
        temperature=0,
        max_tokens=len(text),
        top_p=0.5
    )
    return response['choices'][0]['message']['content']

The result of the translation is stored in a JSON file where the text is the key of dictionaries having the different translation as key, value pairs:

[
    "**Does the Great Blue Heron have a long lifespan?**": {
        "de": "**Hat der Kanadareiher eine lange Lebensdauer?**",
        "en": "**Does the Great Blue Heron have a long lifespan?**",
        "es": "**¿La Garza Azul tiene una larga esperanza de vida?**",
        "fr": "**Le Grand Héron a-t-il une longue durée de vie?**",
        "source": ["Species.json Great Blue Heron (Ardea herodias)"],
        "date": "2024-02-11"
    },
    ...
]

Each object includes a date, when it has been found the last time and a list of sources indicating in which file and object the line has been found.

There are a few challenges regarding content translation. Consult our video "Using Obsidian as CMS - Update Translation" if you want to learn more about this topic.

build.py - Generating Static Web Pages

Finally, a customized script converts the content stored in JSON files into html files. This script uses Python markdown library with a few extensions (fenced_code, tables, attr_list) to convert markdown to html and Jinja templates to arrange the content into web pages.

The algorithm is quite simple but there are a few challenges to be aware of and which are addressed below.

The main feature of Obsidian is the links between notes. The published web pages should be able to maintain this network of links. When a note is converted to a web page, the reference to a linked note may have changed. For example, the content of a note may be located at "/pages/gear/Using Obsidian as CMS.md," but on the website, the content of this note will be displayed at "/gear/using-obsidian-as-cms/."

We resolve this by first creating a large dictionary of all pages, called a "sitemap," where the note names, such as "Using Obsidian as CMS," serve as the keys. We then search and replace all instances of references to the notes with proper HTML links in the content.

Here the Python code for this function:


def replace_links(sitemap_dict):
    for _, values in sitemap_dict.items():
        output_content = values['output']
        for linked_key in sitemap_dict:
            link_pattern = f'{linked_key}'
            link_text = sitemap_dict[linked_key]['link']
            link_href = sitemap_dict[linked_key]['path']
            output_content = output_content.replace(link_pattern, f'<a href="/{link_href}/">{link_text}</a>')
            values['output'] = output_content
    return copy.deepcopy(sitemap_dict)

References to notes that aren't published as web pages will be ignored.

How to Deal with Media Files Like Photos and Videos?

Although Obsidian can contain media files, optimized publishing of high resolution media files requires CDN. We keep in Obsidian ONLY notes for media files. The media files itself are stored outside of the Obsidian Vault. Editing content about a media file, or linking media file within a web page then can be done within Obsidian while creating the final published web page will convert the reference to the media file into the correct html snippet.

The script we deploy for this task has been adjusted to deal with high resolution photos. The resulting html code is customized for the image rendering and CDN service (IMGIX) we are using.

How to Deal with Structured Data?

One set of structured data covers the attributes of a web page like head meta tags content (language, title, description, canonical url etc.) Declaring these parameters using front-matter is an obvious choice but we applied another approach as front-matter comes with some disadvantages.

After an initial resistance to rely on front-matter, we now rely heavily on it. Also because Obsidian made some improvements which mitigate some issues. We still use our initial approach, see below.

Web Page Meta Data: Front-Matter Approach

For notes which convert into web pages, the front-matter area can look like this:


---
path: /gear/using-obsidian-as-cms/
template: PageEN
title: Using Obsidian as CMS
description: How to implement Obsidian as content management system for Websites.
tags: obsidian,cms
photo: "khm-20240204-1309-0000197446"
---

None of the variables are mandatory. In cases where a variable is declared multiple times in different sections of a note, front-matter will overwrite the others. These variables are mostly optional and if not declared in front-matter a value is obtained by different means. For example:

etc.

This setup allows us a smooth transition from using a Meta section as described below, not all notes had to migrate to front-matter at once and if for some reason a variable is missing, there are alternative methods to obtain a reasonable value if necessary.

Web Page Meta Data: Meta Section Approach

Our implementation keeps this data in a published note, initiated with an h3 heading named "Meta" followed by a list of items where a parameter is kept before a colon and the value of the parameter behind the colon:

For this page


### Meta
- website: muuuh.com
- path: gear/using-obsidian-as-cms
- title: Using Obsidian as CMS
- description: How to implement Obsidian as content management system for Websites.
- author: Karl-Heinz Müller

Collections

Another set of structured data is the content itself or a section of a website. Our website faunaflora.photography, for example, publishes information about 150+ species, 500+ photos, 300+ videos and 1000+ log entries about in field species identifications and observations. We store the data for each item in a separate note following a pre-determined headings, lists and texts.

A python script converts these notes into pandas data frames and then converts the data into web pages. The structure of these type of notes is rigid but still provides enough flexibility to extend information about an item.

As an example the Markdown of a photo file, close up of an American Bullfrog (Rana Catesbeiana):


## khm-20190624-1612-0000104643

![American Bull Frog](https://simaecnet.imgix.net/photos/khm-20190624-1612-0000104643.jpg?w=1200&h=400&fit=crop&auto=format,compress&crop=entropy "American Bull Frog")

![American Bull Frog](https://simaecnet.imgix.net/photos/khm-20190624-1612-0000104643.jpg?w=200&h=200&fit=crop&auto=format,compress&crop=entropy "American Bull Frog")


### Title

American Bull Frog (Rana Catesbeiana)

### Caption

American Bull Frog (Rana Cateisbana) - Close Up

### Meta

- camera: NIKON D500
- lens: 90mm f/2.8
- exposure_time: 1/1000
- f_number: f/3.3
- focal_length: 90mm
- iso: 200
- topic: species
- species: American Bullfrog (Rana Catesbeiana)
- location: Parc Bernard-Landry
- city: LAVAL
- state: QC
- country: CA
- width: 4219
- height: 2808
- presentation: entropy
- tags: amphibians, frogs

We have two Markdown code lines for the image. These are two different layouts which serve us to verify that image rendering (presentation) is correct for the corresponding photo (presentation). It further allows viewing the photo as support when writing title and caption.

Meta section contains structured data about the photo. Most of the content in this section has been extracted from exif data of the respective photo.

How to Deal with User Interactive Features?

Using Obsidian as CMS doesn't allow creating interactive features like forms, searches or content updates on page load via API. We enhance functionality of websites where the content is driven by using Obsidian as CMS with interactive UI elements built with Svelte. Over time we have built a small library of such components, like:

How to Deal with Special Styling Instructions?

As mentioned before, when writing notes, we aim to avoid distraction with custom styling features beyond those already provided by basic Markdown. However, we are addressing a few items that are handy to have available.

How to Deal with Specially Formatted Text Blocks?

We sometimes want to highlight text paragraphs. For the conversion of Markdown to HTML, you can associate a CSS class with a paragraph using native HTML syntax. For example:


<p class="warning">Text paragraph associated with a css class</p>

Oooh, I don't like this approach. Alternatively you can extend a paragraph with a css class using a particular syntax:


This is the text of a paragraph which can span one or more lines.
{: .myclass}

This is how it then may look like with some css statements applied to the ".myclass", named in this example ".enhance".

On a side note, you can use a similar approach if you want to include a link to an external resource opening in a new tab. In this case you will add the instruction {: target=_blank} immediately after the closing ")" of the link markdown.

How to Deal with Multi-Column Design?

If all content of a note is converted into a page, the content will be shown as a single column which is acceptable for display on a small device like a phone but not that appealing on a desktop. We deployed two different strategies to enhance visually the display of the content in multiple columns.

A simple one is using the horizontal line markdown "---" to split the content of a section into one, two or more columns.


### H3 markedown always initiates a section

This content will end up in the first column, a div with the class "column" within a div with the class columns

---

This content will end up in the second column, a div with the class "column"

To display the content in columns we then apply css display:grid or display:flex

The second approach involves the templates. Some set of notes have a more rigid selection of section titles, like parks which always have a section named "How to get there" and "What to see here". As the content of each section is stored within a dictionary where the key is the section name, the template will inject the corresponding content within a more complex html structure allowing a multicolumn display with grid or flex. Here a jinja html code dealing with such an object:


<div class="wrapper">
    <div class="leftcolumn">
    {% if page.what_to_see_here_html|length > 0 %}
        <h3>What to See Here</h3>
        {% for item in page.what_to_see_here_html %}
            {{item}}
        {% endfor %}
    {%- endif %}
    </div>
    <div class="rightcolumn">
    ...
    </div>
</div>

Again using css statements for grid or flex, the display on desktop can be arranged accordingly.

Future Additions and Remarks

How can I refer to images or videos within a note? - Having a long article without illustrating photos may result in monotonous presentation. This has been resolved. See Identifying Whales in Saguenay St. Lawrence Marine Park.

Images are referred with a markdown for media files. When converting the note to html, a function replaces the html snippet for the image to a proper figure/picture tag for responsive design.

How can I refer a SVG graph in a document? - Occasionally, we would like to illustrate a topic with a simple graph generated with an external tool and exported as SVG file.

Working on.

References

Glossary

Front-matter is a section on top of a markdown file written in YAML ("YAML is a human-friendly data serialization language for all programming languages."), initiated by 3 dashes and separated from the rest of the content by another 3 dashes.