How to convert an encoded markdown content back to HTML using python ?

Published: March 18, 2024

Updated: March 18, 2024

Tags: Python;

DMCA.com Protection Status

Introduction

Markdown prioritizes web security by converting potentially risky characters into safe entities. This process ensures that special symbols, like "<", are displayed literally on web pages without accidentally triggering HTML code.

Here is an example of content

1
2
3
4
5
6
7
8
9
content =  """&lt;!DOCTYPE html&gt;
&lt;html&gt;
&lt;body&gt;

Here is some &lt;strong&gt;content&lt;/strong&gt; displayed after applying markdown!

&lt;/body&gt;
&lt;/html&gt;
"""

Our objective is to convert this content back into HTML.

Using html python module

For achieving that, a simple solution is to utilize the html Python module, which converts HTML entities back to their corresponding characters. How it works:

  • takes a string containing HTML entities as input.
  • identifies entities within the string (e.g., < for "<", & for "&").
  • replaces them with their corresponding characters according to HTML standards.
  • returns the unescaped string containing the original characters.

Then if we do:

1
print( html.unescape(content) )

The output will be:

1
2
3
4
5
6
7
8
<!DOCTYPE html>
<html>
<body>

Here is some <strong>content</strong> displayed after applying markdown!

</body>
</html>

References

Links Site
html docs.python.org
Python convert markdown to html fix stackoverflow
Using Markdown as a Python Library python-markdown.github.io