Simple Markdown to HTML Parser
This challenge asks you to build a basic Markdown parser in JavaScript. Markdown is a lightweight markup language that allows you to format plain text using a simple syntax. Converting Markdown to HTML is a common task in web development, enabling rich text editing and display.
Problem Description
Your task is to create a JavaScript function, parseMarkdown, that takes a string containing Markdown syntax as input and returns a string representing the equivalent HTML. This parser should handle a limited set of Markdown features:
- Headings: Markdown uses
#for headings.# Heading 1should become<h1>Heading 1</h1>## Heading 2should become<h2>Heading 2</h2>- And so on, up to
###### Heading 6.
- Bold Text: Markdown uses
**to denote bold text.This is **bold** text.should becomeThis is <strong>bold</strong> text.
- Italic Text: Markdown uses
*to denote italic text.This is *italic* text.should becomeThis is <em>italic</em> text.
- Paragraphs: Plain text lines that are not recognized as other Markdown elements should be wrapped in
<p>tags. Blank lines should separate paragraphs.
Key Requirements:
- The function should accept a single string argument representing the Markdown input.
- The function should return a single string representing the generated HTML output.
- The parser should process the Markdown line by line.
- The order of operations matters: you should handle bold and italic text within lines before wrapping lines in paragraph tags.
Expected Behavior:
The parser should accurately translate the specified Markdown elements into their corresponding HTML tags.
Edge Cases to Consider:
- Empty input string.
- Input with only whitespace.
- Lines that start with Markdown syntax but don't conform perfectly (e.g.,
#followed by a space). - Nested bold/italic (though for this basic parser, assume no deep nesting, but handle cases like
***bold and italic***). - Markdown syntax appearing within code blocks (this is a more advanced case, but for this basic challenge, you can assume no code blocks).
Examples
Example 1:
Input:
# This is a Heading
This is a **bold** paragraph.
This is an *italic* paragraph.
Output:
<h1>This is a Heading</h1>
<p>This is a <strong>bold</strong> paragraph.</p>
<p>This is an <em>italic</em> paragraph.</p>
Explanation: The first line is a level 1 heading. The subsequent lines are treated as paragraphs, with bold and italic syntax correctly translated.
Example 2:
Input:
## Another Heading
Some text here.
More text with **bold** and *italic*.
A final line.
A new paragraph.
Output:
<h2>Another Heading</h2>
<p>Some text here.</p>
<p>More text with <strong>bold</strong> and <em>italic</em>.</p>
<p>A final line.</p>
<p>A new paragraph.</p>
Explanation: Handles multiple paragraphs and different heading levels. Notice how the blank line correctly separates the last paragraph of the first block from the new paragraph.
Example 3:
Input:
***Bold and Italic***
# Heading with ***bold***
Output:
<p><em><strong>Bold and Italic</strong></em></p>
<h1>Heading with <strong>bold</strong></h1>
Explanation: Demonstrates that *** should be interpreted as both bold and italic (order of application might matter here, leading to <em><strong>...</strong></em> or <strong><em>...</em></strong> – <em><strong> is a reasonable outcome for this basic implementation). Also shows heading processing taking precedence over general paragraph formatting.
Constraints
- The input Markdown string will not exceed 10,000 characters.
- The input will be a valid JavaScript string.
- The output should be a valid HTML string.
- Performance is not a critical constraint for this challenge, but the solution should be reasonably efficient.
- Focus on correctly implementing the specified Markdown features. Do not implement other Markdown features like lists, links, images, or code blocks.
Notes
- Consider using regular expressions to identify and replace Markdown syntax.
- Think about how to process the input string efficiently, perhaps by splitting it into lines.
- The order of operations is important. For example, you should apply bold and italic transformations before wrapping lines in paragraph tags.
- For the
***case, consider which tag should be on the outside. A common interpretation is to apply both, resulting in nested tags or combined<strong><em>.