My Heading

from IPython.display import HTML,Markdown,display
from mistletoe import markdown

from fastcore.test import test_eq

source

Highlight


def Highlight(
    match
):

Base token class.

Token has two subclasses:

  • block_token.BlockToken, for all block level tokens. A block level token is text which occupies the entire horizontal width of the “page” and is offset for the surrounding sibling block with line breaks.

  • span_token.SpanToken, for all span-level (or inline-level) tokens. A span-level token appears inside the flow of the text lines without any surrounding line break.

Custom __repr__ methods in subclasses: The default __repr__ implementation outputs the number of child tokens (from the attribute children) if applicable, and the content attribute if applicable. If any additional attributes should be included in the __repr__ output, this can be specified by setting the class attribute repr_attributes to a tuple containing the attribute names to be output.


source

Superscript


def Superscript(
    match
):

Base token class.

Token has two subclasses:

  • block_token.BlockToken, for all block level tokens. A block level token is text which occupies the entire horizontal width of the “page” and is offset for the surrounding sibling block with line breaks.

  • span_token.SpanToken, for all span-level (or inline-level) tokens. A span-level token appears inside the flow of the text lines without any surrounding line break.

Custom __repr__ methods in subclasses: The default __repr__ implementation outputs the number of child tokens (from the attribute children) if applicable, and the content attribute if applicable. If any additional attributes should be included in the __repr__ output, this can be specified by setting the class attribute repr_attributes to a tuple containing the attribute names to be output.


source

Subscript


def Subscript(
    match
):

Base token class.

Token has two subclasses:

  • block_token.BlockToken, for all block level tokens. A block level token is text which occupies the entire horizontal width of the “page” and is offset for the surrounding sibling block with line breaks.

  • span_token.SpanToken, for all span-level (or inline-level) tokens. A span-level token appears inside the flow of the text lines without any surrounding line break.

Custom __repr__ methods in subclasses: The default __repr__ implementation outputs the number of child tokens (from the attribute children) if applicable, and the content attribute if applicable. If any additional attributes should be included in the __repr__ output, this can be specified by setting the class attribute repr_attributes to a tuple containing the attribute names to be output.


source

Emoji


def Emoji(
    match
):

Base token class.

Token has two subclasses:

  • block_token.BlockToken, for all block level tokens. A block level token is text which occupies the entire horizontal width of the “page” and is offset for the surrounding sibling block with line breaks.

  • span_token.SpanToken, for all span-level (or inline-level) tokens. A span-level token appears inside the flow of the text lines without any surrounding line break.

Custom __repr__ methods in subclasses: The default __repr__ implementation outputs the number of child tokens (from the attribute children) if applicable, and the content attribute if applicable. If any additional attributes should be included in the __repr__ output, this can be specified by setting the class attribute repr_attributes to a tuple containing the attribute names to be output.


source

FootnoteRef


def FootnoteRef(
    match
):

Base token class.

Token has two subclasses:

  • block_token.BlockToken, for all block level tokens. A block level token is text which occupies the entire horizontal width of the “page” and is offset for the surrounding sibling block with line breaks.

  • span_token.SpanToken, for all span-level (or inline-level) tokens. A span-level token appears inside the flow of the text lines without any surrounding line break.

Custom __repr__ methods in subclasses: The default __repr__ implementation outputs the number of child tokens (from the attribute children) if applicable, and the content attribute if applicable. If any additional attributes should be included in the __repr__ output, this can be specified by setting the class attribute repr_attributes to a tuple containing the attribute names to be output.


source

FootnoteEntry


def FootnoteEntry(
    match
):

Base class for block-level tokens. Recursively parse inner tokens.

Naming conventions:

* lines denotes a list of (possibly unparsed) input lines, and is
  commonly used as the argument name for constructors.

* BlockToken.children is a list with all the inner tokens (thus if
  a token has children attribute, it is not a leaf node; if a token
  calls span_token.tokenize_inner, it is the boundary between
  span-level tokens and block-level tokens);

* BlockToken.start takes a line from the document as argument, and
  returns a boolean representing whether that line marks the start
  of the current token. Every subclass of BlockToken must define a
  start function (see block_tokenizer.tokenize).

* BlockToken.read takes the rest of the lines in the document as an
  iterator (including the start line), and consumes all the lines
  that should be read into this token.

  Default to stop at an empty line.

  Note that BlockToken.read does not have to return a list of lines.
  Because the return value of this function will be directly
  passed into the token constructor, we can return any relevant
  parsing information, sometimes even ready-made tokens,
  into the constructor. See block_tokenizer.tokenize.

  If BlockToken.read returns None, the read result is ignored,
  but the token class is responsible for resetting the iterator
  to a previous state. See block_tokenizer.FileWrapper.get_pos,
  block_tokenizer.FileWrapper.set_pos.

Attributes: children (list): inner tokens. line_number (int): starting line (1-based).


source

Strikethrough


def Strikethrough(
    match
):

Base token class.

Token has two subclasses:

  • block_token.BlockToken, for all block level tokens. A block level token is text which occupies the entire horizontal width of the “page” and is offset for the surrounding sibling block with line breaks.

  • span_token.SpanToken, for all span-level (or inline-level) tokens. A span-level token appears inside the flow of the text lines without any surrounding line break.

Custom __repr__ methods in subclasses: The default __repr__ implementation outputs the number of child tokens (from the attribute children) if applicable, and the content attribute if applicable. If any additional attributes should be included in the __repr__ output, this can be specified by setting the class attribute repr_attributes to a tuple containing the attribute names to be output.


source

FencedDiv


def FencedDiv(
    result
):

Base class for block-level tokens. Recursively parse inner tokens.

Naming conventions:

* lines denotes a list of (possibly unparsed) input lines, and is
  commonly used as the argument name for constructors.

* BlockToken.children is a list with all the inner tokens (thus if
  a token has children attribute, it is not a leaf node; if a token
  calls span_token.tokenize_inner, it is the boundary between
  span-level tokens and block-level tokens);

* BlockToken.start takes a line from the document as argument, and
  returns a boolean representing whether that line marks the start
  of the current token. Every subclass of BlockToken must define a
  start function (see block_tokenizer.tokenize).

* BlockToken.read takes the rest of the lines in the document as an
  iterator (including the start line), and consumes all the lines
  that should be read into this token.

  Default to stop at an empty line.

  Note that BlockToken.read does not have to return a list of lines.
  Because the return value of this function will be directly
  passed into the token constructor, we can return any relevant
  parsing information, sometimes even ready-made tokens,
  into the constructor. See block_tokenizer.tokenize.

  If BlockToken.read returns None, the read result is ignored,
  but the token class is responsible for resetting the iterator
  to a previous state. See block_tokenizer.FileWrapper.get_pos,
  block_tokenizer.FileWrapper.set_pos.

Attributes: children (list): inner tokens. line_number (int): starting line (1-based).


source

opening_tag


def opening_tag(
    line
):

Call self as a function.


source

TagExtractor


def TagExtractor(
    
):

Find tags and other markup and call handler functions.

Usage: p = HTMLParser() p.feed(data) … p.close()

Start tags are handled by calling self.handle_starttag() or self.handle_startendtag(); end tags by self.handle_endtag(). The data between tags is passed from the parser to the derived class by calling self.handle_data() with the data as argument (the data may be split up in arbitrary chunks). If convert_charrefs is True the character references are converted automatically to the corresponding Unicode character (and self.handle_data() is no longer split in chunks), otherwise they are passed by calling self.handle_entityref() or self.handle_charref() with the string containing respectively the named or numeric reference as the argument.

test_eq(opening_tag('<div>'), ('div', {}))
test_eq(opening_tag('<div class="x">'), ('div', {'class':'x'}))
test_eq(opening_tag('<br/>'), (None, {}))
test_eq(opening_tag('<img src="a.png"/>'), (None, {}))
test_eq(opening_tag('plain text'), (None, {}))
test_eq(opening_tag('<svg xmlns="http://www.w3.org/2000/svg">'), ('svg', {'xmlns':'http://www.w3.org/2000/svg'}))
test_eq(opening_tag('<div markdown="1">'), ('div', {'markdown':'1'}))

source

LenientHtmlBlock


def LenientHtmlBlock(
    result
):

Block-level HTML token. This is a leaf block token with a single child of type span_token.RawText, which holds the raw HTML content.


source

ExtendedHtmlRenderer


def ExtendedHtmlRenderer(
    args:VAR_POSITIONAL, kw:VAR_KEYWORD
):

HTML renderer class.

See mistletoe.base_renderer module for more info.

def render_md(md): return HTML(markdown(md, ExtendedHtmlRenderer))
render_md("Here's a sentence with a footnote[^1] and another[^2].\n\n[^1]: First note.\n[^2]: Second note.")

Here's a sentence with a footnote[1] and another[2].

1 First note.
2 Second note.
render_md("~~strikethrough~~ and **bold ~~strikethrough~~**")

strikethrough and bold strikethrough

render_md("Check out https://fast.ai and http://example.com for more info!")

Check out https://fast.ai and http://example.com for more info!

render_md("Here's a sentence with a link to <http://www.example.org>.")

Here's a sentence with a link to http://www.example.org.

render_md("""Here's some code:

```
https://fast.ai
~~strikethrough~~
:smile:
H~2~O
```

And outside the block: https://fast.ai ~~strikethrough~~ :smile: H~2~O""")

Here's some code:

https://fast.ai
~~strikethrough~~
:smile:
H~2~O

And outside the block: https://fast.ai strikethrough 😊 H2O

render_md("H~2~O and E=mc^2^")

H2O and E=mc2

test_md2 = """
- [x] Completed task
- [ ] Incomplete task
- Regular item"""
render_md(test_md2)
  • Completed task
  • Incomplete task
  • Regular item
render_md("==highlighted== and :smile: :rocket: :heart:")

highlighted and 😊 🚀 ❤️


source

parse_attrs


def parse_attrs(
    text
):

Call self as a function.

parse_attrs('{#my-id .class1 .class2 width="50%" height="200 px"}')
' id="my-id" class="class1 class2" width="50%" height="200 px"'

source

ExtendedHtmlRenderer.render_heading


def render_heading(
    token
):

Call self as a function.

<style>
.important { background-color: yellow; font-weight: bold; }
</style>
render_md("# My Heading {#intro .important}")
render_md('This [link](http://www.example.org){target="_blank"} opens in a new tab.')

This link opens in a new tab.

<style>
[class*="callout-"] { border-left: 4px solid var(--clr); background: var(--bg);
  padding: 0.8em 1em; border-radius: 4px; margin: 0.5em 0; }
.callout-note      { --clr: #4a9eff; --bg: #f0f7ff; }
.callout-warning   { --clr: #f0ad4e; --bg: #fff8f0; }
.callout-tip       { --clr: #5cb85c; --bg: #f0fff0; }
.callout-important { --clr: #d9534f; --bg: #fff0f0; }
.columns { display: flex; gap: 1em; }
.column { flex: 1; }
</style>
render_md("""::: {.callout-warning .prose}
This is a **note** with *formatting*.

- Item 1
- Item 2
:::""")

This is a note with formatting.

  • Item 1

  • Item 2

render_md(""":::: {.columns}
::: {.column}
**Left** column
:::
::: {.column}
**Right** column
:::
::::""")

Left column

Right column

Tests

def rend(c): return ExtendedHtmlRenderer().render(Document(c))
def test_render(a,b): return test_eq(rend(a), b)
def test_render_p(a,b): return test_render(a, f'<p>{b}</p>\n')
test_render_p(':sm ile:', ':sm ile:')
test_render_p('.', '.')
test_render_p(':unknown:', ':unknown:')
test_render_p(':smile:', '😊')
test_render_p('H~ 2~O', 'H~ 2~O')
test_render_p('H~2~O', 'H<sub>2</sub>O')
test_render_p('E=mc^ 2^', 'E=mc^ 2^')
test_render_p('E=mc^2^', 'E=mc<sup>2</sup>')
test_render_p('~~no space~~', '<del>no space</del>')
test_render_p('**~~nested~~**', '<strong><del>nested</del></strong>')
test_render_p('==~~double~~==', '<mark><del>double</del></mark>')
html = rend('```\nhttps://fast.ai\n```')
assert 'href' not in html and 'https://fast.ai' in html
html = rend('https://fast.ai')
assert '<a href="https://fast.ai">https://fast.ai</a>' in html
html = rend('[link](https://fast.ai)')
assert html.count('href') == 1
test_render_p('[^1]', '<sup><a href="#fn-1" id="fnref-1">[1]</a></sup>')
html = rend('- [x] done\n- [ ] todo\n- regular')
assert 'checked' in html and html.count('checkbox') == 2
test_render_p('[link](http://example.com){target="_blank"}', '<a href="http://example.com" target="_blank">link</a>')
test_render_p('[link](http://example.com){#my-id .cls}', '<a href="http://example.com" id="my-id" class="cls">link</a>')
test_render('# Heading {#intro}', '<h1 id="intro">Heading</h1>\n')
test_render('## Test {.important}', '<h2 class="important">Test</h2>\n')
test_render('### Multi {.c1 .c2}', '<h3 class="c1 c2">Multi</h3>\n')
html = rend('# Full {#id1 .cls data-level="1"}')
assert 'id="id1"' in html and 'class="cls"' in html and 'data-level="1"' in html
test_render('# Plain', '<h1>Plain</h1>\n')
test_render('## Empty {}', '<h2>Empty {}</h2>\n')
html = rend('# Order {.first #myid .second key="val"}')
assert 'id="myid"' in html and 'class="first second"' in html and 'key="val"' in html
test_render('# Spaces {data-value="hello world"}', '<h1 data-value="hello world">Spaces</h1>\n')

test_render('<details markdown="1">\n<summary>test</summary>\n\n```json\n{"a": 1}\n```\n</details>',
    '<details markdown="1">\n<summary>test</summary><pre><code class="language-json">{"a": 1}\n</code></pre></details>\n\n')
html = rend('::: {.callout-note}\nHello **world**\n:::')
assert '<div class="callout-note">' in html and '<strong>world</strong>' in html
html = rend('::: {#box .a .b}\nInner\n:::')
assert 'id="box"' in html and 'class="a b"' in html and 'Inner' in html
html = rend(':::: {.columns}\n::: {.column}\nLeft\n:::\n::: {.column}\nRight\n:::\n::::')
assert html.count('class="column"') == 2 and 'class="columns"' in html
html = rend('<div>\n\n    indented after blank\n</div>')
assert '<pre>' not in html and 'indented after blank' in html
html = rend('<div>\n\n<p>child</p>\n\n    <p>indented child</p>\n</div>')
assert '<pre>' not in html and 'indented child' in html
html = rend('<svg width="10" height="10">\n    <circle cx="5" cy="5" r="4"/>\n</svg>')
assert '<pre>' not in html and '<circle' in html
html = rend('Before\n\n<div>\n\n    indented\n</div>\n\nAfter')
assert '<pre>' not in html and 'Before' in html and 'After' in html
html = rend('<details markdown="1">\n<summary>test</summary>\n\n```json\n{"a": 1}\n```\n</details>')
assert '<pre><code class="language-json">' in html and '<details markdown="1">' in html and '</details>' in html
html = rend('<div markdown="1">\n\n**bold** and *italic*\n</div>')
assert '<strong>bold</strong>' in html and '<em>italic</em>' in html
html = rend('<div>\n\n**not bold**\n</div>')
assert '<strong>' not in html and '**not bold**' in html
test_eq(rend("`<details>`"), '<p><code>&lt;details&gt;</code></p>\n')