Commit Graph

9 Commits

Author SHA1 Message Date
Shawn Rutledge
51a348b2e2 Markdown writer: omit space after opening code block fence
The CommonMark spec shows that it's not necessary to have a space
between the code fence and the language string:
https://spec.commonmark.org/0.29/#example-112
This also avoids a needless trailing space after a code fence that
does not include a language string.

Change-Id: I2addd38a196045a7442150760b73269bfe4ffb22
Reviewed-by: Volker Hilsheimer <volker.hilsheimer@qt.io>
2020-04-20 21:08:32 +02:00
Shawn Rutledge
0fcd782bd3 Markdown writer: don't wrap code block, and detect its end
The end of a code block nested in a list item is now detected;
and if the text of the list item continues after the code block,
it continues to be indented.

Code blocks should never be word-wrapped.

Fixes: QTBUG-80603
Change-Id: I4427f8b1d4807d819616f5cb971e2d006170d9be
Reviewed-by: Volker Hilsheimer <volker.hilsheimer@qt.io>
2020-04-20 21:08:23 +02:00
Giuseppe D'Angelo
2d265dce58 Markdown importer: properly set hyperlinks
The "title" in markdown is the tooltip, not the name attribute of
a link. Also, tell the char format that it's an anchor.

Change-Id: I2978848ec6705fe16376d6fe17f31007cce4b801
Reviewed-by: Shawn Rutledge <shawn.rutledge@qt.io>
2020-02-03 14:58:06 +01:00
Shawn Rutledge
b3cc9403c4 QTextMarkdownWriter: write fenced code blocks with language declaration
MD4C now makes it possible to detect indented and fenced code blocks:
https://github.com/mity/md4c/issues/81
Fenced code blocks have the advantages of being easier to write by hand,
and having an "info string" following the opening fence, which is commonly
used to declare the language.

Also, the HTML parser now recognizes tags of the form
<pre class="language-foo">
which is one convention for declaring the programming language
(as opposed to human language, for which the lang attribute would be used):
https://stackoverflow.com/questions/5134242/semantics-standards-and-using-the-lang-attribute-for-source-code-in-markup
So it's possible to read HTML and write markdown without losing this information.

It's also possible to read markdown with any type of code block:
fenced with ``` or ~~~, or indented, and rewrite it the same way.

Change-Id: I33c2bf7d7b66c8f3ba5bdd41ab32572f09349c47
Reviewed-by: Gatis Paeglis <gatis.paeglis@qt.io>
2019-06-04 08:01:34 +02:00
Shawn Rutledge
280d679c55 QTextMarkdownWriter: fix some bad cases with word wrap
If any non-breakable content (such as a link) already went past
80 columns, or if a word ended on column 80, it didn't wrap the rest of
the paragraph following.

Change-Id: I27dc0474f18892c34ee2514ea6d5070dae29424f
Reviewed-by: Gatis Paeglis <gatis.paeglis@qt.io>
2019-05-24 15:37:05 +02:00
Shawn Rutledge
7224d0e427 QTextMarkdownImporter: don't keep heading level on following list item
When reading a document like

 # heading
 - list item

and then re-writing it, it turned into

 # heading
 - # list item

because QTextCursor::insertList() simply calls QTextCursor::insertBlock(), thus
inheriting block format from the previous block, without an opportunity to
explicitly define the block format.  So be more consistent: use
QTextMarkdownImporter::insertBlock() for blocks inside list items too.  Now it
fully defines blockFormat first, then inserts the block, and then adds it to
the current list only when the "paragraph" is actually the list item's text
(but not when it's a continuation paragraph).  Also, be prepared for applying
and removing block markers to arbitrary blocks, just in case (they might be
useful for block quotes, for example).

Change-Id: I391820af9b65e75abce12abab45d2477c49c86ac
Reviewed-by: Gatis Paeglis <gatis.paeglis@qt.io>
2019-05-24 15:37:05 +02:00
Shawn Rutledge
7dd71e8125 Markdown: blockquotes, code blocks, and generalized nesting
Can now detect nested quotes and code blocks inside quotes, and can
rewrite the markdown too.

QTextHtmlParser sets hard-coded left and right margins, so we need to do
the same to be able to read HTML and write markdown, or vice-versa,
and to ensure that all views (QTextEdit, QTextBrowser, QML Text etc.)
will render it with margins.  But now we add a semantic memory too:
BlockQuoteLevel is similar to HeadingLevel, which was added in
310daae539 to preserve H1..H6 heading
levels, because detecting it via font size didn't make sense in
QTextMarkdownWriter.  Likewise detecting quote level by its margins
didn't make sense; markdown supports nesting quotes; and indenting
nested quotes via 40 pixels may be a bit too much, so we should consider
it subject to change (and perhaps be able to change it via CSS later on).
Since we're adding BlockQuoteLevel and depending on it in QTextMarkdownWriter,
it's necessary to set it in QTextHtmlParser to enable HTML->markdown
conversion.  (But so far, nested blockquotes in HTML are not supported.)

Quotes (and nested quotes) can contain indented code blocks, but it seems
the reverse is not true (according to https://spec.commonmark.org/0.29/#example-201 )

Quotes can contain fenced code blocks.

Quotes can contain lists.  Nested lists can be interrupted with
nested code blocks and nested quotes.

So far the writer assumes all code blocks are the indented type.
It will be necessary to add another attribute to remember whether the
code block is indented or fenced (assuming that's necessary).
Fenced code blocks would work better for writing inside block quotes
and list items because the fence is less ambiguous than the indent.

Postponing cursor->insertBlock() as long as possible helps with nesting.
cursor->insertBlock() needs to be done "just in time" before inserting
text that will go in the block.  The block and char formats aren't
necessarily known until that time.  When a nested block (such as a
nested quote) ends, the context reverts to the previous block format,
which then needs to be re-determined and set before we insert text
into the outer block; but if no text will be inserted, no new block
is necessary.  But we can't use QTextBlockFormat itself as storage,
because for some reason bullets become very "sticky" and it becomes
impossible to have plain continuation paragraphs inside list items:
they all get bullets.  Somehow QTextBlockFormat remembers, if we copy it.
But we can create a new one each time and it's OK.

Change-Id: Icd0529eb90d2b6a3cb57f0104bf78a7be81ede52
Reviewed-by: Gatis Paeglis <gatis.paeglis@qt.io>
2019-05-08 20:28:53 +00:00
Shawn Rutledge
040dd7fe26 Markdown: fix several issues with lists and continuations
Importer fixes:
- the first list item after a heading doesn't keep the heading font
- the first text fragment after a bullet is the bullet text, not a
  separate paragraph
- detect continuation lines and append to the list item text
- detect continuation paragraphs and indent them properly
- indent nested list items properly
- add a test for QTextMarkdownImporter
Writer fixes:
- after bullet items, continuation lines and paragraphs are indented
- indentation of continuations isn't affected by checkboxes
- add extra newlines between list items in "loose" lists
- avoid writing triple newlines
- enhance the test for QTextMarkdownWriter

Change-Id: Ib1dda514832f6dc0cdad177aa9a423a7038ac8c6
Reviewed-by: Gatis Paeglis <gatis.paeglis@qt.io>
2019-05-08 20:28:28 +00:00
Shawn Rutledge
23c2da3cc2 Add QTextMarkdownWriter, QTextEdit::markdown property etc.
A QTextDocument can now be written out in Markdown format.

- Add the QTextMarkdownWriter as a private class for now
- Add QTextDocument::toMarkdown()
- QTextDocumentWriter uses QTextMarkdownWriter if setFormat("markdown")
  is called or if the file suffix is .md or .mkd
- Add QTextEdit::toMarkdown() and the markdown property

[ChangeLog][QtGui][Text] Markdown (CommonMark or GitHub dialect) is now
a supported format for reading into and writing from QTextDocument.

Change-Id: I663a77017fac7ae1b3f9a400f5cd357bb40750af
Reviewed-by: Gatis Paeglis <gatis.paeglis@qt.io>
2019-05-01 14:31:27 +00:00