Don't serialize an extra LF in <pre>, <textarea>, <listing>

zcorpan · domenic · commit 2aa0000433f8 · 2016-09-24T10:47:14.000+01:00
This was implemented in Presto in ~2012, and recently implemented in Gecko, but it broke CKEditor (http://dev.ckeditor.com/ticket/14814#ticket) so it is being backed out again in Gecko. Fixes whatwg#944.
diff --git a/source b/source
@@ -57812,7 +57812,7 @@ o............A....e
    data-x="attr-script-src">src</code> content attribute, and the <span>Should element's inline
    behavior be blocked by Content Security Policy?</span> algorithm returns "<code
    data-x="">Blocked</code>" when executed upon the <code>script</code> element, "<code
-   data-x="">script</code>", and the <code>script</code> element's <code>child text content</code>,
+   data-x="">script</code>", and the <code>script</code> element's <span>child text content</span>,
    then abort these steps. The script is not executed. <ref spec="CSP"></p></li>
 
    <li id="script-processing-for">
@@ -108553,11 +108553,6 @@ document.body.appendChild(text);
         <!-- also, i guess: image, but we don't list it because we don't consider it an "element",
         more a "macro", and thus we should never serialize it -->
 
-        <p>If <var>current node</var> is a <code>pre</code>, <code>textarea</code>, or
-        <code>listing</code> element, and the first child node of the element, if any, is a
-        <code>Text</code> node whose character data has as its first character a U+000A LINE FEED
-        (LF) character, then append a U+000A LINE FEED (LF) character.</p>
-
         <p>Append the value of running the <span>HTML fragment serialization algorithm</span> on the
         <var>current node</var> element (thus recursing into this algorithm for that
         element), followed by a U+003C LESS-THAN SIGN character (&lt;), a U+002F SOLIDUS character
@@ -108638,7 +108633,7 @@ document.body.appendChild(text);
   <p class="warning">It is possible that the output of this algorithm, if parsed with an <span>HTML
   parser</span>, will not return the original tree structure. Tree structures that do not roundtrip
   a serialize and reparse step can also be produced by the <span>HTML parser</span> itself, although
-  such cases are non-conforming.</p>
+  such cases are typically non-conforming.</p>
 
   <div class="example">
 
@@ -108707,6 +108702,26 @@ document.body.appendChild(text);
 
   </div>
 
+  <p>For historical reasons, this algorithm does not round-trip an initial U+000A LINE FEED (LF)
+  character in <code>pre</code>, <code>textarea</code>, or <code>listing</code> elements, even
+  though (in the first two cases) the markup being round-tripped can be conforming. The <span>HTML
+  parser</span> will drop such a character during parsing, but this algorithm does <em>not</em>
+  serialize an extra U+000A LINE FEED (LF) character.</p>
+  <!-- https://github.com/whatwg/html/issues/944 -->
+
+  <div class="example">
+   <p>For example, consider the following markup:</p>
+
+   <pre>&lt;pre>
+
+Hello.&lt;/pre></pre>
+
+   <p>When this document is first parsed, the <code>pre</code> element's <span>child text
+   content</span> starts with a single newline character. After a serialize-reparse roundtrip, the
+   <code>pre</code> element's <span>child text content</span> is simply "<code
+   data-x="">Hello.</code>".</p>
+  </div>
+
   <p><dfn id="escapingString">Escaping a string</dfn> (for the purposes of the algorithm above)
   consists of running the following steps:</p>