/irc-logs / w3c / #html-wg / 2007-05-30 / end

Options:

  1. # Session Start: Wed May 30 00:00:00 2007
  2. # Session Ident: #html-wg
  3. # [00:02] * Joins: loic (loic@90.29.174.204)
  4. # [00:16] <karl> http://textplain.blogspot.com/2007/05/small-html-documents.html
  5. # [00:16] <karl> An exercise in reading HTML specifications
  6. # [00:17] <karl> "The HTML5 spec is easier to read because it is designed with audience in mind. But at the end of the day, it's still a spec. Spec aren't meant to be read by humans. They're meant to be read by pedantic assholes (who yell at people trying to interpret the spec) and angelic humanitarians (who interpret specs and translate for the rest of us)."
  7. # [00:17] * Joins: DanC_lap (connolly@128.30.52.30)
  8. # [00:17] <zcorpan_> karl: yeah, saw that earlier
  9. # [00:17] <karl> http://www.bluishcoder.co.nz/2007/05/support-for-html-video-element-in.html
  10. # [00:17] <karl> Support for HTML Video element in Firefox
  11. # [00:17] <karl> "I briefly mentioned in a previous post that I was working on implementing this tag natively in Firefox. The intent is to display Ogg Theora encoded video without needing any plugins, similar to the recent demonstration of Theora playback using a video element by Opera. Opera have a post about it on their labs page."
  12. # [00:17] <gavin_> those groups sound like they were taken from Mark Pilgrim's post from a while back
  13. # [00:18] <karl> http://www.broken-links.com/2007/05/29/mozilla-gets-native-video-support/
  14. # [00:18] <karl> Mozilla gets native video support
  15. # [00:18] <gavin_> http://diveintomark.org/archives/2004/08/16/specs
  16. # [00:18] <karl> "While I think this is great news and has a lot of potential, I foresee one major obstacle to this becoming standard: proprietary codecs. If they decide to implement it, Apple will want Quicktime in Safari, and Microsoft will want Windows Media Video in Internet Explorer."
  17. # [00:19] <zcorpan_> karl: have you become an aggregator? :)
  18. # [00:19] <karl> sort of ;)
  19. # [00:19] <karl> zcorpan_: just reading the links of the morning
  20. # [00:19] <zcorpan_> ok
  21. # [00:19] <karl> and I have to leave in a few minutes to take the train :)
  22. # [00:20] <karl> for my happy 1h30 commuting reading Rainer Maria Rilke.
  23. # [00:20] <mjs> Apple will want MPEG-4, not QuickTime, and MP4 is not proprietary (though it is subject to patents)
  24. # [00:21] * DanC_lap tries to remember how MP4 and H.623 are related... or is it H.263...
  25. # [00:22] <karl> http://www.apple.com/quicktime/technologies/h264/
  26. # [00:22] <Dashiva> mp4 is a container, h.263 is a codec, h.264 is a newer codec?
  27. # [00:22] <karl> http://en.wikipedia.org/wiki/H.264
  28. # [00:22] <karl> "video compression. Also known as MPEG-4 Part 10, or AVC (for Advanced Video Coding)."
  29. # [00:22] <mjs> MPEG-4 is a family of formats, including a generic container format and a number of codecs
  30. # [00:23] <karl> http://www.brucelawson.co.uk/index.php/2007/html5-microformats-accessibility-testing/
  31. # [00:23] <karl> HTML 5, microformats and testing accessibility
  32. # [00:24] <DanC_lap> H.264 is what I meant... when people say mp4, do they usually mean with h.264?
  33. # [00:24] * DanC_lap wanders off again...
  34. # [00:24] <Dashiva> I'd say usually yes
  35. # [00:24] <Dashiva> But h.264 by itself isn't a unique description either, it's a complex issue
  36. # [00:25] * karl is preparing his bag
  37. # [00:25] * Quits: karl (karlcow@128.30.52.30) (Quit: Where dwelt Ymir, or wherein did he find sustenance?)
  38. # [00:25] <mjs> Apple's specific main interest is to support the MPEG-4 container with H.246 video and AAC audio, but we'd likely also support MPEG-family video and audio codecs at the very least
  39. # [00:27] <hyatt> "Theora is obviously the most common-sense cross-browser, cross-platform solution;"
  40. # [00:27] <hyatt> rolls his eyes
  41. # [00:27] <hyatt> Flash is the most common sense, cross-browser, cross-platform solution.
  42. # [00:28] <hyatt> :)
  43. # [00:30] * Quits: DanC_lap (connolly@128.30.52.30) (Ping timeout)
  44. # [00:30] * Quits: gavin_ (gavin@74.103.208.221) (Ping timeout)
  45. # [00:36] * Joins: gavin_ (gavin@74.103.208.221)
  46. # [00:41] * Joins: anne (annevk@81.68.67.12)
  47. # [00:44] * Joins: heycam (cam@130.194.72.84)
  48. # [00:53] * Parts: hasather (hasather@81.235.209.174)
  49. # [00:55] * Quits: Sander (svl@71.57.109.108) (Quit: And back he spurred like a madman, shrieking a curse to the sky.)
  50. # [01:03] * Quits: hyatt (hyatt@17.255.99.41) (Quit: hyatt)
  51. # [01:15] <anne> hyatt is evil :p
  52. # [01:21] * Joins: sbuluf (nweqi@200.49.140.150)
  53. # [01:34] * Parts: anne (annevk@81.68.67.12)
  54. # [01:47] * Quits: loic (loic@90.29.174.204) (Quit: hoopa rules)
  55. # [01:48] * Joins: DanC_lap (connolly@128.30.52.30)
  56. # [01:48] * Quits: DanC_lap (connolly@128.30.52.30) (Client exited)
  57. # [01:48] * Joins: DanC_lap (connolly@128.30.52.30)
  58. # [02:10] * Quits: tH (Rob@87.102.91.218) (Quit: ChatZilla 0.9.78.1-rdmsoft [XULRunner 1.8.0.9/2006120508])
  59. # [02:12] * Joins: AGraf (Ashe@213.47.199.86)
  60. # [02:14] * Joins: karl (karlcow@128.30.52.30)
  61. # [02:19] * Quits: edas (edaspet@88.191.34.123) (Quit: http://eric.daspet.name/ et l'édition 2007 de http://www.paris-web.fr/ )
  62. # [02:38] * Quits: gavin_ (gavin@74.103.208.221) (Ping timeout)
  63. # [02:43] * Joins: gavin_ (gavin@74.103.208.221)
  64. # [02:47] * Joins: marcos (chatzilla@131.181.148.226)
  65. # [03:11] * Quits: DanC_lap (connolly@128.30.52.30) (Ping timeout)
  66. # [03:36] * Quits: kingryan (rking3@208.66.64.47) (Quit: kingryan)
  67. # [03:44] * Quits: mjs (mjs@17.255.104.223) (Quit: mjs)
  68. # [03:46] * Joins: DanC_lap (connolly@128.30.52.30)
  69. # [04:45] * Quits: gavin_ (gavin@74.103.208.221) (Ping timeout)
  70. # [04:50] * Joins: gavin_ (gavin@74.103.208.221)
  71. # [04:57] * Joins: Sander (svl@71.57.109.108)
  72. # [05:21] * Joins: mjs (mjs@66.245.248.74)
  73. # [05:25] * Quits: mjs (mjs@66.245.248.74) (Quit: mjs)
  74. # [05:34] * Joins: ddailey (david_dail@24.144.172.117)
  75. # [05:35] <ddailey> Was my perception correct that Hyatt just rolled his eyes, or was I imagining something?
  76. # [05:37] <ddailey> Oh I see. That was five hours ago. I must have been imagining. ho hum.... back to sleep.
  77. # [05:40] <ddailey> Was there not, however, and more recently, a seeming impasse at http://lists.w3.org/Archives/Public/public-html/2007May/1244.html?
  78. # [05:42] * Joins: hyatt (hyatt@24.6.91.161)
  79. # [05:50] * Quits: hyatt (hyatt@24.6.91.161) (Client exited)
  80. # [05:50] * Joins: hyatt (hyatt@24.6.91.161)
  81. # [05:51] * Quits: spillner (spillner@141.76.40.118) (Ping timeout)
  82. # [05:51] <Hixie> ddailey: http://lists.w3.org/Archives/Public/public-html/2007May/1242.html describes the solution i'd recommend to that
  83. # [05:53] <ddailey> Yes, I read it -- it seems like that was not too far from what Dan was saying -- you talked more about issue tracking -- he was talking more about discussion
  84. # [05:54] * Joins: spillner (spillner@141.76.40.118)
  85. # [05:54] <ddailey> The unresolved issue seems to concern whether you ever get to sleep or not?
  86. # [05:54] <Hixie> if the process is discussion -> summary -> editor's handle issues on their own timetable, i don't see a problem
  87. # [05:55] <Hixie> if the process is discussion -> editors, then it won't work, i simply don't have the bandwidth for that (which means hyatt certainly doesn't even remotely have the bandwidth either).
  88. # [05:55] <hyatt> my bandwidth is even more limited than usual right now because of wwdc
  89. # [05:55] <Hixie> i'm a little surprised dan said he'd "take the risk" with my time, though
  90. # [05:55] <karl> I don't think he said that
  91. # [05:55] <karl> The way I understood it
  92. # [05:56] <ddailey> I think he was worried more about the risk of wild discussion in email
  93. # [05:56] <karl> he's willing to try along the process he proposed.
  94. # [05:56] <karl> and knowing dan, fixing step by step the process
  95. # [05:57] <karl> I will have to read again carefully side by side your two emails
  96. # [05:57] <karl> I'm sure there is a way in between with common terms
  97. # [05:57] <ddailey> that is my sense as well
  98. # [05:58] <hyatt> Hixie: it is remarkable how many pages think <embed> and <meta> need end tags
  99. # [05:59] <Hixie> yeah
  100. # [05:59] <zcorpan_> hyatt: <embed> is understandable given that it hasnt' been specced before
  101. # [05:59] <hyatt> zcorpan_: yeah true
  102. # [06:00] <hyatt> Hixie: i'm super impressed at the parsing section.
  103. # [06:00] <ddailey> embed is a total mess for SVG (in terms of cross browser stuff)
  104. # [06:00] <hyatt> Hixie: it must have been agonizing to write
  105. # [06:02] <Hixie> hyatt: :-D
  106. # [06:02] <Hixie> hyatt: it was quite... "fun" to write, yes
  107. # [06:02] <hyatt> Hixie: we need to do something about the whitespace vs. text section in table mode
  108. # [06:02] <Hixie> yes, i have a whole _pile_ of open issues on the parser part of the spec
  109. # [06:02] <hyatt> the current model would rip text out of the table and lose the whitespace between words
  110. # [06:02] <hyatt> which doesn't match anything
  111. # [06:03] <hyatt> i can't think of a way to specify it though that works incrementally
  112. # [06:03] <hyatt> browsers seem to mostly just yank the text out of the table if there happens to be whitespace in the text they happen to be processing
  113. # [06:03] <hyatt> errr non-whitespace
  114. # [06:04] <hyatt> and then the whitespace gets yanked out too
  115. # [06:05] <zcorpan_> hyatt: you sure about whitespace between words being lost?
  116. # [06:05] <zcorpan_> http://hasather.net/html5/parsetree/parsetree?source=%3Ctable%3Ex+y%3C%2Ftable%3E
  117. # [06:05] <ddailey> I have worries about the parsing section -- totally unsubstantiated gnawing sort of worries -- "<script><\\script>" and "<script>"+"</scri"+"t" It seems clear to me that you've experimented through 99% of it from the test suite, but it gets so bloody interactive with everthing else that can go on
  118. # [06:05] <hyatt> zcorpan_: i'm saying the html5 spec as written states that the whitespace would be lost
  119. # [06:05] <zcorpan_> hyatt: ok, then html5lib isn't following the spec
  120. # [06:05] <hyatt> i would not expect it to
  121. # [06:06] <hyatt> :)
  122. # [06:06] <zcorpan_> :P
  123. # [06:06] <hyatt> you have to go character by character to follow the spec when processing text
  124. # [06:06] <hyatt> in order to follow the spec
  125. # [06:06] <hyatt> and nobody is going to do that
  126. # [06:06] <Hixie> hyatt: yeah it's a known bug in the spec
  127. # [06:06] <Hixie> hyatt: it'll be fixed in due course
  128. # [06:06] <hyatt> Hixie: i am not sure how to correct it though
  129. # [06:06] <hyatt> Hixie: i found white-space: pre to be a neat way to see where the whitespace went heh
  130. # [06:06] <Hixie> probably just set a flag when you hit the first non-whitespace
  131. # [06:07] <hyatt> and just rip it all out from there on?
  132. # [06:07] <hyatt> yeah
  133. # [06:07] <hyatt> that would work
  134. # [06:07] <Hixie> yeah
  135. # [06:08] <ddailey> I should send ya'll some other nasty white space stuff I have been working with
  136. # [06:09] <ddailey> it's not done yet and anne says he's willing to look at it before I trouble a large audience, but ooh ick it looks ugly
  137. # [06:12] <ddailey> There must be a theorem of some sort that concludes that no reasonable human communication can be reognized by a push down automaton
  138. # [06:13] <ddailey> s/reognized/recognized/
  139. # [06:15] * Joins: mjs (mjs@64.81.48.145)
  140. # [06:22] * Quits: DanC_lap (connolly@128.30.52.30) (Ping timeout)
  141. # [06:26] * Parts: ddailey (david_dail@24.144.172.117)
  142. # [06:26] * Joins: olivier (ot@128.30.52.30)
  143. # [06:52] * Quits: zcorpan_ (zcorpan@84.216.40.128) (Ping timeout)
  144. # [06:52] * Quits: gavin_ (gavin@74.103.208.221) (Ping timeout)
  145. # [06:57] * Joins: gavin_ (gavin@74.103.208.221)
  146. # [07:25] * Parts: asbjornu (asbjorn@84.48.116.134)
  147. # [07:48] * Quits: Sander (svl@71.57.109.108) (Quit: And back he spurred like a madman, shrieking a curse to the sky.)
  148. # [07:55] <karl> hixie: what would you consider the more stable section of HTML5 right now?
  149. # [08:06] <Hixie> <canvas>, probably
  150. # [08:14] <karl> ok. thanks.
  151. # [08:15] <hyatt> yeah canvas is pretty stable
  152. # [08:15] <hyatt> shipped in three browsers so... :)
  153. # [08:15] <karl> I wonder what are the intents of IE team on canvas
  154. # [08:17] <hyatt> probably have no plans to implement it
  155. # [08:17] <hyatt> because of the patent stuff
  156. # [08:17] <karl> which puts it out of the specification?
  157. # [08:18] <mjs> I don't believe Apple has any plan to call for exclusion on the patent
  158. # [08:18] <hyatt> no, patented stuff can be put in a spec
  159. # [08:18] <hyatt> then it's up to apple to call for exclusion
  160. # [08:18] <hyatt> within a certain timeframe
  161. # [08:18] <mjs> so in the w3c, it should not be an issue
  162. # [08:18] <hyatt> but no way msft is going to touch it until it's clear it is unencumbered
  163. # [08:18] <mjs> I would love to hear Chris Wilson's comments on it now that it is in a w3c spec
  164. # [08:19] <hyatt> mjs: it's not in a spec til html wg publishes something i assume
  165. # [08:19] <karl> hyatt: I was thinking on a more or less the self imposed hixie requirement to have only in the spec things which are implemented by major browser vendors
  166. # [08:19] <mjs> since he made vague claims that it might be hard to implement in IE or otherwise bad
  167. # [08:19] <hyatt> karl: most of html5 is implemented by nobody, so i assume he means after a while :)
  168. # [08:20] * Quits: sbuluf (nweqi@200.49.140.150) (Ping timeout)
  169. # [08:20] * karl is trying to find a canvas example, reached the page http://developer.mozilla.org/en/docs/Canvas_tutorial:Basic_usage and see that the template at the bottom has no doctype. *sigh*
  170. # [08:23] <karl> canvas is not implemented in Adobe GoLive CS2. Just tested.
  171. # [08:25] * karl is doing another test canvas in an XHTML document served as application/xhtml+xml
  172. # [08:29] <karl> :((( xhtml 1.0 strict served as application/xhtml+xml, canvas is rendered in Camino.
  173. # [08:30] <karl> and the same in Safari :(((
  174. # [08:30] <karl> sigh
  175. # [08:30] * Joins: loic (loic@90.29.174.204)
  176. # [08:30] <hyatt> ?
  177. # [08:30] <hyatt> why would it not be rendered
  178. # [08:32] <karl> because it is not xhtml 1.0 strict
  179. # [08:32] <hyatt> not following
  180. # [08:32] <mjs> doctype declarations don't restrict the vocabulary the browser implements
  181. # [08:33] <hyatt> yeah doctype declarations are meaningless
  182. # [08:33] <karl> I have tolerance for browsers dealing with tag soup. But I'm less tolerant with browsers screwing the space of other specs.
  183. # [08:33] <hyatt> other than determining quirks vs. strict mode in html
  184. # [08:33] <mjs> you can put an <iframe> in your xhtml 1.0 strict document
  185. # [08:33] <mjs> and it may not validate but it will render fine
  186. # [08:33] <mjs> or <embed>
  187. # [08:33] <hyatt> we don't have any versioning regarding XHTML versions or HTML versions
  188. # [08:33] <hyatt> you get the latest stuff regardless
  189. # [08:34] <hyatt> neither does ffx or opera
  190. # [08:34] <karl> I think it should be at least for application/xhtml+xml
  191. # [08:34] <hyatt> why?
  192. # [08:34] <xover> That appears to be MSIE's postion, I think, at least.
  193. # [08:35] <hyatt> until MSIE does application/xhtml+xml, nothing any browser really does is particularly relevant.
  194. # [08:35] <hyatt> for XHTML.
  195. # [08:35] <xover> But I think that presupposes a desire for cleaner markup in the wild.
  196. # [08:35] <mjs> I don't think they've said anything about whether they would offer nonstandard extensions in application/xhtml+xml
  197. # [08:35] <mjs> just that they would not process it with a tag soup parser
  198. # [08:37] * Joins: Lachy (Lachlan@210.84.36.41)
  199. # [08:37] <karl> I wish that browsers vendor ignore the elements which are not part of XHTML 1.0 strict served with application/xhtml+xml.
  200. # [08:37] <karl> s/vendor/vendors/
  201. # [08:38] <hyatt> those elements are still part of the DOM tree
  202. # [08:38] <hyatt> i'm not sure what you want to have happen
  203. # [08:38] <hyatt> reject the whole document?
  204. # [08:38] <mjs> rejecting the whole document would violate the spec I think (except maybe in a validating parser, not sure what should happen there)
  205. # [08:38] <karl> that could be a possibility, I might be the more helpful at least for the web developer.
  206. # [08:39] <mjs> the spec says what to do with unknown elements
  207. # [08:39] <karl> s/I might/it might/
  208. # [08:39] <mjs> but it's not clear if you should treat some elements in the XHTML namespace as sometimes known and other times not known
  209. # [08:39] <hyatt> yeah that seems strange to me
  210. # [08:39] <hyatt> and i don't really see any benefit to rejecting newer elements
  211. # [08:40] <hyatt> the author wouldn't have used them if he didn't want them
  212. # [08:40] <mjs> in particular, treating XHTML 1.0 Transitional elements that are not in Strict as unknown when you use the Strict doctype declaration would be kinda weird
  213. # [08:41] <karl> hyatt: an author might have forgotten a namespace
  214. # [08:41] <karl> for example an XHTML 1.0 document with other elements from a different namespace.
  215. # [08:41] <mjs> karl: it would be the job of a conformance checker to tell the author about such things
  216. # [08:42] <xover> Actually, in context of “Draconian Error Handling”, it'd be quite interesting to explore what that actually _means_ in terms of UA behavior.
  217. # [08:42] <hyatt> seems like a pointless expenditure of energy given how irrelevant XHTML is right now
  218. # [08:42] <karl> hyatt: it i might be irrelevant to you ;) it is not to me :) at all. I use it every day.
  219. # [08:42] <mjs> XML requires draconian error handling at the parsing level, but it's up to the language to define it at the language level, i.e. what to do for unknown elements or attributes, or bad attribute values
  220. # [08:42] <xover> For a well-formed and otherwise XML Valid XHTML document instance that just happens to use an unknown element of some stripe.
  221. # [08:42] <hyatt> all using XHTML gets you is slower parsing, loss of key JS functionality, bugs, and (in some browsers) non-incremental rendering
  222. # [08:43] <mjs> SVG used to require a visible hard global failure in SVG 1.1 for any bad attribute value or unknown attribute that wasn't namespaced etc
  223. # [08:43] <mjs> in 1.2 they updated that to unknown things should be ignored
  224. # [08:43] <mjs> hyatt: apparently the parsing is faster in Mozilla, though that could be solely due to lack of incremental rendering
  225. # [08:44] <hyatt> mjs: faster than html parsing you mean?
  226. # [08:44] <hyatt> mjs: if so, wow, their html parsing must suck. :)
  227. # [08:44] <mjs> hyatt: someone made that claim - I didn't test myself
  228. # [08:44] * Joins: frippz (fredrikfro@193.11.209.47)
  229. # [08:45] <xover> What would happen if — purely hypothetical, of course — the parser didn't have to deal with any quirks?
  230. # [08:45] <hyatt> xover: html? not much
  231. # [08:45] <hyatt> the quirks aren't a big deal
  232. # [08:45] <hyatt> they don't hurt perf or anything
  233. # [08:45] <mjs> if you mean quirks in the sense of "things only done in quirks mode and not in standards mode", there are very few in the parser, at least in webkit
  234. # [08:45] <mjs> like one or two
  235. # [08:46] <mjs> and I think they could probably be done in standards mode too
  236. # [08:46] <hyatt> yeah, we have more style system parser quirks than html
  237. # [08:46] <xover> If you assume the input is “perfect” XHTML, no need to deal with possible author borkage.
  238. # [08:47] <mjs> well, XML parsers are already an example of what happens then
  239. # [08:47] <mjs> although they do have to detect errors so they can fail
  240. # [08:47] <mjs> sometimes detecting errors is more work than just handling them the same as the non-error case
  241. # [08:47] <xover> Hmm. And to the degree that either is measurable, detecting it is as expensive as.. ah, right.
  242. # [08:48] <mjs> I don't think there's an intrinsic simplicity advantage to either HTML or XML parsing; or performance advantage when dealing with conforming content
  243. # [08:49] <xover> Code Complexity then?
  244. # [08:49] <mjs> XML has simpler error handling rules (hard failure) but the internal subset and other such things make up for it in added complexity
  245. # [08:49] <mjs> in any case the parser is a fairly small part of the implementation, all things considered
  246. # [08:50] <mjs> most of the core code is DOM, CSS, JavaScript and layout
  247. # [08:50] <mjs> and the parts with the hardest algorithms are JS and rendering/layout
  248. # [08:50] * xover can well imagine...
  249. # [08:51] <mjs> DOM does not have too many fancy algorithms required, but it is a fair chunk of code and requires careful thinking to choose the right data structures
  250. # [08:51] <hyatt> the rendering/layout code will put hair on your chest.
  251. # [08:51] * hyatt flexes.
  252. # [08:52] <mjs> hyatt: I want to see you come into work with the top three buttons open and a huge gold chain
  253. # [08:52] * karl is heading to the html 5 spec to see what is happening when the doctype is not first in the document.
  254. # [08:52] <xover> Bling bling!
  255. # [08:53] * Quits: heycam (cam@130.194.72.84) (Quit: bye)
  256. # [08:53] <hyatt> mjs: only if the theme to shaft is playing
  257. # [08:53] <mjs> hyatt: I have the CD
  258. # [08:54] <hyatt> would you walk behind me with a box playing it?
  259. # [08:54] <xover> «Pimp my...» browser developer?
  260. # [08:54] <mjs> I don't see any reason to walk behind you
  261. # [08:55] <xover> You two have just promised to be the entertainment at the first HTML WG F2F!
  262. # [08:56] <hyatt> mjs: ok walk in front of me
  263. # [08:56] <hyatt> they should hear the music before i become visible anyway
  264. # [08:56] <karl> This specification defines the parsing rules for HTML documents, whether they are syntactically valid or not. Certain points in the parsing algorithm are said to be parse errors. The error handling for parse errors is well-defined: user agents must either act as described below when encountering such problems, or must abort processing at the first error that they encounter for which they do not wish to apply the rules described below.
  265. # [08:56] <karl> http://dev.w3.org/cvsweb/~checkout~/html5/spec/Overview.html?rev=1.47&content-type=text/html;%20charset=iso-8859-1#parse
  266. # [08:56] <hyatt> karl: yup
  267. # [08:57] <karl> does that mean "display an error message" when the document is starting with <p>bllabbba</p><!doctype html>
  268. # [08:58] <hyatt> no
  269. # [08:58] <hyatt> you either recover from the error as the spec descibes
  270. # [08:58] <hyatt> or if you choose not to follow the spec's error recovery rules
  271. # [08:58] <mjs> karl: the spec lets you fail catastrophically at the first error or recover as required
  272. # [08:58] <hyatt> then you can abort processing
  273. # [08:58] <hyatt> mjs: or somewhere in between
  274. # [08:58] <mjs> the abort rule is for things like conformance checkers
  275. # [08:58] <mjs> right, at the first error you don't handle
  276. # [08:58] <hyatt> you can give up at the first rule at which you choose not to apply html5's recovery rule
  277. # [08:58] <hyatt> errr first error
  278. # [08:59] * karl is searching the section on how to recover
  279. # [08:59] <karl> The more I read the specification, the more I find it difficult to read
  280. # [08:59] <mjs> karl: it's tricky to follow because it is defined as a state machine
  281. # [08:59] <mjs> the parsing section is one of the hardest to just read through
  282. # [08:59] <hyatt> the parsing section is going to be one of the hardest to just read
  283. # [09:00] <hyatt> because it's a giant algorithm
  284. # [09:00] * Quits: gavin_ (gavin@74.103.208.221) (Ping timeout)
  285. # [09:00] <karl> so I wonder what it is supposed to happen in my case
  286. # [09:02] <mjs> karl: you would start in section 8.2.3. Tokenisation (charset detection would not find anything interesting) in "Data state"
  287. # [09:03] <mjs> karl: the initial '<' puts you in "tag open state", and eventually you'd correctly parse the <p>, implicitly opening <html> and <body> along the way (I leave the details as an excercise to the reader)
  288. # [09:04] <xover> Hmm. Does HTML5 still have the concept of a PI?
  289. # [09:04] <xover> Or a Markup Declaration?
  290. # [09:05] * Joins: gavin_ (gavin@74.103.208.221)
  291. # [09:05] * Quits: hyatt (hyatt@24.6.91.161) (Quit: hyatt)
  292. # [09:06] <mjs> then the <! would result in a doctype token, which would be a parse error in the main phase
  293. # [09:06] <xover> Hmm. Apparently the answer is “Not really”.
  294. # [09:08] <mjs> the defined recovery for the doctype token in the main phase is "Parse error. Ignore the token."
  295. # [09:08] <mjs> it's all kind of hairy
  296. # [09:08] <mjs> xover: no, I don't think so
  297. # [09:09] <karl> hmm I will have to try the parse exercise again. I will have no time in the 20 minutes remaining before leaving to know what is happening with "<p>bllabbba</p><!doctype html>"
  298. # [09:09] <mjs> xover: though I think parser error recovery would lead to such things being ignored at least
  299. # [09:10] <karl> it seems to parse it character by character. I wonder how browser would be able to do that without being very slow
  300. # [09:10] <mjs> karl: I just gave you a summary, and if you would like a sneak peek at the answer, the final DOM will look like this:
  301. # [09:10] <mjs> <html><body><p>bllabba</p></body></html>
  302. # [09:10] <mjs> with a parse error for the stray doctype
  303. # [09:11] <xover> I wonder if the general markup declaration — as the wrapper concept for DOCTYPE and comments — was considered overkill/unnecessary or just forgotten / not considered.
  304. # [09:11] <karl> yes but I was trying to follow the logic of the specification to understand it.
  305. # [09:11] <mjs> the lack of <title> is also a conformance error in that document, as is the lack of initial doctype
  306. # [09:11] <mjs> karl: it's hard to explain over IRC, in person I would point
  307. # [09:11] <karl> many thanks already for your time mjs
  308. # [09:11] <karl> it helps
  309. # [09:12] <karl> what the browser is supposed to do with identified parse errors once it has recreated a dom?
  310. # [09:12] <karl> http://dev.w3.org/cvsweb/~checkout~/html5/spec/Overview.html?rev=1.47&content-type=text/html;%20charset=iso-8859-1#parse
  311. # [09:12] <mjs> there's no requirement to report parse errors, but also no requirement not to
  312. # [09:13] <karl> hhmmmmm "must abort processing at the first error that they encounter for which they do not wish to apply the rules described below."
  313. # [09:13] <karl> free to implementers :/
  314. # [09:13] <mjs> there's no requirement to even identify parse errors, if you do the specified recovery
  315. # [09:13] * Joins: anne (annevk@81.68.67.12)
  316. # [09:14] <xover> Hixie: Is there a description fo the spec “build process” anywhere? (to poke and prod at it, toy with generating PDF versions, etc.)
  317. # [09:16] <mjs> it's interesting that some things aren't flagged as parse errors in the parsing section, though they are presumably non-conforming
  318. # [09:16] <mjs> like <head><p>foo
  319. # [09:16] <mjs> (note lack of </head>)
  320. # [09:17] <mjs> or is that conforming?
  321. # [09:17] <karl> I asked all the question about code before doctype
  322. # [09:17] <xover> The general case is an implied end tag?
  323. # [09:17] <karl> because there are geocities like sites which inject code before serving the page
  324. # [09:18] <mjs> karl: the effect in most browsers and in the HTML5 spec is that the doctype is just ignored in that case
  325. # [09:18] <xover> Hmm. That's an interesting use case.
  326. # [09:18] <mjs> karl: so it would put the page in quirks mode, unless the site also injects a doctype
  327. # [09:20] <anne> mjs, that's conforming
  328. # [09:20] <anne> mjs, </head> and <body> are optional
  329. # [09:20] <anne> (except that it's not conforming because you didn't specify <title>
  330. # [09:20] <anne> but it wouldn't cause parse errors)
  331. # [09:21] <karl> anne: in html5lib python, do you parse character by character as the algorithm says in html 5 spec?
  332. # [09:21] <xover> End tags in general are optional, or only for a specific set of elements?
  333. # [09:21] <mjs> anne: got it
  334. # [09:22] <mjs> xover: same end tags are optional as are optional in HTML4.01, I think
  335. # [09:22] <anne> karl, we have "optimizations" in place
  336. # [09:22] <anne> I'm not sure actually the above bug is actually a bug by the way...
  337. # [09:23] <xover> Hmm. By the same rationale, or by default?
  338. # [09:24] <anne> Yeah, maybe it is...
  339. # [09:24] <anne> http://www.whatwg.org/specs/web-apps/current-work/#append doesn't say what I thought it would
  340. # [09:24] <karl> :)
  341. # [09:24] <mjs> xover: you have to act as if the same elements have implicit open and close tags to parse legacy documents correctly, and making use of this non-conforming would probably be unhelpful
  342. # [09:24] <mjs> at least that's what I imagine
  343. # [09:25] * karl wonders if it would be good to impose very strict rules on authoring tools.
  344. # [09:26] <xover> I took anne to mean that the spec makes some or all end tags optional in conforming documents?
  345. # [09:26] * Joins: edas (edaspet@88.191.34.123)
  346. # [09:26] * xover suspects authoring tools will behave much like browsers in that regard...
  347. # [09:26] * anne was replying to the bug in the table section with whitespace versus character tokens
  348. # [09:27] <xover> «If the browsers support it, the users will want it, so we have to be able to emit it.»
  349. # [09:27] <karl> re *sigh*
  350. # [09:27] <karl> that will be my sigh day ;)
  351. # [09:27] <mjs> karl: the spec requires authoring tools to generate only conforming documents
  352. # [09:27] <anne> xover, some end tags are optional even for conforming documents
  353. # [09:28] <mjs> xover: the </head> end tag is optional, as is the </p> end tag, I think those are both true in HTML 4.01 as well
  354. # [09:28] <mjs> but </div> is not optional
  355. # [09:29] <mjs> or </script>
  356. # [09:29] <xover> Ah, I just realised I misread mjs' earlier comment.
  357. # [09:30] <xover> “…make (use of)…” not “…(make use of)…”
  358. # [09:30] <mjs> ah, right
  359. # [09:30] <xover> heh heh, sorry
  360. # [09:30] <anne> karl, the conformance requirements on authoring tools are quite strict
  361. # [09:31] * Quits: karl (karlcow@128.30.52.30) (Quit: Where dwelt Ymir, or wherein did he find sustenance?)
  362. # [09:31] <mjs> I didn't double-check but I think the same set of tags have implicit close or open as in HTML 4.01
  363. # [09:31] <mjs> for compatibility w/ parsing in older browsers
  364. # [09:31] <anne> yeah
  365. # [09:31] <anne> although we might change the <p><table> case
  366. # [09:32] <mjs> though I think there might be new elements w/ empty content model which therefore don't need close tags
  367. # [09:34] <xover> Given the need to support self-closing tags for compat, wouldn't it make sense to generalize that usage for empty elements?
  368. # [09:34] * xover just thinking out loud...
  369. # [09:35] <anne> <img /> is allowed
  370. # [09:35] <mjs> self-closing tags are allowed, for empty elements only
  371. # [09:35] <anne> <html xmlns=http://www.w3.org/1999/xhtml> is allowed even (even without the quotes)
  372. # [09:36] <anne> although xmlns will not end up in the right namespace and such of course
  373. # [09:36] <mjs> (for other elements they'd behave different between HTML and XHTML and so are non-conforming)
  374. # [09:36] <xover> Hmm.
  375. # [09:37] <anne> nothing to hmm about
  376. # [09:37] <anne> this is all mostly a solved problem
  377. # [09:37] * Quits: Lachy (Lachlan@210.84.36.41) (Quit: Leaving)
  378. # [09:37] * Quits: MikeSmith (MikeSmith@mcclure.w3.org) (Ping timeout)
  379. # [09:37] <mjs> in HTML 4.01, if you take it to be an SGML application, <div /> would be allowed but means something different from what it means in XML
  380. # [09:38] <xover> “Hmm.” -> “/me must think on this to grok it more fully” :-)
  381. # [09:38] <mjs> (in fact it means three different things according to SGML, the way browsers actually parse HTML, and XML)
  382. # [09:39] <mjs> (so it's kind of bad that a fully per-spec HTML 4.01 validator would not flag it)
  383. # [09:39] <xover> ottomh, SGML and XML should be fairly equivalent for this case.
  384. # [09:40] <anne> <br/> in SGML is <br>\n&gt;
  385. # [09:40] <anne> that's quite different from XML
  386. # [09:40] <xover> Hmm. Well, actually, no, not ... right.
  387. # [09:42] * xover curses markup minimization...
  388. # [09:43] <xover> Well, /me is off to `ork. Thanks for the interesting discussions all!
  389. # [09:43] * xover wanders off...
  390. # [09:51] * anne wonders why karl argued for versioning in HTML earlier...
  391. # [09:57] <anne> btw, technically the spec doesn't define <p>foobar</p><!doctype html> yet
  392. # [09:57] <anne> it requires a doctype first
  393. # [10:05] <mjs> that's true
  394. # [10:06] <mjs> if UAs chose to recover from lack of doctype by just ignoring it then it would be processed as I said
  395. # [10:07] <anne> the spec should define both modes in due course imo
  396. # [10:07] <anne> including doctype sniffing
  397. # [10:07] <anne> and then hopefully we can make it a single mode
  398. # [10:07] <anne> and keep the doctype stuff for "minimal" rendering differences
  399. # [10:09] <mjs> I thought leaving the missing/older/unknown doctype situation undefined was on purpose, to for example allow IE to implement HTML5 parsing for HTML5 but still parse HTML4 the old way
  400. # [10:10] <anne> hmm, maybe that needs to be a separate spec then :(
  401. # [10:10] * anne would love to have all of the web defined
  402. # [10:12] <mjs> I think a spec defining how to detect quirks mode, the rendering and CSS parsing differences in quirks mode, and parsing in absence of proper HTML5 doctype (ideally just the same as normal HTML5 parsing) would be useful
  403. # [10:12] * Quits: olivier (ot@128.30.52.30) (Quit: Leaving)
  404. # [10:15] * Joins: ROBOd (robod@86.34.246.154)
  405. # [10:18] * hsivonen thinks the spec for doctype sniffing should be dbaron's implementation ported to English, since hyatt already ported it to WebKit and Opera almost matches (apparently by blackbox reverse engineering, since they don't match exactly in corner cases)
  406. # [10:18] * Joins: Lachy (Lachlan@210.84.36.41)
  407. # [10:18] * Quits: Lachy (Lachlan@210.84.36.41) (Client exited)
  408. # [10:22] <mjs> Opera 9 seems pretty close according to your table
  409. # [10:22] <hsivonen> mjs: yes (except corner cases)
  410. # [10:22] <mjs> that chart really makes me think that CSS should change to make Almost Standards mode the standard
  411. # [10:22] <anne> I think we devised our own algorithm based on real world needs
  412. # [10:22] <anne> mjs, yeah
  413. # [10:22] <anne> +1
  414. # [10:23] <mjs> Opera 9 seems to match Gecko and WebKit a lot more closely than earlier Operas
  415. # [10:23] <mjs> (formerly I guess it was closer to IE)
  416. # [10:24] <hsivonen> anne: really? why does it match Gecko in places where the Gecko has apple.com-in-2000-induded weirdness and doesn't match for ISO HTML which doesn't have real-world relevance
  417. # [10:24] <hsivonen> ?
  418. # [10:24] <hsivonen> s/indudud/induced/
  419. # [10:24] <anne> I don't know and can't share our algorithm
  420. # [10:27] <hsivonen> mjs: https://bugzilla.mozilla.org/show_bug.cgi?id=78208
  421. # [10:27] <mjs> you mean because you haven't asked permission yet or because the specific algorithm is considered an important trade secret?
  422. # [10:30] <mjs> interesting to read dbaron circa 2001
  423. # [10:31] <anne> Haven't asked
  424. # [10:32] <anne> mjs, I think UA conformance requirements probably require you to treat <style scoped> in the same regardless of context
  425. # [10:34] <mjs> hsivonen: interesting how strident people were in those comments
  426. # [10:34] <mjs> we've never been that anal for Safari, I guess because we considered quirks mode and standards mode to actually be "copy IE bugs" mode and "copy mozilla bugs" mode respectively
  427. # [10:34] <mjs> (ok, I exaggerate a bit)
  428. # [10:35] <mjs> anne: right, I just mean that for conforming content, <style scoped> doesn't create style reapplication issues
  429. # [10:36] <mjs> so hyatt's change would only make a significant difference for nonconforming content, in which case you might see a non-scoped <style> in the body and be screwed anyway
  430. # [10:36] <anne> <i>&heart;5
  431. # [10:38] * hsivonen should blog the story of https://bugzilla.mozilla.org/show_bug.cgi?id=42525 sometime.
  432. # [10:38] <hsivonen> now it has been a while so there's no need to hide the bug from reopeners any longer
  433. # [10:47] * Quits: anne (annevk@81.68.67.12) (Client exited)
  434. # [10:47] <mjs> I couldn't make it through to the end of the comments there
  435. # [10:47] * Joins: anne (annevk@81.68.67.12)
  436. # [10:47] * Quits: anne (annevk@81.68.67.12) (Client exited)
  437. # [10:47] * Joins: anne (annevk@81.68.67.12)
  438. # [10:47] * Quits: anne (annevk@81.68.67.12) (Client exited)
  439. # [10:48] * Joins: anne (annevk@81.68.67.12)
  440. # [10:48] <hsivonen> mjs: long story short, what looks illogical in the doctype chart was done to keep apple.com in the quirks mode at the time
  441. # [10:48] * Quits: anne (annevk@81.68.67.12) (Client exited)
  442. # [10:48] * Joins: anne (annevk@81.68.67.12)
  443. # [10:51] <mjs> wow, apple was breaking the web before even having a browser
  444. # [10:53] <hsivonen> mjs: and it was the same issue that is the difference between the Standards Mode and the Almost Standards mode today
  445. # [10:56] <mjs> hsivonen: a colorful history there
  446. # [10:56] * Joins: MikeSmith (MikeSmith@mcclure.w3.org)
  447. # [11:08] * Quits: gavin_ (gavin@74.103.208.221) (Ping timeout)
  448. # [11:13] * Joins: gavin_ (gavin@74.103.208.221)
  449. # [11:23] <anne> good point mjs
  450. # [11:23] <anne> we should point out more often that syntax errors are hardly interesting
  451. # [11:23] <anne> and that checking all is impossible
  452. # [11:23] <hsivonen> anne: do you offer a zip file of the Web Forms 2.0 suite over HTTP? would you prefer me spidering it or would you prefer me the hit the server only for a zip file?
  453. # [11:23] * anne doesn't think many people realize how this stuff actually works
  454. # [11:23] <hsivonen> s/the hit/to hit/
  455. # [11:24] <anne> hsivonen, at this point spidering as I don't have to do anything :)
  456. # [11:24] <hsivonen> anne: ok
  457. # [11:24] <anne> i think we can take the bandwidth usage too :)
  458. # [11:31] * Joins: heycam (cam@203.214.6.6)
  459. # [11:49] <MikeSmith> heycam - I see that you're going to be in Tokyo at the end of August?
  460. # [12:16] * anne wonders where zcorpan is
  461. # [12:16] * anne spotted a mistake in his presentation
  462. # [12:16] <anne> while copying the google suggest example 8-)
  463. # [12:17] * Joins: tH_ (Rob@87.102.91.218)
  464. # [12:18] * tH_ is now known as tH
  465. # [12:37] * anne likes the <noscript><link rel=stylesheet></noscript> usecase
  466. # [12:37] <anne> <noscript><base href=evil.com></noscript>
  467. # [12:37] <anne> hah
  468. # [13:09] * Parts: anne (annevk@81.68.67.12)
  469. # [13:24] * Joins: anne (annevk@81.68.67.12)
  470. # [13:28] * Parts: anne (annevk@81.68.67.12)
  471. # [13:33] <heycam> MikeSmith, yep i will be
  472. # [13:33] <heycam> for svg f2f + svg open
  473. # [14:16] * Quits: marcos (chatzilla@131.181.148.226) (Ping timeout)
  474. # [14:27] * Quits: beowulf (carisenda@91.84.50.132) (Ping timeout)
  475. # [14:38] * Joins: beowulf (carisenda@91.84.50.132)
  476. # [14:59] * Joins: AGraf|mb (Ashe@138.232.65.129)
  477. # [15:10] * Quits: MikeSmith (MikeSmith@mcclure.w3.org) (Ping timeout)
  478. # [15:12] * Joins: zcorpan_ (zcorpan@84.216.41.246)
  479. # [15:36] * Quits: AGraf|mb (Ashe@138.232.65.129) (Quit: Quit)
  480. # [15:38] * Quits: gavin_ (gavin@74.103.208.221) (Ping timeout)
  481. # [15:44] * Joins: gavin_ (gavin@74.103.208.221)
  482. # [15:51] * Joins: AGraf|mb (Ashe@138.232.245.27)
  483. # [15:56] * Joins: DanC_lap (connolly@128.30.52.30)
  484. # [16:09] * Quits: AGraf|mb (Ashe@138.232.245.27) (Client exited)
  485. # [16:31] * Quits: DanC_lap (connolly@128.30.52.30) (Ping timeout)
  486. # [16:35] * Joins: billmason (billmason@69.30.57.156)
  487. # [17:16] * Joins: kazuhito (kazuhito@222.151.153.231)
  488. # [17:19] * Quits: zcorpan_ (zcorpan@84.216.41.246) (Ping timeout)
  489. # [17:22] * Joins: zcorpan_ (zcorpan@84.216.41.246)
  490. # [17:26] * Joins: hasather (hasather@81.235.209.174)
  491. # [17:42] * Joins: MikeSmith (MikeSmith@mcclure.w3.org)
  492. # [17:44] * Quits: kazuhito (kazuhito@222.151.153.231) (Quit: Quitting!)
  493. # [17:46] * Quits: gavin_ (gavin@74.103.208.221) (Ping timeout)
  494. # [17:52] * Joins: gavin_ (gavin@74.103.208.221)
  495. # [17:56] * Joins: DanC_lap (connolly@128.30.52.30)
  496. # [18:06] * Quits: edas (edaspet@88.191.34.123) (Ping timeout)
  497. # [18:15] * Joins: Sander (svl@71.57.109.108)
  498. # [18:42] * Joins: h3h (bfults@66.162.32.234)
  499. # [18:45] * Joins: edas (edaspet@88.191.34.123)
  500. # [18:50] * Quits: loic (loic@90.29.174.204) (Ping timeout)
  501. # [19:05] * Joins: loic (loic@90.41.7.67)
  502. # [19:30] * Joins: kingryan (rking3@208.66.64.47)
  503. # [19:54] * Quits: gavin_ (gavin@74.103.208.221) (Ping timeout)
  504. # [19:58] * Joins: hyatt (hyatt@24.6.91.161)
  505. # [19:59] * Joins: gavin_ (gavin@74.103.208.221)
  506. # [21:13] * Quits: xover (xover@193.157.66.5) (Ping timeout)
  507. # [21:16] * Joins: xover (xover@193.157.66.5)
  508. # [21:41] * Quits: DanC_lap (connolly@128.30.52.30) (Ping timeout)
  509. # [21:41] * Quits: mjs (mjs@64.81.48.145) (Quit: mjs)
  510. # [21:50] * Quits: loic (loic@90.41.7.67) (Quit: hoopa rules)
  511. # [21:57] * Quits: hyatt (hyatt@24.6.91.161) (Quit: hyatt)
  512. # [22:01] * Quits: gavin_ (gavin@74.103.208.221) (Ping timeout)
  513. # [22:02] * Quits: gsnedders (gsnedders@86.139.123.225) (Quit: Don't touch /dev/null…)
  514. # [22:02] * Joins: hyatt (hyatt@24.6.91.161)
  515. # [22:06] * Joins: gavin_ (gavin@74.103.208.221)
  516. # [22:07] * Quits: mw22 (chatzilla@84.41.169.151) (Ping timeout)
  517. # [22:10] * Joins: mw22 (chatzilla@84.41.169.151)
  518. # [22:13] * Quits: ROBOd (robod@86.34.246.154) (Quit: http://www.robodesign.ro )
  519. # [22:30] * Joins: mjs (mjs@66.245.248.74)
  520. # [22:37] * Quits: Hixie (ianh@129.241.93.37) (Client exited)
  521. # [22:41] * Quits: mjs (mjs@66.245.248.74) (Ping timeout)
  522. # [22:44] * Joins: DanC_lap (connolly@128.30.52.30)
  523. # [22:48] * Joins: gsnedders (gsnedders@86.139.123.225)
  524. # [22:49] * Joins: Zeros (Zeros-Elip@67.154.87.254)
  525. # [22:56] * Quits: Zeros (Zeros-Elip@67.154.87.254) (Quit: Leaving)
  526. # [23:10] * Joins: Hixie (ianh@129.241.93.37)
  527. # [23:16] * Quits: Hixie (ianh@129.241.93.37) (Client exited)
  528. # [23:16] * Joins: asbjornu (asbjorn@84.48.116.134)
  529. # [23:26] * Joins: Hixie (ianh@129.241.93.37)
  530. # [23:59] * Joins: nickshanks (nicholas@195.137.85.17)
  531. # Session Close: Thu May 31 00:00:00 2007

The end :)