Options:
- # Session Start: Wed May 30 00:00:00 2007
- # Session Ident: #html-wg
- # [00:02] * Joins: loic (loic@90.29.174.204)
- # [00:16] <karl> http://textplain.blogspot.com/2007/05/small-html-documents.html
- # [00:16] <karl> An exercise in reading HTML specifications
- # [00:17] <karl> "The HTML5 spec is easier to read because it is designed with audience in mind. But at the end of the day, it's still a spec. Spec aren't meant to be read by humans. They're meant to be read by pedantic assholes (who yell at people trying to interpret the spec) and angelic humanitarians (who interpret specs and translate for the rest of us)."
- # [00:17] * Joins: DanC_lap (connolly@128.30.52.30)
- # [00:17] <zcorpan_> karl: yeah, saw that earlier
- # [00:17] <karl> http://www.bluishcoder.co.nz/2007/05/support-for-html-video-element-in.html
- # [00:17] <karl> Support for HTML Video element in Firefox
- # [00:17] <karl> "I briefly mentioned in a previous post that I was working on implementing this tag natively in Firefox. The intent is to display Ogg Theora encoded video without needing any plugins, similar to the recent demonstration of Theora playback using a video element by Opera. Opera have a post about it on their labs page."
- # [00:17] <gavin_> those groups sound like they were taken from Mark Pilgrim's post from a while back
- # [00:18] <karl> http://www.broken-links.com/2007/05/29/mozilla-gets-native-video-support/
- # [00:18] <karl> Mozilla gets native video support
- # [00:18] <gavin_> http://diveintomark.org/archives/2004/08/16/specs
- # [00:18] <karl> "While I think this is great news and has a lot of potential, I foresee one major obstacle to this becoming standard: proprietary codecs. If they decide to implement it, Apple will want Quicktime in Safari, and Microsoft will want Windows Media Video in Internet Explorer."
- # [00:19] <zcorpan_> karl: have you become an aggregator? :)
- # [00:19] <karl> sort of ;)
- # [00:19] <karl> zcorpan_: just reading the links of the morning
- # [00:19] <zcorpan_> ok
- # [00:19] <karl> and I have to leave in a few minutes to take the train :)
- # [00:20] <karl> for my happy 1h30 commuting reading Rainer Maria Rilke.
- # [00:20] <mjs> Apple will want MPEG-4, not QuickTime, and MP4 is not proprietary (though it is subject to patents)
- # [00:21] * DanC_lap tries to remember how MP4 and H.623 are related... or is it H.263...
- # [00:22] <karl> http://www.apple.com/quicktime/technologies/h264/
- # [00:22] <Dashiva> mp4 is a container, h.263 is a codec, h.264 is a newer codec?
- # [00:22] <karl> http://en.wikipedia.org/wiki/H.264
- # [00:22] <karl> "video compression. Also known as MPEG-4 Part 10, or AVC (for Advanced Video Coding)."
- # [00:22] <mjs> MPEG-4 is a family of formats, including a generic container format and a number of codecs
- # [00:23] <karl> http://www.brucelawson.co.uk/index.php/2007/html5-microformats-accessibility-testing/
- # [00:23] <karl> HTML 5, microformats and testing accessibility
- # [00:24] <DanC_lap> H.264 is what I meant... when people say mp4, do they usually mean with h.264?
- # [00:24] * DanC_lap wanders off again...
- # [00:24] <Dashiva> I'd say usually yes
- # [00:24] <Dashiva> But h.264 by itself isn't a unique description either, it's a complex issue
- # [00:25] * karl is preparing his bag
- # [00:25] * Quits: karl (karlcow@128.30.52.30) (Quit: Where dwelt Ymir, or wherein did he find sustenance?)
- # [00:25] <mjs> Apple's specific main interest is to support the MPEG-4 container with H.246 video and AAC audio, but we'd likely also support MPEG-family video and audio codecs at the very least
- # [00:27] <hyatt> "Theora is obviously the most common-sense cross-browser, cross-platform solution;"
- # [00:27] <hyatt> rolls his eyes
- # [00:27] <hyatt> Flash is the most common sense, cross-browser, cross-platform solution.
- # [00:28] <hyatt> :)
- # [00:30] * Quits: DanC_lap (connolly@128.30.52.30) (Ping timeout)
- # [00:30] * Quits: gavin_ (gavin@74.103.208.221) (Ping timeout)
- # [00:36] * Joins: gavin_ (gavin@74.103.208.221)
- # [00:41] * Joins: anne (annevk@81.68.67.12)
- # [00:44] * Joins: heycam (cam@130.194.72.84)
- # [00:53] * Parts: hasather (hasather@81.235.209.174)
- # [00:55] * Quits: Sander (svl@71.57.109.108) (Quit: And back he spurred like a madman, shrieking a curse to the sky.)
- # [01:03] * Quits: hyatt (hyatt@17.255.99.41) (Quit: hyatt)
- # [01:15] <anne> hyatt is evil :p
- # [01:21] * Joins: sbuluf (nweqi@200.49.140.150)
- # [01:34] * Parts: anne (annevk@81.68.67.12)
- # [01:47] * Quits: loic (loic@90.29.174.204) (Quit: hoopa rules)
- # [01:48] * Joins: DanC_lap (connolly@128.30.52.30)
- # [01:48] * Quits: DanC_lap (connolly@128.30.52.30) (Client exited)
- # [01:48] * Joins: DanC_lap (connolly@128.30.52.30)
- # [02:10] * Quits: tH (Rob@87.102.91.218) (Quit: ChatZilla 0.9.78.1-rdmsoft [XULRunner 1.8.0.9/2006120508])
- # [02:12] * Joins: AGraf (Ashe@213.47.199.86)
- # [02:14] * Joins: karl (karlcow@128.30.52.30)
- # [02:19] * Quits: edas (edaspet@88.191.34.123) (Quit: http://eric.daspet.name/ et l'édition 2007 de http://www.paris-web.fr/ )
- # [02:38] * Quits: gavin_ (gavin@74.103.208.221) (Ping timeout)
- # [02:43] * Joins: gavin_ (gavin@74.103.208.221)
- # [02:47] * Joins: marcos (chatzilla@131.181.148.226)
- # [03:11] * Quits: DanC_lap (connolly@128.30.52.30) (Ping timeout)
- # [03:36] * Quits: kingryan (rking3@208.66.64.47) (Quit: kingryan)
- # [03:44] * Quits: mjs (mjs@17.255.104.223) (Quit: mjs)
- # [03:46] * Joins: DanC_lap (connolly@128.30.52.30)
- # [04:45] * Quits: gavin_ (gavin@74.103.208.221) (Ping timeout)
- # [04:50] * Joins: gavin_ (gavin@74.103.208.221)
- # [04:57] * Joins: Sander (svl@71.57.109.108)
- # [05:21] * Joins: mjs (mjs@66.245.248.74)
- # [05:25] * Quits: mjs (mjs@66.245.248.74) (Quit: mjs)
- # [05:34] * Joins: ddailey (david_dail@24.144.172.117)
- # [05:35] <ddailey> Was my perception correct that Hyatt just rolled his eyes, or was I imagining something?
- # [05:37] <ddailey> Oh I see. That was five hours ago. I must have been imagining. ho hum.... back to sleep.
- # [05:40] <ddailey> Was there not, however, and more recently, a seeming impasse at http://lists.w3.org/Archives/Public/public-html/2007May/1244.html?
- # [05:42] * Joins: hyatt (hyatt@24.6.91.161)
- # [05:50] * Quits: hyatt (hyatt@24.6.91.161) (Client exited)
- # [05:50] * Joins: hyatt (hyatt@24.6.91.161)
- # [05:51] * Quits: spillner (spillner@141.76.40.118) (Ping timeout)
- # [05:51] <Hixie> ddailey: http://lists.w3.org/Archives/Public/public-html/2007May/1242.html describes the solution i'd recommend to that
- # [05:53] <ddailey> Yes, I read it -- it seems like that was not too far from what Dan was saying -- you talked more about issue tracking -- he was talking more about discussion
- # [05:54] * Joins: spillner (spillner@141.76.40.118)
- # [05:54] <ddailey> The unresolved issue seems to concern whether you ever get to sleep or not?
- # [05:54] <Hixie> if the process is discussion -> summary -> editor's handle issues on their own timetable, i don't see a problem
- # [05:55] <Hixie> if the process is discussion -> editors, then it won't work, i simply don't have the bandwidth for that (which means hyatt certainly doesn't even remotely have the bandwidth either).
- # [05:55] <hyatt> my bandwidth is even more limited than usual right now because of wwdc
- # [05:55] <Hixie> i'm a little surprised dan said he'd "take the risk" with my time, though
- # [05:55] <karl> I don't think he said that
- # [05:55] <karl> The way I understood it
- # [05:56] <ddailey> I think he was worried more about the risk of wild discussion in email
- # [05:56] <karl> he's willing to try along the process he proposed.
- # [05:56] <karl> and knowing dan, fixing step by step the process
- # [05:57] <karl> I will have to read again carefully side by side your two emails
- # [05:57] <karl> I'm sure there is a way in between with common terms
- # [05:57] <ddailey> that is my sense as well
- # [05:58] <hyatt> Hixie: it is remarkable how many pages think <embed> and <meta> need end tags
- # [05:59] <Hixie> yeah
- # [05:59] <zcorpan_> hyatt: <embed> is understandable given that it hasnt' been specced before
- # [05:59] <hyatt> zcorpan_: yeah true
- # [06:00] <hyatt> Hixie: i'm super impressed at the parsing section.
- # [06:00] <ddailey> embed is a total mess for SVG (in terms of cross browser stuff)
- # [06:00] <hyatt> Hixie: it must have been agonizing to write
- # [06:02] <Hixie> hyatt: :-D
- # [06:02] <Hixie> hyatt: it was quite... "fun" to write, yes
- # [06:02] <hyatt> Hixie: we need to do something about the whitespace vs. text section in table mode
- # [06:02] <Hixie> yes, i have a whole _pile_ of open issues on the parser part of the spec
- # [06:02] <hyatt> the current model would rip text out of the table and lose the whitespace between words
- # [06:02] <hyatt> which doesn't match anything
- # [06:03] <hyatt> i can't think of a way to specify it though that works incrementally
- # [06:03] <hyatt> browsers seem to mostly just yank the text out of the table if there happens to be whitespace in the text they happen to be processing
- # [06:03] <hyatt> errr non-whitespace
- # [06:04] <hyatt> and then the whitespace gets yanked out too
- # [06:05] <zcorpan_> hyatt: you sure about whitespace between words being lost?
- # [06:05] <zcorpan_> http://hasather.net/html5/parsetree/parsetree?source=%3Ctable%3Ex+y%3C%2Ftable%3E
- # [06:05] <ddailey> I have worries about the parsing section -- totally unsubstantiated gnawing sort of worries -- "<script><\\script>" and "<script>"+"</scri"+"t" It seems clear to me that you've experimented through 99% of it from the test suite, but it gets so bloody interactive with everthing else that can go on
- # [06:05] <hyatt> zcorpan_: i'm saying the html5 spec as written states that the whitespace would be lost
- # [06:05] <zcorpan_> hyatt: ok, then html5lib isn't following the spec
- # [06:05] <hyatt> i would not expect it to
- # [06:06] <hyatt> :)
- # [06:06] <zcorpan_> :P
- # [06:06] <hyatt> you have to go character by character to follow the spec when processing text
- # [06:06] <hyatt> in order to follow the spec
- # [06:06] <hyatt> and nobody is going to do that
- # [06:06] <Hixie> hyatt: yeah it's a known bug in the spec
- # [06:06] <Hixie> hyatt: it'll be fixed in due course
- # [06:06] <hyatt> Hixie: i am not sure how to correct it though
- # [06:06] <hyatt> Hixie: i found white-space: pre to be a neat way to see where the whitespace went heh
- # [06:06] <Hixie> probably just set a flag when you hit the first non-whitespace
- # [06:07] <hyatt> and just rip it all out from there on?
- # [06:07] <hyatt> yeah
- # [06:07] <hyatt> that would work
- # [06:07] <Hixie> yeah
- # [06:08] <ddailey> I should send ya'll some other nasty white space stuff I have been working with
- # [06:09] <ddailey> it's not done yet and anne says he's willing to look at it before I trouble a large audience, but ooh ick it looks ugly
- # [06:12] <ddailey> There must be a theorem of some sort that concludes that no reasonable human communication can be reognized by a push down automaton
- # [06:13] <ddailey> s/reognized/recognized/
- # [06:15] * Joins: mjs (mjs@64.81.48.145)
- # [06:22] * Quits: DanC_lap (connolly@128.30.52.30) (Ping timeout)
- # [06:26] * Parts: ddailey (david_dail@24.144.172.117)
- # [06:26] * Joins: olivier (ot@128.30.52.30)
- # [06:52] * Quits: zcorpan_ (zcorpan@84.216.40.128) (Ping timeout)
- # [06:52] * Quits: gavin_ (gavin@74.103.208.221) (Ping timeout)
- # [06:57] * Joins: gavin_ (gavin@74.103.208.221)
- # [07:25] * Parts: asbjornu (asbjorn@84.48.116.134)
- # [07:48] * Quits: Sander (svl@71.57.109.108) (Quit: And back he spurred like a madman, shrieking a curse to the sky.)
- # [07:55] <karl> hixie: what would you consider the more stable section of HTML5 right now?
- # [08:06] <Hixie> <canvas>, probably
- # [08:14] <karl> ok. thanks.
- # [08:15] <hyatt> yeah canvas is pretty stable
- # [08:15] <hyatt> shipped in three browsers so... :)
- # [08:15] <karl> I wonder what are the intents of IE team on canvas
- # [08:17] <hyatt> probably have no plans to implement it
- # [08:17] <hyatt> because of the patent stuff
- # [08:17] <karl> which puts it out of the specification?
- # [08:18] <mjs> I don't believe Apple has any plan to call for exclusion on the patent
- # [08:18] <hyatt> no, patented stuff can be put in a spec
- # [08:18] <hyatt> then it's up to apple to call for exclusion
- # [08:18] <hyatt> within a certain timeframe
- # [08:18] <mjs> so in the w3c, it should not be an issue
- # [08:18] <hyatt> but no way msft is going to touch it until it's clear it is unencumbered
- # [08:18] <mjs> I would love to hear Chris Wilson's comments on it now that it is in a w3c spec
- # [08:19] <hyatt> mjs: it's not in a spec til html wg publishes something i assume
- # [08:19] <karl> hyatt: I was thinking on a more or less the self imposed hixie requirement to have only in the spec things which are implemented by major browser vendors
- # [08:19] <mjs> since he made vague claims that it might be hard to implement in IE or otherwise bad
- # [08:19] <hyatt> karl: most of html5 is implemented by nobody, so i assume he means after a while :)
- # [08:20] * Quits: sbuluf (nweqi@200.49.140.150) (Ping timeout)
- # [08:20] * karl is trying to find a canvas example, reached the page http://developer.mozilla.org/en/docs/Canvas_tutorial:Basic_usage and see that the template at the bottom has no doctype. *sigh*
- # [08:23] <karl> canvas is not implemented in Adobe GoLive CS2. Just tested.
- # [08:25] * karl is doing another test canvas in an XHTML document served as application/xhtml+xml
- # [08:29] <karl> :((( xhtml 1.0 strict served as application/xhtml+xml, canvas is rendered in Camino.
- # [08:30] <karl> and the same in Safari :(((
- # [08:30] <karl> sigh
- # [08:30] * Joins: loic (loic@90.29.174.204)
- # [08:30] <hyatt> ?
- # [08:30] <hyatt> why would it not be rendered
- # [08:32] <karl> because it is not xhtml 1.0 strict
- # [08:32] <hyatt> not following
- # [08:32] <mjs> doctype declarations don't restrict the vocabulary the browser implements
- # [08:33] <hyatt> yeah doctype declarations are meaningless
- # [08:33] <karl> I have tolerance for browsers dealing with tag soup. But I'm less tolerant with browsers screwing the space of other specs.
- # [08:33] <hyatt> other than determining quirks vs. strict mode in html
- # [08:33] <mjs> you can put an <iframe> in your xhtml 1.0 strict document
- # [08:33] <mjs> and it may not validate but it will render fine
- # [08:33] <mjs> or <embed>
- # [08:33] <hyatt> we don't have any versioning regarding XHTML versions or HTML versions
- # [08:33] <hyatt> you get the latest stuff regardless
- # [08:34] <hyatt> neither does ffx or opera
- # [08:34] <karl> I think it should be at least for application/xhtml+xml
- # [08:34] <hyatt> why?
- # [08:34] <xover> That appears to be MSIE's postion, I think, at least.
- # [08:35] <hyatt> until MSIE does application/xhtml+xml, nothing any browser really does is particularly relevant.
- # [08:35] <hyatt> for XHTML.
- # [08:35] <xover> But I think that presupposes a desire for cleaner markup in the wild.
- # [08:35] <mjs> I don't think they've said anything about whether they would offer nonstandard extensions in application/xhtml+xml
- # [08:35] <mjs> just that they would not process it with a tag soup parser
- # [08:37] * Joins: Lachy (Lachlan@210.84.36.41)
- # [08:37] <karl> I wish that browsers vendor ignore the elements which are not part of XHTML 1.0 strict served with application/xhtml+xml.
- # [08:37] <karl> s/vendor/vendors/
- # [08:38] <hyatt> those elements are still part of the DOM tree
- # [08:38] <hyatt> i'm not sure what you want to have happen
- # [08:38] <hyatt> reject the whole document?
- # [08:38] <mjs> rejecting the whole document would violate the spec I think (except maybe in a validating parser, not sure what should happen there)
- # [08:38] <karl> that could be a possibility, I might be the more helpful at least for the web developer.
- # [08:39] <mjs> the spec says what to do with unknown elements
- # [08:39] <karl> s/I might/it might/
- # [08:39] <mjs> but it's not clear if you should treat some elements in the XHTML namespace as sometimes known and other times not known
- # [08:39] <hyatt> yeah that seems strange to me
- # [08:39] <hyatt> and i don't really see any benefit to rejecting newer elements
- # [08:40] <hyatt> the author wouldn't have used them if he didn't want them
- # [08:40] <mjs> in particular, treating XHTML 1.0 Transitional elements that are not in Strict as unknown when you use the Strict doctype declaration would be kinda weird
- # [08:41] <karl> hyatt: an author might have forgotten a namespace
- # [08:41] <karl> for example an XHTML 1.0 document with other elements from a different namespace.
- # [08:41] <mjs> karl: it would be the job of a conformance checker to tell the author about such things
- # [08:42] <xover> Actually, in context of “Draconian Error Handling”, it'd be quite interesting to explore what that actually _means_ in terms of UA behavior.
- # [08:42] <hyatt> seems like a pointless expenditure of energy given how irrelevant XHTML is right now
- # [08:42] <karl> hyatt: it i might be irrelevant to you ;) it is not to me :) at all. I use it every day.
- # [08:42] <mjs> XML requires draconian error handling at the parsing level, but it's up to the language to define it at the language level, i.e. what to do for unknown elements or attributes, or bad attribute values
- # [08:42] <xover> For a well-formed and otherwise XML Valid XHTML document instance that just happens to use an unknown element of some stripe.
- # [08:42] <hyatt> all using XHTML gets you is slower parsing, loss of key JS functionality, bugs, and (in some browsers) non-incremental rendering
- # [08:43] <mjs> SVG used to require a visible hard global failure in SVG 1.1 for any bad attribute value or unknown attribute that wasn't namespaced etc
- # [08:43] <mjs> in 1.2 they updated that to unknown things should be ignored
- # [08:43] <mjs> hyatt: apparently the parsing is faster in Mozilla, though that could be solely due to lack of incremental rendering
- # [08:44] <hyatt> mjs: faster than html parsing you mean?
- # [08:44] <hyatt> mjs: if so, wow, their html parsing must suck. :)
- # [08:44] <mjs> hyatt: someone made that claim - I didn't test myself
- # [08:44] * Joins: frippz (fredrikfro@193.11.209.47)
- # [08:45] <xover> What would happen if — purely hypothetical, of course — the parser didn't have to deal with any quirks?
- # [08:45] <hyatt> xover: html? not much
- # [08:45] <hyatt> the quirks aren't a big deal
- # [08:45] <hyatt> they don't hurt perf or anything
- # [08:45] <mjs> if you mean quirks in the sense of "things only done in quirks mode and not in standards mode", there are very few in the parser, at least in webkit
- # [08:45] <mjs> like one or two
- # [08:46] <mjs> and I think they could probably be done in standards mode too
- # [08:46] <hyatt> yeah, we have more style system parser quirks than html
- # [08:46] <xover> If you assume the input is “perfect” XHTML, no need to deal with possible author borkage.
- # [08:47] <mjs> well, XML parsers are already an example of what happens then
- # [08:47] <mjs> although they do have to detect errors so they can fail
- # [08:47] <mjs> sometimes detecting errors is more work than just handling them the same as the non-error case
- # [08:47] <xover> Hmm. And to the degree that either is measurable, detecting it is as expensive as.. ah, right.
- # [08:48] <mjs> I don't think there's an intrinsic simplicity advantage to either HTML or XML parsing; or performance advantage when dealing with conforming content
- # [08:49] <xover> Code Complexity then?
- # [08:49] <mjs> XML has simpler error handling rules (hard failure) but the internal subset and other such things make up for it in added complexity
- # [08:49] <mjs> in any case the parser is a fairly small part of the implementation, all things considered
- # [08:50] <mjs> most of the core code is DOM, CSS, JavaScript and layout
- # [08:50] <mjs> and the parts with the hardest algorithms are JS and rendering/layout
- # [08:50] * xover can well imagine...
- # [08:51] <mjs> DOM does not have too many fancy algorithms required, but it is a fair chunk of code and requires careful thinking to choose the right data structures
- # [08:51] <hyatt> the rendering/layout code will put hair on your chest.
- # [08:51] * hyatt flexes.
- # [08:52] <mjs> hyatt: I want to see you come into work with the top three buttons open and a huge gold chain
- # [08:52] * karl is heading to the html 5 spec to see what is happening when the doctype is not first in the document.
- # [08:52] <xover> Bling bling!
- # [08:53] * Quits: heycam (cam@130.194.72.84) (Quit: bye)
- # [08:53] <hyatt> mjs: only if the theme to shaft is playing
- # [08:53] <mjs> hyatt: I have the CD
- # [08:54] <hyatt> would you walk behind me with a box playing it?
- # [08:54] <xover> «Pimp my...» browser developer?
- # [08:54] <mjs> I don't see any reason to walk behind you
- # [08:55] <xover> You two have just promised to be the entertainment at the first HTML WG F2F!
- # [08:56] <hyatt> mjs: ok walk in front of me
- # [08:56] <hyatt> they should hear the music before i become visible anyway
- # [08:56] <karl> This specification defines the parsing rules for HTML documents, whether they are syntactically valid or not. Certain points in the parsing algorithm are said to be parse errors. The error handling for parse errors is well-defined: user agents must either act as described below when encountering such problems, or must abort processing at the first error that they encounter for which they do not wish to apply the rules described below.
- # [08:56] <karl> http://dev.w3.org/cvsweb/~checkout~/html5/spec/Overview.html?rev=1.47&content-type=text/html;%20charset=iso-8859-1#parse
- # [08:56] <hyatt> karl: yup
- # [08:57] <karl> does that mean "display an error message" when the document is starting with <p>bllabbba</p><!doctype html>
- # [08:58] <hyatt> no
- # [08:58] <hyatt> you either recover from the error as the spec descibes
- # [08:58] <hyatt> or if you choose not to follow the spec's error recovery rules
- # [08:58] <mjs> karl: the spec lets you fail catastrophically at the first error or recover as required
- # [08:58] <hyatt> then you can abort processing
- # [08:58] <hyatt> mjs: or somewhere in between
- # [08:58] <mjs> the abort rule is for things like conformance checkers
- # [08:58] <mjs> right, at the first error you don't handle
- # [08:58] <hyatt> you can give up at the first rule at which you choose not to apply html5's recovery rule
- # [08:58] <hyatt> errr first error
- # [08:59] * karl is searching the section on how to recover
- # [08:59] <karl> The more I read the specification, the more I find it difficult to read
- # [08:59] <mjs> karl: it's tricky to follow because it is defined as a state machine
- # [08:59] <mjs> the parsing section is one of the hardest to just read through
- # [08:59] <hyatt> the parsing section is going to be one of the hardest to just read
- # [09:00] <hyatt> because it's a giant algorithm
- # [09:00] * Quits: gavin_ (gavin@74.103.208.221) (Ping timeout)
- # [09:00] <karl> so I wonder what it is supposed to happen in my case
- # [09:02] <mjs> karl: you would start in section 8.2.3. Tokenisation (charset detection would not find anything interesting) in "Data state"
- # [09:03] <mjs> karl: the initial '<' puts you in "tag open state", and eventually you'd correctly parse the <p>, implicitly opening <html> and <body> along the way (I leave the details as an excercise to the reader)
- # [09:04] <xover> Hmm. Does HTML5 still have the concept of a PI?
- # [09:04] <xover> Or a Markup Declaration?
- # [09:05] * Joins: gavin_ (gavin@74.103.208.221)
- # [09:05] * Quits: hyatt (hyatt@24.6.91.161) (Quit: hyatt)
- # [09:06] <mjs> then the <! would result in a doctype token, which would be a parse error in the main phase
- # [09:06] <xover> Hmm. Apparently the answer is “Not really”.
- # [09:08] <mjs> the defined recovery for the doctype token in the main phase is "Parse error. Ignore the token."
- # [09:08] <mjs> it's all kind of hairy
- # [09:08] <mjs> xover: no, I don't think so
- # [09:09] <karl> hmm I will have to try the parse exercise again. I will have no time in the 20 minutes remaining before leaving to know what is happening with "<p>bllabbba</p><!doctype html>"
- # [09:09] <mjs> xover: though I think parser error recovery would lead to such things being ignored at least
- # [09:10] <karl> it seems to parse it character by character. I wonder how browser would be able to do that without being very slow
- # [09:10] <mjs> karl: I just gave you a summary, and if you would like a sneak peek at the answer, the final DOM will look like this:
- # [09:10] <mjs> <html><body><p>bllabba</p></body></html>
- # [09:10] <mjs> with a parse error for the stray doctype
- # [09:11] <xover> I wonder if the general markup declaration — as the wrapper concept for DOCTYPE and comments — was considered overkill/unnecessary or just forgotten / not considered.
- # [09:11] <karl> yes but I was trying to follow the logic of the specification to understand it.
- # [09:11] <mjs> the lack of <title> is also a conformance error in that document, as is the lack of initial doctype
- # [09:11] <mjs> karl: it's hard to explain over IRC, in person I would point
- # [09:11] <karl> many thanks already for your time mjs
- # [09:11] <karl> it helps
- # [09:12] <karl> what the browser is supposed to do with identified parse errors once it has recreated a dom?
- # [09:12] <karl> http://dev.w3.org/cvsweb/~checkout~/html5/spec/Overview.html?rev=1.47&content-type=text/html;%20charset=iso-8859-1#parse
- # [09:12] <mjs> there's no requirement to report parse errors, but also no requirement not to
- # [09:13] <karl> hhmmmmm "must abort processing at the first error that they encounter for which they do not wish to apply the rules described below."
- # [09:13] <karl> free to implementers :/
- # [09:13] <mjs> there's no requirement to even identify parse errors, if you do the specified recovery
- # [09:13] * Joins: anne (annevk@81.68.67.12)
- # [09:14] <xover> Hixie: Is there a description fo the spec “build process” anywhere? (to poke and prod at it, toy with generating PDF versions, etc.)
- # [09:16] <mjs> it's interesting that some things aren't flagged as parse errors in the parsing section, though they are presumably non-conforming
- # [09:16] <mjs> like <head><p>foo
- # [09:16] <mjs> (note lack of </head>)
- # [09:17] <mjs> or is that conforming?
- # [09:17] <karl> I asked all the question about code before doctype
- # [09:17] <xover> The general case is an implied end tag?
- # [09:17] <karl> because there are geocities like sites which inject code before serving the page
- # [09:18] <mjs> karl: the effect in most browsers and in the HTML5 spec is that the doctype is just ignored in that case
- # [09:18] <xover> Hmm. That's an interesting use case.
- # [09:18] <mjs> karl: so it would put the page in quirks mode, unless the site also injects a doctype
- # [09:20] <anne> mjs, that's conforming
- # [09:20] <anne> mjs, </head> and <body> are optional
- # [09:20] <anne> (except that it's not conforming because you didn't specify <title>
- # [09:20] <anne> but it wouldn't cause parse errors)
- # [09:21] <karl> anne: in html5lib python, do you parse character by character as the algorithm says in html 5 spec?
- # [09:21] <xover> End tags in general are optional, or only for a specific set of elements?
- # [09:21] <mjs> anne: got it
- # [09:22] <mjs> xover: same end tags are optional as are optional in HTML4.01, I think
- # [09:22] <anne> karl, we have "optimizations" in place
- # [09:22] <anne> I'm not sure actually the above bug is actually a bug by the way...
- # [09:23] <xover> Hmm. By the same rationale, or by default?
- # [09:24] <anne> Yeah, maybe it is...
- # [09:24] <anne> http://www.whatwg.org/specs/web-apps/current-work/#append doesn't say what I thought it would
- # [09:24] <karl> :)
- # [09:24] <mjs> xover: you have to act as if the same elements have implicit open and close tags to parse legacy documents correctly, and making use of this non-conforming would probably be unhelpful
- # [09:24] <mjs> at least that's what I imagine
- # [09:25] * karl wonders if it would be good to impose very strict rules on authoring tools.
- # [09:26] <xover> I took anne to mean that the spec makes some or all end tags optional in conforming documents?
- # [09:26] * Joins: edas (edaspet@88.191.34.123)
- # [09:26] * xover suspects authoring tools will behave much like browsers in that regard...
- # [09:26] * anne was replying to the bug in the table section with whitespace versus character tokens
- # [09:27] <xover> «If the browsers support it, the users will want it, so we have to be able to emit it.»
- # [09:27] <karl> re *sigh*
- # [09:27] <karl> that will be my sigh day ;)
- # [09:27] <mjs> karl: the spec requires authoring tools to generate only conforming documents
- # [09:27] <anne> xover, some end tags are optional even for conforming documents
- # [09:28] <mjs> xover: the </head> end tag is optional, as is the </p> end tag, I think those are both true in HTML 4.01 as well
- # [09:28] <mjs> but </div> is not optional
- # [09:29] <mjs> or </script>
- # [09:29] <xover> Ah, I just realised I misread mjs' earlier comment.
- # [09:30] <xover> “…make (use of)…” not “…(make use of)…”
- # [09:30] <mjs> ah, right
- # [09:30] <xover> heh heh, sorry
- # [09:30] <anne> karl, the conformance requirements on authoring tools are quite strict
- # [09:31] * Quits: karl (karlcow@128.30.52.30) (Quit: Where dwelt Ymir, or wherein did he find sustenance?)
- # [09:31] <mjs> I didn't double-check but I think the same set of tags have implicit close or open as in HTML 4.01
- # [09:31] <mjs> for compatibility w/ parsing in older browsers
- # [09:31] <anne> yeah
- # [09:31] <anne> although we might change the <p><table> case
- # [09:32] <mjs> though I think there might be new elements w/ empty content model which therefore don't need close tags
- # [09:34] <xover> Given the need to support self-closing tags for compat, wouldn't it make sense to generalize that usage for empty elements?
- # [09:34] * xover just thinking out loud...
- # [09:35] <anne> <img /> is allowed
- # [09:35] <mjs> self-closing tags are allowed, for empty elements only
- # [09:35] <anne> <html xmlns=http://www.w3.org/1999/xhtml> is allowed even (even without the quotes)
- # [09:36] <anne> although xmlns will not end up in the right namespace and such of course
- # [09:36] <mjs> (for other elements they'd behave different between HTML and XHTML and so are non-conforming)
- # [09:36] <xover> Hmm.
- # [09:37] <anne> nothing to hmm about
- # [09:37] <anne> this is all mostly a solved problem
- # [09:37] * Quits: Lachy (Lachlan@210.84.36.41) (Quit: Leaving)
- # [09:37] * Quits: MikeSmith (MikeSmith@mcclure.w3.org) (Ping timeout)
- # [09:37] <mjs> in HTML 4.01, if you take it to be an SGML application, <div /> would be allowed but means something different from what it means in XML
- # [09:38] <xover> “Hmm.” -> “/me must think on this to grok it more fully” :-)
- # [09:38] <mjs> (in fact it means three different things according to SGML, the way browsers actually parse HTML, and XML)
- # [09:39] <mjs> (so it's kind of bad that a fully per-spec HTML 4.01 validator would not flag it)
- # [09:39] <xover> ottomh, SGML and XML should be fairly equivalent for this case.
- # [09:40] <anne> <br/> in SGML is <br>\n>
- # [09:40] <anne> that's quite different from XML
- # [09:40] <xover> Hmm. Well, actually, no, not ... right.
- # [09:42] * xover curses markup minimization...
- # [09:43] <xover> Well, /me is off to `ork. Thanks for the interesting discussions all!
- # [09:43] * xover wanders off...
- # [09:51] * anne wonders why karl argued for versioning in HTML earlier...
- # [09:57] <anne> btw, technically the spec doesn't define <p>foobar</p><!doctype html> yet
- # [09:57] <anne> it requires a doctype first
- # [10:05] <mjs> that's true
- # [10:06] <mjs> if UAs chose to recover from lack of doctype by just ignoring it then it would be processed as I said
- # [10:07] <anne> the spec should define both modes in due course imo
- # [10:07] <anne> including doctype sniffing
- # [10:07] <anne> and then hopefully we can make it a single mode
- # [10:07] <anne> and keep the doctype stuff for "minimal" rendering differences
- # [10:09] <mjs> I thought leaving the missing/older/unknown doctype situation undefined was on purpose, to for example allow IE to implement HTML5 parsing for HTML5 but still parse HTML4 the old way
- # [10:10] <anne> hmm, maybe that needs to be a separate spec then :(
- # [10:10] * anne would love to have all of the web defined
- # [10:12] <mjs> I think a spec defining how to detect quirks mode, the rendering and CSS parsing differences in quirks mode, and parsing in absence of proper HTML5 doctype (ideally just the same as normal HTML5 parsing) would be useful
- # [10:12] * Quits: olivier (ot@128.30.52.30) (Quit: Leaving)
- # [10:15] * Joins: ROBOd (robod@86.34.246.154)
- # [10:18] * hsivonen thinks the spec for doctype sniffing should be dbaron's implementation ported to English, since hyatt already ported it to WebKit and Opera almost matches (apparently by blackbox reverse engineering, since they don't match exactly in corner cases)
- # [10:18] * Joins: Lachy (Lachlan@210.84.36.41)
- # [10:18] * Quits: Lachy (Lachlan@210.84.36.41) (Client exited)
- # [10:22] <mjs> Opera 9 seems pretty close according to your table
- # [10:22] <hsivonen> mjs: yes (except corner cases)
- # [10:22] <mjs> that chart really makes me think that CSS should change to make Almost Standards mode the standard
- # [10:22] <anne> I think we devised our own algorithm based on real world needs
- # [10:22] <anne> mjs, yeah
- # [10:22] <anne> +1
- # [10:23] <mjs> Opera 9 seems to match Gecko and WebKit a lot more closely than earlier Operas
- # [10:23] <mjs> (formerly I guess it was closer to IE)
- # [10:24] <hsivonen> anne: really? why does it match Gecko in places where the Gecko has apple.com-in-2000-induded weirdness and doesn't match for ISO HTML which doesn't have real-world relevance
- # [10:24] <hsivonen> ?
- # [10:24] <hsivonen> s/indudud/induced/
- # [10:24] <anne> I don't know and can't share our algorithm
- # [10:27] <hsivonen> mjs: https://bugzilla.mozilla.org/show_bug.cgi?id=78208
- # [10:27] <mjs> you mean because you haven't asked permission yet or because the specific algorithm is considered an important trade secret?
- # [10:30] <mjs> interesting to read dbaron circa 2001
- # [10:31] <anne> Haven't asked
- # [10:32] <anne> mjs, I think UA conformance requirements probably require you to treat <style scoped> in the same regardless of context
- # [10:34] <mjs> hsivonen: interesting how strident people were in those comments
- # [10:34] <mjs> we've never been that anal for Safari, I guess because we considered quirks mode and standards mode to actually be "copy IE bugs" mode and "copy mozilla bugs" mode respectively
- # [10:34] <mjs> (ok, I exaggerate a bit)
- # [10:35] <mjs> anne: right, I just mean that for conforming content, <style scoped> doesn't create style reapplication issues
- # [10:36] <mjs> so hyatt's change would only make a significant difference for nonconforming content, in which case you might see a non-scoped <style> in the body and be screwed anyway
- # [10:36] <anne> <i>&heart;5
- # [10:38] * hsivonen should blog the story of https://bugzilla.mozilla.org/show_bug.cgi?id=42525 sometime.
- # [10:38] <hsivonen> now it has been a while so there's no need to hide the bug from reopeners any longer
- # [10:47] * Quits: anne (annevk@81.68.67.12) (Client exited)
- # [10:47] <mjs> I couldn't make it through to the end of the comments there
- # [10:47] * Joins: anne (annevk@81.68.67.12)
- # [10:47] * Quits: anne (annevk@81.68.67.12) (Client exited)
- # [10:47] * Joins: anne (annevk@81.68.67.12)
- # [10:47] * Quits: anne (annevk@81.68.67.12) (Client exited)
- # [10:48] * Joins: anne (annevk@81.68.67.12)
- # [10:48] <hsivonen> mjs: long story short, what looks illogical in the doctype chart was done to keep apple.com in the quirks mode at the time
- # [10:48] * Quits: anne (annevk@81.68.67.12) (Client exited)
- # [10:48] * Joins: anne (annevk@81.68.67.12)
- # [10:51] <mjs> wow, apple was breaking the web before even having a browser
- # [10:53] <hsivonen> mjs: and it was the same issue that is the difference between the Standards Mode and the Almost Standards mode today
- # [10:56] <mjs> hsivonen: a colorful history there
- # [10:56] * Joins: MikeSmith (MikeSmith@mcclure.w3.org)
- # [11:08] * Quits: gavin_ (gavin@74.103.208.221) (Ping timeout)
- # [11:13] * Joins: gavin_ (gavin@74.103.208.221)
- # [11:23] <anne> good point mjs
- # [11:23] <anne> we should point out more often that syntax errors are hardly interesting
- # [11:23] <anne> and that checking all is impossible
- # [11:23] <hsivonen> anne: do you offer a zip file of the Web Forms 2.0 suite over HTTP? would you prefer me spidering it or would you prefer me the hit the server only for a zip file?
- # [11:23] * anne doesn't think many people realize how this stuff actually works
- # [11:23] <hsivonen> s/the hit/to hit/
- # [11:24] <anne> hsivonen, at this point spidering as I don't have to do anything :)
- # [11:24] <hsivonen> anne: ok
- # [11:24] <anne> i think we can take the bandwidth usage too :)
- # [11:31] * Joins: heycam (cam@203.214.6.6)
- # [11:49] <MikeSmith> heycam - I see that you're going to be in Tokyo at the end of August?
- # [12:16] * anne wonders where zcorpan is
- # [12:16] * anne spotted a mistake in his presentation
- # [12:16] <anne> while copying the google suggest example 8-)
- # [12:17] * Joins: tH_ (Rob@87.102.91.218)
- # [12:18] * tH_ is now known as tH
- # [12:37] * anne likes the <noscript><link rel=stylesheet></noscript> usecase
- # [12:37] <anne> <noscript><base href=evil.com></noscript>
- # [12:37] <anne> hah
- # [13:09] * Parts: anne (annevk@81.68.67.12)
- # [13:24] * Joins: anne (annevk@81.68.67.12)
- # [13:28] * Parts: anne (annevk@81.68.67.12)
- # [13:33] <heycam> MikeSmith, yep i will be
- # [13:33] <heycam> for svg f2f + svg open
- # [14:16] * Quits: marcos (chatzilla@131.181.148.226) (Ping timeout)
- # [14:27] * Quits: beowulf (carisenda@91.84.50.132) (Ping timeout)
- # [14:38] * Joins: beowulf (carisenda@91.84.50.132)
- # [14:59] * Joins: AGraf|mb (Ashe@138.232.65.129)
- # [15:10] * Quits: MikeSmith (MikeSmith@mcclure.w3.org) (Ping timeout)
- # [15:12] * Joins: zcorpan_ (zcorpan@84.216.41.246)
- # [15:36] * Quits: AGraf|mb (Ashe@138.232.65.129) (Quit: Quit)
- # [15:38] * Quits: gavin_ (gavin@74.103.208.221) (Ping timeout)
- # [15:44] * Joins: gavin_ (gavin@74.103.208.221)
- # [15:51] * Joins: AGraf|mb (Ashe@138.232.245.27)
- # [15:56] * Joins: DanC_lap (connolly@128.30.52.30)
- # [16:09] * Quits: AGraf|mb (Ashe@138.232.245.27) (Client exited)
- # [16:31] * Quits: DanC_lap (connolly@128.30.52.30) (Ping timeout)
- # [16:35] * Joins: billmason (billmason@69.30.57.156)
- # [17:16] * Joins: kazuhito (kazuhito@222.151.153.231)
- # [17:19] * Quits: zcorpan_ (zcorpan@84.216.41.246) (Ping timeout)
- # [17:22] * Joins: zcorpan_ (zcorpan@84.216.41.246)
- # [17:26] * Joins: hasather (hasather@81.235.209.174)
- # [17:42] * Joins: MikeSmith (MikeSmith@mcclure.w3.org)
- # [17:44] * Quits: kazuhito (kazuhito@222.151.153.231) (Quit: Quitting!)
- # [17:46] * Quits: gavin_ (gavin@74.103.208.221) (Ping timeout)
- # [17:52] * Joins: gavin_ (gavin@74.103.208.221)
- # [17:56] * Joins: DanC_lap (connolly@128.30.52.30)
- # [18:06] * Quits: edas (edaspet@88.191.34.123) (Ping timeout)
- # [18:15] * Joins: Sander (svl@71.57.109.108)
- # [18:42] * Joins: h3h (bfults@66.162.32.234)
- # [18:45] * Joins: edas (edaspet@88.191.34.123)
- # [18:50] * Quits: loic (loic@90.29.174.204) (Ping timeout)
- # [19:05] * Joins: loic (loic@90.41.7.67)
- # [19:30] * Joins: kingryan (rking3@208.66.64.47)
- # [19:54] * Quits: gavin_ (gavin@74.103.208.221) (Ping timeout)
- # [19:58] * Joins: hyatt (hyatt@24.6.91.161)
- # [19:59] * Joins: gavin_ (gavin@74.103.208.221)
- # [21:13] * Quits: xover (xover@193.157.66.5) (Ping timeout)
- # [21:16] * Joins: xover (xover@193.157.66.5)
- # [21:41] * Quits: DanC_lap (connolly@128.30.52.30) (Ping timeout)
- # [21:41] * Quits: mjs (mjs@64.81.48.145) (Quit: mjs)
- # [21:50] * Quits: loic (loic@90.41.7.67) (Quit: hoopa rules)
- # [21:57] * Quits: hyatt (hyatt@24.6.91.161) (Quit: hyatt)
- # [22:01] * Quits: gavin_ (gavin@74.103.208.221) (Ping timeout)
- # [22:02] * Quits: gsnedders (gsnedders@86.139.123.225) (Quit: Don't touch /dev/null…)
- # [22:02] * Joins: hyatt (hyatt@24.6.91.161)
- # [22:06] * Joins: gavin_ (gavin@74.103.208.221)
- # [22:07] * Quits: mw22 (chatzilla@84.41.169.151) (Ping timeout)
- # [22:10] * Joins: mw22 (chatzilla@84.41.169.151)
- # [22:13] * Quits: ROBOd (robod@86.34.246.154) (Quit: http://www.robodesign.ro )
- # [22:30] * Joins: mjs (mjs@66.245.248.74)
- # [22:37] * Quits: Hixie (ianh@129.241.93.37) (Client exited)
- # [22:41] * Quits: mjs (mjs@66.245.248.74) (Ping timeout)
- # [22:44] * Joins: DanC_lap (connolly@128.30.52.30)
- # [22:48] * Joins: gsnedders (gsnedders@86.139.123.225)
- # [22:49] * Joins: Zeros (Zeros-Elip@67.154.87.254)
- # [22:56] * Quits: Zeros (Zeros-Elip@67.154.87.254) (Quit: Leaving)
- # [23:10] * Joins: Hixie (ianh@129.241.93.37)
- # [23:16] * Quits: Hixie (ianh@129.241.93.37) (Client exited)
- # [23:16] * Joins: asbjornu (asbjorn@84.48.116.134)
- # [23:26] * Joins: Hixie (ianh@129.241.93.37)
- # [23:59] * Joins: nickshanks (nicholas@195.137.85.17)
- # Session Close: Thu May 31 00:00:00 2007
The end :)