/irc-logs / freenode / #whatwg / 2007-06-18 / end

Options:

  1. # Session Start: Mon Jun 18 00:00:00 2007
  2. # Session Ident: #whatwg
  3. # [00:01] * Joins: tantek (n=tantek@kci-host51.cust.wifi.sprintpcs.com)
  4. # [00:02] * Quits: tantek (n=tantek@kci-host51.cust.wifi.sprintpcs.com) (Client Quit)
  5. # [00:24] <weinigLap> Hixie: you around?
  6. # [00:25] <weinigLap> Hixie: I am a little confused about the Location object in the HTML5 spec
  7. # [00:25] <weinigLap> Hixie: are the attributes supposed to readonly?
  8. # [00:35] <Dashiva> weinigLap: They aren't actually readonly, as setting them maps to the assign() method
  9. # [00:36] * weinigLap nods
  10. # [00:36] <weinigLap> hence my confusion
  11. # [00:37] <weinigLap> Dashiva: do you know why they are labeled readonly in the interface definition?
  12. # [00:37] <Dashiva> The attributes are read-only
  13. # [00:37] <Dashiva> You can't change them. However, setting them is treated as an implicit call to change the current location
  14. # [00:38] <weinigLap> ah
  15. # [00:38] <weinigLap> weird
  16. # [00:38] <weinigLap> thanks though
  17. # [00:59] <Hixie> weinigLap: i don't think they should be readonly... send mail?
  18. # [01:00] <weinigLap> Hixie: pardon?
  19. # [01:00] <weinigLap> oh, send you an email, my bad
  20. # [01:00] <Hixie> sorry, yeah, i meant send mail to whatwg@whatwg.org to report the error :-)
  21. # [01:00] <Hixie> hsivonen: yt?
  22. # [01:01] <weinigLap> Hixie: doing it now, thanks
  23. # [01:02] <Hixie> np
  24. # [01:02] <Hixie> thanks you!
  25. # [01:20] * Quits: mpt (n=mpt@121-72-128-43.dsl.telstraclear.net) ("This computer has gone to sleep")
  26. # [01:22] * Joins: mpt (n=mpt@121-72-128-43.dsl.telstraclear.net)
  27. # [02:04] * Joins: karlUshi (n=karl@dhcp-247-173.mag.keio.ac.jp)
  28. # [02:27] * om_out is now known as othermaciej
  29. # [02:54] * mpt wonders if <image/svg>, <application/mathml+xml>, etc would work
  30. # [02:59] <othermaciej> would work in what?
  31. # [03:12] <mpt> HTML
  32. # [03:13] <mpt> instead of having the HTML specification containing yet another registry for document types
  33. # [03:15] * Quits: weinigLap (i=weinig@nat/apple/x-1a70b2ec233063fb)
  34. # [03:22] <othermaciej> you intend those to be tag names?
  35. # [03:23] * Quits: Lfe (n=lfe@bergstroem.nu) ("brb")
  36. # [03:23] <othermaciej> I don't think that generalizes, some interesting XML languages don't have a specific MIME type assigned
  37. # [03:25] * othermaciej is now known as om_coffee
  38. # [04:02] * om_coffee is now known as othermaciej
  39. # [04:51] * Quits: deltab (n=deltab@82-36-30-34.cable.ubr02.smal.blueyonder.co.uk) (Read error: 104 (Connection reset by peer))
  40. # [04:52] * Joins: deltab (n=deltab@82-36-30-34.cable.ubr02.smal.blueyonder.co.uk)
  41. # [04:54] * Quits: markp (n=mark@adsl-150-130-153.rmo.bellsouth.net) (Remote closed the connection)
  42. # [05:09] * othermaciej is now known as om_food
  43. # [05:09] * Joins: weinigLap (i=weinig@nat/apple/x-57cb914575e6620b)
  44. # [05:26] * moeffju is now known as moeffju[ZzZz]
  45. # [05:27] * Joins: Wolfman2000 (n=Wolfman2@wvh5348rn.rh.ncsu.edu)
  46. # [05:28] <Wolfman2000> Evening. Is there a link that shows what is planning to be deprecated in HTML5?
  47. # [05:36] * Joins: Lfe (n=lfe@bergstroem.nu)
  48. # [06:00] <mpt> Wolfman2000, http://dev.w3.org/cvsweb/~checkout~/html5/html4-differences/Overview.html#dropped-elements
  49. # [06:01] <Wolfman2000> this helps: thanks
  50. # [06:04] * Quits: weinigLap (i=weinig@nat/apple/x-57cb914575e6620b) (Read error: 104 (Connection reset by peer))
  51. # [06:11] * Joins: weinigLap (i=weinig@nat/apple/x-3dac8f5bc5513792)
  52. # [06:12] * Joins: MikeSmith (n=MikeSmit@58.157.21.205)
  53. # [06:14] * Joins: aroben (n=adamrobe@c-69-142-103-232.hsd1.pa.comcast.net)
  54. # [07:19] * Quits: csarven (n=nevrasc@modemcable081.152-201-24.mc.videotron.ca)
  55. # [07:46] <annevk> yeah, XBL doesn't have a MIME type
  56. # [07:50] * Quits: MikeSmith (n=MikeSmit@58.157.21.205) (Excess Flood)
  57. # [07:50] * Joins: MikeSmith (n=MikeSmit@58.157.21.205)
  58. # [07:57] * Quits: weinigLap (i=weinig@nat/apple/x-3dac8f5bc5513792)
  59. # [07:57] * om_food is now known as othermaciej
  60. # [08:13] <hsivonen> Hixie: I'm awake now.
  61. # [08:19] * Quits: aroben (n=adamrobe@c-69-142-103-232.hsd1.pa.comcast.net)
  62. # [08:21] * Joins: aroben (n=adamrobe@c-69-142-103-232.hsd1.pa.comcast.net)
  63. # [08:21] * Quits: aroben (n=adamrobe@c-69-142-103-232.hsd1.pa.comcast.net) (Read error: 54 (Connection reset by peer))
  64. # [08:21] * Joins: aroben (n=adamrobe@c-69-142-103-232.hsd1.pa.comcast.net)
  65. # [08:22] * Quits: aroben (n=adamrobe@c-69-142-103-232.hsd1.pa.comcast.net) (Client Quit)
  66. # [08:22] * Joins: aroben (n=adamrobe@c-69-142-103-232.hsd1.pa.comcast.net)
  67. # [08:22] * Joins: tantek (n=tantek@adsl-63-195-114-133.dsl.snfc21.pacbell.net)
  68. # [08:26] * Quits: aroben (n=adamrobe@c-69-142-103-232.hsd1.pa.comcast.net) (Remote closed the connection)
  69. # [08:27] * Joins: aroben (n=adamrobe@c-69-142-103-232.hsd1.pa.comcast.net)
  70. # [08:36] <annevk> hsivonen, why is < more special than & or " or '?
  71. # [08:37] <annevk> they're all non-conforming in the end
  72. # [08:38] <annevk> jgraham, you around?
  73. # [08:43] * Joins: weinigLap (n=weinig@c-67-188-78-122.hsd1.ca.comcast.net)
  74. # [08:47] <hsivonen> annevk: for unquoted attributes, no, they aren't all non-conforming in the end
  75. # [08:49] <annevk> attribute values?
  76. # [08:49] <annevk> oh, yeah
  77. # [08:49] <hsivonen> annevk: < is special, because it makes it look like a new tag is starting. but one isn't
  78. # [08:49] <hsivonen> annevk: so conformance checkers should be able to flag it for authors who are going WTF
  79. # [08:49] <annevk> title=2<5
  80. # [08:49] <annevk> there's a use case :)
  81. # [08:49] <hsivonen> annevk: also, it should be non-conforming to keep conforming docs reasonable safe for shipped Gecko and WebKit
  82. # [08:49] <annevk> <a title=2<5> already works in Firefox
  83. # [08:49] <hsivonen> oh
  84. # [08:49] <annevk> I think "<" is only special cased in some other states in at least Firefox
  85. # [08:50] <hsivonen> <a <title=2> doesn't
  86. # [08:50] <hsivonen> annevk: good point
  87. # [08:51] <hsivonen> anyway, I added warnings while I was at it in case Hixie disagrees about making it an error
  88. # [08:52] <annevk> I don't think it should be more than an error than a Unicode character that looks like "a" or something
  89. # [08:53] <hsivonen> I wonder how often Russians enter lookalikes by accident
  90. # [08:55] <hsivonen> but yeah, I wasn't properly thinking about some of those cases getting caught on a higher layer
  91. # [08:56] <annevk> I'm going to squash that bug in html5lib now I think
  92. # [08:57] <annevk> And fix all the tests...
  93. # [08:57] <karlUshi> <a title=q<p>math proposal</a>
  94. # [08:57] <hsivonen> annevk: please considering applying my patches for the tests before making other changes that prevent the patches from applying
  95. # [08:59] <annevk> at some point we should sort out math and ruby
  96. # [08:59] <annevk> hsivonen, can't you commit them yourself?
  97. # [08:59] <hsivonen> annevk: AFAIK, no
  98. # [09:00] <karlUshi> <a title=/b</a>math proposal</a>
  99. # [09:00] * Quits: Lachy (n=Lachlan@203-158-59-119.dyn.iinet.net.au) (Read error: 104 (Connection reset by peer))
  100. # [09:00] * Joins: Lachy (n=Lachlan@203-158-59-119.dyn.iinet.net.au)
  101. # [09:03] <annevk> indeed you can't
  102. # [09:03] <Hixie> hsivonen: you sent a mail about EOF having been dropped recently at some point from some sections in the tokeniser
  103. # [09:03] <Hixie> hsivonen: did you see if they actually got dropped? i think i may just never have had them!
  104. # [09:05] <hsivonen> Hixie: diff tells me you had them and dropped them
  105. # [09:07] <Hixie> huh
  106. # [09:07] <Hixie> any idea when?
  107. # [09:08] <Hixie> i'd love to bring them back exactly as they were
  108. # [09:08] <Hixie> totally wasn't my intention to drop them
  109. # [09:08] <hsivonen> Hixie: had them on June 12. not anymore on June 17
  110. # [09:08] <Hixie> ok, cool, thanks
  111. # [09:08] <Hixie> that'll help
  112. # [09:08] <hsivonen> Hixie: probably part of rev 886
  113. # [09:08] <Hixie> the reason i was looking for you earlier was to ask you what the use case for embedded svg was
  114. # [09:08] <annevk> maybe EOF and < were handled in the same way...
  115. # [09:08] <hsivonen> (rev # from off the top of my head)
  116. # [09:09] <hsivonen> annevk: they were
  117. # [09:09] <Hixie> EOF isn't in the diff for 866
  118. # [09:09] <Hixie> er, 886
  119. # [09:11] <hsivonen> Hixie: the use case is including diagrams or graphs
  120. # [09:11] <Hixie> aha
  121. # [09:11] <hsivonen> Hixie: with application/xhtml+xml you can include them inline
  122. # [09:11] <Hixie> 899
  123. # [09:12] <hsivonen> Hixie: oh. sorry. I was thinking of another recent rev
  124. # [09:12] <Wolfman2000> Hixie: sounds like you know a lot about this HTML 5. Do you have any idea when it will eventually take over HTML 4?
  125. # [09:12] <hsivonen> Hixie: so not being able to include them inline in text/html is a feature parity bug between the serializations
  126. # [09:12] <Hixie> Wolfman2000: i wrote html5 :-) how do hyou mean, take over?
  127. # [09:13] <Wolfman2000> There are some people in other channels that are worried of the progress of HTML 5. They wonder when/if HTML 5 will become recommended over HTML 4.01 Strict.
  128. # [09:13] <hsivonen> Hixie: the fact that Jacques Distler, Sam Ruby and I intuitively want to include them inline when possible suggests that it is something that we think there's a point
  129. # [09:13] <Hixie> Wolfman2000: it won't be finished for many years
  130. # [09:14] <othermaciej> including XBL inline in HTML is surely useful
  131. # [09:14] <Hixie> Wolfman2000: then again, html4 isn't really finished yet either
  132. # [09:14] <othermaciej> and including SVG in inline XBL bindings for HTML is surely useful
  133. # [09:14] <Wolfman2000> Also, I have a demonstration page about a potentially valid use for target="_top". Let me get it up on the server.
  134. # [09:14] <Hixie> Wolfman2000: so...
  135. # [09:14] <Wolfman2000> Hixie: ...html4 isn't?
  136. # [09:14] <hsivonen> Hixie: if the diagram or graph is not shared between multiple docs and isn't binary, there's really no good reason not to if putting it inline is possible
  137. # [09:14] <Hixie> Wolfman2000: it's full of bugs and errors (e.g. it says that media=screen is the default, not media=all)
  138. # [09:14] <hsivonen> s/is possible/if possible/
  139. # [09:15] <hsivonen> s/if putting/put/
  140. # [09:15] * hsivonen hasn't properly woken up yet
  141. # [09:16] <annevk> Wolfman2000, contrary to other implementations, HTML5 is driven by implementation
  142. # [09:16] <Wolfman2000> Anyway, the page I mentioned: http://courses.ncsu.edu/csc234/lec/651/jaf_index.html This page won't stay up forever: it's a design that has become retired due to lack of proper disabled support.
  143. # [09:16] <Hixie> hsivonen: should we also allow XBM inline? (just asking to find out where you think the boundary lies)
  144. # [09:16] <annevk> Wolfman2000, so when the spec is done, it will be properly implemented
  145. # [09:16] <annevk> Hixie, what's XBM?
  146. # [09:16] <hsivonen> Hixie: I don't know what XBM is
  147. # [09:16] <Lachy> did you mean XBL?
  148. # [09:17] <Hixie> Wolfman2000: can you send your suggestion / help to me by e-mail? ian@hixie.ch (or whatwg@whatwg.org if you are subscribed)
  149. # [09:17] <Wolfman2000> I'm not subscribed yet.
  150. # [09:17] <Hixie> hsivonen: a text bitmap format
  151. # [09:17] <Wolfman2000> What's the best way to subscribe?
  152. # [09:17] <Hixie> hsivonen: i would have said PNG but that's not a text format
  153. # [09:17] <Hixie> Wolfman2000: http://whatwg.org/mailing-list
  154. # [09:18] <hsivonen> Hixie: but in general I'd be ok with being able to include any namespaced stuff with prefixes and optimize SVG and MathML to work without prefixes
  155. # [09:18] <karlUshi> monchrome bitmap
  156. # [09:18] <Hixie> hsivonen: hm interesting
  157. # [09:18] <othermaciej> XBM is not an XML language
  158. # [09:18] <hsivonen> Hixie: I guess we shouldn't do XBM because it wouldn't work by just hacking the parser
  159. # [09:18] <othermaciej> or indeed a markup language
  160. # [09:18] <othermaciej> you could put it in a data: URL I guess
  161. # [09:19] <Wolfman2000> awaiting confirmation email
  162. # [09:19] <karlUshi> http://en.wikipedia.org/wiki/XBM
  163. # [09:19] <hsivonen> Hixie: but putting SVG or MathML in the DOM already works, so fixing the parser is relatively low-hanging fruit compared to generalizing to XBM
  164. # [09:19] <Hixie> hsivonen: well fwiw i personally think it'd be great to have a math format and a vector graphics format in html. i think it would be a huge amount of work, though, and i think it would be highly controversial (so i don't plan on doing it anytime soon)
  165. # [09:19] <Wolfman2000> ...on confirming the subscription request, can I use my preferred internet name instead of my real name?
  166. # [09:19] <Hixie> Wolfman2000: yes
  167. # [09:20] <Wolfman2000> Then consider me subscribed.
  168. # [09:20] <hsivonen> Hixie: well, pushing text/html and *not* being able to include arbitrary namespaces is also controversial to some
  169. # [09:21] <hsivonen> Hixie: and I'd expect SVG in text/html to have all the same warts as SVG in application/xhtml+xml
  170. # [09:22] <Wolfman2000> ...so now that I'm in, all I do is send email to whatwg@whatwg.org and everyone sees it, right?
  171. # [09:22] <Hixie> hsivonen: i think general svg-like or mathml-like syntax in html has reasonably straightforward ways of being done (far from easy, but at least not technically difficult)
  172. # [09:22] * Wolfman2000 hasn't done this in awhile.
  173. # [09:22] <hsivonen> Hixie: so fixing SVG to WHATWG quality is a more general problem than enabling it in parsing
  174. # [09:22] <Hixie> hsivonen: i think a general-purpose namespaces system would be practically infeasible though in text/html
  175. # [09:22] <Hixie> Wolfman2000: yup
  176. # [09:23] <hsivonen> Hixie: would it be infeasible to hard-wire prefixes known to date and allowing the declaration of unknown prefixes?
  177. # [09:23] <Hixie> hsivonen: i'm not including svg 1.1 in html5 unchanged (e.g. requiring xlink namespace prefixes), that would just be missing a massive opporunity
  178. # [09:23] <Hixie> opportunity
  179. # [09:23] <hsivonen> (I am aware that the list of known prefixes to date is long)
  180. # [09:24] <Hixie> hsivonen: it couldn't be done just by using prefixes, that would have all kinds of issues (e.g. prefixes already do weird things in IE)
  181. # [09:24] <hsivonen> Hixie: more weird than what the obvious prefix bindings would do?
  182. # [09:24] <Hixie> hsivonen: i don't really have any interest in a simplistic solution that just shoehorns XML syntax into text/html to be honest
  183. # [09:25] <Hixie> hsivonen: i don't see the advantage and the costs can be great
  184. # [09:25] <hsivonen> Hixie: yeah, a general-purpose system would be more about fulfilling a bullet point
  185. # [09:25] <Hixie> hsivonen: but anyway, this is something that's on the cards already
  186. # [09:26] <hsivonen> Hixie: but special-casing SVG and MathML still has a point, I think
  187. # [09:26] <hsivonen> Hixie: ok
  188. # [09:28] * Quits: tantek (n=tantek@adsl-63-195-114-133.dsl.snfc21.pacbell.net)
  189. # [09:28] <Wolfman2000> Email has been sent, webpage link included.
  190. # [09:32] * Quits: karlUshi (n=karl@dhcp-247-173.mag.keio.ac.jp) ("Where dwelt Ymir, or wherein did he find sustenance?")
  191. # [09:43] <Wolfman2000> ...great. I just received an email saying my email to the group got...well, bounced. It's awaiting moderator approval.
  192. # [09:43] * Wolfman2000 thought he just signed up.
  193. # [09:43] <annevk> did you e-mail to the list you signed up for?
  194. # [09:43] <annevk> there are four lists
  195. # [09:45] <Wolfman2000> I thought I signed up to the right list. I then emailed whatwg@whatwg.org
  196. # [09:45] <Wolfman2000> ...oh crap. I signed up for Implementors.
  197. # [09:46] <Wolfman2000> so I emailed it to the wrong spot?
  198. # [09:46] <Hixie> Wolfman2000: if it got stuck in the moderator queue you'll have to resubscribe, sorry :-|
  199. # [09:46] <Wolfman2000> ...wha?!?
  200. # [09:47] <Wolfman2000> one shot and that's it?
  201. # [09:48] <Wolfman2000> ...strange. it still looks like I'm subscribed. At least...in Implementors.
  202. # [09:48] <annevk> yeah, you have to subscribe to the other list
  203. # [09:48] <Hixie> Wolfman2000: no i mean you'll have to subcribe to the other list
  204. # [09:48] <annevk> and then e-mail your message again
  205. # [09:48] <Hixie> Wolfman2000: the moderator queue is just a black hole
  206. # [09:48] <Hixie> (we were getting too much spam for me to keep up)
  207. # [09:49] <Wolfman2000> From this page: http://www.whatwg.org/mailing-list I ended up subscribing to Implementors
  208. # [09:49] <Wolfman2000> I assume there is a different page I'm supposed to go to then?
  209. # [09:49] <annevk> use http://lists.whatwg.org/listinfo.cgi/whatwg-whatwg.org
  210. # [09:49] <annevk> (it's linked from that same page)
  211. # [09:50] <Wolfman2000> ...oy. four of them.
  212. # [09:50] <Wolfman2000> I'm assuming I should subscribe to all of them then?
  213. # [09:50] <annevk> no
  214. # [09:50] <annevk> just the ones you're interested in
  215. # [09:51] <annevk> I suggest you read the page briefly first
  216. # [09:51] <Wolfman2000> Probably a good idea.
  217. # [09:54] <Wolfman2000> alright, covered. In the end...I think what I wanted the most was help-whatwg.org instead of whatwg.org
  218. # [09:54] <Wolfman2000> about to re-send the email
  219. # [09:55] <Hixie> hsivonen: fixed the EOF isue
  220. # [09:55] <jgraham> annevk: I'm here now...
  221. # [09:55] <Hixie> issue
  222. # [09:56] <Wolfman2000> ...alright, chose to resend to the same place. Let's hope it doesn't go to the black hole this time
  223. # [09:57] <hsivonen> Hixie: thakns
  224. # [09:58] * Joins: virtuelv (n=virtuelv@pat-tdc.opera.com)
  225. # [09:59] * Quits: Lachy (n=Lachlan@203-158-59-119.dyn.iinet.net.au) (Read error: 104 (Connection reset by peer))
  226. # [09:59] <annevk> jgraham, it's already working
  227. # [09:59] <annevk> jgraham, I didn't have chardet but that seems to be optional now
  228. # [09:59] <jgraham> Great :)
  229. # [10:00] * Joins: Lachy (n=Lachlan@203-158-59-119.dyn.iinet.net.au)
  230. # [10:00] <jgraham> Less difficult questions early in the morning == good
  231. # [10:00] <annevk> Currently fixing the new entity stuff by making a small dirty hack that scrapes the HTML5 spec
  232. # [10:00] <jgraham> s/Less/Fewer
  233. # [10:00] <annevk> well, the multpage version
  234. # [10:03] <hsivonen> Safari appears not to have a chardet equivalent. does this mean that chardet is no longer needed on the real Web?
  235. # [10:04] <hsivonen> Opera seems to have autodetection available but it is scoped to Cyrillic, Chinese, Japanese or Korean at a time
  236. # [10:04] <hsivonen> what does IE7 do?
  237. # [10:05] <Wolfman2000> hsivonen: charset detection? Hmm...hang on a second, while I test a certain webpage.
  238. # [10:05] <hsivonen> Wolfman2000: yes
  239. # [10:05] <Wolfman2000> ...I think it's still needed.
  240. # [10:05] <Wolfman2000> http://foonmix.nothing.sh/ Use Shift_JIS
  241. # [10:06] <Wolfman2000> I believe my options are set to use utf-8 by default
  242. # [10:06] <Wolfman2000> does that help a bit hsivonen, or did I misunderstand?
  243. # [10:08] <hsivonen> Wolfman2000: does IE7 have an autodetector?
  244. # [10:08] <Wolfman2000> hsivonen: I'm unsure: I'm on a Mac.
  245. # [10:08] <Wolfman2000> I was testing Safari
  246. # [10:08] <hsivonen> what do Japanese Safari users do? do they use another browser or switch encodings manually?
  247. # [10:08] <Wolfman2000> I needed to switch my encoding manually.
  248. # [10:08] <Wolfman2000> But I'm an American Safari user, so...I don't know.
  249. # [10:08] <Wolfman2000> Most Japanese people use Windows and IE. :(
  250. # [10:09] <hsivonen> well, hooking up jchardet to my tokenizer is on my todo list
  251. # [10:09] <hsivonen> I'd like to know, though, if passing only the first 512 bytes to chardet is enough
  252. # [10:10] <Wolfman2000> I don't know how to test that. I've only just signed up, and I'm merely a simple TA/web designer
  253. # [10:10] * Quits: aroben (n=adamrobe@c-69-142-103-232.hsd1.pa.comcast.net)
  254. # [10:18] * Joins: Jero (n=Jero@d207230.upc-d.chello.nl)
  255. # [10:30] * Joins: zcorpan (n=zcorpan@static-88.131.66.111.addr.tdcsong.se)
  256. # [10:30] * zcorpan is at the opera office in linköping
  257. # [10:32] <annevk> simonp
  258. # [10:32] <annevk> :)
  259. # [10:33] <zcorpan> @opera.com?
  260. # [10:33] <annevk> yeah
  261. # [10:33] <zcorpan> oh yep. didn't know i had an opera email already
  262. # [10:35] * Joins: webben (i=benh@nat/yahoo/x-5eb6c4c6d1dbd099)
  263. # [10:36] <hsivonen> Hixie: why did you remove "Otherwise, if the next character is a U+003B SEMICOLON, consume that too. If it isn't, there is a parse error.
  264. # [10:36] <hsivonen> "
  265. # [10:36] <hsivonen> Hixie: in entity tokenization
  266. # [10:37] <annevk> because it is part of the entity name
  267. # [10:37] <hsivonen> whoa
  268. # [10:37] * hsivonen is only diffing the tokenization section
  269. # [10:38] <annevk> I thought this entity stuff would be trivial to implement but it's not
  270. # [10:38] <annevk> I think <object>, <video> etc. should allow block-level fallback...
  271. # [10:38] <hsivonen> annevk: It took me a while to get the previous entity parsing right with minimal string object creation
  272. # [10:38] <annevk> The new entity stuff isn't stable either
  273. # [10:39] <annevk> Apparently IE does something different from this for attributes
  274. # [10:39] <annevk> So maybe you should not fix that for now
  275. # [10:39] <hsivonen> yeah
  276. # [10:39] <hsivonen> actually, I did it without string object creation at all
  277. # [10:41] <annevk> Hixie, it would be useful for other standards if HTML5 defined Almost Standards Mode for them
  278. # [10:41] <annevk> Hixie, if we're going to keep it, that is
  279. # [10:44] * Joins: ROBOd (n=robod@86.34.246.154)
  280. # [10:45] * Joins: ddfreyne (n=ddfreyne@d54C57894.access.telenet.be)
  281. # [10:46] * Quits: webben (i=benh@nat/yahoo/x-5eb6c4c6d1dbd099) (Client Quit)
  282. # [10:51] * Quits: weinigLap (n=weinig@c-67-188-78-122.hsd1.ca.comcast.net)
  283. # [10:53] <annevk> hsivonen, fixed the tests
  284. # [10:55] <hsivonen> annevk: thanks
  285. # [10:56] <hsivonen> Hixie: I'd like to make entity names without the terminating semicolon parse errors to help conformance checkers alleviate author confusion is the face of typos
  286. # [10:56] * zcorpan agrees with hsivonen
  287. # [10:57] <annevk> blah &amp blah
  288. # [10:58] <annevk> zcorpan, have you changed your entity script to work for attribute values already?
  289. # [10:58] <zcorpan> annevk: no
  290. # [10:58] <zcorpan> i can do it though
  291. # [10:58] <annevk> might be useful to see what IE does there (and how it compares to normal entity parsing)
  292. # [10:58] <annevk> cool
  293. # [10:58] <hsivonen> annevk: if IE is even remotely sane, the entity handling differences for attributes can be handled by loading a different entity table
  294. # [10:58] * hsivonen hasn't tested
  295. # [10:59] <zcorpan> hopefully it just is the same table without the entries that don't end with ;
  296. # [11:00] <annevk> hsivonen, wouldn't that always be possible?
  297. # [11:01] <hsivonen> Hixie: btw, for easy scraping it would be nice to have the entity table lexicographically sorted
  298. # [11:01] <hsivonen> Hixie: just mentioning this in case you edit it anyway
  299. # [11:01] <annevk> hsivonen, there's a table scraping script
  300. # [11:01] <hsivonen> annevk: I haven't considered insane options :-)
  301. # [11:01] <annevk> it's really easy
  302. # [11:01] <hsivonen> annevk: pointer?
  303. # [11:02] <annevk> http://html5lib.googlecode.com/svn/trunk/python/utils/extract-entities.py
  304. # [11:02] <hsivonen> annevk: thanks
  305. # [11:02] <hsivonen> annevk: I guess I'll add lex sort to the script
  306. # [11:02] <annevk> just wrote that for my own usage, but integrating the new entity handling didn't work out to well
  307. # [11:02] <annevk> hsivonen, can I add you to the html5lib project?
  308. # [11:03] <annevk> so you can simply commit those changes yourself
  309. # [11:03] <hsivonen> annevk: sure
  310. # [11:03] * Joins: mw22 (n=chatzill@h8441169151.dsl.speedlinq.nl)
  311. # [11:03] <annevk> you have a google account?
  312. # [11:03] <hsivonen> hsivonen@gmail.com
  313. # [11:04] <annevk> done
  314. # [11:04] <hsivonen> thanks
  315. # [11:04] <annevk> http://code.google.com/u/hsivonen/
  316. # [11:05] * Joins: webben (i=benh@nat/yahoo/x-f394eae3100023ab)
  317. # [11:07] * Joins: zcorpan_ (n=zcorpan@pat.se.opera.com)
  318. # [11:14] * Joins: hendry (i=hendry@conference/debconf/x-54ea805c11f878f1)
  319. # [11:16] * Joins: Charl (n=charlvn@net-153-111.mweb.co.za)
  320. # [11:18] * Joins: hasather (n=david_ha@pat-tdc.opera.com)
  321. # [11:21] <zcorpan_> http://simon.html5.org/test/html/parsing/entities/trailing-semicolon/002.htm -- that is <img alt>... dunno if ie has different rules for <img src> or <a href>
  322. # [11:23] * Quits: zcorpan (n=zcorpan@static-88.131.66.111.addr.tdcsong.se) (Read error: 110 (Connection timed out))
  323. # [11:23] <annevk> could you another three columns for the non attribute case?
  324. # [11:25] <annevk> IE actually differs for &entity; &entity and &entityX
  325. # [11:25] <zcorpan_> sure
  326. # [11:25] <annevk> I suppose &entity means &entity< or something?
  327. # [11:26] <annevk> or maybe &entity followed by a space
  328. # [11:29] * Joins: maikmerten (n=maikmert@T66d6.t.pppool.de)
  329. # [11:30] <annevk> what would be more useful I suppose if you checked the results using DOM methods and then just printed how they are supported... :)
  330. # [11:32] <zcorpan_> yeah
  331. # [11:37] * Quits: hasather (n=david_ha@pat-tdc.opera.com) (Remote closed the connection)
  332. # [11:40] <hsivonen> where might I find a list of characters that are allowed (per spec) in unquoted attribute values in HTML 4.01?
  333. # [11:40] <annevk> Should </br> also cause the active formatting elements to be reconstructed?
  334. # [11:41] <hsivonen> (without spending hours trying to grok SGML myself after borrowing the Handbook from a library)
  335. # [11:41] <annevk> I think it's [a-Z0-9]
  336. # [11:41] <hsivonen> annevk: what about hyphens, underscores and the like?
  337. # [11:43] <annevk> I'd think details would be in http://www.w3.org/TR/html4/appendix/notes.html
  338. # [11:43] <annevk> but it doesn't seem like it
  339. # [11:44] <annevk> "In certain cases, authors may specify the value of an attribute without any quotation marks. The attribute value may only contain letters (a-z and A-Z), digits (0-9), hyphens (ASCII decimal 45), periods (ASCII decimal 46), underscores (ASCII decimal 95), and colons (ASCII decimal 58). We recommend using quotation marks even when it is possible to eliminate them."
  340. # [11:44] <annevk> http://www.w3.org/TR/html4/intro/sgmltut.html#h-3.2.2
  341. # [11:45] <hsivonen> LCNMCHAR ".-_:"
  342. # [11:45] <hsivonen> yeah
  343. # [11:45] <hsivonen> thanks
  344. # [11:49] * Quits: Wolfman2000 (n=Wolfman2@wvh5348rn.rh.ncsu.edu) ("Leaving")
  345. # [11:52] * Quits: kfish (n=conrad@61.194.21.25) (Remote closed the connection)
  346. # [11:52] * Joins: kfish (n=conrad@61.194.21.25)
  347. # [11:57] * othermaciej is now known as om_sleep
  348. # [12:04] <annevk> bah
  349. # [12:04] <annevk> </br> is harder than it looks
  350. # [12:05] <zcorpan_> ok, the good news is that ie does the same thing with entities in attributes for both <img alt> and <a href>
  351. # [12:05] <zcorpan_> the bad news is that <img alt="&AElig"> works but <img alt="&AEligX"> doesn't
  352. # [12:06] <zcorpan_> need to figure out which characters end entities
  353. # [12:06] <zcorpan_> in attribute values
  354. # [12:09] <annevk> ah, that's only for attribute values?
  355. # [12:09] <zcorpan_> yeah
  356. # [12:09] <annevk> quoted versus double quoted versus unquoted too maybe?
  357. # [12:09] <zcorpan_> oh, that better work the same...
  358. # [12:09] <zcorpan_> but i'll test it too
  359. # [12:11] <hsivonen> when the generic facet of my validation service sees an XHTML 1.0 doctype in text/html, I will (in a future release) tokenize as HTML5 but validate as XHTML 1.0 and I'm going to say that this is bogus but I am doing it for the users' convenience
  360. # [12:11] <hsivonen> should I make the message a warning or an error?
  361. # [12:11] <hsivonen> "bogus" means error but "convenience" means warning
  362. # [12:11] <zcorpan_> error
  363. # [12:12] <zcorpan_> say that it isn't processed as xhtml by browsers unless the document is served with an xml mime type
  364. # [12:12] <zcorpan_> or something
  365. # [12:12] <virtuelv> annevk: doesn't most browsers interpret </br> as <br>?
  366. # [12:13] <annevk> yes
  367. # [12:13] <hsivonen> zcorpan_: makes sense.
  368. # [12:13] <hsivonen> annevk: do you have an opinion?
  369. # [12:14] <annevk> hsivonen, warning seems fine
  370. # [12:14] <hsivonen> zcorpan_: or would it make sense to turn it into a warning if the user checked the lax content type checkbox?
  371. # [12:14] <annevk> it's not actively harmful
  372. # [12:14] * hsivonen is inclined to bind this to the lax type option
  373. # [12:16] <hsivonen> doh. I'm already doing something else for the lax type option, so that doesn't work
  374. # [12:16] * Joins: Ducki (n=Alex@dialin-212-144-055-174.pools.arcor-ip.net)
  375. # [12:18] <zcorpan_> hsivonen: with the lax option set, wouldn't you process it as xml?
  376. # [12:18] <hsivonen> zcorpan_: yes. I can't even remember anymore what the lax option does
  377. # [12:18] * Joins: BenWard (i=BenWard@nat/yahoo/x-a897a56513068751)
  378. # [12:18] <hsivonen> the code for it is rather hairy, too
  379. # [12:19] <zcorpan_> in any case, when you parse xhtml with the html parser, emit an error imho
  380. # [12:19] <hsivonen> zcorpan_: ok
  381. # [12:19] <zcorpan_> if, with the lax option, you parse text/html as xml, a warning is fine
  382. # [12:20] <zcorpan_> back to entities: it seems any character except [a-zA-Z0-9] end an entity in attribute values
  383. # [12:25] <annevk> so you basically consume chars until you hit something out that range
  384. # [12:25] <annevk> hmm
  385. # [12:27] * moeffju[ZzZz] is now known as moeffju
  386. # [12:29] * Quits: webben (i=benh@nat/yahoo/x-f394eae3100023ab) (Read error: 60 (Operation timed out))
  387. # [12:29] <zcorpan_> or you consume as many as possible that match the entity table, and for the longest match, check if the next character is in that range. if yes, emit the consumed characters, otherwise emit the entity
  388. # [12:32] <annevk> ok, rearchitected my </br> fix
  389. # [12:32] <annevk> should be easy to add </p> later
  390. # [12:33] <annevk> and _tons_ of other elements that act like that...
  391. # [12:33] <annevk> I love </plaintext>
  392. # [12:35] <annevk> zcorpan_, assuming the entity table doesn't have ; that should work I suppose
  393. # [12:36] <zcorpan_> yeah, the ; is not part of the entity name. we need to revert to the old table and instead have a third column that says which entities always require a ;
  394. # [12:37] <annevk> and a fourth that says which entities require that for attribute values...
  395. # [12:38] <zcorpan_> that is the same
  396. # [12:39] <zcorpan_> unless the next character is [a-zA-Z0-9], in which case all entities require a ;
  397. # [12:40] <annevk> how does that cover <a href="&region">&region</a>
  398. # [12:40] <annevk> oh right
  399. # [12:40] <annevk> interesting
  400. # [12:40] <annevk> what about & as terminating character and ?
  401. # [12:40] <annevk> or did you already try it for a big range?
  402. # [12:42] <zcorpan_> http://simon.html5.org/test/html/parsing/entities/trailing-semicolon/004.htm
  403. # [12:43] <annevk> good stuff :)
  404. # [12:44] <annevk> maybe you should use <span> instead of <a> for 003
  405. # [12:45] <zcorpan_> span doesn't have a href attribute :)
  406. # [12:45] <annevk> use title :)
  407. # [12:45] <zcorpan_> the point was to test a URI attribute
  408. # [12:45] <annevk> ok
  409. # [12:45] <zcorpan_> though i could use # if you don't want a lot of 404s :)
  410. # [12:46] <annevk> I suppose that could help
  411. # [12:47] <zcorpan_> done
  412. # [12:48] <zcorpan_> sent results to the list
  413. # [12:50] <annevk> heh, fun that you replied to my message :p
  414. # [12:50] * annevk goes to fetch some food before it's gone
  415. # [12:52] <zcorpan_> i thought it was appropriate as a reply :)
  416. # [12:52] * Joins: yod (n=ot@bas11-montreal02-1128535778.dsl.bell.ca)
  417. # [12:54] * Joins: webben (i=benh@nat/yahoo/x-6bb203663b30336a)
  418. # [12:55] <hsivonen> does IE7 support &apos;?
  419. # [12:56] <zcorpan_> no
  420. # [12:56] <hsivonen> that's weird
  421. # [12:56] <zcorpan_> yes
  422. # [12:56] <zcorpan_> iirc i filed a bug on that during their "beta" stage
  423. # [12:57] <hsivonen> gotta remember to make it a warning
  424. # [12:57] <zcorpan_> http://simon.html5.org/test/ie7b2-bugs/014.html
  425. # [12:57] <zcorpan_> opera doesn't support &TRADE;
  426. # [12:58] * hsivonen adds a note in the source
  427. # [12:58] <zcorpan_> annevk: is there a bug on that? (can i check that? :P )
  428. # [13:17] * hsivonen wonders what's the best practice regarding memory allocation for growable buffers in a reusable library class
  429. # [13:18] <hsivonen> that is, should I optimize speed and risk memory leaks?
  430. # [13:18] <hsivonen> never leak memory and risk speed?
  431. # [13:18] <hsivonen> or let the user of the library decide?
  432. # [13:19] <hsivonen> annevk: does html5lib ever shrink buffers that grow depending on input? or does Python make these decisions for you?
  433. # [13:24] * Quits: Charl (n=charlvn@net-153-111.mweb.co.za) ("Leaving")
  434. # [13:27] <annevk> zcorpan_, Opera does
  435. # [13:27] <annevk> zcorpan_, fetch a newer build now you can ;)
  436. # [13:28] <annevk> hsivonen, I'm not competent enough to answer that question
  437. # [13:28] <annevk> hsivonen, I can say as much as that we don't have weird constraints anywhere to my knowledge
  438. # [13:30] <hsivonen> annevk: ok
  439. # [13:33] * Quits: webben (i=benh@nat/yahoo/x-6bb203663b30336a) (Client Quit)
  440. # [13:53] * Joins: Ducki_ (i=Ducki@dialin-145-254-187-023.pools.arcor-ip.net)
  441. # [13:54] * Quits: Ducki (n=Alex@dialin-212-144-055-174.pools.arcor-ip.net) (Read error: 113 (No route to host))
  442. # [14:00] * Quits: ddfreyne (n=ddfreyne@unaffiliated/ddfreyne) ("kthxbai")
  443. # [14:02] * Quits: Ducki_ (i=Ducki@dialin-145-254-187-023.pools.arcor-ip.net) (Read error: 104 (Connection reset by peer))
  444. # [14:07] * Joins: webben (i=benh@nat/yahoo/x-41646368fedce086)
  445. # [14:14] <annevk> onload is broken in Safari: http://www.howtocreate.co.uk/safaribenchmarks.html ?
  446. # [14:14] <annevk> you'd think that if onload is broken pages would be broken as well...
  447. # [14:46] <Fuzzy76> I guess "broken" is a subjective term
  448. # [14:47] <annevk> it certainly explains the statistics on the safari download page...
  449. # [14:47] <Fuzzy76> yes... I've seen several other benchmarks, and none of them showed anything NEAR the numbers from Apple.
  450. # [14:52] * Quits: hsivonen (n=hsivonen@kekkonen.cs.hut.fi) (Remote closed the connection)
  451. # [14:54] * Joins: Cerbera (i=cerbera@cpc1-flee1-0-0-cust285.glfd.cable.ntl.com)
  452. # [14:55] * Parts: Cerbera (i=cerbera@cpc1-flee1-0-0-cust285.glfd.cable.ntl.com)
  453. # [15:20] <annevk> can someone explain to me how "After DOCTYPE public identifier state" and "Before DOCTYPE system identifier state" are different?
  454. # [15:20] <annevk> seems like they could be merged
  455. # [15:21] <annevk> i'll keep the separate for now...
  456. # [15:24] * Philip` wonders if there's a reliable way to get multiple asynchronous XMLHttpRequests in flight at once (so the frequency of response arrival can be faster than the round-trip time)
  457. # [15:28] * Joins: karlUshi (n=karl@124-144-94-188.rev.home.ne.jp)
  458. # [15:37] <annevk> wtf
  459. # [15:37] <annevk> doctype name is no longer uppercase?!
  460. # [15:37] <annevk> uppercased*
  461. # [15:37] <annevk> this is problematic
  462. # [15:39] <annevk> seems to be what Firefox does
  463. # [15:39] <annevk> but the amount of testcases that relies on this quirk...
  464. # [15:43] * annevk fixes tests
  465. # [15:53] * Joins: icaaq_ (n=icaaaq@226.228.13.217.in-addr.dgcsystems.net)
  466. # [15:55] <annevk> Soonish people should be able to use html5lib to determine whether a page will render in quirks or standards mode
  467. # [15:57] <mpt> zcorpan_, what if someone really does want to style the <head> (e.g. head, title {display: block} title {font-size: 2em;})?
  468. # [16:01] <annevk> you don't need a scoped style sheet for that
  469. # [16:03] <Philip`> (Hmm, I can't fix my problem with XMLHttpRequest, but I can dynamically add <script> elements to the page while cycling through server port numbers so it has one outstanding request per port, since the scripts appear to get loaded asynchronously)
  470. # [16:04] <Philip`> (Oh, but they're only asynchronous in Firefox, not Opera, so that won't simply work. But XMLHttpRequest appears to do pipelining in Opera, so I just need to switch between the two methods. And work out what to do for Safari...)
  471. # [16:08] * Philip` can't quite find what HTML5 says should happen in terms of synchrony when adding a (non-async) <script> to the DOM
  472. # [16:10] <Philip`> Oh, looks like it ought to be asynchronous, since the pausing is only done inside the tree construction algorithm
  473. # [16:13] * Quits: hendry (i=hendry@conference/debconf/x-54ea805c11f878f1) ("savepower")
  474. # [16:15] <zcorpan_> mpt: if scoped stylesheets are changed to not affect their parent, then you couldn't use a scoped stylesheet for it anyway. and as anne says, you can already do that without scoped stylesheets
  475. # [16:16] * Quits: Lfe (n=lfe@bergstroem.nu) (Remote closed the connection)
  476. # [16:16] * Joins: Lfe (n=lfe@bergstroem.nu)
  477. # [16:16] <zcorpan_> btw, a girl here (at opera) will be implementing an html5 parser in c++
  478. # [16:16] <Lachy> IE's cryptic error javascript error messages are really annoying :-(
  479. # [16:16] <Lfe> zcorpan_: i would like her even more if she somehow left out those pluses ^_^
  480. # [16:17] <zcorpan_> Lfe: heh
  481. # [16:17] <Lachy> I'm writing a test case to test the toUpperCase and toLowerCase functions in JavaScript against the unicode data file
  482. # [16:18] <Philip`> Could provide a C API around the C++ implementation, so it's easily embeddable in other languages (like C, or Python ctypes, or whatever)
  483. # [16:18] <Lachy> so far, I've identified 3 bugs in Firefox within the first 500 chars (cause it takes far too long to process all 17000)
  484. # [16:21] * Joins: met_ (n=Hassman@b14-4.vscht.cz)
  485. # [16:21] <met_> looks like people are confused by all those storages http://ajaxian.com/archives/firefox-3-sqlite-and-more
  486. # [16:25] * Joins: SavageX (n=maikmert@T634f.t.pppool.de)
  487. # [16:30] * Joins: billmason (n=billmaso@ip156.unival.com)
  488. # [16:41] * Quits: maikmerten (n=maikmert@T66d6.t.pppool.de) (Read error: 110 (Connection timed out))
  489. # [16:45] * Joins: jcgregorio (i=chatzill@nat/ibm/x-e7238e70f545cabc)
  490. # [16:48] * Quits: karlUshi (n=karl@124-144-94-188.rev.home.ne.jp) ("Where dwelt Ymir, or wherein did he find sustenance?")
  491. # [16:49] <zcorpan_> http://simon.html5.org/temp/html5-opera.txt are things that i might write tests for this summer (thought probably less that that, that's just a first filtering)
  492. # [16:50] <zcorpan_> anyone want me to look at something in particular?
  493. # [16:57] * Joins: hsivonen (n=hsivonen@vipunen.hut.fi)
  494. # [16:58] <hsivonen> annevk: I haven't implemented anything that is in the tree construction part (yet)
  495. # [17:07] <annevk> ah
  496. # [17:07] <annevk> i just landed all that's needed to enable quirks mode checking
  497. # [17:07] <annevk> someone just has to hook in some flag
  498. # [17:08] * SavageX is now known as maikmerten
  499. # [17:08] * annevk hopes jgraham can make it look prettier
  500. # [17:09] <annevk> and we need to update the DOCTYPE token to handle systemId and publicId in case they are not None
  501. # [17:10] <zcorpan_> comments before the doctype don't trigger quirks mode per html5? even bogus comments? iirc this triggers quirks mode in firefox: </ foo ><!doctype html>
  502. # [17:10] <zcorpan_> but <? foo ><!doctype html> is standards mode
  503. # [17:11] <annevk> </ foo><!doctype html> doesn't in Opera
  504. # [17:11] <annevk> doesn't give you a comment token either
  505. # [17:12] * Quits: KevinMarks (n=KevinMar@pdpc/supporter/active/kevinmarks) ("The computer fell asleep")
  506. # [17:13] <zcorpan_> </ foo><!doctype html> is quirks mode in ie7
  507. # [17:15] <Philip`> XXX<!doctype html> is standards mode in FF too
  508. # [17:15] <Philip`> (unless that > is pushed beyond the first 1024 bytes)
  509. # [17:16] <annevk> So Firefox is sniffing before actual parsing?
  510. # [17:17] <annevk> Guess that's why it's called "doctype sniffing" here and there
  511. # [17:17] <zcorpan_> Philip`: :-O wow, i don't think that was the case before
  512. # [17:17] <zcorpan_> annevk: yeah
  513. # [17:19] <zcorpan_> Philip`: or perhaps i just didn't test that case
  514. # [17:19] <hsivonen> annevk: or you could put JSON nulls in the array for public and system id when not present
  515. # [17:20] <hsivonen> annevk: since that handles nicely the cases when only one is absent
  516. # [17:20] <hsivonen> annevk: and you need to know which on
  517. # [17:20] <hsivonen> e
  518. # [17:20] <hsivonen> Philip`: that's weird. IIRC, around Mozilla 1.1 it wasn't like that.
  519. # [17:20] <Philip`> It looks like FF must be doing some look-ahead before parsing - compare <!--><!doctype html> vs <!--><!doctype xhtml>
  520. # [17:20] <annevk> hsivonen, that's for the tokenizer tests
  521. # [17:21] <annevk> hsivonen, I was thinking about the tree construction stage
  522. # [17:21] <hsivonen> oh
  523. # [17:21] <annevk> maybe I should handle the tokenizer tests first, prolly easier to make testcases too
  524. # [17:21] * Joins: weinigLap (i=weinig@nat/apple/x-b2452f3377f5f19d)
  525. # [17:25] * Joins: aroben (n=adamrobe@c-69-142-103-232.hsd1.pa.comcast.net)
  526. # [17:26] * Quits: aroben (n=adamrobe@c-69-142-103-232.hsd1.pa.comcast.net) (Client Quit)
  527. # [17:30] <annevk> hsivonen, should I use None in the tests?
  528. # [17:31] <hsivonen> annevk: yes
  529. # [17:31] <hsivonen> annevk: I'm assuming that your JSON impl. maps None to JSON null
  530. # [17:32] <annevk> I'm talking about the test format
  531. # [17:32] * Quits: met_ (n=Hassman@b14-4.vscht.cz) ("Chemists never die, they just stop reacting.")
  532. # [17:32] <hsivonen> annevk: tree tests?
  533. # [17:33] * Joins: aroben (n=adamrobe@c-69-142-103-232.hsd1.pa.comcast.net)
  534. # [17:33] <annevk> tokenizer tests
  535. # [17:34] <annevk> http://html5lib.googlecode.com/svn/trunk/testdata/tokenizer/
  536. # [17:35] <hsivonen> annevk: JSON null please
  537. # [17:35] <annevk> that throws an error somewhere else
  538. # [17:36] <hsivonen> annevk: in your JSON to Python mapper?
  539. # [17:37] <hsivonen> I'd expect a correct doctype to look like this: ["DOCTYPE", "HTML", null, null, false]
  540. # [17:37] <annevk> the last one should be true I think
  541. # [17:37] <annevk> as the flag is now "correct"
  542. # [17:37] <hsivonen> argh.
  543. # [17:37] <hsivonen> annevk: you are right, of course
  544. # [17:38] <Philip`> More specifically: FF seems to do standards mode if the first 1024 characters from the first non-whitespace character onwards, parsed using quirks mode rules, contains at least one doctype, and the first doctype is a valid HTML one and is not preceded by any non-comment non-text nodes
  545. # [17:38] <annevk> it seems all tests were a bit bogus with respect to that
  546. # [17:38] <Philip`> (or something roughly like that)
  547. # [17:39] <zcorpan_> Philip`: you sure parsing is in quirks mode initially?
  548. # [17:39] * zcorpan_ should be heading home now so he doesn't miss the train
  549. # [17:40] <Philip`> I think so - <!--><!doctype html><!--> results in two empty comments, instead of one comment with the text "><!doctype html><!" in it
  550. # [17:40] * Quits: aroben (n=adamrobe@c-69-142-103-232.hsd1.pa.comcast.net)
  551. # [17:40] <annevk> ah ok
  552. # [17:40] <annevk> I didn't have simplejson and hence I got some simplified json parser that didn't get null
  553. # [17:40] <zcorpan_> but is the document standards mode or quirks mode?
  554. # [17:40] <annevk> implemented null in it now
  555. # [17:41] <zcorpan_> if quirks mode then the parser is initially in standards mode -- otherwise you would have seen the doctype in the pre-parse and switched to standards mode
  556. # [17:41] <annevk> Philip`, <!--> should always be a single comment
  557. # [17:41] <zcorpan_> i might have written something about this at sitepoint forums at some point
  558. # [17:42] <zcorpan_> anyway, i'm leaving now
  559. # [17:42] * zcorpan_ waves
  560. # [17:42] <Philip`> If I do <!doctype html><!--><!doctype html><!--> then it is CSS1Compat and it says "#comment: ><!doctype html><!"
  561. # [17:43] <Philip`> If I do <!--><!doctype html><!--> then it is BackCompat and it says "#comment","#comment"
  562. # [17:44] <Philip`> so... it's parsing in standards mode, not finding the doctype, then re-parsing in quirks mode (and finding the doctype but not changing mode)?
  563. # [17:45] <annevk> I have updated some of the tests
  564. # [17:48] * Joins: hendry (i=hendry@conference/debconf/x-609d3a2cbe1a9a2e)
  565. # [17:50] * Quits: jruderman (n=jruderma@c-67-169-24-116.hsd1.ca.comcast.net) (Read error: 110 (Connection timed out))
  566. # [18:00] * Joins: hasather (n=hasather@22.80-203-71.nextgentel.com)
  567. # [18:02] * Quits: zcorpan_ (n=zcorpan@pat.se.opera.com) (Read error: 110 (Connection timed out))
  568. # [18:06] * Parts: icaaq_ (n=icaaaq@226.228.13.217.in-addr.dgcsystems.net)
  569. # [18:16] * Quits: BenWard (i=BenWard@nat/yahoo/x-a897a56513068751) (Read error: 54 (Connection reset by peer))
  570. # [18:16] * Joins: BenWard (i=BenWard@nat/yahoo/x-c770bf629bc86efc)
  571. # [18:34] * Joins: weinigLap_ (i=weinig@nat/apple/x-e9330fb687110f8f)
  572. # [18:34] * Quits: weinigLap (i=weinig@nat/apple/x-b2452f3377f5f19d) (Read error: 104 (Connection reset by peer))
  573. # [18:36] * weinigLap_ is now known as weinigLap
  574. # [18:46] <annevk> I fixed all the DOCTYPE tests and the tokenizer part of the implementation. I also added some more tokenizer tests to cover PUBLIC and SYSTEM ids.
  575. # [18:47] * Quits: weinigLap (i=weinig@nat/apple/x-e9330fb687110f8f)
  576. # [18:50] * Joins: rubys (n=rubys@cpe-075-182-064-252.nc.res.rr.com)
  577. # [18:53] <rubys> annevk: you've been busy! :-)
  578. # [18:54] <annevk> yeah, I feel a bit sorry for the ruby project
  579. # [18:54] * Joins: KevinMarks (i=KevinMar@nat/google/x-a0afc6007b347218)
  580. # [18:54] <rubys> nah, won't be hard to catch up, the divs on the python code points the way.
  581. # [18:54] <annevk> cool
  582. # [18:55] <annevk> There are still some things to implement such as proper DOCTYPE tokens
  583. # [18:55] <rubys> i'd like to wait to resync until you slow down...
  584. # [18:56] <annevk> I'm about to go home
  585. # [18:56] <annevk> so go ahead :)
  586. # [18:56] <rubys> cool, and I see the python tests are passing, which is a good sign.
  587. # [18:59] * Joins: weinigLap (i=weinig@nat/apple/x-71ac892db7432bfb)
  588. # [19:00] <annevk> yeah, I fixed the tests along with the implementation although 3 are still failing
  589. # [19:00] <annevk> I hope Thomas can fix that
  590. # [19:00] <rubys> i don't see any failing... which ones fail for you?
  591. # [19:00] <annevk> some sanitize and serializer tests
  592. # [19:01] <rubys> I just tried again... no failures.
  593. # [19:01] <annevk> hmm ok
  594. # [19:02] <annevk> maybe I'm missing something
  595. # [19:05] * Quits: ROBOd (n=robod@86.34.246.154) ("http://www.robodesign.ro")
  596. # [19:13] * Joins: ROBOd (n=robod@86.34.246.154)
  597. # [19:14] * Quits: Dashiva (i=Dashiva@v035b.studby.ntnu.no) (Read error: 104 (Connection reset by peer))
  598. # [19:15] * Joins: Dashiva (i=Dashiva@v035b.studby.ntnu.no)
  599. # [19:21] * Quits: Hixie (i=ianh@trivini.no) (Read error: 104 (Connection reset by peer))
  600. # [19:21] * Joins: Hixie (i=ianh@trivini.no)
  601. # [19:22] * Joins: jruderman (n=jruderma@c-67-169-24-116.hsd1.ca.comcast.net)
  602. # [19:33] * Quits: BenWard (i=BenWard@nat/yahoo/x-c770bf629bc86efc) (Read error: 104 (Connection reset by peer))
  603. # [19:34] * Joins: BenWard (i=BenWard@nat/yahoo/x-5b7d4fb73384f8e8)
  604. # [19:44] <annevk> jgraham, have you looked at handling "comments" within RCDATA and CDATA blocks?
  605. # [19:44] <annevk> jgraham, seems like we need some character buffer
  606. # [19:46] * Joins: h3h (n=w3rd@66-162-32-234.static.twtelecom.net)
  607. # [19:47] * Joins: Ducki (n=Alex@dialin-145-254-186-123.pools.arcor-ip.net)
  608. # [19:48] * Joins: Ducki_ (n=Alex@dialin-212-144-064-154.pools.arcor-ip.net)
  609. # [19:51] * Quits: Ducki (n=Alex@dialin-145-254-186-123.pools.arcor-ip.net) (Read error: 104 (Connection reset by peer))
  610. # [19:51] * Quits: BenWard (i=BenWard@nat/yahoo/x-5b7d4fb73384f8e8) ("Fades out again…")
  611. # [19:55] * Joins: zcorpan_ (n=zcorpan@84-216-40-33.sprayadsl.telenor.se)
  612. # [20:00] * Quits: KevinMarks (i=KevinMar@pdpc/supporter/active/kevinmarks) ("The computer fell asleep")
  613. # [20:01] * Joins: aroben (n=adamrobe@c-69-142-103-232.hsd1.pa.comcast.net)
  614. # [20:01] * Quits: aroben (n=adamrobe@c-69-142-103-232.hsd1.pa.comcast.net) (Remote closed the connection)
  615. # [20:02] * Joins: aroben (n=adamrobe@c-69-142-103-232.hsd1.pa.comcast.net)
  616. # [20:06] * Quits: aroben (n=adamrobe@c-69-142-103-232.hsd1.pa.comcast.net) (Client Quit)
  617. # [20:12] * Parts: webben (i=benh@nat/yahoo/x-41646368fedce086)
  618. # [20:16] * Joins: ddfreyne (n=ddfreyne@d54C57894.access.telenet.be)
  619. # [20:16] * Joins: KevinMarks (i=KevinMar@nat/google/x-10afead0a8b1854b)
  620. # [20:16] * Quits: ddfreyne (n=ddfreyne@unaffiliated/ddfreyne) (Remote closed the connection)
  621. # [20:16] * Joins: ddfreyne (n=ddfreyne@d54C57894.access.telenet.be)
  622. # [20:26] <jgraham> annevk: I was going to ask you the same thing :)
  623. # [20:26] <jgraham> I haven't, yet
  624. # [20:26] <jgraham> As I've been a bit busy
  625. # [20:26] <jgraham> I was happy to see all your checkins today though
  626. # [20:28] * Quits: hendry (i=hendry@conference/debconf/x-609d3a2cbe1a9a2e) (Read error: 110 (Connection timed out))
  627. # [20:55] * Quits: jruderman (n=jruderma@c-67-169-24-116.hsd1.ca.comcast.net)
  628. # [21:01] * om_sleep is now known as othermaciej
  629. # [21:02] <zcorpan_> Philip`: yes, exactly
  630. # [21:07] <Jero> off topic: what do you guys think of this design? http://jero.net/lab/redesign2/
  631. # [21:17] <zcorpan_> Jero: looks a bit like a standard template for a blog
  632. # [21:19] <Jero> well, it is a blog ;), but i see what you mean
  633. # [21:20] <Jero> I'll probably have to dust off my Photoshop skills and try to come up with something original
  634. # [21:21] * Joins: mw22_ (n=chatzill@h8441169151.dsl.speedlinq.nl)
  635. # [21:21] * Quits: mw22 (n=chatzill@h8441169151.dsl.speedlinq.nl) (Read error: 104 (Connection reset by peer))
  636. # [21:22] * mw22_ is now known as mw22
  637. # [21:25] * Quits: maikmerten (n=maikmert@T634f.t.pppool.de) (niven.freenode.net irc.freenode.net)
  638. # [21:25] * Quits: rubys (n=rubys@cpe-075-182-064-252.nc.res.rr.com) (niven.freenode.net irc.freenode.net)
  639. # [21:26] * Quits: ROBOd (n=robod@86.34.246.154) (niven.freenode.net irc.freenode.net)
  640. # [21:26] * Quits: weinigLap (i=weinig@nat/apple/x-71ac892db7432bfb) (niven.freenode.net irc.freenode.net)
  641. # [21:26] * Quits: Lachy (n=Lachlan@203-158-59-119.dyn.iinet.net.au) (niven.freenode.net irc.freenode.net)
  642. # [21:26] * Quits: othermaciej (n=mjs@dsl081-048-145.sfo1.dsl.speakeasy.net) (niven.freenode.net irc.freenode.net)
  643. # [21:26] * Quits: dolphinling (n=chatzill@rbpool5-2.shoreham.net) (niven.freenode.net irc.freenode.net)
  644. # [21:26] * Quits: Yudai (n=Yudai@p931010.tokyte00.ap.so-net.ne.jp) (niven.freenode.net irc.freenode.net)
  645. # [21:26] * Quits: didymos (i=jho@rapwap.razor.dk) (niven.freenode.net irc.freenode.net)
  646. # [21:26] * Joins: holst_ (i=jho@rapwap.razor.dk)
  647. # [21:27] * Joins: ROBOd (n=robod@86.34.246.154)
  648. # [21:27] * Joins: weinigLap (i=weinig@nat/apple/x-71ac892db7432bfb)
  649. # [21:27] * Joins: Lachy (n=Lachlan@203-158-59-119.dyn.iinet.net.au)
  650. # [21:27] * Joins: othermaciej (n=mjs@dsl081-048-145.sfo1.dsl.speakeasy.net)
  651. # [21:27] * Joins: dolphinling (n=chatzill@rbpool5-2.shoreham.net)
  652. # [21:27] * Joins: Yudai (n=Yudai@p931010.tokyte00.ap.so-net.ne.jp)
  653. # [21:27] * Joins: didymos (i=jho@rapwap.razor.dk)
  654. # [21:27] * Quits: didymos (i=jho@rapwap.razor.dk) (Success)
  655. # [21:27] * Quits: KevinMarks (i=KevinMar@pdpc/supporter/active/kevinmarks) (niven.freenode.net irc.freenode.net)
  656. # [21:27] * Quits: zcorpan_ (n=zcorpan@84-216-40-33.sprayadsl.telenor.se) (niven.freenode.net irc.freenode.net)
  657. # [21:27] * Quits: Ducki_ (n=Alex@dialin-212-144-064-154.pools.arcor-ip.net) (niven.freenode.net irc.freenode.net)
  658. # [21:27] * Quits: Hixie (i=ianh@trivini.no) (niven.freenode.net irc.freenode.net)
  659. # [21:27] * Quits: hasather (n=hasather@22.80-203-71.nextgentel.com) (niven.freenode.net irc.freenode.net)
  660. # [21:27] * Quits: jcgregorio (i=chatzill@nat/ibm/x-e7238e70f545cabc) (niven.freenode.net irc.freenode.net)
  661. # [21:27] * Quits: yod (n=ot@bas11-montreal02-1128535778.dsl.bell.ca) (niven.freenode.net irc.freenode.net)
  662. # [21:27] * Quits: MikeSmith (n=MikeSmit@58.157.21.205) (niven.freenode.net irc.freenode.net)
  663. # [21:27] * Quits: bewest (n=ben@httpcraft/bewest) (niven.freenode.net irc.freenode.net)
  664. # [21:27] * Quits: YaaL (i=yaal@hell.pl) (niven.freenode.net irc.freenode.net)
  665. # [21:27] * Quits: annevk (n=annevk@pat-tdc.opera.com) (niven.freenode.net irc.freenode.net)
  666. # [21:27] * Quits: citoyen (i=eira@synth.no) (niven.freenode.net irc.freenode.net)
  667. # [21:28] * Joins: KevinMarks (i=KevinMar@pdpc/supporter/active/kevinmarks)
  668. # [21:28] * Joins: zcorpan_ (n=zcorpan@84-216-40-33.sprayadsl.telenor.se)
  669. # [21:28] * Joins: Ducki_ (n=Alex@dialin-212-144-064-154.pools.arcor-ip.net)
  670. # [21:28] * Joins: Hixie (i=ianh@trivini.no)
  671. # [21:28] * Joins: hasather (n=hasather@22.80-203-71.nextgentel.com)
  672. # [21:28] * Joins: jcgregorio (i=chatzill@nat/ibm/x-e7238e70f545cabc)
  673. # [21:28] * Joins: yod (n=ot@bas11-montreal02-1128535778.dsl.bell.ca)
  674. # [21:28] * Joins: MikeSmith (n=MikeSmit@58.157.21.205)
  675. # [21:28] * Joins: bewest (n=ben@httpcraft/bewest)
  676. # [21:28] * Joins: annevk (n=annevk@pat-tdc.opera.com)
  677. # [21:28] * Joins: citoyen (i=eira@synth.no)
  678. # [21:28] * Quits: MikeSmith (n=MikeSmit@58.157.21.205) (Excess Flood)
  679. # [21:29] * Joins: MikeSmith (n=MikeSmit@58.157.21.205)
  680. # [21:32] <Hixie> annevk: we could define almost standards mode, but there'd be absolutely no detectable conformance criteria in the spec for it :-/
  681. # [21:32] <Hixie> hsivonen: i thought the table _was_ sorted
  682. # [21:33] * Joins: Toolskyn (i=toolskyn@amy.bdick.de)
  683. # [21:35] * Joins: YaaL (i=yaal@hell.pl)
  684. # [21:36] * Joins: maikmerten (n=maikmert@T634f.t.pppool.de)
  685. # [21:38] * Joins: rubys (n=rubys@cpe-075-182-064-252.nc.res.rr.com)
  686. # [21:38] * Joins: jruderman (n=jruderma@guest-230.mountainview.mozilla.com)
  687. # [21:39] * Quits: zcorpan_ (n=zcorpan@84-216-40-33.sprayadsl.telenor.se) (Read error: 110 (Connection timed out))
  688. # [21:42] <Jero> for all those who care: I now implemented the entire tokenization and tree construction algorithms in my HTML5 parser in PHP (http://jero.net/lab/ph5p/)
  689. # [21:42] <Jero> now to get rid of those bugs (http://jero.net/lab/ph5p/tests.html)
  690. # [21:43] <Jero> (and optimizing might not be such a bad idea)
  691. # [21:47] <rubys> jero: have you taken a look at html5lib's testdata directory?
  692. # [21:48] <Jero> not recently, but the tests in my tests.html file are from the first test file
  693. # [21:48] * Joins: Ducki (i=Alex@dialin-212-144-064-209.pools.arcor-ip.net)
  694. # [21:49] * Quits: gsnedders (n=gsnedder@host86-140-190-99.range86-140.btcentralplus.com) (Remote closed the connection)
  695. # [21:50] * Joins: gsnedders (n=gsnedder@host86-140-190-99.range86-140.btcentralplus.com)
  696. # [21:59] <Hixie> annevk: defined almost standards mode
  697. # [22:03] * Joins: dbaron (n=dbaron@corp-242.mountainview.mozilla.com)
  698. # [22:10] * Quits: Ducki_ (n=Alex@dialin-212-144-064-154.pools.arcor-ip.net) (Read error: 113 (No route to host))
  699. # [22:32] * Quits: rubys (n=rubys@cpe-075-182-064-252.nc.res.rr.com) (Read error: 60 (Operation timed out))
  700. # [22:36] <Hixie> jesus, <nobr> is wacked in html parsers
  701. # [22:36] <Hixie> how are we gonna handle _that_
  702. # [22:37] <Dashiva> Compared to what parsers?
  703. # [22:40] <Hixie> how do you mean?
  704. # [22:41] <Dashiva> Just wondered if there was somewhere it wasn't wacked, since you qualified the statement like that
  705. # [22:45] * Quits: maikmerten (n=maikmert@T634f.t.pppool.de) ("Leaving")
  706. # [22:45] * Quits: yod (n=ot@bas11-montreal02-1128535778.dsl.bell.ca) ("Leaving")
  707. # [22:46] <Hixie> Dashiva: oh well like xml parsing
  708. # [22:46] * Quits: ROBOd (n=robod@86.34.246.154) ("http://www.robodesign.ro")
  709. # [22:48] * Joins: kingryan (n=kingryan@corp.technorati.com)
  710. # [22:57] * Quits: jcgregorio (i=chatzill@nat/ibm/x-e7238e70f545cabc) ("ChatZilla 0.9.78.1 [Firefox 2.0.0.4/2007060115]")
  711. # [22:58] * Quits: MikeSmith (n=MikeSmit@58.157.21.205) (Read error: 110 (Connection timed out))
  712. # [22:58] * Quits: ddfreyne (n=ddfreyne@unaffiliated/ddfreyne) ("kthxbai")
  713. # [23:04] * Quits: dbaron (n=dbaron@corp-242.mountainview.mozilla.com) ("8403864 bytes have been tenured, next gc will be global.")
  714. # [23:08] * Quits: othermaciej (n=mjs@dsl081-048-145.sfo1.dsl.speakeasy.net) (Read error: 110 (Connection timed out))
  715. # [23:14] * Quits: Jero (n=Jero@d207230.upc-d.chello.nl) ("ChatZilla 0.9.78.1 [Firefox 2.0.0.4/2007051502]")
  716. # [23:17] <Hixie> hsivonen: yt?
  717. # [23:18] * Joins: dbaron (n=dbaron@corp-242.mountainview.mozilla.com)
  718. # [23:24] * Parts: hasather (n=hasather@22.80-203-71.nextgentel.com)
  719. # [23:27] * Quits: jruderman (n=jruderma@guest-230.mountainview.mozilla.com)
  720. # [23:28] * Joins: Wolfman2000 (n=Wolfman2@wvh5348rn.rh.ncsu.edu)
  721. # [23:34] * Joins: jruderman (n=jruderma@corp-242.mountainview.mozilla.com)
  722. # [23:34] * Quits: KevinMarks (i=KevinMar@pdpc/supporter/active/kevinmarks) ("brb")
  723. # [23:35] * Joins: KevinMarks (i=KevinMar@nat/google/x-241954cbdc7d6ec0)
  724. # [23:47] * Quits: weinigLap (i=weinig@nat/apple/x-71ac892db7432bfb)
  725. # [23:49] * Joins: Ducki_ (i=Alex@dialin-212-144-064-209.pools.arcor-ip.net)
  726. # [23:49] * Quits: Ducki (i=Alex@dialin-212-144-064-209.pools.arcor-ip.net) (Read error: 104 (Connection reset by peer))
  727. # [23:52] * Joins: csarven (n=nevrasc@modemcable081.152-201-24.mc.videotron.ca)
  728. # [23:54] * Quits: Ducki_ (i=Alex@dialin-212-144-064-209.pools.arcor-ip.net) (Read error: 113 (No route to host))
  729. # [23:54] <Philip`> Hmph, now I need to do canvas text rendering :-(
  730. # [23:55] * Joins: othermaciej (n=mjs@17.255.99.40)
  731. # Session Close: Tue Jun 19 00:00:00 2007

The end :)