/irc-logs / freenode / #whatwg / 2007-05-13 / end

Options:

  1. # Session Start: Sun May 13 00:00:00 2007
  2. # Session Ident: #whatwg
  3. # [00:03] * Joins: jruderman (n=jruderma@c-67-169-183-228.hsd1.ca.comcast.net)
  4. # [00:04] * om_out is now known as othermaciej
  5. # [00:50] * Quits: hasather (n=hasather@81-235-209-174-no62.tbcn.telia.com) (Remote closed the connection)
  6. # [00:50] * Joins: zcorpan_ (n=zcorpan@217-211-77-236-no13.tbcn.telia.com)
  7. # [00:50] * Joins: hasather (n=hasather@81-235-209-174-no62.tbcn.telia.com)
  8. # [00:51] * Quits: dbaron (n=dbaron@c-71-198-189-81.hsd1.ca.comcast.net) (Read error: 110 (Connection timed out))
  9. # [00:52] * Joins: MikeSmith (n=MikeSmit@202.33.78.114)
  10. # [01:26] * Quits: zcorpan_ (n=zcorpan@217-211-77-236-no13.tbcn.telia.com) (Read error: 110 (Connection timed out))
  11. # [02:04] * Quits: gavin (n=gavin@firefox/developer/gavin) (Remote closed the connection)
  12. # [02:04] * Joins: gavin (n=gavin@people.mozilla.com)
  13. # [02:06] * Parts: hasather (n=hasather@81-235-209-174-no62.tbcn.telia.com)
  14. # [02:58] * Quits: MikeSmith (n=MikeSmit@202.33.78.114) ("Get thee behind me, satan.")
  15. # [03:18] * Joins: h3h (n=w3rd@cpe-66-75-149-197.san.res.rr.com)
  16. # [03:22] * Quits: bzed (n=bzed@dslb-084-059-126-057.pools.arcor-ip.net) (Remote closed the connection)
  17. # [03:34] * Joins: weinig (n=weinig@adsl-71-134-96-142.dsl.sntc01.pacbell.net)
  18. # [03:34] * Quits: h3h (n=w3rd@cpe-66-75-149-197.san.res.rr.com) ("|")
  19. # [03:42] * Joins: tantek (n=tantek@dsl001-150-252.sfo1.dsl.speakeasy.net)
  20. # [03:56] * Joins: MikeSmith (n=MikeSmit@202.33.78.114)
  21. # [04:07] * Quits: MikeSmith (n=MikeSmit@202.33.78.114) ("Get thee behind me, satan.")
  22. # [04:43] * Quits: weinig (n=weinig@adsl-71-134-96-142.dsl.sntc01.pacbell.net)
  23. # [04:57] * Joins: jcgregorio (n=chatzill@adsl-072-148-043-048.sip.rmo.bellsouth.net)
  24. # [05:04] * Quits: jdandrea (n=jdandrea@ool-44c0a1fe.dyn.optonline.net)
  25. # [05:10] * Joins: mikeday (n=mikeday@CPE-60-224-50-129.vic.bigpond.net.au)
  26. # [05:10] * Quits: virtuelv (n=virtuelv@pat-tdc.opera.com) (heinlein.freenode.net irc.freenode.net)
  27. # [05:10] * Quits: mw22 (n=chatzill@h8441169151.dsl.speedlinq.nl) (heinlein.freenode.net irc.freenode.net)
  28. # [05:10] * Quits: moeffju (i=moeffju@ubermutant.net) (heinlein.freenode.net irc.freenode.net)
  29. # [05:10] * Quits: ianloic (n=ian@71.5.56.162.ptr.us.xo.net) (heinlein.freenode.net irc.freenode.net)
  30. # [05:10] * Quits: jruderman (n=jruderma@c-67-169-183-228.hsd1.ca.comcast.net) (heinlein.freenode.net irc.freenode.net)
  31. # [05:10] * Quits: Lachy (n=Lachlan@203-214-143-196.perm.iinet.net.au) (heinlein.freenode.net irc.freenode.net)
  32. # [05:10] * Quits: tantek (n=tantek@dsl001-150-252.sfo1.dsl.speakeasy.net) (heinlein.freenode.net irc.freenode.net)
  33. # [05:10] * Quits: csarven (n=nevrasc@modemcable081.152-201-24.mc.videotron.ca) (heinlein.freenode.net irc.freenode.net)
  34. # [05:10] * Quits: annevk (n=annevk@pat-tdc.opera.com) (heinlein.freenode.net irc.freenode.net)
  35. # [05:10] * Quits: hays (n=hays@pool-138-88-199-16.res.east.verizon.net) (heinlein.freenode.net irc.freenode.net)
  36. # [05:10] * Joins: tantek (n=tantek@dsl001-150-252.sfo1.dsl.speakeasy.net)
  37. # [05:10] * Joins: jruderman (n=jruderma@c-67-169-183-228.hsd1.ca.comcast.net)
  38. # [05:10] * Joins: csarven (n=nevrasc@modemcable081.152-201-24.mc.videotron.ca)
  39. # [05:10] * Joins: annevk (n=annevk@pat-tdc.opera.com)
  40. # [05:10] * Joins: hays (n=hays@pool-138-88-199-16.res.east.verizon.net)
  41. # [05:10] * Joins: Lachy (n=Lachlan@203-214-143-196.perm.iinet.net.au)
  42. # [05:10] * Joins: virtuelv (n=virtuelv@pat-tdc.opera.com)
  43. # [05:10] * Joins: mw22 (n=chatzill@h8441169151.dsl.speedlinq.nl)
  44. # [05:10] * Joins: moeffju (i=moeffju@ubermutant.net)
  45. # [05:10] * Joins: ianloic (n=ian@71.5.56.162.ptr.us.xo.net)
  46. # [05:15] * Quits: tantek (n=tantek@dsl001-150-252.sfo1.dsl.speakeasy.net)
  47. # [05:17] * Quits: wakaba_ (n=w@118.166.210.220.dy.bbexcite.jp) (heinlein.freenode.net irc.freenode.net)
  48. # [05:17] * Quits: gsnedders (n=gsnedder@host86-139-123-225.range86-139.btcentralplus.com) (heinlein.freenode.net irc.freenode.net)
  49. # [05:17] * Quits: syp (n=syp@photpc17.epfl.ch) (heinlein.freenode.net irc.freenode.net)
  50. # [05:17] * Quits: clotman (n=louis@shell.icgroup.com) (heinlein.freenode.net irc.freenode.net)
  51. # [05:17] * Quits: Philip` (n=philip@zaynar.demon.co.uk) (heinlein.freenode.net irc.freenode.net)
  52. # [05:17] * Quits: wilhelm (n=wilhelm@trivini.no) (heinlein.freenode.net irc.freenode.net)
  53. # [05:17] * Quits: theoros (n=theoros@ACC8D244.ipt.aol.com) (heinlein.freenode.net irc.freenode.net)
  54. # [05:17] * Quits: laug (n=laug@poy.chewa.net) (heinlein.freenode.net irc.freenode.net)
  55. # [05:17] * Quits: didymos (i=jho@rapwap.razor.dk) (heinlein.freenode.net irc.freenode.net)
  56. # [05:17] * Quits: bewest (n=ben@httpcraft/bewest) (heinlein.freenode.net irc.freenode.net)
  57. # [05:17] * Quits: deltab (n=deltab@82-46-154-93.cable.ubr02.smal.blueyonder.co.uk) (heinlein.freenode.net irc.freenode.net)
  58. # [05:17] * Quits: madmoose (i=madmoose@gateway/web/cgi-irc/beitsahour.net/x-a6a69e0cd54b3b1a) (heinlein.freenode.net irc.freenode.net)
  59. # [05:17] * Quits: hsivonen (n=hsivonen@kekkonen.cs.hut.fi) (heinlein.freenode.net irc.freenode.net)
  60. # [05:17] * Quits: Hixie (n=ianh@trivini.no) (heinlein.freenode.net irc.freenode.net)
  61. # [05:17] * Quits: othermaciej (n=mjs@dsl081-048-145.sfo1.dsl.speakeasy.net) (heinlein.freenode.net irc.freenode.net)
  62. # [05:17] * Quits: Dashiva (i=Dashiva@v035b.studby.ntnu.no) (heinlein.freenode.net irc.freenode.net)
  63. # [05:17] * Quits: Yudai (n=Yudai@p931d95.tokyte00.ap.so-net.ne.jp) (heinlein.freenode.net irc.freenode.net)
  64. # [05:17] * Joins: wakaba_ (n=w@118.166.210.220.dy.bbexcite.jp)
  65. # [05:17] * Joins: gsnedders (n=gsnedder@host86-139-123-225.range86-139.btcentralplus.com)
  66. # [05:17] * Joins: syp (n=syp@photpc17.epfl.ch)
  67. # [05:17] * Joins: clotman (n=louis@shell.icgroup.com)
  68. # [05:17] * Joins: Philip` (n=philip@zaynar.demon.co.uk)
  69. # [05:17] * Joins: wilhelm (n=wilhelm@trivini.no)
  70. # [05:17] * Joins: othermaciej (n=mjs@dsl081-048-145.sfo1.dsl.speakeasy.net)
  71. # [05:17] * Joins: Dashiva (i=Dashiva@v035b.studby.ntnu.no)
  72. # [05:17] * Joins: Yudai (n=Yudai@p931d95.tokyte00.ap.so-net.ne.jp)
  73. # [05:17] * Joins: madmoose (i=madmoose@gateway/web/cgi-irc/beitsahour.net/x-a6a69e0cd54b3b1a)
  74. # [05:17] * Joins: hsivonen (n=hsivonen@kekkonen.cs.hut.fi)
  75. # [05:17] * Joins: Hixie (n=ianh@trivini.no)
  76. # [05:18] * Joins: theoros (n=theoros@ACC8D244.ipt.aol.com)
  77. # [05:18] * Joins: bewest (n=ben@httpcraft/bewest)
  78. # [05:18] * Joins: didymos (i=jho@rapwap.razor.dk)
  79. # [05:18] * Joins: laug (n=laug@poy.chewa.net)
  80. # [05:18] * Joins: deltab (n=deltab@82-46-154-93.cable.ubr02.smal.blueyonder.co.uk)
  81. # [05:24] * Joins: theoros` (n=theoros@ACC8D244.ipt.aol.com)
  82. # [05:25] * Quits: theoros` (n=theoros@ACC8D244.ipt.aol.com) (Read error: 104 (Connection reset by peer))
  83. # [05:26] * Quits: mikeday (n=mikeday@CPE-60-224-50-129.vic.bigpond.net.au) ("-")
  84. # [05:31] * Joins: h3h (n=w3rd@cpe-66-75-149-197.san.res.rr.com)
  85. # [05:31] * Quits: h3h (n=w3rd@cpe-66-75-149-197.san.res.rr.com) (Client Quit)
  86. # [06:12] * theoros is now known as theoros|asleep
  87. # [06:18] * Quits: theoros|asleep (n=theoros@ACC8D244.ipt.aol.com) (Excess Flood)
  88. # [06:19] * Joins: theoros|asleep (n=theoros@ACC8D244.ipt.aol.com)
  89. # [06:35] * Joins: h3h (n=w3rd@cpe-66-75-149-197.san.res.rr.com)
  90. # [06:44] * Joins: tantek (n=tantek@66.201.57.7)
  91. # [06:52] * Quits: tantek (n=tantek@66.201.57.7)
  92. # [07:13] * Quits: jcgregorio (n=chatzill@adsl-072-148-043-048.sip.rmo.bellsouth.net) ("ChatZilla 0.9.78.1 [Firefox 2.0.0.3/0000000000]")
  93. # [07:32] * Joins: weinig (n=weinig@c-24-7-121-96.hsd1.ca.comcast.net)
  94. # [07:34] * Joins: tantek (n=tantek@66.201.57.7)
  95. # [07:46] * Quits: csarven (n=nevrasc@modemcable081.152-201-24.mc.videotron.ca)
  96. # [08:02] * Quits: tantek (n=tantek@66.201.57.7)
  97. # [08:32] * Joins: dbaron (n=dbaron@c-71-198-189-81.hsd1.ca.comcast.net)
  98. # [09:06] * Quits: dbaron (n=dbaron@c-71-198-189-81.hsd1.ca.comcast.net) ("8403864 bytes have been tenured, next gc will be global.")
  99. # [09:07] * Quits: Lachy (n=Lachlan@203-214-143-196.perm.iinet.net.au) (Read error: 104 (Connection reset by peer))
  100. # [09:08] * Joins: Lachy (n=Lachlan@203-214-143-196.perm.iinet.net.au)
  101. # [09:12] * Quits: h3h (n=w3rd@cpe-66-75-149-197.san.res.rr.com)
  102. # [09:26] * weinig is now known as weinig|zZz
  103. # [09:27] * Joins: mikeday (n=mikeday@CPE-60-224-50-129.vic.bigpond.net.au)
  104. # [09:28] <mikeday> is whatwg.org down or is it just me?
  105. # [09:34] <Lachy> it appears to be down
  106. # [09:35] <mikeday> is the HTML5 spec anywhere else, like w3.org?
  107. # [09:35] <Lachy> yes, in CVS
  108. # [09:35] <Lachy> dev.w3.org
  109. # [09:35] <Lachy> http://dev.w3.org/cvsweb/html5/
  110. # [09:35] * Quits: weinig|zZz (n=weinig@c-24-7-121-96.hsd1.ca.comcast.net) (Read error: 60 (Operation timed out))
  111. # [09:36] <mikeday> awesome :)
  112. # [09:36] * Quits: Lachy (n=Lachlan@203-214-143-196.perm.iinet.net.au) ("Leaving")
  113. # [09:37] * Joins: Lachy (n=Lachlan@203-217-95-91.dyn.iinet.net.au)
  114. # [09:42] * Joins: tantek (n=tantek@adsl-63-195-114-133.dsl.snfc21.pacbell.net)
  115. # [10:10] * Quits: othermaciej (n=mjs@dsl081-048-145.sfo1.dsl.speakeasy.net)
  116. # [10:20] * Quits: tantek (n=tantek@adsl-63-195-114-133.dsl.snfc21.pacbell.net)
  117. # [10:43] * Joins: ROBOd (n=robod@86.34.246.154)
  118. # [10:52] * Joins: zcorpan_ (n=zcorpan@217-211-77-236-no13.tbcn.telia.com)
  119. # [10:53] <mikeday> Hmm, the HTML5 spec seems to say that comments cannot occur before the root element
  120. # [10:54] <zcorpan_> mikeday: where do you read that?
  121. # [10:54] <mikeday> tree construction, 8.2.4.1. The initial phase
  122. # [10:55] <zcorpan_> that's before the doctype, no?
  123. # [10:55] <hsivonen> hmm. looks like the entire dreamhost is down
  124. # [10:55] <hsivonen> can't get to damowmow portal or the DOM viewer to check this
  125. # [10:56] <mikeday> ah, so only before the doctype
  126. # [10:56] <hsivonen> dreamhost has been down a bit too often lately
  127. # [10:56] <zcorpan_> mikeday: yeah... but then the #writing section goes ahead and says that comments are allowed before the doctype
  128. # [10:56] <mikeday> hrmph, that's helpful :)
  129. # [10:57] * zcorpan_ pointed that out before
  130. # [10:58] <mikeday> U+00 is converted to U+FFFD, but what about other weird characters like U+07?
  131. # [11:00] <hsivonen> mikeday: other weird stuff is preserved
  132. # [11:00] * hsivonen has complained about that before
  133. # [11:00] * mikeday is noticing a pattern here
  134. # [11:01] <mikeday> okay, one more thing: what does RCDATA stand for?
  135. # [11:02] <zcorpan_> replaced character data
  136. # [11:03] <mikeday> what exactly is replaced about it?
  137. # [11:03] <zcorpan_> entities
  138. # [11:03] <mikeday> can have entities... ah.
  139. # [11:32] <annevk> doesn't really matter what it stands for...
  140. # [11:32] <annevk> just implement the steps
  141. # [11:32] * Joins: met_ (n=Hassman@r5bx220.net.upc.cz)
  142. # [11:33] * Joins: aroben (n=adamrobe@c-67-160-250-192.hsd1.ca.comcast.net)
  143. # [11:37] * Joins: tantek (n=tantek@adsl-63-195-114-133.dsl.snfc21.pacbell.net)
  144. # [11:39] * Joins: peepo (n=Jay@host81-132-186-246.range81-132.btcentralplus.com)
  145. # [11:39] * Quits: peepo (n=Jay@host81-132-186-246.range81-132.btcentralplus.com) (Remote closed the connection)
  146. # [11:40] * Joins: peepo (n=Jay@host81-132-186-246.range81-132.btcentralplus.com)
  147. # [11:48] * Joins: maikmerten (n=maikmert@Lba02.l.pppool.de)
  148. # [11:49] * Joins: hasather (n=hasather@81-235-209-174-no62.tbcn.telia.com)
  149. # [11:51] * Quits: zcorpan_ (n=zcorpan@217-211-77-236-no13.tbcn.telia.com) (Read error: 110 (Connection timed out))
  150. # [11:57] <annevk> oh, whatwg is down?
  151. # [11:58] <annevk> is the mail server down too?
  152. # [11:58] * annevk wonders how that works
  153. # [12:01] <annevk> it seems that lists.whatwg.org is not down
  154. # [12:01] <annevk> on the other hand, my e-mail hasn't made it through to the archives yet...
  155. # [12:07] * Joins: jdandrea (n=jdandrea@ool-44c0a1fe.dyn.optonline.net)
  156. # [12:26] <mikeday> hi annevk
  157. # [12:27] <mikeday> took a look at the html5lib code, looks rather clean
  158. # [12:28] <mikeday> just toying with some C code
  159. # [12:28] <mikeday> it's a shame that you've got to do so much irrelevant stuff in C, though.
  160. # [12:31] <annevk> python is nice
  161. # [12:31] <annevk> especially to "quickly" prototype stuff like this
  162. # [12:31] <annevk> the problem is that it doesn't scale well for very large pages, such as the HTML5 spec
  163. # [12:32] <mikeday> you could probably speed it up, at the risk of making the code much uglier...
  164. # [12:34] <annevk> yeah... rather have a fast C implementation with Python wrappers I think
  165. # [12:35] * Quits: peepo (n=Jay@host81-132-186-246.range81-132.btcentralplus.com) (Read error: 104 (Connection reset by peer))
  166. # [12:36] <mikeday> that's the spirit, outsource the ugliness somewhere else :)
  167. # [12:38] * mikeday ponders
  168. # [12:38] <mikeday> the data state can have a very tight inner loop, just scanning for the next & or <
  169. # [12:39] <annevk> or EOF
  170. # [12:39] <annevk> charsUntil() handles EOF automatically
  171. # [12:39] <annevk> so you know
  172. # [12:40] <mikeday> I'm assuming you're working on a chunk of data, so you know there is no EOF in the middle of the chunk
  173. # [12:41] <annevk> if you do script execution document.close() might do that
  174. # [12:41] * annevk isn't sure
  175. # [12:41] <annevk> but it depends on how you implement stuff, I guess
  176. # [12:41] <mikeday> right
  177. # [12:42] <mikeday> I wonder which is faster: if '&' else if '<' else ..., or a table lookup
  178. # [12:42] <mikeday> eg. if charTable[currChar] == MARKUP_CHAR
  179. # [12:43] <annevk> from the little I know I believe table lookup is faster
  180. # [12:43] <annevk> however, how would you handle "any other character" in that case?
  181. # [12:43] <annevk> (I don't think I'm the right person to discuss this with though.)
  182. # [12:44] <mikeday> any other character would be the else case
  183. # [12:45] <annevk> that would work nicely then I suppose
  184. # [12:45] <mikeday> if (... == MARKUP_CHAR) { change state } else { keep accumulating character data }
  185. # [12:45] <mikeday> always frustrates me that efficient code looks less and less like the specification, though
  186. # [12:45] <mikeday> we still don't have a magical compiler that converts spec -> code
  187. # [12:46] <annevk> just use the tests from html5lib
  188. # [12:46] <annevk> and maybe contribute some more
  189. # [12:46] <annevk> and pay some attention to the spec too :)
  190. # [12:47] <mikeday> right :)
  191. # [12:50] <mikeday> hmm, using the HTML5 spec as a test document is rather meta
  192. # [12:50] <mikeday> especially considering it's not very well-formed :/
  193. # [12:51] <annevk> the multipage version of HTML5 is generated using html5lib
  194. # [12:51] <annevk> that's meta
  195. # [12:52] <mikeday> neat :)
  196. # [12:55] <hsivonen> mikeday: do you use a DFA for XML?
  197. # [12:56] <mikeday> hsivonen, not yet, but I'd like to
  198. # [12:56] <mikeday> I've generated one, but haven't got around to building a parser around it yet.
  199. # [12:56] <hsivonen> mikeday: surely a function call per tokenizer state is good enough considering that it is the de facto way to write XML parsers
  200. # [12:57] * mikeday shrugs
  201. # [12:57] <mikeday> for HTML5 you mean?
  202. # [12:57] <hsivonen> I intend to optimize away the explicit state variable but I hesitate going all the way to a hand-rolled DFA
  203. # [12:58] <hsivonen> mikeday: I meant a function call (possibly inlined by compiler) per state in the HTML5 tokenizer spec
  204. # [12:58] <hsivonen> mikeday: the XML parsers that I've looked at work roughly that way
  205. # [12:59] <mikeday> after looking at the spec, I've seen that the state machine is rather more complicated than the average DFA
  206. # [12:59] <mikeday> with XML it's easier, as you're going from grammar to DFA
  207. # [13:00] <annevk> there are some additional switches indeed based on tree construction feedback
  208. # [13:00] <annevk> although I think you should be able to integrate those too
  209. # [13:00] <mikeday> right, it would take a bit of messing around though
  210. # [13:00] <annevk> (it leads you further away from the spec though)
  211. # [13:00] <mikeday> that too.
  212. # [13:00] <annevk> shouldn't be much of an issue I think...
  213. # [13:01] <mikeday> by the way, a tiny test seems to show that the if/else is slightly faster than table
  214. # [13:01] <mikeday> if only two characters are being checked for
  215. # [13:01] <annevk> see, don't trust me :)
  216. # [13:01] <mikeday> but if three or more characters are being checked for, table wins by far
  217. # [13:01] <annevk> oh, ok :)
  218. # [13:01] <mikeday> eg. for whitespace characters it would be a win
  219. # [13:02] <mikeday> for the data state inner loop, not so much
  220. # [13:02] <hsivonen> I wonder if it is possible to construct a hash function that hashes all UTF-16 code units to a small range of integers so that markup-significant characters get unique scalars and neutral characters overlap
  221. # [13:03] * mikeday grins
  222. # [13:03] <hsivonen> (and effient one, that is)
  223. # [13:03] <hsivonen> efficient even
  224. # [13:03] <mikeday> let's see, markup significant characters are all < U+007F
  225. # [13:04] <mikeday> just make sure that everything above 127 is mapped to 127..255 range
  226. # [13:05] <mikeday> and ASCII stays as it is
  227. # [13:05] <mikeday> or do you want & and < to map to the same small integer?
  228. # [13:05] <hsivonen> didn't think that far
  229. # [13:05] <hsivonen> gotta go. later
  230. # [13:05] * mikeday waves
  231. # [13:06] <mikeday> hrm, jumping into the micro-optimisation, I forgot that no one uses UTF-16 anyway
  232. # [13:06] <mikeday> (for given values of no one)
  233. # [13:08] * Quits: aroben (n=adamrobe@c-67-160-250-192.hsd1.ca.comcast.net) (Read error: 110 (Connection timed out))
  234. # [13:10] <annevk> in some states unicode chars are important
  235. # [13:10] <mikeday> ?
  236. # [13:10] <annevk> tag name state
  237. # [13:10] <annevk> but I suppose that doesn't matter much
  238. # [13:11] <annevk> that's actually in the anything else case so...
  239. # [13:11] <annevk> nm me
  240. # [13:11] <mikeday> I noticed that the tag names all get lowercased
  241. # [13:11] <mikeday> that would mean that <camelCase> XML tags can't be embedded in HTML5, right?
  242. # [13:14] <annevk> ASCII lowercase, yes
  243. # [13:14] <annevk> XML can't be embedded in HTML5
  244. # [13:14] <mikeday> true, you could have camelCase tags as long as they use accented letters :)
  245. # [13:15] <mikeday> are unknown tags still added to the DOM?
  246. # [13:16] <annevk> of course
  247. # [13:16] <annevk> there's in fact no difference between "unknown tags" and <span> for instance
  248. # [13:16] <annevk> (iirc)
  249. # [13:17] <mikeday> so arbitrary vocabularies can be included,
  250. # [13:17] <mikeday> as long as they don't require <camelCase>
  251. # [13:17] <mikeday> or plain uppercase, for that matter
  252. # [13:17] <mikeday> seems like MathML would work fine
  253. # [13:18] <annevk> there's no namespace support either
  254. # [13:19] <annevk> but in due course we would add limited support for that I suppose
  255. # [13:20] <Philip`> annevk: Doesn't http://dev.w3.org/cvsweb/~checkout~/html5/spec/Overview.html?rev=1.12&content-type=text/html;%20charset=iso-8859-1#pixel cover the points about how an arbitrary object is treated as ImageData?
  256. # [13:21] <annevk> oh, I think I've been looking at an old version of the spec
  257. # [13:22] <Philip`> mikeday: I'd expect table lookups to usually be much slower than if/elses in real programs because you won't be able to keep the lookup table in the cache for very long (if you're processing lots of other data at the same time) and it'll have to do really expensive memory reads
  258. # [13:23] <Philip`> People used to use lookup tables for fast sin/cos calculations, but now it's much quicker just to get the CPU to recalculate it every time because memory is slow
  259. # [13:23] <mikeday> Philip`, the table is pretty small, 256 bytes, but the processing other data at the same time constraint could be a problem
  260. # [13:24] <Philip`> Caches are pretty small too :-)
  261. # [13:24] <annevk> thanks Philip`
  262. # [13:24] <Philip`> (like, uh, 16KB or something?)
  263. # [13:24] <Philip`> (depending on what processor you have)
  264. # [13:25] <mikeday> the whitespace test requires five else if branches, though
  265. # [13:26] <mikeday> at least it wouldn't be hard to try both methods on real world data
  266. # [13:26] <mikeday> as it's not really fundamental to the structure of the code
  267. # [13:33] <met_> annevk why is on http://annevankesteren.nl/2006/08-paintr21 It works in Firefox (given a few hacks), with the notable exception of the "Save it!" button.? Save works for me in FF 2.0.0.3
  268. # [13:34] <met_> the only difference is nice Paintr logo in Opera vs. text logo in FF
  269. # [13:35] * Quits: mikeday (n=mikeday@CPE-60-224-50-129.vic.bigpond.net.au) ("-")
  270. # [13:36] <met_> ah see the logo is made by css content:url
  271. # [13:38] <annevk> that thing was made before FF2
  272. # [13:38] <met_> can you update the text? 8-)))
  273. # [13:43] * Joins: bzed (n=bzed@dslb-084-059-108-031.pools.arcor-ip.net)
  274. # [13:44] * Joins: dk (i=dk@gouax1-151.dialup.optusnet.com.au)
  275. # [14:19] * Philip` wonders if <div irrelevant><img ...><img ...></div> would be a sensible way of pre-loading images to be used in a canvas, so you can just wait for window.onload and then be sure all the images are loaded
  276. # [14:20] <annevk> I think if you do img.src in a script the load event is delayed as well
  277. # [14:22] <Philip`> Oh, that sounds better
  278. # [14:29] <Dashiva> What's the deal with r\^ole?
  279. # [14:30] <Philip`> It's the (La)TeX spelling, I believe
  280. # [14:31] <Dashiva> of rĂ´le?
  281. # [14:32] <Philip`> Maybe, but my IRC client mangles that
  282. # [14:32] * Philip` looks in the log
  283. # [14:32] <Philip`> Ah, yes, that
  284. # [14:33] <Philip`> Same as r&ocirc;le too, but not quite so ugly
  285. # [14:34] <Dashiva> But what's wrong with just role, was more my question
  286. # [14:34] <Lachy> aargh! I've asked 3 times for Patrick (or anyone else) to provide examples of tables that would benefit from the headers attribute, and each time he's bypassed the question entirely
  287. # [14:34] <annevk> lol, people are wasting their time on www-html? :)
  288. # [14:34] <Lachy> it's so annoying that they won't contribute when asked, and then bitch about being ignored
  289. # [14:35] <annevk> they are indeed
  290. # [14:35] <Philip`> Oh - just spelling it "role" seems far more sensible :-)
  291. # [14:35] <annevk> fun
  292. # [14:35] <Dashiva> Isn't that what the semantic web is all about?
  293. # [14:35] <Dashiva> Getting other people to do all the work, and then complaining about nothing happening
  294. # [14:40] <Philip`> That sounds like the approach of getting authors to mark up all their data correctly in a machine-processable form, so you can build advanced search engines on the semantic web that correctly understand the relationships between pieces of data
  295. # [14:41] <Philip`> compared to e.g. Google, which just puts up with whatever rubbish authors create
  296. # [14:41] <Philip`> but it's kind of obvious which one is doing better at the moment
  297. # [14:48] <maikmerten> wow, seems Opera's layout engine is 1345% more green that other competing engines... impressive http://en.wikipedia.org/wiki/Comparison_of_layout_engines_(WHATWG)
  298. # [14:49] <maikmerten> one keeps wondering why such things make it into Wikipedia
  299. # [14:49] <Dashiva> Probably because all browsers have their share of fanatical fanboys
  300. # [14:50] <annevk> prolly also because it doesn't list all the WHATWG features
  301. # [14:53] <Philip`> You could replace the whole first table with "Web Forms 2.0: No ? Yes" and then Opera wouldn't be seen as having such an unfair lead
  302. # [14:54] <Dashiva> Thinking of it as a lead is a problem to begin with, IMO
  303. # [14:55] <annevk> Safari for instance does support type=range iirc
  304. # [14:55] <annevk> Firefox supports persistent storage
  305. # [14:55] <Philip`> Also one could change <video> to no in Opera, because it's not fair to count very experimental builds that don't even match the WA1 spec
  306. # [14:55] <annevk> Internet Explorer supports parts of drag & drop, draggable, contenteditable, etc.
  307. # [14:57] * Philip` wonders if anyone has made a <canvas> paint program that can save and load from globalStorage
  308. # [14:57] <Philip`> Oh, actually, that wouldn't work because you can't draw data: images then call toDataURL again :-(
  309. # [15:01] <annevk> Maybe the new definition of origin helps with that?
  310. # [15:02] <annevk> Cause in theory that would be a safe image, unless you got it after a redirect
  311. # [15:02] <Philip`> "The origin of a Document or image that was generated from a data: URI found in another Document or in a script is the origin of the that Document or script." - oh, sounds like that covers it
  312. # [15:03] * theoros|asleep is now known as theoros
  313. # [15:03] <annevk> Although if you store it in globalStorage and then retrieve it later...
  314. # [15:03] * annevk ponders
  315. # [15:04] <Philip`> You'd just get a string out of globalStorage, and I assume strings don't have complex security arrangements
  316. # [15:04] <Philip`> and then you'd create an image from that string, but that image would be created in your own document
  317. # [15:05] <annevk> sounds tricky
  318. # [15:06] <Philip`> (If you've got the data: string, you could rewrite libpng in JS and get the image data anyway, so the only problem is in whether you're allowed to get the string in the first place)
  319. # [15:06] <Philip`> (and you should be allowed to get strings from globalStorage, because otherwise it'd be a bit pointless...)
  320. # [15:06] <Philip`> but I don't know if that agrees with what the spec says
  321. # [15:07] <annevk> I suppose data: URLs not retrieved from <img> objects or non-same origin <canvas> objects are to be considered safe
  322. # [15:07] <annevk> and that therefore invoking toDataURL() should not fail and drawImage() should not mark the <canvas> object non-same origin
  323. # [15:11] <annevk> I suppose the problem is that painting a data URL might not always be safe
  324. # [15:53] * Quits: gsnedders (n=gsnedder@host86-139-123-225.range86-139.btcentralplus.com)
  325. # [16:00] * Joins: gsnedders (n=gsnedder@host86-139-123-225.range86-139.btcentralplus.com)
  326. # [16:34] * Quits: Lachy (n=Lachlan@203-217-95-91.dyn.iinet.net.au) (Read error: 104 (Connection reset by peer))
  327. # [16:54] * Joins: zcorpan_ (n=zcorpan@217-211-77-236-no13.tbcn.telia.com)
  328. # [17:10] * Joins: Lachy (n=Lachlan@203-217-95-91.dyn.iinet.net.au)
  329. # [17:45] * Joins: csarven (n=nevrasc@modemcable081.152-201-24.mc.videotron.ca)
  330. # [17:48] <annevk> http://weblog.200ok.com.au/2007/05/what-i-want-from-new-markup-spec.html
  331. # [17:52] <Lachy> hmm. Looks like we need some kind of tutorial to explain how the heading structure works
  332. # [17:53] <annevk> http://www.kavoir.com/2007/05/html5-adopted-by-w3c.html is someone who thinks Chris Wilson will be editor
  333. # [17:56] <Philip`> Also thinks Microsoft is one of the key contributing groups in the WHAT-WG
  334. # [17:56] <annevk> http://ma.gnolia.com/people/apartness/bookmarks/prejesh
  335. # [17:58] <annevk> http://www.designerstalk.com/forums/web-standards/26075-web-standards-danger.html
  336. # [17:59] <annevk> http://www.elementary-group-standards.com/web-standards/web-standards-html5-support-existing-content.html
  337. # [18:07] * met_ is glad he is wringting in Czech only, so all his mistakes cannot by discussed here 8-)
  338. # [18:08] <annevk> I wonder why people on www-html think there was some arbitrary descision process going on... The sole reason <samp> and such are still here is because dropping them would cost more.
  339. # [18:09] <annevk> I think there have hardly been any arbitrary descisions with regards to HTML5
  340. # [18:14] <wilhelm> Why would one want to drop such elements?
  341. # [18:17] <csarven> annevk tsk tsk <m>
  342. # [18:19] <Lachy> annevk, I think he's just using code, samp, etc. to make a point about dropping things like headers="" and summary=""
  343. # [18:20] <Lachy> personally, I somewhat agree with keeping headers (I'm just trying to get them to help find evidence for it), though I'm undecided about summary
  344. # [18:22] <Philip`> http://canvex.lazyilluminati.com/misc/summary.html is how people seem to be using summary now
  345. # [18:23] <Philip`> ((Can't remember if I pointed that out here before))
  346. # [18:25] <Lachy> Philip`, what was the total sample size surveyed?
  347. # [18:27] <Lachy> wow, so many of them are used for presentational purposes
  348. # [18:30] <Philip`> That was 2523 pages, of which 105 had a summary attribute anywhere
  349. # [18:31] * Joins: h3h (n=w3rd@cpe-66-75-149-197.san.res.rr.com)
  350. # [18:31] <Lachy> I think we need a larger sample size
  351. # [18:31] <Philip`> The results are probably misleading because a few sites have a lot of distinct summaries
  352. # [18:32] <Lachy> the results should be grouped by domain name to deal with that
  353. # [18:33] <Philip`> It also seems quite hard to analyse the results automatically since pretty much everyone uses totally different strings (except for those that use "")
  354. # [18:33] <Philip`> But it would be useful to get much better data than this
  355. # [18:34] <Lachy> yeah, you could probably try to filter on things like the word "layout" and maybe the length (e.g. < 4 words is relatively useless)
  356. # [18:37] * Philip` would try to do something better if he didn't have far too much urgent work to do now instead :-)
  357. # [18:38] * Quits: dk (i=dk@gouax1-151.dialup.optusnet.com.au) (Read error: 60 (Operation timed out))
  358. # [18:38] <Lachy> are you going to release the code of the tool soon, so others can work with it?
  359. # [18:39] <Philip`> I'll attempt to do that once I have time
  360. # [18:39] <Philip`> It's not like it's particularly interesting or difficult code, though - it just downloads a load of pages into a database, then parses them all and walks through the tree trying to find things that match some condition, then sticks the results in a table
  361. # [18:40] <Philip`> (Can you get something like an XML database that does really fast queries on tree-structured data? That'd be quite handy for this kind of thing, after working around the problem that lots of sites can't be serialised into well-formed XML)
  362. # [18:41] <zcorpan_> TagSoup?
  363. # [18:41] <met_> Philip` have you some experience with xml databases?
  364. # [18:42] <Philip`> met_: None at all
  365. # [18:42] <met_> my colleagues recoomentder me http://exist.sourceforge.net/ but i never tried
  366. # [18:43] <met_> also you can use xml in postgresql (with xpath etc.), don't mentioning Oracle and MS SQL
  367. # [18:45] <Philip`> Ah, looks like it could be useful
  368. # [18:46] <met_> and here is a link about postgresql and xpath http://www.throwingbeans.org/postgresql_and_xml.html
  369. # [18:47] <met_> ms sql2005 and oracle (not sure wich version) have it natively as xml datatypes
  370. # [18:48] <Philip`> Hopefully the databases do some kind of indexing, because running unindexed queries over 100MB of XML doesn't sound like the absolute fastest thing ever
  371. # [18:49] <Philip`> or maybe I'm thinking from the wrong perspective for this kind of thing
  372. # [18:49] <met_> ms and oracle yes
  373. # [18:52] <Philip`> (For added fun, some of my downloaded documents are actually PDF files, parsed by html5lib into something that I expect is quite hideous. Maybe I should check the content-type on these things...)
  374. # [18:53] <met_> whow
  375. # [18:53] <met_> and what other types like *.doc etc
  376. # [18:54] <Philip`> I don't see any of those
  377. # [18:54] <Philip`> I just got the URLs from Yahoo search results (since they're nicer than Google and still provide search APIs), so it's limited to what they files they think are worth putting in the results
  378. # [19:06] * Joins: weinig|zZz (n=weinig@m810f36d0.tmodns.net)
  379. # [19:07] * weinig|zZz is now known as weinig
  380. # [19:19] * Quits: weinig (n=weinig@m810f36d0.tmodns.net)
  381. # [19:50] * Joins: dbaron (n=dbaron@c-71-198-189-81.hsd1.ca.comcast.net)
  382. # [20:14] * Joins: zcorpan (n=zcorpan@217-211-77-236-no13.tbcn.telia.com)
  383. # [20:30] * Joins: kingryan (n=kingryan@dsl081-240-149.sfo1.dsl.speakeasy.net)
  384. # [20:32] <annevk> csarven, what about it?
  385. # [20:32] <csarven> i find <m> arbitrary but im sure <samp> has its own story
  386. # [20:33] * Quits: zcorpan_ (n=zcorpan@217-211-77-236-no13.tbcn.telia.com) (Read error: 110 (Connection timed out))
  387. # [20:33] <annevk> <samp> is just there because dropping it would have little value
  388. # [20:34] <annevk> <m> is there because lots of pages use it
  389. # [20:34] <annevk> aiui
  390. # [20:34] <Philip`> I thought HTML5 was starting from a clean slate and only adding features when there's good enough reasons to justify adding them...
  391. # [20:35] <csarven> lots of pages use lots of things =)
  392. # [20:35] <Philip`> (or at least I'm fairly sure I remember people using that as an argument)
  393. # [20:35] <csarven> Philip` that would be the ideal approach but it is not always the case
  394. # [20:36] <annevk> Philip`, in general, ye
  395. # [20:36] <annevk> s
  396. # [20:56] * Quits: h3h (n=w3rd@cpe-66-75-149-197.san.res.rr.com)
  397. # [21:03] * annevk tends to agree with David Baron that for implementors every HTML feature needs to be specified
  398. # [21:03] <annevk> (this includes <frameset>)
  399. # [21:10] * Joins: h3h (n=w3rd@cpe-66-75-149-197.san.res.rr.com)
  400. # [21:19] <hsivonen> annevk: yeah. If you build navigation systems, you need to know that the earth is round even if a flat earth would be nicer
  401. # [21:21] <Lachy> annevk, are you referring to David's latest on www-html? I didn't get the relevance, since the discussion was related to document conformance only.
  402. # [21:22] * Parts: zcorpan (n=zcorpan@217-211-77-236-no13.tbcn.telia.com)
  403. # [21:22] <annevk> the contents of his e-mail are relevant imo
  404. # [21:22] <annevk> although I agree it didn't make much sense in context
  405. # [21:23] <Lachy> sure, it's relevant to the spec in general
  406. # [21:23] * Parts: hasather (n=hasather@81-235-209-174-no62.tbcn.telia.com)
  407. # [21:23] <hsivonen> is there now relevant discussion on www-html? I unsubscribed to respect the HTML WG email recess.
  408. # [21:24] * Joins: hasather (n=hasather@81-235-209-174-no62.tbcn.telia.com)
  409. # [21:24] <Lachy> hsivonen, not really
  410. # [21:24] <Lachy> I'll let you know when something important is posted
  411. # [21:25] <hsivonen> Lachy: thanks
  412. # [21:26] <Lachy> nice! I can refer to this next time someone tries to shift the burden of proof on to me to disprove their claim http://en.wikipedia.org/wiki/Burden_of_proof#Science_and_other_uses
  413. # [21:37] * Joins: BenWard (n=BenWard@cpc3-cmbg2-0-0-cust58.cmbg.cable.ntl.com)
  414. # [21:50] * Joins: othermaciej (n=mjs@dsl081-048-145.sfo1.dsl.speakeasy.net)
  415. # [22:12] * Quits: maikmerten (n=maikmert@Lba02.l.pppool.de) ("Leaving")
  416. # [22:27] <tantek> Lachy, nice reference, I hadn't seen that before and ended up writing up our own for microformats.org: http://microformats.org/wiki/brainstorming#Burden_of_Proof
  417. # [22:28] * Joins: zcorpan (n=zcorpan@84-216-40-20.sprayadsl.telenor.se)
  418. # [22:46] * Quits: ROBOd (n=robod@86.34.246.154) ("http://www.robodesign.ro")
  419. # [23:00] * Quits: tantek (n=tantek@adsl-63-195-114-133.dsl.snfc21.pacbell.net)
  420. # [23:03] * Joins: jdandrea_ (n=jdandrea@ool-44c0a58f.dyn.optonline.net)
  421. # [23:07] * Joins: JonT (n=opera@ti221110a080-11581.bb.online.no)
  422. # [23:11] * Parts: JonT (n=opera@ti221110a080-11581.bb.online.no)
  423. # [23:12] * Joins: JonT (n=opera@ti221110a080-11581.bb.online.no)
  424. # [23:12] * Parts: JonT (n=opera@ti221110a080-11581.bb.online.no)
  425. # [23:20] * Quits: jdandrea (n=jdandrea@ool-44c0a1fe.dyn.optonline.net) (Read error: 110 (Connection timed out))
  426. # [23:22] * Quits: hasather (n=hasather@81-235-209-174-no62.tbcn.telia.com) (Read error: 110 (Connection timed out))
  427. # [23:34] * Quits: met_ (n=Hassman@r5bx220.net.upc.cz) ("Chemists never die, they just stop reacting.")
  428. # [23:37] * Joins: Philip`_ (n=philip@zaynar.demon.co.uk)
  429. # [23:37] * Parts: BenWard (n=BenWard@cpc3-cmbg2-0-0-cust58.cmbg.cable.ntl.com)
  430. # [23:47] * Quits: Philip` (n=philip@zaynar.demon.co.uk) (Read error: 110 (Connection timed out))
  431. # [23:51] * Joins: mpt (n=mpt@canonical/launchpad/mpt)
  432. # Session Close: Mon May 14 00:00:00 2007

The end :)