/irc-logs / w3c / #html-wg / 2007-07-23 / end

Options:

  1. # Session Start: Mon Jul 23 00:00:00 2007
  2. # Session Ident: #html-wg
  3. # [00:03] * Joins: mjs (mjs@67.41.194.248)
  4. # [00:38] * Quits: mjs (mjs@67.41.194.248) (Ping timeout)
  5. # [00:46] * Joins: mjs (mjs@67.41.148.190)
  6. # [00:50] * Quits: tH (Rob@87.102.85.210) (Quit: ChatZilla 0.9.78.1-rdmsoft [XULRunner 1.8.0.9/2006120508])
  7. # [01:07] * Quits: gavin_ (gavin@63.245.208.169) (Ping timeout)
  8. # [01:09] * Quits: mjs (mjs@67.41.148.190) (Ping timeout)
  9. # [01:09] * Joins: gavin_ (gavin@63.245.208.169)
  10. # [01:12] * Quits: heycam (cam@203.214.127.179) (Ping timeout)
  11. # [01:17] * Joins: mjs (mjs@67.41.193.116)
  12. # [01:19] * Quits: zcorpan (zcorpan@84.216.41.90) (Ping timeout)
  13. # [01:43] * Joins: Zeros (Zeros-Elip@69.140.48.129)
  14. # [01:44] * Joins: heycam (cam@130.194.72.84)
  15. # [01:44] * Quits: mjs (mjs@67.41.193.116) (Ping timeout)
  16. # [01:50] * Joins: xover (xover@193.157.66.5)
  17. # [01:52] * Joins: mjs (mjs@70.56.48.154)
  18. # [01:56] * Quits: heycam (cam@130.194.72.84) (Quit: bye)
  19. # [01:56] * Joins: heycam (cam@130.194.72.84)
  20. # [02:08] * Joins: karl (karlcow@128.30.52.30)
  21. # [02:12] * Quits: Sander (svl@86.87.68.167) (Quit: And back he spurred like a madman, shrieking a curse to the sky.)
  22. # [02:16] * Quits: gavin (gavin@74.103.208.221) (Ping timeout)
  23. # [02:21] * Joins: gavin (gavin@74.103.208.221)
  24. # [02:57] * Quits: mjs (mjs@70.56.48.154) (Ping timeout)
  25. # [03:05] * Joins: mjs (mjs@67.40.155.111)
  26. # [03:12] * Joins: olivier (ot@128.30.52.30)
  27. # [03:18] <karl> http://www.gizmosforgeeks.com/2007/07/20/new-html-spec-v5/
  28. # [03:40] * Quits: mjs (mjs@67.40.155.111) (Ping timeout)
  29. # [03:46] * Joins: mjs (mjs@67.41.192.213)
  30. # [04:23] * Quits: gavin (gavin@74.103.208.221) (Ping timeout)
  31. # [04:28] * Joins: gavin (gavin@74.103.208.221)
  32. # [04:53] * Quits: mjs (mjs@67.41.192.213) (Quit: mjs)
  33. # [04:58] <karl> http://www.la-grange.net/2007/07/23-japanese-typography
  34. # [04:58] <karl> some example of Japanese conventions.
  35. # [04:58] <karl> I have tried equivalents of strong and em (or even bold) in a few texts around me, without success at all.
  36. # [04:59] <karl> italics was inexistant and bold fonts were all used for titles.
  37. # [05:06] * Joins: MikeSmith (MikeSmith@mcclure.w3.org)
  38. # [05:10] * Joins: schepers (schepers@128.30.52.30)
  39. # [06:30] * Quits: gavin (gavin@74.103.208.221) (Ping timeout)
  40. # [06:33] <karl> just discovered that Robert Burns was the author of http://en.wikipedia.org/wiki/Auld_Lang_Syne
  41. # [06:34] <schepers> that may be a different Robbie Burns ;P
  42. # [06:35] * Joins: gavin (gavin@74.103.208.221)
  43. # [06:50] * Quits: schepers (schepers@128.30.52.30) (Client exited)
  44. # [06:57] * Quits: gavin (gavin@74.103.208.221) (Ping timeout)
  45. # [06:57] * Joins: gavin (gavin@74.103.208.221)
  46. # [07:11] * Joins: schepers (schepers@128.30.52.30)
  47. # [07:24] * Joins: mjs (mjs@67.41.153.80)
  48. # [07:57] * Quits: gavin (gavin@74.103.208.221) (Ping timeout)
  49. # [07:57] * Joins: gavin (gavin@74.103.208.221)
  50. # [08:01] * olivier is now known as dan
  51. # [08:10] * dan is now known as olivier
  52. # [08:10] * Quits: heycam (cam@130.194.72.84) (Quit: bye)
  53. # [08:22] * Quits: olivier (ot@128.30.52.30) (Quit: Leaving)
  54. # [08:22] * Joins: olivier (ot@128.30.52.30)
  55. # [08:28] * Joins: ROBOd (robod@86.34.246.154)
  56. # [08:47] <hsivonen> it is sad how many people think that HTML5 is wrong because they have an illusion of XML when they write XHTML as text/html
  57. # [08:54] <karl> hsivonen: then you must be crying when watching "you've got mail" - http://www.imdb.com/title/tt0128853/
  58. # [08:54] <olivier> :)
  59. # [09:05] <hsivonen> karl: I don't know what you mean, because I have not seen the movie.
  60. # [09:06] * Joins: heycam (cam@203.214.127.179)
  61. # [09:16] * Joins: mjs_ (mjs@67.41.147.12)
  62. # [09:17] * Quits: mjs (mjs@67.41.153.80) (Ping timeout)
  63. # [09:25] * Quits: olivier (ot@128.30.52.30) (Quit: Leaving)
  64. # [09:28] * Quits: karl (karlcow@128.30.52.30) (Quit: Where dwelt Ymir, or wherein did he find sustenance?)
  65. # [09:57] * Quits: mjs_ (mjs@67.41.147.12) (Ping timeout)
  66. # [09:59] * Quits: gavin (gavin@74.103.208.221) (Ping timeout)
  67. # [10:00] * Joins: billyjack (MikeSmith@mcclure.w3.org)
  68. # [10:00] * Quits: MikeSmith (MikeSmith@mcclure.w3.org) (Ping timeout)
  69. # [10:01] * billyjack is now known as MikeSmith
  70. # [10:06] * Joins: gavin (gavin@74.103.208.221)
  71. # [10:20] <beowulf> hsivonen: that is sad, but also an example of being careful what is taught to people
  72. # [10:25] * Joins: zcorpan (zcorpan@84.216.41.25)
  73. # [10:32] * Joins: mjs (mjs@67.41.136.143)
  74. # [10:33] * Quits: gavin (gavin@74.103.208.221) (Ping timeout)
  75. # [10:34] * Joins: gavin (gavin@74.103.208.221)
  76. # [10:34] * Quits: Zeros (Zeros-Elip@69.140.48.129) (Quit: Leaving)
  77. # [10:39] * Quits: mjs (mjs@67.41.136.143) (Ping timeout)
  78. # [10:39] * Joins: mjs (mjs@67.41.136.143)
  79. # [10:41] <hsivonen> beowulf: which way do you mean? do you mean that the XHTML propaganda effort was a mistake? or that it should be continued to be taught?
  80. # [10:44] <beowulf> at this point I feel the xhtml propaganda effort was in a sense misleading
  81. # [10:44] <beowulf> i wouldn't say mistake though, it has plenty of positive outcomes
  82. # [10:44] <beowulf> s/outcomes/consequences
  83. # [10:45] <mjs> what were the positive consequences?
  84. # [10:45] <beowulf> as a movement it led people to think more about what they write
  85. # [10:46] <mjs> the Inquisition led people to think deeply about their religious convictions
  86. # [10:46] <beowulf> i can only speak for myself
  87. # [10:47] <beowulf> i wouldn't call it a mistake or compare it to the Inquisition
  88. # [10:48] <beowulf> given a choice between well written html and well written appendix c xhtml i wouldn't much care
  89. # [10:48] <beowulf> but i rarely see well written html
  90. # [10:49] <hsivonen> beowulf: is Anne's blog well-written HTML according to your definition of well-written?
  91. # [10:49] * beowulf looks
  92. # [10:50] <beowulf> at first glance, yes
  93. # [10:50] <hsivonen> beowulf: ok
  94. # [10:50] <mjs> I'm just saying, leading people to think about something by saying wrong things isn't what I would consider a positive consequence on the whole
  95. # [10:51] <beowulf> fair enough
  96. # [10:55] <beowulf> what you need then is a Zeldman for html
  97. # [10:57] <zcorpan> POSH
  98. # [10:58] <hsivonen> is Tantek's POSH HTML or XHTML-as-text/html?
  99. # [10:58] <beowulf> POSH has so far been a whisper in some quaint old corner of the web
  100. # [11:00] * Quits: mjs (mjs@67.41.136.143) (Ping timeout)
  101. # [11:02] <beowulf> plus POSH is hard to sell i'd imagine
  102. # [11:02] <beowulf> "you recall we rewrote the corporate site from html to xhtml to future proof and make all things wonderful? Well..."
  103. # [11:04] <hsivonen> the question we should be asking is why was XHTML-as-text/html easier to sell than HTML 4.01 Strict? The people going to XHTML Transitional as text/html felt it was forward-looking while HTML 4.01 Strict wasn't appealing
  104. # [11:05] <hsivonen> XHTML was all about the /> which has no technical effect in text/html. focusing on that felt like doing something, but no matter if you did it carefully or sloppily, it didn't really matter
  105. # [11:05] <hsivonen> Strict, OTOH, becomes an inconvenience that actually matters on terms of what works in browsers
  106. # [11:06] <hsivonen> s/on/in/
  107. # [11:08] * Joins: mjs (mjs@67.41.152.68)
  108. #
  109. # Session Start: Mon Jul 23 11:11:35 2007
  110. # Session Ident: #html-wg
  111. # [11:11] * Now talking in #html-wg
  112. # [11:11] * Topic is 'HTML WG http://www.w3.org/html/wg/ logged: http://krijnhoetmer.nl/irc-logs/'
  113. # [11:11] * Set by Zeros on Mon Apr 30 23:38:28
  114. # [11:13] <mjs> when I first heard about XML (this was before really knowing anything about technology) my firs thought was, "but this doesn't actually *do* anything"
  115. # [11:14] <hsivonen> mjs: semantics, not behavior :-)
  116. # [11:15] <mjs> s/anything about technology/anything about web technology/
  117. # [11:15] <mjs> I also remember around this same time having an exchange about the <object> tag with an HTML4 enthusiast
  118. # [11:15] <mjs> him: there's this great new tag, it's called <object>
  119. # [11:15] <mjs> me: great! what does it do?
  120. # [11:16] <mjs> him: it can do anything
  121. # [11:16] <mjs> me: how do I tell it what to actually do in a specific case?
  122. # [11:16] <mjs> him: that's undefined
  123. # [11:16] <mjs> me: I thought you said it was great
  124. # [11:17] <beowulf> :)
  125. # [11:18] * Quits: gavin (gavin@74.103.208.221) (Ping timeout)
  126. # [11:19] * Joins: gavin (gavin@74.103.208.221)
  127. # [11:26] * Quits: MikeSmith (MikeSmith@mcclure.w3.org) (Ping timeout)
  128. # [11:31] * Joins: MikeSmith (MikeSmith@mcclure.w3.org)
  129. # [11:47] * Quits: mjs (mjs@67.41.152.68) (Ping timeout)
  130. # [11:52] * Joins: tH (Rob@87.102.85.210)
  131. # [11:59] * Joins: Lachy (chatzilla@203.214.140.60)
  132. # [12:07] * Joins: myakura (myakura@58.88.37.26)
  133. # [12:14] * Joins: mjs (mjs@67.41.152.68)
  134. # [13:21] * Quits: gavin (gavin@74.103.208.221) (Ping timeout)
  135. # [13:23] * Quits: mjs (mjs@67.41.152.68) (Ping timeout)
  136. # [13:23] * Joins: olivier (ot@128.30.52.30)
  137. # [13:26] * Quits: zcorpan (zcorpan@84.216.41.25) (Ping timeout)
  138. # [13:26] * Joins: gavin (gavin@74.103.208.221)
  139. # [13:56] * Parts: Lionheart (robin@66.57.69.65)
  140. # [13:58] * Quits: MikeSmith (MikeSmith@mcclure.w3.org) (Quit: Less talk, more pimp walk.)
  141. # [14:01] * Quits: Lachy (chatzilla@203.214.140.60) (Connection reset by peer)
  142. # [14:14] * Joins: zcorpan (zcorpan@84.216.41.25)
  143. # [14:18] <zcorpan> hsivonen: in validator.nu, choosing the HTML5 (prerelease schema) and the HTML parser, it says "Schema Error: The chosen preset schema is not appropriate for HTML."
  144. # [14:18] <hsivonen> whoa
  145. # [14:18] <anne> I also can't validate annevankesteren.nl/contact using html5.validator.nu...
  146. # [14:19] <anne> It aborts on IO and mumbles something about XHTML mode
  147. # [14:19] <anne> "Validator.nu is validation 2.0" :)
  148. # [14:20] <hsivonen> these are the kind of reasons why I mentioned it on IRC before making other announcements
  149. # [14:21] <hsivonen> validation 2.0, like Web 2.0, is in perpetual beta
  150. # [14:28] <hsivonen> anne: apparently, the way ifs fall, it mumbles about the XHTML mode if it dies before it had a chance to choos the mode...
  151. # [14:30] <hsivonen> zcorpan: fixed. I think.
  152. # [14:32] <hsivonen> anne: I get a non-200 HTTP status
  153. # [14:33] <zcorpan> hsivonen: the javascript needs fixing too
  154. # [14:33] <zcorpan> hsivonen: or nm
  155. # [14:33] <hsivonen> anne: I suspect your server has the same problem as krijn's had a few days ago.
  156. # [14:33] <anne> hsivonen, oh?
  157. # [14:33] <hsivonen> zcorpan: did you reload?
  158. # [14:33] <zcorpan> hsivonen: had a cached version of the js file
  159. # [14:34] <hsivonen> anne: probably something to do with content negotiation as the generic facet doesn't fail
  160. # [14:34] <hsivonen> I'll improve diagnostics
  161. # [14:34] <anne> hmm, now it does work
  162. # [14:38] <hsivonen> anne: your server says 406
  163. # [14:41] <anne> oh
  164. # [14:41] <anne> I guess that has something to do with conneg, yes
  165. # [14:41] <hsivonen> anne: chances are you are relying on */*
  166. # [14:42] <hsivonen> anne: the html5 facet does not Accept */*
  167. # [14:42] <anne> prolly
  168. # [14:42] <hsivonen> anne: in krijn's case, it was about Apache 1.3 PHP mapping and negotiation not working together
  169. # [14:46] <krijnh> \o
  170. # [14:55] <zcorpan> o/
  171. # [14:56] <hsivonen> anne: now with a slightly better error message: http://html5.validator.nu/?doc=http%3A%2F%2Fannevankesteren.nl%2Fcontact
  172. # [14:59] <hsivonen> anne: you have the exact same problem that krijnh had: Available variants: application/x-httpd-php
  173. # [14:59] <anne> yeah, makes sense
  174. # [14:59] <krijnh> anne: also running Apache 1.3?
  175. # [14:59] <anne> could be
  176. # [15:00] <anne> anyway, got to go
  177. # [15:00] <hsivonen> I'm mildly amused about how conneg is supposed to be great and then something as common as Apache+PHP is b0rked
  178. # [15:06] * Joins: Sander (svl@86.87.68.167)
  179. # [15:07] <hsivonen> hmm. looks like I had fallen for the classic way of making a page invalid
  180. # [15:07] <hsivonen> I had copied the CVSDude badge HTML boilerplace
  181. # [15:07] <krijnh> Even you? Wow :)
  182. # [15:08] <krijnh> Hey Sander
  183. # [15:08] <Sander> oi
  184. # [15:08] <krijnh> Are you coming to Delft Thursday?
  185. # [15:09] <Sander> I am
  186. # [15:09] <krijnh> (You're the zoid guy right?)
  187. # [15:09] <Sander> probably not
  188. # [15:09] <krijnh> Hmm
  189. # [15:09] * Sander tries to think what zoid would be
  190. # [15:09] <krijnh> Never mind then :)
  191. # [15:09] <krijnh> Ah, you're from haveskill
  192. # [15:10] <Sander> there's too many sander's in this country doing web standards stuff. :D
  193. # [15:10] <krijnh> Yeah ;)
  194. # [15:10] <Sander> I am indeed. :)
  195. # [15:10] <krijnh> Were you at Info.nl too last month?
  196. # [15:10] <Sander> No, I was in the USA at that point, alas
  197. # [15:26] * Joins: karl (karlcow@128.30.52.30)
  198. # [15:28] * Quits: gavin (gavin@74.103.208.221) (Ping timeout)
  199. # [15:34] * Joins: gavin (gavin@74.103.208.221)
  200. # [15:37] <hsivonen> now that I made XHTML-1.0-as-text/html non-fatal, I don't know how to communicate to users the Jing-level errors about xml:lang.
  201. # [15:37] <hsivonen> that is, that the attribute that the schema does not allow is not lang in the XML namespace but xml:lang in no namespace
  202. # [15:50] * Quits: myakura (myakura@58.88.37.26) (Quit: Leaving...)
  203. # [15:54] * Quits: karl (karlcow@128.30.52.30) (Quit: Where dwelt Ymir, or wherein did he find sustenance?)
  204. # [15:55] * Joins: Lionheart (robin@198.86.248.1)
  205. # [16:26] * Joins: billmason (billmason@69.30.57.156)
  206. # [16:32] * Quits: Lionheart (robin@198.86.248.1) (Ping timeout)
  207. # [16:59] * Quits: olivier (ot@128.30.52.30) (Quit: Leaving)
  208. # [17:31] <zcorpan> hsivonen: "Namespaces do not work in text/html, and hence, xml:* attributes cannot be used in text/html." or some such
  209. # [17:37] * Quits: gavin (gavin@74.103.208.221) (Ping timeout)
  210. # [17:39] <zcorpan> hsivonen: is /> the only HTML4-specific tokenization error?
  211. # [17:42] * Joins: gavin (gavin@74.103.208.221)
  212. # [18:41] <hsivonen> zcorpan: thanks. a new version of the XHTML-as-text/html info message should go live soonish as the service rebuilds itself
  213. # [18:42] <hsivonen> zcorpan: no, there are other HTML 4-specific errors
  214. # [18:42] <hsivonen> a list follows:
  215. # [18:42] <hsivonen> valueless boolean attributes
  216. # [18:43] <hsivonen> unquoted attributes with non-Name values
  217. # [18:43] <hsivonen> </ in CDATA or RCDATA
  218. # [18:43] <hsivonen> plus />
  219. # [18:43] <hsivonen> that's it for now
  220. # [18:44] <zcorpan> </ is allowed in cdata and rcdata unless it is followed by a name start character
  221. # [18:44] <zcorpan> iirc
  222. # [18:44] <hsivonen> moreover, it appears that some HTML case-insensitivity stuff regressed with the parser rewrite
  223. # [18:45] <hsivonen> zcorpan: interesting. I wasn't aware of that
  224. # [18:45] <hsivonen> hmm. thinking again, I was but not with the right terms
  225. # [18:53] <hsivonen> zcorpan: did HTML 4 have separate name characters and name start characters?
  226. # [18:54] <hsivonen> apparently yes
  227. # [18:57] <hsivonen> zcorpan: fix checked in. the service will take a while to rebuild
  228. # [18:58] <hsivonen> (I'm starting to think I might actually need two JVM instances to avoid these Service Temporarily Unavailable periods)
  229. # [19:01] <Philip`> "The measurable study would be the number of pages with XHTML DOCTYPEs, served as HTML, and containing markup that would have unintended consequences if served as XML"
  230. # [19:01] * Philip` wonders if anyone has data on how many XHTML-as-text/html pages are not even well-formed, and what are the most common causes of ill-formedness
  231. # [19:02] <Philip`> (All I can tell from my collected data is that 50% of XHTML-doctyped pages cause parse errors in the HTML5 tokeniser, which is marginally worse than the 45% of not-just-XHTML pages)
  232. # [19:03] <hsivonen> Philip`: do you keep a local copy of the files that you got when dereferencing dmoz URLs?
  233. # [19:04] <Philip`> I don't
  234. # [19:05] <Philip`> (since it'd probably be around half a gigabyte for 8K pages, which is not entirely negligible)
  235. # [19:07] <hsivonen> depends on your free disk space vs. your network downstream, I guess
  236. # [19:07] <hsivonen> I have a puny 1 Mbps downstream
  237. # [19:09] <zcorpan> ATTSPLEN 65536 -- These are the largest values --
  238. # [19:09] <zcorpan> LITLEN 65536 -- permitted in the declaration --
  239. # [19:09] <zcorpan> NAMELEN 65536 -- Avoid fixed limits in actual --
  240. # [19:09] <zcorpan> PILEN 65536 -- implementations of HTML UA's --
  241. # [19:09] <zcorpan> is in the sgml declaration for html4
  242. # [19:10] <zcorpan> even the sgml declaration for html4 admits that HTML UAs are not based on sgml
  243. # [19:13] <Philip`> hsivonen: I was using my computer, which has approximately no disk space except during the brief periods in which I've deleted some junk and not filled it up again, and a university one where I was just borrowing its /tmp and can't do permanent storage
  244. # [19:16] <Philip`> (I think the bottleneck ended up being in the way that I spawned two processes (curl and the tokeniser) for each downloaded page, which didn't work too badly but could probably be done much better)
  245. # [19:17] <zcorpan> hsivonen: you may want to warn about minimized href and src attributes since they get dropped in internet explorer
  246. # [19:18] <hsivonen> I have been hoping that someone on public-html curious enough about verifying Hixie's results to write a test harness in Java. Hasn't happened yet...
  247. # [19:18] <hsivonen> zcorpan: are those two the only ones?
  248. # [19:19] <Philip`> ((Still got about six pages per second (downloaded + tokenised), which seems much better than the ~0.2/sec from http://triin.net/2006/06/12/Running_the_program (though not collecting as much information about each page)))
  249. # [19:20] <zcorpan> hsivonen: yes
  250. # [19:23] <hsivonen> zcorpan: added. should be live in a few moments
  251. # [19:25] <hsivonen> Philip`: did the triin.net guy measure the total byte size of the stuff that was downloaded?
  252. # [19:26] <hsivonen> I wonder what would be an efficient way of storing the original content-type and URI along with the payload on disk...
  253. # [19:26] <Philip`> You could try to guess the numbers from http://triin.net/archive/kool/webstat/figure-8.png
  254. # [19:27] * Joins: kingryan (rking3@208.66.64.47)
  255. # [19:38] <Philip`> Of the 543 pages with XHTML doctypes, 329 had unrecognised entity names in attributes (which seems to always be <a href="a?b&c">), 159 had '?' in the tag open state (I guess <?xml...>), 94 had non-permitted slashes (<something/>), 38 had duplicate attributes
  256. # [19:39] <hsivonen> Philip`: based on the figure, there's only about 8 gigabytes to download
  257. # [19:42] <Philip`> Looks more like 20GB to me, since the mean is around 30KB and there's about 0.8 million in total
  258. # [19:42] <hsivonen> ok
  259. # [19:43] <Philip`> 20GB / 1Mbps = 1.9 days so it's not that bad :-)
  260. # [19:43] <hsivonen> I'm going away for a couple of days. too bad I don't have slurper software standing by
  261. # [19:44] * Quits: gavin (gavin@74.103.208.221) (Ping timeout)
  262. # [19:45] <hsivonen> I wonder how many files per zip file java.util.zip can handle
  263. # [19:45] <hsivonen> or how many files per directory HFS+ can handle before melting down
  264. # [19:49] * Joins: gavin (gavin@74.103.208.221)
  265. # [19:49] <Philip`> I don't think the zip format is especially perfect for adding lots of files one at a time, given how it has a file table at the end that it'd have to keep rewriting
  266. # [19:50] <Philip`> but maybe that's a negligible problem if it's only thousands per zip file
  267. # [20:05] * Joins: hasather (hasather@80.203.71.22)
  268. # [20:19] * Quits: jgraham (jgraham@81.86.209.151) (Quit: Ex-Chat)
  269. # [20:32] <zcorpan> does anyone understand what robert burns is on to with consistency and createElement()?
  270. # [20:32] * Parts: hasather (hasather@80.203.71.22)
  271. # [20:32] * Joins: hasather (hasather@80.203.71.22)
  272. # [20:34] <hsivonen> zcorpan: he seems to believe that createElement() magically does the right thing if you only use elements from one namespace
  273. # [20:39] <Philip`> Is that true only if his understanding of "the right thing" doesn't include elements being treated like HTML elements (e.g. if you did createElement('b') it wouldn't be rendered as bold)?
  274. # [20:40] <hsivonen> I'm not going to guess his intent further.
  275. # [20:48] * Joins: jgraham (jgraham@81.86.208.107)
  276. # [21:08] <hsivonen> I'm getting more curious about what it is that Rob Burns is implementing
  277. # [21:35] * Parts: hasather (hasather@80.203.71.22)
  278. # [21:36] * Joins: edas (edaspet@88.191.34.123)
  279. # [21:36] * Joins: hasather (hasather@80.203.71.22)
  280. # [21:51] * Quits: gavin (gavin@74.103.208.221) (Ping timeout)
  281. # [21:56] * Joins: gavin (gavin@74.103.208.221)
  282. # [22:01] <jgraham> I'm getting more despondent about the #kB unread email I have on public-html
  283. # [22:02] <jgraham> and my lack of desire to read it
  284. # [22:18] * Joins: edaspet (edaspet@88.191.34.123)
  285. # [22:18] <jgraham> Only 8 emails to go, of which are from Rob Burns, and which weigh in at a total of 111kB
  286. # [22:19] <anne> nice
  287. # [22:19] <jgraham> s of which/7 of which/
  288. # [22:20] * Quits: edas (edaspet@88.191.34.123) (Ping timeout)
  289. # [22:25] * Quits: edaspet (edaspet@88.191.34.123) (Client exited)
  290. # [22:25] <anne> would be nice if RB provided some use cases and real problems all his new elements are solving
  291. # [22:25] * Joins: edas (edaspet@88.191.34.123)
  292. # [22:26] <hsivonen> anne: they are solving the problem of expressing precise semantics
  293. # [22:27] <anne> yeah, RDF does so too I'm told
  294. # [22:29] <jgraham> Expressing precise semantics is not, in itself, a use case
  295. # [22:29] <jgraham> Maybe I should post that
  296. # [22:29] <jgraham> But I feel bad sending people more email
  297. # [22:30] <anne> me too
  298. # [22:30] <anne> everytime I open the reply window and type something I close it a few seconds later because it seems rather pointless
  299. # [22:31] <anne> (I now stopped doing that; just reading)
  300. # [22:43] * Quits: schepers (schepers@128.30.52.30) (Quit: Trillian (http://www.ceruleanstudios.com)
  301. # [22:45] * Joins: Zeros (Zeros-Elip@67.154.87.254)
  302. # [22:53] * Quits: ROBOd (robod@86.34.246.154) (Quit: http://www.robodesign.ro )
  303. # [23:00] * Joins: mjs (mjs@67.41.204.169)
  304. # [23:13] <anne> jgraham, when are we going to do another release of html5lib?
  305. # [23:14] <anne> actually, what I'm more interested in is the plans there were at some point for a C version... have those progressed?
  306. # [23:15] <Philip`> I've been working on a partial C++ (and JS and Perl) version, which could be useful for that
  307. # [23:15] <Philip`> (I guess it'd be reasonably easy to port to straight C if necessary)
  308. # [23:16] <anne> 3 parsers at once? fancy
  309. # [23:17] <Philip`> I've just written one in OCaml, and a compiler with C++/JS/Perl code-generation backends
  310. # [23:17] <Philip`> (Only done the tokeniser, though)
  311. # [23:18] <anne> doing that for tree construction might get tricky
  312. # [23:18] <anne> although I suppose there's some logic there as well :)
  313. # [23:21] <Philip`> http://canvex.lazyilluminati.com/svn/tokeniser/tokeniser_spec.ml is the meta-implementation of the algorithm - most of the words in there still have to be implemented manually in each language, but that can be fairly straightforward
  314. # [23:25] * Quits: mjs (mjs@67.41.204.169) (Ping timeout)
  315. # [23:33] * Joins: mjs (mjs@67.41.152.66)
  316. # [23:38] <anne> I suppose in theory you can map the tree construction stuff to something similar
  317. # [23:38] <jgraham> anne: We should do one soon. I think we should make a few improvements to charsUntill in the tokenizer and then put the release out
  318. # [23:39] <jgraham> We can do the new charset detection stuff for 0.11
  319. # [23:39] <anne> k
  320. # [23:39] <anne> I wonder how much further changes charset detection will get
  321. # [23:40] <jgraham> I'm also interested in taking Philip`'s C++ tokenizer and hooking it up to python via SWIG or similar. But I need to motivate myself to actually learn a little more C++ than I know to do that.
  322. # [23:46] <Philip`> What would be involved in the C++/Python interface? I guess it's just transferring characters and tokens, but I don't know which side should be pushing/pulling or what kind of data structures they should pass around
  323. # [23:47] * Philip` finishes creating the Perl port of his JS port of his C++ tokeniser, and tries to work out how to run tests and see how many it fails...
  324. # [23:47] <zcorpan> mjs: html5 already requires all Document objects to implement HTMLDocument and other supported interfaces (like SVGDocument)
  325. # [23:48] <anne> "(This is the case whether or not the document in question is an HTML document or indeed whether it contains any HTML elements at all.)"
  326. # [23:52] <jgraham> Philip`: I envision python pulling tokens from C++ (basically html5lib views the tokenizer as an iterator which produces a sequqnce of tokens)
  327. # [23:53] <jgraham> So I think you need something like a emitToken method on the C++ side which returns a pointer to the next token
  328. # [23:53] <jgraham> Then the interface code turns that into a python object
  329. # [23:54] <jgraham> Or something
  330. # [23:54] <Philip`> Where would the C++ side get characters from?
  331. # [23:56] <zcorpan> mjs: i have tests on that, btw: http://simon.html5.org/test/html/dom/interfaces/Document/
  332. # [23:57] <jgraham> I guess you have to pass it something it can interpret as file-like
  333. # [23:57] * Joins: myakura (myakura@58.88.37.26)
  334. # [23:57] <jgraham> and then the C++ side would read the file
  335. # [23:57] <mjs> zcorpan: what I meant was requiring createElement to create elements in the HTML namespace for all documents
  336. # [23:57] <anne> oooh
  337. # [23:57] <mjs> zcorpan: HTML5 only requires that for HTML documents
  338. # [23:58] <mjs> sorry for being unclear
  339. # [23:58] <anne> i suppose it makes some sense
  340. # [23:59] * Quits: gavin (gavin@74.103.208.221) (Ping timeout)
  341. # Session Close: Tue Jul 24 00:00:00 2007

The end :)