/irc-logs / freenode / #whatwg / 2008-08-30 / end

Options:

  1. # Session Start: Sat Aug 30 00:00:00 2008
  2. # Session Ident: #whatwg
  3. # [00:03] <Hixie> oh great, anne's told everyone that the real reason for websocket is my model railway :-P
  4. # [00:05] <annevk> anecdotical evidence to feed the flames :p
  5. # [00:06] <Lachy> BenMillard, ok. reading it now
  6. # [00:06] <Hixie> annevk: good interview
  7. # [00:07] <annevk> ta
  8. # [00:07] <annevk> gsnedders, so are you putting up a Web service tonight?
  9. # [00:07] <BenMillard> smedero has sent me a review of the message. it's enlightening but I think it doesn't change what my message says
  10. # [00:07] <smedero> no, it shouldn't
  11. # [00:08] <smedero> it is just backstory, since you probably haven't been following the telecons....
  12. # [00:08] <BenMillard> yes, thanks for providing me with that
  13. # [00:14] * Quits: Amorphous (i=jan@unaffiliated/amorphous) (Connection timed out)
  14. # [00:15] <jacobolus> are unescaped ampersands allowed in html?
  15. # [00:16] <annevk> sometimes
  16. # [00:17] * Joins: Amorphous (i=jan@unaffiliated/amorphous)
  17. # [00:18] <jacobolus> the validator seems happy to accept “<!DOCTYPE html><title></title>&”
  18. # [00:19] <annevk> that's correct (not sure if the syntax section accurately reflects it currently)
  19. # [00:20] <jacobolus> I didn't look super carefully, but I only noticed a mention w.r.t. xml
  20. # [00:20] <annevk> jacobolus, it does not need to be escaped when followed by a space, EOF, another ampersand, start tag, end tag, comment
  21. # [00:20] <jacobolus> ah, okay
  22. # [00:21] <jacobolus> “<!DOCTYPE html><title></title>AT&T” properly fails then
  23. # [00:21] <annevk> yup
  24. # [00:22] <annevk> hsivonen, "Probable cause: & should have been escaped as &amp;D." what is the "D" doing there?
  25. # [00:23] <BenMillard> annevk, nothing much. :P
  26. # [00:24] <jacobolus> ah, it is described in detail, but should maybe be more explicit what authors should do?
  27. # [00:24] <jacobolus> (i.e. what implementors should do is described in detail)
  28. # [00:25] <annevk> section 8.1 describes in detail what authors should do
  29. # [00:25] <jacobolus> oh, nevermind
  30. # [00:25] <annevk> though it seems it's not entirely clear on things
  31. # [00:25] <jacobolus> the text must not contain the character U+003C LESS-THAN SIGN (<) or an ambiguous ampersand.
  32. # [00:26] <jacobolus> okay, that's reasonable :)
  33. # [00:26] * Joins: othermaciej (n=mjs@17.203.15.200)
  34. # [00:26] <jacobolus> http://www.whatwg.org/specs/web-apps/current-work/#ambiguous doesn't mention EOF :)
  35. # [00:27] <BenMillard> Lachy, are you double-checking my numbers? if so, thanks!
  36. # [00:28] <annevk> jacobolus, yeah, whatwg@whatwg.org ;)
  37. # [00:29] <Lachy> BenMillard, I suppose I can go through it again and do that. But let me finish it at least once first... I'm in the middle of doing some other things too
  38. # [00:29] <BenMillard> Lachy, sure thing...I don't wish to monopolise your time
  39. # [00:30] * Joins: svl (n=me@ip565744a7.direct-adsl.nl)
  40. # [00:31] * Joins: tndH (i=Rob@adsl-77-86-6-71.karoo.KCOM.COM)
  41. # [00:33] <Lachy> BenMillard, btw, I found a rather interesting table might be interesting for you to study. But it requires membership to see it on this site http://www.newzbin.com/
  42. # [00:34] <BenMillard> Lachy, would it be permitted for you to extract the table and send it to me for analysis and publishing in my collection? my work is non-commercial research so I imagine that would qualify as "fair use"
  43. # [00:34] * annevk summons zcorpan
  44. # [00:35] * BenMillard hopes zcorpan will arrive with a witty /me line...
  45. # [00:35] <Lachy> BenMillard, sure. I'll do it after I'm done with your email
  46. # [00:36] <BenMillard> Lachy, cool. I'll go start some dinner :)
  47. # [00:39] <gsnedders> annevk: As I said, if anyone writes it :P
  48. # [00:40] <gsnedders> BenMillard: Peh! Silly logn emails you want me to read!
  49. # [00:40] <gsnedders> *long
  50. # [00:41] * Quits: smedero (n=smedero@mdp-nat251.mdp.com)
  51. # [00:42] <Lachy> gsnedders, you could just pretend to read it, skim it and post random comments about things you see to make him think you're really reading it. ;-)
  52. # [00:42] <gsnedders> Lachy: True.
  53. # [00:42] * gsnedders hopes all this crazy insane ECMAScript optimization work shows through in other interpreted languages
  54. # [00:45] <annevk> gsnedders, oh I see
  55. # [00:45] <annevk> well, now it's too late
  56. # [00:45] <annevk> I guess tomorrow I can download the relevant packages and try making it work, shouldn't be too hard
  57. # [00:45] <gsnedders> annevk: It's probably worth working from what I'll put in hg tomorrow
  58. # [00:46] <gsnedders> Actually, if you want to work on it in the morning, I could do that now
  59. # [00:48] * jgraham wonders if there is any point in arguing process with CW
  60. # [00:48] <BenMillard> gsnedders, there's no obligation and no salesperson will call. :D
  61. # [00:49] <jgraham> I think I'll fix html5lib first
  62. # [00:49] <gsnedders> Hixie: PostScript is a turing-complete language
  63. # [00:49] <annevk> gsnedders, that doesn't mean it will cause a risk
  64. # [00:49] * Parts: michaeln (n=michaeln@nat/google/x-f4ccb7dae09b7c33)
  65. # [00:49] * gsnedders doesn't know enough to know that
  66. # [00:49] * jgraham saw a ~3 line postscript file that did a ray tracing image of a ball on a chessboard or something once
  67. # [00:49] <gsnedders> I'm just pointing out what I know :P
  68. # [00:49] <Philip`> Turing machines with no IO devices are a bit rubbish in practice
  69. # [00:50] * gsnedders passes Philip` an IO device
  70. # [00:50] <Philip`> I suppose you could try to make them jump backwards and forwards really fast so their tape catches fire
  71. # [00:52] <gsnedders> Philip`: Problem with that is next to no-one actually implements a turing-machine using tape
  72. # [00:53] <Philip`> Partly because it'd be physically impossible to implement one at all
  73. # [00:53] <Lachy> BenMillard, your byte counts for the sizes of the tables seem to be a little off
  74. # [00:53] <BenMillard> Lachy, perhaps there was a bug during the copying...
  75. # [00:54] <Lachy> I saved the both files using wget, stripped out all markup before and after the <table>...</table> and resaved the files
  76. # [00:54] <BenMillard> Lachy, what numbers do you get and by which method? I expect the error is on my part
  77. # [00:54] <Lachy> For noscope.html, I get 1659, and for complexdatatable.html, I get 2591
  78. # [00:56] <BenMillard> hmm, what types of line endings do you see? I see 2 characters per newline, which might be my error
  79. # [00:56] <Lachy> what method did you use to get the sizes?
  80. # [00:57] <BenMillard> I viewed source in Firefox 2, copied and pasted into a plain text editor, then selected the text from the start of "<table" to the end of "</table>" and read off what the statusbar said
  81. # [00:57] <BenMillard> your method sounds better to me :)
  82. # [00:57] * gsnedders reads the start of BenMillard's email and turns off
  83. # [00:57] <Lachy> line endings are CRLF
  84. # [00:58] <BenMillard> Lachy, same here
  85. # [00:58] <Lachy> the problem with copying and pasting from firefox's view source is that it doesn't copy properly
  86. # [00:59] <Lachy> it frequently adds in extra lines in random places
  87. # [00:59] <BenMillard> Lachy, I now measure 2,625 for the complexdatatable.html
  88. # [00:59] <BenMillard> Lachy, yes I removed the empty lines
  89. # [00:59] <BenMillard> I'm happy to steal your numbers if that's OK by you :)
  90. # [01:00] <Lachy> no, those numbers are copyrighted to me! :-)
  91. # [01:01] <BenMillard> Lachy, perhaps you could e-mail me the files you downloaded and the cropped versions, then I can see where the difference is
  92. # [01:01] <BenMillard> thanks for going to the trouble to do this, btw
  93. # [01:03] <Lachy> http://lachy.id.au/temp/tables.zip
  94. # [01:05] <BenMillard> Lachy, both the text editor's selection and Windows Explorer agree with your numbers
  95. # [01:06] <Lachy> did you find the diffs with your files?
  96. # [01:06] * Quits: billmason (n=billmaso@ip75.unival.com) (Read error: 104 (Connection reset by peer))
  97. # [01:07] <BenMillard> Lachy, in noscope.html I see class="header" on the <td> for each of the 3 dates
  98. # [01:07] <jgraham> argh
  99. # [01:08] <gsnedders> I probably ought to do the work I'm meant to do for Computing
  100. # [01:08] <gsnedders> I'm really get quite far behind
  101. # [01:08] <gsnedders> Though de-facto as long as I've done everything I'm meant to by December it doesn't really matter when I do it
  102. # [01:08] <BenMillard> gsnedders, do prioritise things above this e-mail. I might not send for a day or two yet
  103. # [01:09] <gsnedders> BenMillard: I've just been dealing with other low priority email
  104. # [01:09] <Lachy> why? I took my files directly from juicystudio, and didn't modify anything else
  105. # [01:09] <gsnedders> BenMillard: From several months ago :)
  106. # [01:09] <BenMillard> Lachy, you see the class="headers" here? http://juicystudio.com/wcag/tables/noscope.html
  107. # [01:09] <jgraham> Why did someone think it was a good idea for lxml to add a random doctype when parsing html documents?
  108. # [01:09] <BenMillard> sorry, class="header": "<td class="header">12/12/2005</td"
  109. # [01:09] <gsnedders> If you send me something low priority and don't get a reply within an hour or two, it'll probably take a few weeks or months :)
  110. # [01:10] <gsnedders> jgraham: Because libxml2's HTML support is just a big hack
  111. # [01:10] <jgraham> gsnedders: The problem is that their XML support tries to enforce XML rules
  112. # [01:10] <BenMillard> Lachy, the 3 instances of class="header" are also missing from your complexdatatable.html
  113. # [01:10] <jgraham> Like no : in tag names
  114. # [01:11] <jgraham> s/tag/attribute/
  115. # [01:11] <Lachy> Looks like they've just been removed from those files
  116. # [01:11] <Lachy> press reload
  117. # [01:11] * Quits: Maurice (i=copyman@cc90688-a.emmen1.dr.home.nl) ("Disconnected...")
  118. # [01:11] <BenMillard> Lachy: yes, you're right
  119. # [01:11] <BenMillard> that's annoying
  120. # [01:11] <Lachy> I did at first, but that must have been cached
  121. # [01:11] <BenMillard> they're changing the record whilst I'm replying to it :(
  122. # [01:11] <Lachy> yeah, he's destroying the evidence
  123. # [01:12] * Joins: tantek (n=tantek@adsl-68-123-180-62.dsl.pltn13.pacbell.net)
  124. # [01:12] <BenMillard> Lachy, OK, so if you saw those attributes were present then those are the values I'll give since they are historically accurate
  125. # [01:13] <jgraham> Those attributes were present for sure. I think I mentioned it in an email
  126. # [01:13] <gsnedders> Why hasn't anyone creating a decent Flickr downloader yet?
  127. # [01:13] <gsnedders> That is like, easy to use.
  128. # [01:13] <jgraham> gsnedders: What do you mean decent?
  129. # [01:13] <Philip`> gsnedders: Like, a web browser?
  130. # [01:13] <gsnedders> Philip`: But to download an entire set?
  131. # [01:13] <Philip`> gsnedders: Oh
  132. # [01:13] <BenMillard> Lachy, the absence of class="header" makes our numbers match, so at least I haven't forgotten how to count. :)
  133. # [01:13] <gsnedders> jgraham: Not having a crazily complex UI. Copy and pasting a URL from a browser would work fine.
  134. # [01:14] <Philip`> gsnedders: Write a few (dozen) lines of script to use their API?
  135. # [01:14] <Lachy> ok, so my values are wrong cause they're the new values without the classes
  136. # [01:14] <BenMillard> Lachy, correct
  137. # [01:14] <gsnedders> Philip`: Because I need something that works on my uncle's computer, so I could just write a web API
  138. # [01:14] <gsnedders> *web interface
  139. # [01:14] * gsnedders yawns
  140. # [01:15] <Lachy> BenMillard, there are only 18 header cells by my count, not 20
  141. # [01:15] <BenMillard> Lachy, 12 in the <thead>, agreed?
  142. # [01:15] <Lachy> yes
  143. # [01:15] <BenMillard> Lachy, 2 in the first column of the <tbody>?
  144. # [01:16] <Lachy> plus the 6 for budgeted, actual and forcasted in the column
  145. # [01:16] <BenMillard> Lachy, 6 in column 7 as you say
  146. # [01:16] <BenMillard> Lachy, yep, I've forgotten how to add up them :)
  147. # [01:16] <Lachy> ah, I didn't count those first 2 as headers
  148. # [01:16] <BenMillard> oh wait, 12 + 6 + 2 = 20
  149. # [01:16] <Lachy> the "Partner Portal" ones?
  150. # [01:16] <BenMillard> Lachy, yeah
  151. # [01:17] <BenMillard> they are associated as being row headers
  152. # [01:18] <Lachy> in complexdatatable.html, they are. But in noscope.html, there's nothing that indicates they are headers
  153. # [01:18] <BenMillard> "<td scope="row" id="row1" rowspan="3">Partner Portal</td>" in http://juicystudio.com/wcag/tables/complexdatatable.html
  154. # [01:18] <BenMillard> Lachy, yeah, later in my e-mail I mention that test 1 is unfair
  155. # [01:18] <Lachy> ok
  156. # [01:18] <BenMillard> and that scope="row" was used in test 3 instead of using headers+id for all the associations
  157. # [01:18] <Lachy> you mean test 2
  158. # [01:19] <BenMillard> Lachy, oh sorry you're right
  159. # [01:19] <BenMillard> scope="row" is used in addition to headers+id in test 3
  160. # [01:19] <Lachy> oh, what are the final byte counts you used? I should check the percentage given too
  161. # [01:20] <BenMillard> Lachy, 1,704 and 2,625. yes, I have made percentage errors before now :)
  162. # [01:22] <Lachy> I get 54.05%
  163. # [01:22] <BenMillard> Lachy, when talking about test file 3, I say "5 cells use <td scope> and participate in headers+id, duplicating the association." so my e-mail about that aspect correct, I just got muddled during this review
  164. # [01:23] <BenMillard> Lachy, what is your calculation for that? Maybe I've forgotten percentage increase math...
  165. # [01:24] <Lachy> (2625 - 1704) / 1704 * 100 = 921 / 1704 * 100 = 54.04%
  166. # [01:24] <BenMillard> hmm, that's more complicated than what I did :)
  167. # [01:24] <Lachy> what did you do?
  168. # [01:25] <BenMillard> just now I tried 2,625 / 1,704 = 1.5404 so yours looks right
  169. # [01:25] <BenMillard> maybe I typod my sum first time round
  170. # [01:25] <Lachy> I assumed I needed to find the difference, and then find out what percentage that difference was with the lower value
  171. # [01:26] <BenMillard> Lachy, so me saying 36% more markup was understating the code bloat by quite a bit! thanks for spotting that
  172. # [01:26] <Lachy> our numbers are consistent. Yours (1.5404) means that 2625 is 154% the size of 1704
  173. # [01:27] <Lachy> whereas mine says it's 54% larger
  174. # [01:27] <BenMillard> Lachy, yeah that's how I interpret it
  175. # [01:27] <Philip`> (Markup-size doesn't seem a very interesting measure when these tables are probably generated by programs from databases, and no human ever needs to look at the markup, and simplicity of implementing the table-generating code seems much more relevant)
  176. # [01:28] <BenMillard> Philip`, I've seen auto-generated headers+id, for sure
  177. # [01:28] <Lachy> Philip`, throwing lots of data at people, regardless of how relevant it is, is a useful techniqe for winning an argument :-)
  178. # [01:28] <BenMillard> I've also seen typoed headers+id
  179. # [01:29] <Lachy> fyorfty percent of all people know that, Kent
  180. # [01:29] <Philip`> Lachy: Winning an argument is not the aim; the aim is to design the best possible system :-p
  181. # [01:29] <BenMillard> Philip`, it's also worth considering that if the generating code can be radically simpler (such as just using <th> for all headers) that reduces the likelihood of bugs in the table
  182. # [01:30] <Lachy> s/fyorfty/forfty/ (I messed the simpsons quote :-))
  183. # [01:31] <Philip`> BenMillard: It's good to encourage people to do the simplest thing, but sometimes they just have complex tables, so I thought the issue was how to support the most complex tables (e.g. whether to force them to use <th> instead of <td>)
  184. # [01:31] <BenMillard> Philip`, that's right. So if a table can be supported by plain <th> using a sane association algorithm, that's preferable over the complexity and bloat of headers+id, in my judgement.
  185. # [01:32] * Quits: tantek (n=tantek@adsl-68-123-180-62.dsl.pltn13.pacbell.net) (Connection reset by peer)
  186. # [01:32] <BenMillard> but I can well imagine irregular tables will sometimes be necessary and need headers+id, although even then all the headers could be done as <th>
  187. # [01:32] <jgraham> Hmm BenMillard keeps saying sensible things so I don't have to
  188. # [01:33] <Philip`> BenMillard: Would the headers attribute be supported only on <td>, not <th>?
  189. # [01:33] <BenMillard> Philip`, I haven't studied that in detail yet. would you like me to forward the message to you?
  190. # [01:33] * Joins: tantek (n=tantek@adsl-68-123-180-62.dsl.pltn13.pacbell.net)
  191. # [01:34] <Philip`> How would it handle something like http://factfinder.census.gov/servlet/QTTable?_bm=n&_lang=en&qr_name=DEC_2000_SF1_U_DP1&ds_name=DEC_2000_SF1_U&geo_id=05000US48487 where the numbers need to be associated with the label in the first column, but the labels in the first column also need to be associated with some random set of other label cells?
  192. # [01:34] <jgraham> Philip`: I think there is likely a use case for @headers on th although no one has actually brought forward a table that needs it (at least recently)
  193. # [01:35] <BenMillard> Philip & jgraham, I call those "heirarchical row headers" although nobody else does :)
  194. # [01:35] <Lachy> "Test file 1 erroneously uses <td> for 10 of the 20 header cells" Which headers make up the 10? I only count 9
  195. # [01:35] <BenMillard> Lachy, I'll recount
  196. # [01:35] <Lachy> actually, 11
  197. # [01:36] <Lachy> 3 dates, 2 x Partner Portal, 6 Budged/actual/forcast
  198. # [01:36] <Philip`> BenMillard: It might be best to not forward the email, since I have too many other things I ought to be working on instead :-)
  199. # [01:36] <BenMillard> Philip`, sure thing
  200. # [01:36] <BenMillard> Lachy, so we're talking about? http://juicystudio.com/wcag/tables/noscope.html
  201. # [01:36] <Lachy> yes
  202. # [01:36] <jgraham> Philip`: That table looks like it should actually be several smaller tables
  203. # [01:37] * Quits: dglazkov (n=dglazkov@nat/google/x-7446b87021ae62b6)
  204. # [01:37] <BenMillard> Lachy, I agree with 11. thanks!
  205. # [01:37] * Joins: shepazu (n=schepers@88.128.85.131)
  206. # [01:37] <Philip`> jgraham: I don't think splitting it into smaller tables would help with the "One race -> Asian -> Asian Indian" label hierarchy, which is the main problem
  207. # [01:37] <BenMillard> (so this is another case where I understated the error)
  208. # [01:38] <jgraham> Philip`: I think that layout would need @headers on <th>
  209. # [01:38] <Hixie> iirc you can actually do Philip`'s table with some careful use of rowspans, but i forget if i ended up making that work or not (and it's dubious whether that's desireable anyway)
  210. # [01:38] <jgraham> Philip`: Sure but it would have confused me hell of lot less
  211. # [01:38] <Philip`> (Also splitting it into smaller tables would make the layout go all ugly, because you want them to all be exactly the same column sizes, and there's no way to enforce that when they're multiple tables)
  212. # [01:38] <jgraham> adn I can see it
  213. # [01:38] <BenMillard> Hixie, yes, rowspan works for "heirarchical row header" case...if you've got enough width to present it
  214. # [01:40] <Hixie> Philip`: i'm not sure what the best way to render that table is, but i'm pretty sure that "0. Subject, Race, One Race, Native Hawaiian and Other Pacific Islander, Other Pacific Islander 2; Number" is not the best way to read out that cell
  215. # [01:40] <BenMillard> Philip`, the table uses fixed-width, such as width="385", so you could split it and keep the fixed widths
  216. # [01:40] <Philip`> BenMillard: Then you're making assumptions about how many pixels the user's font uses
  217. # [01:40] <Hixie> Philip`: which is presumably what one would get if we encouraged people to chain headers
  218. # [01:41] <Hixie> Philip`: it should definitely be possible to link columns into having the same widths even in different tables, though css can't do that (and likely won't for some time) so i agree that in this case we shouldn't assume that it is possible
  219. # [01:41] <BenMillard> Hixie, when moving from cell to cell the more sophisticated ATs only announce the headers which have changed
  220. # [01:41] <Lachy> BenMillard, "3 of the cells using <td scope="row"> also use rowspan." - I only see 2 scope="row" in test 3
  221. # [01:42] <Hixie> BenMillard: well then it would sound exactly like if there weren't chained headers, assuming you're navigating the table linearly
  222. # [01:42] <Lachy> and this assertion of yours is debatable "For scope to work here under HTML4, scope=""rowgroup" must be used with the appropriate use of <tbody> around the rows which are being spanned: "
  223. # [01:42] <BenMillard> Lachy, yep, well spotted
  224. # [01:42] <Lachy> the spec is ambiguous though
  225. # [01:42] <jgraham> Hixie: FWIW Al suggested that the common AT setup is to have headers red out on demand
  226. # [01:42] <Lachy> it says row, but technically it's still in 3 rows
  227. # [01:42] <BenMillard> Lachy, does scope="row" apply to multiple rows in HTML4?
  228. # [01:42] <jgraham> s/red/read/
  229. # [01:43] <Lachy> in fact, it doesn't say one way or the other
  230. # [01:43] <Lachy> it just says "row: The current cell provides header information for the rest of the row that contains it"
  231. # [01:43] <Hixie> jgraham: that would suggest it would render as: "Zero." zero what? crap, what are the headers? "Subject, Race, One Race, Native Hawaiian and Other Pacific Islander, Other Pacific Islander 2; Number" say what now?
  232. # [01:43] <Lachy> so does that mean the rest of the <tr> that contains it, or the rest of the row(s) that it's actually in?
  233. # [01:44] <jgraham> Hixie: I agree in this case it's pretty hard to understand. But I find that table pretty hard to understand so maybe it's just a badly designed table
  234. # [01:44] <BenMillard> Lachy, it seems to think a "row" is different from a "row group" so my reading is that scope="row" applies to exactly one line of cells across the table
  235. # [01:44] <Hixie> jgraham: quite possible
  236. # [01:45] <Lachy> You say "This further exemplifies how difficult the headers+id system is to get right", after you mention errors with scope=""
  237. # [01:45] <Hixie> jgraham: but i think "Zero." zero what? crap, what are the headers? "Other Pacific Islander 2; Number" would be easier to understand.
  238. # [01:45] <BenMillard> Lachy, can we nail down the scope="row" thing first? :)
  239. # [01:45] <jgraham> Lachy: Trying to understand the HTML4 headers spec algorithm is a lost cause
  240. # [01:45] <Hixie> there's an algorithm?
  241. # [01:45] <Lachy> BenMillard, HTML4 is not clear enough to be certain one way or another
  242. # [01:45] <Hixie> i thought there was just some vague handwaving
  243. # [01:46] <jgraham> Hixie: Algorithm is a bit of a strong term
  244. # [01:46] <jgraham> vauge handwaving is indeed much closer
  245. # [01:46] <BenMillard> Lachy, it seems to make as much different between a row and a row group as it does between a column and a column group, though...
  246. # [01:47] <BenMillard> Lachy, indeed, why have a "rowgroup" value if "row" was intended to cover that case?
  247. # [01:47] <Lachy> hmm, perhaps.
  248. # [01:47] <jgraham> Hixie: re: what AT should read out; as I've said before this seems like exactly the sort of question that user testing could help answer
  249. # [01:47] <jgraham> BenMillard: If you care the Table Inspector has a HTML4 mode
  250. # [01:47] <BenMillard> Lachy, I agree that it's debateable, so I guess either interpretation is right. :)
  251. # [01:47] <Lachy> but I don't think it's a particularly strong argument
  252. # [01:48] <jgraham> BenMillard: I wouldn't expect miracles from it though
  253. # [01:48] <Lachy> anyway, with regards to that assertion I quoted above, the evidence you presented immediately before it doesn't support it
  254. # [01:50] <BenMillard> Lachy, I see what you mean
  255. # [01:51] * Quits: tantek (n=tantek@adsl-68-123-180-62.dsl.pltn13.pacbell.net) (Read error: 110 (Connection timed out))
  256. # [01:51] <BenMillard> Lachy, my thinking was that headers+id "missed out" 8 associations in favour of using scope, while headers+id also duplicates the 6 associations which are made by scope
  257. # [01:52] <BenMillard> Lachy, I interpret the gaps and overlapping as authoring mistakes...
  258. # [01:52] <Lachy> if they're consistent, it's not really a mistake. Just redundant
  259. # [01:52] <BenMillard> Lachy, they are consistent, that's true
  260. # [01:53] * Quits: weinig (n=weinig@nat/apple/x-416c389b1b319027)
  261. # [01:53] <BenMillard> Lachy, what sentence would you suggest in place of that one?
  262. # [01:55] <Lachy> I don't know
  263. # [01:57] * Joins: aboodman3 (n=aboodman@nat/google/x-c9301e41a04d7076)
  264. # [01:57] * aboodman3 is now known as aboodman
  265. # [01:58] <BenMillard> Lachy, how about I strike that sentence and change the 1st one in that paragraph to "So, test file 3 uses a weird patchwork of techniques, with mistakes in the use of scope and colspan."
  266. # [01:59] <Lachy> yeah
  267. # [02:00] <BenMillard> Lachy, done. did you find anything else?
  268. # [02:01] <BenMillard> jgraham, thanks for your review, btw. Short but sweet. :)
  269. # [02:02] <Hixie> Lachy: yeah, but to do that we'd have to make a number of variants of that table, and then give each variant to three or four different users, and ask each user to answer questions about the table
  270. # [02:02] <Hixie> Lachy: so if we tried, say, three variants, and had three users, that's nine users to get under a usability study video camera
  271. # [02:03] * Joins: KevinMarks (n=KevinMar@nat/google/x-234621707c1d44fc)
  272. # [02:03] <BenMillard> Hixie, is that towards jgraham?
  273. # [02:03] <Hixie> um
  274. # [02:03] <Hixie> yes
  275. # [02:03] <Hixie> my bad
  276. # [02:05] <BenMillard> I'll leave sending the mail about tables until tomorrow. I got a snapshot of all 3 tests.
  277. # [02:05] <BenMillard> Philip`, that table is going into my collection under "To Do".
  278. # [02:05] <jgraham> Hixie: Well I'm not sure how many people 9 is cmpared to the number that, say, Josh works with in a day. Plus given those 9 people they could each look at several different tables so once you had enough people to get data on one type of table, you'd have enough to get data on several
  279. # [02:06] <Hixie> certainly would be great if we could do it
  280. # [02:07] <jgraham> Even without a full test like that one could try a single user with several similar tables and different amounts of verbosity, for example
  281. # [02:07] * Joins: weinig (n=weinig@nat/apple/x-a30f8313d04811a7)
  282. # [02:07] <jgraham> (one user obviously isn't a very good sample)
  283. # [02:10] * Joins: tantek (n=tantek@adsl-99-137-128-33.dsl.snfc21.sbcglobal.net)
  284. # [02:13] * Joins: othermaciej_ (n=mjs@17.244.17.18)
  285. # [02:18] <BenMillard> Philip`, I've actually put some notes with it, so it ended up as "USA FactFinder: Demographic Characteristics, 2000" here: http://projectcerbera.com/web/study/2008/collection#tables-government
  286. # [02:19] * Quits: aroben (n=aroben@unaffiliated/aroben) (Read error: 104 (Connection reset by peer))
  287. # [02:21] * Quits: tantek (n=tantek@adsl-99-137-128-33.dsl.snfc21.sbcglobal.net)
  288. # [02:21] * Joins: othermaciej__ (n=mjs@17.244.17.18)
  289. # [02:21] * Quits: othermaciej_ (n=mjs@17.244.17.18) (Read error: 104 (Connection reset by peer))
  290. # [02:25] * Quits: shepazu (n=schepers@88.128.85.131) (Read error: 110 (Connection timed out))
  291. # [02:29] * Quits: othermaciej (n=mjs@17.203.15.200) (Read error: 110 (Connection timed out))
  292. # [02:42] * Dashiva equips vast-browser-wing-conspiracy hat
  293. # [02:43] * Joins: tantek (n=tantek@66-117-137-125.dsl.lmi.net)
  294. # [02:52] * othermaciej__ is now known as othermaciej
  295. # [02:55] * Parts: BenMillard (i=cerbera@cpc1-flee1-0-0-cust285.glfd.cable.ntl.com)
  296. # [03:00] <takkaria> ah, it's nice when you can mark 81 messages as read safely
  297. # [03:01] * Quits: svl (n=me@ip565744a7.direct-adsl.nl) ("And back he spurred like a madman, shrieking a curse to the sky.")
  298. # [03:08] * Quits: syp_ (n=syp@lasigpc9.epfl.ch) (simmons.freenode.net irc.freenode.net)
  299. # [03:08] * Quits: jacobolus (n=jacobolu@pool-71-119-188-52.lsanca.dsl-w.verizon.net) (simmons.freenode.net irc.freenode.net)
  300. # [03:08] * Quits: Philip` (n=philip@zaynar.demon.co.uk) (simmons.freenode.net irc.freenode.net)
  301. # [03:08] * Quits: hendry (n=hendry@nox.vm.bytemark.co.uk) (simmons.freenode.net irc.freenode.net)
  302. # [03:08] * Quits: bzed (n=bzed@devel.recluse.de) (simmons.freenode.net irc.freenode.net)
  303. # [03:08] * Quits: bdash (n=bdash@fire/developer/bdash) (simmons.freenode.net irc.freenode.net)
  304. # [03:08] * Quits: didymos (i=jho@rapwap.razor.dk) (simmons.freenode.net irc.freenode.net)
  305. # [03:08] * Quits: [YaaL] (i=yaal@hell.pl) (simmons.freenode.net irc.freenode.net)
  306. # [03:08] * Quits: uriel (n=uriel@h677044.serverkompetenz.net) (simmons.freenode.net irc.freenode.net)
  307. # [03:08] * Quits: deltab (n=deltab@82-36-30-34.cable.ubr02.smal.blueyonder.co.uk) (simmons.freenode.net irc.freenode.net)
  308. # [03:08] * Joins: jacobolus (n=jacobolu@pool-71-119-188-52.lsanca.dsl-w.verizon.net)
  309. # [03:08] * Joins: syp_ (n=syp@lasigpc9.epfl.ch)
  310. # [03:08] * Joins: Philip` (n=philip@zaynar.demon.co.uk)
  311. # [03:08] * Joins: hendry (n=hendry@nox.vm.bytemark.co.uk)
  312. # [03:08] * Joins: bzed (n=bzed@devel.recluse.de)
  313. # [03:08] * Joins: bdash (n=bdash@fire/developer/bdash)
  314. # [03:08] * Joins: [YaaL] (i=yaal@hell.pl)
  315. # [03:08] * Joins: uriel (n=uriel@h677044.serverkompetenz.net)
  316. # [03:08] * Joins: didymos (i=jho@rapwap.razor.dk)
  317. # [03:08] * Joins: deltab (n=deltab@82-36-30-34.cable.ubr02.smal.blueyonder.co.uk)
  318. # [03:09] * Joins: weinig_ (n=weinig@nat/apple/x-9aa5e2e8f5ca56f3)
  319. # [03:09] * Quits: othermaciej (n=mjs@17.244.17.18) (Read error: 104 (Connection reset by peer))
  320. # [03:09] * Joins: othermaciej (n=mjs@17.244.17.18)
  321. # [03:13] * Joins: tantek_ (n=tantek@66-117-137-125.dsl.lmi.net)
  322. # [03:13] * Quits: tantek (n=tantek@66-117-137-125.dsl.lmi.net) (Read error: 104 (Connection reset by peer))
  323. # [03:14] * Quits: KevinMarks (n=KevinMar@nat/google/x-234621707c1d44fc) (Connection timed out)
  324. # [03:17] * Joins: othermaciej_ (n=mjs@17.244.17.18)
  325. # [03:17] * Quits: othermaciej (n=mjs@17.244.17.18) (Read error: 104 (Connection reset by peer))
  326. # [03:21] * Joins: tantek (n=tantek@66-117-137-125.dsl.lmi.net)
  327. # [03:21] * Quits: tantek_ (n=tantek@66-117-137-125.dsl.lmi.net) (Read error: 104 (Connection reset by peer))
  328. # [03:24] * Quits: weinig (n=weinig@nat/apple/x-a30f8313d04811a7) (Read error: 110 (Connection timed out))
  329. # [03:27] * Quits: tantek (n=tantek@66-117-137-125.dsl.lmi.net)
  330. # [03:52] * Quits: bdash (n=bdash@fire/developer/bdash) (Read error: 110 (Connection timed out))
  331. # [03:56] <takkaria> http://www.squarefree.com/burningedge/2008/08/29/2008-08-29-trunk-builds/ -- looks like yesterday was a pretty productive day for gecko
  332. # [03:59] * Joins: alyosha (n=anime4ch@74.93.182.234)
  333. # [03:59] <jruderman> that covers changes in the last two weeks, not just yesterday
  334. # [03:59] <jruderman> we only land that much in one day on crazy code freeze days
  335. # [04:00] <takkaria> ah, I thought it had rather a lot on it for a day
  336. # [04:00] <takkaria> still, pretty good going. :)
  337. # [04:01] <alyosha> hi ppl
  338. # [04:01] <alyosha> what do u guys think of IE 8 beta 2's HTML 5 support?
  339. # [04:02] <alyosha> I noticed (and I'm 100% sure I'm not the only one) a regression with unrecognized elements (eg. html 5 sectioning elements and inline elements such a mark)
  340. # [04:03] <alyosha> hopefully they'll fix it b4 final release
  341. # [04:10] * Quits: eseidel (n=eseidel@nat/google/x-e99074dd86d18be5)
  342. # [04:13] <takkaria> have they removed the document.createElement() hack?
  343. # [04:13] <alyosha> yeah, pretty much
  344. # [04:14] <alyosha> but the elements to seem to show up correctly in the DOM tree in IE 8's developer tools
  345. # [04:15] <alyosha> *do
  346. # [04:16] * Joins: tantek (n=tantek@66-117-137-125.dsl.lmi.net)
  347. # [04:20] * Joins: eseidel (n=eseidel@nat/google/x-9aff75974286f61f)
  348. # [04:21] <alyosha> I think it's probably an unintentional bug and they should fix it before final release, but I don't know for sure
  349. # [04:24] * Quits: eseidel (n=eseidel@nat/google/x-9aff75974286f61f) (Client Quit)
  350. # [04:24] <alyosha> and the interesting thing is that the IE7 mode button is disabled on the html 5 doctype
  351. # [04:25] * Joins: hdh (n=hdh@118.71.121.76)
  352. # [04:25] <alyosha> even though IE 7 rendering mode can be hacked to display new elements
  353. # [04:26] * Quits: tndH (i=Rob@adsl-77-86-6-71.karoo.KCOM.COM) ("ChatZilla 0.9.83-rdmsoft [XULRunner 1.9/2008061013]")
  354. # [04:26] <alyosha> hmmm, what do u get with html 5 doctype and <meta http-equiv="X-UA-Compatible" content="IE=7">?
  355. # [04:28] * Joins: franksalim (n=frank@user-64-9-234-71.googlewifi.com)
  356. # [04:30] <alyosha> html 5 doctype overrides the meta thingy
  357. # [04:31] <alyosha> IE 7 mode not available for html 5
  358. # [04:32] * Quits: othermaciej_ (n=mjs@17.244.17.18)
  359. # [04:38] <alyosha> actually, they didn't remove the document.createElement() hack. It's just in the CSS, it makes unrecognized elements "UNKNOWN"
  360. # [04:38] <alyosha> just tried disabling script with the hack, and it still works
  361. # [04:38] <alyosha> but the styles just aren't applied
  362. # [04:42] <alyosha> style attributes are applied after the hack, but external stylesheets are not applied
  363. # [04:49] * Quits: weinig_ (n=weinig@nat/apple/x-9aa5e2e8f5ca56f3)
  364. # [04:59] * Quits: franksalim (n=frank@user-64-9-234-71.googlewifi.com) (Read error: 110 (Connection timed out))
  365. # [05:18] * Quits: tantek (n=tantek@66-117-137-125.dsl.lmi.net)
  366. # [05:25] <Hixie> you gotta wonder what a mess their codebase is to get this kind of behaviour
  367. # [05:26] <alyosha> yeah, guess so
  368. # [05:27] <alyosha> IE 7 mode renders fine and shows the stylesheets fine too, but it can only be activated through developer tools or by adding the website to compatibility mode
  369. # [05:27] <alyosha> the meta thing is overridden by the doctype and the button is gone too
  370. # [05:28] <alyosha> gotta love M$, they make sure web designers won't lose their jobs (constantly gotta fix all their problems)
  371. # [05:28] <alyosha> lol
  372. # [05:30] <alyosha> nvm, adding it to compatibility view doesn't work either
  373. # [05:31] <alyosha> does MS have a bug tracker somewhere?
  374. # [05:32] <Hixie> https://connect.microsoft.com/feedback/AdvancedSearch.aspx?SiteID=136&Status=1&FeedbackType=1 i think?
  375. # [05:33] <alyosha> ooh, cool, Microsoft isn't completely submerged in the last decade after all.
  376. # [05:34] <Hixie> if you can get it to work, let me know
  377. # [05:35] <alyosha> sure. but I think most likely we'll have to wait for a fix from MS or do something like <header id="header"> ... #header { /*style here*/ } if they don't fix it
  378. # [05:39] <alyosha> according to this report IE8b1 didn't have this problem, so it's most likely a regression in IE8b2: https://connect.microsoft.com/IE/feedback/ViewFeedback.aspx?FeedbackID=364356
  379. # [05:42] <alyosha> well, g2g, l8rz
  380. # [05:42] * Parts: alyosha (n=anime4ch@74.93.182.234)
  381. # [05:55] * Quits: jruderman (n=jruderma@c-67-180-39-55.hsd1.ca.comcast.net)
  382. # [05:57] * Joins: jruderman (n=jruderma@c-67-180-39-55.hsd1.ca.comcast.net)
  383. # [06:17] * Joins: aboodman2 (n=aboodman@nat/google/x-c029fadc328fea49)
  384. # [06:18] * Joins: eseidel (n=eseidel@c-24-130-13-197.hsd1.ca.comcast.net)
  385. # [06:20] * Joins: eseidel_ (n=eseidel@72.14.224.1)
  386. # [06:27] * Quits: aboodman (n=aboodman@nat/google/x-c9301e41a04d7076) (Read error: 110 (Connection timed out))
  387. # [06:29] * Joins: aboodman (n=aboodman@216.239.45.19)
  388. # [06:32] * Joins: aboodman3 (n=aboodman@69.36.227.135)
  389. # [06:36] * Quits: eseidel (n=eseidel@c-24-130-13-197.hsd1.ca.comcast.net) (Read error: 110 (Connection timed out))
  390. # [06:42] * Quits: aboodman2 (n=aboodman@nat/google/x-c029fadc328fea49) (Read error: 110 (Connection timed out))
  391. # [06:44] * Joins: weinig (n=weinig@c-71-198-176-23.hsd1.ca.comcast.net)
  392. # [06:46] * Quits: aboodman (n=aboodman@216.239.45.19) (Read error: 110 (Connection timed out))
  393. # [06:54] * Joins: Kuruma (n=Kuruman@h123-176-107-050.catv01.catv-yokohama.ne.jp)
  394. # [07:09] * eseidel_ is now known as eseidel
  395. # [07:11] * Joins: eseidel_ (n=eseidel@c-24-130-13-197.hsd1.ca.comcast.net)
  396. # [07:28] <Hixie> "Ian's approach completely removes HTML conformance checking as a
  397. # [07:28] <Hixie> mechanism to introduce authors to accessibility issues."
  398. # [07:28] <Hixie> -- http://html4all.org/pipermail/list_html4all.org/2008-August/000977.html
  399. # [07:28] <Hixie> well at least they admit that they are trying to use conformance checking for their own purposes
  400. # [07:28] * Quits: eseidel (n=eseidel@72.14.224.1) (Read error: 110 (Connection timed out))
  401. # [07:29] <Hixie> and good to see others on that thread disagreeing with it :-)
  402. # [07:33] * Quits: aboodman3 (n=aboodman@69.36.227.135) (Read error: 110 (Connection timed out))
  403. # [07:36] * Quits: csarven (n=csarven@modemcable144.140-202-24.mc.videotron.ca) ("http://www.csarven.ca/")
  404. # [07:36] * Quits: weinig (n=weinig@c-71-198-176-23.hsd1.ca.comcast.net)
  405. # [08:05] * Quits: eseidel_ (n=eseidel@c-24-130-13-197.hsd1.ca.comcast.net) (Read error: 110 (Connection timed out))
  406. # [08:18] * Joins: shepazu (n=schepers@88.128.85.131)
  407. # [08:21] * Quits: hdh (n=hdh@118.71.121.76) ("Konversation terminated!")
  408. # [08:21] <hsivonen> wow. when I fixed bugs in my validation harness, it ran in 4 hours and the output was only 83.4 MB.
  409. # [08:22] * Joins: hdh (n=hdh@118.71.121.171)
  410. # [08:24] <hsivonen> annevk: typo. thanks
  411. # [08:37] <Hixie> hsivonen: heh
  412. # [08:38] * Joins: aboodman3 (n=aboodman@dsl081-073-212.sfo1.dsl.speakeasy.net)
  413. # [08:39] * Joins: othermaciej (n=mjs@c-69-181-42-194.hsd1.ca.comcast.net)
  414. # [08:40] * Joins: aboodman4 (n=aboodman@dsl081-073-212.sfo1.dsl.speakeasy.net)
  415. # [08:54] <hsivonen> whoa! there are many more 0-error docs than I would have thought
  416. # [08:54] * Joins: KevinMarks (n=KevinMar@c-98-207-134-151.hsd1.ca.comcast.net)
  417. # [08:56] * Joins: tantek (n=tantek@adsl-63-195-114-133.dsl.snfc21.pacbell.net)
  418. # [08:57] <Hixie> hsivonen: 2?
  419. # [08:58] * Quits: aboodman3 (n=aboodman@dsl081-073-212.sfo1.dsl.speakeasy.net) (Read error: 110 (Connection timed out))
  420. # [08:58] <hsivonen> Hixie: 4514
  421. # [08:58] <Hixie> out of a million?
  422. # [08:58] <hsivonen> out of 516875
  423. # [08:58] <hsivonen> and manual verification shows that it's really so
  424. # [08:58] <hsivonen> however, this ignores the document mode
  425. # [08:58] <Hixie> 0.87%
  426. # [08:59] <hsivonen> so doctypeless files count
  427. # [08:59] <Hixie> does bgcolor in a transitional doc count as pass or fail?
  428. # [08:59] <hsivonen> fail
  429. # [08:59] <hsivonen> this is HTML5 rules
  430. # [08:59] <hsivonen> except for doctype
  431. # [08:59] <Hixie> wow that's not bad then
  432. # [09:00] <hsivonen> note that omitted alt doesn't count as an error
  433. # [09:00] <Hixie> what are we saying, that's horrific. but still. higher than i expected.
  434. # [09:00] <hsivonen> and IRIs on non-UTF-8 pages pass
  435. # [09:00] <hsivonen> no parse errors (doctype errors ignored) is 29%
  436. # [09:01] <hsivonen> which is rather high compared to your old numbers
  437. # [09:02] <hsivonen> but now the results look pretty consistent with what I've seen before in terms of the relative frequencies
  438. # [09:03] <Hixie> i had two numbers, one that counted /> and doctypes as errors and one that didn't
  439. # [09:03] <hsivonen> ah
  440. # [09:04] <Hixie> i forget what my exact numbers were
  441. # [09:04] <Hixie> but one was about 70% and one was about 90%
  442. # [09:04] * Quits: shepazu (n=schepers@88.128.85.131) (Read error: 110 (Connection timed out))
  443. # [09:19] * Joins: bdash (n=bdash@fire/developer/bdash)
  444. # [09:33] <annevk> hsivonen, MB or GB?
  445. # [09:34] <annevk> gsnedders, hmm, you didn't do your checkin
  446. # [09:37] * Joins: aboodman5 (n=aboodman@dsl081-073-212.sfo1.dsl.speakeasy.net)
  447. # [09:41] * Joins: Maurice (i=copyman@cc356098-a.emmen1.dr.home.nl)
  448. # [09:42] <hsivonen> annevk: MB
  449. # [09:43] <hsivonen> annevk: the harness used to have a simple but serious bug
  450. # [09:43] <annevk> but you expected 80GB initially?!
  451. # [09:43] <hsivonen> annevk: 60 GB actually, but that expectation was based on the bug, too
  452. # [09:43] <annevk> ok
  453. # [09:44] * Joins: GregHouston (n=ghouston@adsl-75-6-6-153.dsl.spfdmo.sbcglobal.net)
  454. # [09:50] * Joins: myakura (n=myakura@p3216-ipbf5106marunouchi.tokyo.ocn.ne.jp)
  455. # [09:54] * Quits: aboodman4 (n=aboodman@dsl081-073-212.sfo1.dsl.speakeasy.net) (Read error: 110 (Connection timed out))
  456. # [09:54] <Hixie> hsivonen: very interesting results
  457. # [09:55] <Hixie> hsivonen: these results really argue for consolidating all "attribute [known presentational attribute] not allowed" messages into a single message "This page contains presentational markup. More details... Help on removing presentational markup..."
  458. # [09:56] <Hixie> wow, 7% of pages had an </embed> ?
  459. # [09:57] <hsivonen> so it seems
  460. # [09:57] <hsivonen> crazy
  461. # [09:57] <annevk> lots of people think <embed> needs a closing tag, I once did so too
  462. # [09:57] <annevk> it's not like there was good documentation out there on how it works...
  463. # [09:59] <Hixie> wow, malformed byte sequences aren't that common either
  464. # [09:59] <annevk> "No “p” element in scope but a “p” end tag seen." 9%!
  465. # [10:00] <annevk> madness
  466. # [10:00] <Hixie> that's probably a lot of <p><table></table></p>-type stuff
  467. # [10:01] <annevk> and 5% had "Element “frameset” not allowed in this context. (The parent was element “html”.) Suppressing further errors from this subtree." so many frames still around?
  468. # [10:01] <Hixie> this sample didn't bias for date of creation
  469. # [10:01] <Hixie> so it includes stuff going back many years
  470. # [10:01] <Hixie> there's a lot of old content out there still
  471. # [10:03] <Hixie> sigh i really don't want to reintroduce <script language="">, people typo it so much
  472. # [10:03] <Hixie> and the & issue is a sad one
  473. # [10:03] <annevk> >2% uses <head profile>
  474. # [10:04] <Hixie> iirc there's a lot of pages that have <head profile=""> (blank)
  475. # [10:04] <hsivonen> annevk: wordpress.com gives distinct host names to users
  476. # [10:04] <hsivonen> annevk: livejournal, too
  477. # [10:04] <Hixie> like there are a lot of <a> elements with shape="rect"
  478. # [10:04] <hsivonen> annevk: I was too lazy to deal with those
  479. # [10:04] <hsivonen> annevk: although I did collapse MySpace profiles
  480. # [10:05] <Hixie> hsivonen: i'll give you a domain-separated set of urls next time instead of site-separated
  481. # [10:06] <Hixie> maybe we should make & followed by alphanumerics, followed by =, a non-ambiguous ampersand
  482. # [10:06] <Hixie> that might deal with a bunch of these & errors
  483. # [10:06] <annevk> "Bad value (consolidated) for attribute “lang” from namespace “http://www.w3.org/XML/1998/namespace” on element “html”: Bad language tag: Bad variant subtag." XML sites were included?
  484. # [10:07] <hsivonen> annevk: no
  485. # [10:07] <hsivonen> annevk: the validator sees HTML lang as XML lang internally
  486. # [10:07] <takkaria> Hixie: I think that could be a big win for authoring
  487. # [10:07] <hsivonen> annevk: and these messages weren't fully sanitized for UI consumption
  488. # [10:07] <annevk> Hixie, maybe also allow anything but [a-Z#]
  489. # [10:08] <annevk> to follow it
  490. # [10:08] <Hixie> annevk: ?
  491. # [10:08] <annevk> &" would be conforming
  492. # [10:08] <annevk> and so would 2&2
  493. # [10:09] <Hixie> the character encoding thing -- we could make <meta charset> allowed if not preceeded by any non-ASCII
  494. # [10:09] <annevk> or (&)
  495. # [10:09] <Hixie> annevk: i posit that the problem is just urls in attributes
  496. # [10:10] * Quits: hdh (n=hdh@118.71.121.171) (Read error: 104 (Connection reset by peer))
  497. # [10:11] <takkaria> fwiw I'd prefer the "get a character reference" algorithm not to depend on whether you're in an attribute value state or not
  498. # [10:11] <annevk> I don't see what's wrong loosening them up both, given that you keep several extension points
  499. # [10:11] <annevk> takkaria, it already does
  500. # [10:12] <annevk> takkaria, and if we are to keep compat with IE, it has to be that way
  501. # [10:13] <takkaria> I mean in this particular case. i.e. if you can paste an unescaped URL into an attribute value you should also be able to conformingly paste it outside an attribute value
  502. # [10:15] <annevk> that wouldn't work well
  503. # [10:16] <annevk> eg, it would go wrong with &AMP= which does different things
  504. # [10:17] <takkaria> mm, that's a point
  505. # [10:18] <takkaria> ah well. it would be nice, though
  506. # [10:21] <annevk> at this point chaals would ask for a pony
  507. # [10:59] * Joins: primal1 (n=primal1@pool-72-87-132-196.lsanca.dsl-w.verizon.net)
  508. # [10:59] <annevk> grmbl, how do you properly configure lxml?
  509. # [11:00] <annevk> unzipped it's 25MB
  510. # [11:05] * Joins: ROBOd (n=robod@89.122.216.38)
  511. # [11:42] <hsivonen> Unsupported character encoding name: “iso-utf-8”. Will continue sniffing.
  512. # [11:42] <hsivonen> Unsupported character encoding name: “44-iso-8859-1”. Will continue sniffing.
  513. # [11:43] <hsivonen> crazy ebcdic charset in HTTP: http://web-sniffer.net/?url=http%3A%2F%2Fwww.antalis.fr%2Fsitesweb%2FFO%2Fpages%2Finterne-2-66-2122-rich_text-73228.html&submit=Submit&http=1.1&type=GET&uak=0
  514. # [11:44] <hsivonen> Unsupported character encoding name: “gb2312,big5,euc-kr”. Will sniff.
  515. # [11:45] <hsivonen> Unsupported character encoding name: “zh-tw”. Will sniff.
  516. # [11:45] <hsivonen> you can't make this stuff up
  517. # [12:02] <jgraham> hsivonen: btw, I'm not sure that such a thing as an unbiased sample of webpages exists
  518. # [12:02] * Quits: tantek (n=tantek@adsl-63-195-114-133.dsl.snfc21.pacbell.net) (Read error: 104 (Connection reset by peer))
  519. # [12:03] * Joins: tantek (n=tantek@adsl-63-195-114-133.dsl.snfc21.pacbell.net)
  520. # [12:03] <hsivonen> jgraham: sure. I said it was biased. :-)
  521. # [12:03] <Philip`> You can't even know how it's biased, because you can't know what the population is
  522. # [12:04] <jgraham> hsivonen: I know. I just think it's a tautology
  523. # [12:04] <hsivonen> yeah
  524. # [12:04] <hsivonen> and, yet, with different page sets, the same common errors come to the top
  525. # [12:07] <jgraham> In some sense approximately all the pages on the web are autogenerated pages which use the url to determine the content e.g. calendar.example.com/year/month/day with only implementation limits on the value of year
  526. # [12:08] <jgraham> So an unbiased sample of the whole population of http URLs that return 200 would be very misleading
  527. # [12:09] <Philip`> "approximately all" is not a concept that makes sense, where there's an infinite number of pages
  528. # [12:11] <hsivonen> more to the point, the number of pages in countably infinite which should make counting proportions a bit more tractable
  529. # [12:12] <hsivonen> Unsupported character encoding name: “big6”. Will sniff.
  530. # [12:12] <gsnedders> annevk: I asked if you wanted me to do it last night so you could work on it this morning. I got no answer :P
  531. # [12:12] <Philip`> Positive integers are countably infinite too, but it doesn't make sense to ask for an unbiased random sampling of positive integers
  532. # [12:13] <gsnedders> annevk: I took the default-lazy solution
  533. # [12:13] <hsivonen> Philip`: true, but you can say that half of the integers are positive
  534. # [12:13] <Philip`> hsivonen: No you can't :-p
  535. # [12:14] <annevk> gsnedders, I thought the default was yes!
  536. # [12:14] <Philip`> For every positive integer you give me, I'll give you back two negative integers, so there's twice as many :-)
  537. # [12:14] <hsivonen> Philip`: hmm. right.
  538. # [12:14] <hsivonen> now I appear silly and badly educated
  539. # [12:14] * gsnedders attempts to cd Documents/Stuff\ I\'m\ Working\ On/spec-gen
  540. # [12:14] <annevk> gsnedders, I would appreciate a bundle of lxml+anolis+html5lib so I can just write the frontend script and don't have to worry about the bundling as I'm really bad at that
  541. # [12:15] * annevk tried it this morning and couldn't get the lxml dependency to work
  542. # [12:15] <gsnedders> annevk: I've never tried bundling :)
  543. # [12:15] <gsnedders> annevk: lxml is written in C, which may make it harder
  544. # [12:15] * Quits: Amorphous (i=jan@unaffiliated/amorphous) ("shutdown")
  545. # [12:15] <Philip`> (It does make sense to ask for an unbiased random real number between 0 and 1, even though that's an uncountable set)
  546. # [12:15] <annevk> gsnedders, I think that's the problem, yes
  547. # [12:15] <Philip`> (or at least I think it makes sense)
  548. # [12:16] <gsnedders> But it really does need to be for the sake of being reasonably quick
  549. # [12:17] * Quits: primal1 (n=primal1@pool-72-87-132-196.lsanca.dsl-w.verizon.net)
  550. # [12:17] <annevk> what's a difference between a pleonasm and tautology?
  551. # [12:17] <Hixie> hsivonen, Philip`: in this particular case the population was itself a (biased, non-random) subset of google's index
  552. # [12:18] <annevk> ah I see, tautology is also used in logic
  553. # [12:19] <Hixie> a tautology is specifically being overly specific in a redundant manner. a pleonasm is just using too many words. as i understand it.
  554. # [12:20] <Philip`> I think the logical meaning of tautology is a statement that's true regardless of the values of any variables in it
  555. # [12:20] <annevk> maybe the Dutch and English pleonasm are different then (in Dutch "round circle" is considered a "pleonasme")
  556. # [12:20] * Joins: virtuelv (n=virtuelv@163.80-202-65.nextgentel.com)
  557. # [12:20] <annevk> Philip`, yeah
  558. # [12:21] * Philip` guesses that must include all true statements that don't have any variables
  559. # [12:22] <annevk> "2. Logic. An empty or vacuous statement composed of simpler statements in a fashion that makes it logically true whether the simpler statements are factually true or false; for example, the statement Either it will rain tomorrow or it will not rain tomorrow."
  560. # [12:24] <GregHouston> Logical "proofs" of the existence of God generally falls into the category of a tautology.
  561. # [12:26] <annevk> gsnedders, anyway, for you the stuff is running right? can't you just zip that dir? :)
  562. # [12:27] <gsnedders> annevk: Only if you're running OS X/x86 :)
  563. # [12:27] <gsnedders> As of course the compiled C stuff…
  564. # [12:29] <annevk> grmbl
  565. # [12:29] * Joins: tndH (n=Rob@adsl-77-86-6-71.karoo.KCOM.COM)
  566. # [12:32] <annevk> so how do I install lxml?
  567. # [12:32] <annevk> running setup.py install fails
  568. # [12:33] <gsnedders> annevk: http://codespeak.net/lxml/installation.html :P
  569. # [12:34] <virtuelv> annevk: sudo apt-get install python-lxml :P
  570. # [12:35] <annevk> hmm
  571. # [12:35] * annevk wonders if dreamhost supports that
  572. # [12:35] <virtuelv> they don't
  573. # [12:36] <gsnedders> You need to install it in a custom path
  574. # [12:36] <virtuelv> on slicehost, that stuff is a bit easier, given that you have root
  575. # [12:36] <annevk> "annevk is not in the sudoers file. This incident will be reported."
  576. # [12:37] <annevk> gsnedders, DreamHost doesn't have easy_install
  577. # [12:37] * Joins: Amorphous (i=jan@unaffiliated/amorphous)
  578. # [12:40] <Philip`> Do they have hard_install?
  579. # [12:41] <Hixie> there appears to be an inverse corrolation between how much actual useful research someone has done, and how much they ask people who are doing research to do more
  580. # [12:41] <annevk> -_-
  581. # [12:42] * gsnedders is gonna have to install it on (mt)
  582. # [12:43] <Philip`> Hixie: That would be because the people who can do research themselves do it themselves instead of having to ask others :-)
  583. # [12:43] <annevk> grmbl, even if I do apt-get on my local machine it complains about lxml.html not being there :/
  584. # [12:43] <Hixie> that and they know how much work it is, i imagine
  585. # [12:43] <Philip`> It would be nicer if they said *why* they wanted that research, and what useful information it would be likely to reveal
  586. # [12:44] * Joins: svl (n=me@ip565744a7.direct-adsl.nl)
  587. # [12:50] <annevk> gsnedders, I guess the lxml dependency is pretty big?
  588. # [12:50] <gsnedders> annevk: Yeah.
  589. # [12:50] <annevk> sigh
  590. # [12:51] <gsnedders> annevk: It's the structure used for the tree everywhere
  591. # [12:51] <jgraham> Philip`: It's not clear to me that there are an infinite number of web pages given likely limits on URL length supported by servers
  592. # [12:52] <jgraham> annevk: If you want python to work sensibly on Dreamhost you have to install it youself under your home directory
  593. # [12:52] <jgraham> Then you install easy_install
  594. # [12:52] <jgraham> Then you do easy_install lxml
  595. # [12:53] <jgraham> Then you just have to rember to change anything like #!/usr/bin/env python to #!/home/annevk/bin/python
  596. # [12:54] <jgraham> Otherwise using any external dependencies seems to be really hard
  597. # [12:55] <gsnedders> Not really
  598. # [12:56] <jgraham> gsnedders: It's getting the paths right so you can import stuff that seemed to be hard
  599. # [12:56] <gsnedders> export PYTHONPATH=${HOME}/packages/lib/python
  600. # [12:56] <gsnedders> export PATH=${HOME}/packages/bin:$PATH
  601. # [12:56] <gsnedders> in .bash_profile
  602. # [12:56] <gsnedders> That's what used on sp.org
  603. # [12:56] <jgraham> Hmm, I thought I tried that and it didn't work
  604. # [12:57] <jgraham> Anyway setting PYTHONPATH is a bad idea in general
  605. # [12:57] <gsnedders> That's true, but it works ;P
  606. # [12:58] <gsnedders> annevk: See what I just pushed
  607. # [12:58] <gsnedders> i.e., http://hg.gsnedders.com/hgwebdir.cgi/anolis/rev/cf4770338aa0
  608. # [13:01] <virtuelv> annevk: there is some tutorial for rolling your own python on DH
  609. # [13:01] <virtuelv> http://wiki.dreamhost.com/Python#Building_a_custom_version_of_Python
  610. # [13:23] * Joins: maikmerten (n=maikmert@Lbaac.l.pppool.de)
  611. # [13:25] * Quits: virtuelv (n=virtuelv@163.80-202-65.nextgentel.com) ("Leaving")
  612. # [13:28] * Joins: virtuelv (n=virtuelv@163.80-202-65.nextgentel.com)
  613. # [13:47] * Joins: jacobolus1 (n=jacobolu@pool-71-119-188-52.lsanca.dsl-w.verizon.net)
  614. # [13:48] * Quits: jacobolus (n=jacobolu@pool-71-119-188-52.lsanca.dsl-w.verizon.net) (Read error: 104 (Connection reset by peer))
  615. # [14:49] * Quits: othermaciej (n=mjs@c-69-181-42-194.hsd1.ca.comcast.net)
  616. # [15:49] * Quits: GregHouston (n=ghouston@adsl-75-6-6-153.dsl.spfdmo.sbcglobal.net) (Read error: 110 (Connection timed out))
  617. # [15:49] * Joins: GregHouston (n=ghouston@ppp-66-143-220-108.dsl.spfdmo.swbell.net)
  618. # [16:10] * Joins: csarven (n=csarven@modemcable144.140-202-24.mc.videotron.ca)
  619. # [16:14] * Joins: BenMillard (i=cerbera@cpc1-flee1-0-0-cust285.glfd.cable.ntl.com)
  620. # [16:17] * Quits: svl (n=me@ip565744a7.direct-adsl.nl) ("And back he spurred like a madman, shrieking a curse to the sky.")
  621. # [16:29] * Quits: jacobolus1 (n=jacobolu@pool-71-119-188-52.lsanca.dsl-w.verizon.net) (Read error: 110 (Connection timed out))
  622. # [16:35] <gsnedders> Time to go out into town to do something about the /topic
  623. # [16:36] <jcranmer> gsnedders: you're leaving your sense of logic behind?
  624. # [16:37] * Joins: hdh (n=hdh@58.187.60.134)
  625. # [16:40] * Joins: jacobolus (n=jacobolu@pool-71-119-188-52.lsanca.dsl-w.verizon.net)
  626. # [17:06] * Joins: jacobolus1 (n=jacobolu@pool-71-119-188-52.lsanca.dsl-w.verizon.net)
  627. # [17:09] * Quits: jacobolus (n=jacobolu@pool-71-119-188-52.lsanca.dsl-w.verizon.net) (Read error: 104 (Connection reset by peer))
  628. # [17:12] * Joins: svl (n=me@ip565744a7.direct-adsl.nl)
  629. # [17:15] <virtuelv> gsnedders: I presume you'll put URL in /topic
  630. # [17:27] * gsnedders is too impatient to wait in a queue of the length there was
  631. # [17:28] <gsnedders> (i.e., my hair is still the same old colour)
  632. # [17:29] * Joins: weinig (n=weinig@c-71-198-176-23.hsd1.ca.comcast.net)
  633. # [17:33] * Joins: sverrej (n=sverrej@cBF13BF51.dhcp.bluecom.no)
  634. # [18:21] * Parts: BenMillard (i=cerbera@cpc1-flee1-0-0-cust285.glfd.cable.ntl.com)
  635. # [18:26] * Quits: svl (n=me@ip565744a7.direct-adsl.nl) ("And back he spurred like a madman, shrieking a curse to the sky.")
  636. # [18:31] * Quits: aboodman5 (n=aboodman@dsl081-073-212.sfo1.dsl.speakeasy.net)
  637. # [18:50] <hsivonen> weird. my Mac had bluescreened (literally) while unattended
  638. # [18:53] * gsnedders still has never got a pinkscreen
  639. # [18:53] <Lachy> hsivonen, do you mean a kernel panic?
  640. # [18:54] <Lachy> AFAIK, macs can't get BSODs
  641. # [18:54] <gsnedders> Lachy: They can however get stuck on a blank blue screen
  642. # [18:54] <gsnedders> Lachy: For no apparent reason
  643. # [18:55] <Lachy> I've never seen that
  644. # [18:56] * gsnedders tries to decide in what order to post his blog posts
  645. # [18:56] <Lachy> I've had my machines have kernel panics a couple of times, and just freeze with the spinning beachball cursor.
  646. # [18:56] <Lachy> gsnedders, I'd recommend starting with number 1 followed by number 2
  647. # [18:57] <gsnedders> Lachy: It would make more sense to do them in chronological order, but the earlier one is far more time-consuming to write
  648. # [18:57] <Lachy> ok
  649. # [18:57] <Lachy> I have a number of blog posts I have to finish writing
  650. # [18:58] <Lachy> I suppose I should just post something about IE8 tonight, and then post my other, significantly longer, potentially 3-part series later
  651. # [19:00] <gsnedders> I have eight drafts currently
  652. # [19:01] <gsnedders> One gives a useful answer to <http://krijnhoetmer.nl/irc-logs/whatwg/20080605#l-450>
  653. # [19:02] <gsnedders> The other follows on from that
  654. # [19:02] * Quits: tantek (n=tantek@adsl-63-195-114-133.dsl.snfc21.pacbell.net)
  655. # [19:03] <Lachy> I'd forgotten I'd even asked that question. I suppose it'll be good to get a better answer than "Stuff"
  656. # [19:03] <gsnedders> It was the first place I could think of that has a public record of me avoiding that question.
  657. # [19:07] <gsnedders> Writing about May last year is rather time-consuming.
  658. # [19:11] * gsnedders smacks his old writing
  659. # [19:12] <gsnedders> It uses -ise :(
  660. # [19:12] <Lachy> VMWare ThinApp is absolutely brilliant! Now I can seamlessly run IE6, IE7, and IE8b1 and IE8b2 all within the same copy of Windows XP, which is itself running in VMWare Fusion on OS X.
  661. # [19:13] <Lachy> it basically runs each version of IE, or any other application I like, within its own sandbox
  662. # [19:14] <gsnedders> As long as no sand falls over the edge, I guess that's all right
  663. # [19:15] <Lachy> gsnedders, what is wrong with using -ise?
  664. # [19:15] <gsnedders> Lachy: en-gb-oed prefers -ize :P
  665. # [19:15] <Lachy> what?!
  666. # [19:15] <Lachy> nooO!
  667. # [19:16] <Lachy> -ize is wrong. Stupid American misspelling
  668. # [19:16] <gsnedders> No, it isn't.
  669. # [19:16] <Lachy> yes, it is
  670. # [19:16] <Lachy> I thought en-GB used -ise, just like en-AU
  671. # [19:16] <gsnedders> -ize comes from Greek, and should be used on Greek-derived words
  672. # [19:17] <GregHouston> Am I looking at the right thing. It looks like Thin App starts around $6000. I have Workstation and it was a little under $200.
  673. # [19:18] <gsnedders> en-gb only uses -ise, en-gb-oed uses -ize for words of Greek origin and -ise for those of French, en-us uses -ize
  674. # [19:18] <gsnedders> "[T]he suffix…, whatever the element to which it is added, is in its origin the Gr[eek] -ιζειν, L[atin] -izāre; and, as the pronunciation is also with z, there is no reason why in English the special French spelling in -iser should be followed, in opposition to that which is at once etymological and phonetic." — the OED
  675. # [19:19] <gsnedders> en-us also over does the entire z thing. Analyze is wrong.
  676. # [19:19] <Lachy> hmm, interesting
  677. # [19:20] <gsnedders> en-gb uses -ise too much, en-us uses -ize too much
  678. # [19:20] <Lachy> I still think -ise should be used for *everything*
  679. # [19:21] <Lachy> except for words like prize which are supposed to end in -ize
  680. # [19:22] <Lachy> wiktionary says that it's supposed to be -ise for french-origin words and -ize for greek-origin words. But to do that, I would have to know the origin of each word before I tried to spell it
  681. # [19:22] <gsnedders> me wonders whether he really should add a certain girl on Facebook…
  682. # [19:25] <gsnedders> (She is all ready convinced that I'm secretly in love with her, which is totally untrue)
  683. # [19:32] <GregHouston> It appears Thin App really is $6k. Application virtualization must be pretty tricky to cost 20 times that of a virtual machine.
  684. # [19:32] <GregHouston> I can't multipy. Make that 30 times.
  685. # [19:33] <GregHouston> Or spell. * multiply
  686. # [19:34] * Joins: svl (n=me@ip565744a7.direct-adsl.nl)
  687. # [19:36] * Quits: weinig (n=weinig@c-71-198-176-23.hsd1.ca.comcast.net)
  688. # [19:54] * Joins: eseidel (n=eseidel@c-24-130-13-197.hsd1.ca.comcast.net)
  689. # [20:16] <Philip`> jgraham: You only need a single custom HTTP server that supports arbitrary-length URLs, and then the web can have an infinite number of pages, and I would have thought at least one person would have made such a server
  690. # [20:16] <Philip`> If nobody has, I'll make one, just to prove my point :-p
  691. # [20:28] * Joins: weinig (n=weinig@nat/apple/x-ade8156ca560b392)
  692. # [20:45] * Quits: myakura (n=myakura@p3216-ipbf5106marunouchi.tokyo.ocn.ne.jp) ("Leaving...")
  693. # [21:18] <gsnedders> Philip`: You have calendars that can be navigated endlessly. There's no need for custom HTTP servers.
  694. # [21:25] <Philip`> gsnedders: But those calendars might have finite URL limitations
  695. # [21:31] <Philip`> (even if it's only limited by the amount of RAM available)
  696. # [21:41] * Quits: maikmerten (n=maikmert@Lbaac.l.pppool.de) ("Leaving")
  697. # [22:00] * Quits: KevinMarks (n=KevinMar@c-98-207-134-151.hsd1.ca.comcast.net) ("The computer fell asleep")
  698. # [22:47] * Joins: MacDome (n=eric@c-24-130-13-197.hsd1.ca.comcast.net)
  699. # [22:54] * Joins: othermaciej (n=mjs@c-69-181-42-194.hsd1.ca.comcast.net)
  700. # [23:06] <gsnedders> Philip`: Your webserver that supports arbitrary-length URLs will have the same RAM limitations
  701. # [23:08] <Philip`> gsnedders: No it won't - it won't store the URL in memory
  702. # [23:08] <gsnedders> Philip`: It just returns something for any request?
  703. # [23:11] <Philip`> gsnedders: It could ignore the URL entirely, or it could do some streaming processing of it to calculate a finite output
  704. # [23:11] <Philip`> (I assume HTTP doesn't particularly like you sending the response before you've received the request, so you can't do anything like echo the URL back to the client)
  705. # [23:12] * Quits: ROBOd (n=robod@89.122.216.38) ("http://www.robodesign.ro")
  706. # [23:12] <gsnedders> I don't think RFC2616 actually forbids you from doing so…
  707. # [23:36] * Quits: hdh (n=hdh@58.187.60.134) (Read error: 104 (Connection reset by peer))
  708. # [23:52] * Quits: svl (n=me@ip565744a7.direct-adsl.nl) ("And back he spurred like a madman, shrieking a curse to the sky.")
  709. # [23:56] <Philip`> gsnedders: Does it never require you receive the whole header so you can detect invalid requests and send an appropriate response?
  710. # [23:59] * Quits: sverrej (n=sverrej@cBF13BF51.dhcp.bluecom.no) (Connection timed out)
  711. # Session Close: Sun Aug 31 00:00:00 2008

The end :)