/irc-logs / freenode / #whatwg / 2007-07-03 / end

Options:

# Session Start: Tue Jul 03 00:00:00 2007
# Session Ident: #whatwg
# [00:08] * Quits: KevinMarks (i=KevinMar@nat/google/x-c08733bf8ae21a83) ("The computer fell asleep")
# [00:09] * Parts: hasather (n=hasather@22.80-203-71.nextgentel.com)
# [00:15] * Quits: Jero_ (n=Jero@d207230.upc-d.chello.nl) ("ChatZilla 0.9.78.1 [Firefox 2.0.0.4/2007051502]")
# [00:16] * weinig is now known as weinig|coffee
# [00:17] * Quits: othermaciej (n=mjs@17.255.97.47) (Connection reset by peer)
# [00:21] * Joins: KevinMarks (i=KevinMar@nat/google/x-3b295f2af1b52fbf)
# [00:22] * Quits: tndH (i=Rob@adsl-87-102-93-12.karoo.KCOM.COM) ("ChatZilla 0.9.78.1-rdmsoft [XULRunner 1.8.0.9/2006120508]")
# [00:26] * Joins: othermaciej (n=mjs@17.255.97.47)
# [00:30] * Quits: othermaciej (n=mjs@17.255.97.47) (Read error: 104 (Connection reset by peer))
# [00:31] * Joins: othermaciej (n=mjs@17.255.97.47)
# [00:34] * Joins: othermaciej_ (i=mjs@nat/apple/x-c56df03dc088c0a8)
# [00:39] * weinig|coffee is now known as weinig
# [00:50] * Quits: othermaciej (n=mjs@17.255.97.47) (Read error: 110 (Connection timed out))
# [00:54] * Joins: othermaciej (n=mjs@17.255.97.47)
# [00:56] * Quits: othermaciej_ (i=mjs@nat/apple/x-c56df03dc088c0a8) (Connection timed out)
# [00:56] * Parts: webben (n=benh@91.84.193.157)
# [01:01] * Hixie narowly misses flinging his ipod acress his cube
# [01:01] <Hixie> oops
# [01:05] <Hixie> right, done scope=""
# [01:05] <Hixie> now headers=""
# [01:13] * Joins: weinig_ (n=weinig@17.255.106.153)
# [01:15] * Quits: weinig (i=weinig@nat/apple/x-c6347e52ac937d33) (Read error: 104 (Connection reset by peer))
# [01:15] * Joins: webben (n=benh@91.84.193.157)
# [01:15] * Joins: weinig (i=weinig@nat/apple/x-daf5d1c93e69adc0)
# [01:20] * Quits: KevinMarks (i=KevinMar@nat/google/x-3b295f2af1b52fbf) ("The computer fell asleep")
# [01:30] * Joins: aroben_ (n=adamrobe@17.255.104.120)
# [01:31] * Quits: weinig_ (n=weinig@17.255.106.153) (Read error: 110 (Connection timed out))
# [01:32] * Joins: rubys (n=rubys@cpe-075-182-064-252.nc.res.rr.com)
# [01:32] <rubys> jgraham: ping?
# [01:35] <Hixie> if you have what you think is a tree, in the form of a list A of mappings from one node to a list of nodes all of which are in list A
# [01:35] <Hixie> is there a way short of walking the entire tree to verify that the list is indeed a tree and that there are thus no loops?
# [01:36] * Joins: othermaciej_ (i=mjs@nat/apple/x-592f43713f0159c0)
# [01:37] <othermaciej_> there probably is, based on what graph properties make the graph a tree
# [01:37] <othermaciej_> to be a tree you need to be not just cycle-free but also have exactly one directed edge pointing to each node (except the root)
# [01:37] <Hixie> i guess i don't mean a tree, i mean a directed graph
# [01:38] <othermaciej_> directed acyclic graph?
# [01:38] <Hixie> right
# [01:38] <Hixie> basically a have a list of table cells, each of which can be the header (through headers="") for zero or more other cells, and each of which can have zero or more header cells for itself
# [01:38] <Hixie> but there mustn't be any loops
# [01:39] <othermaciej_> let me look it up in my CLR
# [01:39] <Hixie> i mean i'll do the full walk if there's no quicker way
# [01:39] <Hixie> (memory is no object)
# [01:39] * kingryan thinks that's the only way
# [01:40] <kingryan> you might be able to cache some of it, though
# [01:40] <othermaciej_> I don't even know what you mean by "full walk"
# [01:40] <othermaciej_> you'd have to walk every possible path, not just visit every node once
# [01:40] <othermaciej_> if you are really brute forcing it
# [01:40] <Hixie> yeah
# [01:40] <othermaciej_> you'd have to show all paths through the graph terminate
# [01:42] * Quits: weinig (i=weinig@nat/apple/x-daf5d1c93e69adc0) (Read error: 104 (Connection reset by peer))
# [01:42] * Joins: weinig (i=weinig@nat/apple/x-b84cd2cc1b593708)
# [01:42] <othermaciej_> Hixie: iteratively removing nodes with no outgoing edges is one way
# [01:43] <Hixie> ok screw this. i don't HAVE to check that headers="" don't form loops
# [01:43] <othermaciej_> Hixie: you'd want a hashtable from node to nodes it points to, and one the other way
# [01:43] <Hixie> at least not in the first pass
# [01:44] <kingryan> Hixie: you only need to check them if you're going to be walking them (check to avoid inf. loops)
# [01:44] <Hixie> yeah
# [01:44] * Quits: aroben (n=adamrobe@17.203.15.248) (Read error: 110 (Connection timed out))
# [01:44] <Philip`> I think you could do a topological sort
# [01:44] <Hixie> which i don't
# [01:44] <Philip`> which'll tell you if it's got any cycles
# [01:44] <Hixie> but i was hoping to be able to see how many pages had that problem
# [01:44] <othermaciej_> Philip`: I'm not sure the obvious topological sort algorithms will terminate in finite time
# [01:45] <othermaciej_> on a graph with cycles
# [01:45] <othermaciej_> since topological sorts are desgined to work on a DAG
# [01:45] <Philip`> You can just do a depth-first search - start with each node being white, mark each one as grey when you recurse into it, mark it as grey when you recurse back out, and if you ever follow an edge into a grey node then there's a cycle
# [01:46] <Philip`> Uh
# [01:46] <Philip`> *mark it as black when you recurse back out
# [01:46] <othermaciej_> that works
# [01:46] <othermaciej_> hmm wait
# [01:46] <Philip`> (You can do some thingy with numbering nodes as you turn them black, to get a topological sort, I think)
# [01:46] <othermaciej_> I'm not sure it works
# [01:47] <othermaciej_> not obvious to me that a cycle couldn't be observable only by visiting a black node
# [01:50] <othermaciej_> DFS can detect cycles by identifying back-edges
# [01:50] * Quits: webben (n=benh@91.84.193.157)
# [01:51] * Joins: nikola_tesla (i=nagarjun@d60-65-150-197.col.wideopenwest.com)
# [01:51] <othermaciej_> your algorithm is right
# [01:51] * Quits: othermaciej (n=mjs@17.255.97.47) (Read error: 110 (Connection timed out))
# [01:52] <othermaciej_> I guess that would run in O(E) where E is the number of edges
# [01:52] <othermaciej_> which seems like the best you could do
# [01:52] * Joins: KevinMarks (i=KevinMar@nat/google/x-cc1a40534dd9b194)
# [01:55] <Hixie> and it'll work whatever order i do the nodes in, as far as i can tell
# [01:55] <Hixie> which is useful
# [01:55] <Hixie> in my case
# [01:55] * Parts: rubys (n=rubys@cpe-075-182-064-252.nc.res.rr.com)
# [01:58] <Philip`> I think I can convince myself it's right by saying that if there is a cycle, then when the DFS reaches some node N in that cycle, it will not mark the node as black until either it has reached another grey node (and found a cycle) or has searched the whole cycle and got back to N (which is grey, so it finds the cycle) or has reached a black node in the cycle; and there can never be a black node in the cycle, because the cycle will be detected before an
# [01:58] <Philip`> ...before any node in the cycle is marked as black
# [02:00] <Philip`> I guess you have to do something to make sure the DFS covers all the nodes (by repeatedly DFSing from some arbitrary remaining white node, until there are none)
# [02:01] <Hixie> yeah i'm just going to go through every node with at least one outgoing edge (since i have to visit them anyway for unrelated reasons) and if it's white, i do the search
# [02:01] <Philip`> It should be O(V) rather than O(E) because it'll never visit one node more than once
# [02:02] <Philip`> except I'm probably confused and it's O(E) too, so it's more like O(min(V, E)), not that anybody actually cares, since V =~ E anyway for non-crazy graphs
# [02:03] <Hixie> this is where i find out there's only 5 tables on the whole web with a headers="" attribute and therefore it could be O(N^4) and still complete in finite time
# [02:04] * Joins: karlUshi (n=karl@dhcp-247-173.mag.keio.ac.jp)
# [02:06] * aroben_ is now known as aroben
# [02:07] <othermaciej_> Philip`: it has to traverse every edge at least once to see the color of the node at the other end
# [02:07] * othermaciej_ is now known as othermaciej
# [02:08] <othermaciej> Philip`: but I guess it's O(V+E) since you need to visit disconnected nodes too
# [02:08] <Philip`> Got to be careful in case you stumble across some gigantic table with hundreds of rows and columns that's been made accessible with (buggy) headers, since that might cause an O(N^4) algorithm to take a second or two
# [02:09] <othermaciej> actually I guess you don't since Hixie's data structure only represents edges
# [02:09] <othermaciej> hundreds could be worse than a second or two with an O(N^4) algorithm
# [02:09] <othermaciej> N^4 gets bad pretty quickly
# [02:09] <Philip`> Oh, whoops, I forgot it'd still have to look along all the edges to already-black nodes
# [02:10] <Hixie> yeah N^4 is insanely bad if you've got anything of any kind of size
# [02:10] <Philip`> 100^4 = 10^8 which isn't all that bad if you're just following a few pointers :-)
# [02:11] <Hixie> sadly i have to do a string lookup on every single one of these edges :-)
# [02:11] <Hixie> (of course if it's bad, i'll optimise it more. we'll see)
# [02:12] <Philip`> You could do an O(E) preprocessing step to do all the string lookups per edge, before doing the horribly inefficient but highly optimised O(N^4) cycle-finding algorithm on it :-)
# [02:12] * Quits: bzed (n=bzed@dslb-084-059-102-210.pools.arcor-ip.net) ("Leaving")
# [02:12] <Hixie> indeed
# [02:16] <othermaciej> DFS isn't that hard to code, doesn't seem like a big deal
# [02:16] <Hixie> indeed
# [02:16] <Hixie> and you'll be glad to know it works
# [02:16] <Hixie> sweet
# [02:16] <othermaciej> nice
# [02:17] <Hixie> it tested my three test tables in 0.244s including compiling the program and parsing the html
# [02:18] <Hixie> and given that it took 0.245s to do the same program with only one empty test file...
# [02:18] <othermaciej> it runs in negative time!
# [02:19] <Hixie> and y'all were worried about it being slow!
# [02:19] <kingryan> O(-N^4) ?
# [02:23] <Philip`> Give it a really big table to test, and see if it returns the answer before you've even started the program
# [02:27] * Joins: aroben_ (n=adamrobe@17.203.15.248)
# [02:36] * Quits: kingryan (n=kingryan@corp.technorati.com)
# [02:43] * Quits: aroben (n=adamrobe@17.255.104.120) (Read error: 110 (Connection timed out))
# [02:44] * weinig is now known as weinigFood
# [02:44] * Quits: billmason (n=billmaso@c-24-20-186-228.hsd1.mn.comcast.net)
# [02:45] <Philip`> Hmm, just remembered a slower but simpler way to find cycles: use a kind of negated variant of Bellman-Ford, by initialising every node's 'distance' value to 0, then setting v.distance=max(v.distance, 1+u.distance) for each edge (u,v), then repeating num_nodes+1 times, and if any has distance=num_nodes+1 then there's a cycle
# [02:48] * Joins: aroben (n=adamrobe@17.255.104.120)
# [02:48] <Philip`> ...or is that totally rubbish and wrong? I'm not quite sure now
# [03:00] * Quits: nikola_tesla (i=nagarjun@d60-65-150-197.col.wideopenwest.com) ("I came here with a simple dream. A dream of killing all humans. And this is how it must end? Who's the real seven billion t")
# [03:05] * Quits: aroben_ (n=adamrobe@17.203.15.248) (Read error: 110 (Connection timed out))
# [03:10] <Hixie> hsivonen: please confirm that since the last time i checked about your parsing e-mails, you have sent only one further message (about <select>)
# [03:13] * Joins: yod (n=ot@dhcp-247-181.mag.keio.ac.jp)
# [03:26] * Quits: aroben (n=adamrobe@17.255.104.120)
# [03:28] * Joins: aroben (n=adamrobe@17.203.15.248)
# [03:29] <Hixie> holy crap, according to this nearly half of all tables with headers="" have a cycle
# [03:29] <Hixie> that seems unlikely
# [03:30] <Hixie> in fact of 60,000 tables with headers="" that i just parsed, only 194 came out without some sort of error
# [03:30] <Hixie> and of those, 177 didn't need headers="" at all because scope="" got the same effect
# [03:31] <Hixie> leaving 17 tables out of 60,000 with headers="" (in just over 100,000,000 documents total) that used headers="" in a non-trivial yet correct way
# [03:31] * Hixie looks at those 17 tables
# [03:32] <Hixie> one of them was the table on http://cgi.ebay.ie/Nokia-6210-unlocked-battery-charger-WARRANTY_W0QQitemZ200124682259QQihZ010QQcategoryZ3312QQcmdZViewItem
# [03:32] <Hixie> and it only uses headers with the empty string as its value
# [03:32] <Hixie> maybe i should exclude those, huh
# [03:32] <Hixie> in fact 9 of these were variants on that ebay page
# [03:33] <othermaciej> would that require assuming no header is a header for that call?
# [03:33] <Hixie> my headers="" algorithm used nothing but headers="" to assign headers to cells
# [03:33] <Hixie> so <th> elements have no effect when headers="" is specified
# [03:34] <othermaciej> what I'm wondering is, whether that is the specified behavior for headers=""
# [03:34] <Hixie> in html4?
# [03:35] <othermaciej> yeah
# [03:35] * Quits: KevinMarks (i=KevinMar@nat/google/x-cc1a40534dd9b194) ("The computer fell asleep")
# [03:36] <Hixie> ok i clearly need to look for tables with only blank headers="", since all but one of these uses of headers="" that different from scope="" are blank headers="" only.
# [03:36] <othermaciej> I guess HTML4 is not very clear on it
# [03:36] <Hixie> (http://www.bls.gov/oco/cg/cgs041.htm being that page)
# [03:38] <Hixie> and that page only uses headers="" to associate <th>s with parent <th>S
# [03:38] <Hixie> it doesn't actually do anything to make the table accessible as far as i can tell
# [03:40] <othermaciej> that's a pretty poor record
# [03:40] <Hixie> i'm skeptical of the large number of loops
# [03:40] <Hixie> that seems unlikely
# [03:40] <othermaciej> .3% of usage being error-free seems pretty damn low, even by the already low standards of most HTML features
# [03:41] <othermaciej> that does sound suspicious (the number of loops)
# [03:41] <Hixie> i also scanned longdesc="" in the same survey. i had my script throw out obviously invalid uses of longdesc="", like pointing to a file that the parent <a href=""> points to.
# [03:42] <Hixie> doing a spot check of the pages that came up as "good" uses, one was pointing to the same file, and another was pointing to a file that was the destination of a 301 redirect of a parent <a href="">
# [04:02] <Hixie> wow, longdesc is a disaster zone far worse than i had imagined
# [04:05] <Hixie> many of these are just pointing to the root of the site!
# [04:05] * Hixie adds another heuristic to look for that
# [04:05] <Hixie> lol, the longdesc="" on http://www.felicieditore.it/ points to http://www.felicieditore.com/, which doesn't exist
# [04:08] <Hixie> http://7mobile.de/shop/select?id=101787&v=010000 is a longdesc disaster in so many ways
# [04:14] <Lachy> Hixie: is it looking so bad for headers and longdesc that you're going to consider leaving them out?
# [04:16] <Hixie> i'm going to _consider_ leaving them out just like i'm going to consider leaving them in
# [04:17] <othermaciej> right now it's looking kind of bad for headers even on just a "degrade gracefully in current versions of the #2 screen reader" basis
# [04:17] <Lachy> ok. Maybe you could put them in, and include some algorithm to determine when it should be ignored due to it containing an illogical value
# [04:17] <othermaciej> which I think was the best argument in its favor
# [04:18] <othermaciej> if Hixie's data about how many uses are invalid holds up, anyway
# [04:18] <Hixie> yeah i'm getting a sample of those with cycles to check that
# [04:23] <Hixie> i think it's fair to say that no valid longdesc will ever point to the root of a domain, right?
# [04:23] * Quits: aroben (n=adamrobe@17.203.15.248)
# [04:25] <Hixie> oh crap, missed dinner. bbl.
# [04:38] * Quits: dbaron (n=dbaron@corp-242.mountainview.mozilla.com) ("8403864 bytes have been tenured, next gc will be global.")
# [04:56] * Quits: duryodhan (n=chatzill@221.128.138.129) (Read error: 110 (Connection timed out))
# [05:11] <Hixie> ok there's definitely something wrong with the cycle detection
# [05:22] <othermaciej> I think I found a mistake in CSS 2.1 (at least in the November 2006 WD)
# [05:23] <othermaciej> is there any way to see a newer editor's draft so I can check if it is fixed before I report it?
# [05:23] <Hixie> http://www.w3.org/Style/Group/css2-src/cover.html
# [05:23] * Hixie fixes the bug
# [05:24] <Hixie> i was indexing using the wrong variable. duh.
# [05:24] <othermaciej> can you check for me if this is really a mistake before I make an ass of myself
# [05:24] <othermaciej> http://www.w3.org/Style/Group/css2-src/visufx.html says, about overflow, "It affects the clipping of all of the element's content except any descendant elements (and their respective content and descendants) whose containing block is the viewport or an ancestor of the element."
# [05:24] <othermaciej> but obviously that is not supposed to apply to overflow on the viewport itself
# [05:24] <Hixie> what's the error?
# [05:25] <othermaciej> right?
# [05:25] <Hixie> right, the viewport is not an element
# [05:26] <othermaciej> ok, maybe just a lack of clarity, not an error
# [05:26] <othermaciej> since if you interpret it that way, it doesn't say anything about how to clip for overflow on the viewport
# [05:26] <Hixie> that sentence doesn't really say anything about anything
# [05:28] <othermaciej> later examples seem to assume it is saying something
# [05:28] <Hixie> yeah, css2.1 is only marginally better than html4 in terms of spec quality
# [05:33] * Joins: aroben (n=adamrobe@c-67-160-250-192.hsd1.ca.comcast.net)
# [05:33] * Quits: aroben (n=adamrobe@c-67-160-250-192.hsd1.ca.comcast.net) (Read error: 104 (Connection reset by peer))
# [05:33] * weinigFood is now known as weinig
# [05:34] * Joins: aroben (n=adamrobe@c-67-160-250-192.hsd1.ca.comcast.net)
# [05:35] <othermaciej> ok maybe I won't bother with this, even though it was confusing to me, the actual behavior seems to be interoperable
# [05:37] * Quits: MikeSmith (n=MikeSmit@eM60-254-213-111.pool.emobile.ad.jp) (Read error: 110 (Connection timed out))
# [06:06] * weinig is now known as intern
# [06:10] <Hixie> Lachy: yt?
# [06:16] * intern is now known as weinig
# [06:17] * Quits: othermaciej (i=mjs@nat/apple/x-592f43713f0159c0) (Read error: 104 (Connection reset by peer))
# [06:18] * Joins: othermaciej (i=mjs@nat/apple/x-fb4f9c6993d005e4)
# [06:26] * Quits: weinig (i=weinig@nat/apple/x-b84cd2cc1b593708)
# [06:35] * Quits: jruderman (n=jruderma@ip68-225-10-93.pv.oc.cox.net)
# [06:50] * Joins: othermaciej_ (n=mjs@17.255.97.47)
# [07:00] * Joins: weinig (n=weinig@c-67-188-89-242.hsd1.ca.comcast.net)
# [07:03] * Quits: othermaciej (i=mjs@nat/apple/x-fb4f9c6993d005e4) (Read error: 104 (Connection reset by peer))
# [07:04] * Joins: othermaciej (i=mjs@nat/apple/x-9a5ebd469e883efe)
# [07:06] * Quits: othermaciej_ (n=mjs@17.255.97.47) (Read error: 104 (Connection reset by peer))
# [07:06] * Joins: othermaciej_ (n=mjs@17.255.97.47)
# [07:09] * Quits: weinig (n=weinig@c-67-188-89-242.hsd1.ca.comcast.net) (Remote closed the connection)
# [07:09] * Quits: othermaciej (i=mjs@nat/apple/x-9a5ebd469e883efe) (Read error: 104 (Connection reset by peer))
# [07:09] * Joins: MikeSmith (n=MikeSmit@eM60-254-223-239.pool.emobile.ad.jp)
# [07:09] * Joins: othermaciej (i=mjs@nat/apple/x-e41fdc7728c2c6d6)
# [07:09] * Joins: weinig (n=weinig@c-67-188-89-242.hsd1.ca.comcast.net)
# [07:20] * Quits: weinig (n=weinig@c-67-188-89-242.hsd1.ca.comcast.net) (Read error: 104 (Connection reset by peer))
# [07:24] * Quits: othermaciej (i=mjs@nat/apple/x-e41fdc7728c2c6d6)
# [07:26] * Quits: othermaciej_ (n=mjs@17.255.97.47) (Read error: 110 (Connection timed out))
# [07:26] * Quits: aroben (n=adamrobe@c-67-160-250-192.hsd1.ca.comcast.net)
# [07:27] * Joins: aroben (n=adamrobe@c-67-160-250-192.hsd1.ca.comcast.net)
# [07:27] * Quits: aroben (n=adamrobe@c-67-160-250-192.hsd1.ca.comcast.net) (Read error: 104 (Connection reset by peer))
# [07:27] * Joins: aroben (n=adamrobe@c-67-160-250-192.hsd1.ca.comcast.net)
# [07:34] <Hixie> every page i've checked so far that has non-redundant headers="" actually uses them incorrectly.
# [07:35] <Hixie> although maybe we need a heuristic for the top-left cell
# [07:53] <Hixie> ok i finally found a page with a real longdesc=""
# [07:53] <Hixie> http://www.britanniarescue.com/about/strategy/
# [07:53] <Hixie> http://www.britanniarescue.com/online/longdesc/index.php#BRlogo
# [07:54] <Hixie> the longdesc is inaccurate, and it would be more useful for the information in that file to be in alt="" text anyway
# [07:56] * Joins: weinig (n=weinig@c-67-188-89-242.hsd1.ca.comcast.net)
# [08:03] * Quits: weinig (n=weinig@c-67-188-89-242.hsd1.ca.comcast.net)
# [08:07] <Hixie> longdesc="mailto:trustee@nbbankrutpcy.com"
# [08:07] <Hixie> wtf
# [08:09] * Joins: Charl (n=charlvn@c1-116-5.wblv.isadsl.co.za)
# [08:33] <hsivonen> Hixie: confirmed only one additional email
# [08:36] <Hixie> thanks
# [08:36] <Hixie> just making sure none of your mails fall through the cracks when i speed-read the html list...
# [09:03] <hsivonen> Hixie: should I CC you next time?
# [09:04] <Hixie> no, it's ok
# [09:04] <Hixie> just making sure
# [09:04] <hsivonen> ok
# [09:04] * Quits: MikeSmith (n=MikeSmit@eM60-254-223-239.pool.emobile.ad.jp) (Read error: 110 (Connection timed out))
# [09:05] <hsivonen> on the face of it, http://www.britanniarescue.com/about/strategy/ seems to have decorative images. why do they bother with longdesc?
# [09:05] <Hixie> i just select all mail to html and read it, then select all mail to the next list and read it, etc
# [09:05] * Joins: duryodhan_away (n=chatzill@221-128-139-99.static.exatt.net)
# [09:05] <Hixie> i have no idea why they use it
# [09:05] <Hixie> probably because It's The Law
# [09:06] <Hixie> after looking at all this in more detail, i'm starting to suspect that the accessibility advocacy has maybe done more damage than help, sadly
# [09:07] <hsivonen> yeah. in some twisted way it seems to me that by speccing accessibility features we might actually create lawyerbombs :-(
# [09:17] * Joins: MikeSmith (n=MikeSmit@eM60-254-220-75.pool.emobile.ad.jp)
# [09:26] * Joins: KevinMarks (n=KevinMar@c-76-102-254-252.hsd1.ca.comcast.net)
# [09:28] <Lachy> Hey Hixie, I'm here now
# [09:29] <Hixie> hey
# [09:29] * Quits: karlUshi (n=karl@dhcp-247-173.mag.keio.ac.jp) ("Where dwelt Ymir, or wherein did he find sustenance?")
# [09:30] <Hixie> i found a workaround around whatever it was i was going to ask you
# [09:30] <Hixie> which i've forgotten now
# [09:30] <Lachy> ok, no worries
# [09:31] * Quits: yod (n=ot@dhcp-247-181.mag.keio.ac.jp) ("This computer has gone to sleep")
# [09:31] * Lachy is off to see the Transforms movie now
# [09:31] <Hixie> aha, the next wave of data is in
# [09:31] <Lachy> *Transformers
# [09:31] * Hixie examines
# [09:33] <Hixie> lol
# [09:33] <Hixie> one of the longdesc=""s points to a file called spacer.txt
# [09:33] <Hixie> i have my doubts about the usefulness of THAT longdesc
# [09:34] * Joins: tndH (i=Rob@adsl-87-102-93-12.karoo.KCOM.COM)
# [09:37] <Dashiva> How excellent, an accessible spacer gif
# [09:37] <Hixie> there are 8 times more longdesc=""s that point to the same page as an ancestor <a href=""> than there are longdesc=""s that didn't get caught on any of my "likely to suck" heuristics
# [09:38] <Hixie> and out of 8 million <table>s with a cell with a headers="" attribute, twenty thousand had a cycle in the headers=""
# [09:38] <Hixie> jesus
# [09:38] <Hixie> and over a million had IDs that pointed to elements that weren't cells!
# [09:39] <Hixie> ten thousand had overlapping cells
# [09:39] * Quits: aroben (n=adamrobe@c-67-160-250-192.hsd1.ca.comcast.net)
# [09:40] <Hixie> in about four million cases, the headers="" attribute were redundant given the algorithm in the spec for mapping <th>s to <td>s
# [09:40] <Hixie> in about 80,000 cases the headers="" attribute _would_ have been redundant if all the headers used <th> elements instead of <td>
# [09:40] <Hixie> leaving about 2 million cases that might be valid which i'll have to look at
# [09:43] <Hixie> 2 for 2 on broken uses so far
# [10:24] * Joins: zcorpan (n=zcorpan@84-216-41-27.sprayadsl.telenor.se)
# [10:25] * hendry_ is now known as hendry
# [10:27] <hsivonen> http://tools.ietf.org/html/draft-walsh-tobin-hrri-00
# [10:27] * Quits: tndH (i=Rob@adsl-87-102-93-12.karoo.KCOM.COM) (Read error: 110 (Connection timed out))
# [10:28] <annevk> that's been up for a while now, not?
# [10:29] <annevk> although I don't think they are actually fixing anything
# [10:29] <annevk> they are just widening the range of allowed characters
# [10:33] <hsivonen> annevk: may have been. I dunno. found out today
# [10:33] <zcorpan> a superset of IRI?
# [10:34] <hsivonen> zcorpan: so it seems
# [10:34] <hsivonen> URL5
# [10:34] <zcorpan> yeah
# [10:35] <annevk> that's what we need, yes
# [10:35] <annevk> that's not what it is :(
# [10:36] <hsivonen> URL, URI, IRI, HRRI, URL5
# [10:38] <zcorpan> were there not more names somewhere in between?
# [10:38] * annevk learns about ephemeral
# [10:38] <annevk> there's XRI -> HRRI
# [10:38] <annevk> iirc
# [10:39] <annevk> IRIs are not done yet fwiw
# [10:45] * Joins: hendry_ (n=hendry@91.84.62.62)
# [10:46] <annevk> dropped / not included / omitted / ...?
# [10:46] <annevk> suggestions?
# [10:48] <annevk> excluded?
# [10:49] <zcorpan> 2007-07-01 17:35 Ben 'Cerbera' Millard "absent" might be even better?
# [10:49] <zcorpan> 2007-07-01 17:35 Ben 'Cerbera' Millard "not included" can still imply "we decided not to include these"
# [10:49] <zcorpan> 2007-07-01 17:35 Ben 'Cerbera' Millard "absent" just means "not present"
# [10:49] * Joins: BenWard (i=BenWard@nat/yahoo/x-5be70fe38b7a2d67)
# [10:49] * Joins: maikmerten (n=maikmert@T74a5.t.pppool.de)
# [10:50] <annevk> cool
# [10:54] * Joins: othermaciej (n=mjs@dsl081-048-145.sfo1.dsl.speakeasy.net)
# [10:58] * Quits: hendry (n=hendry@91.84.62.62) (Read error: 113 (No route to host))
# [11:00] * Joins: Ducki (n=Alex@dialin-145-254-189-214.pools.arcor-ip.net)
# [11:05] * Quits: KevinMarks (n=KevinMar@c-76-102-254-252.hsd1.ca.comcast.net) ("rebooting thanks to iTunes")
# [11:09] * hendry_ is now known as hendry
# [11:12] <zcorpan> people really think that new features will suffer less from interop problems than existing features
# [11:13] <annevk> it's mostly an academic exercise it seems
# [11:13] <annevk> although not a real interesting one at that
# [11:18] * Quits: MikeSmith (n=MikeSmit@eM60-254-220-75.pool.emobile.ad.jp) (Read error: 104 (Connection reset by peer))
# [11:29] * Joins: ROBOd (n=robod@86.34.246.154)
# [11:31] * Joins: KevinMarks (n=KevinMar@c-76-102-254-252.hsd1.ca.comcast.net)
# [11:44] * Joins: MikeSmith (n=MikeSmit@eM60-254-240-13.pool.emobile.ad.jp)
# [11:50] <Hixie> "Is XHTML 5 the successor of XHTML 2? Of course not." seems to beg the question with tr/52/21/
# [11:50] <Hixie> didn't someone already ask him that?
# [11:52] <Hixie> oh i see henri basically said that already
# [11:52] <annevk> maybe we should have "HTML 5" (language) and HTML and XHTML (syntax)
# [11:52] <annevk> the XHTML syntax for HTML 5 shorthand would be XHTML5 but that would be unofficial
# [11:52] <othermaciej> s/beg the question/invite the question/
# [11:53] * othermaciej hopes that here at least he can still be gently pedantic
# [11:53] * zcorpan hasn't seen the tr/// constructor before
# [11:53] <othermaciej> it's sed syntax
# [11:53] <othermaciej> (also perl I think)
# [11:54] <othermaciej> same source as s/foo/bar/
# [11:58] <zcorpan> seems useful :)
# [11:59] * Quits: duryodhan_away (n=chatzill@221-128-139-99.static.exatt.net) (Read error: 110 (Connection timed out))
# [12:00] * zcorpan also learns that other puncation and parantheses can be used instead of slashes
# [12:02] * Joins: duryodhan_away (n=chatzill@221.128.139.41)
# [12:04] <annevk> the WHATWG sniffing algorithm doesn't seem to deal with .ico formats, bitmaps, etc.
# [12:07] <zcorpan> http://del.icio.us/url/99931bd7993088a7dc60da0a031732e1 -- "(X)HTML4"
# [12:07] <Hixie> annevk: seems easiest to just ignore the whole issue, frankly. it's not like the spec is called "xhtml5"
# [12:07] <Hixie> annevk: does the spec allow for extra rows to sniff such types?
# [12:08] <krijnh> zcorpan: vpieters? :|
# [12:08] <annevk> Hixie, no it says "User agents must ignore any rows for image types that they do not support."
# [12:08] <annevk> which seems to conflict with the warning earlier on
# [12:08] <annevk> I might have mentioned that on the mailing list already
# [12:08] <zcorpan> krijnh: and condor87
# [12:09] <Hixie> annevk: ah well we'll have to add rows then
# [12:17] * annevk ponders about <picture>
# [12:18] <annevk> it seems such an obvious failure, how can they not see it?
# [12:21] <hsivonen> annevk: indeed
# [12:22] <hsivonen> annevk: Sander Tekelenburg's attempt at making it backwards compatible should show that the nice idea gets out of control quickly when you scratch the surface
# [12:22] <annevk> neither proposal even works in IE7
# [12:23] <hsivonen> I try to focus on tree building instead spending the whole day replying to the list
# [12:24] <annevk> I think I'll work on some tests for getBoundingClientRect and getClientRects or something
# [12:24] <annevk> lunch first!
# [12:24] <hsivonen> I'm getting more and more convinced that grouping by insertion mode first and by element second makes sense
# [12:24] <annevk> you're keeping insertion modes?
# [12:25] <hsivonen> with fall through for IN_TABLE etc. to IN_BODY and from IN_BODY to IN_HEAD_NOSCRIPT to IN_HEAD
# [12:25] <hsivonen> annevk: no. I have just phases
# [12:25] <annevk> oh ok
# [12:25] <annevk> i like your code for the tokenizer quite a bit
# [12:26] <annevk> although the comments are quite verbose
# [12:26] * Joins: tantek (n=tantek@adsl-63-195-114-133.dsl.snfc21.pacbell.net)
# [12:26] <hsivonen> annevk: it's the spec :-)
# [12:26] <annevk> yeah :)
# [12:26] <hsivonen> too bad that doing the same for tree building is too much work
# [12:27] <annevk> we just need lots of testcases
# [12:27] <annevk> if zcorpan gets a proper browser framework to work for html5lib tests I assume we'll get even more testcases there
# [12:28] <hsivonen> I intend to print my tree builder and the spec and go over them with a highlighter pen to check that everything is there
# [12:28] <annevk> especially since the testformat is quite easy and the output can be generated using tools (assuming html5lib is compliant)
# [12:29] <annevk> not sure yet how to test the formpointer stuff
# [12:29] <annevk> that may require some extension
# [12:30] <hsivonen> annevk: I have been thinking of a sanitizer tree that puts an UUID ID on <form> and form='' on out-of-subtree associated inputs
# [12:37] <Hixie> so has anyone actually defined the problem that <picture> is intended to solve?
# [12:39] <hsivonen> Hixie: implicitly, the problem is that <img> doesn't allow structured fallback--only a plain string
# [12:39] <Hixie> aah
# [12:40] <Hixie> does he elaborate on why <object> and longdesc="" don't handle this well enough?
# [12:40] <Hixie> http://www.grupodignidade.org.br/projetos.php - <img src="img/logo.gif" alt="logo" width="160" height="80" longdesc="http://www.grupodignidade.org.br/img/logo.gif" />
# [12:40] <Hixie> sigh
# [12:40] <hsivonen> Hixie: for <object>, yes. for longdecs, I no longer remember
# [12:40] <Hixie> k
# [12:40] * Joins: zcorpan_ (n=zcorpan@84-216-41-27.sprayadsl.telenor.se)
# [12:41] <Hixie> bed time
# [12:41] <Hixie> nn
# [12:41] <hsivonen> nn
# [12:45] * Joins: yod (n=ot@softbank221018155222.bbtec.net)
# [12:46] * Quits: yod (n=ot@softbank221018155222.bbtec.net) (Client Quit)
# [12:47] <annevk> the table and longdesc study is interesting
# [12:50] * Joins: zcorpan__ (n=zcorpan@84-216-41-27.sprayadsl.telenor.se)
# [12:52] * Joins: peepo (n=Jay@86.157.113.34)
# [12:57] * Joins: Ducki_ (n=Alex@dialin-145-254-187-022.pools.arcor-ip.net)
# [13:01] * Quits: zcorpan (n=zcorpan@84-216-41-27.sprayadsl.telenor.se) (Read error: 110 (Connection timed out))
# [13:04] * Quits: zcorpan_ (n=zcorpan@84-216-41-27.sprayadsl.telenor.se) (Read error: 110 (Connection timed out))
# [13:06] * zcorpan__ is now known as zcorpan
# [13:07] <zcorpan> hmm, it's not possible to check what case elements are in the dom in html, is it? except perhaps trying getElementsByTagNameNS or something
# [13:12] <annevk> don't think so
# [13:12] <annevk> unless localName is somehow secured
# [13:13] <zcorpan> given webkit's implementation experience with my suggestion about localName, even that seems to be a dead end
# [13:15] <zcorpan> i'll just have to use toLowerCase()
# [13:19] <zcorpan> http://simon.html5.org/temp/html5lib-tests/wrapper.html -- got something working at least. now i just need to figure out how to parse and test the real files. or perhaps i'll just use another wrapper with some php. that may be simpler, dunno
# [13:21] * Quits: Ducki (n=Alex@dialin-145-254-189-214.pools.arcor-ip.net) (Read error: 110 (Connection timed out))
# [13:22] <zcorpan> the function fails in ie if there's a short bogus comment like <!foo>
# [13:38] * Quits: maikmerten (n=maikmert@T74a5.t.pppool.de) ("Leaving")
# [13:39] <zcorpan> </> results in a "/" element in ie
# [13:44] * Joins: the_mart (n=Martin@host86-135-9-158.range86-135.btcentralplus.com)
# [13:46] <zcorpan> same as </foo> really
# [13:47] <zcorpan> stray </x:y> gets dropped
# [14:00] <annevk> dropping </> works just as well
# [14:05] <zcorpan> oh sure. i was surprised that ie didn't drop it
# [14:23] * Quits: tantek (n=tantek@adsl-63-195-114-133.dsl.snfc21.pacbell.net) (Remote closed the connection)
# [14:29] * Joins: maikmerten (n=maikmert@T74a5.t.pppool.de)
# [14:33] * Joins: SavageX (n=maikmert@T6eaf.t.pppool.de)
# [14:48] * Quits: peepo (n=Jay@86.157.113.34) ("later")
# [14:51] * Quits: maikmerten (n=maikmert@T74a5.t.pppool.de) (Read error: 110 (Connection timed out))
# [14:54] <annevk> lol
# [14:54] <annevk> tr > tbody > td
# [14:54] <annevk> tbody is not implied!
# [14:57] * Joins: Ducki (i=Ducki@dialin-212-144-055-229.pools.arcor-ip.net)
# [14:57] * Joins: Lachy_ (n=Lachy@124-168-24-114.dyn.iinet.net.au)
# [14:57] * Quits: Lachy_ (n=Lachy@124-168-24-114.dyn.iinet.net.au) (Client Quit)
# [15:07] <Philip`> Shouldn't that be "tbody > tr > td"?
# [15:07] <annevk> yeah
# [15:09] <Philip`> Ah
# [15:13] * Quits: Lachy (n=Lachy@124-168-24-114.dyn.iinet.net.au) (Read error: 110 (Connection timed out))
# [15:14] * Joins: rubys (n=rubys@cpe-075-182-064-252.nc.res.rr.com)
# [15:14] * Parts: rubys (n=rubys@cpe-075-182-064-252.nc.res.rr.com)
# [15:14] * Joins: rubys (n=rubys@cpe-075-182-064-252.nc.res.rr.com)
# [15:14] * Parts: rubys (n=rubys@cpe-075-182-064-252.nc.res.rr.com)
# [15:19] * Quits: Ducki_ (n=Alex@dialin-145-254-187-022.pools.arcor-ip.net) (Read error: 110 (Connection timed out))
# [15:34] * duryodhan_away is now known as duryodhan
# [15:41] * Quits: zcorpan (n=zcorpan@84-216-41-27.sprayadsl.telenor.se) (Read error: 110 (Connection timed out))
# [15:48] * Joins: zcorpan (n=zcorpan@84-216-41-27.sprayadsl.telenor.se)
# [15:51] <zcorpan> making progress...: http://simon.html5.org/temp/html5lib-tests/wrapper.html
# [15:52] <zcorpan> now i just need to make the text file into two arrays
# [15:53] * annevk wonders in what kind of fantasyland some people live
# [15:53] <annevk> "I was thinking exactly the opposite, and wondering whether Microsoft might be persuaded to migrate their horrific ?Active-X? strings from the opening <object> tag to an nested <param>."
# [15:54] * Joins: tndH (i=Rob@adsl-87-102-93-12.karoo.KCOM.COM)
# [15:54] <Philip`> zcorpan: "Security error: attempted to read protected variable" - why doesn't Opera like that?
# [15:55] <zcorpan> Philip`: dunno, works in Kestrel
# [15:56] <Philip`> Oh, okay, maybe it's only a problem with 9.2
# [15:57] <annevk> evil data: URIs
# [15:57] <hsivonen> annevk: in a world where the value of π is a legislative decision
# [15:58] * Joins: jcgregorio (i=chatzill@nat/ibm/x-b085c6389fc2a600)
# [16:03] <zcorpan> any suggestions on how to read the text file with js?
# [16:04] <hsivonen> zcorpan: XHR?
# [16:04] <zcorpan> hsivonen: yeah. although in firefox i got a "syntax error" when trying to read .responseText
# [16:05] * Quits: duryodhan (n=chatzill@221.128.139.41) ("Born to be WilD !! rofl")
# [16:06] * Joins: duryodhan (n=chatzill@221.128.139.41)
# [16:07] <zcorpan> but let's assume that doesn't happen in firefox and i can read the file... how do i then parse it into two arrays?
# [16:07] <zcorpan> my previous attempt with split() was too naïve and didn't really work
# [16:08] <Philip`> Regular expressions?
# [16:08] <Philip`> Whatever the problem, they are always the solution
# [16:08] <annevk> :p
# [16:08] <hsivonen> "now you have two problems" :-)
# [16:08] <annevk> why doesn't split("\n\n") work?
# [16:10] <zcorpan> does that work with multiple lines?
# [16:10] <zcorpan> also, what if a test has e.g. \n\n as data
# [16:10] <zcorpan> or doesn't the syntax allow for that?
# [16:10] <annevk> oh right, yes
# [16:10] <zcorpan> i think it does, so long as no test has \n\n as data
# [16:11] <annevk> no \n\n can occur
# [16:11] <zcorpan> ok
# [16:11] <annevk> just split on \n\n#data or something and remove #data from the first line too
# [16:11] <zcorpan> splitting removes automatically
# [16:14] <Philip`> http://wiki.whatwg.org/wiki/Parser_tests#Tree_Construction_Tests doesn't seem to say it has to have blank lines between tests - the only delimiter is "\n#data\n"
# [16:14] <annevk> sure, but the first test doesn't start with \n\n
# [16:14] <annevk> Philip`, except for the first test...
# [16:14] <annevk> also, two newlines is sort of accepted
# [16:14] <Philip`> /^#data$/
# [16:14] <Philip`> /^#data$/
# [16:16] <Philip`> Uh
# [16:16] <Philip`> /^#data$/m
# [16:18] <Philip`> (or something like /\n*^#data\n/m if you want to strip newlines, assuming the last test doesn't end with a newline)
# [16:19] * Philip` wonders if anyone has written test cases for test case parsers
# [16:20] <Philip`> though I'm not entirely sure how you'd parse the tests for the test parser
# [16:20] * Quits: BenWard (i=BenWard@nat/yahoo/x-5be70fe38b7a2d67) ("Fades out again…")
# [16:20] * Joins: BenWard (i=BenWard@nat/yahoo/x-4d35e60d0ce62784)
# [16:20] <zcorpan> we need a parsing spec for the test case format
# [16:20] <zcorpan> -_-
# [16:23] * Quits: BenWard (i=BenWard@nat/yahoo/x-4d35e60d0ce62784) (Client Quit)
# [16:24] * Joins: BenWard (i=BenWard@nat/yahoo/x-586e60901d97a22b)
# [16:25] * Quits: BenWard (i=BenWard@nat/yahoo/x-586e60901d97a22b) (Client Quit)
# [16:25] * Joins: BenWard (i=BenWard@nat/yahoo/x-be55ed277dc3cc02)
# [16:29] * Joins: billmason (n=billmaso@ip156.unival.com)
# [16:39] * Quits: MikeSmith (n=MikeSmit@eM60-254-240-13.pool.emobile.ad.jp) ("Less talk, more pimp walk.")
# [16:48] <annevk> I tweaked http://wiki.whatwg.org/wiki/Parser_tests#Tree_Construction_Tests a bit to make it more clear what the actual format is
# [16:49] <Philip`> The link at the bottom to the tests should probably be updated
# [16:50] <Philip`> 'a line that says "#errors:"' - probably shouldn't have the colon
# [16:51] * Quits: duryodhan (n=chatzill@221.128.139.41) (Remote closed the connection)
# [16:51] <annevk> at some point the format used by http://html5lib.googlecode.com/svn/trunk/testdata/tree-construction/tests4.dat should be added too and the description could use some more whitespace...
# [16:57] * Joins: Ducki_ (i=Alex@dialin-212-144-065-213.pools.arcor-ip.net)
# [17:04] <zcorpan> yay
# [17:05] <zcorpan> works in Kestrel now
# [17:06] <annevk> zcorpan, sweet
# [17:06] <zcorpan> firefox boils at...: Error: unexpected end of XML source
# [17:06] <zcorpan> Source File: data:text/html,<script><div></script></div><title><p></title><p><p>
# [17:06] <zcorpan> Line: 1, Column: 4
# [17:06] <zcorpan> Source Code:
# [17:06] <zcorpan> <div>
# [17:06] <annevk> ah
# [17:07] <zcorpan> is that e4x or something?
# [17:07] <Philip`> It works in precisely none of the five browsers I have access to :-(
# [17:07] <annevk> put encodeURIComponent around it
# [17:07] <annevk> maybe that will make it work better (it's also theoretically more correct)
# [17:07] <zcorpan> don't think that's the problem
# [17:07] <zcorpan> it's <script><div></script> in the actual test
# [17:08] <annevk> maybe catch all error events and silence them?
# [17:09] <annevk> iframe.onerror = function ...
# [17:09] <Philip`> That would be parsed as E4X, I believe - it's only in the cases of  and <![CDATA[...]]> where you have to use type="text/javascript;e4x=1"
# [17:10] <annevk> iframe.onerror = null
# [17:10] <annevk> or something
# [17:10] <Philip`> (http://developer.mozilla.org/en/docs/E4X)
# [17:10] <zcorpan> annevk: doesn't help
# [17:10] <zcorpan> annevk: don't think JS errors bubble up to the parent document
# [17:11] <annevk> zcorpan, iframe.contentWindow.onerror = null
# [17:11] <zcorpan> annevk: nope
# [17:12] * Quits: KevinMarks (n=KevinMar@c-76-102-254-252.hsd1.ca.comcast.net) ("The computer fell asleep")
# [17:12] <annevk> does it actually work if you remove that test?
# [17:13] <zcorpan> hmm. no.
# [17:13] <annevk> btw, it would be nice if you showed the input data in the result tree as well
# [17:14] <annevk> makes it easier to analyze potential errors
# [17:14] <Philip`> Could change the tests to do <script type="unsupported"> so browsers won't try running them
# [17:15] <annevk> that may work
# [17:16] <zcorpan> or use //<div> instead of <div>
# [17:16] <zcorpan> annevk: done
# [17:16] <annevk> done what?
# [17:17] <zcorpan> showed the input data
# [17:17] <annevk> ah
# [17:17] <annevk> does it matter though that browsers run them?
# [17:18] <zcorpan> no, don't think so
# [17:18] * Quits: Ducki (i=Ducki@dialin-212-144-055-229.pools.arcor-ip.net) (No route to host)
# [17:18] <annevk> zcorpan, btw iframe.contentWindow.onerror = function(foo,bar,baz) { return false }
# [17:18] <annevk> might prevent the error from appearing
# [17:18] <zcorpan> it's some other reason why it doesn't work in firefox
# [17:18] <zcorpan> ok
# [17:20] <zcorpan> xhr only works on the same domain, right
# [17:20] <zcorpan> might need a server side script to include external tests
# [17:20] <annevk> yeah, same-origin
# [17:23] <Philip`> If the external tests were in a format that was valid JS, you could include them with <script src>
# [17:24] <zcorpan> well, they're not. :)
# [17:24] <Philip`> Or if you could change the external tests to be in a format that was valid JS :-)
# [17:25] <zcorpan> seems simpler to write a server-side wrapper for this
# [17:25] <Philip`> but I guess the point of it being external is that it's external and out of your control
# [17:25] <annevk> zcorpan, how about a document.write() version?
# [17:26] <zcorpan> annevk: ?
# [17:26] <annevk> zcorpan, instead of iframe.src = do iframe.contentDocument.open(); iframe.contentDocument.write(testdata); etc.
# [17:26] <annevk> that's how the live-dom-viewer works
# [17:27] <zcorpan> ah
# [17:27] <zcorpan> ok
# [17:29] <zcorpan> it doesn't fire a load even then. but i guess i could make it work. what's the benefit?
# [17:29] <annevk> works in IE
# [17:29] <annevk> just copy some of the live-dom-ivewer logic
# [17:29] <annevk> should be doable
# [17:32] <zcorpan> works in firefox with that change
# [17:33] <zcorpan> and opera 9.2
# [17:35] <zcorpan> ie only wants to load the first test
# [17:37] * Quits: BenWard (i=BenWard@nat/yahoo/x-be55ed277dc3cc02) ("Fades out again…")
# [17:38] <annevk> that's an improvement
# [17:40] <zcorpan> "childNodes is null or not an object"
# [17:40] <zcorpan> for (var i = 0; i < node.childNodes.length; i += 1) {
# [17:42] <annevk> hmm
# [17:42] <zcorpan> ah
# [17:42] <zcorpan> contentDocument -> contentWindow.document
# [17:42] <annevk> whoa
# [17:43] <annevk> that's supposed to be equivalent
# [17:43] <Philip`> It's kind of irritating when you're trying to write tests to help interoperability between browsers, but then you can't even write a script to run the tests without hitting non-interoperability issues between every browser...
# [17:43] <zcorpan> now it works in ie
# [17:43] <zcorpan> Philip`: yeah
# [17:43] <zcorpan> but it outputs everything on one line
# [17:44] <zcorpan> \n -> \r\n ?
# [17:44] * Quits: jcgregorio (i=chatzill@nat/ibm/x-b085c6389fc2a600) ("ChatZilla 0.9.78.1 [Firefox 2.0.0.4/2007060115]")
# [17:44] <annevk> yeah
# [17:45] <zcorpan> YAY!
# [17:45] <zcorpan> :D
# [17:45] <zcorpan> doesn't work in safari though
# [17:46] <annevk> hmm
# [17:46] <annevk> blame mjs :p
# [17:46] <zcorpan> othermaciej: yt? :)
# [17:47] <annevk> IE fails everything because of its fixed <title>
# [17:49] <annevk> zcorpan, the test output numbers don't match the test input numbers
# [17:49] <annevk> zcorpan, it seems that way
# [17:49] <zcorpan> the output numbers is 1 greater right?
# [17:50] <annevk> hmm, IE and Opera seem to be one off
# [17:50] <zcorpan> yeah
# [17:50] <zcorpan> it's correct
# [17:50] <zcorpan> the first test is empty
# [17:50] <zcorpan> .split(/\n*#data\n/m)
# [17:50] <annevk> so why are they one off?
# [17:51] <annevk> IE saying it's 24 and Opera claiming it's 25...
# [17:51] <zcorpan> "foobar".split("foo") // ["", "bar"]
# [17:52] <zcorpan> i guess i could remove the first entry from the array but it seemed simpler to ignore it
# [17:53] <zcorpan> they might do different things with split()
# [17:55] <zcorpan> yep
# [17:55] <zcorpan> javascript:(function(){var arr = "#data\nfoo".split(/\n*#data\n/m); alert(arr.length); })()
# [17:57] <Philip`> (Is it intentional that that will match strings like "foo#data\n"?)
# [17:57] <zcorpan> not really
# [17:58] <Philip`> (That was what the ^ in /\n*^#data\n/m was for :-) )
# [17:59] <zcorpan> (fixed)
# [18:01] <zcorpan> ok, fixed the number of tests issue
# [18:04] <zcorpan> ie passes test 101
# [18:05] <annevk> <html><head><title></title><body></body></html> ...
# [18:06] <zcorpan> amazing that i got the format right on the first try. i didn't even look at the documentation
# [18:06] <annevk> hixie designed it
# [18:07] <zcorpan> Hixie: if you could get people use html right on the first try... ;)
# [18:07] <annevk> I'm quite disappointed by the large number of fails
# [18:07] <annevk> Hopefully that will improve in due course by either updating the tests or the spec
# [18:08] <zcorpan> annevk: in which browser?
# [18:08] <annevk> all?
# [18:08] <Philip`> Could you make a table of the results for all browsers, to see which tests don't match any browser's reality?
# [18:09] <zcorpan> i guess
# [18:09] <zcorpan> but there are more tests
# [18:09] <zcorpan> i want to figure out how to run those
# [18:09] <zcorpan> first food
# [18:09] <annevk> another for loop around the xhr
# [18:09] <annevk> or just merge everything on the server
# [18:09] <zcorpan> yeah
# [18:10] <annevk> it would be good if you at some point comitted this back to html5lib
# [18:11] <annevk> then we can make the acid-parser test
# [18:11] <zcorpan> perhaps i don't need to do server side magic
# [18:11] <annevk> other things that might be nice: 1) some colors on the result page to make it easier to scan 2) collapsable items on the result page
# [18:12] <annevk> especially the second is useful given the large number of tests that fail :)
# [18:12] * Quits: hendry (n=hendry@91.84.62.62) ("leaving")
# [18:12] * zcorpan makes notes
# [18:13] <annevk> zcorpan, did you "fix" the difference in counting with IE?
# [18:15] <annevk> I'm thinking that it might be useful to include a bunch of <title></title> in a lot of testcases to make the IE results more usable
# [18:16] <Philip`> Could you post-process the results to ignore ones where the only difference is the "| <title>" line?
# [18:17] <Philip`> (or mark as uninteresting, rather than entirely ignore them)
# [18:18] <annevk> that'd be another option
# [18:18] <annevk> prolly better
# [18:25] * Joins: h3h (n=w3rd@66-162-32-234.static.twtelecom.net)
# [18:39] * Joins: rubys (n=rubys@cpe-075-182-064-252.nc.res.rr.com)
# [18:40] <rubys> any html5lib developers awake here? :-)
# [18:44] * annevk is
# [18:45] <annevk> zcorpan ported html5lib tests to browsers
# [18:45] <annevk> see http://simon.html5.org/temp/html5lib-tests/wrapper.html for tree-construction/tests1
# [18:46] <rubys> Anne, can you do me a favor and svn update and then run:
# [18:46] <rubys> python parse.py --tree "<p><b><i><u></p><p>X"
# [18:49] <annevk> get two <p> siblings the second containing the same as the first plus "X" as deepest child
# [18:51] <rubys> nevermind, I found my problem (the actual test2 #45 actually has a new line in the middle)
# [18:51] <rubys> sorry to bother you
# [18:51] <annevk> no worries
# [18:57] * Joins: Ducki__ (i=Alex@dialin-145-254-186-117.pools.arcor-ip.net)
# [19:04] * Joins: KevinMarks (i=KevinMar@nat/google/x-0ec231ed8cb32832)
# [19:04] * Quits: gsnedders (n=gsnedder@host81-132-88-104.range81-132.btcentralplus.com) ("Don't touch /dev/null…")
# [19:09] <annevk> hsivonen, how would this UUID stuff work?
# [19:10] <annevk> hsivonen, what I'm interested in is annotating the test results for tree construction with that information
# [19:15] * Joins: aroben (n=adamrobe@17.255.104.120)
# [19:17] * Quits: aroben (n=adamrobe@17.255.104.120) (Remote closed the connection)
# [19:17] * Joins: aroben (n=adamrobe@17.255.104.120)
# [19:19] * Quits: Ducki_ (i=Alex@dialin-212-144-065-213.pools.arcor-ip.net) (Read error: 113 (No route to host))
# [19:25] * Parts: rubys (n=rubys@cpe-075-182-064-252.nc.res.rr.com)
# [19:26] * Joins: hasather (n=hasather@22.80-203-71.nextgentel.com)
# [19:35] * Parts: hasather (n=hasather@22.80-203-71.nextgentel.com)
# [19:35] * Joins: met_ (n=Hassman@r5bx220.net.upc.cz)
# [19:36] <met_> http://ydnar.vox.com/library/post/webkit-team-adds-audio-video-support.html
# [19:38] * Joins: Lachy (n=Lachy@124-168-24-114.dyn.iinet.net.au)
# [19:42] * Joins: bzed (n=bzed@dslb-084-059-121-172.pools.arcor-ip.net)
# [19:43] <zcorpan> annevk: i did
# [19:47] * Quits: Lachy (n=Lachy@124-168-24-114.dyn.iinet.net.au) ("ChatZilla 0.9.78.1 [Firefox 2.0.0.4/2007051502]")
# [19:47] * Joins: Lachy (n=Lachy@124-168-24-114.dyn.iinet.net.au)
# [19:47] * Joins: weinig (n=weinig@17.255.97.129)
# [19:48] <othermaciej> zcorpan: what's the problem?
# [19:50] * Joins: hasather (n=hasather@22.80-203-71.nextgentel.com)
# [19:52] * Quits: hasather (n=hasather@22.80-203-71.nextgentel.com) (Remote closed the connection)
# [19:53] * Joins: hasather (n=hasather@22.80-203-71.nextgentel.com)
# [20:27] * Joins: kingryan (n=kingryan@corp.technorati.com)
# [20:36] * Joins: gsnedders (n=gsnedder@host81-132-88-104.range81-132.btcentralplus.com)
# [20:37] * Joins: psa (n=yomode@posom.com)
# [20:37] * Joins: jcgregorio (n=chatzill@adsl-072-148-043-048.sip.rmo.bellsouth.net)
# [20:43] * Joins: tantek (n=tantek@corp.technorati.com)
# [20:57] * Quits: Ducki__ (i=Alex@dialin-145-254-186-117.pools.arcor-ip.net) (Read error: 104 (Connection reset by peer))
# [20:57] * Joins: Ducki__ (n=Alex@dialin-145-254-186-117.pools.arcor-ip.net)
# [20:59] <zcorpan> othermaciej: http://simon.html5.org/temp/html5lib-tests/wrapper.html doesn't work in safari (for windows). don't know why
# [21:00] <othermaciej> I was hoping it would be obvious but there's a whole lot of script there
# [21:00] * Quits: tantek (n=tantek@corp.technorati.com)
# [21:01] <zcorpan> would the web inspector help me debug? how do i activate it on windows?
# [21:01] <othermaciej> zcorpan: it's got a "parse error" and a "maximum call stack size exceeded"
# [21:01] <othermaciej> the JavaScript error console (in the debug menu) would tell you that
# [21:01] <zcorpan> don't see a debug menu
# [21:02] <othermaciej> yeah, you have to turn it on with a command-line switch
# [21:02] <othermaciej> google for "safari windows debug menu"
# [21:02] <othermaciej> I don't remember the details at the moment
# [21:02] <billmason> http://rakaz.nl/item/enabling_the_debug_menu_on_safari_for_windows
# [21:02] <zcorpan> ok, will do
# [21:02] <othermaciej> is dom2string going to recurse to a depth of more than 99?
# [21:02] <zcorpan> billmason: cheers
# [21:02] <othermaciej> if so, that's probably the problem
# [21:03] <othermaciej> we should probably relax that stack limit
# [21:03] <zcorpan> it might
# [21:05] <zcorpan> but i don't think that's the problem, it didn't work with one test with the input "Test" either
# [21:11] <zcorpan> is "run" a preserved word?
# [21:13] <hasather> zcorpan: no
# [21:13] <zcorpan> what is the SyntaxError: Parse Error on line 1 in http://simon.html5.org/temp/html5lib-tests/wrapper.html ?
# [21:16] * Joins: weinig_ (i=weinig@nat/apple/x-a7dd02655b3290fa)
# [21:18] * Quits: met_ (n=Hassman@r5bx220.net.upc.cz) ("Chemists never die, they just stop reacting.")
# [21:22] * Quits: aroben (n=adamrobe@17.255.104.120) (Read error: 104 (Connection reset by peer))
# [21:24] * Joins: zcorpan_ (n=zcorpan@84-216-41-27.sprayadsl.telenor.se)
# [21:24] <zcorpan_> works when i have only 1 test in the file
# [21:24] <zcorpan_> 2 tests as well
# [21:25] <hasather> seems to be a problem with the test that looks like this: "<script><div></script></div><title><p></title><p><p>"
# [21:27] * Parts: kingryan (n=kingryan@corp.technorati.com)
# [21:27] * Joins: kingryan (n=kingryan@corp.technorati.com)
# [21:28] <hasather> zcorpan: that seems to be the only test that has unallowed content in a script element
# [21:29] * Quits: weinig (n=weinig@17.255.97.129) (Read error: 110 (Connection timed out))
# [21:30] <jgraham> zcorpan_: TestData in http://html5lib.googlecode.com/svn/trunk/python/tests/support.py contains the testcase parser that html5lib uses (you have to pass it a list of the section headings e.g. ("data", "errors", "document"))
# [21:30] <jgraham> (that was a FYI if you have any more issues with the test format)
# [21:36] * Quits: zcorpan (n=zcorpan@84-216-41-27.sprayadsl.telenor.se) (Read error: 110 (Connection timed out))
# [21:36] <zcorpan_> hasather: ah. yes of course
# [21:37] <zcorpan_> jgraham: thanks
# [21:39] <zcorpan_> othermaciej: seems like the problem is the number of recursions indeed. not sure if i can/will work around that
# [21:40] * Quits: weinig_ (i=weinig@nat/apple/x-a7dd02655b3290fa) (Read error: 110 (Connection timed out))
# [21:42] <othermaciej> zcorpan_: I'm sure your function could easily be rewritten not to be recursive
# [21:42] <zcorpan_> othermaciej: can you do it for me? :)
# [21:44] <othermaciej> zcorpan_: don't have time to actually test, but I can tell you roughly how to do it
# [21:45] <othermaciej> you're effectively doing a preorder tree traversal
# [21:45] <othermaciej> you can do that with a stack, or since you have parent pointers just with a simple loop
# [21:46] <othermaciej> when entering a node, you do the entry processing (print node itself, increment indent)
# [21:47] <othermaciej> then you check if it has children - if so, enter the first child
# [21:47] <zcorpan_> (the live dom viewer has the same problem btw)
# [21:47] <othermaciej> if no children, check for a next sibling - if present, do exit processing for current node and enter the next sibling
# [21:48] <othermaciej> if no next sibling, do exit processing for this node, then continue from the parent as if it had no children (i.e. exit to the parent's next sibling or parent's parent and so forth)
# [21:48] <zcorpan_> ok. thanks
# [21:49] <othermaciej> we use this style of tree traversal internal to webcore all the time
# [21:49] <othermaciej> in fact, we have an internal traverseNextNode function that does it
# [21:49] <othermaciej> (although that doesn't visit a node again when exiting, which I think you want)
# [21:50] <zcorpan_> yeah, i want to catch misnested nodes in ie
# [21:51] <zcorpan_> or perhaps that's just a check before you process the children
# [21:52] * moeffju[Away] is now known as moeffju
# [21:52] * Joins: peepo (n=Jay@86.157.113.34)
# [21:56] * Quits: peepo (n=Jay@86.157.113.34) (Client Quit)
# [22:04] * Joins: weinig (i=weinig@nat/apple/x-81d0d4e457982b68)
# [22:05] * Joins: aroben (n=adamrobe@17.203.15.248)
# [22:08] * Quits: KevinMarks (i=KevinMar@nat/google/x-0ec231ed8cb32832) (Read error: 104 (Connection reset by peer))
# [22:16] * Quits: SavageX (n=maikmert@T6eaf.t.pppool.de) (Remote closed the connection)
# [22:17] * Joins: MikeSmith (n=MikeSmit@eM60-254-215-75.pool.emobile.ad.jp)
# [22:21] * Quits: Charl (n=charlvn@c1-116-5.wblv.isadsl.co.za) ("Leaving")
# [22:27] * Joins: h3h_ (n=w3rd@66-162-32-234.static.twtelecom.net)
# [22:32] * Joins: webben (n=benh@91.84.193.157)
# [22:38] * Quits: h3h (n=w3rd@66-162-32-234.static.twtelecom.net) (Read error: 110 (Connection timed out))
# [22:48] * Joins: KevinMarks (i=KevinMar@nat/google/x-c911ea9fa8809661)
# [22:50] * Joins: weinig_ (i=weinig@nat/apple/x-042b02b45e5f5046)
# [22:50] * Quits: weinig (i=weinig@nat/apple/x-81d0d4e457982b68) (Read error: 104 (Connection reset by peer))
# [22:57] * Quits: Ducki__ (n=Alex@dialin-145-254-186-117.pools.arcor-ip.net) (Read error: 104 (Connection reset by peer))
# [22:58] * Joins: Ducki__ (n=Alex@dialin-212-144-055-244.pools.arcor-ip.net)
# [23:09] * Joins: weinig (i=weinig@nat/apple/x-7d076cc85573a423)
# [23:10] * Quits: weinig_ (i=weinig@nat/apple/x-042b02b45e5f5046) (Read error: 104 (Connection reset by peer))
# [23:12] * Quits: KevinMarks (i=KevinMar@nat/google/x-c911ea9fa8809661) (Read error: 110 (Connection timed out))
# [23:14] <zcorpan_> hmm. the question is how to handle misnested nodes.
# [23:15] * Joins: hendry (n=hendry@91.84.62.62)
# [23:15] * Quits: h3h_ (n=w3rd@66-162-32-234.static.twtelecom.net)
# [23:15] * Quits: Ducki__ (n=Alex@dialin-212-144-055-244.pools.arcor-ip.net) (Read error: 113 (No route to host))
# [23:17] * Quits: ROBOd (n=robod@86.34.246.154) ("http://www.robodesign.ro")
# [23:19] * Quits: the_mart (n=Martin@host86-135-9-158.range86-135.btcentralplus.com) (kubrick.freenode.net irc.freenode.net)
# [23:19] * Quits: Yudai (n=Yudai@p931010.tokyte00.ap.so-net.ne.jp) (kubrick.freenode.net irc.freenode.net)
# [23:20] * Joins: the_mart (n=Martin@host86-135-9-158.range86-135.btcentralplus.com)
# [23:20] * Joins: Yudai (n=Yudai@p931010.tokyte00.ap.so-net.ne.jp)
# [23:25] <Philip`> zcorpan_: Output "FAIL" and then stop?
# [23:36] * Joins: KevinMarks (i=KevinMar@nat/google/x-2987d34f5000d2a1)
# [23:43] * Joins: weinig_ (i=weinig@nat/apple/x-a4970a9ef18c9aca)
# [23:44] * Quits: weinig (i=weinig@nat/apple/x-7d076cc85573a423) (Read error: 104 (Connection reset by peer))
# [23:44] * othermaciej facepalms at continuing mail from Rob Burns
# [23:46] <zcorpan_> Philip`: yeah... but the recursive algorithm could output the entire tree anyway, which is nicer for debugging
# [23:46] <Philip`> I don't quite see how trying to publish one document after four months counts as "rushing"
# [23:47] <Hixie> <td id="m1" axis="mainMenu" headers="m1" valign="top">
# [23:47] <Hixie> sigh
# [23:47] <zcorpan_> Hixie: hah
# [23:48] <othermaciej> now that's some compact information
# [23:48] <othermaciej> Hixie: is that the sort of thing causing all the cycles?
# [23:52] <Hixie> it's at least one cause
# [23:52] <Hixie> i'm going to rerun the survey with a special hack to count those sperately
# [23:54] * Parts: hasather (n=hasather@22.80-203-71.nextgentel.com)
# [23:55] <Hixie> i really have to stop e-mailing public-html
# [23:59] * Joins: tantek (n=tantek@corp.technorati.com)
# Session Close: Wed Jul 04 00:00:00 2007

The end :)