Options:
- #
- # Session Start: Thu Jul 12 00:00:00 2007
- # Session Ident: #whatwg
- # [00:00] * Quits: weinig (n=weinig@17.255.105.179) (Remote closed the connection)
- # [00:01] * Joins: weinig (n=weinig@17.255.105.179)
- # [00:08] * Quits: weinig (n=weinig@17.255.105.179)
- # [00:12] * Quits: Charl (n=charlvn@net-153-078.mweb.co.za) ("Leaving")
- # [00:15] * Quits: tndH (i=Rob@adsl-87-102-67-108.karoo.KCOM.COM) ("ChatZilla 0.9.78.1-rdmsoft [XULRunner 1.8.0.9/2006120508]")
- # [00:19] * Joins: weinig (n=weinig@17.255.105.179)
- # [00:20] * Joins: jruderman (n=jruderma@corp-242.mountainview.mozilla.com)
- # [00:24] * Quits: Ducki_ (n=Ducki@dialin-145-254-186-150.pools.arcor-ip.net) (Read error: 110 (Connection timed out))
- # [00:29] * Joins: othermaciej (n=mjs@17.255.96.56)
- # [00:33] * Joins: zcorpan_ (n=zcorpan@84-216-42-141.sprayadsl.telenor.se)
- # [00:40] * Joins: othermaciej_ (n=mjs@17.203.15.242)
- # [00:44] * Quits: zcorpan (n=zcorpan@84-216-42-141.sprayadsl.telenor.se) (Read error: 110 (Connection timed out))
- # [00:49] * Quits: weinig (n=weinig@17.255.105.179) (Read error: 110 (Connection timed out))
- # [00:54] * Quits: othermaciej (n=mjs@17.255.96.56) (Connection timed out)
- # [00:55] * Joins: MikeSmith (n=MikeSmit@eM60-254-199-121.pool.emobile.ad.jp)
- # [01:02] * othermaciej_ is now known as othermaciej
- # [01:04] * Joins: csarven (n=nevrasc@modemcable081.152-201-24.mc.videotron.ca)
- # [01:06] * Joins: webben (n=benh@82.152.177.224)
- # [01:06] * Joins: weinig (i=weinig@nat/apple/x-4286c6a5d69799ff)
- # [01:07] * Quits: webben (n=benh@82.152.177.224) (Client Quit)
- # [01:09] * Joins: othermaciej_ (n=mjs@17.255.96.56)
- # [01:12] * Parts: billmason (n=billmaso@ip156.unival.com)
- # [01:13] * Quits: zcorpan_ (n=zcorpan@84-216-42-141.sprayadsl.telenor.se) (Read error: 110 (Connection timed out))
- # [01:17] * Joins: webben (n=benh@82.152.177.224)
- # [01:24] * Quits: othermaciej (n=mjs@17.203.15.242) (Read error: 110 (Connection timed out))
- # [01:27] * Quits: weinig (i=weinig@nat/apple/x-4286c6a5d69799ff) (Read error: 104 (Connection reset by peer))
- # [01:28] * Joins: weinig (i=weinig@nat/apple/x-56380e5f3c86924a)
- # [01:34] * Quits: weinig (i=weinig@nat/apple/x-56380e5f3c86924a) (Read error: 104 (Connection reset by peer))
- # [01:34] * Joins: weinig (i=weinig@nat/apple/x-9442b53c58f4ceb5)
- # [01:53] * Quits: weinig (i=weinig@nat/apple/x-9442b53c58f4ceb5) (Read error: 104 (Connection reset by peer))
- # [01:53] * Joins: weinig (i=weinig@nat/apple/x-c5d2539c439c091f)
- # [01:59] * Joins: karlUshi (n=karl@dhcp-247-173.mag.keio.ac.jp)
- # [02:00] * Quits: mpt (n=mpt@121-72-128-43.dsl.telstraclear.net) ("Leaving")
- # [02:10] * Quits: h3h (n=w3rd@66-162-32-234.static.twtelecom.net) ("|")
- # [02:11] * Joins: mpt (n=mpt@121-72-128-43.dsl.telstraclear.net)
- # [02:35] * Quits: bzed (n=bzed@dslb-084-059-120-182.pools.arcor-ip.net) ("Leaving")
- # [02:56] * Joins: yod (n=ot@dhcp-247-181.mag.keio.ac.jp)
- # [02:56] * Quits: weinig (i=weinig@nat/apple/x-c5d2539c439c091f)
- # [03:11] * Joins: kfish (n=conrad@61.194.21.25)
- # [03:17] * Quits: MikeSmith (n=MikeSmit@eM60-254-199-121.pool.emobile.ad.jp) ("Less talk, more pimp walk.")
- # [03:37] * Joins: weinig (i=weinig@nat/apple/x-a4349c72d0efa10d)
- # [03:54] * Quits: weinig (i=weinig@nat/apple/x-a4349c72d0efa10d)
- # [04:05] * Quits: kfish (n=conrad@61.194.21.25) ("oss!")
- # [04:09] * Quits: yod (n=ot@dhcp-247-181.mag.keio.ac.jp) ("Leaving")
- # [04:13] * Quits: karlUshi (n=karl@dhcp-247-173.mag.keio.ac.jp) ("Where dwelt Ymir, or wherein did he find sustenance?")
- # [04:13] * othermaciej_ is now known as othermaciej
- # [04:16] * Quits: psa (n=yomode@posom.com) (anthony.freenode.net irc.freenode.net)
- # [04:16] * Quits: didymos (i=jho@rapwap.razor.dk) (anthony.freenode.net irc.freenode.net)
- # [04:21] * Joins: didymos (i=jho@rapwap.razor.dk)
- # [04:25] * Joins: othermaciej_ (n=mjs@17.203.15.242)
- # [04:27] * Quits: jruderman (n=jruderma@corp-242.mountainview.mozilla.com)
- # [04:34] * Quits: laug (n=laug@poy.chewa.net) (anthony.freenode.net irc.freenode.net)
- # [04:36] * Joins: laug (n=laug@poy.chewa.net)
- # [04:38] * Quits: othermaciej (n=mjs@17.255.96.56) (Read error: 110 (Connection timed out))
- # [04:42] * Quits: dbaron (n=dbaron@corp-242.mountainview.mozilla.com) ("8403864 bytes have been tenured, next gc will be global.")
- # [04:52] * Joins: MikeSmith (n=MikeSmit@eM60-254-213-172.pool.emobile.ad.jp)
- # [04:58] * Joins: jruderman (n=jruderma@c-67-169-24-116.hsd1.ca.comcast.net)
- # [05:08] * Quits: othermaciej_ (n=mjs@17.203.15.242)
- # [05:12] * Quits: MikeSmith (n=MikeSmit@eM60-254-213-172.pool.emobile.ad.jp) ("Less talk, more pimp walk.")
- # [05:18] * Joins: h3h (n=w3rd@cpe-76-88-44-219.san.res.rr.com)
- # [05:31] * Quits: KevinMarks (i=KevinMar@nat/google/x-5e755f57d4deef4e) (Read error: 110 (Connection timed out))
- # [05:41] * Joins: smfr (n=smfr@netblock-72-25-91-9.dslextreme.com)
- # [05:41] * Quits: smfr (n=smfr@netblock-72-25-91-9.dslextreme.com) (Remote closed the connection)
- # [05:42] * Joins: KevinMarks (i=KevinMar@nat/google/x-91fa22b7585e1628)
- # [05:44] * Quits: csarven (n=nevrasc@modemcable081.152-201-24.mc.videotron.ca) ("http:/www.csarven.ca")
- # [06:01] * Joins: MikeSmith (n=MikeSmit@tea12.w3.mag.keio.ac.jp)
- # [06:06] * Joins: othermaciej (n=mjs@17.203.15.242)
- # [06:16] * Quits: KevinMarks (i=KevinMar@nat/google/x-91fa22b7585e1628) ("The computer fell asleep")
- # [06:21] * Quits: MikeSmith (n=MikeSmit@tea12.w3.mag.keio.ac.jp) ("Less talk, more pimp walk.")
- # [06:26] * Joins: MikeSmith (n=MikeSmit@eM60-254-218-79.pool.emobile.ad.jp)
- # [06:33] * Quits: othermaciej (n=mjs@17.203.15.242) (Read error: 104 (Connection reset by peer))
- # [06:35] * Joins: yod (n=ot@dhcp-247-181.mag.keio.ac.jp)
- # [06:36] * Joins: karlUshi (n=karl@dhcp-247-173.mag.keio.ac.jp)
- # [06:38] * Joins: othermaciej (n=mjs@17.203.15.242)
- # [06:38] * Joins: mksm (n=mksm@201-68-26-43.dsl.telesp.net.br)
- # [07:02] * Joins: weinig (i=weinig@nat/apple/x-fcd2d99cf375dff2)
- # [07:06] * Joins: othermaciej_ (n=mjs@17.203.15.242)
- # [07:06] * Quits: othermaciej (n=mjs@17.203.15.242) (Read error: 104 (Connection reset by peer))
- # [07:20] * Joins: tantek (n=tantek@c-24-6-138-86.hsd1.ca.comcast.net)
- # [07:42] * Quits: karlUshi (n=karl@dhcp-247-173.mag.keio.ac.jp) ("This computer has gone to sleep")
- # [07:44] * Joins: karlUshi (n=karl@133.27.247.173)
- # [07:52] * Quits: mksm (n=mksm@201-68-26-43.dsl.telesp.net.br) (Read error: 110 (Connection timed out))
- # [07:53] * othermaciej_ is now known as othermaciej
- # [08:01] * Joins: KevinMarks (n=KevinMar@c-76-102-254-252.hsd1.ca.comcast.net)
- # [08:15] * Quits: weinig (i=weinig@nat/apple/x-fcd2d99cf375dff2)
- # [08:17] * Quits: MikeSmith (n=MikeSmit@eM60-254-218-79.pool.emobile.ad.jp) (Read error: 110 (Connection timed out))
- # [08:19] * Quits: tantek (n=tantek@c-24-6-138-86.hsd1.ca.comcast.net)
- # [08:21] * Joins: tantek (n=tantek@c-24-6-138-86.hsd1.ca.comcast.net)
- # [08:29] * Quits: h3h (n=w3rd@cpe-76-88-44-219.san.res.rr.com)
- # [08:32] * Joins: Charl (n=charlvn@net-153-078.mweb.co.za)
- # [09:11] * Quits: othermaciej (n=mjs@17.203.15.242)
- # [09:16] * Joins: bzed (n=bzed@dslb-084-059-106-081.pools.arcor-ip.net)
- # [09:32] * Joins: MikeSmith (n=MikeSmit@tea12.w3.mag.keio.ac.jp)
- # [09:49] * Quits: yod (n=ot@dhcp-247-181.mag.keio.ac.jp) ("Leaving")
- # [09:50] * Quits: karlUshi (n=karl@133.27.247.173) ("Where dwelt Ymir, or wherein did he find sustenance?")
- # [09:56] * Quits: webben (n=benh@82.152.177.224)
- # [10:02] * Joins: othermaciej (n=mjs@dsl081-048-145.sfo1.dsl.speakeasy.net)
- # [10:38] * Joins: webben (i=benh@nat/yahoo/x-b00bc501049c9ad7)
- # [10:43] * Quits: webben (i=benh@nat/yahoo/x-b00bc501049c9ad7) (Client Quit)
- # [10:50] * Joins: ROBOd (n=robod@86.34.246.154)
- # [10:51] * Joins: rabies (n=Miranda@p54889723.dip0.t-ipconnect.de)
- # [11:19] <Hixie> hm
- # [11:19] <Hixie> are there really two parse errors for "<!DOCTYPE" but only one for "<!DOCTYPE "?
- # [11:20] * Quits: MikeSmith (n=MikeSmit@tea12.w3.mag.keio.ac.jp) ("Less talk, more pimp walk.")
- # [11:25] * Quits: othermaciej (n=mjs@dsl081-048-145.sfo1.dsl.speakeasy.net)
- # [11:28] <hsivonen> Hixie: yes
- # [11:28] * Joins: Ducki (n=Ducki@dialin-145-254-180-142.pools.arcor-ip.net)
- # [11:29] <hsivonen> Hixie: probably not worth tweaking
- # [11:42] <hsivonen> <b><table><td></b><i></table>X
- # [11:42] <hsivonen> Why isn't <b> supposed to reopen before X?
- # [12:10] <Hixie> isn't it?
- # [12:11] <Hixie> oh because the table is in the <b>
- # [12:11] <Hixie> and so the X is still in the <b>
- # [12:11] <Hixie> the </b> in the above has no effect
- # [12:13] <virtuelv> Hixie: How's Bergen?
- # [12:22] <gsnedders> http://geoffers.no-ip.com/svn/php-html-5-direct/tests/numbersTest
- # [12:24] * Joins: zcorpan_ (n=zcorpan@84-216-41-80.sprayadsl.telenor.se)
- # [12:24] <Hixie> virtuelv: rainy
- # [12:24] * Joins: webben (i=benh@nat/yahoo/x-671e0e02b2b7f7f2)
- # [12:25] <Hixie> gsnedders: does that match the spec or the spec with your proposed changes?
- # [12:25] <gsnedders> Hixie: the spec
- # [12:26] <virtuelv> Hixie: Norway's been pretty much like that for a couple of weeks now
- # [12:26] <gsnedders> Hixie: even when the spec does very odd things (like a list of integers with input "10" outputting [1])
- # [12:27] <Hixie> gsnedders: k
- # [12:27] * Joins: MikeSmith (n=MikeSmit@eM60-254-217-151.pool.emobile.ad.jp)
- # [12:27] <Hixie> gsnedders: can you include that link in one of your e-mails? (or just mail it directly to me ian@hixie.ch) I'll try to look at what browsers do with those tests when I update the spec
- # [12:28] <gsnedders> Hixie: I'm going to email it shortly
- # [12:28] <gsnedders> Hixie: just a few more general issues with the number section, then my review of that is done, and I'll send it off with the final email
- # [12:28] * Quits: MikeSmith (n=MikeSmit@eM60-254-217-151.pool.emobile.ad.jp) (Client Quit)
- # [12:28] * Joins: MikeSmith (n=MikeSmit@eM60-254-217-151.pool.emobile.ad.jp)
- # [12:28] <hsivonen> Hixie: ah, I didn't realized the table was in b. I've got a bug then.
- # [12:30] <virtuelv> Hixie: re DOMContentLoaded - it'd be useful to have some event when the DOM is loaded and styles are available/applied
- # [12:43] * Quits: webben (i=benh@nat/yahoo/x-671e0e02b2b7f7f2)
- # [12:45] * Quits: MikeSmith (n=MikeSmit@eM60-254-217-151.pool.emobile.ad.jp) ("Less talk, more pimp walk.")
- # [12:49] <hsivonen> translating the spec to code would be less error-prone if the spec didn't have gotos that create unnatural loops
- # [12:50] <gsnedders> hsivonen: heh. I ended up with a do {} while (true); in my implementation of the lists of integers.
- # [12:50] <gsnedders> then relying on break and continue statements
- # [12:50] <hsivonen> gsnedders: I'm pretty sure do-while is always natural
- # [12:50] <hsivonen> (natural in the compiler sense)
- # [12:51] <gsnedders> ah. in that sense.
- # [12:51] <gsnedders> (of natural)
- # [12:51] <gsnedders> PHP likely does something odd with it, though, knowing PHP.
- # [12:52] <gsnedders> has anyone apart from zcorpan_ and myself started the spec review, anyway?
- # [12:52] <hsivonen> if I had to guess, my guess would be that even PHP created only natural loops for the purpose of compiler optimization
- # [12:52] <hsivonen> gsnedders: I'm reviewing the parsing spec as I go
- # [12:53] <hsivonen> gsnedders: I don't have much to say about tokenization, but I have posted remark about tree building
- # [12:55] <gsnedders> hsivonen: ah. I just haven't seen that much.
- # [12:56] <hsivonen> lost in the flood I guess :-(
- # [12:56] <gsnedders> ah, now I see
- # [12:57] * Joins: MikeSmith (n=MikeSmit@eM60-254-196-101.pool.emobile.ad.jp)
- # [13:03] * Joins: BenWard (i=BenWard@nat/yahoo/x-0eab581352088fa4)
- # [13:03] <Hixie> hsivonen: believe me, the spec doesn't look like what i'd want it to look like if i was doing this from scratch
- # [13:03] <Hixie> anyway, time to be a tourist
- # [13:04] <gsnedders> Hixie: rarely anything ends up as you'd like it to if you started from scratch :P
- # [13:13] * Quits: dolphinling (n=chatzill@rbpool5-46.shoreham.net) (Read error: 110 (Connection timed out))
- # [13:18] * Joins: webben (i=benh@nat/yahoo/x-33e0476f5e45002e)
- # [13:18] * Quits: webben (i=benh@nat/yahoo/x-33e0476f5e45002e) (Client Quit)
- # [13:19] * Joins: webben (i=benh@nat/yahoo/x-48c26d3078c18148)
- # [13:23] * Quits: syp| (n=syp@lasigpc9.epfl.ch) (anthony.freenode.net irc.freenode.net)
- # [13:24] * Joins: billyjack (n=MikeSmit@eM60-254-213-222.pool.emobile.ad.jp)
- # [13:24] * Joins: syp| (n=syp@lasigpc9.epfl.ch)
- # [13:25] * Quits: MikeSmith (n=MikeSmit@eM60-254-196-101.pool.emobile.ad.jp) (Read error: 110 (Connection timed out))
- # [13:26] * Joins: Ducki_ (n=Ducki@dialin-145-254-188-142.pools.arcor-ip.net)
- # [13:27] * Parts: zcorpan_ (n=zcorpan@84-216-41-80.sprayadsl.telenor.se)
- # [13:27] <hsivonen> <a><p>X<a>Y</a>Z</p></a>
- # [13:28] <hsivonen> Why does the first <a> come off the stack before <p> goes in?
- # [13:28] <hsivonen> ooh. does the p get reparented?
- # [13:30] <hsivonen> now I'm confused
- # [13:45] * Joins: met_ (n=Hassman@r5bx220.net.upc.cz)
- # [13:45] <met_> http://ajaxian.com/archives/google-gears-roadmap-and-features
- # [13:45] <hsivonen> ooh! my code lacks step #10 of the AAA!
- # [13:45] * Quits: Ducki (n=Ducki@dialin-145-254-180-142.pools.arcor-ip.net) (Read error: 110 (Connection timed out))
- # [13:46] * Quits: webben (i=benh@nat/yahoo/x-48c26d3078c18148)
- # [13:48] * Joins: yod (n=ot@softbank221018155222.bbtec.net)
- # [13:51] * billyjack is now known as MikeSmith
- # [13:52] <Philip`> gsnedders: In numbersTest: s/dimentions/dimensions/
- # [13:55] <gsnedders> Philip`: fixed
- # [14:01] * Joins: karlUshi (n=karl@124-144-94-188.rev.home.ne.jp)
- # [14:17] * Joins: webben (i=benh@nat/yahoo/x-d086cdddbcd385cf)
- # [14:36] * Quits: Charl (n=charlvn@net-153-078.mweb.co.za) ("Leaving")
- # [14:55] * Joins: Codler (n=Codler@84-218-6-177.eurobelladsl.telenor.se)
- # [14:57] * Quits: karlUshi (n=karl@124-144-94-188.rev.home.ne.jp) ("Where dwelt Ymir, or wherein did he find sustenance?")
- # [15:04] * Joins: webben_ (i=benh@nat/yahoo/x-81b62412ed8678e6)
- # [15:13] * Quits: MikeSmith (n=MikeSmit@eM60-254-213-222.pool.emobile.ad.jp) (Read error: 110 (Connection timed out))
- # [15:17] * Quits: webben (i=benh@nat/yahoo/x-d086cdddbcd385cf) (Read error: 110 (Connection timed out))
- # [15:25] * Quits: webben_ (i=benh@nat/yahoo/x-81b62412ed8678e6) (Connection timed out)
- # [15:26] * Joins: Ducki__ (i=Ducki@dialin-145-254-188-068.pools.arcor-ip.net)
- # [15:27] * Joins: Jero (n=Jero@d207230.upc-d.chello.nl)
- # [15:27] * Joins: webben (i=benh@nat/yahoo/x-12e6faa3dc813625)
- # [15:29] * Quits: Ducki_ (n=Ducki@dialin-145-254-188-142.pools.arcor-ip.net) (Read error: 104 (Connection reset by peer))
- # [15:36] * Joins: MikeSmith (n=MikeSmit@eM60-254-222-202.pool.emobile.ad.jp)
- # [15:43] * Quits: Codler (n=Codler@84-218-6-177.eurobelladsl.telenor.se) ("- nbs-irc 2.21 - www.nbs-irc.net -")
- # [15:52] * Joins: webben_ (i=benh@nat/yahoo/x-27b006c78a1968a7)
- # [15:54] * Joins: tndH (i=Rob@adsl-87-102-67-108.karoo.KCOM.COM)
- # [15:58] * Quits: webben (i=benh@nat/yahoo/x-12e6faa3dc813625) (Connection timed out)
- # [16:13] * Joins: Codler (n=Codler@84-218-6-193.eurobelladsl.telenor.se)
- # [16:18] * Joins: billmason (n=billmaso@ip156.unival.com)
- # [16:20] * Quits: billmason (n=billmaso@ip156.unival.com) (Read error: 104 (Connection reset by peer))
- # [16:22] * Joins: billmason (n=billmaso@ip156.unival.com)
- # [16:23] <gsnedders> Jero: you around?
- # [16:23] <Jero> yup
- # [16:25] <gsnedders> did you start your PHP5 implementation from scratch not knowing that there was a semi-started one before, or some other reason?
- # [16:32] <gsnedders> Jero: and I've started on a 1:1 implementation in PHP, which isn't really so relevant in the real world
- # [16:33] <Jero> gsnedders: correct, I found out later that there was already an HTML5 parser in PHP
- # [16:33] <Jero> gsnedders: but I could access the site (some issues with Trac I believe)
- # [16:34] <gsnedders> Jero: it's not so interesting now. a lot of the code written for it is obsolete
- # [16:34] <gsnedders> http://php-html5lib.dashslot.net/svn/trunk works, though
- # [16:35] <Jero> gsnedders: interesting
- # [16:35] * Quits: webben_ (i=benh@nat/yahoo/x-27b006c78a1968a7)
- # [16:35] <Jero> also, what do you think of my implementation so far?
- # [16:35] <gsnedders> I've never had time to really look into it
- # [16:35] <gsnedders> (due to school, and now trying to get as much of the spec review done as possible before going away in a week)
- # [16:36] <gsnedders> http://geoffers.no-ip.com/svn/php-html-5-direct contains the direct implementation
- # [16:37] <Jero> thanks
- # [16:37] <gsnedders> it's all very slow, though
- # [16:37] <Jero> so is my implementation at the moment :p
- # [16:37] <gsnedders> the direct one will be far slower, though
- # [16:38] <Jero> yeah, i'm sure
- # [16:38] <gsnedders> as the aim is to make absolutely no compromises from the spec
- # [16:38] <gsnedders> which is the case of the tokeniser means one character at a time
- # [16:38] <gsnedders> *means emitting
- # [16:38] <Jero> yeah, that's not a very optimal solution :p
- # [16:39] <Jero> but I guess I've only made three or four changes to the entire parsing algorithm compared to the spec
- # [16:41] <Philip`> If you want to write a new tokeniser in some language, it could perhaps be helpful to build on my work - that has a direct representation of the spec algorithm, and generates C++ or JS code to execute it, and it ought to be fairly quick to do other languages in the same way
- # [16:42] * Quits: moeffju (i=moeffju@ubermutant.net) (Read error: 131 (Connection reset by peer))
- # [16:43] <Philip`> (I need to add some kind of abstraction in the code-generating part - JS was only easy because it's almost entirely identical to C++ except for replacing 'bool' with 'var', and it takes a little bit more effort if you needs $s in front of variables)
- # [16:43] <Philip`> (but I'll at least try to create a Perl implementation too, to make sure it's sufficiently portable between languages)
- # [16:50] * Joins: moeffju (i=moeffju@ubermutant.net)
- # [16:52] <gsnedders> Jero: I may, however, try forking off the direct impl and work on optimising it (as that's far nicer than starting from scratch, as I can just rewrite one method at a time)
- # [16:54] * Parts: Codler (n=Codler@84-218-6-193.eurobelladsl.telenor.se)
- # [16:56] <Jero> well, I followed the spec in everything (with three or four exceptions), so that's basically the same as forking off the direct implementation, don't you think?
- # [16:58] <gsnedders> Jero: yes
- # [16:59] <gsnedders> Jero: it would be interesting to compare the two, though (and optimising it won't take overly long to do)
- # [17:00] <Jero> my impl still has a couple of bugs (though most of them are related I think)
- # [17:01] <Jero> and I'm a bit behind when it comes to the last 60 or so revisions
- # [17:01] <gsnedders> heh. any bugs in the direct impl are either PHP bugs or spec bugs
- # [17:02] <gsnedders> and I wouldn't allow any regressions when optimising it
- # [17:03] <Jero> gsnedders: you can contribute to the code if you want to in the future
- # [17:03] <gsnedders> Jero: I'll probably optimise the tokeniser and then see how the two compare, then decide what to do from there
- # [17:04] <Jero> the tokeniser of my implementation you mean?
- # [17:05] <gsnedders> the tokeniser of the direct implementation, then compare it to your tokeniser
- # [17:05] <Jero> that sounds like a good idea
- # [17:06] <Jero> I'll upload the code I have on my PC to the online version of my parser, so you can compare it to the latest and greatest
- # [17:06] <gsnedders> heh. it won't be for a while, though
- # [17:06] <gsnedders> the tokeniser isn't written in the direct impl yet
- # [17:07] <Jero> oh i see :p
- # [17:07] <gsnedders> (which I had actually implied earlier)
- # [17:09] <Jero> also, don't you think it'd be great to have the HTML5's parsing algorithm being used by the built-in DOMDocument->loadHTML() function in PHP?
- # [17:10] <Jero> ATM that function uses the libxml2 HTML parser
- # [17:10] <gsnedders> Jero: as if you're ever gonna persude the PHP devs to implement a draft standard…
- # [17:10] <Jero> don't worry, it was just an idea..
- # [17:11] <gsnedders> Jero: it took me many, many, many years to persuade them of a bug in strip_tags(), which they kept writing off as being invalid HTML (as the aim there is to use a basic parser that'll work with valid HTML) despite me citing specific parts of the specification that clearly said otherwise
- # [17:13] <Jero> heh
- # [17:14] <gsnedders> I bet they didn't have a copy of the SGML spec, and were simply saying what they thought was right.
- # [17:14] <gsnedders> (it's actually something that despite being part of the SGML spec is relevant)
- # [17:17] <Jero> what was the bug?
- # [17:18] <gsnedders> U+003E within quoted attribute values
- # [17:18] <gsnedders> it probably breaks if you mix single and double quotes, actually
- # [17:18] <gsnedders> e.g., <foo bar="this'> is parsed as a single |foo| element where @bar=this
- # [17:20] <Jero> so it closes the value of bar upon seeing the ' character?
- # [17:20] <gsnedders> yes
- # [17:20] <Jero> that is indeed very weird
- # [17:21] <Jero> and what was their argument?
- # [17:22] <gsnedders> actually, that does work correctly
- # [17:22] <gsnedders> var_dump(strip_tags('<foo bar="this\'>">')); indeed produces string(0) ""
- # [17:22] <gsnedders> Jero: for the > bug? that it was invalid HTML.
- # [17:22] <gsnedders> Jero: for the latter? I only just thought of it
- # [17:23] <Jero> i see
- # [17:23] <gsnedders> the former is untrue, as it is completely valid
- # [17:24] <gsnedders> [^<&] off the top of my head
- # [17:25] <Jero> heh
- # [17:25] <Jero> and they still haven't fixed it?
- # [17:25] <gsnedders> the former is fixed in 5.2.2, IIRC
- # [17:26] <gsnedders> only 5, though
- # [17:26] * Joins: Ducki_ (n=Ducki@dialin-212-144-065-008.pools.arcor-ip.net)
- # [17:26] <gsnedders> the same patch would apply against 4.4 fine, but it's unfixed
- # [17:27] <Jero> that's stupid
- # [17:28] <gsnedders> typical of PHP development, though
- # [17:29] * Joins: hasather (n=hasather@22.80-203-71.nextgentel.com)
- # [17:29] <Jero> that's too bad
- # [17:29] <Philip`> <foo <bar=<bar> is syntactically valid in HTML5 now - only ["&] (or ['&] or (\s|&)) does anything
- # [17:30] * Philip` wonders how that will mess up strip_tags
- # [17:30] <gsnedders> Jero: http://cvs.php.net/viewvc.cgi/php-src/ext/standard/tests/strings/bug40432.phpt?revision=1.2&view=markup&pathrev=MAIN
- # [17:30] <Jero> thanks
- # [17:31] <gsnedders> I think I saw it fail in 5.2.3, actually
- # [17:32] <gsnedders> Philip`: http://cvs.php.net/viewvc.cgi/php-src/ext/standard/string.c?view=markup — search for php_u_strip_tags
- # [17:32] <gsnedders> Philip`: string(0) "" is PHP 5.2.3's output, though
- # [17:37] * Joins: h3h (n=w3rd@cpe-76-88-44-219.san.res.rr.com)
- # [17:37] <Jero> gsnedders, i'm off, if you ever need me regarding my HTML5 parser, email me at [censored :)]
- # [17:38] <gsnedders> Jero: I'll be around here if you ever want me
- # [17:38] <Jero> alrighty, bye
- # [17:38] * Quits: Jero (n=Jero@d207230.upc-d.chello.nl) ("ChatZilla 0.9.78.1 [Firefox 2.0.0.4/2007051502]")
- # [17:44] * Quits: Ducki__ (i=Ducki@dialin-145-254-188-068.pools.arcor-ip.net) (Read error: 110 (Connection timed out))
- # [17:46] * Joins: maikmerten (n=maikmert@T714f.t.pppool.de)
- # [17:54] * Quits: yod (n=ot@softbank221018155222.bbtec.net) ("Leaving")
- # [17:57] * Quits: rabies (n=Miranda@p54889723.dip0.t-ipconnect.de)
- # [18:00] * Quits: h3h (n=w3rd@cpe-76-88-44-219.san.res.rr.com)
- # [18:05] * Joins: weinig (i=weinig@nat/apple/x-61a6746577b718b4)
- # [18:10] * Quits: KevinMarks (n=KevinMar@c-76-102-254-252.hsd1.ca.comcast.net) ("The computer fell asleep")
- # [18:11] * Joins: webben (i=benh@nat/yahoo/x-0f6da64dc24fa572)
- # [18:15] * Joins: mksm (n=mksm@201-68-26-43.dsl.telesp.net.br)
- # [18:32] * Joins: othermaciej (n=mjs@dsl081-048-145.sfo1.dsl.speakeasy.net)
- # [18:34] <gsnedders> jgraham: do you really think that those tests would be that hard to get working in another language? the script I use to parse it is in the repos
- # [18:34] * Quits: webben (i=benh@nat/yahoo/x-0f6da64dc24fa572) (Read error: 104 (Connection reset by peer))
- # [18:35] * Joins: h3h (n=w3rd@66-162-32-234.static.twtelecom.net)
- # [18:36] <gsnedders> jgraham: I didn't want to copy the html5lib test cases format as it would mean I'd need the input data repeated multiple times for each algorithm
- # [18:37] * Joins: Darkluna (n=Codler@84-218-6-217.eurobelladsl.telenor.se)
- # [18:37] * Quits: BenWard (i=BenWard@nat/yahoo/x-0eab581352088fa4) ("Fades out again…")
- # [18:42] <Philip`> gsnedders: It would probably be useful to give more detail on the test format, like how it represents arrays and strings
- # [18:42] <Philip`> or just use JSON since that already defines those things and everyone has JSON parsers already :-)
- # [18:43] <gsnedders> and have each test as an object with an array of results?
- # [18:47] <gsnedders> Philip`: but yeah, the documentation was thrown together very quickly
- # [18:50] * Joins: Codler (n=Codler@84-218-6-220.eurobelladsl.telenor.se)
- # [18:56] * Quits: weinig (i=weinig@nat/apple/x-61a6746577b718b4) (Read error: 110 (Connection timed out))
- # [19:00] <Philip`> gsnedders: I was thinking of something like [["Empty string", "", false, false, false, null, "", []], ...], since that's about the same as what you have already but more JSONic, but maybe ["Empty string", "", { "unsigned":false, "signed":false, "real":false, ... }] would be more easily extensible
- # [19:01] <gsnedders> Philip`: I was thinking {"":[false,false,false,null,null,[]]}
- # [19:01] * Quits: Darkluna (n=Codler@84-218-6-217.eurobelladsl.telenor.se) (Connection timed out)
- # [19:01] <Philip`> It'd be nice if JSON allowed you to keep comments
- # [19:02] <gsnedders> Philip`: there are only headers for large groups of tests, so I don't feel that much about keeping them
- # [19:04] <Philip`> What about XML? <numbertest><!-- Empty string --><input></input><outputs><output algorithm="unsigned"><false/></output><output algorithm="integerlist"><items/></output>...
- # [19:04] <gsnedders> that means defining data types and the like
- # [19:04] <Philip`> Hmm, maybe the [false,false,...] one is easiest
- # [19:06] <Philip`> In any case, it does seem probably easier to use JSON rather than a custom data format when you have arrays and non-ASCII strings, to avoid making every implementor implement another test parser
- # [19:07] <gsnedders> that's true
- # [19:07] <gsnedders> just lack of comments in JSON is annoying
- # [19:08] <gsnedders> around 15 minutes to be completely happy with a JSON version of the test suite… not overly slow…
- # [19:08] <Philip`> (JSON is also quite handy when you're running tests in web browsers)
- # [19:09] <gsnedders> (It would've been easier if it were possible to get pretty printing of JSON in PHP)
- # [19:09] <gsnedders> (as I just hacked my existing parser)
- # [19:20] * Joins: dbaron (n=dbaron@corp-241.mountainview.mozilla.com)
- # [19:26] * Joins: Ducki__ (n=Ducki@dialin-145-254-188-044.pools.arcor-ip.net)
- # [19:32] * Quits: othermaciej (n=mjs@dsl081-048-145.sfo1.dsl.speakeasy.net) (Read error: 110 (Connection timed out))
- # [19:34] * Quits: Ducki_ (n=Ducki@dialin-212-144-065-008.pools.arcor-ip.net) (Read error: 113 (No route to host))
- # [19:41] * Joins: weinig (i=weinig@nat/apple/x-62a4bc8b3855df4e)
- # [19:56] * Quits: weinig (i=weinig@nat/apple/x-62a4bc8b3855df4e)
- # [20:20] * Joins: kingryan (n=kingryan@corp.technorati.com)
- # [20:28] * Quits: tantek (n=tantek@c-24-6-138-86.hsd1.ca.comcast.net)
- # [20:34] * Joins: KevinMarks (i=KevinMar@nat/google/x-d948542ea06c46c5)
- # [20:49] * Joins: hendry (n=hendry@91.84.62.62)
- # [21:00] * Parts: hasather (n=hasather@22.80-203-71.nextgentel.com)
- # [21:01] * Joins: hasather (n=hasather@22.80-203-71.nextgentel.com)
- # [21:01] * Quits: tndH (i=Rob@adsl-87-102-67-108.karoo.KCOM.COM) (Read error: 110 (Connection timed out))
- # [21:05] * Quits: Codler (n=Codler@84-218-6-220.eurobelladsl.telenor.se) ("- nbs-irc 2.21 - www.nbs-irc.net -")
- # [21:06] * moeffju is now known as moeffju[afk]
- # [21:11] * Quits: KevinMarks (i=KevinMar@nat/google/x-d948542ea06c46c5) (Read error: 110 (Connection timed out))
- # [21:22] <gsnedders> jgraham: just looking at the PHPUnit compiled version of the tests?
- # [21:26] * Joins: Ducki_ (n=Ducki@dialin-145-254-180-237.pools.arcor-ip.net)
- # [21:36] * Joins: tndH (i=Rob@adsl-87-102-67-108.karoo.KCOM.COM)
- # [21:41] * moeffju[afk] is now known as moeffju
- # [21:45] * Quits: MikeSmith (n=MikeSmit@eM60-254-222-202.pool.emobile.ad.jp) (Read error: 110 (Connection timed out))
- # [21:47] * Joins: MikeSmith (n=MikeSmit@eM60-254-198-69.pool.emobile.ad.jp)
- # [21:48] * Quits: Ducki__ (n=Ducki@dialin-145-254-188-044.pools.arcor-ip.net) (Read error: 113 (No route to host))
- # [22:04] * Quits: maikmerten (n=maikmert@T714f.t.pppool.de) ("Leaving")
- # [22:12] * Joins: virtuelv_ (n=virtuelv@47.80-202-66.nextgentel.com)
- # [22:21] * Joins: KevinMarks (i=KevinMar@nat/google/x-5e37e3882e0a78d8)
- # [22:26] <virtuelv_> is it defined anywhere what the implied DOM should be like when using createHTMLDocument()?
- # [22:27] <virtuelv_> (iow: what should the DOM be like given var doc = document.implementation.createHTMLDocument("");
- # [22:27] <virtuelv_> doc.documentElement.innerHTML = "<h1>What</h1>";
- # [22:27] <virtuelv_> alert(doc.documentElement.outerHTML);
- # [22:28] <virtuelv_> what should be alerted?
- # [22:29] * Quits: met_ (n=Hassman@r5bx220.net.upc.cz) ("Chemists never die, they just stop reacting.")
- # [22:29] <jgraham> gsnedders: Yeah, for some reason I looked at the PHP version
- # [22:30] <gsnedders> jgraham: yeah. that'd be thy impossible to parse. there's now a JSON version of the tests in the repo as well, though
- # [22:30] <gsnedders> (but that loses some data, like not distinguishing between ints and floats)
- # [22:32] <Philip`> Could you store floats as strings instead of numbers?
- # [22:33] <gsnedders> then parse the string?
- # [22:33] <gsnedders> hmmm…
- # [22:43] * Joins: zcorpan_ (n=zcorpan@90-229-146-10-no117.tbcn.telia.com)
- # [23:04] * Quits: ROBOd (n=robod@86.34.246.154) ("http://www.robodesign.ro")
- # [23:26] * Quits: virtuelv_ (n=virtuelv@47.80-202-66.nextgentel.com) ("Leaving")
- # [23:26] * Joins: Ducki__ (n=Ducki@dialin-145-254-187-229.pools.arcor-ip.net)
- # [23:26] * Quits: jruderman (n=jruderma@c-67-169-24-116.hsd1.ca.comcast.net)
- # [23:33] * Quits: zcorpan_ (n=zcorpan@90-229-146-10-no117.tbcn.telia.com) (Read error: 110 (Connection timed out))
- # [23:45] * Quits: Ducki__ (n=Ducki@dialin-145-254-187-229.pools.arcor-ip.net) (Read error: 104 (Connection reset by peer))
- # [23:45] * Quits: Ducki_ (n=Ducki@dialin-145-254-180-237.pools.arcor-ip.net) (Read error: 113 (No route to host))
- # [23:47] * Parts: hasather (n=hasather@22.80-203-71.nextgentel.com)
- # [23:50] * Joins: jruderman (n=jruderma@corp-241.mountainview.mozilla.com)
- # Session Close: Fri Jul 13 00:00:00 2007
The end :)