/irc-logs / w3c / #html-wg / 2007-03-31 / end

Options:

# Session Start: Sat Mar 31 00:00:01 2007
# Session Ident: #html-wg
# [00:04] * Joins: gavin (gavin@74.103.208.221)
# [00:29] * Quits: kingryan (kingryan@66.92.187.33) (Quit: kingryan)
# [00:30] <hsivonen> regarding respecting MIME: in my experience, if one implements XML content types properly, it is essential to provide a checkbox for turning it off
# [00:30] <hsivonen> e.g. the W3C serves XML DTDs as text/plain...
# [00:53] <mjs> and of course anything you need a checkbox to turn off is something you probably shouldn't do
# [00:58] <hsivonen> mjs: but I wanted to be eligible for the t-shirt: http://www.cafepress.com/feedparser
# [00:59] <hsivonen> mjs: seriously though, my app is all about finding spec violations
# [00:59] <hsivonen> mjs: so I guess by default I have to flag even RFC 3023 stuff
# [00:59] <mjs> hsivonen: well, a conformance checker is obviously a different beast than an app to present content to end users
# [01:09] <Lachy> hsivonen, if you were to remove the option and rely on content sniffing, then the other reasonable alternative is just to output an error for the MIME type problems and then continue as if the correct types at been sent
# [01:19] * Quits: tylerr (tylerr@66.195.32.2) (Quit: Leaving)
# [01:27] <hsivonen> Lachy: defaulting to US-ASCII doesn't fit that solution, but I guess I could make text/plain as an XML type a non-fatal error
# [01:28] <hsivonen> I want to keep image/jpeg and the like fatal, though
# [01:49] * Quits: Zeros (Zeros-Elip@67.154.87.254) (Ping timeout)
# [01:50] * Parts: hasather (hasather@81.235.209.174)
# [02:06] * Quits: gavin (gavin@74.103.208.221) (Ping timeout)
# [02:10] * Quits: AnPol (anpol@85.118.224.254) (Quit: Bye)
# [02:11] * Joins: gavin (gavin@74.103.208.221)
# [02:22] * Quits: h3h (bfults@66.162.32.234) (Quit: |)
# [02:31] * Joins: Zeros (Zeros-Elip@69.140.48.129)
# [02:31] * Quits: Zeros (Zeros-Elip@69.140.48.129) (Client exited)
# [03:11] * Joins: DougJ (djones4@74.76.23.86)
# [03:14] <Grauw> [05:31] <anne> Does anyone actually implement new versions of XSLT?
# [03:15] <Grauw> I've used XSLT2 extensively. In fact, I couldn't bare to use XLST1 anymore, it just misses too much that I've gotten used to :)
# [03:16] <Grauw> [04:21] <DanC> Hixie, do you take special classes on how to be rude, or do you just do it naturally?
# [03:16] <Grauw> ehehe
# [03:17] <Grauw> there's a fad with making sarcastic remarks about the W3C efforts nowadays, it's honestly not that bad, if perhaps not directly applying to the goals of HTML5
# [03:18] <Grauw> but anyway
# [03:21] * dbaron RRSAgent, pointer?
# [03:21] * RRSAgent See http://www.w3.org/2007/03/31-html-wg-irc#T01-15-01
# [03:22] * Quits: Ashe (Ashe@213.47.199.86) (Ping timeout)
# [03:24] <mjs> Grauw: I don't think Hixie is the only one to think the TAG's findings are of questionable value
# [03:24] <Grauw> I'm not saying he is
# [03:25] <Grauw> saying that there's a 'fad' implies that there are more than one people doing it :)
# [03:25] <dbaron> I completely disagree with a whole bunch of the findings.
# [03:25] <mjs> 'fad' implies that (a) it's invalid and (b) it will pass
# [03:27] <Grauw> let's rephrase it then, I think there's a lot of criticism on W3C specs and effort on small details, and specs as a whole are named worthless for that
# [03:27] <Grauw> whereas the specs as a whole are pretty nice
# [03:27] <Grauw> e.g. XHTML2 introduces a lot of nice things, so within its own space of existence I would say that you could call it a nice specification
# [03:28] <mjs> XHTML2 is a bunch of bad ideas poorly specified
# [03:28] <mjs> to call it a "nice specification" you have to water down the meaning of that phrase a lot
# [03:29] <Grauw> however some people are critisising that it doesn't specify every implementation detail, and discarding the entire spec or the ideas in it as bad
# [03:29] <Grauw> or at least make it sound like that, and cause other people to do so
# [03:29] * Quits: dbaron (dbaron@63.245.220.242) (Quit: 8403864 bytes have been tenured, next gc will be global.)
# [03:30] <mjs> I think many of the ideas in it are also bad
# [03:30] <mjs> though some are good
# [03:30] <Grauw> well a lot of it is a matter of perspective
# [03:30] <mjs> its basic idea of breaking compatibility is bad
# [03:31] <Grauw> if I hear Anne going on about XML error handling it also makes me cringe :)
# [03:31] <Grauw> mjs, maybe, that's the key difference that HTML5 does different
# [03:32] <Grauw> however you should consider the ideas within the space of thought of breaking compatibility (and they didn’t entirely, btw)
# [03:32] <Grauw> anyway
# [03:32] <Grauw> I need to go :)
# [03:32] <Grauw> I'm being called ^_^
# [03:32] <Grauw> later
# [03:34] <Hixie> if you look at them in a vaccuum, xforms is a nice spec. xhtml2 is not.
# [03:34] <Hixie> imho.
# [03:34] <Hixie> xhtml2 leaves far too much very vague and has basically no ua conformance criteria to speak of
# [03:34] <Grauw> (not gone yet) I suppose you could indeed say that XForms is a much better spec than XHTML2
# [03:36] <mjs> xhtml2 is a bad spec qua spec as well as an unsuitable design for the web
# [03:36] <Grauw> maybe as for XHTML2 what I mean is people should look at the ideas that it has (and people have, you said you have too, and a fair amount ended up in HTML5)
# [03:36] <mjs> in the case of xforms only the latter is true
# [03:36] <Grauw> I disagree with that :) I don't see how it's unsuitable for the web
# [03:37] <Grauw> but anyway, I need to go, so don't remark on that :)
# [03:38] <Hixie> ideas don't make a spec
# [03:39] * Joins: Ashe (Ashe@213.47.199.86)
# [04:04] * Parts: DougJ (djones4@74.76.23.86)
# [04:14] * Quits: gavin (gavin@74.103.208.221) (Ping timeout)
# [04:19] * Joins: gavin (gavin@74.103.208.221)
# [05:06] * Quits: mjs (mjs@17.255.99.50) (Quit: mjs)
# [05:07] * Joins: mjs (mjs@17.255.99.50)
# [05:13] <Grauw> w.r.t. that web architecture document quote, "A data format specification SHOULD provide for version information."
# [05:13] <Grauw> I don't think you can consider CSS a data format. HTML, maybe, but I think whether or not to include versioning information is not a general rule but rather based on a design decision in the language
# [05:13] * Quits: mjs (mjs@17.255.99.50) (Quit: mjs)
# [05:14] <Grauw> finally, in most XML-based formats, the XML namespace provides sufficient means for versioning
# [05:14] <Grauw> and no additional version information should be provided. backwards-compatibility breaking changes should change the namespace
# [05:19] * Joins: sbuluf (hm@200.49.140.156)
# [05:33] * Quits: primal1 (primal1@72.87.242.30) (Quit: primal1)
# [05:50] * Joins: MikeSmith (MikeSmith@mcclure.w3.org)
# [06:11] * Joins: Zeros (Zeros-Elip@69.140.48.129)
# [06:11] * Quits: Zeros (Zeros-Elip@69.140.48.129) (Quit: Leaving)
# [06:11] * Joins: Zeros (Zeros-Elip@69.140.48.129)
# [06:21] * Joins: DougJ (djones4@24.213.244.253)
# [06:22] * Quits: gavin (gavin@74.103.208.221) (Ping timeout)
# [06:27] * Joins: gavin (gavin@74.103.208.221)
# [06:27] * Joins: dbaron (dbaron@71.198.189.81)
# [06:33] * Joins: mjs (mjs@64.81.48.145)
# [06:44] * Quits: DanC (connolly@128.30.52.30) (Quit: Client exiting)
# [06:45] <Grauw> I wish people would stop the debate over <abbr> and <acronym>. There is no way in hell that a screenreader can know <acronym>SMIL</acronym> is to be pronounced like ‘smile’, nor can it know that <acronym>SQL</acronym> has to be pronounced ‘sequel’. Unless it has a list of pronounciations built-in, in which case the author doesn’t need to indicate any distinction in the first place.
# [06:46] <Hixie> css isn't a data format?
# [06:46] <Hixie> what is it then?
# [06:46] <Grauw> the only real solution would be to provide an attribute with a pronunciation key in some phonetic alphabet
# [06:47] <Grauw> Hixie: what I mean is that it doesn’t convey information
# [06:47] <Grauw> it merely conveys how to present the information
# [06:47] <jjb> Grauw: that's why we need to spec in a "phonetic" attribute for <acronym>, to be specified in IPA symbols
# [06:47] <jjb> (just kidding)
# [06:47] <Grauw> jjb, yay for unicode ;p
# [06:48] <jjb> ha, i see you made the joke before i had time to put mine together
# [06:48] <Grauw> I'm all for that, with the proper IPA symbols ;p
# [06:48] <jjb> in theory, i wouldn't mind that either, but somehow i think it would't pan out so great in the market...
# [06:48] <Hixie> you don't think "i want h1s to be rendered in green" is information?
# [06:48] <Grauw> somehow, ne :)
# [06:49] <Grauw> Hixie, ... I think you’re missing my point :)
# [06:49] <Hixie> yes
# [06:49] <Hixie> i don't understand your point :-)
# [06:50] <Grauw> to me at information is primarily stuff like address information, etc, something you would express with RDF
# [06:50] <Grauw> CSS in itself as a format of course conveys information on the presentation
# [06:50] <Grauw> but it doesn’t carry semantics
# [06:50] <Hixie> you could express CSS in RDF
# [06:50] <Grauw> of course you can
# [06:50] <Hixie> so...?
# [06:50] <Grauw> but you wouldn't :)
# [06:51] <Hixie> I wouldn't express _anything_ in RDF
# [06:51] <Lachy> LOL!
# [06:51] <Hixie> i don't understand why css is different from html in terms of deserving versioning information
# [06:52] <Grauw> I don't think it is
# [06:52] <Hixie> ok then
# [06:52] <Grauw> I'm not trying to make an argument for versioning information, I said it depends on design decisions of the language
# [06:53] <Grauw> and that maybe the reference to ‘data format’ could be interpreted as so
# [06:53] <Lachy> Grauw, when would it ever be a good design desicion to include versioning?
# [06:53] <Grauw> that it would only cover ‘semantic’ information
# [06:54] <Grauw> for implementors versioning sucks, but I think there’s been a fair number of arguments presented where it’s useful as well
# [06:54] <Grauw> even if just so that if you see the document, you know ‘oh it was documented according to this version of the standard. I’ll stick to that than, so that the risk of adding stuff that doesn’t work in the systems it’s intended to be used for is minimal’
# [06:55] <Hixie> why would a specification's version be useful for that? surely you'd want to specify an authoring subset or at most an implementation version
# [06:56] <Hixie> language versions have very little bearing on what set of features authors use
# [06:56] <Grauw> Backbase software has been through three major versions by now, all incompatible
# [06:56] <Grauw> even among minor versions there have been some incompatibilities (unintentionally, I suppose)
# [06:57] <Grauw> the version information is not indicated however, because it simply depends on which version of the library is included, and which namespace is used
# [06:57] <Hixie> right but with HTML you have more than one implementation
# [06:58] <Hixie> so even with one _language_ version you'll have incompatibilities
# [06:58] <Hixie> e.g. as now with CSS and HTML and DOM and JS and so forth
# [06:58] <Grauw> yeah, so that’s why it’s a per-language consideration
# [06:58] <Hixie> no
# [06:58] <Hixie> it's a per IMPLEMENTATION consideration at most
# [06:58] <Grauw> ...depends on how well the implementations are done, I think.
# [06:59] <Grauw> with the browsers, we got unlucky in that it’s kind of a mess
# [06:59] <Zeros> versioning works if its enforced in the implementation
# [06:59] <Grauw> but if you look at internet standards like IP, etc
# [06:59] <Grauw> you can’t really implement it ‘incompletely’
# [07:00] <Zeros> If Javascript 2 only works when you specify that version there's no room for erroneous mixing in features, and you can fix things later with explicit reference to what set of features you're using
# [07:01] <Grauw> a problem with a lot of ideas that some people have (like anne's xml-should-have-error-recovery) is that it’s a view that really doesn’t apply to all areas
# [07:01] <Zeros> Some XML implementations have some level of error recovery :)
# [07:01] <Grauw> e.g. programming languages are usually not forgiving in their syntax error handling, yet they have been used widely and it forms no obstacle to programmers (rather, a help)
# [07:01] <Zeros> Webkit doesn't show a yellow screen when it sees a malformed document
# [07:02] <Zeros> Grauw, the counter argument is that all media decoders (mp3, video etc.) implement error recovery when the file is malformed
# [07:02] <Grauw> XML is very often used to contain a programming language as well, and for data I also think it would be bad to allow the risk of data being incorrectly error-corrected to say something else than that it actually was supposed to mean
# [07:03] <Grauw> so it’s very specific to the domain
# [07:03] <Grauw> same is true for versioning information I think
# [07:03] <Grauw> or feature indication (pretty much the same thing, but more flexible)
# [07:04] <Lachy> Zeros, WebKit does show an error for malformed content (although it's not yellow like Mozilla)
# [07:04] * Quits: DougJ (djones4@24.213.244.253) (Quit: DougJ)
# [07:04] <Zeros> Lachy, it shows the page rendered up until that point, mozilla shows nothing but the error
# [07:04] <Grauw> Zeros, yeah, I hate malformed MP3s with their stupid pops
# [07:04] <Grauw> if they just didn’t work at all, people wouldn’t share them
# [07:04] <Grauw> and I wouldn’t be bothered by them
# [07:04] <Zeros> Lachy, One is better for programmers, the other for users
# [07:04] <Lachy> oh, I see what you mean
# [07:05] <Grauw> I think both approaches are ok, as long as there’s something that forces the programmers of the backend to implement something that never generates invalid XML
# [07:06] * Joins: DougJ (djones4@24.213.244.253)
# [07:06] <Lachy> unfortunately, that doesn't always happen in reality
# [07:06] <Grauw> if all browsers support XHTML, it will, for sites that serve XHTML
# [07:06] <Lachy> There's plenty of systems out there that claim to support XML, but only have tag soup parsers
# [07:07] <Zeros> Lachy, The major issue I think is that XHTML is trivial to hand author and generate invalid results
# [07:07] <Grauw> because they’re forced to, in the same manner as programmers are forced to make sure their code compiles
# [07:07] <Grauw> anyway, have to go (again), I’ll read the backlog when I get back :)
# [07:07] <Zeros> Its more complicated for your word document to become invalid, so most word documents are valid
# [07:07] <Lachy> e.g. feed readers, mobile phones, etc. all use tag soup parsers for XML
# [07:07] <Zeros> With XHTML its pretty easy to introduce an anomaly which would invalidate the entire document.
# [07:07] <Grauw> my feed reader (Sage) uses a true XML parser, and it works just fine (most of the time ;p)
# [07:08] <Lachy> there are many that don't, though
# [07:08] <Zeros> libxml2 works pretty well :)
# [07:08] <Grauw> sure, and I don’t think that’s a good thing
# [07:08] <Grauw> anyway
# [07:09] * Quits: MikeSmith (MikeSmith@mcclure.w3.org) (Quit: Get thee behind me, satan.)
# [07:10] <Zeros> How long does it usually take for sysreq to respond?
# [07:11] * Parts: DougJ (djones4@24.213.244.253)
# [07:12] <Lachy> Zeros, what's your real name, or the name you use on the mailing list?
# [07:13] <Zeros> I'm not on the list yet, I filled out the form and realized I entered my last name twice.
# [07:14] <Zeros> Was going to join
# [07:14] <Zeros> Can you fix it?
# [07:14] <Lachy> no, ask Dan or Karl
# [07:14] <Zeros> alright, thanks
# [07:15] <Lachy> typing /whois says your name is Elliot. What's your last name?
# [07:15] <Zeros> Elliott Sprehn
# [07:15] <Zeros> and you?
# [07:15] <Lachy> Lachlan Hunt
# [07:16] * Lachy thought everyone knew that
# [07:16] <Zeros> I guess I'm not anyone
# [07:56] <Lachy> the logs from in here last night are funny :-)
# [07:57] <Lachy> especially the stuff about the TAG
# [08:20] * Joins: marcos (chatzilla@203.206.31.102)
# [08:28] * Quits: gavin (gavin@74.103.208.221) (Ping timeout)
# [08:33] * Joins: gavin (gavin@74.103.208.221)
# [08:50] * Quits: st (st@62.234.155.214) (Quit: st)
# [08:54] * Joins: DougJ (djones4@24.213.244.253)
# [08:55] * Quits: DougJ (djones4@24.213.244.253) (Quit: DougJ)
# [10:00] * Joins: chaals (chaals@194.182.142.5)
# [10:14] <anne> That removing namespace-well-formedness in XML somehow makes documents no longer trustable is a myth. If a document needs to be trustable you verify whether or not it meets your criteria. You can't trust a document solely based on the syntax. That'd be silly.
# [10:14] * Parts: anne (annevk@81.68.67.12)
# [10:15] * Quits: Zeros (Zeros-Elip@69.140.48.129) (Quit: Leaving)
# [10:16] * Joins: anne (annevk@81.68.67.12)
# [10:32] * Joins: ROBOd (robod@86.34.246.154)
# [10:34] * Quits: chaals (chaals@194.182.142.5) (Ping timeout)
# [10:36] * Quits: gavin (gavin@74.103.208.221) (Ping timeout)
# [10:41] * Joins: gavin (gavin@74.103.208.221)
# [11:07] <Grauw> anne, I’m not saying that you should trust a document based on the syntax. I am saying however that if there’s something wrong with the document, it’s better if it fails with an error than that it silently tries to recover it (very possibly incorrectly), and the source is never notified of the problem
# [11:09] * Quits: Dave (dsr@128.30.52.30) (Quit: be seeing you ...)
# [11:09] <Grauw> when it’s meant to be read only by a human (e.g. HTML), it’s less of a problem, because if a human sees, say, that a whole paragraph is linked instead of a certain phrase (because an </a> tag was forgotten), he’ll think it quirky but will understand that only the first part was supposed to be linked.
# [11:10] <Grauw> a computer however would consider the whole rest of the paragraph to be part of the link description
# [11:11] <Grauw> HTML recovers from errors, XML does not. Let the people who want error recovery just use HTML and let’s keep XML simple the way it is :).
# [11:12] <Grauw> it’s not as if XML doesn’t define error handling; it just has very strict (and simple) rules for error handling, similar to say, every programming language out there, or any file system, or any database format
# [11:13] <Grauw> try writing a bunch of corrupted bytes into a MORK file and see if it still works :)
# [11:13] <Grauw> it’s not uncommon practice at all, and it’s not a bad practice either
# [11:14] <anne> I'm not really convinced by your argument.
# [11:14] <anne> If there's consistent error handling what's the problem?
# [11:14] <Grauw> because the error is a condition that should not be there in the first place
# [11:14] <anne> Why should be strict on syntax but loose on conformance?
# [11:14] <Grauw> it signals that there’s something wrong, and ignoring it and just going on as if nothing happened doesn’t indicate the error
# [11:15] <anne> s/should//
# [11:15] <Grauw> loose on conformance? that’s up to the language that is implemented
# [11:16] <Grauw> for a data file format (what XML was originally envisioned for, I think), I don’t think you would want anything but strict error handling
# [11:17] <Grauw> and for XHTML, well, it’s in my experience a really nice way to find out some obvious errors that you made, which could otherwise bite you later on
# [11:17] <anne> Seeing the feeds deployed on the web I think you do want "graceful" error handling
# [11:17] <anne> Also seeing the XHTML deployed by the way... Event the application/xhtml+xml sites very often have character encoding issues.
# [11:17] <anne> s/Event/Even/
# [11:17] <Grauw> the problem is that if the user agent doesn’t implement the file format correctly, it doesn’t
# [11:18] <Grauw> define error handling conditions as much as you want, a user agent that uses a regular expression to filter out the bits of an XML file still won’t get it right
# [11:18] <anne> Regular expression?
# [11:18] * anne doesn't follow
# [11:18] <Grauw> there’s plenty of feed readers which do that
# [11:19] * Quits: dbaron (dbaron@71.198.189.81) (Quit: 8403864 bytes have been tenured, next gc will be global.)
# [11:19] <Grauw> if they would use a real parser (e.g. in Java or .NET or PHP), all of them break on error conditions
# [11:19] <anne> Oh, I'm talking about the survey done on feeds some time ago and experience people involved in feeds have shared so far.
# [11:20] <Grauw> if validators use a real XML processor, then feeds must be valid XML
# [11:20] <Grauw> if they are not, then they are not an XML file format, but something like HTML
# [11:20] <Grauw> the parsing rules would depend on the user agent
# [11:20] <Grauw> not on any specification
# [11:21] * Quits: ROBOd (robod@86.34.246.154) (Quit: http://www.robodesign.ro )
# [11:21] <Grauw> if you define error handling cases for XML, the parsers that don’t get it right now, will still not get it right if you specify error handling cases (because they just do their own thing and don’t implement any specification)
# [11:21] <Grauw> if they had done so in the first place, there would never be so much invalid feeds around on the net
# [11:22] <Grauw> and anyway, sites who serve feeds can be notified of the problem, it’s a matter of evangelisation I’d say
# [11:22] <Grauw> anyway, my point is: the XML error handling rules are VERY simple. Making them more complex isn’t going to get more UAs to implement them correctly. If anything, less.
# [11:23] <anne> Graceful error handling isn't necessarily harder.
# [11:23] <hsivonen> Grauw: having done standards evangelization for the Mozilla project long ago, my faith in the power of standards evangelism is not very high as a solution to problems like this
# [11:23] <anne> I also share that concern.
# [11:23] <Grauw> it doesn’t serve a purpose, if there is strict error handling then authors are enforced to serve correct XML
# [11:24] <anne> Especially with all the people advocating valid HTML code having an incorrect site themselves...
# [11:24] <Grauw> if the XML is invalid then you cannot be sure the error handling corrects it properly
# [11:24] <Grauw> I don’t :)
# [11:24] <anne> You can never be sure.
# [11:24] <Grauw> because I use XHTML at least I don’t make obvious mistakes like forgetting a </a> :)
# [11:25] <anne> You also can't be sure the parser correctly rejected your content.
# [11:25] <Grauw> it’s like a programming language: there being syntax checking and type checking doesn’t guarantee that your program is good or well-written
# [11:25] <Grauw> it does help you catch a lot of obvious errors though
# [11:25] <anne> Or did not reject your content when in fact it should. (A common problem with today's browsers.)
# [11:26] <Grauw> on my website there’s in fact two layers of XML parsers that check it, the backend and the browser itself
# [11:26] <anne> A conformance checker helps you with that and more.
# [11:26] <anne> (I wasn't talking about your website specifically.)
# [11:26] <Grauw> a conformance checker exists on an entirely different layer
# [11:26] <Grauw> it isn’t immediate
# [11:26] <anne> It could be
# [11:26] <Grauw> people run it after developing the site as an afterthought
# [11:26] <Grauw> and don’t run it again everytime new content is added
# [11:27] <anne> That entirely depends on how you do things.
# [11:27] <anne> If they don't doing just syntax checking on new content doesn't help much.
# [11:27] <anne> imo
# [11:27] <Grauw> XML does it inherently
# [11:27] <anne> No, XML only checks some syntax of which some isn't even enforced in all browsers.
# [11:28] <anne> You can't really buy anything with that.
# [11:28] <Grauw> besides, if the parser doesn’t reject the XML, it’s a parser bug. Defining graceful error handling rules isn’t going to improve that situation because the parser doesn’t follow the spec!
# [11:28] <Grauw> browsers isn’t the only thing out there
# [11:28] <anne> I'm not sure I ever argued otherwise.
# [11:29] <Grauw> the fact that some browsers may implement some parts of XML badly doesn’t make it useless, or a failure
# [11:29] <hsivonen> Grauw: I wouldn't be too surprised if even Xerces-J didn't properly enforce well-formedness for non-UTF-* documents
# [11:29] <anne> It makes it a failure on lots of mobile content and lots of feed content.
# [11:29] <anne> Sadly.
# [11:30] <hsivonen> the thing is that XML works great as long as you don't expose authors to it or put it on the Web
# [11:30] <Grauw> so pointing to bad implementations, how exactly does that make ‘graceful’ error handling better? an error is an error.
# [11:30] <hsivonen> XHTML and RSS are the cases where people start to complain about this
# [11:30] <Grauw> that’s a good thing
# [11:30] <hsivonen> the enterprise integration use cases are fine
# [11:30] <Grauw> it finally forces people to pay attention
# [11:31] <anne> Also, again, there's not much point in enforcing strict syntax if you don't enforce all the other things a document needs to conform too as strict as well. (And that's impossible for XHTML for instance and any other language that allows embedding script.)
# [11:31] <Grauw> I wouldn’t say there’s no point
# [11:31] <Grauw> for one thing, XML lives on a different layer than the document language on top of it
# [11:31] <Grauw> breaking XML well-formedness breaks any type of processor
# [11:32] <Grauw> I mean, you don’t go blaming IPv4 for errors in HTML documents either
# [11:32] <anne> Not for syntactically correct content, but yes.
# [11:33] <anne> I would say that HTML syntax and XML syntax are on the same layer more or less...d
# [11:33] <anne> s/...d/.../
# [11:33] <Grauw> or rather, I do not expect IPv4 to be ‘lenient’ in some unpredictable way (and instead of rejecting bad packets routing them anywhere based on a guess) just because there are bad HTML documents out there
# [11:33] <Grauw> syntax, yes
# [11:33] <Grauw> SGML too
# [11:33] <anne> AFAIK authors don't have to deal with IPv4 so that seems fine.
# [11:34] <anne> SGML is pretty much dead.
# [11:34] <Grauw> beside the point :)
# [11:34] <Grauw> it’s on the same layer, HTML syntax is a variant of SGML and XML is a variant SGML too
# [11:34] <Grauw> however the language that uses XML (tons) or the language that uses HTML syntax (HTML itself) are on a higher layer
# [11:35] <anne> XML is actually a subset iirc
# [11:35] <Grauw> last I read it wasn’t entirely SGML-compatible
# [11:35] <Grauw> but I might be mistaken, anyway, subset falls under the variant nomer I think :)
# [11:36] <Grauw> higher layer, just like the Java API is on a higher layer of the Java syntax, yet it’s also part of the same language
# [11:36] <anne> I'm not sure why "higher layer" matters here.
# [11:36] <Grauw> s/of/than
# [11:36] <anne> It's the syntax you use as author.
# [11:36] <Grauw> XML is used in a lot more ways than HTML alone
# [11:36] <anne> The language itself would actually be the "higher layer" imo...
# [11:36] <Grauw> for a lot of applications of XML, strict error checking is simply inappropriate
# [11:36] <Grauw> for XHTML, strict error checking is a good thing as well I think
# [11:37] <Grauw> it avoids a lot of problems
# [11:37] <Grauw> not all, but a lot
# [11:37] <anne> Which problems?
# [11:37] <Grauw> forgetting an </a> tag
# [11:37] <anne> It only avoids some syntax problems...
# [11:37] <anne> Which are obvious when you make them anyway...
# [11:37] <Grauw> I’ve seen plenty of sites / blog posts where an entire paragraph was a link because people forgot an </a>
# [11:37] <hsivonen> Grauw: do you mean that there are cases when a DTD-valid XML document is not an SGML + Annex K document?
# [11:38] <Grauw> hsivonen: you tell me, I don’t know, I just think I read it somewhere :)
# [11:38] <anne> Grauw, prolly with an XHTML DOCTYPE
# [11:38] <Grauw> or posts where the </ul> in a list was forgotten, causing the indenting to be wrong in my feed reader but not in browser X
# [11:38] <Grauw> anne, what has doctype to do with it :)
# [11:39] <anne> nm that
# [11:39] <Grauw> except for determining ‘standards’ or quirks mode in IE
# [11:39] <anne> I rather have the content as user than some error in my face because the author just pressed publish and walked away.
# [11:39] <Grauw> I understand that
# [11:39] <anne> Or didn't check in all user agents...
# [11:39] <Grauw> one sec
# [11:39] <anne> (Which happens, I posted samples of that to my site.)
# [11:40] <anne> Not just in Internet Explorer...
# [11:42] <Grauw> IE doesn’t understand XHTML, so there’s no user agents on the web which don’t at least have a decent understanding of XML
# [11:42] <anne> ?
# [11:43] <Grauw> if you author XML, processed by an XML parser, then you should never be presented with a page that’s unreadable because the author didn’t verify the XML-well-formedness
# [11:43] <Grauw> because the author’ll notice that his page is broken immediately
# [11:44] <Grauw> or of course, really because his CMS ran it through a server-side XML parser to test the well-formedness and complains if it isn’t
# [11:46] <Grauw> anyway, if you really want graceful error degradation, use HTML instead of XHTML. If you care about the same in feeds, well honestly, just use an XML parser and let the users complain to the author’s website to fix it :).
# [11:46] <Grauw> I’ve done so to a site that newly had an XML feed implemented, and that worked just fine.
# [11:46] <anne> I also care about it on mobile phones.
# [11:46] <anne> I also think the character encoding issue would have had to be fixed by now in order for that to be feasible. But it isn't...
# [11:47] <Grauw> I don’t think there’ll be documents that are authored on a mobile phone
# [11:47] * Joins: hasather (hasather@81.235.209.174)
# [11:47] <anne> Also, I think ease of authoring and users first demands graceful error handling.
# [11:47] <anne> Grauw, you obviously don't work for the same company as I do
# [11:47] <Grauw> If the desktop browsers are all strictly validating, and they don’t use some different ‘profile’ or something but just the same content, then they should be authored correctly even if some phones don’t have a proper XML parser
# [11:48] <Grauw> I have worked with a lot of XML technologies though, and also on the support side of that.
# [11:48] <anne> We are encountering those problems daily and are currently using heuristics (last I heard) to determine whether or not application/xhtml+xml can be processed as XML...
# [11:48] <Grauw> Backbase processes all documents as XML, and the worst we get as support questions is like
# [11:48] <Grauw> ‘why doesn’t   work’ and we tell them to use ‘ ’ and they’re happy.
# [11:49] * anne was already planning to extend the default set of entities
# [11:49] <Grauw> I don’t see the need for those heuristics
# [11:49] <Grauw> you are planning, huh :)
# [11:50] <anne> Yeah, next semester I'll be working on XML2 as part of my university research project for my Bachelor degree.
# [11:50] <anne> or XML5, haven't decided on a name yet
# [11:50] <Lachy> oh, not another naming debate ;-)
# [11:50] <Grauw> I think I could say I probably have more experience with XML technologies than you do, given that Backbase is based on XML, and your work mainly concerns HTML :)
# [11:51] <anne> Lachy, just between me and my supervisor at the uni this time...
# [11:51] <anne> Grauw, how do you know what my work concerns?
# [11:52] <anne> But "XML technologies" sure...
# [11:52] <Grauw> I don’t know, you started the ‘you obviously don’t work for the same company as I do’ path :)
# [11:52] <hsivonen> anne: my supervisor first favored "Web Applications 1.0" but later agreed to putting "HTML5" in the title
# [11:52] <anne> Grauw, I was just indicating that you can't know the problems we face with mobile content.
# [11:53] <hsivonen> anne: do you mean the success story of XML for mobiles is breaking XML? :-)
# [11:53] <Grauw> well I know of a lot of XML-related problems because Backbase is actually one of the few web technologies that uses XML
# [11:53] <anne> hsivonen, sort of :)
# [11:53] * anne likes that sentence
# [11:54] <Grauw> and we don’t get many of them
# [11:54] <Grauw> writing correct XML is actually extremely simple compared to writing correct HTML :)
# [11:55] <Grauw> most people who don’t get it at first do get it soon thereafter
# [11:56] <anne> That's good. I don't plan to change how you have to write XML.
# [11:56] <Grauw> as for Opera, given that they have a fairly wide deployment, they are likely a tested platform for many mobile web sites (as they really test against platforms anyway), and it would be nice if they stood ground with regard to XML validity
# [11:56] <anne> XML validity?
# [11:56] <anne> hah
# [11:56] <Grauw> well-formedness
# [11:56] <Grauw> you get what I mean
# [11:57] <anne> You mean namespace-well-formedness?
# [11:57] <anne> As I said, we tried and failed.
# [11:57] <Grauw> I don’t think I have ever heard of namespace-well-formedness so far
# [11:57] <Grauw> Maybe Opera gave up too soon :)
# [11:58] <anne> <test:test/> is well-formed but not namespace-well-formed
# [11:58] <anne> You can speculate all you wish, but reality is what matters here.
# [11:58] <anne> Oh, its "namespace well-formedness"
# [11:58] <anne> See http://www.w3.org/TR/xml-names/#ProcessorConformance
# [11:58] <Grauw> Namespace well-formedness then, yes
# [12:00] <Grauw> the problem why there is invalid XML in the first place is because UAs give in to pressure like Opera does
# [12:00] <Grauw> and many, many feed readers
# [12:00] <anne> The problem is that people are not perfect.
# [12:00] <anne> And make imperfect products.
# [12:01] <anne> And we have to deal with that situation and not just pretend it doesn't exist.
# [12:01] <Grauw> they seem to have no problem with well-formedness when creating a Java program
# [12:01] <Grauw> So I don’t buy into the argument that people can’t help doing it wrong
# [12:01] <anne> People have lots of problems with different C compilers every day...
# [12:01] <Grauw> or that doing it right is so hard that it’s impossible
# [12:01] <Grauw> that’s a very different issue
# [12:01] <hsivonen> if there's some pressure point where I wish Opera had stood firm, it's that I think all browser vendors should have refused to implement XML 1.1
# [12:01] <anne> I don't know enough about Java.
# [12:01] <Grauw> but it would be great if C compilers would agree on one language
# [12:02] <anne> hsivonen, I think I'll fold in all the good features from XML 1.1 into XML n
# [12:02] <Grauw> XML 1.1 is XML 1.0 plus a wider unicode range, right?
# [12:02] <anne> Plus at least one incompatible change...
# [12:03] <anne> wider unicode range for element and attributes names btw
# [12:03] <Grauw> yes
# [12:03] <Grauw> the incompatible change being (if it’s easy to describe)?
# [12:03] <hsivonen> Grauw: XML 1.1 allows a different arbitrary range of characters (yes, wider) in element and attribute names
# [12:03] <anne> Grauw, the spec points them out
# [12:03] <anne> some characters are no longer allowed in some production
# [12:03] <anne> and some others are
# [12:03] <Grauw> aha
# [12:04] <Grauw> well that’s weird, that they exclude characters that used to be allowed before
# [12:04] <hsivonen> the path to crazy implementation cost and interop breakage is paved with i18n political correctness and IBM mainframe compat
# [12:04] <anne> XML 1.1 is a classic example of versioning done wrong
# [12:04] <Grauw> unless they were characters that could not actually be used in a regular document
# [12:05] <Grauw> anyway, I need to go :)
# [12:05] <Grauw> (again!)
# [12:05] <Grauw> see you later
# [12:05] <anne> heh
# [12:05] <anne> bye
# [12:05] <anne> http://www.w3.org/TR/xml11/#sec-xml11
# [12:08] <anne> The main problem is the control character nonsense.
# [12:08] <hsivonen> anne: what exactly is the nonsense in your opinion_
# [12:08] <hsivonen> ?
# [12:09] <anne> That you need to use character references and such...
# [12:09] <anne> I don't think those changes make much of a problem though... Just allow everything and require #0 to be translated to #FFFD
# [12:09] <anne> #x0 to #xFFFD
# [12:10] <anne> So new parsers can just ignore the whole <?xml version=x?> thing and continue...
# [12:11] <hsivonen> anne: would you allow #xFFFF?
# [12:12] <anne> everything that should not be allowed for some good reason will be translated into #xFFFD during the input stream phase I suppose
# [12:12] <hsivonen> anne: would #x0 be the only code point that you would not allow in the infoset?
# [12:12] <anne> I'm obviously no expert in that
# [12:16] <anne> hsivonen, it seems that's what HTML5 does
# [12:16] <hsivonen> yes.
# [12:17] <hsivonen> in retrospect, I think it was a very bad idea for XML to arbitrarily limit the lexical space of identifiers
# [12:17] <anne> Although HTML5 should probably deal with some characters such as those from windows-1252
# [12:17] <hsivonen> but gotta run to an iki.fi meeting to lobby for a Jabber server
# [12:17] <anne> bye
# [12:27] * Quits: Grauw (ask@202.71.92.74) (Ping timeout)
# [12:38] * Joins: ROBOd (robod@86.34.246.154)
# [12:44] * Quits: gavin (gavin@74.103.208.221) (Ping timeout)
# [12:46] * Joins: edas (edaspet@88.191.34.123)
# [12:56] * Joins: gavin (gavin@74.103.208.221)
# [13:07] * Quits: sbuluf (hm@200.49.140.156) (Ping timeout)
# [14:59] * Quits: gavin (gavin@74.103.208.221) (Ping timeout)
# [15:04] * Joins: gavin (gavin@74.103.208.221)
# [15:16] * Quits: edas (edaspet@88.191.34.123) (Quit: Quitte)
# [15:17] * Quits: marcos (chatzilla@203.206.31.102) (Ping timeout)
# [15:37] * Joins: erik (erik@84.29.164.153)
# [15:37] * Quits: erik (erik@84.29.164.153) (Quit: Bye bye)
# [15:38] * Joins: erik (erik@84.29.164.153)
# [15:39] * Joins: marcos___ (chatzilla@203.206.31.102)
# [15:39] * marcos___ is now known as marcos
# [15:44] * Quits: erik (erik@84.29.164.153) (Quit: Bye bye)
# [15:53] * Joins: h3h (bfults@70.95.237.98)
# [16:13] * Quits: h3h (bfults@70.95.237.98) (Quit: h3h)
# [16:32] * Joins: Shunsuke (kuruma@219.110.80.235)
# [17:07] * Quits: gavin (gavin@74.103.208.221) (Ping timeout)
# [17:12] * Joins: gavin (gavin@74.103.208.221)
# [17:16] * Joins: Grauw (ask@202.71.92.74)
# [17:55] * Quits: Grauw (ask@202.71.92.74) (Ping timeout)
# [17:57] * Joins: edas (edaspet@88.191.34.123)
# [18:02] * Joins: h3h (bfults@70.95.237.98)
# [18:08] * Quits: anne (annevk@81.68.67.12) (Ping timeout)
# [18:16] * Joins: sbuluf (logp@200.49.140.131)
# [18:18] * Parts: hasather (hasather@81.235.209.174)
# [18:19] * Joins: cwahlers (Miranda@201.27.182.230)
# [18:55] * Joins: st (st@62.234.155.214)
# [19:04] * Joins: dbaron (dbaron@71.198.189.81)
# [19:14] * Quits: gavin (gavin@74.103.208.221) (Ping timeout)
# [19:19] * Joins: gavin (gavin@74.103.208.221)
# [20:00] * Quits: edas (edaspet@88.191.34.123) (Ping timeout)
# [20:12] * Joins: edas (edaspet@88.191.34.123)
# [21:01] * Quits: dbaron (dbaron@71.198.189.81) (Quit: 8403864 bytes have been tenured, next gc will be global.)
# [21:10] * Quits: Shunsuke (kuruma@219.110.80.235) (Quit: See you...)
# [21:15] * Quits: edas (edaspet@88.191.34.123) (Ping timeout)
# [21:21] * Quits: gavin (gavin@74.103.208.221) (Ping timeout)
# [21:26] * Joins: gavin (gavin@74.103.208.221)
# [21:58] * Joins: hasather (hasather@81.235.209.174)
# [22:04] * Joins: edas (edaspet@88.191.34.123)
# [22:09] * Quits: cwahlers (Miranda@201.27.182.230) (Connection reset by peer)
# [22:28] * Quits: ROBOd (robod@86.34.246.154) (Quit: http://www.robodesign.ro )
# [22:31] * Quits: h3h (bfults@70.95.237.98) (Quit: h3h)
# [22:32] * Joins: h3h_ (bfults@70.95.237.98)
# [22:32] * Quits: h3h_ (bfults@70.95.237.98) (Quit: h3h_)
# [22:39] * Quits: edas (edaspet@88.191.34.123) (Quit: http://eric.daspet.name/ et l'édition 2007 de http://www.paris-web.fr/ )
# [22:42] * Quits: hasather (hasather@81.235.209.174) (Client exited)
# [22:43] * Joins: hasather (hasather@81.235.209.174)
# [23:28] * Quits: gavin (gavin@74.103.208.221) (Ping timeout)
# [23:33] * Joins: gavin (gavin@74.103.208.221)
# Session Close: Sun Apr 01 00:00:00 2007

The end :)