/irc-logs / w3c / #html-wg / 2007-03-31 / end

Options:

  1. # Session Start: Sat Mar 31 00:00:01 2007
  2. # Session Ident: #html-wg
  3. # [00:04] * Joins: gavin (gavin@74.103.208.221)
  4. # [00:29] * Quits: kingryan (kingryan@66.92.187.33) (Quit: kingryan)
  5. # [00:30] <hsivonen> regarding respecting MIME: in my experience, if one implements XML content types properly, it is essential to provide a checkbox for turning it off
  6. # [00:30] <hsivonen> e.g. the W3C serves XML DTDs as text/plain...
  7. # [00:53] <mjs> and of course anything you need a checkbox to turn off is something you probably shouldn't do
  8. # [00:58] <hsivonen> mjs: but I wanted to be eligible for the t-shirt: http://www.cafepress.com/feedparser
  9. # [00:59] <hsivonen> mjs: seriously though, my app is all about finding spec violations
  10. # [00:59] <hsivonen> mjs: so I guess by default I have to flag even RFC 3023 stuff
  11. # [00:59] <mjs> hsivonen: well, a conformance checker is obviously a different beast than an app to present content to end users
  12. # [01:09] <Lachy> hsivonen, if you were to remove the option and rely on content sniffing, then the other reasonable alternative is just to output an error for the MIME type problems and then continue as if the correct types at been sent
  13. # [01:19] * Quits: tylerr (tylerr@66.195.32.2) (Quit: Leaving)
  14. # [01:27] <hsivonen> Lachy: defaulting to US-ASCII doesn't fit that solution, but I guess I could make text/plain as an XML type a non-fatal error
  15. # [01:28] <hsivonen> I want to keep image/jpeg and the like fatal, though
  16. # [01:49] * Quits: Zeros (Zeros-Elip@67.154.87.254) (Ping timeout)
  17. # [01:50] * Parts: hasather (hasather@81.235.209.174)
  18. # [02:06] * Quits: gavin (gavin@74.103.208.221) (Ping timeout)
  19. # [02:10] * Quits: AnPol (anpol@85.118.224.254) (Quit: Bye)
  20. # [02:11] * Joins: gavin (gavin@74.103.208.221)
  21. # [02:22] * Quits: h3h (bfults@66.162.32.234) (Quit: |)
  22. # [02:31] * Joins: Zeros (Zeros-Elip@69.140.48.129)
  23. # [02:31] * Quits: Zeros (Zeros-Elip@69.140.48.129) (Client exited)
  24. # [03:11] * Joins: DougJ (djones4@74.76.23.86)
  25. # [03:14] <Grauw> [05:31] <anne> Does anyone actually implement new versions of XSLT?
  26. # [03:15] <Grauw> I've used XSLT2 extensively. In fact, I couldn't bare to use XLST1 anymore, it just misses too much that I've gotten used to :)
  27. # [03:16] <Grauw> [04:21] <DanC> Hixie, do you take special classes on how to be rude, or do you just do it naturally?
  28. # [03:16] <Grauw> ehehe
  29. # [03:17] <Grauw> there's a fad with making sarcastic remarks about the W3C efforts nowadays, it's honestly not that bad, if perhaps not directly applying to the goals of HTML5
  30. # [03:18] <Grauw> but anyway
  31. # [03:21] * dbaron RRSAgent, pointer?
  32. # [03:21] * RRSAgent See http://www.w3.org/2007/03/31-html-wg-irc#T01-15-01
  33. # [03:22] * Quits: Ashe (Ashe@213.47.199.86) (Ping timeout)
  34. # [03:24] <mjs> Grauw: I don't think Hixie is the only one to think the TAG's findings are of questionable value
  35. # [03:24] <Grauw> I'm not saying he is
  36. # [03:25] <Grauw> saying that there's a 'fad' implies that there are more than one people doing it :)
  37. # [03:25] <dbaron> I completely disagree with a whole bunch of the findings.
  38. # [03:25] <mjs> 'fad' implies that (a) it's invalid and (b) it will pass
  39. # [03:27] <Grauw> let's rephrase it then, I think there's a lot of criticism on W3C specs and effort on small details, and specs as a whole are named worthless for that
  40. # [03:27] <Grauw> whereas the specs as a whole are pretty nice
  41. # [03:27] <Grauw> e.g. XHTML2 introduces a lot of nice things, so within its own space of existence I would say that you could call it a nice specification
  42. # [03:28] <mjs> XHTML2 is a bunch of bad ideas poorly specified
  43. # [03:28] <mjs> to call it a "nice specification" you have to water down the meaning of that phrase a lot
  44. # [03:29] <Grauw> however some people are critisising that it doesn't specify every implementation detail, and discarding the entire spec or the ideas in it as bad
  45. # [03:29] <Grauw> or at least make it sound like that, and cause other people to do so
  46. # [03:29] * Quits: dbaron (dbaron@63.245.220.242) (Quit: 8403864 bytes have been tenured, next gc will be global.)
  47. # [03:30] <mjs> I think many of the ideas in it are also bad
  48. # [03:30] <mjs> though some are good
  49. # [03:30] <Grauw> well a lot of it is a matter of perspective
  50. # [03:30] <mjs> its basic idea of breaking compatibility is bad
  51. # [03:31] <Grauw> if I hear Anne going on about XML error handling it also makes me cringe :)
  52. # [03:31] <Grauw> mjs, maybe, that's the key difference that HTML5 does different
  53. # [03:32] <Grauw> however you should consider the ideas within the space of thought of breaking compatibility (and they didn’t entirely, btw)
  54. # [03:32] <Grauw> anyway
  55. # [03:32] <Grauw> I need to go :)
  56. # [03:32] <Grauw> I'm being called ^_^
  57. # [03:32] <Grauw> later
  58. # [03:34] <Hixie> if you look at them in a vaccuum, xforms is a nice spec. xhtml2 is not.
  59. # [03:34] <Hixie> imho.
  60. # [03:34] <Hixie> xhtml2 leaves far too much very vague and has basically no ua conformance criteria to speak of
  61. # [03:34] <Grauw> (not gone yet) I suppose you could indeed say that XForms is a much better spec than XHTML2
  62. # [03:36] <mjs> xhtml2 is a bad spec qua spec as well as an unsuitable design for the web
  63. # [03:36] <Grauw> maybe as for XHTML2 what I mean is people should look at the ideas that it has (and people have, you said you have too, and a fair amount ended up in HTML5)
  64. # [03:36] <mjs> in the case of xforms only the latter is true
  65. # [03:36] <Grauw> I disagree with that :) I don't see how it's unsuitable for the web
  66. # [03:37] <Grauw> but anyway, I need to go, so don't remark on that :)
  67. # [03:38] <Hixie> ideas don't make a spec
  68. # [03:39] * Joins: Ashe (Ashe@213.47.199.86)
  69. # [04:04] * Parts: DougJ (djones4@74.76.23.86)
  70. # [04:14] * Quits: gavin (gavin@74.103.208.221) (Ping timeout)
  71. # [04:19] * Joins: gavin (gavin@74.103.208.221)
  72. # [05:06] * Quits: mjs (mjs@17.255.99.50) (Quit: mjs)
  73. # [05:07] * Joins: mjs (mjs@17.255.99.50)
  74. # [05:13] <Grauw> w.r.t. that web architecture document quote, "A data format specification SHOULD provide for version information."
  75. # [05:13] <Grauw> I don't think you can consider CSS a data format. HTML, maybe, but I think whether or not to include versioning information is not a general rule but rather based on a design decision in the language
  76. # [05:13] * Quits: mjs (mjs@17.255.99.50) (Quit: mjs)
  77. # [05:14] <Grauw> finally, in most XML-based formats, the XML namespace provides sufficient means for versioning
  78. # [05:14] <Grauw> and no additional version information should be provided. backwards-compatibility breaking changes should change the namespace
  79. # [05:19] * Joins: sbuluf (hm@200.49.140.156)
  80. # [05:33] * Quits: primal1 (primal1@72.87.242.30) (Quit: primal1)
  81. # [05:50] * Joins: MikeSmith (MikeSmith@mcclure.w3.org)
  82. # [06:11] * Joins: Zeros (Zeros-Elip@69.140.48.129)
  83. # [06:11] * Quits: Zeros (Zeros-Elip@69.140.48.129) (Quit: Leaving)
  84. # [06:11] * Joins: Zeros (Zeros-Elip@69.140.48.129)
  85. # [06:21] * Joins: DougJ (djones4@24.213.244.253)
  86. # [06:22] * Quits: gavin (gavin@74.103.208.221) (Ping timeout)
  87. # [06:27] * Joins: gavin (gavin@74.103.208.221)
  88. # [06:27] * Joins: dbaron (dbaron@71.198.189.81)
  89. # [06:33] * Joins: mjs (mjs@64.81.48.145)
  90. # [06:44] * Quits: DanC (connolly@128.30.52.30) (Quit: Client exiting)
  91. # [06:45] <Grauw> I wish people would stop the debate over <abbr> and <acronym>. There is no way in hell that a screenreader can know <acronym>SMIL</acronym> is to be pronounced like ‘smile’, nor can it know that <acronym>SQL</acronym> has to be pronounced ‘sequel’. Unless it has a list of pronounciations built-in, in which case the author doesn’t need to indicate any distinction in the first place.
  92. # [06:46] <Hixie> css isn't a data format?
  93. # [06:46] <Hixie> what is it then?
  94. # [06:46] <Grauw> the only real solution would be to provide an attribute with a pronunciation key in some phonetic alphabet
  95. # [06:47] <Grauw> Hixie: what I mean is that it doesn’t convey information
  96. # [06:47] <Grauw> it merely conveys how to present the information
  97. # [06:47] <jjb> Grauw: that's why we need to spec in a "phonetic" attribute for <acronym>, to be specified in IPA symbols
  98. # [06:47] <jjb> (just kidding)
  99. # [06:47] <Grauw> jjb, yay for unicode ;p
  100. # [06:48] <jjb> ha, i see you made the joke before i had time to put mine together
  101. # [06:48] <Grauw> I'm all for that, with the proper IPA symbols ;p
  102. # [06:48] <jjb> in theory, i wouldn't mind that either, but somehow i think it would't pan out so great in the market...
  103. # [06:48] <Hixie> you don't think "i want h1s to be rendered in green" is information?
  104. # [06:48] <Grauw> somehow, ne :)
  105. # [06:49] <Grauw> Hixie, ... I think you’re missing my point :)
  106. # [06:49] <Hixie> yes
  107. # [06:49] <Hixie> i don't understand your point :-)
  108. # [06:50] <Grauw> to me at information is primarily stuff like address information, etc, something you would express with RDF
  109. # [06:50] <Grauw> CSS in itself as a format of course conveys information on the presentation
  110. # [06:50] <Grauw> but it doesn’t carry semantics
  111. # [06:50] <Hixie> you could express CSS in RDF
  112. # [06:50] <Grauw> of course you can
  113. # [06:50] <Hixie> so...?
  114. # [06:50] <Grauw> but you wouldn't :)
  115. # [06:51] <Hixie> I wouldn't express _anything_ in RDF
  116. # [06:51] <Lachy> LOL!
  117. # [06:51] <Hixie> i don't understand why css is different from html in terms of deserving versioning information
  118. # [06:52] <Grauw> I don't think it is
  119. # [06:52] <Hixie> ok then
  120. # [06:52] <Grauw> I'm not trying to make an argument for versioning information, I said it depends on design decisions of the language
  121. # [06:53] <Grauw> and that maybe the reference to ‘data format’ could be interpreted as so
  122. # [06:53] <Lachy> Grauw, when would it ever be a good design desicion to include versioning?
  123. # [06:53] <Grauw> that it would only cover ‘semantic’ information
  124. # [06:54] <Grauw> for implementors versioning sucks, but I think there’s been a fair number of arguments presented where it’s useful as well
  125. # [06:54] <Grauw> even if just so that if you see the document, you know ‘oh it was documented according to this version of the standard. I’ll stick to that than, so that the risk of adding stuff that doesn’t work in the systems it’s intended to be used for is minimal’
  126. # [06:55] <Hixie> why would a specification's version be useful for that? surely you'd want to specify an authoring subset or at most an implementation version
  127. # [06:56] <Hixie> language versions have very little bearing on what set of features authors use
  128. # [06:56] <Grauw> Backbase software has been through three major versions by now, all incompatible
  129. # [06:56] <Grauw> even among minor versions there have been some incompatibilities (unintentionally, I suppose)
  130. # [06:57] <Grauw> the version information is not indicated however, because it simply depends on which version of the library is included, and which namespace is used
  131. # [06:57] <Hixie> right but with HTML you have more than one implementation
  132. # [06:58] <Hixie> so even with one _language_ version you'll have incompatibilities
  133. # [06:58] <Hixie> e.g. as now with CSS and HTML and DOM and JS and so forth
  134. # [06:58] <Grauw> yeah, so that’s why it’s a per-language consideration
  135. # [06:58] <Hixie> no
  136. # [06:58] <Hixie> it's a per IMPLEMENTATION consideration at most
  137. # [06:58] <Grauw> ...depends on how well the implementations are done, I think.
  138. # [06:59] <Grauw> with the browsers, we got unlucky in that it’s kind of a mess
  139. # [06:59] <Zeros> versioning works if its enforced in the implementation
  140. # [06:59] <Grauw> but if you look at internet standards like IP, etc
  141. # [06:59] <Grauw> you can’t really implement it ‘incompletely’
  142. # [07:00] <Zeros> If Javascript 2 only works when you specify that version there's no room for erroneous mixing in features, and you can fix things later with explicit reference to what set of features you're using
  143. # [07:01] <Grauw> a problem with a lot of ideas that some people have (like anne's xml-should-have-error-recovery) is that it’s a view that really doesn’t apply to all areas
  144. # [07:01] <Zeros> Some XML implementations have some level of error recovery :)
  145. # [07:01] <Grauw> e.g. programming languages are usually not forgiving in their syntax error handling, yet they have been used widely and it forms no obstacle to programmers (rather, a help)
  146. # [07:01] <Zeros> Webkit doesn't show a yellow screen when it sees a malformed document
  147. # [07:02] <Zeros> Grauw, the counter argument is that all media decoders (mp3, video etc.) implement error recovery when the file is malformed
  148. # [07:02] <Grauw> XML is very often used to contain a programming language as well, and for data I also think it would be bad to allow the risk of data being incorrectly error-corrected to say something else than that it actually was supposed to mean
  149. # [07:03] <Grauw> so it’s very specific to the domain
  150. # [07:03] <Grauw> same is true for versioning information I think
  151. # [07:03] <Grauw> or feature indication (pretty much the same thing, but more flexible)
  152. # [07:04] <Lachy> Zeros, WebKit does show an error for malformed content (although it's not yellow like Mozilla)
  153. # [07:04] * Quits: DougJ (djones4@24.213.244.253) (Quit: DougJ)
  154. # [07:04] <Zeros> Lachy, it shows the page rendered up until that point, mozilla shows nothing but the error
  155. # [07:04] <Grauw> Zeros, yeah, I hate malformed MP3s with their stupid pops
  156. # [07:04] <Grauw> if they just didn’t work at all, people wouldn’t share them
  157. # [07:04] <Grauw> and I wouldn’t be bothered by them
  158. # [07:04] <Zeros> Lachy, One is better for programmers, the other for users
  159. # [07:04] <Lachy> oh, I see what you mean
  160. # [07:05] <Grauw> I think both approaches are ok, as long as there’s something that forces the programmers of the backend to implement something that never generates invalid XML
  161. # [07:06] * Joins: DougJ (djones4@24.213.244.253)
  162. # [07:06] <Lachy> unfortunately, that doesn't always happen in reality
  163. # [07:06] <Grauw> if all browsers support XHTML, it will, for sites that serve XHTML
  164. # [07:06] <Lachy> There's plenty of systems out there that claim to support XML, but only have tag soup parsers
  165. # [07:07] <Zeros> Lachy, The major issue I think is that XHTML is trivial to hand author and generate invalid results
  166. # [07:07] <Grauw> because they’re forced to, in the same manner as programmers are forced to make sure their code compiles
  167. # [07:07] <Grauw> anyway, have to go (again), I’ll read the backlog when I get back :)
  168. # [07:07] <Zeros> Its more complicated for your word document to become invalid, so most word documents are valid
  169. # [07:07] <Lachy> e.g. feed readers, mobile phones, etc. all use tag soup parsers for XML
  170. # [07:07] <Zeros> With XHTML its pretty easy to introduce an anomaly which would invalidate the entire document.
  171. # [07:07] <Grauw> my feed reader (Sage) uses a true XML parser, and it works just fine (most of the time ;p)
  172. # [07:08] <Lachy> there are many that don't, though
  173. # [07:08] <Zeros> libxml2 works pretty well :)
  174. # [07:08] <Grauw> sure, and I don’t think that’s a good thing
  175. # [07:08] <Grauw> anyway
  176. # [07:09] * Quits: MikeSmith (MikeSmith@mcclure.w3.org) (Quit: Get thee behind me, satan.)
  177. # [07:10] <Zeros> How long does it usually take for sysreq to respond?
  178. # [07:11] * Parts: DougJ (djones4@24.213.244.253)
  179. # [07:12] <Lachy> Zeros, what's your real name, or the name you use on the mailing list?
  180. # [07:13] <Zeros> I'm not on the list yet, I filled out the form and realized I entered my last name twice.
  181. # [07:14] <Zeros> Was going to join
  182. # [07:14] <Zeros> Can you fix it?
  183. # [07:14] <Lachy> no, ask Dan or Karl
  184. # [07:14] <Zeros> alright, thanks
  185. # [07:15] <Lachy> typing /whois says your name is Elliot. What's your last name?
  186. # [07:15] <Zeros> Elliott Sprehn
  187. # [07:15] <Zeros> and you?
  188. # [07:15] <Lachy> Lachlan Hunt
  189. # [07:16] * Lachy thought everyone knew that
  190. # [07:16] <Zeros> I guess I'm not anyone
  191. # [07:56] <Lachy> the logs from in here last night are funny :-)
  192. # [07:57] <Lachy> especially the stuff about the TAG
  193. # [08:20] * Joins: marcos (chatzilla@203.206.31.102)
  194. # [08:28] * Quits: gavin (gavin@74.103.208.221) (Ping timeout)
  195. # [08:33] * Joins: gavin (gavin@74.103.208.221)
  196. # [08:50] * Quits: st (st@62.234.155.214) (Quit: st)
  197. # [08:54] * Joins: DougJ (djones4@24.213.244.253)
  198. # [08:55] * Quits: DougJ (djones4@24.213.244.253) (Quit: DougJ)
  199. # [10:00] * Joins: chaals (chaals@194.182.142.5)
  200. # [10:14] <anne> That removing namespace-well-formedness in XML somehow makes documents no longer trustable is a myth. If a document needs to be trustable you verify whether or not it meets your criteria. You can't trust a document solely based on the syntax. That'd be silly.
  201. # [10:14] * Parts: anne (annevk@81.68.67.12)
  202. # [10:15] * Quits: Zeros (Zeros-Elip@69.140.48.129) (Quit: Leaving)
  203. # [10:16] * Joins: anne (annevk@81.68.67.12)
  204. # [10:32] * Joins: ROBOd (robod@86.34.246.154)
  205. # [10:34] * Quits: chaals (chaals@194.182.142.5) (Ping timeout)
  206. # [10:36] * Quits: gavin (gavin@74.103.208.221) (Ping timeout)
  207. # [10:41] * Joins: gavin (gavin@74.103.208.221)
  208. # [11:07] <Grauw> anne, I’m not saying that you should trust a document based on the syntax. I am saying however that if there’s something wrong with the document, it’s better if it fails with an error than that it silently tries to recover it (very possibly incorrectly), and the source is never notified of the problem
  209. # [11:09] * Quits: Dave (dsr@128.30.52.30) (Quit: be seeing you ...)
  210. # [11:09] <Grauw> when it’s meant to be read only by a human (e.g. HTML), it’s less of a problem, because if a human sees, say, that a whole paragraph is linked instead of a certain phrase (because an </a> tag was forgotten), he’ll think it quirky but will understand that only the first part was supposed to be linked.
  211. # [11:10] <Grauw> a computer however would consider the whole rest of the paragraph to be part of the link description
  212. # [11:11] <Grauw> HTML recovers from errors, XML does not. Let the people who want error recovery just use HTML and let’s keep XML simple the way it is :).
  213. # [11:12] <Grauw> it’s not as if XML doesn’t define error handling; it just has very strict (and simple) rules for error handling, similar to say, every programming language out there, or any file system, or any database format
  214. # [11:13] <Grauw> try writing a bunch of corrupted bytes into a MORK file and see if it still works :)
  215. # [11:13] <Grauw> it’s not uncommon practice at all, and it’s not a bad practice either
  216. # [11:14] <anne> I'm not really convinced by your argument.
  217. # [11:14] <anne> If there's consistent error handling what's the problem?
  218. # [11:14] <Grauw> because the error is a condition that should not be there in the first place
  219. # [11:14] <anne> Why should be strict on syntax but loose on conformance?
  220. # [11:14] <Grauw> it signals that there’s something wrong, and ignoring it and just going on as if nothing happened doesn’t indicate the error
  221. # [11:15] <anne> s/should//
  222. # [11:15] <Grauw> loose on conformance? that’s up to the language that is implemented
  223. # [11:16] <Grauw> for a data file format (what XML was originally envisioned for, I think), I don’t think you would want anything but strict error handling
  224. # [11:17] <Grauw> and for XHTML, well, it’s in my experience a really nice way to find out some obvious errors that you made, which could otherwise bite you later on
  225. # [11:17] <anne> Seeing the feeds deployed on the web I think you do want "graceful" error handling
  226. # [11:17] <anne> Also seeing the XHTML deployed by the way... Event the application/xhtml+xml sites very often have character encoding issues.
  227. # [11:17] <anne> s/Event/Even/
  228. # [11:17] <Grauw> the problem is that if the user agent doesn’t implement the file format correctly, it doesn’t
  229. # [11:18] <Grauw> define error handling conditions as much as you want, a user agent that uses a regular expression to filter out the bits of an XML file still won’t get it right
  230. # [11:18] <anne> Regular expression?
  231. # [11:18] * anne doesn't follow
  232. # [11:18] <Grauw> there’s plenty of feed readers which do that
  233. # [11:19] * Quits: dbaron (dbaron@71.198.189.81) (Quit: 8403864 bytes have been tenured, next gc will be global.)
  234. # [11:19] <Grauw> if they would use a real parser (e.g. in Java or .NET or PHP), all of them break on error conditions
  235. # [11:19] <anne> Oh, I'm talking about the survey done on feeds some time ago and experience people involved in feeds have shared so far.
  236. # [11:20] <Grauw> if validators use a real XML processor, then feeds must be valid XML
  237. # [11:20] <Grauw> if they are not, then they are not an XML file format, but something like HTML
  238. # [11:20] <Grauw> the parsing rules would depend on the user agent
  239. # [11:20] <Grauw> not on any specification
  240. # [11:21] * Quits: ROBOd (robod@86.34.246.154) (Quit: http://www.robodesign.ro )
  241. # [11:21] <Grauw> if you define error handling cases for XML, the parsers that don’t get it right now, will still not get it right if you specify error handling cases (because they just do their own thing and don’t implement any specification)
  242. # [11:21] <Grauw> if they had done so in the first place, there would never be so much invalid feeds around on the net
  243. # [11:22] <Grauw> and anyway, sites who serve feeds can be notified of the problem, it’s a matter of evangelisation I’d say
  244. # [11:22] <Grauw> anyway, my point is: the XML error handling rules are VERY simple. Making them more complex isn’t going to get more UAs to implement them correctly. If anything, less.
  245. # [11:23] <anne> Graceful error handling isn't necessarily harder.
  246. # [11:23] <hsivonen> Grauw: having done standards evangelization for the Mozilla project long ago, my faith in the power of standards evangelism is not very high as a solution to problems like this
  247. # [11:23] <anne> I also share that concern.
  248. # [11:23] <Grauw> it doesn’t serve a purpose, if there is strict error handling then authors are enforced to serve correct XML
  249. # [11:24] <anne> Especially with all the people advocating valid HTML code having an incorrect site themselves...
  250. # [11:24] <Grauw> if the XML is invalid then you cannot be sure the error handling corrects it properly
  251. # [11:24] <Grauw> I don’t :)
  252. # [11:24] <anne> You can never be sure.
  253. # [11:24] <Grauw> because I use XHTML at least I don’t make obvious mistakes like forgetting a </a> :)
  254. # [11:25] <anne> You also can't be sure the parser correctly rejected your content.
  255. # [11:25] <Grauw> it’s like a programming language: there being syntax checking and type checking doesn’t guarantee that your program is good or well-written
  256. # [11:25] <Grauw> it does help you catch a lot of obvious errors though
  257. # [11:25] <anne> Or did not reject your content when in fact it should. (A common problem with today's browsers.)
  258. # [11:26] <Grauw> on my website there’s in fact two layers of XML parsers that check it, the backend and the browser itself
  259. # [11:26] <anne> A conformance checker helps you with that and more.
  260. # [11:26] <anne> (I wasn't talking about your website specifically.)
  261. # [11:26] <Grauw> a conformance checker exists on an entirely different layer
  262. # [11:26] <Grauw> it isn’t immediate
  263. # [11:26] <anne> It could be
  264. # [11:26] <Grauw> people run it after developing the site as an afterthought
  265. # [11:26] <Grauw> and don’t run it again everytime new content is added
  266. # [11:27] <anne> That entirely depends on how you do things.
  267. # [11:27] <anne> If they don't doing just syntax checking on new content doesn't help much.
  268. # [11:27] <anne> imo
  269. # [11:27] <Grauw> XML does it inherently
  270. # [11:27] <anne> No, XML only checks some syntax of which some isn't even enforced in all browsers.
  271. # [11:28] <anne> You can't really buy anything with that.
  272. # [11:28] <Grauw> besides, if the parser doesn’t reject the XML, it’s a parser bug. Defining graceful error handling rules isn’t going to improve that situation because the parser doesn’t follow the spec!
  273. # [11:28] <Grauw> browsers isn’t the only thing out there
  274. # [11:28] <anne> I'm not sure I ever argued otherwise.
  275. # [11:29] <Grauw> the fact that some browsers may implement some parts of XML badly doesn’t make it useless, or a failure
  276. # [11:29] <hsivonen> Grauw: I wouldn't be too surprised if even Xerces-J didn't properly enforce well-formedness for non-UTF-* documents
  277. # [11:29] <anne> It makes it a failure on lots of mobile content and lots of feed content.
  278. # [11:29] <anne> Sadly.
  279. # [11:30] <hsivonen> the thing is that XML works great as long as you don't expose authors to it or put it on the Web
  280. # [11:30] <Grauw> so pointing to bad implementations, how exactly does that make ‘graceful’ error handling better? an error is an error.
  281. # [11:30] <hsivonen> XHTML and RSS are the cases where people start to complain about this
  282. # [11:30] <Grauw> that’s a good thing
  283. # [11:30] <hsivonen> the enterprise integration use cases are fine
  284. # [11:30] <Grauw> it finally forces people to pay attention
  285. # [11:31] <anne> Also, again, there's not much point in enforcing strict syntax if you don't enforce all the other things a document needs to conform too as strict as well. (And that's impossible for XHTML for instance and any other language that allows embedding script.)
  286. # [11:31] <Grauw> I wouldn’t say there’s no point
  287. # [11:31] <Grauw> for one thing, XML lives on a different layer than the document language on top of it
  288. # [11:31] <Grauw> breaking XML well-formedness breaks any type of processor
  289. # [11:32] <Grauw> I mean, you don’t go blaming IPv4 for errors in HTML documents either
  290. # [11:32] <anne> Not for syntactically correct content, but yes.
  291. # [11:33] <anne> I would say that HTML syntax and XML syntax are on the same layer more or less...d
  292. # [11:33] <anne> s/...d/.../
  293. # [11:33] <Grauw> or rather, I do not expect IPv4 to be ‘lenient’ in some unpredictable way (and instead of rejecting bad packets routing them anywhere based on a guess) just because there are bad HTML documents out there
  294. # [11:33] <Grauw> syntax, yes
  295. # [11:33] <Grauw> SGML too
  296. # [11:33] <anne> AFAIK authors don't have to deal with IPv4 so that seems fine.
  297. # [11:34] <anne> SGML is pretty much dead.
  298. # [11:34] <Grauw> beside the point :)
  299. # [11:34] <Grauw> it’s on the same layer, HTML syntax is a variant of SGML and XML is a variant SGML too
  300. # [11:34] <Grauw> however the language that uses XML (tons) or the language that uses HTML syntax (HTML itself) are on a higher layer
  301. # [11:35] <anne> XML is actually a subset iirc
  302. # [11:35] <Grauw> last I read it wasn’t entirely SGML-compatible
  303. # [11:35] <Grauw> but I might be mistaken, anyway, subset falls under the variant nomer I think :)
  304. # [11:36] <Grauw> higher layer, just like the Java API is on a higher layer of the Java syntax, yet it’s also part of the same language
  305. # [11:36] <anne> I'm not sure why "higher layer" matters here.
  306. # [11:36] <Grauw> s/of/than
  307. # [11:36] <anne> It's the syntax you use as author.
  308. # [11:36] <Grauw> XML is used in a lot more ways than HTML alone
  309. # [11:36] <anne> The language itself would actually be the "higher layer" imo...
  310. # [11:36] <Grauw> for a lot of applications of XML, strict error checking is simply inappropriate
  311. # [11:36] <Grauw> for XHTML, strict error checking is a good thing as well I think
  312. # [11:37] <Grauw> it avoids a lot of problems
  313. # [11:37] <Grauw> not all, but a lot
  314. # [11:37] <anne> Which problems?
  315. # [11:37] <Grauw> forgetting an </a> tag
  316. # [11:37] <anne> It only avoids some syntax problems...
  317. # [11:37] <anne> Which are obvious when you make them anyway...
  318. # [11:37] <Grauw> I’ve seen plenty of sites / blog posts where an entire paragraph was a link because people forgot an </a>
  319. # [11:37] <hsivonen> Grauw: do you mean that there are cases when a DTD-valid XML document is not an SGML + Annex K document?
  320. # [11:38] <Grauw> hsivonen: you tell me, I don’t know, I just think I read it somewhere :)
  321. # [11:38] <anne> Grauw, prolly with an XHTML DOCTYPE
  322. # [11:38] <Grauw> or posts where the </ul> in a list was forgotten, causing the indenting to be wrong in my feed reader but not in browser X
  323. # [11:38] <Grauw> anne, what has doctype to do with it :)
  324. # [11:39] <anne> nm that
  325. # [11:39] <Grauw> except for determining ‘standards’ or quirks mode in IE
  326. # [11:39] <anne> I rather have the content as user than some error in my face because the author just pressed publish and walked away.
  327. # [11:39] <Grauw> I understand that
  328. # [11:39] <anne> Or didn't check in all user agents...
  329. # [11:39] <Grauw> one sec
  330. # [11:39] <anne> (Which happens, I posted samples of that to my site.)
  331. # [11:40] <anne> Not just in Internet Explorer...
  332. # [11:42] <Grauw> IE doesn’t understand XHTML, so there’s no user agents on the web which don’t at least have a decent understanding of XML
  333. # [11:42] <anne> ?
  334. # [11:43] <Grauw> if you author XML, processed by an XML parser, then you should never be presented with a page that’s unreadable because the author didn’t verify the XML-well-formedness
  335. # [11:43] <Grauw> because the author’ll notice that his page is broken immediately
  336. # [11:44] <Grauw> or of course, really because his CMS ran it through a server-side XML parser to test the well-formedness and complains if it isn’t
  337. # [11:46] <Grauw> anyway, if you really want graceful error degradation, use HTML instead of XHTML. If you care about the same in feeds, well honestly, just use an XML parser and let the users complain to the author’s website to fix it :).
  338. # [11:46] <Grauw> I’ve done so to a site that newly had an XML feed implemented, and that worked just fine.
  339. # [11:46] <anne> I also care about it on mobile phones.
  340. # [11:46] <anne> I also think the character encoding issue would have had to be fixed by now in order for that to be feasible. But it isn't...
  341. # [11:47] <Grauw> I don’t think there’ll be documents that are authored on a mobile phone
  342. # [11:47] * Joins: hasather (hasather@81.235.209.174)
  343. # [11:47] <anne> Also, I think ease of authoring and users first demands graceful error handling.
  344. # [11:47] <anne> Grauw, you obviously don't work for the same company as I do
  345. # [11:47] <Grauw> If the desktop browsers are all strictly validating, and they don’t use some different ‘profile’ or something but just the same content, then they should be authored correctly even if some phones don’t have a proper XML parser
  346. # [11:48] <Grauw> I have worked with a lot of XML technologies though, and also on the support side of that.
  347. # [11:48] <anne> We are encountering those problems daily and are currently using heuristics (last I heard) to determine whether or not application/xhtml+xml can be processed as XML...
  348. # [11:48] <Grauw> Backbase processes all documents as XML, and the worst we get as support questions is like
  349. # [11:48] <Grauw> ‘why doesn’t &nbsp; work’ and we tell them to use ‘&#160;’ and they’re happy.
  350. # [11:49] * anne was already planning to extend the default set of entities
  351. # [11:49] <Grauw> I don’t see the need for those heuristics
  352. # [11:49] <Grauw> you are planning, huh :)
  353. # [11:50] <anne> Yeah, next semester I'll be working on XML2 as part of my university research project for my Bachelor degree.
  354. # [11:50] <anne> or XML5, haven't decided on a name yet
  355. # [11:50] <Lachy> oh, not another naming debate ;-)
  356. # [11:50] <Grauw> I think I could say I probably have more experience with XML technologies than you do, given that Backbase is based on XML, and your work mainly concerns HTML :)
  357. # [11:51] <anne> Lachy, just between me and my supervisor at the uni this time...
  358. # [11:51] <anne> Grauw, how do you know what my work concerns?
  359. # [11:52] <anne> But "XML technologies" sure...
  360. # [11:52] <Grauw> I don’t know, you started the ‘you obviously don’t work for the same company as I do’ path :)
  361. # [11:52] <hsivonen> anne: my supervisor first favored "Web Applications 1.0" but later agreed to putting "HTML5" in the title
  362. # [11:52] <anne> Grauw, I was just indicating that you can't know the problems we face with mobile content.
  363. # [11:53] <hsivonen> anne: do you mean the success story of XML for mobiles is breaking XML? :-)
  364. # [11:53] <Grauw> well I know of a lot of XML-related problems because Backbase is actually one of the few web technologies that uses XML
  365. # [11:53] <anne> hsivonen, sort of :)
  366. # [11:53] * anne likes that sentence
  367. # [11:54] <Grauw> and we don’t get many of them
  368. # [11:54] <Grauw> writing correct XML is actually extremely simple compared to writing correct HTML :)
  369. # [11:55] <Grauw> most people who don’t get it at first do get it soon thereafter
  370. # [11:56] <anne> That's good. I don't plan to change how you have to write XML.
  371. # [11:56] <Grauw> as for Opera, given that they have a fairly wide deployment, they are likely a tested platform for many mobile web sites (as they really test against platforms anyway), and it would be nice if they stood ground with regard to XML validity
  372. # [11:56] <anne> XML validity?
  373. # [11:56] <anne> hah
  374. # [11:56] <Grauw> well-formedness
  375. # [11:56] <Grauw> you get what I mean
  376. # [11:57] <anne> You mean namespace-well-formedness?
  377. # [11:57] <anne> As I said, we tried and failed.
  378. # [11:57] <Grauw> I don’t think I have ever heard of namespace-well-formedness so far
  379. # [11:57] <Grauw> Maybe Opera gave up too soon :)
  380. # [11:58] <anne> <test:test/> is well-formed but not namespace-well-formed
  381. # [11:58] <anne> You can speculate all you wish, but reality is what matters here.
  382. # [11:58] <anne> Oh, its "namespace well-formedness"
  383. # [11:58] <anne> See http://www.w3.org/TR/xml-names/#ProcessorConformance
  384. # [11:58] <Grauw> Namespace well-formedness then, yes
  385. # [12:00] <Grauw> the problem why there is invalid XML in the first place is because UAs give in to pressure like Opera does
  386. # [12:00] <Grauw> and many, many feed readers
  387. # [12:00] <anne> The problem is that people are not perfect.
  388. # [12:00] <anne> And make imperfect products.
  389. # [12:01] <anne> And we have to deal with that situation and not just pretend it doesn't exist.
  390. # [12:01] <Grauw> they seem to have no problem with well-formedness when creating a Java program
  391. # [12:01] <Grauw> So I don’t buy into the argument that people can’t help doing it wrong
  392. # [12:01] <anne> People have lots of problems with different C compilers every day...
  393. # [12:01] <Grauw> or that doing it right is so hard that it’s impossible
  394. # [12:01] <Grauw> that’s a very different issue
  395. # [12:01] <hsivonen> if there's some pressure point where I wish Opera had stood firm, it's that I think all browser vendors should have refused to implement XML 1.1
  396. # [12:01] <anne> I don't know enough about Java.
  397. # [12:01] <Grauw> but it would be great if C compilers would agree on one language
  398. # [12:02] <anne> hsivonen, I think I'll fold in all the good features from XML 1.1 into XML n
  399. # [12:02] <Grauw> XML 1.1 is XML 1.0 plus a wider unicode range, right?
  400. # [12:02] <anne> Plus at least one incompatible change...
  401. # [12:03] <anne> wider unicode range for element and attributes names btw
  402. # [12:03] <Grauw> yes
  403. # [12:03] <Grauw> the incompatible change being (if it’s easy to describe)?
  404. # [12:03] <hsivonen> Grauw: XML 1.1 allows a different arbitrary range of characters (yes, wider) in element and attribute names
  405. # [12:03] <anne> Grauw, the spec points them out
  406. # [12:03] <anne> some characters are no longer allowed in some production
  407. # [12:03] <anne> and some others are
  408. # [12:03] <Grauw> aha
  409. # [12:04] <Grauw> well that’s weird, that they exclude characters that used to be allowed before
  410. # [12:04] <hsivonen> the path to crazy implementation cost and interop breakage is paved with i18n political correctness and IBM mainframe compat
  411. # [12:04] <anne> XML 1.1 is a classic example of versioning done wrong
  412. # [12:04] <Grauw> unless they were characters that could not actually be used in a regular document
  413. # [12:05] <Grauw> anyway, I need to go :)
  414. # [12:05] <Grauw> (again!)
  415. # [12:05] <Grauw> see you later
  416. # [12:05] <anne> heh
  417. # [12:05] <anne> bye
  418. # [12:05] <anne> http://www.w3.org/TR/xml11/#sec-xml11
  419. # [12:08] <anne> The main problem is the control character nonsense.
  420. # [12:08] <hsivonen> anne: what exactly is the nonsense in your opinion_
  421. # [12:08] <hsivonen> ?
  422. # [12:09] <anne> That you need to use character references and such...
  423. # [12:09] <anne> I don't think those changes make much of a problem though... Just allow everything and require #0 to be translated to #FFFD
  424. # [12:09] <anne> #x0 to #xFFFD
  425. # [12:10] <anne> So new parsers can just ignore the whole <?xml version=x?> thing and continue...
  426. # [12:11] <hsivonen> anne: would you allow #xFFFF?
  427. # [12:12] <anne> everything that should not be allowed for some good reason will be translated into #xFFFD during the input stream phase I suppose
  428. # [12:12] <hsivonen> anne: would #x0 be the only code point that you would not allow in the infoset?
  429. # [12:12] <anne> I'm obviously no expert in that
  430. # [12:16] <anne> hsivonen, it seems that's what HTML5 does
  431. # [12:16] <hsivonen> yes.
  432. # [12:17] <hsivonen> in retrospect, I think it was a very bad idea for XML to arbitrarily limit the lexical space of identifiers
  433. # [12:17] <anne> Although HTML5 should probably deal with some characters such as those from windows-1252
  434. # [12:17] <hsivonen> but gotta run to an iki.fi meeting to lobby for a Jabber server
  435. # [12:17] <anne> bye
  436. # [12:27] * Quits: Grauw (ask@202.71.92.74) (Ping timeout)
  437. # [12:38] * Joins: ROBOd (robod@86.34.246.154)
  438. # [12:44] * Quits: gavin (gavin@74.103.208.221) (Ping timeout)
  439. # [12:46] * Joins: edas (edaspet@88.191.34.123)
  440. # [12:56] * Joins: gavin (gavin@74.103.208.221)
  441. # [13:07] * Quits: sbuluf (hm@200.49.140.156) (Ping timeout)
  442. # [14:59] * Quits: gavin (gavin@74.103.208.221) (Ping timeout)
  443. # [15:04] * Joins: gavin (gavin@74.103.208.221)
  444. # [15:16] * Quits: edas (edaspet@88.191.34.123) (Quit: Quitte)
  445. # [15:17] * Quits: marcos (chatzilla@203.206.31.102) (Ping timeout)
  446. # [15:37] * Joins: erik (erik@84.29.164.153)
  447. # [15:37] * Quits: erik (erik@84.29.164.153) (Quit: Bye bye)
  448. # [15:38] * Joins: erik (erik@84.29.164.153)
  449. # [15:39] * Joins: marcos___ (chatzilla@203.206.31.102)
  450. # [15:39] * marcos___ is now known as marcos
  451. # [15:44] * Quits: erik (erik@84.29.164.153) (Quit: Bye bye)
  452. # [15:53] * Joins: h3h (bfults@70.95.237.98)
  453. # [16:13] * Quits: h3h (bfults@70.95.237.98) (Quit: h3h)
  454. # [16:32] * Joins: Shunsuke (kuruma@219.110.80.235)
  455. # [17:07] * Quits: gavin (gavin@74.103.208.221) (Ping timeout)
  456. # [17:12] * Joins: gavin (gavin@74.103.208.221)
  457. # [17:16] * Joins: Grauw (ask@202.71.92.74)
  458. # [17:55] * Quits: Grauw (ask@202.71.92.74) (Ping timeout)
  459. # [17:57] * Joins: edas (edaspet@88.191.34.123)
  460. # [18:02] * Joins: h3h (bfults@70.95.237.98)
  461. # [18:08] * Quits: anne (annevk@81.68.67.12) (Ping timeout)
  462. # [18:16] * Joins: sbuluf (logp@200.49.140.131)
  463. # [18:18] * Parts: hasather (hasather@81.235.209.174)
  464. # [18:19] * Joins: cwahlers (Miranda@201.27.182.230)
  465. # [18:55] * Joins: st (st@62.234.155.214)
  466. # [19:04] * Joins: dbaron (dbaron@71.198.189.81)
  467. # [19:14] * Quits: gavin (gavin@74.103.208.221) (Ping timeout)
  468. # [19:19] * Joins: gavin (gavin@74.103.208.221)
  469. # [20:00] * Quits: edas (edaspet@88.191.34.123) (Ping timeout)
  470. # [20:12] * Joins: edas (edaspet@88.191.34.123)
  471. # [21:01] * Quits: dbaron (dbaron@71.198.189.81) (Quit: 8403864 bytes have been tenured, next gc will be global.)
  472. # [21:10] * Quits: Shunsuke (kuruma@219.110.80.235) (Quit: See you...)
  473. # [21:15] * Quits: edas (edaspet@88.191.34.123) (Ping timeout)
  474. # [21:21] * Quits: gavin (gavin@74.103.208.221) (Ping timeout)
  475. # [21:26] * Joins: gavin (gavin@74.103.208.221)
  476. # [21:58] * Joins: hasather (hasather@81.235.209.174)
  477. # [22:04] * Joins: edas (edaspet@88.191.34.123)
  478. # [22:09] * Quits: cwahlers (Miranda@201.27.182.230) (Connection reset by peer)
  479. # [22:28] * Quits: ROBOd (robod@86.34.246.154) (Quit: http://www.robodesign.ro )
  480. # [22:31] * Quits: h3h (bfults@70.95.237.98) (Quit: h3h)
  481. # [22:32] * Joins: h3h_ (bfults@70.95.237.98)
  482. # [22:32] * Quits: h3h_ (bfults@70.95.237.98) (Quit: h3h_)
  483. # [22:39] * Quits: edas (edaspet@88.191.34.123) (Quit: http://eric.daspet.name/ et l'édition 2007 de http://www.paris-web.fr/ )
  484. # [22:42] * Quits: hasather (hasather@81.235.209.174) (Client exited)
  485. # [22:43] * Joins: hasather (hasather@81.235.209.174)
  486. # [23:28] * Quits: gavin (gavin@74.103.208.221) (Ping timeout)
  487. # [23:33] * Joins: gavin (gavin@74.103.208.221)
  488. # Session Close: Sun Apr 01 00:00:00 2007

The end :)