From: aray@nyc.pipeline.com (Arjun Ray) Newsgroups: comp.text.sgml Subject: Re: DTD for *incorrect* HTML Date: 15 Feb 1996 21:58:34 -0500 Organization: FUDGE Dispersal Systems Message-ID: <4g0rsq$ncf@alpha.nyc.pipeline.com> References: \ <311E2EAB.14CC@passage.com> \ <3123B24A.2BA0@passage.com> In article <3123B24A.2BA0@passage.com>, W. Eliot Kimber \ writes: >> >> In article <311E2EAB.14CC@passage.com> >>"W. Eliot Kimber" \ writes: >> > Joe Armstrong wrote: >> > > >> > > Is there a DTD for flawed HTML. >> > >> > \> > \ >> > ]> Close, but it still doesn't capture the idea of "tag salad" which I believe underlies Joe Armstrong's question. >I want to appologize for my post, which was a bit snotty and made in >haste under the influence of insufficient sleep and excess stress (no >excuse). Snotty? If you say so. I found it refreshingly direct. People who validate against the "Mozilla DTD" at Webtechs are deluding themselves, IMHO, if they think they've achieved anything beyond a perfunctory observance of "fashionable" formalities. Their documents are no more valid with the exercise than they remain invalid without. >My proposal was not intended to be that serious, although given that >most elements in HTML are not containers, it almost works as is. You'd >also have to make the content models of all the containers ANY as well >for it to actually allow you to parse anything. True, but isn't this taking the view that we need to explicate the structure (if any) of HTML documents that the current crop of popular browsers appear willing to accept? i.e., without necessarily describing browser/user-agent behavior per se, but assuming that their parsing algorithms are directed at *some* abstract data structure, how should *that* a.d.s be represented in SGML (or could it be)? Viewing Joe Armstrong's question in this light -- "What DTD, if any, could the current crop of popular browsers be viewed as agents for?" -- the real answer hinges on whether *any* HTML element "should" be a container at all. Certainly the Mosaic-spawn pay no attention. They even subscribe to a different terminology (tags are "marks" and attributes are "tags".) We are even afforded a "reference implementation": \ In particular, the file HTMLparse.c and the get_mark() subroutine. It warrants careful study. For instance, when someone composes ...\abcd\efgh\ijkl\... he really means something like this ("first bold, then also italic, and then only italic...") ...\abcd\efgh\ijkl\... and that's what this implementation *wants* him to mean. The extent to which people are willing (or habituated) to "learn from the implementation" shouldn't be underestimated, lest wishful thinking obscure facts. Thus, consider this exercise. Take a document with allegedly HTML markup. Preprocess it via any regular-expression capable processor to effect the following substitutions: STAGO (`<' in context) ===> `\ `\ with the URL or suitable transformation thereof to point to a single line: \ And you're ready for validation of "flawed HTML". Travesty? No, just the true gauge of "support for open standards" and sundry other feel-goodisms. And please don't ask how Netscape proposed content providers should "hide" Javascript code from other browsers... Cheers, ar From: tore@lis.pitt.edu (Tore Joergensen) Newsgroups: alt.etext,alt.hypertext,comp.publish.prepress,comp.multimedia,comp.publish.electronic.misc,comp.text.sgml,comp.text Subject: Re: --> Press Release from OmniMedia on CDA of Telecom Reform Act Followup-To: alt.etext,alt.hypertext,comp.publish.prepress,comp.multimedia,comp.publish.electronic.misc,comp.text.sgml,comp.text Date: 16 Feb 1996 16:26:07 GMT Organization: University of Pittsburgh Message-ID: <4g2b6v$dsq@toads.pgh.pa.us> References: \ Jon Noring (noring@netcom.com) wrote [lots of stuff deleted]: : And all the while the U.S. publishing industry is restricted to marketing : books the "old-fashioned way", our European and Japanese friends will not be : so restricted. They will enjoy the ability and advantages of marketing : directly over the Internet in their own countries and so will clearly dominate : there as well as developing the infrastructure to do so, getting a large : technological jump over U.S. publishers. How this will effect the current : supremacy the U.S. has in publishing is debatable, since it can be argued that : it is still a level playing field -- the U.S. publishers can market their : e-books over the Internet outside the U.S., and the rest of the world has to : market their e-books the "old-fashioned way" within U.S. borders. If a person makes a Web-page in a European country which markets a book in an "indecent" way, it will be possible for an US citizen to read that page. US may choose to prosecute that person if (s)he enters US, but no country in Europe will send a person to US to be prosecuted for publishing something from a European site. The internet does not belong to US, and other countries rule by their own laws :-). I'm not sure what will happen if a person sends "indecent" material by e-mail to an US citizen, but "passive" marketing (which I think will include posting on newsgroups) will not give an European citizen any problem (as long as (s)he doesn't enter US). The only way US can "protect" US citizens against "indecent" material, is to filter all comunication from the rest of the world, or make it illegal for US citizens to read/view "indecent" material (which really will be an impressive level of "protection"). I hope you will be able to get rid of that law. It s**ks! (protection stars added for innocent people :-) -- +-------------------------+-------------------------------------------+ | Tore B. Joergensen | e-mail : tore@lis.pitt.edu | | Centre Court Villa | web : http://www.pitt.edu/~tojst1 | | 5535 Centre Avenue # 6 | | | Pgh, PA 15232, USA | Norwegian MSIS-student at Univ. of Pgh. | +-------------------------+-------------------------------------------+ Newsgroups: comp.text.sgml Date: Fri, 16 Feb 1996 17:08:12 -0600 Message-ID: <9602162308.AA08809@fly.HiWAAY.net> From: Len Bullard \ References: \ <311E2EAB.14CC@passage.com> \ <3123B24A.2BA0@passage.com> <4g0rsq$ncf@alpha.nyc.pipeline.com> Subject: Re: DTD for *incorrect* HTML [Arjun Ray] >Travesty? No, just the true gauge of "support for open standards" and >sundry other feel-goodisms. And please don't ask how Netscape proposed >content providers should "hide" Javascript code from other browsers... We are actually puzzling over that one. Why was it considered a good idea to use an empty \