Newsgroups: comp.text.sgml Date: 01 Sep 1994 00:53:51 UT From: "W. Eliot Kimber" \ Organization: Passage Systems, Inc. Message-ID: \ References: <33v8g7$ksc@urmel.informatik.rwth-aachen.de> Subject: Re: HyTime problems [Jens Meggers] | Working on a HyTime conforming DTD to represent Hypermedia-Mail, I | stumbled across some problems: | | 1. Using FCSLOCK with a !percent view!: | I want to use FCSLOCK to pic up a part of some graphic objects | (like JPEG). To adress that parts in percentage terms I have to define a | FCS with the dimension 100 x 100. | In some HyTime examples the FCS is only defined in the DTD and there is | no FCS-element within the document instance. To force no error during the | SGML-parsing process, the attribut IMPFCS IDREF #REQUIRED is changed to | IMPFCS NAME #REQUIRED. This seems to be a very incorrect solution. In fact, the declared value prescription of "IDREF" for IMPFCS in the standard is a typo corrected to "NAME" in the Catalog of Architectural Forms. This is because you are specifying the *element type name* of the FCS-form element used to define the FCS, not the ID of an instance of the FCS-form element. (Just as the axisdef attribute of the FCS element form is refering to the element types of AXIS-form elements). | The problem is, that if you would put the FCS in the document instance, | you have to generate a minimum of one event (\, but there is no use to define an instance of | the percent-fcs, if it is only constructed for fcslock. Ah, but the evsched element form has a content model of (evgrp|event)*, so you can have an empty event schedule. | 2. Using FCSLOCK to pic up a range of video-frames: | is it correct, that there is no way to pic up a range of video-frames | from different videos with the same FCS(lock) ? | I think, that axis-dimension of the fcs, used in fcslock, have to be | changed for each video with different lenght. I'm not sure what you mean, but it would depend on how the location source for the FCSLoc element was defined. Normally, a single video object would be the location source, e.g.: \ ]> \ \\Video1\\ \ \\1 5\\\ \ ... \

\Click here to see some video\ ... \ However, if you are expecting support for the multloc option, the location source for the fcsloc could be a multiple location consisting of several video objects, which would be treated as a single object from which you could select multiple ranges by specifying multiple dimension specifications within each dimension list. You could also locate multiple ranges of frames by using multiple FCSLoc elements and then locating them via a nameloc, e.g.: \ \ ]> \ \\Video1\\ \\Video2\\ \ \\1 5\\\ \ \ \\10 4\\\ \ \ \frame-range-1 frame-range-2\ \ ... \

\Click here to see some video\ ... \ | 3. The dimspec order of the extlist in the FCSLOCK content: | does the order of the fcs axis-definition determine the order of the | dimspecs in the fcslock-content ? Yes. Note that in a real application, you could define different element types for the dimlist or dimspec elements to make their association to particular axes clear and consistent: \ \ \ \ | 4. How can I describe this Situation: | \ | Can I use a hyperlink to start an event ? You can use a hyperlink to make an object the content of an event using the accessed anchor link feature. Accessed anchor link lets you say "when an object is traversed to as the anchor of a particular hyperlink, consider it to be contained in this event." This allows you to model dynamic behavior by using hyperlinks to describe "state transitions" (e.g., traversing a hyperlink is the act of moving from one state to the next). For the accessed anchor feature, you can specify either a link type or a link instance as the thing that does the controlling. In the case of starting a particular event, you would probably key off of a specific hyperlink instance, for example: \ \ \ \ \ ]> \ \\Video1\\ \ \ \ \\ \ \ \ \\ \ \1 200\ \1 200\ \ \ \ \ \ ... \

\ \This is a graphic\ \ In a real application, the event schedule that defines the video display might be defined centrally and re-used. You might also use the accessed link element type feature instead of the accessed link feature so that any ShowObjectLink that was traversed would result in the object traversed to being shown. Also, this example looks more verbose than it needs to be because I've shown all the elements and not taken advantage of minimization. Note that you don't necessarily need an event schedule for this sort of function, as your application could define the meaning of traversing to a particular anchor of a particular link type to mean "present the anchor in a window". For example, DynaText provides a way to specify this behavior in its own style language. However, using the FCS is a more generic way to do it, providing a more interchangeable specification of the author's intent. -- \

W. Eliot Kimber (kimber@passage.com) Systems Analyst and HyTime Consultant Passage Systems, Inc., 9971 Quail Blvd., Suite 903, Austin TX 78758 +1 512 339 1400 465 Fairchild Dr., Suite 201, Mountain View, CA 94043, +1 415 390 0911 \

Newsgroups: comp.text.sgml Date: 01 Sep 1994 05:04:41 UT From: Novell Epub \ Organization: Novell Electronic Publishing Message-ID: \ Subject: Job opening at Novell [This notice was posted last week with replies directed to Wayne Taylor at Novell. We have reason to suspect that Wayne's mail connection may be acting up, so I am reposting this. Please direct replies to \ this time. If you replied to the earlier post, please resend your reply to the new address. -- Jon Bosak, Novell Corporate Publishing Services] Novell, a leading worldwide computer networking company, has the following opportunity: Sr. Software Engineer/Hypertext Specialist Duties include: - Research, specify, develop and maintain our online document preparation tools. - Assist business units in the specification of hardware to support authoring, SGML conversion and preparation tools. - Support authoring groups in troubleshooting problems that arise in the preparation of documents for online delivery. - Communicate with the development and support staffs of our tools vendors. - Assist business units in negotiating and managing conversion outsourcing and other contract work related to online document preparation. - Monitor and participate in the development of industry standards for electronic document delivery. - Assist the department head in researching and developing our corporate information delivery strategy. Skills Needed: - Significant C or C++ programming abilities in UNIX, Windows, DOS, or OS/2 - Significant experience working with SGML To be considered for this position, please send your resume in ASCII or uuencoded PostScript to novlepub@netcom.com. Jon Bosak Novell Corporate Publishing Services Newsgroups: comp.text.sgml Date: 01 Sep 1994 08:26:25 UT From: Ernie Quah Cheng Hai \ Organization: ncb Message-ID: \ Subject: SGML Conference SGML Asia Pacific For interested readers, The above conference would be held in Singapore from 10-12 Oct 94. Dr Charles Goldfard would be one of the keyword speaker. It's not free !!! 1st participant US$800, 2nd US$600, 3rd US$400 Call +1 703 519 8160 for registration. Newsgroups: comp.text.sgml Date: 01 Sep 1994 08:52:37 UT From: Michael Schwantner \ Organization: Fachinformationszentrum Karlsruhe Message-ID: <85905A7945@pc> References: <1994Aug28.062554.1@east.pima.edu> Subject: Re: TEI info, please [Gloria McMillan] | I was just wondering if there is a second usenet group on TEI. I am in | writing and literature and have been reading about TEI. I have lots of | questions, but don't want to post them to the wrong group. | | Does a separate TEI news group exist? There is a TEI mailing list: Send a mail to listserv@uicvm.uic.edu. Leave the subject blank, in the message body write: subscribe tei-l \ \ Here is an excerpt from the welcome message: "I would like to welcome you as a subscriber to the electronic discussion group, TEI-L, at the University of Illinois at Chicago. The Text Encoding Initiative (TEI) is an international project to establish guidelines for the encoding of machine-readable textual material for research; this discussion group has been set up in order to disseminate information about the TEI and to enable discussion of the TEI guidelines while they are under development. We hope that the TEI-L list will prove useful in providing information and encouraging discussion as the TEI progresses toward completion and formal publication of the TEI Guidelines For Text Encoding and Interchange. ..." Newsgroups: comp.text.sgml Date: 01 Sep 1994 09:36:31 UT From: Steinar Bang \ Organization: Falch Hurtigtrykk, Oslo, Norway Message-ID: \ References: <33v8g7$ksc@urmel.informatik.rwth-aachen.de> Subject: "Was: HyTime problems" \ [Jens Meggers] | Working on a HyTime conforming DTD to represent Hypermedia-Mail, ... What is "hypermedia-mail"? Just curious. - Steinar -- \ \Support your local \ \DoD\\ chapter. \ Newsgroups: comp.text.sgml Date: 01 Sep 1994 12:22:16 UT From: Dave Peterson \ Organization: ACM Network Services Message-ID: <344h1o$55h@hopper.acm.org> References: <342me6INNn9i@moon.cis.ohio-state.edu> Subject: Re: What is the relationship between entities and attributes? [James Webster Saunders] | I came across the following reference in "Practical SGML": | | \ | | where FILE1 and FILE2 are entities. In this case, the entities are | external files containing the tag data. If a tagged document were | being parsed without a DTD and it contained such a reference, how would | one distinguish that this value is indeed a list of entities? "parsing without a DTD" sounds akin to compiling without a programming language specified -- how would you compile a FORTRAN program without specifying that it's FORTRAN? I think you need to explain in more detail what you want to accomplish. Dave Peterson SGMLWorks! davep@acm.org Newsgroups: comp.text.sgml Date: 01 Sep 1994 12:27:46 UT From: Arjan Loeffen \ Message-ID: <52070.loeffen@ruulet.let.ruu.nl> Subject: Search Dear reader, I'd be interested to hear how available SGML editing environments perform a search/replace. Do you know of an editor that allows you to search cross-element boundary, as in the following example: \

Therefore we \must find such a person. (search for "we must find" is successful). and \

Therefore we\I mean our department. must find such a person. (search for "we must find" is successful). I'd like to hear how these programs deal with this, especially if contexts or types of elements are checked to decide if the elements should be ignored/included there. Thanks in advance. Arjan. -- Arjan Loeffen Achter de Dom 22-24 ++31+30536417 voice work Faculty of Arts 3512JP Utrecht ++31+206623817 voice home University of Utrecht The Netherlands ++31+309221 fax work Newsgroups: comp.lang.smalltalk,comp.text.sgml Date: 01 Sep 1994 12:47:15 UT From: Hasko Heinecke \ Organization: Georg Heeg Objektorientierte Systeme, Dortmund, FRG Message-ID: \ References: \ Subject: Re: Smalltalk tools for SGML [Joe Berkovitz] | Does anyone know of any Smalltalk tools out there for parsing or | otherwise processing SGML input? Any pointers would be greatly | appreciated! Georg Heeg - Objektorientierte Systeme, the company I work for, is currently developing an SGML parser for ParcPlace's Smalltalk. For more information, please contact me vie phone, fax, or email. Hasko Heinecke Georg Heeg - Objektorientierte Systeme Baroper Str. 337 D-44227 Dortmund Germany Tel: +49-231-97599-0 Fax: +49-231-97599-20 Email: hasko@heeg.de, info@heeg.de -- +-------------------------------------------------------+ | Hasko Heinecke speaking for myself only | | I _never_ mean what I say - and nobody else does... | +-------------------------------------------------------+ Newsgroups: comp.text.sgml Date: 01 Sep 1994 15:45:14 UT From: James Saunders \ Organization: the ohio state university Message-ID: \ References: <342me6INNn9i@moon.cis.ohio-state.edu> <344h1o$55h@hopper.acm.org> Subject: Re: What is the relationship between entities and attributes? [Dave Peterson] | "parsing without a DTD" sounds akin to compiling without a programming | language specified--how would you compile a FORTRAN program without | specifying that it's FORTRAN? | | I think you need to explain in more detail what you want to accomplish. We are involved in a project that allows construction of a DTD for a collection of tagged documents that do not have a DTD. As such, this is not an easily solved problem. My question was, what information that is contained in a tags attribute, or in other tags in the same document, allows you, without knowing the DTD, to determine that the tag attributes declared value is CDATA, ENTITY, ENTITIES etc. ie \ pic12 is an entity \ "100m 200m" is not an entity -- James W. Saunders Research Assistant | Graduate Student Office of Research | Dept. of Geography OCLC | OSU saunders@oclc.org | jsaunder@magnus.acs.ohio-state.edu Newsgroups: comp.text.sgml Date: 01 Sep 1994 16:21:08 UT From: David Megginson \ Organization: Department of English, University of Ottawa Message-ID: \ References: <342me6INNn9i@moon.cis.ohio-state.edu> Subject: Re: What is the relationship between entities and attributes? [James Webster Saunders] | I came across the following reference in "Practical SGML": | | \ | | where FILE1 and FILE2 are entities. In this case, the entities are | external files containing the tag data. If a tagged document were | being parsed without a DTD and it contained such a reference, how would | one distinguish that this value is indeed a list of entities? Exactly. That is why SGML files are incomplete without a DTD. By the same token, how could you tell that \&FILE2; was an external data entity without a declaration in the DTD to guide you? | There seems to be a tremendous amount of flexibility in SGML to specify | things as attributes. In this example, however, this seems to lead to | ambiguity in the tagging as to what is contained in the attributes | declared value. Any enlightenment on this issue would be appreciated. That's why we declare notations, entities, and attribute lists in the DTD, so that the processing software will know _exactly_ what to do in a case like this. Just out of curiosity, why _do_ you need to parse the document without a DTD? David -- David Megginson Department of English, University of Ottawa, dmeggins@aix1.uottawa.ca Ottawa, Ontario, CANADA K1N 6N5 dmeggins@acadvm1.uottawa.ca Phone: +1 613 564 6850 (Office) ak117@freenet.carleton.ca +1 613 564 9175 (FAX) Newsgroups: comp.text.sgml Date: 01 Sep 1994 18:09:13 UT From: Bob Agnew \ Organization: Science Applications International Corp. Message-ID: <1994Sep1.180913.5161@ast.saic.com> References: \ Subject: Re: Is #CURRENT a good thing? / TEI gripe [David Megginson] | Exactly -- it would be slightly more useful that way, but not much. I | imagine that it was included in the (SGML) standard, along with a few | other poorly-thought-out features (RANK, DATATAG, etc), because of | pressure from an older generation of computer hacks, trained in the | days when saving ten bytes from a file (at the cost of transparency and | easy maintenence) could actually matter. I was disappointed to find | TEI using #CURRENT, but I would like to emphasise that this is a very | small disappointment with an otherwise outstanding piece of | collaborative work. Your use of the word "imagine" seems to be accurate here, but that's about all. The first sentence of paragraph 4.6.5 (RANK Feature) in the handbook begins: "RANK is a concession to application design practices in the early days of generic coding......" Not much imagination required there. Your use of the phrase "poorly-thought-out" is, I feel, unkind and inappropriate. The purpose of #CURRENT is clearly stated in the standard to provide a means for inheriting a default from a parent element. With respect to the misuse of #CURRENT determining the current level of an element, I should like to point out that the level of an elemnet is defined as the RANK of the element in the standard and should be readily available to an application from any well designed parser. I have found from experience that the handling of "RANK" or level in the outspec dtd by means of FOSI's is woefully inadequate. There, the context must be explicitly stated for nested elements. e.g., context="item seqlist item seqlist" The "RANK" or level must be maintained by the FOSI programmer and the only operations that may be done on the counter is to reset it or increment it. No counter arithmetic or logic is provided. Counters may be used to contruct strings for labeling or TOCs and non numeric schemes such as UC Roman or LC alpha are supported. My point is that the parser can easily determined the level or rank of each element and should make it easily available to the application. -- "One man's syntax is another man's semantics." Newsgroups: comp.text.sgml Date: 01 Sep 1994 18:17:53 UT From: Joel Finkle \ Organization: Searle R\&D Message-ID: <1994Sep1.181753.19063@tin.monsanto.com> References: <342gak$bnt@finnegan.iol.ie> Subject: Re: SGML to postscript/PDF [Steve Pepper] | The answer depends a lot on your particular requirements. Any of the | "quasi-WYSIWYG" SGML-editors with support for PostScript printers will | allow you to produce PostScript output directly from SGML - once you | have set up your style sheets, of course. But they only offer fairly | basic formatting capabilities - e.g., no hyphenation. Examples: | Author/Editor, Adept Editor. One added value of a "direct" SGML to PDF output is retention of the structure within the PDF file. This can be done by implementing PDF bookmarks, links, and annotations. Through the use of the Acrobat Distiller, "pdfmark" postscript commands may be embedded in a postscript document already to implement these features. All it takes is a Postscript output that will embed these codes in the file. When I spoke with someone at Adobe some months ago, he said that they would like to be able to retain SGML tagging within a PDF document. I expect that the pdfmark is the first step toward that, but it needs a tremendous enhancement. Joel -- Joel Finkle Searle R\&D jjfink@skcla.monsanto.com "And when I die don't bury me / in a box in a cemetery. Out in the garden would be much better, I could be pushin' up home grown tomatoes" -- Guy Clark, "Home Grown Tomatoes" Newsgroups: comp.text.sgml Date: 01 Sep 1994 19:15:19 UT From: Sean Mc Grath \ Organization: Ireland On-Line Message-ID: <345987$5g9@finnegan.iol.ie> Subject: Re: SGML to Postscript/PDF [Jeffrey McArthur] | How direct is *direct*. With TeX, it is possible to feed your SGML | file directly into TeX (with a proper set of macro loaded) and output a | DVI file. [Steve Pepper] | Hope this helps. If not, perhaps you need to bore us with the details | of your reasons for needing to go direct from SGML to PostScript. Thanks for both replies... My situation is that a large collection of SGML documents exist on an unknown platform on which my transformation from SGML to .PS or .PDF format must take place. Given that I do not know *anything* about the platform as of yet I was hoping to avoid concerns about whether or not third party programs (such as TeX) will run on the target platform by seeing if I can go direct. My needs in terms of formatting it must be said are pretty basic. The odd moveto/lineto sprinkled amongst text of at most three fonts in a handful of point sizes - that is it. If I could coerce the client into living with mono spaced fonts my problems would be solved but I doubt if he will go for that :-) I cannot help feeling that someone, somewhere has written a nice C function library with function calls like :- newpage() - Start new postscript page setfont(x) - Select font boldon() - Turn on bold face centretext("Hello World") - Display text centered on the current line Para ("I am a paragraph") - Output text as a paragraph, wrapping text as required I have that awful "I'm missing something here" feeling. Can anyone put me out of my mysery? Regards, -- Sean Mc Grath digitome@iol.ie Digitome Ltd., Ballina, Co. Mayo, Ireland Tel: +353 96 72092 Newsgroups: alt.culture.usenet,alt.culture.internet,news.misc,comp.infosystems.www.misc,comp.text,comp.text.sgml Date: 01 Sep 1994 19:32:13 UT From: Cameron Laird \ Organization: NeoSoft Internet Services +1 713 684 5969 Message-ID: <345a7t$480@Starbase.NeoSoft.COM> References: <344tf2$ks7@Starbase.NeoSoft.COM> \ Subject: Re: The construction of FAQs [Cameron Laird] | Here's the question: is there any good reason not to construct all FAQs | from now on in HTML, rather than plaintext? More generally, [Peter N. M. Hansteen] | Try this: a significant number of people (I myself am one) use rather | traditional text-based tools to browse newsgroups, and either do not | have easy access to html or (like myself) read news and associated | documents off-line. This in turn means that the usefulness of live | links to other documents etc which html is famous for is greatly | reduced. [other apt observations] . . . One of my realizations for the week is that HTML is not just about live links; its formatting or stylistic standards define an advance over the character-based plaintext with which we're all familiar. HTML is not the best language for quasi-static distribution of documents (PostScript is one example of a better one), but I'm thinking of these advantages, in the context of FAQs, which are written by few, and read by many: 1. HTML browsers are easily available; 2. knowledge of HTML will diffuse explosively on the back of the WWW; and 3. the references in an HTMLized document are more likely to be accurate. This is just a common-place of software engineering; if a URL in a plaintext document is mistyped, people using it will correct the spelling on the fly while tapping out their own ftp requests. On the other hand, a live URL that's wrong will quickly reveal itself, and invite correction. Thus, an HTMLized FAQ will have fewer faults of this sort, which is a benefit even to the people who don't have a live connection. -- Cameron Laird ftp://ftp.neosoft.com/pub/users/claird/home.html claird@Neosoft.com (claird%Neosoft.com@uunet.uu.net) +1 713 267 7966 claird@litwin.com (claird%litwin.com@uunet.uu.net) +1 713 996 8546 Newsgroups: comp.text.sgml Date: 01 Sep 1994 19:48:18 UT From: Bob Agnew \ Organization: Science Applications International Corp. Message-ID: <1994Sep1.194818.21381@ast.saic.com> References: \ Subject: Re: What is the relationship between entiti [James Saunders] | We are involved in a project that allows construction of a DTD for a | collection of tagged documents that do not have a DTD. As such, this | is not an easily solved problem. My question was, what information | that is contained in a tags attribute, or in other tags in the same | document, allows you, without knowing the DTD, to determine that the | tag attributes declared value is CDATA, ENTITY, ENTITIES etc. | | ie \ pic12 is an entity | \ "100m 200m" is not an entity In general, nothing. All that information is in the DTD. Whoever tagged the document in the first place did it according to some DTD because that is what told hime what tags he had available, what attributes the could have, and in what context they could occur. In this case, you are somewhat fortunate in that your document designer used names which are mildly indicative of their nature. e.g., I might guess, with some confidence, that "pic2" is an artwork file since it contains the word FILE and files are file entities. This is why the standard requires the DTD to be part of the document. So if you have a document without a DTD, it's not really an SGML document. -- "One man's syntax is another man's semantics." Newsgroups: comp.text.sgml Date: 01 Sep 1994 21:14:14 UT From: "Claude L. Bullard" \ Message-ID: <9409012114.AA19892@source.asset.com> Subject: SGML New User Requesting General Information After my posting to "newbies" on the basics of SGML, I received a number of kind and complementary postings suggesting that I'd written something of benefit to the community. Thank y'all very much. Nothing works better on me than applause. Ed Brachman at Interleaf made some very good suggestions about improving the post. | (1) Where you compare a DTD to an 'ADT', I'm not sure you're doing the | newbie any favors. I assume that 'ADT' stands for 'abstract data type' | -- but I'm not sure that calling a DTD an ADT helps even for those of | us who can make that assumption, and I'll bet that it goes over the | heads of at least some of the newbies you otherwise do a good job of | talking to. Quite right, Ed. But every newbie is not new to every thing. Some are well-educated comp-sci types who just want to know what SGML IS-OR-Ain't. A few well-placed "buzzWords" gives them the clues they are looking for. Unless they are navigating the nets with Mosaic, they are hopefully not true innocents. Even in that case, it behooves an inquiring mind to get dictionaries for subjects they want to study. In this and future posts on the subject, I am consciously trying to propagate a meme: MyMEME: SGML can be used for more than books. To do it, one must get Beyond The Book Metaphor... or finally realize that a lot of things we think of as software are really automated books without binders. SERMON FOLLOWS (caveat emptor): It is our imaginations that place the harshest limits on our creativity. Since only divine beings create ex nihilo, the rest of us get our sparks from what we ingest through the nearest convenient port. In this case, the newbie would do well to look up the subjects of Abstract Data Type and automata theory and compare them to what SGML has to offer. From however much of this they can absorb, they may begin, by contrast and comparison, to form their own unique and even heretical ideas for SGML applications. Great Jumping Horny Toads, they may start to think for themselves rather than blandly accepting the latest brain-dead application. The strength of SGML is its capacity for rigorously defining applications. Sure, the language has gaps and can't *do everything*. But it can be used by imaginative people to do far more than has already been done, and if enough people believe that and try, maybe we can break the miasma of "control" through registering DTDs. ... we may even find a cure for boredom. Chapters are BORRING! To achieve this, we should learn to treat SGML like the dog that it is: a standard for marking up data. Nothing more. How data should be marked up so that it becomes information, what generic identifiers should be called, how they should be grouped, this is where we can be endlessly creative rather than spending our days endlessly debating if this or that feature of the language or a certain DTD are *holy* (unless that's how you have fun. In that case, party on.) Freedom to seek their own path, to direct their own evolution, to think "funny" thoughts and discover by experience if these thoughts are *holy*, that is what I advocate for newbies. To be free one must choose to think. To think well, one must become knowledgeable. To stay free, one should freely choose what one should know. ... But first, one must eat *good food*. Enough sermon.. | (2) In noting points about the badbook DTD, you claim that the use of | the PUBLIC keyword implies that the DTD has been formally registered | with some body empowered to do formal registrations... Hmm, I was trying to say the opposite. That while the identifier appears to make that claim, there is no way to check it. In other words, no *magical hand* flies around the universe to ensure that the type is registered. It points to the concept of *agreement among trusted parties* as the basis for almost everything SGML can do for parties of more than one. | The only problem I've ever seen is where a system references such | identifiers, but offers you *no* way to get at the relevant "publicly" | identified material. Yep. Such systems are impolite and shouldn't be invited to *parties where the band is too loud and one has to resort to sign language*. That is one use of SGML: sign language for systems that party together than go to their own *domain* afterwards. | (3) In discussing the ISBN attribute, it might be worthwhile to use it | as an illustration of the lack of semantics in SGML. Naming the | attribute 'ISBN' does *not* mean that there's anything connected with | SGML that checks its value to see if it's a valid ISBN -- as the value | in you example in fact is not. Quite true and part of the subject of the next post where I will complete the example. However, SGML is extended by HyTime and HyTime does give one a way to set a check for that via lextypes. Whether your system can DO that checking is another subject: features matching. | (4) Parochially, it bugs me that there's no mention of Interleaf in | your list of SGML products. I know that it's just your personal list. It is. Actually, I woke up that night worried about the numbers of vendors I didn't mention (no lie). Interleaf and its products are fine stuff from what I gather and I have nothing but respect for the company and its never ending struggle to get WYSIWYG and DTD-centered applications to interoperate. But in the fifteen odd years I've done this for a living, I've never had the pleasure of using an Interleaf product, so it's like talking about the attributes of a lady I've never courted.... presumptious and a good way to eliminate a potentially thrilling life experience. Cheers to one and all! Len Bullard Newsgroups: comp.text.sgml Date: 01 Sep 1994 21:15:52 UT From: Jeffrey McArthur \ Organization: ATLIS Publishing Message-ID: <346397$ar9@news.delphi.com> Subject: Re: SGML to Postscript/PDF TeX is one of the most widely ported programs in existance. It is actually quite hard to find a platform that it does not exist on. The output of TeX is a .dvi file. The .dvi file was created with portability in mind. You can copy a .dvi file from an IBM Mainframe to a Cray, to a CDC Cyber, to a PC, to a Mac, to an Amiga, to an Atari, and so on. David Fuchs (sp?) and Donald Knuth went to a lot of trouble to make sure the output of TeX was transparent from one machine to another. So unless you are running on something REALLY odd, there will be a version of TeX available. Converting to Postscript is another matter. The source code to DVIPS is available and written in moderately portable C, it has not been ported to as many platforms as TeX. But it is available for most platforms. On the other hand, installing TeX and DVIPS is not very easy... -- Jeffrey M\\kern-.05em\\raise.5ex\\hbox{\\b c}\\kern-.05emArthur a.k.a. Jeffrey McArthur email: j_mcarthur@bix.com phone: +1 301 210 6655 fax: +1 301 210 4999 home: +1 410 290 6935 The opinions express are mine. They do not reflect the opinions of my employer. My access to the Internet is not paid for by my employer. Newsgroups: alt.culture.usenet,alt.culture.internet,news.misc,comp.infosystems.www.misc,comp.text.sgml,comp.text Followup-To: alt.culture.internet Date: 01 Sep 1994 23:38:00 UT From: Jim Jewett \ Organization: University of Michigan EECS Dept. Message-ID: <345oko$aa5@zip.eecs.umich.edu> References: <344tf2$ks7@Starbase.NeoSoft.COM> \ <345a7t$480@starbase.neosoft.com> Subject: Re: The construction of FAQs [Followups slashed] In article <345a7t$480@starbase.neosoft.com>, Cameron Laird \ wrote: >In article \, >Peter N. M. Hansteen \ wrote: >>In article <344tf2$ks7@Starbase.NeoSoft.COM>, Cameron Laird wrote: >>> Is there any good reason not to construct all FAQs from now on in HTML, >>> rather than plaintext? >> [Many people use traditional text tools, often offline, and don't have >> access to the live links.] >One of my realizations for the week is that HTML is not just >about live links; its formatting or stylistic standards define >an advance over the character-based plaintext with which we're >all familiar. There is, however, a lot to be said for familiarity. LaTeX is far better than ASCII, but there are people who don't have LaTeX around. If it is done lightly (eg, no figures, no really absurd escapes, etc) it isn't even very intrusive -- but I remember when I would see LaTeX and wonder if I knew how to read it. I imagine many people would simply have decided that they didn't. If FAQs start using entities and lots of anchors and more tags than line breaks, this will be a step backwards, because it will lose the portability. >HTML is not the best language for quasi-static distribution of >documents (PostScript is one example of a better one), This is what made me respond... I will often grab and read something in ASCII. I will often decide not to in postscript. This would be true even if they were the same size. Why? At home, I can't read postscript at all (except as raw text, which is sometimes doable, and often painful). At work, I sit in front of a workstation designed to display postscript well -- and a large number of the documents I FTP are basically unreadable. The nearest I can figure, the person writing them made assumptions about fonts that I don't have, or expected them to be printed rather than viewed. For truly interesting documents, I literally scroll through the blasted raw postscript in another window to see what was supposed to be in place of black blotches, or areas too big to view, or areas overwritten, or areas in an unreadable font... I would much rather read the ascii without the formatting info to obscure it. ASCII almost always wins on the portability issue. _________ Have a favorite group or mailing list? Describe it to | grouprev+@pitt.edu jJ | Take only memories. jimj@eecs.umich.edu \\__/ Leave not even footprints. jewett+@pitt.edu Newsgroups: comp.text.sgml Date: 01 Sep 1994 23:59:44 UT From: Dave Shema \ Organization: Boeing Computer Services Message-ID: \ Subject: Underscores and the Sgmls validating parser? My SGML data contains entities and attributes with underscores ("_"). I am trying to use Sgmls, derived from ARCSGML, as the validating parser. This parser does not seem to like underscores. In the declarations file, I have tried modifying the list of name characters that can occur at positions other than the first position: NAMING LCNMCHAR "-._" UCNMCHAR "-._" The message: sgmls: Unsupported feature at sgml_decl, line 32 in declaration parameter 92: Character number 95 is not supported as an additional name character is displayed by the parser. --------------------- partial sgml declaration file --------------------------- . . . SYNTAX SHUNCHAR CONTROLS 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 127 255 BASESET "ISO 646-1983//CHARSET International Reference Version (IRV)//ESC 2/5 4/0" DESCSET 0 128 0 FUNCTION RE 13 RS 10 SPACE 32 TAB SEPCHAR 9 NAMING LCNMSTRT "" UCNMSTRT "" LCNMCHAR "-._" UCNMCHAR "-._" NAMECASE GENERAL YES ENTITY NO DELIM GENERAL SGMLREF SHORTREF SGMLREF NAMES SGMLREF QUANTITY SGMLREF LITLEN 650 NAMELEN 32 . . . Short of going into the C code and changing values in the tables (what we ended up doing with ARCSGML), can this parser be encouraged to accept underscores? Thanks !! Dave -- dave shema (dshema@grace.rt.cs.boeing.com) Newsgroups: alt.culture.usenet,alt.culture.internet,news.misc,comp.infosystems.www.misc,comp.text.sgml,comp.text Followup-To: news.misc Date: 02 Sep 1994 01:53:34 UT From: Tom O Breton \ Organization: BREnterprises Message-ID: \ References: <345oko$aa5@zip.eecs.umich.edu> Subject: Re: The construction of FAQs [ Follow-ups to news.misc (Since I can't think of a place that really fits) ] jimj@quip.eecs.umich.edu (Jim Jewett) writes: > This is what made me respond... I will often grab and read something > in ASCII. I will often decide not to in postscript. This would be > true even if they were the same size. Why? Absolutely. I've often had the experience of ftping to look for a file, only to find that it's in Postscript and not getting it. Felt bad about it too, 'cause I presume the author did more work than straight ASCII would entail, but in my eyes rendered it unusable (unless I filter it into straight ASCII) Tom -- finger me for how Tehomega is coming along (at tob@world.std.com) Author of The Burning Tower (from TomBreton@delphi.com) (weekly in rec.games.frp.archives) Newsgroups: alt.culture.usenet,alt.culture.internet,news.misc,comp.infosystems.www.misc,comp.text.sgml Date: 02 Sep 1994 03:47:31 UT From: Ram Samudrala \ Organization: The Centre for Advanced Research in Biotechnology Message-ID: <34678j$l9o@umd5.umd.edu> References: <344tf2$ks7@Starbase.NeoSoft.COM> \ <345a7t$480@Starbase.NeoSoft.COM> Subject: Re: The construction of FAQs [Cameron Laird] | One of my realizations for the week is that HTML is not just about live | links; its formatting or stylistic standards define an advance over the | character-based plaintext with which we're all familiar. Right. It offers as logical framework for your writing, much the way TeX does, though I am still a lot more comfortable with LaTeX than I am with HTML. HTML does offer a certain amount of abstraction, but it's not flexible enough for me. In general, I'd like more control over sizes (?) and the fonts within the document. It is however a lot easier for the uninitiated to learn, than TeX is. I think your writing ability improves if you write using HTML or TeX, personally. --Ram Ram Samudrala ram@elan1.carb.nist.gov Reagan became senile. Bush already was. Clinton acts like he is. Newsgroups: comp.text.sgml Date: 02 Sep 1994 06:01:04 UT From: Tom Worthington \ Organization: Australian Defence Force Academy, Canberra, Australia Message-ID: <1994Sep2.060104.2194@sserve.cc.adfa.oz.au> Subject: Electronic Document Management Pocket Guide available A "Pocket Guide to the Management of Electronic Documents in the Australian Public Service" is now available by FTP: URL: ftp://archie.au/ACS/edocgd.html Those without a HTML document browser can send a message to "listproc@www0.cern.ch" with "www ftp://archie.au/ACS/edocgd.html" in the body of the message. A text version of the document will be sent back. Please note that this is an experimental service provided by CERN. This leaflet has been produced as a ready reference for APS managers responsible for, or concerned about, the control of electronically stored corporate records. It provides a condensed guide to the principles for management of electronic documents, as set out in the report "Management of Electronic Documents in the Australian Public Service", published in 1993 by the Commonwealth Government's Information Exchange Steering Committee (IESC). The IESC is an advisory body, responsible for providing guidance to Commonwealth agencies on policies and strategic directions relating to Information Technology and related issues, including telecommunications. For further details of the IESC contact Max McGregor (e-mail: max.mcgregor@finance.ausgovfinance.telememo.au, ph: +61 6 263 3553, fax: +61 6 263 2276). PS: Don't miss (because I am talking at it): PLAYING FOR KEEPS: An electronic Records Management Conference Hosted by Australian Archives Canberra Australia 8-10 November 1994 For details e-mail: acts@ozemail.edu.au Phone: +61 6 2573299 or Fax: +61 6 2573256 -- Posted by Tom Worthington \ Chair of the IESC Electronic Document Management Subcommittee & Senior Policy Advisor, Data Administration Standards Department of Defence Room B-3-25, Russell Offices, Canberra ACT 2600, Australia Ph: +61 6 2651258, Fax: +61 6 2653601, Pager: +61 6 2856209 X.400: G=Tom;S=Worthington;OU=CM-DIMP;O=HQADF;P=ausgovdefencenet;A=telememo;C=au 2 September, 1994 File no: HQ 93-33989 Newsgroups: comp.text.sgml Date: 02 Sep 1994 07:10:13 UT From: "William D. Lindsey" \ Message-ID: <1994Sep2.091013.413@ittpub> References: <9409012114.AA19892@source.asset.com> Subject: Re: SGML New User Requesting General Information [Len Bullard] | After my posting to "newbies" on the basics of SGML, I received a | number of kind and complementary postings suggesting that I'd written | something of benefit to the community. Thank y'all very much. Nothing | works better on me than applause. Add my name to the membership list of the appreciative audience of that article. I found it to be a very readable introduction to SGML and have saved a copy to share with computer-aware people who have expressed curiosity about the subject. I'd like to take this opportunity to nominate Mr. Bullard for the un-official post of "Maintainer of the comp.text.sgml FAQ". He's made a fine start. cheers, Bill Lindsey bill@ittpub.nl Newsgroups: comp.text.sgml Date: 02 Sep 1994 08:12:09 UT From: Steve Pepper \ Organization: Falch Hurtigtrykk as, Oslo, Norway Message-ID: <1994Sep2.081209.24886@falch.no> References: <342me6INNn9i@moon.cis.ohio-state.edu> <344h1o$55h@hopper.acm.org> \ Subject: Re: What is the relationship between entities and attributes? [James Saunders] | We are involved in a project that allows construction of a DTD for a | collection of tagged documents that do not have a DTD. As such, this | is not an easily solved problem. My question was, what information | that is contained in a tags attribute, or in other tags in the same | document, allows you, without knowing the DTD, to determine that the | tag attributes declared value is CDATA, ENTITY, ENTITIES etc. | | ie \ pic12 is an entity | \ "100m 200m" is not an entity If you know the SGML declaration (or can safely assume that the reference concrete syntax is being used), the actual characters used in the attribute values will give you some clues. For example, "pic12" *could* be an entity, because it starts with a valid name start character and only contains valid name characters. (Unfortunately, it could be anything at all, except a number token or number token list.) "size = 100mm 200mm" - because it lacks literal delimiters - cannot in any case be a single attribute specification. If it is valid SGML it can only be equivalent to "size = '100mm' otheratt = '200mm'". In this case your DTD construction algorithm would be able to infer that the attribute 'size' could only have the declared value NMTOKEN, NMTOKENS, NUTOKEN, NUTOKENS or CDATA (because it doesn't start with a valid name start character). It could also infer that the declared value of attribute 'otheratt' (whose name would be unknown, of course) is a name token group, of which '200mm' is a member (because otherwise the omission of the attribute name would have been illegal). Neither of these attributes could in any case be an entity. If your second example had read '\', size could still not have a declared value ENTITY, because its specified value contains a space, which is not a valid name character. Nor could it be of type ENTITIES, because both tokens start with invalid name start characters. For the same reasons, size could not be ID, IDS, IDREF, IDREFS, NAME or NAMES, etc. The only possible interpretations would be CDATA, NMTOKENS or NUTOKENS. Note that even if you are unable to make any assumptions about the SGML declaration, an attribute value that begins with a digit can only ever be CDATA, a name token (not a name), a number or a number token. It can *never* be an entity, because (as far as I understand) it is not possible to give digits the role of name start character. Best regards, Steve -- pepper@falch.no ------------------------------------------------------------------ falch hurtigtrykk a.s, postboks 130 kalbakken, n-0902 oslo, norway tel +47 2216 3040 fax +47 2216 2350 Newsgroups: comp.text.sgml Date: 02 Sep 1994 08:27:10 UT From: Bruce Hunter \ Organization: The Direct Connection Ltd Message-ID: <345s2r$85m@felix.dircon.co.uk> Subject: ActiveSystems Anyone got an email address for ActiveSystems, or any experience with their advertised products ActiveSearch and ActiveServer? regards, Bruce Hunter SGML Systems Engineering bruce@sgml.dircon.co.uk Newsgroups: comp.text.sgml Date: 02 Sep 1994 09:05:21 UT From: "William D. Lindsey" \ Message-ID: <1994Sep2.110522.414@ittpub> References: \ Subject: Re: Underscores and the Sgmls validating parser? [Dave Shema] | My SGML data contains entities and attributes with underscores ("_"). | I am trying to use sgmls, derived from ARCSGML, as the validating | parser. This parser does not seem to like underscores. : | Short of going into the C code and changing values in the tables (what | we ended up doing with ARCSGML), can this parser be encouraged to | accept underscores? No. The only way I could get underscores to be accepted by sgmls was by hacking (gently) the source. Around line 957 of sgmldecl.c (sgmls-1.1) change from: else if ((char_flags[c] & (CHAR_SIGNIFICANT | CHAR_MAGIC)) && c != '.' && c != '-') { to: else if ((char_flags[c] & (CHAR_SIGNIFICANT | CHAR_MAGIC)) && c != '.' && c != '-' && c != '_' ) { This change had the unintended side effect of altering the results for test019 in the test suite. I haven't studied the test carefully, but it may be that the new results are correct. I hope this helps. Bill -- Bill Lindsey bill@ittpub.nl Newsgroups: comp.text.sgml Date: 02 Sep 1994 09:43:15 UT From: "William D. Lindsey" \ Message-ID: <1994Sep2.114316.415@ittpub> References: <1994Sep2.110522.414@ittpub> Subject: Re: Underscores and the Sgmls validating parser? [William D. Lindsey] | This change had the unintended side effect of altering the results for | test019 in the test suite. I haven't studied the test carefully, but | it may be that the new results are correct. I apologize for this mis-information. Upon closer inspection, I found that I had edited the test019.sgm source file at some point. The patch to sgmldecl.c causes NO differences in any of the test results. Bill -- Bill Lindsey bill@ittpub.nl Newsgroups: comp.text.sgml Date: 02 Sep 1994 10:51:02 UT From: Christoph Altenhofen \ Organization: IBM Germany, European Networking Center, Heidelberg Message-ID: \ References: <345s2r$85m@felix.dircon.co.uk> Subject: Re: ActiveSystems [Bruce Hunter] | Anyone got an email address for ActiveSystems, or any experience with | their advertised products ActiveSearch and ActiveServer? Their address: Active Systems, Inc. 11 Holland Avenue, Suite 700 Ottawa, Ontario K1Y 4S1 Tel: +1 613 729 2043 Fax: +1 613 729 2874 E-Mail: sales@ctmg.isis.org I sent a Fax to them to get some informations about ActiveServer and ActiveSearch and they answered very quickly. Up to now, I only studied the infos I got in respond of my Fax, but it sounds quite interesting. So is there anybody out there having experience with these products or any other comercial products for databasing SGML documents (as DynaBase, BASIS SGMLserver etc.)? Any hints are welcome. Christoph -- * Christoph Altenhofen /IBM Deutschland Informationssysteme GmbH * | European Networking Center * Tel.: +6221 / 59 - 4503 | Dept. Open Document Communication * FAX : +6221 / 59 - 3400 \\ Vangerowstr. 18 * D-69115 Heidelberg * IBMMAIL: DEIBMSW6 \\__________ Germany * e-mail : CALTENHOFEN at VNET.IBM.COM \\__________________________/ * christo%limmat.heidelbg.ibm.com@ibmpa.awdpa.ibm.com * X-400 : C=DE;A=IBMX400;P=IBMMAIL;S=ALTENHOFEN;G=ALTENHC Newsgroups: comp.text.sgml Date: 02 Sep 1994 15:43:36 UT From: "W. Eliot Kimber" \ Organization: Passage Systems, Inc. Message-ID: \ References: <342me6INNn9i@moon.cis.ohio-state.edu> Subject: Re: What is the relationship between entities and attributes? [James Webster Saunders] | I came across the following reference in "Practical SGML": | | \ | | where FILE1 and FILE2 are entities. In this case, the entities are | external files containing the tag data. If a tagged document were | being parsed without a DTD and it contained such a reference, how would | one distinguish that this value is indeed a list of entities? It's not SGML if there's not a DTD, so you can't meaningfully process a document instance without a DTD. With the DTD there is no ambiguitity because the declared value prescription indicates what the attribute takes as a value, in this case, ENTITIES. -- \

Newsgroups: comp.text.sgml Date: 02 Sep 1994 16:09:21 UT From: Dirk Kutscher \ Organization: PC-Labor der Universitaet Bremen Message-ID: \ Subject: data format of mapping files Hi, Does someone know if information about the data format of the replacement files that allow translating a sgml document to a desired output format can be obtained somewhere? Or maybe someone could tell me the meaning of the "+" characters in the mapping files... (I have the qwertz dtd as an example.) Thanks. -- Bye, Dirk Newsgroups: comp.text.sgml Date: 02 Sep 1994 18:54:41 UT From: Tommie Usdin \ Message-ID: <347sdh$5is@news1.digex.net> References: <33fi64$grh@sernews.raleigh.ibm.com> Subject: Re: Request for SGML 94 info Excerpts from the Program for SGML '94 For more information or a full program contact: Graphic Communications Association 100 Daingerfield Road Alexandria, VA 22314-2888 Phone: 703/519-8160 Fax: 703/548-2867 E-Mail: blake@access.digex.net SGML '94 November 7-10, 1994 Sheraton Premiere, Tysons Corner Vienna, Virginia Sunday, November 6 The Just Enough Tutorial Series Marcy Thompson, Manager of Education and Training, SoftQuad Inc., Tutorial Coordinator 9:00 am-12:00 noon Just Enough Concepts Introduction to SGML with no prerequisites. What is SGML? Who uses it? How do they use it? How does it work? Just Enough Syntax Introduction to SGML with no prerequisites. Basic overview of SGML followed by a survey of SGML markup. 1:00 pm-4:00 pm Just Enough Syntax and Just Enough Concepts (continued) Just Enough Databases How does SGML mesh with document databases? Discussion of full text, relational and object-oriented approaches. Just Enough Electronic Delivery An overview of methods of delivering SGML documents electronically. Just Enough Paper Publishing What must you do to an SGML document to turn it into a printed document? 9:00 am-5:00 pm OmniMark User Group 6:30 pm-10:30 pm Opening Reception and Dinner Aboard a Potomac River Cruise Monday, November 7: General Session 9:00 am Opening Remarks Yuri Rubinsky, President, SoftQuad Inc., Conference Chairman 9:15 am The Year in Review Yuri Rubinsky and B. Tommie Usdin, Vice President, ATLIS Consulting Group, Conference Co-chair 10:00 am Conference Keynote: State of the Web - The NCSA Mosaic View of the World's Largest SGML Application Joseph Hardin, Associate Director, Software Development Group, National Center for Supercomputing Applications, University of Illinois, Champaign-Urbana 11:00 am Poster Session Application Track 1:45 pm: Document Engineering at the Canadian Department of National Defence CALS Office. Ken Holman, Vice President, R\&D, Microstar Software Ltd. 2:30 pm: An SGML-based News Agency Tibor Tscheke, Managing Director, STEP GmbH 3:30 pm : Towards an SGML-based Architecture for Operations and Maintenance Documentation in the Telecommunications Industry Wolfgang Weber, Systems Analyst, Siemens AG 4:15 pm: SGML Environment for Developers of Product Data Exchange Standards Lisa Phillips, Computer Scientist, and Joshua Lubell, Computer Scientist, National Institute of Standards and Technology 7:30 pm-10:00 pm: Evening Workshop: Table Handling, Session Chair: Eric Severson, Executive Vice President, Avalanche Development Theory Track 1:45 pm: Project YAO and Other News Dr. Charles F. Goldfarb, Principal Consultant, Information Managment Consulting 2:30 pm: Document Conversion - How Does SGML Markup Acquire Behavior? Kevin Allen, Senior Systems Engineer, InfoAccess 3:30 pm: SGML, DTD Design and Coding, In-House Publishing, Corporate Image, and Synergy Robert Erfle and Gunter von Zadow, IBM European Networking Center 4:15 pm: SGML Model for Statistical Tables Dianne Kennedy, Vice President Strategic Systems, ActiveSystems 7:30 pm-10:00 pm: Evening Workshop: Graphic Representation of Structure Session Chair: B. Tommie Usdin, Vice President, ATLIS Consulting Group Tuesday, November 8 Management Track 8:30 am: SGML Is Not a Solution Marcy Thompson, Manager of Education & Training, SoftQuad Inc. 9:00 am: Reconciling Internal and Interchange Requirements, or How to Survive the Industry Initiative Lani Hajagos, FrameBuilder Marketing Manager, Frame Technology Corporation 10:00am: The Human Aspects of Using SGML Astrid E. Jenssen and Tone Irene Sandahl, University Center for Information Technology Services, University of Oslo 10:45 am: Practical Approaches to SGML Page Composition Francois Chahuneau, Director, AIS Berger Levrault 11:30 am: Implementation Issues and Project Management John W. Oster II, Principal Consultant, McAfee & McAdam, Ltd. 1:00 pm: The Impact of SGML on Training in an Organization Jeanne El Andaloussi, Manager, Document Engineering Group, Bull S. A. 1:45 pm: Reusing Information through SGML Building Blocks John J. Shockro, CEA, Incorporated 2:30 pm: Management of SGML Documents Eric Severson, Executive Vice President, Avalanche and Ludo van Vooren, Director Customer Solutions, Interleaf 3:30 pm: SGML: It's Not Just for Documents Anymore Kurt Conrad, Internal Consultant, Boeing Computer Services System Overview Track Tools and Technologies for SGML-Based Information Systems 8:30 am: Introduction Mary Laplante, Executive Director, SGML Open 9:00 am: DTD, Application, and System Utilities Debbie Lapeyre, Consultant, ATLIS Consulting Group 10:00 am: Parsers, Transformers, and Conversion Tools Pamela Gennusa, Director, Database Publishing Systems, Inc. 11:00 am: Editors and Authoring Systems Paul Grosso, Vice President, ArborText 1:00 pm: Databases, Document and Workflow Management Michael Sperberg-McQueen, Academic Computing Center, University of Illinois at Chicago 2:00 pm: Electronic Delivery Tim Bray, Senior Vice President of Technology, OpenText Corporation 3:15 pm: Layout and Composition Mark Walters, Editor, Seybold Publications General Session 4:15 pm Poster Session 6:00 pm Author Signing Party 7:30 pm-9:30 pm Evening Workshop SGML Open Panel: SGML and the Internet Session Chair: Larry Bohn, Interleaf Wednesday, November 9 General Session 8:30 am: The Golden DTD: Using Data-Centered DTD's to Meet Business Goals Gregory S. Vaughan, Senior Technical Consultant, Database Publishing Systems, Inc. 9:15 am: RealSGML: Digital Service Bulletins for Commercial Aviation Harry Summerfield, President, Zandar Corporation and Freelon Hunter, Project Manager, Boeing Commercial Airplane Group 10:00 am: Alchemy for the Masses: Automating the Construction of SGML Conversion Applications David Sklar, Director of Applications, Electronic Book Technologies 11:00 am Poster Session 1:15 pm-3:00 pm Product News Flashes 7:00 pm-10:00 pm Product Demonstration Table Tops Thursday, November 10: Theory Track 8:30 am The Whys, Whats and Hows of Partial Documents in SGML Eric Freese, Principal Software Developer, Information Dimensions, Inc. 9:00 am: SUBDOC, A Useful Construct for Publishing Mike Maziarka, Frame/Datalogics 9:30 am: Is SHORTREF Still Meaningful? John McFadden, President, Exoterica 10:15 am: Encoding SDIF in the Multipurpose Internet Mail Extensions (MIME) Edward Levinson, Technical Director, Accurate Information Systems, Inc. 10:45 am: File Format for Documents Containing both Logical Structures and Layout Structures Makoto Murata, Fuji Xerox 11:15 pm: Simplified Authoring for a Complex DTD Keith Fabling, Lead Publications Engineer (CTAS), Boeing Commercial Airplanes and Michael A. Murray, Senior Principal Scientist, Boeing Computer Services 11:45 am: Creating SGML Objects for End-Users Jean Paoli, Technical Director, Grif S.A. Application Track 8:30 am: Converting More Than 1 Million Pages to SGML Richard Barth, Director of Operations, Data Conversion Laboratory 9:00 am: Conversion to SGML Bill Preacher, Managing Director, Pindar Infotek 9:30 am: Creation of Electronic Technical Manuals Using SGML James Frizzel, Software Development Engineer, Docucon 10:15 am: The Conversion of Legacy Technical Documents into IETMs; A NAVAIR Phase II SBIR Progress Report Timothy E. Billington, Senior Information Engineer, Information Engineering Group, Aquidneck Management Associates, Ltd. 10:45 am: Document Type Definitions: A Case Study for A Common Set of High Level Tags for an IETM Michael Graser, Martin Marietta Corporation 11:15 am: Building an SGML-based IETM Alan Porter, Technology Development Executive, OMI Logistics Ltd. 11:45 am: The Reality of Military Document Analysis Lawrence A. Beck, Northrop Grumman Data Systems/InfoConversion, and Lewis M. McCormack, Northrop Grumman Aerospace & Electronics General Session 1:00 pm Lunch and Closing Keynote: Pushing the SGML Paradigm Jean-Pierre Gaspart, Managing Director, Associated Consultants and Software Engineers (A.C.S.E) Registration Form I am attending (check all boxes that apply): Just Enough Tutorials, Sunday, November 6 _ Just Enough SGML Syntax _ 9:00 am-12:00 noon* _ 9:00 am-4:00 pm Just Enough SGML Concepts _ 9:00 am-12:00 noon* _ 9:00 am-4:00 pm * You may choose to only attend for the first half of the day. This gives you the opportunity to attend one of the other courses in the afternoon. 1:00 pm-4:00 pm (only) _ Just Enough Databases _ Just Enough Electronic Delivery _ Just Enough Paper Publishing Tutorial Fees: Full Day Rates (9:00am-4:00pm) $145/GCA Member discount $110 Half Day Rates (9:00am-noon or 1:00-4:00 pm) $85/GCA Member discount $50 Tutorial fees are additional. If you plan to attend the conference, you must pay a conference registration fee as well. SGML '94 Conference November 7-10, 1994 Registration fee $845 GCA member discount $640 Educational Institution* discount $502 *Accredited University or College I would like to participate as a vendor in the product demonstration.* Nonmember rate $650/two 6-foot tables Member rate $425/two 6-foot tables Cruise Ship Dandy Reception and Dinner Sunday, November 6, 1994 $10 Conference Registrant $30 Guest Fee Organization Type (check one): _ Corporate Graphic Services _ Government _ Graphic Educational Institute _ Industry Media _ Manufacturer/Equipment _ Manufacturer/Material _ Manufacturer/Systems & Services _ Publisher _ Printer _ Reference Publisher _ Services Provider _ Software Develpeer _ Other_______________________________ Registrant Information: Indicate Name as it will appear on badge: Name: Title: Company/Institution: Address: City: State or Province: Postal Code: Country: Area Code/Phone: FAX: Date: Billing Information: Check enclosed (Make checks payable to Graphic Communications Association) Credit Card: _ Visa _ MasterCard _ American Express Card number: Expiration date: Signature: Newsgroups: comp.text.sgml Date: 02 Sep 1994 22:08:48 UT From: Oliver Lefevre \ Organization: New York University, New York, NY Message-ID: <3487pg$a9r@cmcl2.NYU.EDU> Subject: SCRIPT replacement IBM used to have a thing called SCRIPT which worked on mainframes and DOS PCs and, if I understand correctly, implements some sort of SGML. SCRIPT source code looks very much like TeX, albeit with different commands. I don't have access to the whole suite of SCRIPT software but I would like to be able to process SCRIPT files nonetheless and print the output. Does anybody know how this might be done? (NB: 1/ I am fairly conversant with TeX 2/ I am no multi-millionaire, so commercial SGML tools might not be of much help...) Thank you very much beforehand Olivier Lefevre NYU Medical School, NY Newsgroups: comp.text.sgml Date: 03 Sep 1994 14:16:38 UT From: Paul Robinson \ Organization: Tansin A. Darcos & Company, Silver Spring, MD USA Message-ID: <34a0g6$esh@news1.digex.net> Subject: On Line Specifications for SGML? Is there a set of specifications which are available via FTP or otherwise that describe the specifications for SGML? Or a quick list? Or should I look at one of the source programs that translates SGML, such as a browser, if one is available for MSDOS or Unix? -- Reports on Security Problems: To Subscribe write PROBLEMS-REQUEST@TDR.COM Paul Robinson - paul@tdr.com / tdarcos@MCIMail.com / tdarcos@access.digex.net Voted "Largest Polluter of the (IETF) list" by Randy Bush \ Voted "Largest Polluter of digex.general" by Mike \ Newsgroups: gnu.emacs.sources,comp.text.sgml Date: 03 Sep 1994 21:12:41 UT From: Lennart Staflin \ Organization: Lysator Computer Society, Link|ping University, Sweden Message-ID: \ Subject: ANNOUNCE: PSGML 0.4b2 -- SGML-mode for Emacs PSGML version 0.4b2 is now available at ftp.lysator.liu.se: /pub/sgml/psgml-0.4b2.tar.gz This is a bug-fix release. PSGML is a major mode for editing SGML documents. It works with GNU Emacs 19.19 and later or with Lucid Emacs 19.9 and later. PSGML contains a simple SGML parser and can work with any DTD. Functions provided includes menus and commands for inserting tags with only the contextually valid tags, identification of structural errors, editing of attribute values in a separate window with information about types and defaults, and structure based editing. This is still a beta version, but it is documented and reasonably stable. More information is available in the World Wide Web \ -- Lennart Staflin \ You are in a twisty little maze of URLs, all alluring. Newsgroups: comp.text.sgml,comp.infosystems.www.users Date: 04 Sep 1994 03:57:23 UT From: "Liam R. E. Quin" \ Organization: SoftQuad Inc., Toronto, Canada Message-ID: <1994Sep4.035723.20957@sq.sq.com> References: \ \ Subject: Re: Is there a HoTMetaL HTML temple for FAQs? [James D. Murray] | I'm putting together a FAQ and I'd like to make it available as a Web | HTML file in addition to the traditional ASCII version. I'm looking | into using HoTMetaL as the editing application (MS Windows) [Frank McNeil] | Great? Once created it will be really easy to update in HoTMetaL, due | to the structured way HoTMetaL presents tags and text. Thanks for the plug :-) [James D. Murray] | and I'd like to know if anyone has bothered to put together a template | for FAQs? Not that we have seen, but if you make one and want to send it to us, we'll certainly consider it including it in a future release of HoTMetaL. HoTMetaL templates are simply ASCII HTML files, there's nothing special about them -- so you can simply put files in your templates directory. Also, you can have multiple documents open, and use copy/paste between them. Lee (HoTMetaL is available for ftp at the sites listed in my .signature, and probably lots of other sites too...) -- Liam Quin, Manager of Contracting, SoftQuad Inc +1 416 239 4801 lee@sq.com HexSweeper NeWS game;OPEN LOOK+XView+mf-fonts FAQs;lq-text unix text retrieval SoftQuad HoTMetaL: ftp.ncsa.uiuc.edu:Web/html/hotmetal, and also doc.ic.ac.uk: packages/WWW/ncsa/..., gatekeeper.dec.com:net/infosys/Mosaic/contrib/SoftQuad/ Newsgroups: comp.text.sgml Date: 04 Sep 1994 13:31:19 UT From: Martin Hamilton \ Organization: c/o Loughborough University, UK Message-ID: \ References: <33v8g7$ksc@urmel.informatik.rwth-aachen.de> \ Subject: Re: Hypermedia-Mail (Was: HyTime problems) [Steinar Bang] | What is "hypermedia-mail"? text/html :-) Newsgroups: comp.infosystems.www.misc,comp.text.sgml Date: 04 Sep 1994 15:56:55 UT From: Una Smith \ Organization: Yale University, Department of Biology Message-ID: <34cqo7$lj1@news.ycc.yale.edu> References: <344tf2$ks7@Starbase.NeoSoft.COM> \ <345a7t$480@starbase.neosoft.com> Subject: Re: The construction of FAQs [Cameron Laird] | Here's the question: is there any good reason not to construct all FAQs | from now on in HTML, rather than plaintext? I think I've got the attribution correct. My apologies if not. The reason I've stuck to plain text is that none of the document description packages out there are sufficiently easy to set up that it would be worth my time right now to convert my FAQ into one of them. Given my readership, it is imperative that I be able to produce attractive plain text versions of the FAQ. HTML does not do this nicely enough, especially when URLs are used; I've seen really dreadful versions of my FAQ that were HTMLized by others and then re-extracted to plain text. I would prefer to use something like LaTeX, that would keep the internal references correct without manual editing. And I know there are LaTeX-to-HTML converters out there, so I am tempted to switch over. The most commonly requested form is PostScript, for printed handouts during workshops. Ideally, I could keep a single source document, and it could be filtered through any document format driver on demand: PostScript, Acrobat, RTF, HTML, you name it. We're moving in that direction, but aren't nearly there yet. For now, I'll stick to the low road: plain text. -- Una Smith smith-una@yale.edu Department of Biology, Yale University, New Haven, CT 06520-8104 USA Newsgroups: comp.text.sgml Date: 04 Sep 1994 17:20:45 UT From: Edgar Gilchrist \ Message-ID: \ Subject: Storing SGML in a database I am confronted with a classic documentation problem: producing multiple, port specific versions of a generic document. I would like to consider an SGML/database approach. I envision that the document could be stored in a database as distinct SGML-tagged modules. In the case of a port-specific variation in a module, I imagine that there would be a distinct record, encoded in such a way that it would be correctly included when the port-specific document was assembled. Now, I know that this is not a new approach. Can anyone point me to technical articles, etc., that may exist on this subject? Thanks, Ted Gilchrist Newsgroups: comp.text.sgml Date: 05 Sep 1994 05:52:15 UT From: Tom Worthington \ Organization: Australian Defence Force Academy, Canberra, Australia Message-ID: <1994Sep5.055215.27280@sserve.cc.adfa.oz.au> References: <1994Sep2.060104.2194@sserve.cc.adfa.oz.au> Subject: Re: Electronic Document Management Pocket Guide available [Tom Worthington] | PLAYING FOR KEEPS: An electronic Records Management Conference | Hosted by Australian Archives | Canberra Australia 8-10 November 1994 | For details e-mail: acts@ozemail.edu.au | Phone: +61 6 2573299 or Fax: +61 6 2573256 The e-mail address for ACTS was incorrect. The correct address is: acts@ozemail.com.au -- Posted by Tom Worthington \ Chair of the IESC Electronic Document Management Subcommittee & Senior Policy Advisor, Data Administration Standards Department of Defence Room B-3-25, Russell Offices, Canberra ACT 2600, Australia Ph: +61 6 2651258, Fax: +61 6 2653601, Pager: +61 6 2856209 X.400: G=Tom;S=Worthington;OU=CM-DIMP;O=HQADF;P=ausgovdefencenet;A=telememo;C=au 5 September, 1994 File no: HQ 93-33989 Newsgroups: comp.text.sgml Date: 05 Sep 1994 12:14:45 UT From: Bruce Hunter \ Organization: SGML Systems Engineering Message-ID: <34f23l$17u@felix.dircon.co.uk> References: \ <1994Sep2.110522.414@ittpub> Subject: Re: Underscores and the Sgmls validating parser? [William D. Lindsey] | No. The only way I could get underscores to be accepted by sgmls was | by hacking (gently) the source. | | Around line 957 of sgmldecl.c (sgmls-1.1) change | from: | else if ((char_flags[c] & (CHAR_SIGNIFICANT | CHAR_MAGIC)) | && c != '.' && c != '-') { | to: | else if ((char_flags[c] & (CHAR_SIGNIFICANT | CHAR_MAGIC)) | && c != '.' && c != '-' && c != '_' ) { A much simpler (and recommended) method is just to provide SGMLS with an SGML Declaration in which the underscore character is made a valid NAME character. in the NAMING section just add the underscore character to the UCNMCHAR and LCNMCHAR declarations, as in NAMING UCNMCHAR "-._" LCNMCHAR "-._" regards, -- Bruce Hunter SGML Systems Engineering bruce@sgml.dircon.co.uk Newsgroups: comp.text.sgml Date: 05 Sep 1994 15:23:54 UT From: "Eric R. Skinner" \ Organization: Exoterica Corporation Message-ID: <1994Sep5.152354.5468@exoterica.com> References: \ Subject: Re: SGML Declarations, why ? In article \ north@knoware.nl (Simon North) asks a few questions about the SGML Declaration and mentions the lack of detailed explanations of this area of SGML. To risk boring the lot of you who have seen this mentioned before, Exoterica has a 40-page explanation entitled "Understanding the SGML Declaration" which goes through the whole thing in detail. It's free - just write to info@exoterica.com and request a copy. Provide your air mail address as we will mail you the document. Regards, -- Eric R. Skinner ers@exoterica.com Exoterica Corporation Tel +1 613 722 1700 Ottawa, Canada Fax +1 613 722 5706 Product information: info@exoterica.com Newsgroups: comp.text.sgml Date: 05 Sep 1994 19:29:36 UT From: Erik Naggum \ Organization: Naggum Software; +47 2295 0313 Message-ID: <19940905.4851@naggum.no> References: \ <19940820.4480@naggum.no> \ Subject: HyTime critique (was CONCUR usefulness existence proof) this was becoming too long, so I cut the places where I agree with Eliot. that's not to imply that if I don't comment, I agree, but that should be obvious if you read further. [W. Eliot Kimber] | I also do not consider marked sections to be generally useful or | robust, both because there is no dynamism of marked sections controls | and because marked sections are not bound to element boundaries, which | can make them difficult to manage and work with. I don't understand what "dynamism of marked section controls" refers to, so I can't address that. however, if I replace "marked sections" with "entities" in the rest of this sentence, I end up with a strong argument against entities! maybe this is unfair, since we all agree that entities are very useful, but it shows that arguments that are used negatively about one thing and positively about another aren't very good arguments and probably hide the _real_ arguments. that marked sections are constrained to entity boundaries in my opinion should make them far easier to deal with than entities that aren't bound to element boundaries, and thus not so difficult to manage or work with. they are also not declared in a distant prolog, which should make them easier to deal with. I'd like to know that "dynamism of marked section controls" means, because the other argument is now dismissed. | I was referring to the robustness of the functionality HyTime represents. you said: "My take on CONCUR is that what CONCUR can do weakly HyTime allows you to do robustly." since there isn't much description of "the functionality [that] HyTime represents" in the standard, I interpreted this to refer to the syntax of HyTime, just as CONCUR refers to the syntax of _its_ mechanism. CONCUR is relatively solid in the way it combines two documents into one. the syntax is clear, and all the information required to split them apart again is localized to the seems between the two documents. with some fairly straight-forward constraints, it is possible to change either document without ripping the seems. this is not dependent on any concept of location or ID's or anything. it just happens to be a fairly easily identified aspect of character strings. now, can HyTime do _this_ more robustly than CONCUR does? from my limited understanding of HyTime, I can see several contradictions to this claim. depending on what you consider CONCUR to do, I'm sure it's possible to reduce CONCUR to a basically useless mechanism and then show HyTime to be superior. however, we should compare by means of the highest possible level of expressibility for both solutions. now, mind you, I don't think CONCUR should be in the standard any more than DATATAG and RANK should be (all can be outboarded without much trouble), so I'm a priori interested in anything that can replace it, but that doesn't mean I buy any old argument about which is weaker and stronger. I want truth in advertising, but I find hypermedia to be more hype than media, and dangerously low on testable facts and hypotheses; HyTime fits this bill. | These functions do not necessarily lend themselves to the sort of | comfortably computable processing that computer scientists seem to | prefer. I'll gladly take your word for it, but I'm not sure you understand just how damaging to your own cause this statement is. | HyTime reflects both the reality of data and the need for generality. the "reality of the data" is that if you can't engage them in "comfortably computable processing", they are and will remain just so many bits. | This is a silly argument and I'm surprised you are making it, as it | completely misses the point of HyTime and the problems it tries to | solve. sorry to say so, Eliot, but despite valiant efforts both on your part and by several other people, the "point" of HyTime is still a widely dispersed mist consisting mainly of hype and vaporware. I do know which problems HyTime tries to solve. you should have discovered by now that I don't think HyTime succeeds in solving those problems in a way which will make computer scientists (or, heck, Microsoft-influenced coders) adopt it. that is, the solution is inelegant, clumsy and invites serious criticism that you and others brush off with statements such as that quoted above. | HyTime doesn't do validation of document type declarations *because | it's incomputable in some cases*. although I don't think you know the force of this statement, either, I'll accept it at face value and quote you on it. I think, however, that you should be a little more careful with what you say. there _are_ computer scientists reading this, and they _may_ be wondering whether to adopt HyTime or leave it in paper. | Also, it is essential that HyTime allow document types that allow | invalid constructs -- that's the only way HyTime can hope to peacefully | integrate with existing applications that predate HyTime or that have | other requirements that do not always allow HyTime conformance. that's the "only way"? geez. as far as I could understand, the whole idea with HyTime was that it should not require modification of the documents into which you point. Integrated Open Hypermedia, right? this I always interpreted to mean that you would have to build in some HyTime support in your applications such that they would not _need_ to mess with extant document types or instances, but could reference them from outside. now all of a sudden we _have_ to mess with extant document types? what next? I do understand that you need to emphasize that document types cannot always be validated, but that's not my question. my question is: if and when I want to validate it, can I? you don't answer this question. | But of course, there's no reason you can't define document types that | are *guaranteed* to produce structurally valid HyTime documents. If | you want that level of comfort, define your document types that way. I | certainly recommend it if at all possible, but there are many cases for | which it would be impossible or inappropriate (the TEI, Docbook, and | HTML come to mind immediately). I was asking whether it was possible to know, a priori, whether a document type conforms to HyTime, that is, whether a document type would always produce conforming HyTime documents without having to parse all the possible instances of that document type. I have received no answer to this question, just a bunch of slippery sales-talk that says nothing one way or the other. "there's no reason you can't" does not tell me how I can, or even _whether_ I can. and you turn the whole issue on its head with your "if you want that level of comfort, define your document types that way". don't you _understand_ what I'm asking for? the question is: HOW DO I KNOW THAT WHAT I _THINK_ IS "GUARANTEED" TO PRODUCE STRUCTURALLY VALID HYTIME DOCUMENTS ACTUALLY WILL? "define your document types that way" is just evading the question. if I were only a wee bit nervous about HyTime before this, I would close my eyes and hope it would pass away afterwards. | Obviously this is a specious argument given the requirement stated | above. Also, I'm not sure what you mean by "loose". If by loose you | mean that you can leave the choice of what form an element conforms to | to authors, again, HyTime has to allow that level of flexibility, but | you don't need to put in your document types. I have understood the concept and the execution of the architectural forms, Eliot. this is not HyTime 101, anymore. I do not see any value of this approach, however, because it dislocates the information that it tries to convey, and so is inherently unstable. to be precise, not any value above competing proposals. one of the "winning arguments" of HyTime was that people should be able to look to this solution and adopt it rather than reinvent the wheel. now, HyTime only tells those who come to look that a "wheel" is "sort of round-like" (only expressed in seriously convoluted language), which could have been valuable if you didn't already _know_ that wheels are circular. HyTime is a catalog of hypermedia concepts and ideas, with a hell of a lot of syntax that you end up not using. some of those ideas are brilliant, but the syntax only provides a major haystack in which to look for them. SGML is rigid, which is one of its strength (but also a liability when some argue that loosening up the rigidity would break the whole thing), and it provides very specific, known freedoms and disallows others. what does HyTime do? there are so many options I get lost in them; I can't validate a DTD against what I think could be expressed as element classes; there are redundant ways to do some things, and only one way to do other things that really need several ways because they differ in use or purpose. what is HyTime's model of object rendition and modification? what is the semantics associated with all these concepts that would form the foundation for its application? I can look at all this, and I can understand about 25% of it. sifting through the uselessly intricate indirections of the syntax to see where things might apply, I only wind up with more than a brainful of loose threads. I don't have a pressing need to waste months of my life to grok something that I am steadily losing confidence in to boot. the more I hear about how great HyTime is from others, the more I get a sinking feeling that they're trying to have _me_ confirm to _them_ how great it is. sorry, I don't work that way. if it weren't for this, I maybe wouldn't think, or say I think, that it's all a carefully plotted hoax. | The association could not be tighter than fixed attribute values naming | the element type forms. oh, it could. you have yourself stated the major argument for getting HyTime out as a standard, and I can confirm from firsthand information that this is it: it needed to be out to beat the other contenders. you argue strongly that it's here, that it uses SGML as it exists today, and that we must make do with what we have. now hear this: there is much evidence that the time that was spent on HyTime was consciously and deliberately _not_ spent on "reviewing" SGML. ISO committees are free to propose amendments to their standards at _any_ time, and getting some of the ideas in HyTime into SGML would be a piece of cake. there was, in other words, a conscious decision not to revise SGML, but to go with the existing standard, despite mounting evidence that SGML needs to be amended and extended. now HyTime is used as an excuse not to extend SGML with simpler, cleaner, and better ways to do the things that HyTime does because HyTime already "does the job". at this point, we're looking at a fact of history that says that HyTime is here, and a better SGML is not. swell! it didn't have to be this way, and there is no critical mass of HyTime documents out there, and none that uses the standard, anyway, since the HyTime Catalog "version 2.0" fixes several seriously broken things in it. so we can fix it. we can go back and say that we made a mistake in pushing it so hard while it still couldn't stand on its own, and that we're going back to the drawing board. only the worst of cowards are afraid to acknowledge failure when it stares them in the eye, and no amount of political defeat can outdo a broken standard that remains unfixed in terms of loss of credibility. but to counter your actual argument: an element does not become an instance of an element type form just because it says so. a number of factors are involved, such as congruency of attribute lists and content models. I want to know, and I want to know that I have made a valid HyTime-conforming DTD, or I won't _use_ HyTime: the risk of running into a serious problem down the road because of an easily fixable but undetectable flaw in the design makes me all queasy and nervous. and, as you have yourself so eloquently demonstrated above, I _can't_ know. to be clear: the only way I can know that an element type conforms to an element type form is to do an exhaustive search on all the possible actual contents according to the content model of the element type form as well as that of the the element type. if there is a "content set" in the latter set that are not in the former, I have disproved the assertion. that is, in the dreaded computer scientist terminology, we're establishing a less- than-or-equal relation for a set of regular expressions. this is known to be a hard problem. not in the programming department, but in the "tell me the result before the sun goes nova" department. HyTime could have given us computer scientists significantly improved sleep if a description of what it means to conform to an element type form had been provided that would restrict this problem to bring it down into the "comfortably computable" (real) world. the fact that such restrictions are _not_ included I attribute to ignorance of computability theory, indeed of computer science, and this makes those arguments to the effect that it's the computer scientists' fault that things are more complicated than they can "comfortably" handle, more than just specious. it makes those arguments obtain a tinge of arrogance towards the people who are set to do the "dirty work" of implementing their "clean design". may I suggest that they just refuse? but it's not just the use of attributes that makes HyTime loose. it's the use of the one space of unique ID's as the _universal_ naming and reference mechanism. this comes from the fact that to set up all the thingies that need ID's on them, you need a whole "prolog" just for that, sometimes in one place, sometimes scattered all over the place. some like to label them "meta-objects" and parade them far and wide as the solution to everything. suffice to say there are more schools of thought than this one. [Erik Naggum] | further, there is one obvious drawback with using more than one | document to store a given piece of information: synchronization. we | have discussed this in the context of static documents, where this | problem is not so pronounced as in the general case. using a variety | of addressing mechanisms, HyTime is able to use indirection and | relative addressing in many very useful ways. however, the association | is inherently loose. (my proposal to use relative addresses is also | loose, but I never claimed it was robust.) [W. Eliot Kimber] | I would like to better understand why you think HyTime location | addressing is not robust. If robustness refers only to validation, | then these are two separate issues, and your earlier definition of | robustness does not apply to addressing in general. HyTime location | addressing is as robust as the data you're addressing allows. If | everything has an ID, addressing couldn't be more robust. If, on the | other hand, you're trying to locate characters in the content of | unidentified elements, then things are a bit shakier. But at least | HyTime gives you sufficient indirection so that you can choose the best | binding of locations to make your addresses as persitent as possible. | This is what I think of as robustness. I think we're talking a bit past each other. let me take another spin on this one. by asking for robustness I want to address the testability of something under stress. in this case, I want to know whether a particular change will affect the links that depend on (reference) this document. put another way, I want to be able to compute a dependency graph all the way down to the character if I have to, in order to show whether changing _this_ will require an update _there_ or not, and usefully, _where_ it would require an update. now, since HyTime provides mechanisms to point into documents that don't know that they are the target of links, this is a hard nut to crack. still, for a bounded object set (one of HyTime's better ideas), it is computable, provided you can wait, or tolerate an answer like "try, and see what happens". the HTML linking mechanism is not conceptually much weaker than HyTime's (although in practice it is much weaker), yet reference (link) maintenance with only a few changing documents is becoming a significant problem. the structure built by HTML is extremely unstable. the indirection in HyTime does help, but the fundamental problem is that while HyTime includes the right concept, activity tracking, there isn't even a hint at how this would or should be implemented in practice. it is a distinctly non-trivial exercise. my assessment of HyTime is not unlike my assessment of SGML: both rate A+ for concepts, but HyTime rates a D- for specification, whereas SGML rates a C. (while I'm rating the SGML-related standards, let me add that so far (I got DSSSL at the WG8 meeting that ended last week), DSSSL looks like a straight A for both concepts and specification.) I could immediately see that SGML was a Good Thing, and assumed that it shouldn't be all that much of a problem to implement it cleanly. with a lot of experience in disproving this assumption, I was a bit more cautious about HyTime. I have come to conclude that it would have been an excellent idea to spend a few more years on HyTime before it was released to the world in the guise of an International Standard. in retrospect, it was a very bad thing for me to (help) vote this thing through ISO without understanding it. I deeply regret that I vouched for it, but life doesn't have an "undo" function. the lesson to learn is to study those DIS'es carefully and not to confuse technical merit with political pressure. as far as I have heard over the past couple years, more national bodies have gotten the message. this should provide the necessary impetus to stop the heedless rushing I expect when SGML comes up for "review" in 1996. hope you enjoyed your labor day. #\ Newsgroups: comp.text.sgml Date: 06 Sep 1994 06:15:04 UT From: "W.D. Lindsey" \ Organization: NLnet Message-ID: \ References: \ <1994Sep2.110522.414@ittpub> <34f23l$17u@felix.dircon.co.uk> Subject: Re: Underscores and the Sgmls validating parser? [Bruce Hunter] | A much simpler (and recommended) method is just to provide SGMLS with | an SGML Declaration in which the underscore character is made a valid | NAME character. | | in the NAMING section just add the underscore character to the UCNMCHAR | and LCNMCHAR declarations, as in | | NAMING | UCNMCHAR "-._" | LCNMCHAR "-._" Have you tried this with sgmls version 1.1? I could not get it to work until I made the patch. That was the point. The NAMING declaration won't work without patching the source, the patch has no effect without the NAMING declaration you suggest. Cheers, -Bill Bill Lindsey william@ittpub.nl -- William D. Lindsey Prinses Irenelaan 13 2341 TP Oegstgeest lindsey@inter.NL.NET the Netherlands Newsgroups: comp.text.sgml Date: 06 Sep 1994 14:00:29 UT From: R A Milowski \ Organization: University of Minnesota Message-ID: \ References: <345987$5g9@finnegan.iol.ie> Subject: Re: SGML to Postscript/PDF [Sean Mc Grath] | I cannot help feeling that someone, somewhere has written a nice C | function library with function calls like :- | | newpage() - Start new postscript page In PostScript: showpage Note: Do not do a "showpage" for the first page! | setfont(x) - Select font In PostScript: /name getfont size scalefont setfont Note: I may be wrong about this part, I'd have to look it up. | boldon() - Turn on bold face To change to bold in PostScript, you must change the font to a boldface font. Thus, this is just like the setfont() above. | centretext("Hello World") - Display text centered on the | current line You'd have to write a nasty little PostScript procedure to do this but the final syntax would be: ("Hello World") PCenterText Where "PCenterText" is the PostScript procedure. | Para ("I am a paragraph") - Output text as a paragraph, | wrapping text as required Again, you could (or someone) could write a small little procedure to set and wrap the string into a paragraph. ("I am a paragraph") PSetPara | I have that awful "I'm missing something here" feeling. Can anyone put | me out of my mysery? Not really, what you want is a simple solution. BTW, I have done this before. Thus, it is possible. Unfortunately, I don't have my hands on either the PostScript or the SGML->PS interface. -- R. Alexander Milowski SGML Operations Manager milor001@maroon.tc.umn.edu Microcom Inc. +1 612 825 4132 SGML Consulting -- "The SGML Solutions Experts" Newsgroups: comp.text.sgml Date: 06 Sep 1994 15:03:23 UT From: "William G. Lederer" \ Organization: Another MCSNet Subscriber, Chicago's First Public-Access Internet! Message-ID: <34i0br$eng@Mercury.mcs.com> References: <9408311946.AA22157@helium.biomol.uci.edu> Subject: Re: Search for SGML parser generator [Louise Falevsky] | I am working on a project for Protein Science Journal to electronically | publish the journal on the WWW. I will be using the ISO 12083 DTD as a | basis for the SGML document markup. I will be using WAIS to index the | SGML marked-up documents. I want to create a parser from the 12083 DTD | so that I can parse the SGML documents for WAIS indexing. I have tried | to use SGML-ASP on another DTD and have had no luck in creating a | parser. The generator creates a grammer, but the doc_parser and output | *.asp are not created. I have now tried the HTML.dtd and other dtd's | and am still unsuccessful. I have not been able to find authors Sylvia | von Egmond and Jos Warmer E-mail addresses for direct help. I can locate his e-mail address for you if you really want it, but my suggestion is to get a copy of sgmls, which is publically available, including source code. It is known to work on many platforms. Check with Archie for location of sgmls. I would give you the exact location, but it is in my other office. Newsgroups: comp.text.sgml Date: 06 Sep 1994 16:05:07 UT From: Bob Agnew \ Organization: Science Applications International Corp. Message-ID: <1994Sep6.160507.4201@ast.saic.com> References: <9409012114.AA19892@source.asset.com> Subject: Re: SGML New User Requesting General Informatio [Claude L. Bullard] | After my posting to "newbies" on the basics of SGML, I received a | number of kind and complementary postings suggesting that I'd written | something of benefit to the community. Thank y'all very much. Nothing | works better on me than applause. | | Ed Brachman at Interleaf made some very good suggestions about | improving the post. | | | (1) Where you compare a DTD to an "ADT", I'm not sure you're doing | | the newbie any favors. I assume that "ADT" stands for "abstract | | data type" -- but I'm not sure that calling a DTD an ADT helps even | | for those of us who can make that assumption, and I'll bet that it | | goes over the heads of at least some of the newbies you otherwise | | do a good job of talking to. | | Quite right, Ed. But every newbie is not new to every thing. Some are | well-educated comp-sci types who just want to know what SGML IS-OR-Ain't. | A few well-placed "buzzWords" gives them the clues they are looking | for. Unless they are navigating the nets with Mosaic, they are | hopefully not true innocents. Even in that case, it behooves an | inquiring mind to get dictionaries for subjects they want to study. In | this and future posts on the subject, I am consciously trying to | propagate a meme: | | MyMEME: SGML can be used for more than books. To do it, one must get | Beyond The Book Metaphor... or finally realize that a lot of things we | think of as software are really automated books without binders. | | SERMON FOLLOWS (caveat emptor): I have to agree with Len here. I have found that for folks with a computer science background, all I have to do is say something like " A DTD is sort of a Meta-Grammer or EBNF specification which determines what all the allowable sentences are in the language of a particular document type" and they get it instantly. However to get the same concept over to those who never took a formal language theory course, I have to say something like "A tagged SGML document consists mostly of tagged items. The DTD specifies what tags can appear in a particular document type and in what context they can appear, e.g., You can have a subparagraph tag inside a paragraph tag, but not the other way around." and most of them usually get it. The rest need to be walked through numerous examples and then most get it. Some will never get it ;-). In summary, I think it's best to include all three approaches in order of increasing complexity with perhaps a caveat as to whom the paragraph is intended for. Too bad we don't have Hyperlinks here; then we could use a "SkillTrack" attribute :-). -- "One man's syntax is another man's semantics." Newsgroups: comp.text.sgml Date: 06 Sep 1994 16:16:52 UT From: Bob Agnew \ Organization: Science Applications International Corp. Message-ID: <1994Sep6.161652.5304@ast.saic.com> References: <34f23l$17u@felix.dircon.co.uk> Subject: Re: Underscores and the Sgmls validating pa [Bruce Hunter] | A much simpler (and recommended) method is just to provide SGMLS with | an SGML Declaration in which the underscore character is made a valid | NAME character. | | in the NAMING section just add the underscore character to the UCNMCHAR | and LCNMCHAR declarations, as in | | NAMING | UCNMCHAR "-._" | LCNMCHAR "-._" Well that's just the problem. When you try this SGMLS complains about "Unsupported Feature" and about character 95. I assume the original poster did just what you suggested. I know I did and it failed. -- "One man's syntax is another man's semantics." Newsgroups: comp.text.sgml Date: 06 Sep 1994 17:47:09 UT From: R A Milowski \ Organization: University of Minnesota Message-ID: \ References: \ <19940820.4480@naggum.no> \ <19940905.4851@naggum.no> Subject: Re: HyTime critique (was CONCUR usefulness existence proof) Not to stray too much, but, could either Erik or Eliot (both preferrably) clarify an issue for me on HyTime: Might one call HyTime an "introspective" standard in the sense that it operates assuming that the SGML document is a resource to be queried? Whereas, DSSSL operates in a outward fashion using the SGML document as a starting point. If so, doesn't an introspective standard have a weak link to its resources since it needs to "query" or "link" into the document to perform its tasks? Thus, SGML provides a rigid framework and HyTime "softens" that rigidity. I don't claim to be an expert on HyTime, so please *no flaming*! -- R. Alexander Milowski SGML Operations Manager milor001@maroon.tc.umn.edu Microcom Inc. +1 612 825 4132 SGML Consulting -- "The SGML Solutions Experts" Newsgroups: comp.text.sgml Date: 07 Sep 1994 01:06:46 UT From: Edward Vielmetti \ Organization: Msen, Inc. -- Ann Arbor, MI (account info: +1 313 998-4562) Message-ID: <34j3n6$it4$1@heifetz.msen.com> References: \ <19940820.4480@naggum.no> \ <19940905.4851@naggum.no> Subject: Re: HyTime critique (was CONCUR usefulness existence proof) [W. Eliot Kimber] | These functions do not necessarily lend themselves to the sort of | comfortably computable processing that computer scientists seem to | prefer. [Erik Naggum] | I'll gladly take your word for it, but I'm not sure you understand just | how damaging to your own cause this statement is. If you can't compute things, then code can't be written, then work can't be done, and people move on to something else they can implement. From this I conclude that HyTime can be safely ignored since it's unlikely that anyone will be able to specify that an implementation meets any testing standards. | the HTML linking mechanism is not conceptually much weaker than | HyTime's (although in practice it is much weaker), yet reference (link) | maintenance with only a few changing documents is becoming a | significant problem. the structure built by HTML is extremely | unstable. the indirection in HyTime does help, but the fundamental | problem is that while HyTime includes the right concept, activity | tracking, there isn't even a hint at how this would or should be | implemented in practice. it is a distinctly non-trivial exercise. Right. For that matter, the Internet is "extremely unstable", in the sense that failure happens all over the place and sometimes gets in the way of things working. It is possible to make reference (link) maintenance work better, if you have sufficiently motivated people doing the publishing; it is also possible to have reference (link) creation work better if you have sufficiently clued in people writing authoring tools. I am intrigued by the DSSSL comments, Erik, if you can give a good overview for those of us who can only stomach standards discussions once every few months it would be welcome. -- Edward Vielmetti, vice president for research, Msen Inc. emv@Msen.com Msen Inc., 320 Miller, Ann Arbor MI 48103 +1 313 998 4562 (fax: 998 4563) Newsgroups: comp.text.sgml Date: 07 Sep 1994 02:47:31 UT From: Tim Bray \ Organization: MIND LINK! Communications Corp., Langley, BC, Canada Message-ID: <34j9k3$a55@deep.rsoft.bc.ca> References: \ <19940820.4480@naggum.no> \ <19940905.4851@naggum.no> Subject: Re: HyTime critique (was CONCUR usefulness existence proof) This is an important discussion, since there are a lot of people out there worrying about whether they have to worry about Hytime. Summary: HyTime has problems, but we should use it anyhow. I think the specification of HyTime is ungodly bad and it worries me profoundly that people like Eliot, who've been sort of living this stuff for a long time now, admit to not understanding significant parts of it. And I basically just don't buy the sweeping evangelism coming from Eliot, Dr. G, and others, about how HyTime is the answer to everything. BUT! Everyone I know who's doing anything serious with SGML is also doing one or both of hypertext and multimedia. Right now, they're building their own hypertext/multimedia machinery at the application level. HyTime apparently provides a flexible, portable way to encode all of the linkage & traversal facilities you need to do these things. And, like SGML, while the standard is impenetrable, HyTime markup in text is actually pretty easy to read and figure out what it's doing. Given the above, from now on, when I have a customer who's going to be doing any of this stuff, I'm going to advise them to encode it in HyTime. There may well be solid commercial products on the market that portably and robustly do the right thing, and if not, you were going to have to build it yourself anyhow. So the decision seems like a no-brainer to me. Cheers, Tim Bray, Open Text Corporation Newsgroups: comp.text.sgml Date: 07 Sep 1994 02:55:00 UT From: Tim Bray \ Organization: MIND LINK! Communications Corp., Langley, BC, Canada Message-ID: <34ja24$ac1@deep.rsoft.bc.ca> References: <345987$5g9@finnegan.iol.ie> \ Subject: Re: SGML to Postscript/PDF I once typeset a book directly from SGML into PS, even did the paragraph filling in PS code right in the interpreter, the book came out looking great. (1-page modules, so no page breaking logic required). Conclusions: 1. It can be done. 2. It's a lousy idea; for the same reason that writing an SGML parser in assembly language would be wrong. Cheers, Tim Bray, Open Text Corporation Newsgroups: comp.text.sgml Date: 07 Sep 1994 04:19:35 UT From: Syun Tutiya \ Organization: Stanford University Message-ID: <34jf0n$a1q@Russell.Stanford.EDU> References: \ Subject: Re: SGML in Asian languages? [Alan Williams] | In a recent discussion, some of the people with whom I work have | brought up the question of how to do SGML tagging (specifically | sentential tagging) in documents written in ideographic languages, like | Chinese. Does anyone out there have any insight/experience on this | particular matter? Although I do not understand what Alan means by sentential tagging, we do not have any particular problem with the partially ideographic languages like Japanese. Characters are no problem as long as your parser understands SGML Declaration and SGMLS parses documents quite nicely. ISO 8879 is now translated into Japanese as JIS X 4151. Syun Tutiya Chiba University Newsgroups: comp.text.sgml Date: 07 Sep 1994 15:30:20 UT From: Terry Allen \ Organization: O'Reilly & Associates, Inc. Message-ID: <34kmac$9e2@ruby.ora.com> Subject: US standards publishers? I'm looking for US providers of ISO specs (I'm interested in the forthcoming DSSSL spec). The only source I've used was OMNICOM, and I'm looking for alternatives. If you know a good one, please send me email. Thanks. -- Terry Allen terry@ora.com Newsgroups: comp.text.sgml Date: 07 Sep 1994 15:33:07 UT From: Joachim Schrod \ Organization: TH Darmstadt, FG Systemprogrammierung Message-ID: <34kmfj$1217@rs18.hrz.th-darmstadt.de> References: <9409012114.AA19892@source.asset.com> <1994Sep6.160507.4201@ast.saic.com> Subject: Re: SGML New User Requesting General Informatio [Claude L. Bullard] | After my posting to "newbies" on the basics of SGML, I received a | number of kind and complementary postings suggesting that I'd written | something of benefit to the community. Thank y'all very much. Nothing | works better on me than applause. I'd like to add more applause -- and some critical comments, below... :-) A few well-placed "buzzWords" gives them the clues they are looking for. [Bob Agnew] | I have to agree with Len here. I have found that for folks with a | computer science background, all I have to do is say something like " A | DTD is sort of a Meta-Grammer or EBNF specification which determines | what all the allowable sentences are in the language of a particular | document type." Actually, I think this "buzzWord" is better than the ADT. A DTD is not an ADT, it does not define any semantics. And defining abstract semantics is all an ADT is about... (cf. Barbara H. Liskov and S. N. Zilles: "Programming with Abstract Data Types", SIGPLAN Symposium on VHLL, 1974. That's the seminal article that introduced the term ADT, btw.) Joachim -- Joachim Schrod Email: schrod@iti.informatik.th-darmstadt.de Computer Science Department Technical University of Darmstadt, Germany Newsgroups: comp.text.sgml,comp.text Date: 07 Sep 1994 15:55:31 UT From: Tim Bray \ Organization: Open Text Corporation Message-ID: <34knpj$1c7@deep.rsoft.bc.ca> Subject: Electronic Document Viewers - Your Chance to Plug Yours At SGML '94, I'm going to be doing a survey talk on document viewing technology. (There's a whole series of these survey talks on various aspects of the techonology). The idea is to introduce the design issues and provide a framework for classifying the products that are out there. The purpose of this note is to solicit input. I know of the existence of the following viewing systems, and have at least some information on hand about them. If your product isn't on the list, or it is but you think I should get an up-to-date demo or technical literature, please send it to me, either on email or Tim Bray, Open Text, 101-1965 W. 4th Ave., Vancouver, B.C. Canada V6J 1M8. Since this is a technology survey, I don't mind talking about projects-in-progress. I AM PERFECTLY AWARE that the following list include some products with little or no connection to SGML, and products that vary wildly in their features and capabilities. Vendor Product Comment ------ ------- ------- Adobe Acrobat Have heard SGML story, is there a pos'n paper? Bellcore Superbook EBT Dynatext Folio Views I could use new literature HaL Olias IBM Bookmaster Is that the actual product name? Interleaf WorldView Any recent SGML news? NCSA Mosaic There's a PC version called Chaos? other WWW viewers? NoHands Common Ground Northern Tel Helmsman Still being marketed? No recent sightings Open Text Lector SoftQuad Explorer I know, SQ didn't write it WAIS WAIS ...and there are loads of Z39.50 clients Westinghouse Pathways WordPerfect (lost name) Need more info Thanks in advance for any info. Come to SGML '94! Cheers, Tim Bray, Open Text Corporation Newsgroups: comp.text.sgml Date: 07 Sep 1994 16:36:44 UT From: Bob Agnew \ Organization: Science Applications International Corp. Message-ID: <1994Sep7.163644.9188@ast.saic.com> References: <34kmfj$1217@rs18.hrz.th-darmstadt.de> Subject: Re: SGML New User Requesting General Inform [Joachim Schrod] | Actually, I think this "buzzWord" is better than the ADT. A DTD is not | an ADT, it does not define any semantics. And defining abstract | semantics is all an ADT is about... (cf. Barbara H. Liskov and S. N. | Zilles: "Programming with Abstract Data Types", SIGPLAN Symposium on | VHLL, 1974. That's the seminal article that introduced the term ADT, | btw.) OK -- Back to defending Len. The paradigm that I used here, that of a meta-grammar for a regular expression syntax, is only one of many possible. It is probably the most fundamental in that it borrows directly from the standard and its formulation and is most useful for explaining the mechanics of DTDs and how they relate to document instances. However, an SGML document instance can at once be all of the following things: 1) A stream of characters. 2) A string. 3) An entity. 4) An instance of a Concrete Data Type. 5) A specification for an Abstract Data Type. 6) A tree structure. 7) Input data to an application. 8) A content labeled document. 9) An ostensibly innocuous document with secret information embedded in unknown ways. 10) The secret to life, the universe, and all that. 11) .... your entries here ......... Consider the following discussion: I use the word CDT or concrete data type in that purely abstract classes are not supported directly by SGML itself; however most people use the term ADT when they really mean a CDT. The confusion derives from how the term "abstract" is used since a "type abstraction" can be "concrete". A DTD might be regarded as the specification part of a concrete class; it does not include the methods. They belong to the semantics of the application and usage. One certainly could write a DTD which describes a meta grammar for defining pure virtual C++ classes; however, a document of this type is still a string of characters. What the DTD does tell us however, is what tags can occur in a document and in what context they can occur. In short it specifies how the semantic elements of an application may be embedded in a stream of characters and what characters may be used, etc. Clearly, it can be useful to have more than one paradigm for a document and a document type. -- "One man's syntax is another man's semantics." Newsgroups: comp.text.sgml Date: 07 Sep 1994 16:41:36 UT From: Chris Hector \ Organization: Cray Research Inc. Message-ID: \ Subject: What DTD to use for Man-Pages? I want to put some man pages into SGML and I am looking for a DTD. I would like something tailored toward traditional UNIX man pages rather than an all-encompassing DTD. Does anyone have a favorite - or a place that I can look for DTD's? Thanks in advance for your help. Chris -- Chris Hector | cjh@cray.com Cray Research, Inc | Opinions expressed are my own 655-F Lone Oak Dr. | and not necessarily those of my employer. Eagan, MN 55121 Newsgroups: comp.text.sgml Date: 07 Sep 1994 18:54:42 UT From: "W. Eliot Kimber" \ Organization: Passage Systems, Inc. Message-ID: \ References: \ <19940820.4480@naggum.no> \ <19940905.4851@naggum.no> Subject: Re: HyTime critique The last part of the subject post is to the effect "I find HyTime to be inelegant and insufficiently precise" or, in the words of the horse in Ren and Stimpy cartoons: "No sir, I don't like it." To be honest, I'm quite surprised by this response and somewhat puzzled by it. I'm not going to try to argue about it because Erik has clearly made up his mind and I doubt there's anything I could say that would change it. However, I think Erik's complaint largely misses the key point of HyTime, which is that it attempts to standardize what people were either doing or were going to do as they moved their SGML applications to include more hypermedia aspects. Even if HyTime is only a stopgap waiting for a more elegant solution to these problems, I think it is still necessary. Of the various solutions to these problems defined within the context of existing SGML functionality, HyTime appears to be the most complete. That, coupled with its status as an international standard was enough for me to want to invest time in understanding it. Having gained a fairly deep understanding of HyTime, I am now satisfied that it solves the problems I and my customers are faced with in a way that is practical to implement and of lasting benefit. Of course, I come from the point of view of a data owner whose primary concern is protecting his investment in his data and protecting it from those who would attempt to usurp control to make their own lives easier. That means some of the concerns Erik puts uppermost are of less importance to me. [W. Eliot Kimber] | I also do not consider marked sections to be generally useful or | robust, both because there is no dynamism of marked sections controls | and because marked sections are not bound to element boundaries, which | can make them difficult to manage and work with. [Erik Naggum] | I don't understand what "dynamism of marked section controls" refers | to, so I can't address that. Marked sections as means of doing configuration management lack the sort dynamic control needed to do sophisticated version control. Specifically, there is no way to control the inclusion or exclusion of marked sections based on the local properties of the data (e.g., attribute values of the elements themselves). It is also difficult to express conditions based on multiple variables and boolean combinations. In designing the InfoMaster architecture, we decided that marked sections were insufficient for providing the level of configuration management needed for IBM documents (and documents of similar complexity). We chose instead to define an application-specific mechanism that uses the properties of elements to determine what data to retrieve. This solution required only that our document types provide sufficient flexibility in grouping data and elements so that authors had a reasonable level of control. By codifying this mechanism in the InfoMaster Architecture we hoped to promote it as a general solution shared by a number of applications (and as it happens, the retrieval semantic can be expressed as a set of HyQ queries used to build an "inclusion list" of objects to be processed, so that our application-specific mechanism can be expressed in HyTime terms, making the specification of the mechanism interchangeable and directly implementable or processable by HyTime-based processors with the necessary functionality). | however, if I replace "marked sections" with "entities" in the rest of | this sentence, I end up with a strong argument against entities! I suppose. The fact that marked sections are not necessarily bound to element boundaries is not the real problem with marked sections, although it can lead to complications, which are essentially the same complications you can have with entities that don't align to element boundaries, so perhaps it is not really a valid objection to marked sections in general. | maybe this is unfair, since we all agree that entities are very useful, | but it shows that arguments that are used negatively about one thing | and positively about another aren't very good arguments and probably | hide the _real_ arguments. that marked sections are constrained to | entity boundaries in my opinion should make them far easier to deal | with than entities that aren't bound to element boundaries, and thus | not so difficult to manage or work with. they are also not declared in | a distant prolog, which should make them easier to deal with. I think there must be a typo here ("are constrained" should be "are not constrained"). The main difficulty with marked sections presented by their not being bound to element boundaries is ensuring that when you move data relative to marked section boundaries you have moved it appropriately. You would have the same difficulty with entities in an editor that presented a view of an entire document, including any general text entities, as a single flow. [W. Eliot Kimber] | I was referring to the robustness of the functionality HyTime | represents. [Erik Naggum] | you said: "My take on CONCUR is that what CONCUR can do weakly HyTime | allows you to do robustly." since there isn't much description of "the | functionality [that] HyTime represents" in the standard, I interpreted | this to refer to the syntax of HyTime, just as CONCUR refers to the | syntax of _its_ mechanism. I was probably not clear about the use of CONCUR to which I was referring. The TEI has suggested that CONCUR can be used to represent the combination of a base text plus changes to it, commentary, and other layered information that scholars often work with when analyzing texts and other artifacts. While it is probably the case that CONCUR can in fact do this, it has been my contention that the relationships at work can be better represented through the use of HyTime hyperlinks. Here's why I think so: 1. Hyperlinks in HyTime must represent defined relationships, allowing you to be arbitrarily precise about what a given relationship means. CONCUR does not provide this. 2. As the number of layers increase for a given document, the markup weight will become quite large, forcing the processing system to deal with all layers (even if that only means reading the character stream representing the document). However, with a hyperlink-based approach, each layer can be independent of the other layers as well as the base document, letting authors or applications choose only those layers they want or need to process. This provides a great deal of flexibility in how data is organized and managed. 3. Because the layers can be independent of the base document, they can be processed and interchanged independent of the document itself. 4. The location methods of HyTime allow much greater choice of association than CONCUR can because you are not constrained to what you can identify with elements in a given document type. Aggregate anchors, span locations, and locations in multiple documents all become possible. In short, by using a hyperlink-based approach, your freedom, flexibility, and potential for interchange and re-use is vastly improved. It may also be that I have an instinctive distrust of any data representation method that appears to bind too much information directly in the base information. | sorry to say so, Eliot, but despite valiant efforts both on your part | and by several other people, the "point" of HyTime is still a widely | dispersed mist consisting mainly of hype and vaporware. I do know | which problems HyTime tries to solve. you should have discovered by | now that I don't think HyTime succeeds in solving those problems in a | way which will make computer scientists (or, heck, Microsoft-influenced | coders) adopt it. But the question is not whether computer scientists will adopt HyTime, but whether or not information owners will adopt it. Remember that SGML and HyTime, and by extension computers and software vendors, exist to serve the needs of information owners. I think the existence proofs of tools like SoftQuad Explorer, MarkMinder/HyMinder, and my own Perl hacks should be sufficient to demonstrate that HyTime is more than hype and vaporware. HyTime has already solved for me problems that make its use compelling, all other concerns notwithstanding. Do computer scientists in general accept SGML? Does it matter? I don't think so. APL2 is one of the most elegant programming systems in existence. Computer scientists love it. So what? I guess that having spent my entire professional life in industry solving immediate problems with the tools at hand has taught me that all the computer science I need I learned in data structures 101. This is a bad attitude and I try to moderate it. Nevertheless, focus on the computer science aspects has, for me, been much more of a barrier to solving problems than an aid. Thus my bad attitude and my "what me worry" approach. The question I typically ask is "can I think of way to solve this problem?" If the answer is yes, that's all I need. If the answer is no, then I ask "Is this problem provably unsolvable?" If it is, I don't try to solve it. If it isn't, I try harder to think of a solution. Part of the problem for me may be that many of the problems posed by text processing and hypermedia are relatively mundane and easy to solve using basic data processing techniques. Optimal solutions may need more sophistication (for example, optimizing the parsing and indexing of SGML data for later retrieval), but that optimization occurs within the context of the larger problem space for which the parameters have already been defined, like the use of SGML and HyTime. Given the problem "how do I index SGML data so that I can resolve HyTime locations quickly" I will certainly look to computer scientists to find a solution, but the right answer cannot be "don't use HyTime". It's not a question of can or can't but a question of optimization. I guess I know that once I've demonstrated *a* solution to a problem, I know I can interest people in finding an *optimized* solution to a problem, assuming that the solution to the problem provides some compelling benefit (and of course, I only get paid to solve problems of compelling benefit, so that's all I try to solve). In many cases, the benefit is so compelling that any reasonable solution is worth the effort. And you can be reasonably assured that once a solution is in place, you can justify developing or acquiring an optimal solution. HyTime reflects the reality of the data we have and the things we want to do to it -- it doesn't impose any processing you wouldn't have one way or another. Therefore, I know that solutions to problems exposed by the use of HyTime have the same degree of solvability and optimization potential whether HyTime were used or not. In other words, if the HyTime solution can't be implemented or made to perform, no other equivalent solution will either. I also know that HyTime is flexible enough that you can impose the constraints necessary to make a system practical even if the general solution possible with HyTime is not practical today. [W. Eliot Kimber] | HyTime doesn't do validation of document type declarations *because | it's incomputable in some cases*. [Erik Naggum] | although I don't think you know the force of this statement, either, | I'll accept it at face value and quote you on it. I think, however, | that you should be a little more careful with what you say. there | _are_ computer scientists reading this, and they _may_ be wondering | whether to adopt HyTime or leave it in paper. I think I was wrong on this point. I think you *can* tell computationally whether or not a given DTD allows *only* valid HyTime documents. See the discussion below. [W. Eliot Kimber] | Also, it is essential that HyTime allow document types that allow | invalid constructs -- that's the only way HyTime can hope to peacefully | integrate with existing applications that predate HyTime or that have | other requirements that do not always allow HyTime conformance. [Erik Naggum] | that's the "only way"? geez. as far as I could understand, the whole | idea with HyTime was that it should not require modification of the | documents into which you point. Integrated Open Hypermedia, right? | this I always interpreted to mean that you would have to build in some | HyTime support in your applications such that they would not _need_ to | mess with extant document types or instances, but could reference them | from outside. now all of a sudden we _have_ to mess with extant | document types? what next? I'm not sure I follow your argument at all. The point is merely that if you want to add HyTime support to an existing document type without at the same time making all of your existing documents invalid, you have to allow by HyTime and non-HyTime constructs in the same document type. How could it be otherwise? It doesn't have anything to do with documents to which you point, it only has to do with documents you are putting the pointers in. | I do understand that you need to emphasize that document types cannot | always be validated, but that's not my question. my question is: if | and when I want to validate it, can I? you don't answer this question. Without working out the algorithm, I would say that the answer is yes. I can definitely tell whether a not a document type *only* allows the creation of valid types. This is a simple transform by which you change the GIs in the DTD into the architectural form names of those elements and then compare the resulting content models to the content models of the architectural forms as defined in the standard. There may, however, be some complicating factor I haven't thought of. [W. Eliot Kimber] | But of course, there's no reason you can't define document types that | are *guaranteed* to produce structurally valid HyTime documents. If | you want that level of comfort, define your document types that way. I | certainly recommend it if at all possible, but there are many cases for | which it would be impossible or inappropriate (the TEI, Docbook, and | HTML come to mind immediately). [Erik Naggum] | I was asking whether it was possible to know, a priori, whether a | document type conforms to HyTime, that is, whether a document type | would always produce conforming HyTime documents without having to | parse all the possible instances of that document type. I think the answer is definitely yes, as explained above. On the other hand, the HyTime-defined content models are simple enough that you can validate a DTD by inspection for the most part. It is only the subelements of architectural forms other than HyBrid and HyDoc that pose any real constraints, and those content models are all fairly simple and easy to check. Ditto for attribute lists. The only danger I can see is the use of inclusions, by which you might inadvertently allow invalid subelements within HyTime-unique elements (e.g., allowing HyBrid-form elements within nameloc-form elements or something). The other aspects of validation that HyTime documents are subject to are not determined by the DTD, but are a function of how various location methods are used in combination or the association between anchor roles and link ends, things that can only be validated in instances. | but to counter your actual argument: an element does not become an | instance of an element type form just because it says so. a number of | factors are involved, such as congruency of attribute lists and content | models. I think it's clear now that that can be checked easily enough. In fact, the more I think about it, the more I think that my original assumption, that the reason DTDs are not validated is because they *can't* be, is in fact wrong. The reason DTDs are not validated is because you don't always want to have a DTD that only allows HyTime documents, not because you can't tell if a DTD does only allow valid HyTime documents, because you can check that. | but it's not just the use of attributes that makes HyTime loose. it's | the use of the one space of unique ID's as the _universal_ naming and | reference mechanism. this comes from the fact that to set up all the | thingies that need ID's on them, you need a whole "prolog" just for | that, sometimes in one place, sometimes scattered all over the place. | some like to label them "meta-objects" and parade them far and wide as | the solution to everything. suffice to say there are more schools of | thought than this one. I'm not sure what you're getting at here, either. What would you use other than unique IDs? How are SGML unique identifiers any different from key fields in relational databases or object IDs in object-oriented databases? There has to be some sort of defined name space with guaranteed unique names. I'm not getting your point. And I'm not sure what you mean by a "prolog". | I think we're talking a bit past each other. let me take another spin | on this one. by asking for robustness I want to address the | testability of something under stress. in this case, I want to know | whether a particular change will affect the links that depend on | (reference) this document. put another way, I want to be able to | compute a dependency graph all the way down to the character if I have | to, in order to show whether changing _this_ will require an update | _there_ or not, and usefully, _where_ it would require an update. now, | since HyTime provides mechanisms to point into documents that don't | know that they are the target of links, this is a hard nut to crack. | still, for a bounded object set (one of HyTime's better ideas), it is | computable, provided you can wait, or tolerate an answer like "try, and | see what happens". I agree this is a difficult problem, but what's your point? It's difficulty is inherent in the problem itself, not in the use of HyTime to represent documents that expose it. Dependency tracking of references is solved by resolving a set of addresses and seeing what you hit. In the abstract the problem is not difficult to solve at all. The problem is optimizing the system so that it performs acceptably and being able to detect that for some forms of address, the answer cannot be found in a reasonable period of time. But I think these problems are independent of the use of HyTime and would be present in any system of equivalent flexibility and utility. Certainly existing indexing and retrieval systems suggest that practical solutions can be found and implemented. At the same time, users have to understand the implications of asking certain questions. This is no different for HyTime-based hypermedia than it is for SQL databases. This is the price of function and power. -- \

Newsgroups: soc.culture.scientists,comp.text.sgml,comp.infosystems.www.misc,soc.libraries.talk Followup-To: comp.text.sgml,comp.infosystems.www.misc Date: 07 Sep 1994 19:28:37 UT From: Cameron Laird \ Organization: NeoSoft Internet Services +1 713 684 5969 Message-ID: <34l495$8qi@Starbase.NeoSoft.COM> Subject: HTML style: bibliographic elements I'm looking for guidance on good HTML style in bibliographic citation. For now, when I'm writing a WWWable text, and I want to refer to a book, I italicize \Book Title\ the title; I simply plaintext-quote article titles: "Explanation of things", \Journal of everything\, volume ... Is there yet a sense of "common practice" in this domain? What is it? I've narrowed follow-ups. -- Cameron Laird ftp://ftp.neosoft.com/pub/users/claird/home.html claird@Neosoft.com (claird%Neosoft.com@uunet.uu.net) +1 713 267 7966 claird@litwin.com (claird%litwin.com@uunet.uu.net) +1 713 996 8546 Newsgroups: comp.text.sgml Date: 07 Sep 1994 20:55:06 UT From: Ed Blachman \ Organization: Interleaf, Inc. Message-ID: \ References: <9409012114.AA19892@source.asset.com> Subject: Re: SGML New User Requesting General Information [Claude L. Bullard] | After my posting to "newbies" on the basics of SGML, I received a | number of kind and complementary postings suggesting that I'd written | something of benefit to the community. Thank y'all very much. Nothing | works better on me than applause. As I've told Len in email, and just in case it's not clear, I mean to join in the applause for your posting. It's just that as an (amateur) editor, I applaud by picking nits.... | Ed B[l]achman at Interleaf made some very good suggestions about | improving the post. | || (1) Where you compare a DTD to an "ADT", I'm not sure you're doing the || newbie any favors. I assume that "ADT" stands for "abstract data type" || -- but I'm not sure that calling a DTD an ADT helps even for those of || us who can make that assumption, and I'll bet that it goes over the || heads of at least some of the newbies you otherwise do a good job of || talking to. | | Quite right, Ed. But every newbie is not new to every thing. True; you have to pick an audience, and write for it. If I were writing for an audience of folks who could do the mapping of ADT to Abstract Data Type, I'd use Bob Agnew's EBNF/metagrammar lingo -- it happens to work better for me. But you're writing this, not me. On the other hand, in my limited experience, most of the people to whom I've tried to explain SGML are (a) literate, (b) intelligent and (c) not familiar with ADTs, EBNF, automata theory or much else of computer science. This is so much the case that I have problems connecting the term "newbie" with folks who are literate in CS; so I assumed you were writing for the folks I've talked with... hence my comment. But I tend to make my SGML- explanation attempts in social rather than professional contexts.... Maybe you could add a footnote-type paragraph that would expand ADT to Abstract Data Type, briefly explain the concept, and point to a reference on the topic? That might satisfy your goal (of broadening readers' horizons) while giving non-CS types a chance to understand DTDs enough to follow the rest of the excellent stuff you wrote. || (2) In noting points about the badbook DTD, you claim that the use of || the PUBLIC keyword implies that the DTD has been formally registered || with some body empowered to do formal registrations... | | Hmm, I was trying to say the opposite. That while the identifier | appears to make that claim, there is no way to check it. My apologies for a lack of clarity. I understood what you were saying; I meant to disagree that "the identifier appears to make that claim", and that this made the use of an unregistered public id "bad". I think that unregistered public ids are a useful tool when understood, and (in conjunction with stuff like SGML Open's Entity Management Resolution) a worthwhile piece in the kind of agreement among trusted parties you rightly note as being necessary for successful interoperability. -- ed Newsgroups: comp.text.frame,comp.unix.advocacy,comp.text.sgml,comp.infosystems.www.misc,comp.society.futures,news.groups.questions,misc.writing,alt.culture.internet Followup-To: comp.text.frame Date: 07 Sep 1994 20:55:50 UT From: Cameron Laird \ Organization: NeoSoft Internet Services +1 713 684 5969 Message-ID: <34l9cm$f4d@Starbase.NeoSoft.COM> Subject: Electronic publication: miscellaneous questions Should I attend Frame Technology Corporation's seminar (sales presentation) on FrameViewer 4? Background: I'm strategizing about organizational info-centers. My current employer, for example, ought to be able to say to its engineering staff, "fire up Mosaic http://www.ourserver/infocenter.html when you come in Monday, and find everything -- telephone lists, calendars, procedure manuals, ..." What's a good way to implement this? Here are some things I know: 1. I like Frame Technology. I'm comfortable with FM, and I've seen winning applications built with help engines relying on FM. On the other hand, this involves cash outlays, and licensing, and I have to make a *very* good case to justify those. 2. Microsoft's Help model is successful. On the other hand, it doesn't know about networks, and I don't expect it to for some time. 3. Somebody's going to come out on top in the next couple of years. We who read these newsgroups *know* that a distributed hypertext system is what we need for lots of situations; the only question is whether to implement it with technology from a WWW model, or Adobe, or Microsoft, or WordPerfect, or ... This is a decision that matters, too, because, although one can anticipate filters that will unify the different models, they'll probably take as long to build as it took to make Word-to-WordPerfect transformations as painless as they now are. My current approach: distribute hypertext in HTML, with appropriate Mosaic clients on existing hardware. Generate the HTML files by writing in hyper-linked FM, and then transforming. None of the fm2html-like filters yet satisfy me, but that's the best I can see. Should I care about FrameViewer? Who does? Related question: anyone have interesting stories about distributed publishing of software? There'll come a day when serious applications include embedded references to remote hypertexts. Has it happened yet? I'm open to suggestions on which newsgroups like to mull over such topics. For now, I've narrowed follow-ups to comp.text.frame alone. -- Cameron Laird ftp://ftp.neosoft.com/pub/users/claird/home.html claird@Neosoft.com (claird%Neosoft.com@uunet.uu.net) +1 713 267 7966 claird@litwin.com (claird%litwin.com@uunet.uu.net) +1 713 996 8546 Newsgroups: comp.text.sgml Date: 07 Sep 1994 21:11:45 UT From: Erik Naggum \ Organization: Naggum Software; +47 2295 0313 Message-ID: <19940907.4888@naggum.no> References: \ <19940820.4480@naggum.no> \ <19940905.4851@naggum.no> \ Subject: Re: HyTime critique (was CONCUR usefulness existence proof) [W. Eliot Kimber] | The last part of the subject post is to the effect "I find HyTime to be | inelegant and insufficiently precise" or, in the words of the horse in | Ren and Stimpy cartoons: "No sir, I don't like it." I'm sorry I can't find a witty response to that, so I have to admit that you win this debate. congratulations, Eliot! and good luck. #\ -- Microsoft is not the answer. Microsoft is the question. NO is the answer. Newsgroups: comp.text.sgml,comp.infosystems.www.misc Date: 08 Sep 1994 03:59:20 UT From: Marc VanHeyningen \ Organization: Computer Science Dept, Indiana University Message-ID: <16834.778996760@moose.cs.indiana.edu> References: <34l495$8qi@starbase.neosoft.com> Subject: Re: HTML style: bibliographic elements [Cameron Laird] | I'm looking for guidance on good HTML style in bibliographic citation. | For now, when I'm writing a WWWable text, and I want to refer to a | book, I italicize | | \Book Title\ | | the title; I simply plaintext-quote article titles: | | "Explanation of things", \Journal | of everything\, volume ... | | Is there yet a sense of "common practice" in this domain? What is it? \. -- Marc VanHeyningen \ Newsgroups: comp.text.sgml Date: 08 Sep 1994 08:58:24 UT From: "William D. Lindsey" \ Message-ID: <1994Sep8.105825.421@ittpub> References: <34kmac$9e2@ruby.ora.com> Subject: Re: US standards publishers? [Terry Allen] | I'm looking for US providers of ISO specs (I'm interested in the | forthcoming DSSSL spec). The only source I've used was OMNICOM, and | I'm looking for alternatives. If you know a good one, please send me | email. Thanks. How about: American National Standards Institute 11 West 42nd Street, 13th floor New York, NY 10036 USA (212) 642-4900, fax: (212) 302-1286 Now, who can direct me to the source in The Netherlands? Regards, Bill -- william@ittpub.nl (BTW if you've sent mail sent to "bill@ittpub.nl", please resend to the above address. I never recieved it.) Newsgroups: comp.text.sgml Date: 08 Sep 1994 09:04:50 UT From: "S. A. van Merrienboer" \ Organization: TNO Physics and Electronics Laboratory Message-ID: <1994Sep8.090450.9514@fel.tno.nl> Summary: Can anybody mail me a SGML, HyTime, or HTML FAQ? Subject: SGML, HyTime, HTML FAQ's? I'm desperate for a SGML, HyTime or HTML FAQ, because I'm new on these subjects. Can anybody mail me these FAQ if they do exist? Thanks Siem van Merrienboer svmerrienboer@fel3.fel.tno.nl Newsgroups: comp.text.sgml Date: 08 Sep 1994 11:23:52 UT From: Roger Reading \ Organization: SYNTEGRA - The systems integration business of BT Message-ID: <34ms88$eqg@pheidippides.axion.bt.co.uk> References: <34kmfj$1217@rs18.hrz.th-darmstadt.de> <1994Sep7.163644.9188@ast.saic.com> Subject: Re: SGML New User Requesting General Inform [Claude L. Bullard] | After my posting to "newbies" on the basics of SGML. Is it possible for someone to forward a copy of the document, or post the location, mentioned above. I have search the conference but cannot find the location of said document. Regards Roger Reading -- Roger V. Reading Applications Manager readingr@fleet.syntegra.bt.co.uk Syntegra +44 1252 777779 Newsgroups: comp.text.sgml,comp.text Date: 08 Sep 1994 15:40:43 UT From: "Rita E. Knox" \ Message-ID: \ Keywords: document modelling Subject: Looking for speakers for Documation '95 I am chairing a session at Documation '95 -- "the international forum for document management applications, document system technology and interoperability solutions" -- which will be held from March 7-9 at the Long Beach Convention Center, Long Beach, CA. A description of the session follows: ------------------------------------------------------------- Session for Documation '95 Session Chair: Rita E. Knox Title: Data-Driven Documentation: Modelling Issues Summary: There are many advantages to identifying content in document data bases. Among other things it supports cross-referencing, hypertext navigation, automated data verification and update, and auto-generation of document components. Such content must be identified in a meaningful way -- there must be a correspondence between the document content definition and the natural structure of the information being documented. However, at the same time that document automation experts are developing content models to support documentation uses, there are domain experts who are developing content models to support many applications other than documentation. Where does the line between these potentially redundant efforts fall? What work should each "side" of the industry perform and how might the efforts be coordinated? This session explores these issues by providing examples from different industries where such concurrent modelling efforts are in progress. Suggested topic areas: -- Law/Legal publishing -- Pharmaceutical/New Drug Applications -- General Information/Newspaper Publishing -- Product Data Exchange (STEP)/Technical Documentation ------------------------------------------------------------- I am looking for 2-3 speakers to participate in this session. Potential speakers may be working in one of the suggested topic areas or in some other area where basic domain modelling and documentation modelling may be occurring simultaneously. Interested individuals should send me a brief abstract (500 words) describing their proposed presentation that would address this topic. (Please send to either \ or \) I will respond to all submissions no later than 1 November when I have reviewed all abstracts and made a selection. Thanks. -- Rita Knox -- Rita E. Knox, Ph.D. v: 908.576.8678 Knox\&Assocs/Martin Hensel Corp. f: 908.576.8679 167 Winding Way knox@kanda.com Little Silver, NJ 07739 OR rknox@cnj.digex.net Newsgroups: comp.text.sgml Date: 08 Sep 1994 17:48:43 UT From: "W. Eliot Kimber" \ Organization: Passage Systems, Inc. Message-ID: \ References: \ <19940820.4480@naggum.no> \ <19940905.4851@naggum.no> \ Subject: Re: HyTime critique (was CONCUR usefulness existence proof) [R A Milowski] | Not to stray too much, but, could either Erik or Eliot (both | preferrably) clarify an issue for me on HyTime: | | Might one call HyTime an "introspective" standard in the sense that it | operates assuming that the SGML document is a resource to be queried? | Whereas, DSSSL operates in a outward fashion using the SGML document as | a starting point. I think this is a reasonable approach, although I'm not sure I precisely understand the implications of this classification. HyTime is certainly "loose" in that it is essentially a framework that provides opportunities to hook things together. The framework has to be flexible enough to allow a wide variety of applications and implementation methods to work together. This means HyTime can't be as rigid or constrained as any given application might be. For example, HyTime allows an application to define what lexical type specification language it wants use, what query language it wants to use, and what property sets it wants to define, even though the standard also defines its own versions of these. I consider this a strength of HyTime, since it makes it more likely that existing applications will adopt it if they don't have to re-implement things they already provide (for example, if you were writing a Perl application to do HyTime processing, it would be easiest to define Perl regular expressions as your lexical type language, since Perl already provides the functions you need). HyTime is an application architecture, and while it's goal is to enable interchange among disparate applications, each different application will have different requirements as to what degree of interchange is needed or is practical. Another aspect of HyTime is that it provides an architecture within which other standardization efforts could be done. For example, HyTime property sets provide a way to create a interchangeable definition of a set of application-specific properties. This provides an opportunity for application vendors or specific industries to define standard property sets. For example, full-text retrieval vendors could define property sets that enable access to the various full-text properties they provide (e.g., proximity, lexical variation, relevance ranking, etc.), either on their own or as an industry. I could see applications providing property sets with products along with other APIs and drivers they might provide. No single standard can hope to address all these areas directly. Rather, HyTime defines a set of simple but powerful abstraction and indirection mechanisms that can be used to help data and applications interoperate. -- \

Newsgroups: comp.text.sgml,comp.infosystems.www.misc Date: 08 Sep 1994 19:55:46 UT From: Michael Johnson \ Organization: University of Maine System Message-ID: <94251.155547MICHAEL@MAINE.MAINE.EDU> References: <34l495$8qi@Starbase.NeoSoft.COM> Subject: Re: HTML style: bibliographic elements [Cameron Laird] | I'm looking for guidance on good HTML style in bibliographic citation. | For now, when I'm writing a WWWable text, and I want to refer to a | book, I italicize | | \Book Title\ | | the title; I simply plaintext-quote article titles: | | "Explanation of things", \Journal | of everything\, volume ... | | Is there yet a sense of "common practice" in this domain? What is it? To put in my $0.02 (2\¢) you should at least use the \ and \ tags where you are currently using \ and \, since they are intended to be used for a literary citation. In general, I'd assume that accepted practice in bibliographic format in general (c.g., _The Elements of Style_) is appropriate for HTML docs also. Anywhere you would underline in a typed bibliography (i.e., titles) is an appropriate place to use the \\ tags. -- Michael Johnson, Relay Technology, Inc. michael@maine.maine.edu, michaelj@relay.relay.com "I will choose a path that's clear. I will choose Free Will." -- Neil Peart Newsgroups: comp.text.sgml Date: 08 Sep 1994 22:38:47 UT From: Jeffrey McArthur \ Organization: ATLIS Publishing Message-ID: <34olar$cms@news.delphi.com> Subject: SGML FAQ The Un-Official, Non-Sanctioned Frequently Asked Questions List Version 0.0.0 Date: September 8, 1994 Compiled by Jeffrey McArthur (j_mcarthur@bix.com) Subject: Table of Contents 0. About This Release 0.1 Why is it so late? 0.2 Why is it not in SGML? 0.3 When is the next release? 0.4 Why is the file broken into pieces? 1. General Information 1.1 Notes about the FAQ 1.2 What is Markup? 1.3 What is Tagging? 1.4 What is SGML? 1.5 Why go to all the trouble of using SGML? 1.6 History of SGML 1.7 What is ISO 8879? 1.8 What is a DTD? 1.9 What is a parsing? 1.10 What is legacy data? 2. SGML Language Features 2.1 Elements 2.2 Attributes 2.3 Entities 2.4 Comments 2.5 Notation 2.6 Processing Instructions 3. Parsers 3.1 ARC SGML 3.2 SGMLS 3.3 Exoterica SGML Kernel 4. Converting Data To SGML 4.1 By hand 4.2 Lex/Flex 4.3 OmniMark 4.4 Tagwrite 4.5 Programming Language of your choice 5. Printing SGML 5.1 With LaTeX 5.2 With Plain TeX 5.3 With Troff 5.4 Interleaf 5.5 Quark Express 5.6 Ventura Publisher 5.7 PageMaker 14. Utilities 14.1 Author/Editor 15. Publications 15.1 TAG 15. Vendor Information 16. Consultants 17. SGML Archive. Subject: Chapter 0. About This Release Subject: 0.1 Why is it so late and incomplete? I am no longer working 12-16 hour days. I am only working 11-13 hours a day. I even took two days off over the Labor Day weekend. So the FAQ is late because I have just not had enough strength to work on it. Someday, in the near future, I will have some free time to work on the FAW. Until then, this is all you get. Subject: 0.2 Why is it not in SGML? I am just to lazy to convert it at this time. Subject: Chapter 1. General Information Subject: 1.1 Notes about the FAQ There is no officially maintained FAQ for comp.text.sgml. This is an attempt to solve the most frequently asked question on this newsgroup, "where is the FAQ?". Rather than start an rwar about who is right or wrong or if there should be a FAQ at all I decided that it would be in my best interest to provide a skeleton structure to a non-official FAQ. This is only the rough outline of what is to follow, hopefully. Ideally, the FAQ should be organized as an SGML document. But to start with, this is just an ASCII text file. But looking to the future, what DTD should the FAQ use? If you to help with this FAQ, please send any updates or comments to j_mcarthur@bix.com. The only way this FAQ will be developed is with help from others. One word of warning, since I am starting this FAQ, it will reflect my opinions. Subject: 1.2 What is Markup? Using a highlighter pen to emphasize passages in a book is "marking up" the book. The highlights show passages that are important to the reading. Underlining is another form of markup. It is not possible to use a highlighter on an electronic document. To implement electronic markup a variety of ideas have been developed. Subject: 1.3 What is Tagging? ASCII has become the most commonly used form of information exchange. Almost every word processor has the ability to import and export an ASCII text file. The problem with ASCII text files the 95 printing characters of the 7-bit ASCII definition do not provide any information about the structure or the format of the document. Several methods have been developed to specify additional information in an ASCII text file. FORTRAN used a "C" in the first column of a punched card to specify a comment. This was one of the simplest forms of tagging. All the comment lines were tagged with a "C" in the first column. Pascal allowed comments to be placed almost anywhere. This was done by introducing a start comment sequence, "(*", and an end comment sequence, "*)". The basic idea is to use a recognizable sequence of characters to define parts of a document. Each special sequence of characters is called a "tag". Below is a list of tags used in some computer languages: Language Start Tag End Tag FORTRAN C in first column column 72 Pascal (* *) C /* */ Basic REM in first column end of line The process of adding special sequences of characters to an electronic document is call tagging. Tagging is a method of "marking up" an electronic document. Over time, comments in computer languages have changed. One of the more interresting changes is the ability to nest comments. Most of the newer Algol family of languages allow comments to nest. This included Modula-2, Oberon, and Oberon-2. The ability to nest is important in tagging. This allows using the same notation over and over again. The meaning of a tag become context dependant. In Modula-2 the first "(*" starts the comment. Following "(*" not only continue the comment but add the requirement of an additional "*)" to end the comment. For each start comment tag, there is an end comment tag. Subject: 1.4 What is SGML? One of the problems with tagging is determining what tags to use. SGML takes the concept of tagging one step further. It creates a method for defining a set of tags. This is why some people refer to SGML as a meta-language. SGML does not define a set of tags. It is a tool to define a set of tags. But SGML does a more than just define the tags. There are tools that will take a tagged electronic document and compare it to the set of defined tags and see if the document follows the definition. The process of doing this validation is called "parsing". With SGML you can define a "tag grammar", and you check see whether a text conforms to that grammar. Subject: 1.5 History of SGML Erik Naggum \ wrote a short history in the officially sanctioned FAQ. If I get his permission I will include it here. Or if anyone would care to write one I will post it here. Subject 1.6 Why go to all the trouble of using SGML? SGML is not as easy to use as Word Perfect or Microsoft Word. Most word processor programs are very easy to use. You just type. Little thought is given to the structure of what is written. Style sheets provide some outline capabilities, but they do not force the document to match the style. SGML can be tyrannical in its enforcement of structure. The major advantage of SGML is the enforced consistency. Documents must follow the defined structure, or they will not parse. One major advantage of enforcement of structure is consistancy. The data must follow a predefined set of rules reguarding its structure. Another possible advantage is separating the "form" of the document from the "content" of the document. With a word processor, you are always aware of the form. Style sheets do help; but the layout of the document is bound to the data. SGML may help separate the content. One common misconception is that SGML tags can only define the structure of a document. It is possible to create a SGML document where the tags only describe the form. An example of this is in tables. Many table models only describe the formatting of the table. There is no attempt to represent any structure on the data other than the format. If SGML is used to separate the form from the content of a document, then it is much easier to create new "forms" from the same data. For example, if a document is written using a word processor it may be very difficult to change all the bold italic listings in the document to bold san-serif. If the form is completely separate from the content, then the actual format of the document is specified outside of the document itself. This is the answer of why to use SGML. If the document is to be used only one time, for example a letter to a friend, there is no reason to use SGML. On the other hand, if the letter is to be placed into a system that is searched and/or printed many times in many different ways, then SGML is a major advantage. Subject: 1.7 What is ISO 8879? SGML is an ISO standard. ISO 8879 is the definition of SGML. The definitive document is: The SGML Handbook; Oxford University Press, 1990; ISBN 0-19-853737-9; by Charles F. Goldfarb. If you are serious about SGML, this book is a must. It is a very hard to read document. Also for a book that wants to show off the power of SGML the typesetting is awful. The indexes are almost useless because there is no distinction between a simple reference and a full description (to see a much better computer generated index look at the index to The TeX Book, ISBN 0-201-13447-0 (hard) and ISBN 0-201-13448-9 (soft)). Subject: 1.8 What is a DTD? A Document Type Definition (DTD) is an electronic document that defines a tagging structure. A DTD specifies where each tag is allowed. For example, a novel is made up of a set of chapters. Each chapter is made up of one or more sections. Each section is made up of one or more paragraphs. A DTD contains statements that define this relationship. DTD is the name for a tag grammar. Subject: 1.9 What is a parsing? Webster's defines parsing as: to break (a sentence) down into parts, explaining the grammatical form, function, and interrelation of each part. This is not exactly what we mean by parsing in SGML. Parsing in SGML is done via a parser. A parser is a computer program that breaks down an electronic document into its parts and compares the form of the document based on the SGML tags to the form described in the DTD. Parsing is a check of conformance of a text to the grammar described in the DTD. Parsing is what separates SGML and other word processing formats. For example, in the case of a novel, this would mean that paragraphs only occur inside of sections, and sections only occur inside chapter. A word processor does not enforce those requirements. Subject 1.10 What is legacy data? Legacy data is a term used by some to refer to data that has not been converted to SGML. The choice of terms is rather unfortunate. It gives the impression that nothing good could have been done prior to SGML. There are two issues options in converting legacy data. Change the existing data to match the DTD, or change the DTD to allow the structures in the existing data. The question is simple: what should define the DTD: the idealized model for new data, or the real-world existing data. As anyone who has done any work in physics realizes, working with real-world data can be a very difficult task. SGML enforces the structure defined by the DTD. But it is relatively easy to create a DTD that is totally unsuitable for a set of data. It is also possible to create a DTD that is so loose that no structure at all is enforced. Converting existing data generally requires a lot of compromise. If you have more than a couple of meg of unstructured data and want to convert it to SGML you will end up making massive changes to both the data and the DTD; unless you are very, very lucky. Subject 2. SGML Language Features The syntax used to define a document tag definition. This section is used to provide a quick overview of of SGML and is not a complete description. Also the following is not exactly correct. There are predefined names for all the parts of each SGML statement. Although needed, the names build a wall to understanding for the novice. One aim of this FAQ is to make SGML easy to understand. So the following discussion will not use the proper names. This section has a few endnotes. They will be represented by parens around a roman numeral. This section was very hard to write. Is anyone willing to take this section over? It is hard to explain in simple terms the intricacies of SGML declarations. Subject 2.1 Elements Element are the basic building blocks of an SGML document. Each element defines at least one tag. One of the most common tags is one to define a paragraph. Below is a simple paragraph definition: \ There are 5 pieces to the tag(i). The first piece is "\. The end tags looks like this: \. The third piece controls when the start and end tags are required. There are four values this piece can have. Below is a table showing what the values are and what they mean: - - Both the start and end tag are required - O The start tag is required, the end tag is optional O - The end tag is required the start tag is optional. O O Both the start and end tag are optional. The "O" can be either upper or lowercase. The next piece defines the content of the tag. In the case of the paragraph tag only PCDATA is allowed. PCDATA means parsable character data. The meaning is somewhat complex. But in general this is used to specify that a paragraph can have text (and a few other things) inside it. But no other tag can occur inside a paragraph. (iv) The content of a tag is actually a regular expression. below is a table showing the regular expression operators supported in SGML: ? Zero or one occurrence * Zero or more occurrences + One or more occurrences | or & and. (a & b) means that both \ and \ must occur but in any order , and. (a, b) means that both \ and \ must occur but the order must be \\ The definition of section can be defined as: \ The final piece of a element declaration is the end or the ">". Now that all the parts of the element declaration have been defined the paragraph tag can be used. Below is a set of paragraphs showing how the tag is used: \Alex felt the melancholy stealing over him again. Nostalgia? For that germ-infested ball of mud? Not possible. He could barely remember it. Snapshots from childhood; a chaotic montage of memories. He had fallen down the cellar steps once in a childhood home he scarcely recalled. Tumbling, arms flailing, head thumping hard against the concrete floor. He hadn't been hurt; not really. He'd been too small to mass up enough kinetic energy. But he recalled the terror vividly. Now he was a lot bigger, and he would fall a lot farther.\ ---------------------------------------------------------------- (i) This is actually a lie, there are more than 5 pieces. as a question to the astute reader, how many pieces are there? (ii) Another lie. (iii) Although usually true, this is a lie. (iv) It is impossible to tell from the paragraph tag what tags are allowed inside. The statement says that nothing at this point is defined inside the paragraph. The content of the paragraph can be changed by exceptions on tags that include a paragraph as part of their content. ---------------------------------------------------------------- Subject 2.2 Attributes Each tag can have a set of attributes. Attributes allow additional information to be attached to the tag. The paragraph example above works fine for simple paragraphs. But what about lists? Lists are little more that a sequence of special paragraphs. Defining a simple list is relatively easy: \ \ This works fine for simple lists. But there is no way to specify if the list is to be numbered, or bulletted, or whatever. Attributes provide the way to specify the type of list easily. \ There are six parts to the attribute list (i). The first part is the "\" ends the attribute list. Attributes are specified as part of the tag. \ would define a bullet list because the type of the list is not specified and the default is bullet. \ would specify a numbered list. \ would specify a dashed list. There are a wide variety in the types of attributes. The example above is only to give an idea of some of the uses of attributes. A full description would be longer than the entire FAQ. ---------------------------------------------------------------- (i) This is actually a lie, there are more than six pieces. as a question to the astute reader, how many pieces are there? (ii) Another lie. (iii) Implied and required are also possible options. Implied means the parser should be able to determine the value from context. Required means the value must be specified. ---------------------------------------------------------------- Subject 2.3 Entities Entities are one of the most complex topics in SGML. This is only a very brief overview of what they are. There are two general catagories of entities: external and parameter. (i) External entities refer to something outside the current document. Parameter entities are macros used inside of a dtd. SGML uses parameter entities to define macro replacements in a dtd. For example in the list example above, the list of types is far from complete. The list of types can get quite long. Also the list may be different from document to document. A parameter entity would make it easier to change the list of types. Below is the example of the same list using a parameter entity. \ \ Parameter entities are similar to tags in that they have a start character and an end character (ii). "%" is used as the start character. ";" is used as the end character. (iii) In the attribute list for the list tag the list of possible types is a macro. In the entity declaration notice the space between the "%" and "listtypes". This space is mandatory for parameter entities. Entity definitions are somewhat unusual. It is not an error to define an entity several different ways. But only the first definition is used. This is counter to most macro processing computer languages. It is important to remember that the first definition is what counts. External entities are more complex that the simple macro replacements of parameter entities. The idea is similar. What external entities do is allow a document to refer to an external file or definition. ---------------------------------------------------------------- (i) There is a third category, default entities. These are rare. (ii) In many instances the end character is optional. (iii) The start and end can be defined to be other characters. -- Jeffrey M\\kern-.05em\\raise.5ex\\hbox{\\b c}\\kern-.05emArthur a.k.a. Jeffrey McArthur email: j_mcarthur@bix.com phone: +1 301 210 6655 ATLIS Publishing fax: +1 301 210 4999 12001 Indian Creek Court home: +1 410 290 6935 Beltsville, MD 20705 The opinions express are mine. They do not reflect the opinions of my employer. My access to the Internet is not paid for by my employer. Newsgroups: comp.text.sgml Date: 09 Sep 1994 00:16:06 UT From: Jan Grootenhuis \ Organization: XS4ALL, networking for the masses Message-ID: <34o9g6$ebd@news.xs4all.nl> References: <34kmac$9e2@ruby.ora.com> <1994Sep8.105825.421@ittpub> Subject: Re: US standards publishers? [William D. Lindsey] | Now, who can direct me to the source in The Netherlands? Nederlands Normalisatie-Instituut Kalfjeslaan 2, Postbus 5090 2600 GB DELFT Tel. (+31) 15 690.188 Newsgroups: comp.text.desktop,soc.culture.china,alt.chinese.text,alt.hypertext,comp.text,comp.text.frame,comp.text.interleaf,comp.text.sgml Date: 09 Sep 1994 02:03:23 UT From: Ron Du \ Organization: Loyola University of Chicago Message-ID: <34ofpb$e6b@apollo.it.luc.edu> Subject: IBM mainframe \