________________________________________________________________ Linguist List, Vol. 2, No. 0251. Monday, 27 May 1991 Subj: 2.0251 Tongue Twisters Total: 103 lines Moderators: Anthony Aristar (a_aristar@fennel.cc.uwa.oz.au) Helen Dry (1echad@utsa86.utsa.edu) (1) Date: Sat, 18 May 1991 17:13:29 EDT From: KIMS@SERVAX.FIU.EDU Subject: RE: Tongue Twisters (2) Date: Sun, 19 May 91 09:36:09 +1000 From: bert peeters Subject: Dutch tongue twisters (3) Date: Thu, 23 May 91 14:45:58 GMT From: hoski@rhi.hi.is (Hoskuldur Thrainsson) Subject: Re: Icelandic Tongue Twisters (4) Date: Wed, 22 May 91 16:01:23 +0100 From: Richard Coates Subject: Tongue twister in Catalan (1) -------------------------------------------------------------------- Date: Sat, 18 May 1991 17:13:29 EDT From: KIMS@SERVAX.FIU.EDU Subject: RE: Tongue Twisters Korean tongue twisters: khongjang kongjang kgjangjangen kanjang kongjang kongjangjang-ita. ssoneed-beans facory factor-manager-TOP soysauce factory manager-DCL As for (the) seasoned-bean factory manager, he is the soysauce factory manager. Jo cholchang-e soe changsal-en sae soe changsal-ita. thatsteelwindow-Posss iron steelbar-Top new iron steelbar-DCL The steel bars of that prison (room) are new (steel bars). (2) -------------------------------------------------------------------- Date: Sun, 19 May 91 09:36:09 +1000 From: bert peeters Subject: Dutch tongue twisters The pot calls the kettle black. I'm the pot, so to speak, and Dominique Estival is the kettle. However, the kettle was black before the pot, who turned black just the other day... Here is my last Dutch tongue twister again - with the one word that went missing in the last line (and even - horror of horrors or mother of all horrors - in the English translation) Wie niets weet en weet dat hij niets weet The one who doesn't know anything and knows that he doesn't know anything weet veel meer dan iemand knows a lot more than the one die niets weet en niet weet dat hij niets weet who doesn't know anything and doesn't know that he doesn't know anything Bert Peeters (3) -------------------------------------------------------------------- Date: Thu, 23 May 91 14:45:58 GMT From: hoski@rhi.hi.is (Hoskuldur Thrainsson) Subject: Re: Icelandic Tongue Twisters Just a short Icelandic tongue twister with an international relevance: Frank Zappa i svampfrakka Frank Zappa in a sponge-coat Try saying this one ten times in a row. If you succeed you have mastered the famous Icelandic phonological component. Hoski (4) -------------------------------------------------------------------- Date: Wed, 22 May 91 16:01:23 +0100 From: Richard Coates Subject: Tongue twister in Catalan Re your Catalan tongue twister forwarded by Milton Azevedo: I am informed by a Catalanist, Max Wheeler, that this is not so much a tongue twister as a shibboleth designed to identify non-native speakers (specifically Castilian speakers) as it contains phonemes lacking in Castilian (/dz/, /dzh/, /zh/) in close proximity. A paradigm Catalan tongue twister is Plou poc, pero' per lo poc que plou, plou prou "It doesn't rain much, but considering how little it rains, it rains enough" Richard Coates U. of Sussex [End Linguist List, Vol. 2, No. 0251] ________________________________________________________________ Linguist List, Vol. 2, No. 0252. Monday, 27 May 1991 Subj: 2.0252 Hyouston Total: 47 lines Moderators: Anthony Aristar (a_aristar@fennel.cc.uwa.oz.au) Helen Dry (1echad@utsa86.utsa.edu) (1) Date: Tue, 21 May 1991 16:12 EDT From: Robert D Hoberman Subject: Houston (2) Date: Wed, 22 May 1991 09:47 CDT From: BERN@ducvax.auburn.edu Subject: Hyouston (1) -------------------------------------------------------------------- Date: Tue, 21 May 1991 16:12 EDT From: Robert D Hoberman Subject: Houston In my native New Jersey speech (metropolitan NY but this was before the r-dropping isogloss crossed the Hudson), there was no /hy-/ whatsoever. I first learned to pronounce it at the age of 22, when I lived together with several other /hy/-less speakers and a poor fellow named Hugh. We got lots of laughs ringing the changes on "Where is /yuw/?", "There /hyuw/ are", etc..... It became a matter of practical necessity to acquire the distinction. Nowadays I think I use /hy-/ rarely and sporadically in some words, but I agree with Ellen Prince that good hyumor could still never be the ice cream. Bob Hoberman (2) -------------------------------------------------------------------- Date: Wed, 22 May 1991 09:47 CDT From: BERN@ducvax.auburn.edu Subject: Hyouston The distinction between Hyouston and Youston is one of the features Guy Bailey (Oklahoma State Univ.) and I (Cynthia Bernstein, Auburn Univ.) analyze in our study of Texas phonology based on a Texas Poll survey. Peter Gingiss' estimate is right on the mark: of 910 responding, 78.24 said Hyouston, 10.66 said Youston, 10.77 couldn't be determined, and the rest said something else (Huston). This survey was limited to Texas residents. [End Linguist List, Vol. 2, No. 0252] ________________________________________________________________ Linguist List, Vol. 2, No. 0253. Monday, 27 May 1991 Subj: 2.0253 Albanian Dictionary; Taiwanese Names Total: 110 lines Moderators: Anthony Aristar (a_aristar@fennel.cc.uwa.oz.au) Helen Dry (1echad@utsa86.utsa.edu) (1) Date: Mon, 27 MAY 91 19:36 N From: MURZAKU%VAXSNS.INFN.IT%ICINECA.CINECA.IT@munnari.oz Subject: Inverse Dictionnary of Albanian (2) Date: Mon, 20 May 91 16:47 U From: HSCHUREN%TWNAS886.BITNET@CUNYVM.CUNY.EDU Subject: resend: Name and Law--Taiwan (1) -------------------------------------------------------------------- Date: Mon, 27 MAY 91 19:36 N From: MURZAKU%VAXSNS.INFN.IT%ICINECA.CINECA.IT@munnari.oz Subject: Invers Dictionnary of Albanian To my last message if there was anybody interested on albanian language, some people have replayed. For them or for any one else who is interested on albanian language I'm sending this second message: When I was working in the Institute of Linguistics of the Accademy of Sciences in Albania, I have elaborated an Inverse Dictionnary of Albanian and here, at the Scuola Normale of Pisa I have reelaborated it in some formal aspects and reprinted. This dictionnary have not been published, so if there is anybody interested on it, I will make a copy and send it. The words are taken from "Fjalor i Shqipes se Sotme", Tirana, 1984, which is the last version of the albanian language dictionnary and is/are signed also the grammatical category/ies of each word. The costs of reproducing (180 pp) and posting it are around the 35.000 italian lire or $30 USD. +-------------------------------------------------------------+ | Aleksander Murzaku internet: murzaku@vaxsns.infn.it | | Scuola Normale Superiore bitnet: murzaku@ipisnsib | | Piazza dei Cavalieri, 7 tel: +39/50/597111 | | I-56126 PISA fax: +39/50/563513 | +-------------------------------------------------------------+ (2) -------------------------------------------------------------------- Date: Mon, 20 May 91 16:47 U From: HSCHUREN%TWNAS886.BITNET@CUNYVM.CUNY.EDU Subject: resend: Name and Law--Taiwan The current situation in Taiwan makes it rather difficult for me to remark on Bill Baxter's note on name approving procdures in Taiwan. The govern- ment has just announce the end of mobilization period against communist insurgency (i.e. ceased to treat the PRC as 'seditious bandits'). This is another big step towards liberalization and democracy. On the other hand, four indepence advocates have just been prosecute of sedition, based on a law most deemed out-of-date aftert the end of the mobilization period. The fact that one of the four prosecuted is a graduate student of Tsing Hua University and that he was arreste on Campus stirred wide-spread protest in Taiwan, especially from student and intellectuals. Now, one can argue from many points of views, such as that it is a bad law but it is a law, or that this is a case where the ruling party is intentionally establishing 'white terror' after they have abandoned the mobilization period. Person- ally, neither of the above positions are acceptable. The ruling party does have a history of practicing 'white terror' (roughly corresponding to the period of McCarthy ism). And its record is no where comparable to a true western democracies. But it is also true that the recent leaders have clearly shown their sincerity towards democractizing and liberalizing the old system and that situations have improved substantially in the past few years. There is no doubt that the government is moving too slowly for most democracy-oriented people (which, fortunately is in the clear majority). But whether they could have moved faster or what price needed to be paid at a faster speed are two hypothetical questions that both sides of the debates can argue but will most likely fail to come up with a covincing answer. My ambivalence to the 'curent' background serves to explain why I will be trying to stay as close to the facts without subjective comments. There is a very loosely defined clause in the law alowing registry officials to refuse to register 'indecent' names, such as 'pig shit' reported in by Baxter. But I think this clause is there to protect the young child and to avoid future toubles (since it is quite difficult to chang your name, as one might expect. 'Indecncy' is actually a reason one can use to justify his change of name application). Local functionaries maight enforce this rule or not ans they often invoke this clause as benebolent intervention. A case in point is that many elder Taiwanese women still have the name of Wang3shi4, which is difficult to translate but means roughly 'I(the naming parent) don't want a girl but since this girl is born, I will just rasie her willing or not.' Many ha ve changed their names later but many others chose to retain their roginal name for convenience (because deeds are in this name etc.) The fact is that very few people are likely to choose this name for thier daughter again. Even though any registry official is likely to invoke the law and disallow this name, but I tink the public censure is stronger in this point. As for Japanese-like names. The fact is that Chinese in Taiwan were FORCED to change to Japnanese names durinf the Huan2minghua4 (becoming the emporors' subjects). My mother, a honor student in junior high school, believes that she was denied admission to senior high school for several reasons, one of them being thier family's failure to convert to Japanese names (she got in a couple of years later, after the surrender of the Japanese). Any Naturalized Japanes citizen are required to convert to a Japanese name up to now. A not unlikely reason for Baxter's for disallowing a Japanese name may be a overenthusiastic effort to convert the Japanese names back to Chinese names immediately after the end of World War II. The fact is that there still quite a few people with a Japanese style given name, such as tai4lang2 (Jp. Taro), etc. But all the prople have changed back to their roiginal Chinese family name. (Of course the case with aborigines are more dificult to say, their convertion from Japanese names to Chinese names are not tutomatical. I am glad to hear that many aboriginees activists are using their names in their own languages now to proclaim their ethnical identity.) Chu-Ren Huang, Institute of History and Philology [End Linguist List, Vol. 2, No. 0253] ________________________________________________________________ Linguist List, Vol. 2, No. 0254. Tuesday, 28 May 1991 Subj: 2.0254 Job and ASL Literature Conference Total: 109 lines Moderators: Anthony Aristar (a_aristar@fennel.cc.uwa.oz.au) Helen Dry (1echad@utsa86.utsa.edu) (1) Date: Tue, 28 May 91 18:23:41 EDT From: martha ratliff Subject: syntax job at wayne state university (2) Date: Tue, 28 May 1991 09:30 EST From: Karen Christie Subject: NATIONAL AMERICAN SIGN LANGUAGE LITERATURE CONFERENCE (1) -------------------------------------------------------------------- Date: Tue, 28 May 91 18:23:41 EDT From: martha ratliff Subject: syntax job at wayne state university The Wayne State University English Department and Linguistics Program anticipate summer recruitment for a one-year replacement position at the rank of Assistant or Associate Professor beginning September, 1991. We expect to seek applicants whose research specialty is syntactic theory or description, with related interests in one or more of the following areas: semantics, morphology, discourse/pragmatics, or language acquisition. Teaching responsi- bilities would include directing M.A. student essays and teaching courses at the undergraduate and graduate levels. Salary range is expected to be $24,000 to $40,000, depending on experience. Applicants should send cv, samples of work, and three letters of reference to: Dr. Lesley Brill, Chair, Department of English, 51 W. Warren Ave., Wayne State University, Detroit MI 48202. When the position is authorized, applications will be reviewed as received until position is filled. Applicants who wish to check on the status of the position authorization may contact Lesley Brill at one of the following addresses - lbrill@waynest1.bitnet or lbrill@cms.cc.wayne.edu (2) -------------------------------------------------------------------- Date: Tue, 28 May 1991 09:30 EST From: Karen Christie Subject: NATIONAL AMERICAN SIGN LANGUAGE LITERATURE CONFERENCE CALL FOR PRESENTATIONS NATIONAL AMERICAN SIGN LANGUAGE LITERATURE CONFERENCE OCTOBER 10-13, 1991 National Technical Institute for the Deaf At Rochester Institute of Technology Rochester, New York This conference brings together artists, scholars, and educators with the goal of deepening our understanding and appreciation of American Sign Language Literature. In addition, the conference will explore how ASL literature can be made a integral part of the curricula in programs serving Deaf students and students of ASL as a foreign language. Evening performances by nationally known artists will be complemented the following day by artist-led workshops in which a wide variety of techniques used in ASL literature will be discussed. The conference also serves as a catalyst for dialogue among educators on how ASL literature may be analyzed and taught. This conference provides a forum for artists, scholars, and educators to exchange ideas and to encourage growth and respect for ASL literature. Abstracts for presentations that address issues related to ASL literature and education are welcome. Original ASL literature is the focus of this conference. Presentations will be limited to 20 minutes. Possible Topics: Topics include, but are not limited to: *Literary analysis of different genres of ASL literature *Process of creating ASL literature *Sociolinguistic and political issues in promoting ASL literature in schools and colleges *Comparison of cultures with oral (unwritten) traditions *Analysis of literary traditions in cultures that have strong oral (unwritten) histories *Folklore and its place in understanding the literary traditions of a culture *Process for including ASL literature in school and college curricula *Course descriptions for ASL literature courses *Instructional materials for ASL literature courses. Submission Procedures: Please submit a letter of interest along with: A. Two copies of a 300-500 word abstract of your presentation OR B. One 10-minute VHS cassette videotape labeled with the presenter's name and title of the presentation Deadline: June 15, 1991 Notification: July 15, 1991 Send to: Rochester Institute of Technology National Technical Institute for the Deaf Dr. Laurie Brewer Department of Liberal Arts Hugh L. Carey Building PO Box 9887 Rochester, NY 14623-0887 (716) 475-6287 (Voice/TDD) This conference is sponsored by: Flying Words Project: New York State Council of the Arts National Technical Institute for the Deaf [End Linguist List, Vol. 2, No. 0254] ________________________________________________________________ Linguist List, Vol. 2, No. 0255. Wednesday, 29 May 1991 Subj: 2.0255 The Survival of Immigrant Languages Total: 170 lines Moderators: Anthony Aristar (a_aristar@fennel.cc.uwa.oz.au) Helen Dry (1echad@utsa86.utsa.edu) (1) Date: Tue, 28 May 91 09:59 MET From: "Norval Smith (UVAALF::NSMITH)" Subject: RE: The Survival of African Languages among Slaves (2) Date: Tue, 28 May 91 09:13:36 -0500 From: louden@ix1.cc.utexas.edu (mark l louden) Subject: Re: The Survival of African Languages among Slaves (3) Date: Tue, 28 May 1991 09:56 MST From: KAMPRATH@CC.UTAH.EDU Subject: Re: The Survival of African Languages among Slaves (4) Date: Tue, 28 May 1991 14:49 EDT From: PEARSON2@umiami.IR.Miami.EDU Subject: Re: The Survival of African Languages among Slaves (1) -------------------------------------------------------------------- Date: Tue, 28 May 91 09:59 MET From: "Norval Smith (UVAALF::NSMITH)" Subject: RE: The Survival of African Languages among Slaves I'm not now clear whether Bill Eldridge's question relates to the US or has wider scope. In any case as far as the Atlantic creole languages are concerned the onle form in which African languages have survived is as ritual languages. African religions, or rather African-based religions have survived in various places - think of Voodoo in Haiti. These ritual languages do not really qualify for the term language as such, being at least vastly reduced. Probably none of them allows for much more than the repetition of religious formulas. Some are strictly comparable to technical vocabularies. The problem is that it is difficult to study them although there is at least some literature. Examples are: Lucumi (Cuba/Yoruba) Kromanti (Jamaica/Twi) Kromanti/Koomanti (various groups in Surinam/Twi) Papa (various groups in Surinam/Gbe) Pumbu (Saramaccan - Surinam/Kikongo) Efi (Cuba (reported)/Efik) There is a good supply of thesis topics in this area! Another sense in which African languages have survived in some sense is found in those cases in which significant numbers of African loanwords have survived in Creole languages, indicating that these languages must have been spoken in their new homelands by at least several generations of slaves. Examples are Kikongo (Saramaccan), Kimbundu (Angolar) and the most curious case of all - Eastern Ijo (Berbice Dutch) which could well be described as a mixed language, and therefore half a survival. Some bibliographic references: Daeleman, J. (1972) Kongo elements in Saramacca Tongo. JAL 11, 1-44. Huttar, GL (1985) Sources of Ndjuka African vocabulary. NWIG 59, 45-71. Price, R. (1975) Kikoongo and Saramaccan: a reappraisal. BTLV 131, 401-478 Smith, NSH, IE Robertson, K Williamson. (1987) The Ijo element in Berbice Dutch Language in Society. To Bill Eldridge In one sense talking about survival of African languages brings up the interminable debate among creolists between substratists (or substratomaniacs) and universalists. If you're a strict substratist (to use the kinder term) then you will of course believe that some African language - or common denominator of (a set of) African languages - has survived. It's just the vocabulary that has been replaced with lexical items from the relevant colonial language. However I refer you to the extensive literature reflecting this debate - this is easily accessible! Norval Smith (2) -------------------------------------------------------------------- Date: Tue, 28 May 91 09:13:36 -0500 From: louden@ix1.cc.utexas.edu (mark l louden) Subject: Re: The Survival of African Languages among Slaves In reference to Margaret Fleck's response to Elise E. Morse-Gagne's remarks about the maintenance of immigrant languages in the US, Margaret is absolutely correct in pointing out the Old Order Amish as a good example of lg. maint. beyond the third generation. The Penn. German speaking community, of which the Amish and other conservative Anabaptist sectarian groups were originally a small minority, has maintained PG, a colonial dialect of German, since its formation (sometime around 1775-1800). The dialect is dying out among non- sectarian speakers (youngest @ 60 yrs. old), but the outlook for maintenance among the Amish and related groups is excellent. Also, there is a small sub-sect within the OOAmish living mainly in So. Indiana whose ancestors came directly over from Switzerland in the mid-19th century. They are generally recognized as the last Amish to exist as a distinctly Amish group in Europe. To this day, these "Schweizer" (so referred to by other Amish) speak Swiss German, with some being able to converse in both PG and Swiss G. Other immigrant groups with lg. maintenance beyond the third generation would be Cajun French speakers in La., as well as the Isleno Spanish speakers in the same state; also Spanish in No. New Mexico, German here in central Texas, German among the Hutterites in the Midwest and central provinces in Canada, and Dutch in New York and New Jersey from the colonial era to the early part of this century. Best wishes, Mark Louden (3) -------------------------------------------------------------------- Date: Tue, 28 May 1991 09:56 MST From: KAMPRATH@CC.UTAH.EDU Subject: Re: The Survival of African Languages among Slaves WRT "immigrant groups which have maintained their original languages for as much as three or four generations" (Elise Emerson Morse-Gagne), there are plenty of them, and not limited to Amish communities (Margaret Fleck's response, 2.0252). Even without having made a search for them, much less a study of them, I have run into them in several places. All those that I know of are in farming communities. My parents are both fourth generation German and learned German (Plattdeutsch for my mom in northern Missouri and Schwabisch for my dad in southeastern Nebraska--and the Kamprath Ancestor settled in Ida, Michigan) as their first language at home on the farm. I think it's significant that their major social community was the local Lutheran congregation (Martin Luther, of course, was pretty German, and so is Lutheranism), in which German (some sort of "Hochdeutsch") was the language of the church service; the hymns, sermons, Bible, liturgy, catechism, everything was in German. They learned English when they went to grade school (and I guess they brought it home; I don't know where they picked it up, but my grandparents spoke English around me when we came to visit, tho' Grampa Martens said Komm Herr Jesu... for Come Lord Jesus... as a pre-meal prayer). My parents didn't speak German at home when I was growing up because they spoke different varieties, but our church did have German church services (German was at 8am and we went to the English service 10:30, conducted by the same Pastor Ostermann), not in a farm community, but in small-town central Minnesota--this is 35 years ago. There are also well-known communities in Texas where "Texas-Deutsch" is spoken (in Schulenburg, written "Kirche" is not recognized as the same thing a their spoken [kex@], but [kirx@] is understood in context; the T-D word for "fence" is [fEns], that sort of thing). Schulenburg is also half Czech Catholic; friends of mine of both these persuasions, now about 40 years old, learned these languages at home on the farm from their third-generation parents and heard them in church and on the street. It's not just Schulenburg, of course. Surely everyone's heard of Fredericksburg and the New Braunfels Wurstfest. :-) Amish and Pennsylvania "Dutch" communities are better known, but they're not unique. Christine Kamprath (4) -------------------------------------------------------------------- Date: Tue, 28 May 1991 14:49 EDT From: PEARSON2@umiami.IR.Miami.EDU Subject: Re: The Survival of African Languages among Slaves Is John Singler a subscriber on the Linguist Net? To my knowledge, he is the expert on this subject, having devoted most of his career to creole genesis questions. I sat in on the first week or so of his class on West African Languages at the 1986 Institute (New York) and remember his reporting on original documents of slaving vessles, of large groups of slaves from single language areas being sold in blocks, etc. I will continue to hunt for my notes, but can someone nudge John to write a posting? In the meantime, those interested in the question should definitely search for his publications on creole genesis and the history of slaving practices. Rebecca Burns Hoffman [End Linguist List, Vol. 2, No. 0255] ________________________________________________________________ Linguist List, Vol. 2, No. 0256. Wednesday, 29 May 1991 Subj: 2.0256 Notice to Subscribers: LINGUIST moves on June 1 Total: 80 lines Moderators: Anthony Aristar (a_aristar@fennel.cc.uwa.oz.au) Helen Dry (1echad@utsa86.utsa.edu) (1) Date: Wednesday, May 29 1991 From: The LINGUIST Moderators Subject: Notice to all Subscribers (1) -------------------------------------------------------------------- Date: Wednesday, May 29 1991 From: The LINGUIST Moderators Subject: LINGUIST Moves on June 1 Some of you have been receiving test messages from LINGUIST's new home, which is the Texas A&M Listserv. You may safely ignore these messages: most of them were designed solely to verify pathnames, and no acknowledgement is necessary. We apologize for sending them: we should not need to do so in the future. This weekend, LINGUIST will finally move from the University of Western Australia to Texas A&M. From Saturday, June 1, 1991 all LINGUIST messages will cease to originate from LINGUIST@UNIWA.UWA.OZ.AU, and will instead come from LINGUIST@TAMVM1.TAMU.EDU This change in host will entail some corresponding changes in procedure. Although LINGUIST will not move until Saturday, we suggest that these changes be put into operation IMMEDIATELY. The following are the most important: 1. Direct all messages which are for GENERAL distribution to the list to: LINGUIST@TAMVM1.TAMU.EDU (Internet) LINGUIST@TAMVM1 (Bitnet) 2. We ask that all members verify their pathnames and personal names by sending the message: SUBSCRIBE LINGUIST e.g. "subscribe linguist Jane Smith" to the address: LISTSERV@TAMVM1.TAMU.EDU (Internet) or: LISTSERV@TAMVM1 (Bitnet) Though this verification is not absolutely necessary--most of you will continue to receive mail even if you do nothing--we urge all of you to do so nevertheless. The list is moving from Australia to the United States, and it is inevitable that some of the path-names valid in Australia will cease to be functional. In addition, mail directed to you will NOT HAVE YOUR CORRECT PERSONAL NAME on it until you re-register. If you cease to receive any mail after June 1, re-registering will probably solve your problem. Please direct all request for removal from the list to LISTSERV@TAMVM1 as well. 3. Note that for the present files are being kept at the University of Western Australia. Therefore, continue to direct all requests for FILES to: LISTSERV@UNIWA.UWA.OZ.AU 4. Also continue to direct ALL OTHER enquiries, comments, problems, etc to: LINGUIST-EDITORS@UNIWA.UWA.OZ.AU We hope that this move can be made without over-great upheaval. But we ask a degree of patience with the problems which, in a world of incompatible software, will inevitably occur. [End Linguist List, Vol. 2, No. 0256] ________________________________________________________________ Linguist List, Vol. 2, No. 0257. Thursday, 30 May 1991 Subj: 2.0257 Phonology and Orthography Total: 94 lines Moderators: Anthony Aristar (a_aristar@fennel.cc.uwa.oz.au) Helen Dry (1echad@utsa86.utsa.edu) (1) Date: Tue, 28 May 91 11:11:26 BST From: Margaret Fleck Subject: Phonology and Orthography (2) Date: Tue, 28 May 91 13:42:55 PDT From: rwojcik@atc.boeing.com Subject: Re: Phonology and Orthography (3) Date: Wed, 29 May 91 07:58 CDT From: Harriet ottenheimer Subject: Re: Phonology and Orthography (Part 2) (1) -------------------------------------------------------------------- Date: Tue, 28 May 91 11:11:26 BST From: Margaret Fleck Subject: Phonology and Orthography The number of symbolic distinctions (bits of information) needed to encode a set of CV moras using mora-symbols may be less than if alphabetic symbols are used. --John Coleman Could you provide some concrete examples of languages in which this is the case? Perhaps with brief descriptions of their sets of phonemes and possible syllables? Margaret (2) -------------------------------------------------------------------- Date: Tue, 28 May 91 13:42:55 PDT From: rwojcik@atc.boeing.com Subject: Re: Phonology and Orthography I agree with Margaret Fleck's comments on the alphabetic status of orthograpies and would add that a more appropriate term for what some call "syllabaries" would be "prosodically-organized alphabet". If a writing system requires that its users call on segmental information in order to create symbols--i.e. that the symbols are decomposable into segmental symbols--then it is an alphabet. Undecomposable symbols, e.g. Japanese kana, are necessary for a system to qualify as a true "syllabary". Alphabetic writing did not start with the Greeks, but it achieved its greatest refinement from them. Finally, I want to clarify my comment that alphabet construction was a "practical matter". This was in connection with the question of just what it means to say that alphabetic symbols correspond to phonemes. I believe that it is quite legitimate to view linguistic systems as either psychological or social in nature. Baudouin happened to emphasize the former view, and Saussure the latter. Alphabetic systems are social conventions, and it is probably best to characterize them in terms of an "ideal" phonemic inventory for the language, always bearing in mind that the ideal may not reflect everyone's (or even anyone's) psychological system. And they may carry baggage that speaks to other issues besides phonology. A case in point would be the two major conventions for Breton writing-- so-called "university orthography" and "BZH". The BZH system is preferred by nationalists because, among other things, it places a 'zh' symbol where three dialects have /z/ (kerne, leon, tregor or "KLT" dialects) and the other major dialect has /h/ (vannes dialect). The university orthography has just the letter "z", hence causing, in the minds of many nationalists, an unfair advantage for the KLT speakers. So the name of the country is either "Breizh" or "Breiz", and you speak 'brezhoneg' or 'brezoneg', depending on your politics. (BTW, Breton is a final-devoicing language, and the adjective is often spelled "brezhonek", because its final consonant does not alternate with [g] in internal sandhi. This corresponds with Baudouin's original view of the speaker-based level of phonemic abstraction.) -Rick Wojcik (rwojcik@atc.boeing.com) (3) -------------------------------------------------------------------- Date: Wed, 29 May 91 07:58 CDT From: Harriet ottenheimer Subject: Re: Phonology and Orthography (Part 2) In response to Margaret Fleck's inquiry: > I would be surprized to find (does anyone know?) the same > wholesale omission of vowel indications in non-Semitic languages that > have borrowed e.g. the Arabic alphabet. ShiNzwani (Comoro Islands) is a Bantu language which has borrowed the Arabic alphabet and regularly omits vowel indications. Harriet Ottenheimer [End Linguist List, Vol. 2, No. 0257] ________________________________________________________________ Linguist List, Vol. 2, No. 0258. Thursday, 30 May 1991 Subj: 2.0258 Tongue Twisters Total: 122lines Moderators: Anthony Aristar (a_aristar@fennel.cc.uwa.oz.au) Helen Dry (1echad@utsa86.utsa.edu) (1) Date: Dr M Sebba Date: Tue, 28 May 91 13:24:40 +0100 Subject: Re: Czech and Zulu Tongue Twisters (2) Date: Tue, 28 May 91 12:58:57 EDT From: cblee@unagi.cis.upenn.edu (ChangBong Lee) Subject: Re: Korean Tongue Twisters (3) Date: Tue, 28 May 91 21:27:33 -0500 From: raskin@j.cc.purdue.edu (Victor Raskin) Subject: Russian Tongue Twister (4) Date: 29 May 91 19:43:00 EST From: "ELISE EMERSON MORSE-GAGNE" Subject: long tongue twister (1) -------------------------------------------------------------------- From: Dr M Sebba Date: Tue, 28 May 91 13:24:40 +0100 Subject: Re: Czech and Zulu Tongue Twisters A version of a Czech tongue-twister which has already appeared on the list (taught to me by a Czech; I don't know Czech myself, so I have to guess where the diacritics are. The r's should all have a wedge (hacek) on top. I may have misspelt some words. Trista tri a' tridsat stribrnych krepelek Preletelo pres trista tri a' tridsat stribrnych strech (333 silver swallows flew over 333 silver roofs). A Zulu tongue-twister: I don't know whether this is a folk one, or was made up by linguists at the University of the Witwatersrand, Johannesburg (it comes from a handout dated 1972): Ingqeqebulane yaqaqela uqhoqhoqho, uqhoqhoqho waqaqela iqaqa, iqaqa laqalaza. (The expert talker loosened up for the trachea, the trachea loosened up for the polecat, and the polecat looked around in amazement.) Another: Amaxoxo ayaxokozela exoxa ngoxamu exhibeni. The frogs are talking loudly about the monitor lizard. Q is a palatal click, x is a lateral click. (2) -------------------------------------------------------------------- Date: Tue, 28 May 91 12:58:57 EDT From: cblee@unagi.cis.upenn.edu (ChangBong Lee) Subject: Re: Korean Tongue Twisters Here is one more interesting example in Korean. choki kirin kurim-i amkirin kurin kurimi-nya sutkirin kurin kurim-nya? there giraff picture-NOM female giraff drawing picture-QR male giraff drawing picture-QR "Is that giraff picture there a picture drawing a female giraff or a male giraff? NOTE: NOM: nominative, QR: Question Particle Also, fellow Korean linguists, don't you have some problems in pronouncing the following phrase? "Choongangchong cholchangsal" meaning "steel bar of the Central (govenment) building" (3) -------------------------------------------------------------------- Date: Tue, 28 May 91 21:27:33 -0500 From: raskin@j.cc.purdue.edu (Victor Raskin) Subject: Russian Tongue Twister Is it time for a Russian one: Shla Sasha po shosse i sosala sushku. Walked Alexandra along highway and sucked dried bagel. Alexandra was walking along the highway and sucking on a dried bagel. Victor Raskin raskin@j.cc.purdue.edu (4) -------------------------------------------------------------------- Date: 29 May 91 19:43:00 EST From: "ELISE EMERSON MORSE-GAGNE" Subject: long tongue twister I'm no whiz in modern German dialects, so someone might want to fix this one up after i've massacred it. And identify the dialect! Heut' kommt der hans zu mir; aber ob er ueber Ueber-Ammergau, oder ob er unter Ueber-Ammergau, oder ob er ueberhaupt nicht kommt, ist nicht gewisst. Today hans is coming to (see) me; but if he over Over-Ammergau, or if he under Over-Ammergau, or if he absolutely not comes, isn't known. I think I left out a (not very tongue-twisty) line after the first one, and I'm a little surprised at the way "kommt" in the 4th line of my version seems to have to do duty for the previous two lines...anyone out there know the real thing? --Elise Morse-Gagne' [End Linguist List, Vol. 2, No. 0258] ________________________________________________________________ Linguist List, Vol. 2, No. 0259. Thursday, 30 May 1991 Subj: 2.0259 Query; Responses on Pseudo-Obliques, Himself, Acronyms Total: 101 lines Moderators: Anthony Aristar (a_aristar@fennel.cc.uwa.oz.au) Helen Dry (1echad@utsa86.utsa.edu) (1) Date: Tue, 28 May 91 19:20 GMT From: David E Newton Subject: Hindi Loanwords (2) Date: Tue, 28 May 91 10:07:29 EDT From: Lesli LaRocco Subject: PSEUDO-OBLIQUE SUBJECTS (3) Date: Tue, 28 May 91 16:14 GMT From: FEHN23%UJVAX.ULSTER.AC.UK@CUNYVM.CUNY.EDU Subject: Herself/himself in Hiberno-English (4) Date: Tue, 28 May 91 09:15 EST From: Herb Stahlke <00HFSTAHLKE%BSUVAX1.BITNET@UICVM.uic.edu> Subject: RE: Acronyms (1) -------------------------------------------------------------------- Date: Tue, 28 May 91 19:20 GMT From: David E Newton Subject: Hindi Loanwords I am intending to do some work on the borrowing of Hindi words into the English language. This would be done in a historical perspective, and taken specifically from the viewpoint of the history of the English language, though some work on the borrowing of English words into Hindi might also be relevant. If anyone can give me any information on this topic, or knows of any useful reading matter on the subject, I would be very grateful. Thanks! David E Newton Department of Language and Linguistic Science University of York Heslington York YO1 5DD den1@uk.ac.york.vaxa den1@uk.ac.york.vaxb den1@uk.ac.york.worda den1@uk.ac.york.wordb (2) -------------------------------------------------------------------- Date: Tue, 28 May 91 10:07:29 EDT From: Lesli LaRocco Subject: PSEUDO-OBLIQUE SUBJECTS For Rick Wojcik: You might want to have a look at an article called "Prepositional Quantifiers a nd the Direct Case Condition in Russian," by Leonard Babby in a volume called _Issues in Russian Morphosyntax_ MS Flier and D Brecht, eds, Slavica. It seems to deal with a similar question in Russian. Lesli laRocco (3) -------------------------------------------------------------------- Date: Tue, 28 May 91 16:14 GMT From: FEHN23%UJVAX.ULSTER.AC.UK@CUNYVM.CUNY.EDU Subject: Herself/himself in Hiberno-English In Hiberno-English, at least in my (Northern) variety and the others I am aware of, uses non-reflexive 'herself' and 'himself', not as substitutes for 'she herself' and 'he himself', but meaning 'the woman/man of the house' or 'the boss'. Thus a sentence like: John said that himself wrote it cannot mean John said that he himself wrote it but only John said that the boss/the man of the house wrote it Thus 'herself/himself' are not reduced forms of pronoun plus emphatic. As regards the history of the development of the use of -self forms in this way, I'm afraid I can't help Alison Henry (4) -------------------------------------------------------------------- Date: Tue, 28 May 91 09:15 EST From: Herb Stahlke <00HFSTAHLKE%BSUVAX1.BITNET@UICVM.uic.edu> Subject: RE: Acronyms RE: Bob Hoberman's note on Jewish acronymic names: The acronymic name "Schub" also looks like a German transliteration of the Hebrew root meaning "return." Is this significant or accidental, and do other acronymic names also have meaning in Hebrew? Is this a factor in the selection of acronyms as names? Herb Stahlke Ball State University [End Linguist List, Vol. 2, No. 0259] ________________________________________________________________ Linguist List, Vol. 2, No. 0260. Thursday, 30 May 1991 Subj: 2.0260 Pronoun Doubling and Hyouston Total: 93 lines Moderators: Anthony Aristar (a_aristar@fennel.cc.uwa.oz.au) Helen Dry (1echad@utsa86.utsa.edu) (1) Date: Tue, 28 May 91 18:20:26 +1000 From: bert peeters Subject: Pronoun doubling in Flemish dialects (2) Date: Tue, 28 May 91 11:10:40 +0200 From: Guido Vanden Wyngaerd Subject: Pronoun Doubling (3) Date: Tue, 28 May 91 11:46 EST From: NMILLER@vax1.trincoll.edu Subject: Re: Hyouston (4) Date: Tue, 28 May 91 10:20 MET From: "Norval Smith (UVAALF::NSMITH)" Subject: RE: Hyouston (1) -------------------------------------------------------------------- Date: Tue, 28 May 91 18:20:26 +1000 From: bert peeters Subject: Pronoun doubling in Flemish dialects Dirk Geeraerts questions my hypothesis that clitic/unaccented 'de' in 'hedde gij' is a reduction of a pronoun formally identical to the one that follows. He quotes diachronical evidence to support his argument. Now, this is of course a legitimate approach. But what is it that we are interested in? Do we want to propose analyses that are historically correct or do we want to emit hypotheses that make sense to the speaker of a Flemish dialect anno 1991? I was trying to do the latter, and I do believe we should not mix synchrony and diachrony in this kind of matters. Bert Peeters (2) -------------------------------------------------------------------- Date: Tue, 28 May 91 11:10:40 +0200 From: Guido Vanden Wyngaerd Subject: Pronoun Doubling Some more references for those interested in pronoun doubling in Flemish dialects: 1. Bennis, H. and L. Haegeman (1984) "On the status of agreement and relative clauses in West-Flemish", in W. de Geest and Y. Putseys, eds., Sentential Complementation, Foris, Dordrecht. 2. Haegeman, L. (1990) "Subject Pronouns and Subject Clitics in West- Flemish," The Linguistic Review, 7, 333-363. 3. W. de Geest (1990) "Universele Grammatica op de Gentse toer," Taal en Tongval XLII, 108-124. (1) and (2) also contain some discussion of complementizer agreement phenomena. -Guido Vanden Wyngaerd (3) -------------------------------------------------------------------- Date: Tue, 28 May 91 11:46 EST From: NMILLER@vax1.trincoll.edu Subject: Re: Hyouston Dare a non-linguist make a suggestion to a linguist on the _reporting_ of data? He dares. I assure Cynthia Bernstein, word of honor, that for a sample of 910 (or 91000 as far as NON-statistical significance goes) rounding the percentages to a whole number (78%, 11%, 11%) would only add to the value of what's being communicated. Norman Miller (4) -------------------------------------------------------------------- Date: Tue, 28 May 91 10:20 MET From: "Norval Smith (UVAALF::NSMITH)" Subject: RE: Hyouston Houston The funny thing is that the place that is ultimately the source of the name Houston - whether the place in Texas gets its name from Sam Houston or not - Houston in Renfrewshire, Scotland, is pronounced [hust@n]. It is a Scots name, hoose (i.e. "house") + toon > t@n (i.e. "town" (actually rather "settlement")). Norval Smith [End Linguist List, Vol. 2, No. 0260] ________________________________________________________________ Linguist List, Vol. 2, No. 0261. Thursday, 30 May 1991 Subj: 2.0261 The Survival of Immigrant Languages Total: 118 lines Moderators: Anthony Aristar (a_aristar@fennel.cc.uwa.oz.au) Helen Dry (1echad@utsa86.utsa.edu) (1) Date: Wed, 29 May 91 15:47 EST From: Herb Stahlke <00HFSTAHLKE%BSUVAX1.BITNET@UICVM.uic.edu> Subject: RE: The Survival of Immigrant Languages (2) Date: 29 May 91 19:23:00 EST From: "ELISE EMERSON MORSE-GAGNE" Subject: survival of immigrant languages (3) Date: 29 May 91 21:11:00 EST From: "ELISE EMERSON MORSE-GAGNE" Subject: survival of immigrant lgs again (1) -------------------------------------------------------------------- Date: Wed, 29 May 91 15:47 EST From: Herb Stahlke <00HFSTAHLKE%BSUVAX1.BITNET@UICVM.uic.edu> Subject: RE: The Survival of Immigrant Languages I can confirm some of what Christine Kamprath said about the survival of German in Lutheran communities and also shed some personal insight into how these communities lose their first language. I grew up in a Lutheran parsonage in a small town south of Detroit called Waltz--five miles west of Flat Rock. My father regularly preached in German, having grown up in a small town in central Minnesota. Our table prayers and some of our bedtime prayers were in German. My elder siblings all have a near-native command of German, but since I was born in 1942 I have school German. Regularly during World War II, FBI agents would come down from Detroit to listen to my father's German sermons, and about the time I was born my parents decided to stop speaking German at home, hence the difference in proficiency between my siblings and me. There are also some Stahlke's in Ontario, along the route that Ur-Stahlke took in migrating from Danzig to central Minnesota, but one Ontario branch of the family changed the spelling to Stalkie during World War I on the advice of provincial officials. A good bit of German is still spoken north of Detroit in the Frankenmuth and Frankentrost areas. Incidentally, we considered ourselves really rather enlightened in Waltz. There was one Catholic family in the town of 230 people, of German-Russian descent like most of the town, and we were allowed to play with their children. Herb Stahlke Ball State University (2) -------------------------------------------------------------------- Date: 29 May 91 19:23:00 EST From: "ELISE EMERSON MORSE-GAGNE" Subject: survival of immigrant languages Thank you, Margaret. Yes, the Amish communities are a good counter-example to my generalization. It might be very interesting to compare the (language- survival) results of the Amish/Mennonite _communitas_ to the enforced segregation Bill Eldridge is talking about for the slaves. But if I remember correctly, the slaves were not all equally segregated from native English- speakers. Field workers were far more so than house servants, particularly-- I should imagine--the small children of house servants, who may have played with the small children of the owners. This difference between levels of exposure to English is discussed in Edgar Schneider, _American Earlier Black English_, p. 262-67. It seems to me that this issue verges on the one raised in his query by Mark Louden, who points out that many people assume that all (black/Amish/...) individuals speak (BVE/"Amish-style" English/...). It's not true now, and it apparently wasn't entirely true even in the slavery era. I'm not saying that Bill Eldridge is falling into that trap, I'm just trying to say that to my mind, given the different languages they started with, the diversity of their cultural backgrounds, and the fact that a certain number of slaves were in close contact with (all sorts of!) white English speakers, it is not s -RISING IF THEY LEARNED ENGLISH AND DID NOT RETAIN A DISTINCT LANGUAGE OF THEIR OWN. SORRY ABOUT ALL THIS EMPHASIS, BUT MY KEYBOARD SUDDENLY REFUSED TO DO LOWERCASE LETTERS. JUST GIVES ME gobbledygook GOBBLEDYGOOK. --ELISE MORSE-GAGNE (3) -------------------------------------------------------------------- Date: 29 May 91 21:11:00 EST From: "ELISE EMERSON MORSE-GAGNE" Subject: survival of immigrant lgs again I should have finished reading my backlog of mail before sending my last message--cardinal sin! Thanks to everybody who has made it impossible for me to say again that "I don't know of immigrant groups who have maintained their lgs past 3 or 4 generations". What I meant to convey--and obviously I put it much too strongly--was that surely it is not extraordinary that African languages have not survived in this country (that is, in the sense that Pennsylvania German, etc., has), considering the many immigrant groups which no longer retain their languages. There seem to be three major groups being discussed here. (1) is the African slaves, and it would be wonderful if John Singler or John Rickford or other specialists in the area of creoles and Black English origins would contribute to this. (2) are such communities as the German-speaking or Swiss-German- speaking Amish and non-Amish farming communities. I seem to recall a mention of Spanish-speakers in California? who retain old European Spanish traits; I wonder if they also fall into this category. Are they rural, agricultural communities with a strong shared religious tradition? (3) are waves of immi- grants who have been more or less dispersed and assimilated, with no speakers of the original language after the 3rd generation or so in this country. (1) and (3) apparently share language loss (and influence on at least some varieties of American English). Whether for the same reasons, or not, I certainly can't say. As I remember, this was originally a branch of the Banned Languages discussion. My understanding of Bill Eldridge's first message on the subject (I should say that I think my computer scrambled a recent message from Bill, or someone with the uppercase initials B E, so forgive me if I am missing something) is that he wondered whether a ban on African lgs was in part responsible for the -ir non-survival, but found no records of such a ban. Or something like that. It would certainly be interesting to compare local attitudes towards the lgs. of groups (1), (2), and (3)--and the degree to which the communities in question maintain some imperviousness to local attitudes. Can someone like Mark Louden say whether there has ever been restricitve legislation aimed at Pennsylvania German, for instance? If not, why not? My apologies for the length of this message. --Elise Morse-Gagne [End Linguist List, Vol. 2, No. 0261] ________________________________________________________________ Linguist List, Vol. 2, No. 0262. Thursday, 30 May 1991 Subj: 2.0262 FYI Total: 167 lines Moderators: Anthony Aristar (a_aristar@fennel.cc.uwa.oz.au) Helen Dry (1echad@utsa86.utsa.edu) (1) Date: Wed, 29 May 1991 08:41 EDT From: PEARSON2@umiami.IR.Miami.EDU Subject: Re: Bilingual Jurors (2) Date: 29 May 1991 09:35:40 CDT From: Subject: Association for this history of language sciences (3) Date: Wed, 29 May 91 11:04:28 EDT From: Derek Gross Subject: Abstracts from sci.lang (1) -------------------------------------------------------------------- Date: Wed, 29 May 1991 08:41 EDT From: PEARSON2@umiami.IR.Miami.EDU Subject: Re: Bilingual Jurors HIGH COURT CITES BASIS TO REJECT BILINGUAL JURORS is on the front page of the Miami Herald today. I imagine everyone has gotten some review of the Supreme Court decision in today's paper, but I thought I'd send the following comment by David Waksman, an assistant state attorney in Miami: He is cited as reporting that in state court, translation problems are dealt with on the spot. "Our judges for years have told Spanish-speaking jurors that 'if you disagree with the translation, raise your hand, call it to my attention and we'll deal with it then.'" How reasonable. (2) -------------------------------------------------------------------- Date: 29 May 1991 09:35:40 CDT From: Subject: Association for this history of language sciences NORTH AMERICAN ASSOCIATION FOR THE HISTORY OF THE LANGUAGE SCIENCES The North American Association for the History of the Language Sciences was founded in December 1987 to promote, encourage and support the history of the sciences concerned with language, such as linguistics, anthropology, philos ophy, psychology, sociology, history of ideas, history of science and other disciplines, both theoretical and applied, from the earliest beginnings to the present, including non-European traditions. In addition to the critical presentation of the origin and development of particular ideas, theoretical concepts, terms, schools of thought or particular trends, the Association is interested in the discussion of the methodological, philosophical, and epistemological doundations of a historiography of the language sciences. The Association promotes these aims by issuing a newsletter (twice annually) to keep members in touch and informed about upcoming meetings, including those of other societies, ongoing research projects and recent publications in the field. The Association also sponsors a meeting held jointly with the Linguistic Society of America's annual meeting. Thus the next meeting will take place in January 1992 in Philadelphia. Dues for the organization are $10 (American) or 12$ (Canadian). Annual membership runs from June 1 to May 31. Dues should be sent to the Treasurer, with checks drawn on US banks made out to NAAHOLS, and checks drawn on Canadian banks made out to Talbot Taylor: Professor Talbot Taylor Treasurer, NAAHOLS Department of English College of William and Mary Williamsburg VA 23185 For more information and copies of the latest newsletter, contact: Professor Douglas Kibbee Department of French University of Illinois 2090 Foreign Language Building 707 South Mathews Avenue Urbana IL 61801 (3) -------------------------------------------------------------------- Date: Wed, 29 May 91 11:04:28 EDT From: Derek Gross Subject: Abstracts from sci.lang [Originally posted by T. Geller (geller@ucunix.san.uc.edu) to sci.lang] As a member of The Esperanto League for North America, I receive (along with their regular newsletter) ELNA Update, a small, general-interest language newsletter. Along with some Esperanto news, it contains quite a bit of interesting other stuff. Here are some headlines and excerpts from the 1991/2 issue: ========================== TURKISH PLAN TO PERMIT KURDISH IS IN TROUBLE A government bill to ease restrictions on the use of Kurdish has encountered resistance in the Turkish parliament and may die there. The Bill would lift a ban that made it a crime for Kurds to speak their own language in public or listen to their traditional songs. ...[the bill] would not affect restrictions on teaching Kurdish in schools or publishing newspapers and books in Kurdish. (San Francisco Chronicle, Mar. 14, 1991) --------------------------- FRENCH LANGUAGE FORBIDDEN IN ALGERIA The Algerian parliament has approved a law, similar to one introduced by Col. Khaddafi in Libya, which makes Arabic the sole official language of the country. Its use is now obligatory for all government documents, as well as in trade and schools. It is now a crime in Algeria to use a foreign language (French is the one most often used) for any of these purposes. Punishments range from invalidation of documents to fines from $100 to $500. Businesspeople who use French words on their products risk having their factories and shops closed. (Heroldo de Esperanto, #6, 1991) --------------------------- FRENCH SPELLING REFORM FIZZLES (CONTINUED) Opposition to the proposed French spelling reform is increasing. The goal of the reform is to make spelling more phonetic. Strong opposition is growing not only in France but in Canada and Switzerland as well. Among the groups which have formed to protest the proposed changes are The Association to Save the French Language (l'ASLAF) and The Committee Robespierre which proposed "a moral guillotining tto everyone who dares to profane the French language." (Heroldo de Esperanto, #6, 1991) --------------------------- DUTCH SPELLING CHANGES Spelling changes are being introduced in the Netherlands, although more successfully than in France. French loan words such as _bureau_, _cha^teau_, and _cadeau_ are now being written as _buro_, _sjato_, and _kado_ by some newspapers, while English imports _showroom_, _session_, and _social unit_ are now being written as _sjoowroem_, _sesjen_, and _soosjel joenit_. (Heroldo de Esperanto, #6, 1991) --------------------------- COMPUTER DICTIONARY IN IRISH A constant problem for speakers of minority languages is the lack of up-to-date technical dictionaries. A new computer dictionary appeared has just appeared in Irish, which will now allow Irish speakers to use their language in the computer field instead of English, which until now has enjoyed a monopoly there. In recent years the Irish government and publishers in and outside Ireland have published a large number of technical dictionaries and lists, including ones for medicine, science, the military, music, trade, etc. (Monato, Feb. 1991) ============================ There are a few other interesting articles, having to do with Puerto Rican statehood being opposed on language grounds, Esperanto, English problems in other countries, and so on. The newsletter (4 pp., quarterly) is free to all members of "The Friends of Esperanto," at $7.50/year. Available from: ELNA P.O. Box 1129 El Cerrito, CA 94530 Tel: (415) 653-0998 Hope you enjoyed! [End Linguist List, Vol. 2, No. 0262] ________________________________________________________________ Linguist List, Vol. 2, No. 0263. Sunday, 2 June 1991 Subj: 2.0263 Farewell from UNIWA Total: 41 lines Moderators: Anthony Aristar (a_aristar@fennel.cc.uwa.oz.au) Helen Dry (1echad@utsa86.utsa.edu) (1) Date: Sunday, 2 June 1991 From: The LINGUIST Moderators Subject: Farewell from UNIWA (1) -------------------------------------------------------------------- Date: Sunday, 2 June 1991 From: The LINGUIST Moderators Subject: Farewell from UNIWA Unless some unforeseen event occurs, this will be the last message any of you receive from LINGUIST@UNIWA.UWA.OZ.AU, and thus from the University of Western Australia in Perth, Western Australia. All following messages will be from LINGUIST@TAMVM1 at Texas A&M, and LISTSERV@TAMVM1 (Bitnet) and LISTSERV@TAMVM1.TAMU.EDU (Internet) are now the correct addresses for all subscriptions and signoffs. For a little while the main address of the moderators of LINGUIST will remain at the University of Western Australia, so if any of you have any problems with the list, please address them, as in the past, to LINGUIST-EDITORS@UNIWA.UWA.OZ.AU, or directly to one of the moderators. Messages will be sent from the Listerv at TAMVM1 today. If you fail to receive them in a reasonable time, your address has probably failed to work from our new site. Re-subscribing will solve the problem. Our thanks must go, in this last message, to the department of Anthropology at the University of Western Australia, and to Shelly Harrison in particular, without whom LINGUIST would never have come into existence. [End Linguist List, Vol. 2, No. 0263] ________________________________________________________________ Linguist List, Vol. 2, No. 0264. Sunday, 2 June 1991 Subj: 2.0264 Queries: Register, Turkish, Diacritics Total: 162 lines Moderators: Anthony Aristar (a_aristar@fennel.cc.uwa.oz.au) Helen Dry (1echad@utsa86.utsa.edu) (1) Date: Fri, 31 May 91 16:36:00 EDT From: GL250007@Venus.YorkU.CA Subject: Query: "Register" as a technical term (2) Date: Fri, 31 May 91 15:04:06 PDT From: marks%neuro.usc.edu@usc.edu (Mark Seidenberg) Subject: Turkish query (3) Date: Sat, 25 May 91 18:08:35 CDT From: john@utafll.uta.edu (John Baima) Subject: ISO10646 and dicritical marks (1) -------------------------------------------------------------------- Date: Fri, 31 May 91 16:36:00 EDT From: GL250007@Venus.YorkU.CA Subject: Query: "Register" as a technical term I would appreciate hearing what "register" means as a technical term. I have a sense of the term in Halliday's use, and wonder how this meshes with other uses. You may send replies to me directly or to the list. My email address is GL250007@VENUS.YORKU.CA Bill Greaves (2) -------------------------------------------------------------------- Date: Fri, 31 May 91 15:04:06 PDT From: marks%neuro.usc.edu@usc.edu (Mark Seidenberg) Subject: Turkish query Could someone please point me to (a) a computer-readable Turkish lexicon (a list of words and their pronunciations), and/or (b) work on morphological parsing in Turkish? Thanks in advance. Mark Seidenberg (3) -------------------------------------------------------------------- Date: Sat, 25 May 91 18:08:35 CDT From: john@utafll.uta.edu (John Baima) Subject: ISO10646 and dicritical marks ISO10646 is an attempt by ISO to devise a global character coding standard. It seems to me that the fate of ISO10646 will have a significant impact on linguists in the years to come. I have been one of the people arguing for the inclusion of floating diacritical marks. However, several in the ISO world strongly object to them. One person who has frequently spoken against floating diacritical marks is Mr. Johan van Wingen. He is a representative to ISO from the Netherlands and is on several of the ISO committees. In the following message, Mr. van Wingen is responding to one of my messages to the ISO10646 list (quoted with ">"). I can respond, but I would like to get the responses of the people of LINGUIST and then post a summary (if anyone wants to respond directly, the ISO10646 list is ISO10646@JHUVM.BITNET). Thanks for any responses, John Baima john@utafll.uta.edu ------------------------------------------------------------------- Date: Fri, 24 May 91 17:51:00 CET From: "J. W. van Wingen" Subject: RE: Re: ECMA enhancement proposal for ISO/IEC DIS 10646 Sender: Multi-byte Code Issues Dear Colleagues As I told you before, I have got a tremendous backlog in replying to messages in this list, which deserve one. Here is the first. > While a couple of people have responded to this message, I would like > to add one more point. Minority languages need floating diacritical > marks. There are about 6,000 languages in use today. Within the > lifetime of ISO10646, many minority languages will enter the computer > age. Most of these languages will require diacritical marks because a > majority of them are tone languages. Other languages will need > diacritical marks for other reasons. Most minority languages will be > written in the national language script with as few additional > characters as possible, with the addition of diacritical marks. Here some issues are confused. 1. Minority languages are so special that they must be written with Diacritics. 2. Because they are so small, no fixed letter/diacritic combination is available in the general repertoire. 3. Thus for these languages an ad hoc solution is necessary: the Floating Diacritic that combines at will a letter with a diacritic (this has the advantage that when a minority language is repressed by the regime, no modification of the ISO repertoire needs to be proposed to remove the letter). 4. Because minority languages are written with FD, ISO 10646 should contain this concept. The essence of the argument is that it pertains to a spelling problem, not a coding problem, and thus is strictly spoken outside the scope of this discussion list. The first question is: why should languages be written using special characters or diacritics. Finnish and Hungarian are (distantly) related languages, but with the first the difference between long and short vowels is indicated by doubling the letter (aa/a), with the second by by putting an acute over the long one. Which solution is the better? The only answer can be given by experimental research. If for a language that never has been written before, a writing system is developed, one can discover a certain pattern. First, several missionaries produce a system, each different from the other. Then a linguist publishes a grammar or a lexicon, unifying the spelling, and using an Extension of the International Phonetical Alphabet, totalling to at least 60 small and capital letters. This system is then taught at schools; a few books, and some monthly journals are printed. Then, a New Government raises the language to Official status, and appoints a committee to simplify the spelling, because the cost of ordering special typewriters and printers is becoming too high. > I don't pretend to tell people what they should or should not do, but I > think that it needs to be made clear that a vote against diacritical > marks is a vote against thousands of minority groups. This is putting the fault at the wrong door. It assumes that the way of spelling must be in that way and cannot be in another. In fact, it is the fault of the linguists who designed the spelling while ignoring problems of technical reality, requiring corrections at a later stage from the government. It occurred even with the spelling for Dutch, by de Vries en te Winkel, highly respected scholars. To understand the rules, knowledge of Ancient Gothic was needed. Only after a painful process, a simplified system was established by law, in 1954. > But again, when > ECMA, SHARE, European Governments, and others vote (officially or > unofficially) against diacritical marks, they are voting against > minorities. It isn't just us Americans (including the "California > Dictate") who are concerned about minorities, right? I have the greatest respect for the Soviet linguists who designed the spelling for Kaukasian languages in the 1930ties by avoiding diacritics almost completely. They realized that writing all the 75 phonemes of Abkhazian each by a separate letter would create an unmanageable system. We should be concerned about minorities, especially where they have been the victim of a linguist. One of the worst jokes these people produced was to designate the dot over the i as a diacritic. The people of Turkey have to pay dearly for it, up to this day. I have a suggestion. Let we introduce a law, that every linguist who wants to introduce a new character, not yet existing, has to pay an amount of $ 100 000 as a contribution to the cost caused to the information industry as a result of his invention. Best regards, Johan van Wingen [End Linguist List, Vol. 2, No. 0264] ________________________________________________________________ Linguist List, Vol. 2, No. 0265. Sunday, 2 June 1991 Subj: 2.0265 Responses Total: 141 lines Moderators: Anthony Aristar (a_aristar@fennel.cc.uwa.oz.au) Helen Dry (1echad@utsa86.utsa.edu) (1) Date: Date: Thu, 30 May 91 16:10:55 PDT From: rwojcik%atc.boeing.com@RICEVM1.RICE.EDU Subject: Pseudo-oblique objects (2) Date: Thu, 30 May 91 23:51:18 EDT From: Alexis_Manaster_Ramer@MTS.cc.Wayne.edu Subject: Word stability in historical linguistics (3) Date: 31 May 91 10:49:00 EST From: "ELISE EMERSON MORSE-GAGNE" Subject: hyouston (1) -------------------------------------------------------------------- Date: Thu, 30 May 91 16:10:55 PDT From: rwojcik%atc.boeing.com@RICEVM1.RICE.EDU Subject: Pseudo-oblique objects My thanks to those who have sent me comments on the syntax of range-specifying NPs such as "between 45 minutes to an hour". The grammatical problem that these things pose is that they resemble PPs but behave like NPs. Semantically, the prepositions name beginning and end points on a scale, rather than a relation between an NP and a verb or situation. Right now, I am inclined to think of them as headless post-modifying PPs. So (1) behaves as if it had the syntax of (2): (1) Between 45 minutes and an hour elapsed. (2) A time between 45 minutes and an hour elapsed. One could still analyze the subject in (3) differently, i.e. more like the subject of (4): (3) Between 5 and 10 minutes elapsed. (4) Approximately 5 minutes elapsed. In other words, one could still claim that 'between 5 and 10' fits into a kind of 'measure slot' in the NP. But if you have to live with (1) anyway, then maybe (3) could be treated as a headless postmodifier, too. The headless postmodifier idea might also help to illuminate the nature of double-preposition constructions: (5) Set the timer to between 45 minutes and an hour. (6) Remove debris from around the pipe. I.e. "...to a time between 45 minutes and an hour" and "...from the space around the pipe". Any comments on this line of thought would be appreciated. -Rick Wojcik (rwojcik@atc.boeing.com) (2) -------------------------------------------------------------------- Date: Thu, 30 May 91 23:51:18 EDT From: Alexis_Manaster_Ramer@MTS.cc.Wayne.edu Subject: Word stability in historical linguistics Some weeks ago, while I was away, Bob Poser posted a query regarding claims (esp. those of Aharon Dolgopolsky) about the stability of words with certain meanings. Poser correctly points out that Dolgopolsky's study is marred by the fact that he did not publish his data, that his sample of languages, while large, was unrepresentative (only Old World languages), and that for the languages in the sample we have to take his word for the fact that he correctly identified the words which have remained unchanged since the oldest reconstructed stage and those which have been replaced. I personally think the last objection is not very important, precisely because the sample was skewed in favor of languages Dolgopolsky knows. More generally, I do not know of a better study of word stability, and I think that the purpose for which this study was designed was such that it was not an unreasonable thing to do. The purpose was to devise a way of identifying groups of languages which it is reasonable to assume are related. It was NOT intended to supplant the conventional methods of comparative linguistics which are used to (a) prove beyond a reasonable doubt the existence of a relationship and (b) reconstruct a proto-language (or fragments of one). As Poser points out, Dolgopolsky's partially ordered list of the "stablest words" (including such items as the first and second person as well as interrogative pronouns, negation, the numerals 2 and 3, body parts such as heart and tongue, etc.) has been referred to in recent accounts of the work on remote relations. (To be sure, in the (in)famous Starostin quotation about the word for 'hand', we are dealing with a word which is NOT on the Dolgopolsky list.) However, it should be noted (and I think has been emphasized in recent press accounts) that neither the Nostratic nor the Sino-Caucasian hypothesis (nor even the less believable Dene-Caucasian and some other theories) are based on the stability argument. Whether correct or not, these theories are based on claims of massive and nontrivial sound correspondences and morphological relationships involving large chunks of the morpheme stocks of the languages involved. In fact, it is possible to poke holes in parts of the stability theory by pointing out that, for example, the numerals 2 and 3 are NOT shared by the different Nostratic branches, as Dolgopolsky himself often points out in other contexts. Having said this, it would be very important to get more work on the whole stability question. My own feeling (and it is supported by various kinds of data, not the least the fact about 2 and 3 in Nostratic) is that stability of words is dependent on culture, and hence not universal in the standard sense of the word. Thus, if a given people did not count at all, then it would not have words (much less stable ones) for any numerals, and so on. Hence, I would be skeptical of lists such as Dolgopolsky as having a completely universal status, but used judiciously such studies could prove useful. It certain seems to be a common assumption of ALL comparative linguists that I know of that you base your work on basic alias core vocabulary and hence that there is such a thing as basic or core vocabulary and this tends to be stable enough to allow meaningful comparison. It would be nice to have both a more precise formulation of this and to know the limits of what we are allowed to assume. (I should add that Dolgopolsky himself in a later article noted that first and second person pronouns become quite UNstable in languages spoken by certain kinds of highly stratified societies (such those of Europe, S., SE., and E. Asia, but he concludes that such exceptions to his claims are easy to contain. I think that the problem is much more difficult, and indeed in principle insuperable.) (3) -------------------------------------------------------------------- Date: 31 May 91 10:49:00 EST From: "ELISE EMERSON MORSE-GAGNE" Subject: hyouston Norval Smith's comment that the original, Scottish place-name was pronounced --I mean, IS pronounced--[hust@n] suggests the fate of the word "coupon", which has initial [ky] in many peoples' speech despite coming from French with a [ku]. According to the OED that very word in fact was an early borrowing (whoops) which has survived to the present in Scots English but was re-borrowed more recently into the rest of the language. How do Scots say it? [End Linguist List, Vol. 2, No. 0265] ________________________________________________________________ Linguist List, Vol. 2, No. 0266. Sunday, 2 June 1991 Subj: 2.0266 Phonology and Orthography Total: 120 lines Moderators: Anthony Aristar (a_aristar@fennel.cc.uwa.oz.au) Helen Dry (1echad@utsa86.utsa.edu) (1) Date: 31 May 91 10:01:45 U From: "Mark Durie" Subject: Re: Phonology and Orthograph (2) Date: Thu, 30 May 91 23:21:54 EDT From: Alexis_Manaster_Ramer@MTS.cc.Wayne.edu Subject: Phonology and Orthography (3) Date: Fri, 31 May 91 16:07 GMT From: John Coleman Subject: RE: Phonology and Orthography (4) Date: Fri, 31 May 91 11:24 MET From: RICHARD@CELEX.KUN.NL Subject: Dutch spelling changes (1) -------------------------------------------------------------------- Date: 31 May 91 10:01:45 U From: "Mark Durie" Subject: Re: Phonology and Orthograph Reply to: RE>Phonology and Orthography In response to Margaret Fleck's inquiry: > I would be surprized to find (does anyone know?) the same > wholesale omission of vowel indications in non-Semitic languages that > have borrowed e.g. the Arabic alphabet. Acehnese (Sumatra) has 27 or so vowels, and uses the Arabic script. Some vowels are regularly omitted in the script, and the rest are just represented by a few letters, building on a combination of Malay and Arabic conventions. Needless to say, this traditional Acehnese script is not a language to be read unless you are already very fluent. (2) -------------------------------------------------------------------- Date: Thu, 30 May 91 23:21:54 EDT From: Alexis_Manaster_Ramer@MTS.cc.Wayne.edu Subject: Phonology and Orthography In response to Margaret Fleck's inquiry: > I would be surprized to find (does anyone know?) the same > wholesale omission of vowel indications in non-Semitic languages that > have borrowed e.g. the Arabic alphabet. Ottoman Turkish, Persian, Urdu, Sindhi, and presumably many other Turkic, Indo-Iranian, etc., languages of Central and South Asia that use an Arabic-based writing system, all follow the Arabic principle of writing some vowels (those perceived as somehow corresponding to the Arabic long vowels, usually), but not others (typically those perceived as resembling Arabic short vowels). (3) -------------------------------------------------------------------- Date: Fri, 31 May 91 16:07 GMT From: John Coleman Subject: RE: Phonology and Orthography In response to Margaret Fleck's request for > some concrete examples of languages in which my statement tha > The number of symbolic distinctions (bits of information) needed to > encode a set of CV moras using mora-symbols may be less than if > alphabetic symbols are used ... Any mora-based writing system in which the number of legal CV combinations is less than the number of Cs times the number of Vs will do. In Japanese, for example, combinations yi, ye, wu are not possible, and not encoded by the use of mora symbols. As a consequence, ryi, rye, myi, mye, hyi, hye, nyi, nye, byi, bye, pyi, pye, dyi, dye, zyi, zye, gyi, gye are also not possible, The importance of attending to the number of bits of information in the coding system, rather than the number of distinct symbols, is discussed in a different context in Halle's 1969 paper in Journal of Linguistics 5. --- John Coleman (4) -------------------------------------------------------------------- Date: Fri, 31 May 91 11:24 MET From: RICHARD@CELEX.KUN.NL Subject: Dutch spelling changes In reply to Derek Gross' abstracts from sci.lang concerning Dutch spelling changes: >DUTCH SPELLING CHANGES >Spelling changes are being introduced in the Netherlands, although more >successfully than in France. > >French loan words such as _bureau_, _cha^teau_, and _cadeau_ are now >being written as _buro_, _sjato_, and _kado_ by some newspapers, while >English imports _showroom_, _session_, and _social unit_ are now being >written as _sjoowroem_, _sesjen_, and _soosjel joenit_. >(Heroldo de Esperanto, #6, 1991) For those who are a bit nonplussed by the supposed full-blown phonetic spelling of words like 'sjato' for 'cha^teau' and 'soosjel joenit' for 'social unit' in Dutch, I can add reassuringly that none of the above spellings are officially acknowledged, although 'buro' and 'kado' are widely used informally. The phoneticized spellings of the other words are most likely remnants of the Sixties' social subculture and definitely frowned upon by the average user of Dutch. In fact, more and more words are adopted in their original English spelling under the influence of films, commercials, TV and music, as must be the case in most other languages nowadays. Still, there is an ongoing and so far fruitless debate in the Netherlands about more phonetic spelling, e.g. for cases of Auslautverhaertung (German for final consonant devoicing) and replacing 'c' with either 'k' or 's' when pronounced that way. Richard Piepenbrock [End Linguist List, Vol. 2, No. 0266] ________________________________________________________________ Linguist List, Vol. 2, No. 0267. Sunday, 2 June 1991 Subj: 2.0267 Sociolinguistic Issues: Survival and Bilingualism Total: 116 lines Moderators: Anthony Aristar (a_aristar@fennel.cc.uwa.oz.au) Helen Dry (1echad@utsa86.utsa.edu) (1) Date: Thu, 30 May 91 12:27:52 -0500 From: louden@ix1.cc.utexas.edu (mark l louden) Subject: Re: The Survival of Immigrant Langugaes (2) Date: Thu, 30 May 91 23:55:28 EDT From: Alexis_Manaster_Ramer@MTS.cc.Wayne.edu Subject: The Survival of Immigrant Langugaes (3) Date: 31 May 91 11:05:00 EST From: "ELISE EMERSON MORSE-GAGNE" Subject: non-bilingual police and judiciary (4) Date: Thu, 30 May 1991 18:05 EST From: Karen Christie Subject: Re: FYI (1) -------------------------------------------------------------------- Date: Thu, 30 May 91 12:27:52 -0500 From: louden@ix1.cc.utexas.edu (mark l louden) Subject: Re: The Survival of Immigrant Langugaes In response to Elise M-G's question about restrictive legislation and Penn. German, the story is kind of interesting and somewhat complex. First point that is relevant is that the Penn. Germans, both sectarians (Old Order Amish, Mennonites) and non-sectarians (everybody else) alike, a) have not considered themselves German-Americans as other groups have, e.g. Ger-Ams such as the groups Herb Stahlke refers to in the Midwest, in particular urban Ger-Americans; b) have not been regarded by other Ger-Americans as really Ger-American. This is ude in large part to the fact that the Penn. Germans came over so much earlier (1683-1775) than other sizable groups (mainly 19th c), among whom there was a greater number of middle class and educated people. Penn Germans of both type (sectarian and non-sectarian) who have maintained the dialect are almost without exception rurally based and with limited formal education; those who historically moved to cities and towns and pursued higher levels of education and became middle class generally have given up the dialect in favor of English. The PG dialect has always had the stigma of country-bumpkin speech, even by its own speakers (mainly non-sectarian) who will readily agree with their outside critics that it is 'not a real language' bec. it is effectively (for most speakers) a non-literate lg. and bec. it differs from Std. German. It is no accident that the term Pennsylvania Dutch is almost unversally preferred by speakers over Penn. German since they'll say they're 'not really German, but a mixture of German and American, Dutch'. In the early 1830's, a school law was passed in Pa. allowing lgs. other than English to be used as media of instruction, but ironically it was intere[redte (typo, sorry) widely interpreted as mandating English only; in any case, the low prestige of PG was reinforced by the fact that English remained the exclusive medium of instruction for dialect-speaking children. Many of these children suffered serious discrimination at the hands of teachers, who were by and large not dialect speakers, and although there was never any specific legislation targeting use of PG, kids felt the impact of their mother tongue being stigmatized. Some observers have included Penn Germans in the list of Ger-Amer groups persecuted during WWI (and WWII), but as Kloss has pointed out, this is incorrect. Bec. of the view of Penn 'Dutch' as not being really German, they were largely unaffected by anti-German sentiment felt in other places like here in Texas (during WWI it was illegal to speak/teach German in the state and our dept. was closed down). The accelerated decline of PG among non-sectarians in this century is thus not due to anti-German feelings, but rather the decreasing social and geographic isolation of the speakers (characterized of course by increasing educational achievement and major population shifts); the decline of French and Spanish in Lousiana can be attributed to similar circumstances. The maintenance situation among PG-speaking sectarians is of course just the opposite since the dialect, along with distinctive dress and transportation, is an important marker of in-group status. But when people leave the Old Order groups and become more socially assimilated, PG is generally given up quickly. In some families who leave, they literally go from speaking PG at home to English from one day to the next. Best, Mark Louden (2) -------------------------------------------------------------------- Date: Thu, 30 May 91 23:55:28 EDT From: Alexis_Manaster_Ramer@MTS.cc.Wayne.edu Subject: The Survival of Immigrant Langugaes (1) African languages survived quite well in parts of Brazil. (2) It is my impression that in India various communities maintain their native languages for generations, and that the whole idea that we should expect immigrant communities to assimilate linguistically to their environment is a feature of American (and European?) culture, not a cultural universal. (3) -------------------------------------------------------------------- Date: 31 May 91 11:05:00 EST From: "ELISE EMERSON MORSE-GAGNE" Subject: non-bilingual police and judiciary With reference to the Miami instance, readers may be interested in a series of articles in the Philadelphia Inquirer last week describing the problem encountered by Spanish-speaking mushroom harvesters in Chester Cty near Philadelphia. It appears that virtually no one on the police force speaks Spanish, and interpreters are amateur and sometimes entirely lacking for every stage of a court case, starting with the arrest and going through sentencing. This results in situations like someone named Angel Jose Lopez getting arrested for something Antonio Juan Lopez actually did--and going to jail for it. The articles are informative and shocking. Elise Morse-Gagne (4) -------------------------------------------------------------------- Date: Thu, 30 May 1991 18:05 EST From: Karen Christie Subject: Re: FYI The comment about bilingual jurors and 'translation problems' reminds me of the situation for Deaf jurors also. A Deaf friend of mine was called to jury duty and at that time the judge told the interpreter he must "sign exactly what I say WORD for WORD." [End Linguist List, Vol. 2, No. 0267] ________________________________________________________________ Linguist List, Vol. 2, No. 0268. Sunday, 2 June 1991 Subj: 2.0268 Job Total: 78 lines Moderators: Anthony Aristar (a_aristar@fennel.cc.uwa.oz.au) Helen Dry (1echad@utsa86.utsa.edu) (1) Date: Date: Fri, 31 May 91 11:40:40 METDST From: Eric Reuland Subject: Job at Groningen (1) -------------------------------------------------------------------- Date: Fri, 31 May 91 11:40:40 METDST From: Eric Reuland Subject: Job at Groningen Preliminary Announcement Please bring this to the attention of any potential applicants ---------------------------------------------------------------- UNIVERSITY OF GRONINGEN CENTER FOR BEHAVIOURAL, COGNITIVE AND NEURO-SCIENCES (BCN) Department of Linguistics/Department of Psychology Pending final approval, applications are invited for a Research position (Postdoc) in a psycho-linguistic project investigating the modular nature of the language system, using advanced techniques determining the nature and localization of cerebral activities. The project, entitled "Conceptual Structure and Computation" will commence on September 1, 1991, and has a duration of 4 years. Brief description: Within the human language system conceptual and formal (computational) subsystems can be distinguished. The nature of and differences between these subsystems will be investigated in connection with differences in the nature and localization of the associated cerebral activities. Cerebral activities will be measured by registration of event related potentials, determination of neuro-magnetic activity, and by positron emission tomography, during experimental tasks differentiating as to the type of subsystem involved. The project will resort under the Faculty of Letters, Department of Linguistics. The experiments wil be carried out at the Departments of Experimental Psychology and the Groningen PET Scanning Center. The project will require expertise both in formal linguistics and in experimental psychology. We are looking for applicants with excellent research qualifications, who have a PhD in formal linguistics, research experience in psycholinguistics, and are able to further develop the necessary skills in the latter field. The person to be appointed should consider it a challenge to bridge the gap between these disciplines, and be willing to play an active role in the further development of experimental linguistics at Groningen University. It may be expected that the appointment will be made in scale 10 of the BBRA (the Rules and Regulations for University staff in the Netherlands, approx. Dfl. 54000-63000, depending on experience). Please send applications, including curriculum vitae and the names and addresses of three referees, to Dr. Eric Reuland Department of Linguistics Faculteit der Letteren Postbus 716 9700 AS Groningen The Netherlands It is recommended to send applications before June 22nd 1991. Applicants will be notified as soon as authorization is obtained for the official procedure to start (the official procedure may require reconfirmation of the application). For inquiries, please contact Dr. Eric Reuland, telephone 31-50-635813 (office), 31-50-635974 (admin.) or 31-5908-16029 (home), or via e-mail reuland@let.rug.nl. [End Linguist List, Vol. 2, No. 0268] ________________________________________________________________ Linguist List, Vol. 2, No. 0269. Sunday, 2 June 1991 Subj: 2.0269 Tongue Twisters Total: 106 lines Moderators: Anthony Aristar (a_aristar@fennel.cc.uwa.oz.au) Helen Dry (1echad@utsa86.utsa.edu) (1) Date: Thu, 30 May 91 10:03:14 -0500 From: louden@ix1.cc.utexas.edu (mark l louden) Subject: German Tongue Twisters (2) Date: 31 May 91 2:55 +0800 From: Hurch Subject: tongue twisters, re: morse-gagne (3) Date: Fri, 31 May 91 09:45:28 +0200 From: "Stephan Busemann" Subject: German Tongue Twisters (1) -------------------------------------------------------------------- Date: Thu, 30 May 91 10:03:14 -0500 From: louden@ix1.cc.utexas.edu (mark l louden) Subject: German Tongue Twisters A few versions of the German Oberammergau tongue twister have been posted, so I guess I'll include the one I remember. Heut' kommt der Hans zu ihr, Freut sich die Lies; Ob er ueber oder unter Oberammergau, Oder ob er aber ueber oder unter Unterammergau, Oder aber ob er ueberhaupt nicht kommt, Is' net gewiss In the third line, the 'ob er' and 'aber' may be switched, I can't remember. I also remember an alternate version sung by a drunk at the English Garden in Munich which included lines such as 'ob er aber mit dem ober'n Kiefer kaut' ('whether he chews with the top teeth'). Anybody familiar with this less twisted rendition? Best wishes--Mark Louden (2) -------------------------------------------------------------------- Date: 31 May 91 2:55 +0800 From: Hurch Subject: tongue twisters, re: morse-gagne Let me specify the note from Elise Emerson Morse-Gagne: I know a Upper Bavarian version by the critical "folk" singers Biermoeslblasn which goes: Der Russ der kimmt, der Russ der kimmt des is ganz gwiss Ob er aber ueber Oberammergau oder ob er aber ueber Unterammergau oder ob er aber ueberhaupt net kimmt des is net gwiss. The Russians will come for sure, But whether they will come via Oberammergau or via Unterammergau or not at all, that's not sure. This is an ironical version, and currently the most popular one about the (non-existing) danger of a Russian invasion in Bavaria. The basic lines are the traditional ones. And (as a speaker of a Bavarian) I would doubt whetherthis is really a tongue twister. Bernhard Hurch (3) -------------------------------------------------------------------- Date: Fri, 31 May 91 09:45:28 +0200 From: "Stephan Busemann" Subject: German Tongue Twisters In reaction to Elise Morse-Gagne's mail (Vol. 2 No. 258): The text you quoted is a German folk song. I remember it as follows: Heut' kommt der Hans zu mir, freut sich die Lies. Ob er aber ueber Oberammergau oder aber ueber Unterammergau oder aber ueberhaupt nicht kommt, ist nicht gewiss. today Hans is coming to (see) me (which) Lies is glad about But if he comes by O. or by U. or if he doesn't come at all isn't certain. It is not a dialect at all though the villages in question are Bavarian... I don't take it as a tongue twister, since everything is easy to pronounce (at least for German native speakers). In lines 3 to 5, stressed o, a, ue, and au alternate with unstressed e. However the text contains a garden path, namely "ueber" in "ueberhaupt nicht" (not at all), which seems to introduce a third route Hans might use. For me, the most difficult German tongue-twister has ever been this (short and silly) one: Brautkleid bleibt Brautkleid und Blaukraut bleibt Blaukraut. wedding-dress remains w. and red cabbage remains r. c. Stephan Busemann [End Linguist List, Vol. 2, No. 0269] ________________________________________________________________ Linguist List, Vol. 2, No. 0270. Thursday, 6 June 1991 Subj: 2.0270 Queries Total: 72 lines Moderators: Anthony Aristar (a_aristar@fennel.cc.uwa.oz.au) Helen Dry (1echad@utsa86.utsa.edu) (1) Date: Mon, 3 Jun 1991 11:56 EST From: "Hi, 'lo" Subject: Dutch Verb-second phenomena (2) Date: Mon, 3 Jun 91 20:01 EST From: 00Z0ZHAO%BSUVAX1.BITNET@UICVM.uic.edu Subject: Eve Clark (3) Date: Tue, 4 Jun 91 13:01:22 EDT From: Donna Erickson Subject: Re voice quality for tone language (4) Date: Wed, 5 Jun 91 09:57:42 EDT From: jmyers@pilot.njin.net (Jim Myers) Subject: Language Learning Software (1) -------------------------------------------------------------------- Date: Mon, 3 Jun 1991 11:56 EST From: "Hi, 'lo" Subject: Dutch Verb-second phenomena A Belgian graduate student who is here for a short while is interested in the phenomenon of verb-second in Dutch, especially as applied to deaf people's use of Dutch. Is there any recent literature (post-1985) in formal linguistics that deals with verb-second in Dutch? Susan Fischer (2) -------------------------------------------------------------------- Date: Mon, 3 Jun 91 20:01 EST From: 00Z0ZHAO%BSUVAX1.BITNET@UICVM.uic.edu Subject: Eve Clark Does anyone know Dr. Eve Clark's (Department of Linguistics, Stanford University) bitnet account? I would appreciate it very much if you can afford me the information. (3) -------------------------------------------------------------------- Date: Tue, 4 Jun 91 13:01:22 EDT From: Donna Erickson Subject: Re voice quality for tone language We are some linguists trying to study the voice quality of tone language. Is there anybody doing simialr researches, or known some literature about it? Ho-hsien Pan (4) -------------------------------------------------------------------- Date: Wed, 5 Jun 91 09:57:42 EDT From: jmyers@pilot.njin.net (Jim Myers) Subject: Language Learning Software A professor at Montclair State College is interested in developing software for students enrolled in elementary foreign language courses for whom memorizing word pairs is difficult. The software should allow students to enter their own word pairs. Students would then be able to access the word list for self-testing. The desired program should record responses and retest the student on incorrect responses. Does anyone know of an existing program that would fulfill these requirements? Please respond to me directly and I will summarize results to the list. [End Linguist List, Vol. 2, No. 0270] ________________________________________________________________ Linguist List, Vol. 2, No. 0271. Thursday, 6 June 1991 Subj: 2.0271 For Your Information Total: 95 lines Moderators: Anthony Aristar (a_aristar@fennel.cc.uwa.oz.au) Helen Dry (1echad@utsa86.utsa.edu) (1) Date: Mon, 03 Jun 91 15:21:36 EDT From: Johnd Moyne Subject: CUNY Sentence Processing Conference (2) Date: Wed, 5 Jun 91 22:39:07 AES From: aiatsis@peg.pegasus.oz.au Subject: Material on Aboriginal Studies (1) -------------------------------------------------------------------- Date: Mon, 03 Jun 91 15:21:36 EDT From: Johnd Moyne Subject: CUNY Sentence Processing Conference The tentative dates for the next meeting of the annual CUNY Sentence Processing Conference are March 19-21 or March 26-28, 1992 at New York City. If any of you know of any other meeting which may in conflict, will you please let me know. Thanks, John Moyne: moygc@cunyvm.bitnet (2) -------------------------------------------------------------------- Date: Wed, 5 Jun 91 22:39:07 AES From: aiatsis@peg.pegasus.oz.au Subject: Material on Aboriginal Studies The Australian Institute of Aboriginal and Torres Strait Islander Studies now holds computerised material relating to Aboriginal Studies. The archive is available to researchers, subject to deposit and access conditions and currently holds material of the following types: % Dictionaries of Aboriginal languages. % Texts in Aboriginal languages. % Graphics for use in literature production. % General texts relating to Aboriginal Australia. In addition the archive has software and can provide information and advice about use of Macintosh computers. What is the function of the archive? The ASEDA provides a service to researchers in the field of Aboriginal Studies. By accessing information in electronic form researchers can engage in comparative linguistic work, can locate references that are not available by keyword searching of catalogues, and can 'add value' to existing work (by producing various forms of output from existing data files). The archive offers long term storage and maintenance of electronic versions of texts. It also arranges for the production of infomation in electronic form from paper texts, using optical scanning technology. How is the data stored? The archive is currently stored on a Canon Magneto-Optical Disk (MOD) with a capacity of 250 mb per side of each removable disk.Data is imported from non-Macintosh formats, and stored in its original form, where possible. Structured data is also converted into a 'text-only' file. This facilitates exporting to other media, and reading the data using various software. It is not possible to mark-up all of the data so that it conforms to a standard format. Documentation of the coding conventions used in particular texts are available from the archive. What restrictions are there? Normal copyright restrictions apply, and there are additional restrictions placed on items in the archive by the depositors. Deposit and access forms accompany each item, specifying what access is allowed. Many items are freely available for the use of researchers. How can I use the archive? In two ways. Firstly, you can deposit information with the archive. Any information that you produce or have produced on disk can be deposited. Any information that would normally be deposited with the AIATSIS in a hard copy can now also be deposited in electronic form. Secondly, you can request information from the archive. You will then be sent a copy of the data, subject to the access restrictions placed on it by the depositor. **Note that the ten Raa/Woenne Green Research Dictionary of the Western Desert is now formatted and accessible (~1200kb on Mac disk) [End Linguist List, Vol. 2, No. 0271] ________________________________________________________________ Linguist List, Vol. 2, No. 0272. Thursday, 6 June 1991 Subj: 2.0272 Register and Net-Discourse Total: 61 lines Moderators: Anthony Aristar (a_aristar@fennel.cc.uwa.oz.au) Helen Dry (1echad@utsa86.utsa.edu) (1) Date: Mon, 3 Jun 1991 21:09 CDT From: Peter Gingiss Subject: Register (2) Date: Wed, 5 Jun 91 15:45:32 EDT From: macrakis@osf.org Subject: Flaming and E-mail discourse (1) -------------------------------------------------------------------- Date: Mon, 3 Jun 1991 21:09 CDT From: Peter Gingiss Subject: Register My understanding of the term register is that it refers primarily to the lexicon, and is used mostly in terms of occupations or situations. Thus one can talk of a sports announcers's register or even a sports register. On the other hand, one recent text book uses register as the equivalent of style. I have heard a debate between a British and an American linguist over whether the term "register" was even necessary if one has dialect and style. Register seems to be more a British term. Because I am writing from where my library is not, I would have to investigate further at a later time, but I would be happy to do so. (2) -------------------------------------------------------------------- Date: Wed, 5 Jun 91 15:45:32 EDT From: macrakis@osf.org Subject: Flaming and E-mail discourse A while ago there was a discussion about why dialogue on newsgroups fails. There is a recently published book on E-mail behavior which may interest some of you: Connections, by L. Sproull and S. Kiesler, MIT Press. I only leafed through it at the bookstore, but it does look interesting. On the same subject, it was interesting to see the discussion about Quebec's language policy on the Linguist list degenerate into flaming a few weeks ago. (I admit it: I was a participant.) Since the Linguist list is not very prone to flaming (compared with, say, sci.lang or soc.culture.french), this demands explanation. Could it be that professional linguists' common discourse rules are limited to technical linguistics, and when other subjects are discussed, linguists don't share discourse rules any more than anyone else? Linguist posters seem to manage Dept. of Linguistics discourse behavior, but not Senior Common room discourse behavior...! (I think it's fair to call it flaming and not just strong disagreement because of the number of participants with no specific knowledge of the subject, yet expressing strong opinions.) -s [End Linguist List, Vol. 2, No. 0272] ________________________________________________________________ Linguist List, Vol. 2, No. 0273. Thursday, 6 June 1991 Subj: 2.0273 Responses: Diacritics and Acronyms Total: 94 lines Moderators: Anthony Aristar (a_aristar@fennel.cc.uwa.oz.au) Helen Dry (1echad@utsa86.utsa.edu) (1) Date: Mon, 3 Jun 91 10:04:59 +1000 From: bert peeters Subject: ISO10646 and diacritical marks (2) Date: Thu, 6 Jun 91 01:36:55 -0400 From: daniel%drew.cog.brown.edu@RICEVM1.RICE.EDU (Daniel Radzinski) Subject: Hebrew based acronyms (1) -------------------------------------------------------------------- Date: Mon, 3 Jun 91 10:04:59 +1000 From: bert peeters Subject: ISO10646 and diacritical marks Until I read John Baima's message to LINGUIST, I had been unaware of the fact that moves are being made (and opposed) to introduce floating diacritical marks in those areas of the information industry (such as electronic mail, I presume) where they are not available (yet). If I get Baima's message right, he's saying that there are people out there who'd rather have no diacritics, and are prepared to blame linguists for inventing writing systems that are too complicated. This is outright appalling. It is the very first time that I hear someone argue for the fact that technology should remain stagnant and that all those writing systems that do use diacritics (such as French, for instance, which is not exactly a minority language) are in need of revision. Let me point out that French diacritics were not proposed first by modern linguists nor by missionaries or anything of the sort. They have been for a long time part of the French writing system (and of many other writing systems for that matter) and it is outrageous that someone should make the suggestion that therefore French needs a spelling reform which rids it of its diacritics. There are probably spelling matters that are in need of more urgent revision. I'd like to know how many native speakers of French would be happy to replace their acute accents with a double vowel, for instance. It would render the language unrecognizable. Right now, people corresponding by electronic mail in a language such as French have two possibilities: either they ignore the diacritics, hoping that their message will be read and understood properly - or they use double characters where the diacritic follows or precedes the character which in normal spelling takes it. As I often send out e-mail messages in French, I feel that neither system is completely satisfactory, and I am therefore sympathetic to Baima's proposal to do something about it. I'm not sure whether I am understanding all the implications of the discussion that is going on, but I felt so weird after knowing there are people pointing fingers at linguists who do the right thing - that is develop writing systems that are probably not ideal but close enough so to be given the green light - that I simply had to make a statement. If I've made myself ridiculous, so be it. Bert Peeters (2) -------------------------------------------------------------------- Date: Thu, 6 Jun 91 01:36:55 -0400 From: daniel%drew.cog.brown.edu@RICEVM1.RICE.EDU (Daniel Radzinski) Subject: Hebrew based acronyms Hebrew acronym based anthroponyms. I'm not sure we are looking at the phenomena the way it ought to be seriously dealt with from the point of view of the scientific study of natural language. That "Katz" is an acronym for _kohen tzedek_ `priest of justice' and "Segal" for _sgan levi_ `deputy Levite', for example, is fairly well-known. The question is whether this is true in the sense that it is the actual way such names were formed. I strongly suspect that lesser-known examples such as "amen" being an acronym for _emet muvan (ve-)naxon_ `true, understood (and) correct', or the Aramaic based "bar" (as in Bar-Mitzvah) for _ben rav_ `son of rabbi' (rabbi here in the sense of `respected gentleman') and "mar" `Mr.' for _morenu (ve-)rabenu_ `our teacher (and) rabbi' are nothing more that folk (or rabbinical) etymologies. If so, why should we take the Katz and Segal cases for granted? Well, an explanation might go along the lines that it is no accident that the vast majority of Katzes are kohanim (descendents of Aaron along paternal lineage) and Segals, leviyim (descendents of Levi [but not Aaron] along paternal lineage). But then, why don't we find non-Ashkenazi (i.e., from parts other than central or east Europe) Katzes and Segals? It seems, to me at least, far more reasonable to assume that the names Katz and Segal derive from Germanic terms related to felines and sails, respectively (or perhaps some other Indo-European source). It simple just happened to be the case that the original (or originals) Katz was a kohen, and Segal, a levi. Once the families grew substantially, folk etymologies were sought. Any corroboration on this? Is there anyone out there who can enlighten us some more on this matter? -- Daniel Radzinski [End Linguist List, Vol. 2, No. 0273] ________________________________________________________________ Subj: 2.0274 Jobs Total: 110 lines Moderators: Anthony Aristar (a_aristar@fennel.cc.uwa.oz.au) Helen Dry (1echad@utsa86.utsa.edu) (1) Date: Mon, 3 Jun 1991 12:33:40 -0400 From: rogers@epas.utoronto.ca (Henry Rogers) Subject: Phonology Position -- Toronto (2) Date: Thu, 6 Jun 91 16:10:57 +0200 From: vannoord@coli.uni-sb.de (Gertjan van Noord) Subject: job opening in Computational Linguistics (1) -------------------------------------------------------------------- Date: Mon, 3 Jun 1991 12:33:40 -0400 From: rogers@epas.utoronto.ca (Henry Rogers) Subject: Phonology Position -- Toronto The Department of Linguistics at the University of Toronto invites applications for a one-year leave replacement position beginning August 15, 1991. Applicants should have completed the Ph.D. in linguistics with a specialization in phonology. The position involves undergraduate and graduate teaching, with the possibility of supervision at the MA level. Applicants should submit a cover letter and CV to: Chair Department of Linguistics University of Toronto Toronto, Ontario, Canada M5S 1A1 e-mail: rice@epas.utoronto.ca telephone: (416) 978-4029 fax: (416) 978-8821 Applicants should arrange for two letters of reference to be sent as well. In accordance with Canadian Immigration Regulations, priority will be given to Canadian citizens and permanent residents. Deadline for applications: June 28, 1991. (2) -------------------------------------------------------------------- Date: Thu, 6 Jun 91 16:10:57 +0200 From: vannoord@coli.uni-sb.de (Gertjan van Noord) Subject: job opening in Computational Linguistics 5 June 1991 Job Opening in Computational Linguistics at the University of Saarbrucken: Research Associate (Wissenschaftlicher Mitarbeiter) in the Project BiLD We are looking for an additional researcher to join the project BiLD, located at the Department of Computational Linguistics of the University of Saarbruecken. BiLD (Bidirectional Linguistic Deduction) is a three year project (started January '91) funded by the DFG (German Science Foundation). The project is part of the Special Research Division 314: Artificial Intelligence and Knowledge-Based Systems with locations at the Universities of Karlsruhe, Kaiserslautern and Saarbruecken. The main objective of the project BiLD is the development of uniform methods for parsing and generation based on the paradigm of "NLP as deduction" in the area of constraint-based approaches to linguistics. Current project members are Guenter Neumann, Gertjan van Noord and Hans Uszkoreit (PI). Ideally, we are looking for a computer scientist or computational linguist with theoretical and practical experience in automated deduction techniques and interest in the application of AI deduction methods to natural language processing. However, we are also interested in hearing from interested individuals offering a good background in computational linguistics with experience in the area of the processing of feature-logic-based grammars. The University of Saarbruecken offers an excellent research environment for anyone interested in computational linguistics, artificial intelligence, and computer science. Several research projects are conducted at the Computational Linguistics department. The University has one of the best Computer Science departments in Germany. NLP is one of the main strengths of the department's AI lab. The university also hosts the DFKI (German Research Center for Artificial Intelligence), where several project groups work in the area of NLP. Furthermore a newly founded Max-Planck Institute for Computer Science with a focus in Parallel Processing has been set up on the Saarbruecken campus. Appointment initially until December 1993, with prospects for renewal until December 1996, Salary on the German BAT IIa scale. Exact income depends on age and marital status. CV's and enquiries electronically or by post or during the conference to Prof.Dr. Hans Uszkoreit Universitaet des Saarlandes Computerlinguistik W 6600 Saarbruecken 11 Germany uszkoreit@coli.uni-sb.de [End Linguist List, Vol. 2, No. 0274] ________________________________________________________________ Subj: 2.0275 Things Phonological and Orthographical Total: 110 lines Moderators: Anthony Aristar (a_aristar@fennel.cc.uwa.oz.au) Helen Dry (1echad@utsa86.utsa.edu) (1) Date: Tue, 4 Jun 91 05:34:18 -0400 From: PLUNKETT@LINGUIST.umass.edu Subject: Reply to Responses (2) Date: Tue, 4 Jun 91 12:26:36 BST From: fleck@robots.oxford.ac.uk(Margaret@RICEVM1.RICE.EDU Fleck) Subject: Phonology and Orthography (3) Date: Thu, 6 Jun 1991 12:42:20 PDT From: Diane_L._Olsen.osbu_north%xerox.com@RICEVM1.RICE.EDU Subject: Re: ISO10646 and diacritical marks (1) -------------------------------------------------------------------- Date: Tue, 4 Jun 91 05:34:18 -0400 From: PLUNKETT@LINGUIST.umass.edu Subject: Reply to Responses In reply to Morse-Gagne's question about how the Scots pronounce "coupon" the answer is /ku../ definitely not /kyu.../. In fact I've never heard anyone in Britain pronounce it otherwise, the first time I heard it as /kyu was when I came to the States and I always thought it was a hypercorrection. (2) -------------------------------------------------------------------- Date: Tue, 4 Jun 91 12:26:36 BST From: fleck@robots.oxford.ac.uk(Margaret@RICEVM1.RICE.EDU Fleck) Subject: Phonology and Orthography (1) So there seem to be a fair handful of languages that have borrowed not only the Arabic script, but also its habit of omitting vowels. Thanks to everyone for the examples. Ds nn ndrstnd hw ppl mng t rd sch scrpts? Is it just masochism, or are there particular features of the phonology of these languages that make it more plausible than it sounds? (2) Does anyone understand why John Coleman thinks that the length of some (unspecified) bit-encoding of the 2D character patterns is a suitable measure of how difficult it is for people to learn and use a writing system? It appears that he proposes to usa a fixed number of bits per character, and then form the output bit string by concatenating the representations of successive characters. Under these assumptions, a syllable-based writing system can, as he claims, encode a fixed text using fewer bits. However, it follows by the same line of reasoning that an ideographic system (e.g. Chinese) uses storage space even more efficiently and should thus be an even more popular (stable, adopted by other languages, easy for children to learn) method of writing. Margaret (3) -------------------------------------------------------------------- Date: Thu, 6 Jun 1991 12:42:20 PDT From: Diane_L._Olsen.osbu_north%xerox.com@RICEVM1.RICE.EDU Subject: Re: ISO10646 and diacritical marks Could someone please send me a copy of John Baima's message? I must have accidentally deleted my copy without reading it. I receive a great deal of e-mail, and sometimes I'm a little too quick to purge items without reading them. As for the subject of Bert Peeters' message (people proposing spelling reform -- possibly including ASCII-ification, it sounds like -- instead of technological innovation as a means of solving the "diacritic problem"), well, there are a good many idiots out there in the world. Or rather, there are a lot of people in the world who are very naive about language. I find it hard to suppress a giggle at the notion that a conspiracy of linguists (something like the "Gnomes of Zurich," for those who play the conspiracy-theory game "Illuminati") is to blame for the fact that the writing systems of the world are not all easily encodable in ASCII. This is the first I've heard of anyone proposing the elimination of diacritics as a solution to the problem; but as I have heard many people (mostly nonlinguist engineers) over the years express the sincere wish that human language take on more of the properties of computer languages (e.g., context insensitivity and simple, unambiguous syntax and semantics), I am not surprised. But I hardly think that diacritics are in any danger of extinction! (Speaking of linguistic naivete, I have even heard someone -- this time a young American computer scientist just back from his first trip to Europe -- say that he had heard -- and believed! -- that the official language of the new, unified Europe would be English.) In my department here at Xerox, we develop and maintain software for multilingual WYSIWYG word processing in nearly twenty of the world's most commonly used writing systems, including Arabic and Chinese. I bring this up not so much as an advertisement as an existence proof. Making computers multilingual requires not one iota of "spelling reform." And once we have replaced ASCII with a more universal character encoding standard (OK, my turn to be naive :-) ), all computers will be multilingual. On that glorious day, we will be able to conduct electronic flame wars in Tibetan if we want to! As a loyal coworker of Joe Becker, I am honor bound not to leave any discussion about diacritics or character encoding standards without mentioning the Unicode Consortium. Well, there, I've said it! :-) I bring up Unicode not to discuss it (no anti-Unicode flames, please! That's not my battle!) but simply to point out that there is at least one other group besides ISO that is attempting a universal character encoding standard. Diane L. Olsen (dolsen.osbu_north@xerox.com) Multilingual Software Development Xerox Corporation [End Linguist List, Vol. 2, No. 0275] ________________________________________________________________ Subj: 2.0276 Responses Total: 135 lines Moderators: Anthony Aristar (a_aristar@fennel.cc.uwa.oz.au) Helen Dry (1echad@utsa86.utsa.edu) (1) Date: Mon, 3 Jun 91 12:31:24 EDT Subject: Turkish query From: Sean Boisen (2) Date: Mon, 3 Jun 91 05:58:07 EST From: nuyts%ccu.UIA.AC.BE%CUNYVM.CUNY.EDU@RICEVM1.RICE.EDU Subject: Flemish (3) Date: Mon, 3 Jun 91 12:04:03 +1000 From: bert peeters Subject: Pseudo-oblique objects (1) -------------------------------------------------------------------- Date: Mon, 3 Jun 91 12:31:24 EDT Subject: Turkish query From: Sean Boisen Mark Seidenberg writes: > Could someone please point me to (a) a computer-readable Turkish lexicon > (a list of words and their pronunciations), and/or (b) work on morphological > parsing in Turkish? > Several years ago I worked with a morphological parser for Turkish devised by Jorge Hankamer at UC Santa Cruz, which (reasonably enough) included a lexicon of about 1300 base entries (without detailed phonetic pronunciations, only part of speech and simple definitions). You should contact him directly to obtain it: I assume he's enlarged and enhanced it by now. I doubt my old email address for him is still valid, but back in 87-ish it was ucscc!hank@ucb-vax.berkeley.edu. ........................................ Sean Boisen -- sboisen@bbn.com BBN Systems and Technologies, Cambridge MA (2) -------------------------------------------------------------------- Date: Mon, 3 Jun 91 05:58:07 EST From: nuyts%ccu.UIA.AC.BE%CUNYVM.CUNY.EDU@RICEVM1.RICE.EDU Subject: Flemish A (somewhat delayed) response to Willem de Reuse's question about the pragmatics of double-pronoun constructions in Flemish-Dutch dialects. I do not know the dialects referred to in the discussion so far, but my own dialect, from Antwerp city, does have the pronoun doubling system too, as do most Belgian Dutch dialects. Since I happen to be working (together with my colleague Georges De Schutter) on a grammar of the Antwerp dialect, I have been looking into the issue of the pragmatic differences between alternatives quite recently. The question is: what is the difference between simple pronoun forms such as Ik em da gedaan I have that done I have done it/this and complex forms such as Ik/k em ekik da gedaan I-full/clitic have I-expanded-form that done One problem is that intuitions are not sufficient to settle the matter, so we are planning to work with informants to find out more about this. Anyway, as far as I am concerned, there are several factors involved. One is focus on the subject of the sentence. In the complex form the pronoun is much more prominent. Yet this is not sufficient, since in the simple form one can perfectly stress the initial pronoun to make it focus, too. I have the feeling that there is a subtle differencein the type of focus this produces, but I would not know how to formulate this at present. Another matter which seems involved is empathy: in many cases the second form is used to express some kind of emotional commitment (this is a very unnuanced rendering of what is involved) - a rather positive or satisfactory feeling about what is uttered in the sentence, or sometimes a rather negative attitude. The first form is completely neutral in this respect. But there is much more going on. Also consider the second-person cases, which are much more complex than the first-person cases: Gij/g e da goe gedaan You-full/clitic have that well done Gij/g e gij/ga da goe gedaan You-full/clitic have you-full/semi-full that well done G e dega/degij da goe gedaan You-clitic have you-expanded-semi-full/expanded-full that well done One complicating factor is that there are even more pronoun forms involved here. There is a full, semi-full, and clitic simple pronoun, and a full and semi-full expanded pronoun (and I am not even mentioning one further form, 'gijse', which is used in reactive speech acts only). There are also some syntactic differences with the first person singular. The second sentence has no equivalent in the first person singular: it is not possible to have pronoun doubling with the non-expanded form there. *Ik em ik da goe gedaan In the third sentence it is impossible to use the full form of the normal pronoun in initial position. In the second sentence, however, one can use both a full and a clitic form (the semi-full form probably does not occur for purely phonetic reasons) initially. It will be obvious that this is a quite complicated matter, and I do not feel up to predictions about what precisely is going on here, in terms of precise pragmatic conditions selecting one form or another. I hope I'll know more in a couple of months. Jan Nuyts University of Antwerp (3) -------------------------------------------------------------------- Date: Mon, 3 Jun 91 12:04:03 +1000 From: bert peeters Subject: Pseudo-oblique objects (1) Between 45 minutes and an hour elapsed. (2) A time between 45 minutes and an hour elapsed. The analysis of (1) in terms of (2) makes a lot of sense to me. However, how does Rick Wojcik suggest we should look at the greeting at the end of this message? Greetings from down under down under (i.e. from Tasmania) Bert Peeters [End Linguist List, Vol. 2, No. 0276] ________________________________________________________________ Subj: 2.0277 More Tongue Twisters Total: 113 lines Moderators: Anthony Aristar (a_aristar@fennel.cc.uwa.oz.au) Helen Dry (1echad@utsa86.utsa.edu) (1) Date: Mon, 03 Jun 91 10:24:36 EDT From: Projekt Matrace Subject: Czech tongue twisters (2) Date: 3 Jun 91 08:14:00 EST From: "ELISE EMERSON MORSE-GAGNE" Subject: tongue-twisters (3) Date: Tue, 4 Jun 91 17:02:01 BST From: Effie Ananiadou Subject: Greek Tongue Twisters (4) Date: Tue, 04 Jun 91 10:33:54 EST From: Ralf Thiede Subject: German tongue twister (1) -------------------------------------------------------------------- Date: Mon, 03 Jun 91 10:24:36 EDT From: Projekt Matrace Subject: Czech tongue twisters Here is another Czech tongue twister: Nenaolejuje-li te Julie, naolejuju te ja. (If Julia won't oil you, I'll do.) But the all-time favourite in this language, notorious for tongue-twisting, should be this one: Strc prst skrz krk. (Push your finger through the throat.) Alexandr Rosen, Charles University, Praha (2) -------------------------------------------------------------------- Date: 3 Jun 91 08:14:00 EST From: "ELISE EMERSON MORSE-GAGNE" Subject: tongue-twisters Thank-you to everyone who provided correct versions of the Oberammergau verse--my favorite is the "the Russians are coming" variant! And I am glad to find out from speakers of the language that it doesn't count as a tongue-twister but presumably just as language play. Two short ones in English--one fairly widely known, one my own, but I find it impossible to say: Unique New York blue-black beetle --Elise Morse-Gagne (3) -------------------------------------------------------------------- Date: Tue, 4 Jun 91 17:02:01 BST From: Effie Ananiadou Subject: Greek Tongue Twisters A good Greek one is: 'aspri petra kseksaspri ki'ap'ton ilio kseksasproteri' 'white stone all white and from the sun even whiter' The prefix kse- is used as a means to express degree. Other ones involve very complicated oneword compounds, e.g. 'skoulikomirmigotripa' 'a hole for ants and worms' This latter is a favourite among Greek schoolchildren, often preceded with 'ftou' (literally "I spit"), but here taken not negatively but positively as in 'ftou sou matakia mou' which is often said for babies. Effie Ananiadou Centre for Computational Linguistics, UMIST (4) -------------------------------------------------------------------- Date: Tue, 04 Jun 91 10:33:54 EST From: Ralf Thiede Subject: German tongue twister{_ In response to Shelly Harrison and to continue the ongoing silliness about Hans' itinerary and exploits, here is the other stanza: Hans isst den Schweizerkaes mit dem Gebiss. Ob er'n aber ueber'n Oberkiefer kaut, Oder aber ueber'n Unterkiefer kaut, Oder aber ueberhaupt nicht kaut, Ist nicht gewiss. [Hans eats Swiss chease using his dentures. Whether he chews it over the upper jaw Or else over his lower jaw Or even not at all Is uncertain.] Neither the first or the second stanza can be called a tongue twister, I would say, but rather fall into the category of joyously context-free word play. I don't know where the song first came up, though as a Westphalian I am willing to suspect Bavaria as the country of origin. ;-) Ralf Thiede UNC Charlotte UNC Charlotte [End Linguist List, Vol. 2, No. 0277] ________________________________________________________________ Subj: 2.0278 Flaming Total: 98 lines Moderators: Anthony Aristar (a_aristar@fennel.cc.uwa.oz.au) Helen Dry (1echad@utsa86.utsa.edu) (1) Date: 6 Jun 91 14:57:00 EST From: "ELISE EMERSON MORSE-GAGNE" Subject: "flaming" (2) Date: Thu, 6 Jun 91 22:16:18 PDT From: sp299-ad@violet.berkeley.edu (Celso Alvarez) Subject: Re: Register and Net-Discourse (1) -------------------------------------------------------------------- Date: 6 Jun 91 14:57:00 EST From: "ELISE EMERSON MORSE-GAGNE" Subject: "flaming" I can't stand it any longer. WHAT DOES "FLAMING" MEAN??? I had never heard or seen the word in this usage until I read the introductory material for the Linguist list (sth. like "comments, suggestions and flames should be sent to..."). From such phrases as "the dialogue degenerates into flaming" I have deduced, regretfully, that we are not talking about "ardent, passionate, brilliant" discourse. Instead, I gather, the connotations of combustion, scorching, and incitation are invoked. Beyond that, it seems to refer to an exchange in which heat outstrips knowledge and leaves courtesy in cinders. I would welcome correction, amplification, historical excurses, and finding out whether I am the only person left in the world to whom this term is still new and strange. --Elise Morse-Gagne' (I would also welcome the option of diacritics in electronic mail, incidentally.) (2) -------------------------------------------------------------------- Date: Thu, 6 Jun 91 22:16:18 PDT From: sp299-ad@violet.berkeley.edu (Celso Alvarez) Subject: Re: Register and Net-Discourse macrakis@osf.org (sorry, I missed your name in your message) argues that the discussion on Quebec's language policy degenerated into flaming, and asks Could it be that professional linguists' common discourse rules are limited to technical linguistics, and when other subjects are discussed, linguists don't share discourse rules any more than anyone else? Linguist posters seem to manage Dept. of Linguistics discourse behavior, but not Senior Common room discourse behavior...! I disagree that the discussion was essentially different from one on DS's and SS's that one can find, for example, in the sci.lang newsgroup. I read quite a few informed and informative postings on Quebec and banned languages (the two parallel threads fed on each other). It is true that more personal positions were aired out. But I wouldn't establish a sharp distinction between "technical" linguistics and "other stuff," (an euphemism, for example, for sociolinguistics), about which more people feel entitled to speak. Language planning issues are as technical as any other. The discussion simply revealed that a healthy dose of subjectivity and ideological positioning underlies research in fields like like sociolinguistics, sociology of language, or glottopolitics. In the above disciplines (and even, for example, in variation theory), the dual role of the analyst as producer of technical knowledge and as social actor is unmasked. It would be interesting to dig a little into the ideological foundations behind other linguistic research. Linguists and sociolinguists may share more than is evident in terms of their structural position as producers of specific "truth". Returning to your question, then, I think that discourse rules are managed in the sort of discussions described in very much the same way as in any other discussion. The tendency toward a cautious "cooperation" (and I take this notion with a spoonful of salt), based on the "collective" unraveling of the issue at hand, holds until a given statement or message either challenges the legitimacy of the opinions put forth (and their proponents), and/or simply resituates the discourse by shifting it toward another domain of expertise or interest. A given participant, for example, may chose to invoke a different identity with a simple question (like "Don't you consider absurd that...?") that reframes the exchange as more "personal" than "professional". Of course, the less technical knowledge a participant can display (and this is very frequent in sociolinguistic discussions), the more prone he or she is to attempt to redirect the exchange toward this "personal" domain. We tend to ignore, however, that this "personal" position is actually a socially and ideologically constructed one. Suddenly the social actor pops out of the cacoon of the "professional" researcher, and we are quick to dismiss this metamorphosis as not conforming to the rules of academic exchange. Celso Alvarez U.C. Berkeley sp299-ad@violet.berkeley.edu [Linguist List, Vol. 2, No. 0278] ________________________________________________________________ Linguist List, Vol. 2, No. 0279. Monday, 10 June 1991 Subj: 2.0279 Queries Total: 88 lines Moderators: Anthony Aristar (a_aristar@fennel.cc.uwa.oz.au) Helen Dry (1echad@utsa86.utsa.edu) (1) Date: Thu, 6 Jun 91 21:17:26 +0200 From: Jan Olsen Subject: Queries: agreement, spelling (2) Date: Fri, 7 Jun 91 10:28 EDT From: DJBPITT%PITTVMS.bitnet@RICEVM1.RICE.EDU Subject: Terminological question (3) Date: 9 Jun 91 16:21:00 EST From: "ELISE EMERSON MORSE-GAGNE" Subject: query--borrowed pronouns (1) -------------------------------------------------------------------- Date: Thu, 6 Jun 91 21:17:26 +0200 From: Jan Olsen Subject: Queries: agreement, spelling I'm wondering if there are any spoken languages in which the verb has to agree with its indirect object, but must not agree with the direct object. (If I'm not mistaken, this is true for ASL-clauses with verbs such as GIVE). Furthermore, I'd like to know what different types of strategies languages use when there are not enough formal means to express the agreement relations which are obligatory from a syntax point of view. What I have in mind is something like the situation we find in Georgian (cf. Stephen Anderson's paper in NLLT 1984): V has to agree with SU, DO and IO, but there are only two slots for agreement suffixes. For 3rd person DOs, this does not create a problem since the pertinent suffix is phonologically empty. If a clause contains a SU and an IO, 1st and 2nd person DOs must be replaced by a possessive pronoun + tavi (_head_), a construction which is 3rd person from a formal point of view and therefore helps to solve the agreement problem. And now something completely different: Many writing systems use double letters to represent e.g. vowel length. Are there any writing systems which use triple letters? I'm not thinking of Dutch cases such as zeeen, where the third e belongs to a different morpheme. Are there writing systems which systematically reduplicate letters to express plurality - as in Spanish abbreviations such as EEUU (estados unidos)? Thanx Gisbert fanselow@unipas.fmi.uni-passau.de (2) -------------------------------------------------------------------- Date: Fri, 7 Jun 91 10:28 EDT From: DJBPITT%PITTVMS.bitnet@RICEVM1.RICE.EDU Subject: Terminological question Russian-language descriptions of Cyrillic orthography, particularly in medieval manuscripts, have a special term (neprikrytyj; literally "uncovered") to refer to vowel letters that are not preceded by consonant letters. Does anyone know whether there is a suitable English-language equivalent for this term? I should emphasize that this is an orthographic, rather than linguistic, question, since vowel letters do not necessarily correspond to vowel sounds. But a weak linguistic analogy would be a term that identifies syllables with no onset. Thanks, David (3) -------------------------------------------------------------------- Date: 9 Jun 91 16:21:00 EST From: "ELISE EMERSON MORSE-GAGNE" Subject: query--borrowed pronouns I am working on the adoption of the Scandinavian 3pl pronouns into English in the Middle Ages. I would be very interested to learn of any other instances of pronoun borrowing (transfer, etc.), whether between closely related or unrelated dialects/languages. I have Parker's dissertation discussing Westfoehring, and have heard that there may be a S American language which has borrowed Spanish pronouns, but can't track down the reference. Any leads? Elise Morse-Gagne [End Linguist List, Vol. 2, No. 0279] ________________________________________________________________ Linguist List, Vol. 2, No. 0280. Monday, 10 June 1991 Subj: 2.0280 FYI and Acronyms Total: 101 lines Moderators: Anthony Aristar (a_aristar@fennel.cc.uwa.oz.au) Helen Dry (1echad@utsa86.utsa.edu) (1) Date: Thu, 6 Jun 91 18:00:42 EDT From: mostow@cs.rutgers.edu Subject: CHEAP IJCAI91 AIRFARES AVAILABLE BUT NOT FOR LONG (2) Date: Mon, 10 Jun 91 09:54:56 EDT From: ling-ed@uniwa.uwa.oz.au (The LINGUIST Editors) Subject: Lakoff bibliography in LaTeX (3) Date: Thu, 06 Jun 91 11:58:39 -0400 Subject: Re: Responses: Diacritics and Acronyms From: Ellen Prince (4) Date: Fri, 7 Jun 91 10:42:54 EDT From: Alexis_Manaster_Ramer%MTS.cc.Wayne.edu@RICEVM1.RICE.EDU Subject: Hebrew Acronyms (1) -------------------------------------------------------------------- Date: Thu, 6 Jun 91 18:00:42 EDT From: mostow@cs.rutgers.edu Subject: CHEAP IJCAI91 AIRFARES AVAILABLE BUT NOT FOR LONG For those of you planning to attend IJCAI-91 in Sydney, Australia, it may be important to know that there are extremely discounted round-trip airfares to/from the US and Sydney available right now. Both Continental and Northwest are offering $685 (+ tax = $703) round-trip fares from JFK and Newark. Both these fares *expire soon* (Continental on June 7th and Northwest the following week) so, if you're interested in taking advantage of the savings, call your travel agent immediately! Your local travel agent should have further details and be able to make the arrangements for you. I DON'T HAVE ANY OTHER INFORMATION, SO PLEASE DO ***NOT*** CONTACT ME. (2) -------------------------------------------------------------------- Date: Mon, 10 Jun 91 09:54:56 EDT From: ling-ed@uniwa.uwa.oz.au (The LINGUIST Editors) Subject: Lakoff bibliography in LaTeX William Rapaport has kindly posted a bibilography of articles citing Lakoff's _Women, Fire and Dangerous Things_. This is available by sending listserv@uniwa.uwa.oz.au (NOT Listserv@tamvm1) the message: get lakoff-bibliography (3) -------------------------------------------------------------------- Date: Thu, 06 Jun 91 11:58:39 -0400 Subject: Re: Responses: Diacritics and Acronyms From: Ellen Prince daniel radzinski writes that an acronym source for katz and segal might be a folk etymology, the evidence for that being that these names do not exist outside of eastern european jewry. while he might of course be right, i'd like to point out that it is typical for jews to take names that are phonologically and even apparently morphologically consistent with the languages of the countries in which they reside. thus it is conceivable that katz/segal are indeed acronyms but were invented in eastern europe exactly BECAUSE they fit in so well with a germanic system. also remember that family names are a very late phenomenon, nearly a millenium later than the dispersion of the jews. except for those who eventually took their LABEL 'cohen', 'levi', 'israel'... as their family name, we would be very surprised indeed if we found the same family names among jews of very different regions, acronym or not. for example, my impeccably germanic maiden name, friedman(n), has the same origin--solomon/shlomo/shalom '[man of] peace'--as the oriental jewish family name suleiman--but each fits into the language of the land in which it was used. likewise, the oriental/sephardic family name 'azoulay' is (according to an egyptian jewish friend with that name) a hebrew acronym, but one that fits perfectly into the system of the lands in which IT was invented and used. (4) -------------------------------------------------------------------- Date: Fri, 7 Jun 91 10:42:54 EDT From: Alexis_Manaster_Ramer%MTS.cc.Wayne.edu@RICEVM1.RICE.EDU Subject: Hebrew Acronyms I fully agree with Daniel Radzinski's suggestion that many if not most Jewish names which are etymologized as Hebrew acronyms must also be considered in light of their apparent non-Hebrew etymologies. E.g., Katz is not just the acronym for kohen tzedek, but also the Germanic word for 'cat'. A piece of evidence for this is that acronymic folk etymologies are often offered in that culture for words whose etymology is obscure, e.g. yeke 'a pejorative term for a German Jew' is often explained as standing for yehudi kshe-havana 'a Jew hard of understanding', which is probably fanciful. Daniel's argument that otherwise we would expect names like Katz to also occur among Jews of other parts of the world is perfectly reasonable. However, it also seems to me that, rather than assuming that these names originally were just what they seem to be in German and were then folk-etymologized, I would think that at least some of them were deliberate puns from the beginning. [End Linguist List, Vol. 2, No. 0280] ________________________________________________________________ Linguist List, Vol. 2, No. 0281. Monday, 10 June 1991 Subj: 2.0281 Register Total: 96 lines Moderators: Anthony Aristar (a_aristar@fennel.cc.uwa.oz.au) Helen Dry (1echad@utsa86.utsa.edu) (1) Date: Thu, 6 Jun 1991 15:03-0400 From: Allan C. Wechsler Subject: Register (2) Date: Thu, 6 Jun 91 22:23:37 PDT From: sp299-ad@violet.berkeley.edu (Celso Alvarez) Subject: Re: Register and Net-Discourse (3) Date: Fri, 07 Jun 91 05:20:34 EST From: Woody Starkweather Subject: register (1) -------------------------------------------------------------------- Date: Thu, 6 Jun 1991 15:03-0400 From: Allan C. Wechsler Subject: Register A cautionary note: we're going to have a tough time defining "register" given the traditional difficulty of defining "language". Having said this, I'll try my hand. A register is a part of a language. You can think of a language as owning a lexicon of registers the same way it owns a lexicon of morphemes. As with the morpheme lexicon, not all speakers can use all the registers of a language. Registers are different from dialects in that a single speaker chooses from among a (usually small) set of registers situationally. Many languages have a "baby talk" register used by adults in talking to very young children. In one American dialect of English, the baby-talk register substitutes "ums" [Umz] for "you". Register change can involve a coordinated set of linguistically significant changes on all levels: discourse, syntax, lexicon, morphology, phonology, and phonetics. When I was in high-school we had a register that I can only describe as the "Joe Cool" register. One of its features was that all segments were voiced. Id was rilly gool. Japanese has about four registers conditioned by social status. It also has a baby-talk register in which (among other things) /boku/ "I" is used to mean "the baby", regardless of who is speaking to whom. (In a Japanese restaurant, we asked for a spoon for our 3-year-old. The waitress called into the kitchen in Japanese. Someone answered from the kitchen, also in Japanese, "For whom?" The waitress answered "/Boku-ni/", literally, "For me," but in context, "For the baby." Australian langauges have lots of peculiar registers for social avoidance and ritual purposes. I'd be interested in hearing other examples. (2) -------------------------------------------------------------------- Date: Thu, 6 Jun 91 22:23:37 PDT From: sp299-ad@violet.berkeley.edu (Celso Alvarez) Subject: Re: Register and Net-Discourse Peter Gingiss asks about "register" and "style". My opinion is that the notion of style is not sociologically informed. It individualizes talk, and deemphasizes the social construction of communicative conventions. Register (e.g. "formal" vs. "informal"), on the other hand, alludes to situational constraints, which are social in nature. Celso Alvarez U.C. Berkeley sp299-ad@violet.berkeley.edu (3) -------------------------------------------------------------------- Date: Fri, 07 Jun 91 05:20:34 EST From: Woody Starkweather Subject: register I may have missed some of the earlier discussion, so forgive me if I re- peat something that someone else said. Register is a term used more by public speakers, announcers, and other speech production types to refer to the level of formality. The lowest register would be casual conversation between two friends, a notch higher would be talking to someone higher in status, a notch higher than that might be an informal public speech, such as a classroom, a notch higher than that a large audience to which one speaks on a specific topic. Higher than that is the broadcasting situ- ation, where the audience can't be seen or heard. Register influences more than just lexicon. It influences also the clarity with which one articulates, and it influences pronunciation rules as well (you wouldn't say gonna or doin' on the air). Fluency level rules are also different. Um is quite acceptable at low levels but forbidden in broadcasting. Syntax is also influenced; consider the language of instruction. I suppose register is one dimension of pragmatic variation. [End Linguist List, Vol. 2, No. 0281] ________________________________________________________________ Linguist List, Vol. 2, No. 0282. Monday, 10 June 1991 Subj: 2.0282 Responses: Voice, V2, Address, Tongue Twisters Total: 81 lines Moderators: Anthony Aristar (a_aristar@fennel.cc.uwa.oz.au) Helen Dry (1echad@utsa86.utsa.edu) (1) Date: Date: Thu, 6 Jun 91 18:02:40 PDT From: John_Gilbert@mtsg.ubc.ca Subject: Voice Quality (2) Date: Fri, 07 Jun 91 11:40:31 +0100 From: Hans van de Koot Subject: Hi, 'lo (Dutch V2 phenomena) (3) Date: Mon, 10 Jun 91 13:48 PDT From: Vicki Fromkin Subject: Re: Address of Eve Clark (4) Date: Fri, 7 Jun 1991 19:17 EST From: Fanmail from some flounder Subject: Re: More Tongue Twisters (5) Date: Sun, 9 Jun 91 08:21 EST From: Herb Stahlke <00HFSTAHLKE%BSUVAX1.BITNET@UICVM.uic.edu> Subject: RE: More Tongue Twisters (1) -------------------------------------------------------------------- Date: Thu, 6 Jun 91 18:02:40 PDT From: John_Gilbert@mtsg.ubc.ca Subject: Voice Quality To Donna Erickson: Voice Quality in tone languages - John Laver in Edinburgh knows everything there is to know about voice quality: john@cstr.edinburgh.ac.uk Peter Ladefoged or Ian Maddieson at UCLA? (2) -------------------------------------------------------------------- Date: Fri, 07 Jun 91 11:40:31 +0100 From: Hans van de Koot Subject: Hi, 'lo (Dutch V2 phenomena) A good place to start is Weerman's (1989) THE V2 CONSPIRACY. (3) -------------------------------------------------------------------- Date: Mon, 10 Jun 91 13:48 PDT From: Vicki Fromkin Subject: Re: Queries According to my list Eve Clark's e-mail address is: eclark@psych.stanford.edu (4) -------------------------------------------------------------------- Date: Fri, 7 Jun 1991 19:17 EST From: Fanmail from some flounder Subject: Re: More Tongue Twisters The following is a tongue-twister in English as well as a finger-fumbler in ASL: Good blood, bad blood. Susan Fischer (5) -------------------------------------------------------------------- Date: Sun, 9 Jun 91 08:21 EST From: Herb Stahlke <00HFSTAHLKE%BSUVAX1.BITNET@UICVM.uic.edu> Subject: RE: More Tongue Twisters My favorite perversion of an English tongue-twister is How many figs could a fig plucker pluck if a fig plucker could pluck figs? Does anyone know of other tongue-twisters whose likely mispronunciations also have meaning, perhaps not with the same bestial charm of this one? Herb Stahlke Ball State University [End Linguist List, Vol. 2, No. 0282] ________________________________________________________________ Linguist List, Vol. 2, No. 0283. Monday, 10 June 1991 Subj: 2.0283 Diacritics Total: 151 lines Moderators: Anthony Aristar (a_aristar@fennel.cc.uwa.oz.au) Helen Dry (1echad@utsa86.utsa.edu) (1) Date: Thu, 6 Jun 91 18:51:07 EDT From: macrakis@osf.org Subject: Responses: Diacritics and Acronyms (2) Date: Fri, 7 Jun 91 15:17:14 +0200 From: Lars Henrik Mathiesen Subject: ISO10646 and diacritical marks (3) Date: Mon, 10 Jun 91 10:15:04 +0200 From: mark@adler.philosophie.uni-stuttgart.de (Mark Johnson) Subject: Re: ISO10646 and diacritical marks (1) -------------------------------------------------------------------- Date: Thu, 6 Jun 91 18:51:07 EDT From: macrakis@osf.org Subject: Responses: Diacritics and Acronyms Let me reassure Peeters and others. No one is trying to get rid of diacritics in general. The argument is much narrower than that: should character encodings be closed (i.e. contain a fixed repertoire of character+diacritic combinations) or open (i.e. permit arbitrary combinations of character and diacritic)? The draft international standard DIS10646 has a closed repertoire, Unicode an open one (but also contains many pre-composed characters). Both proposals cover the main languages of the world fully (yes, `main' is vague but certainly includes a wide range: certainly all the European languages, but also Vietnamese, Orissa, Persian, ...). Both can encode e-acute as a single code. Both can encode Ancient Greek eta-rough-grave- subscript, but 10646 encodes it as a single code (yes, there is a list of all the possible combinations in the standard!), while Unicode encodes it as a sequence of four codes. The difference comes with characters which are NOT widely used, either because the language is not among those covered by ISO (Intl. Standards Organization), or because the usage is narrow (e.g. scholarly). For instance, Cyrillic_R-macron-grave (useful for Slavic philology, perhaps?) does not exist as a precomposed character in Unicode or in 10646. But in Unicode it can be represented as a combination of three codes, even if it's never been used before (and even if it is a typo!). 10646 could of course add it in a future revision, but this has to be done on a case-by-case basis. In closed repertoire systems, adding a new character is expensive. In open repertoire systems, using characters composed of existing elements costs nothing. Proponents of closed repertoire systems argue that inventors of NEW orthographies should limit themselves to standard characters. Proponents of open repertoire systems argue that this is an unnatural limitation which restricts designers of orthographies artificially. <> is what the argument is about, <> about suppressing e-acute. -s (2) -------------------------------------------------------------------- Date: Fri, 7 Jun 91 15:17:14 +0200 From: Lars Henrik Mathiesen Subject: ISO10646 and diacritical marks In response to Bert Peeters: The draft of ISO 10646 does contain accented characters --- lots of them. As I understand it, the purpose is that not only English and French, but most or all languages with a standard orthography, should find all the characters they need in there. The argument is about how letter/diacritic combinations should be represented. Currently, all the combinations that are thought necessary are enumerated in a (very large) set of lists. There is no way, within the draft standard, to create new combinations. (A given display system may have ways of superimposing two symbols from the standard, but the exact method will vary between systems.) The suggestion is that the draft standard should be changed to include "floating diacritics," which are defined _by_the_standard_ to be placed with the next letter. The message that was copied here (by a proponent of this) was from the main opponent of the suggestion, it seems. By his argument, the only way new combinations could become necessary would be for ``irresponsible'' linguists to invent ``unreasonable'' accented letters when creating an orthography for a language. Therefore, it is reasonable for ISO to create a standard that cannot accomodate such alphabets, and therefore it's the linguists (and not the standard committee) who will be guilty of forcing the users of that language to use non-standard equipment, with the attendant costs. (To me, this argument seems a little circular. However, there are technical reasons why a standard without floating diacritics is easier to implement.) I think the message was copied here to elicit arguments in favor of floating diacritics, i.e., good reasons why a fixed repertoire cannot be sufficient. -- Lars Mathiesen, DIKU, U of Copenhagen, Denmark [uunet!]mcsun!diku!thorinn Institute of Datalogy -- we're scientists, not engineers. thorinn@diku.dk (3) -------------------------------------------------------------------- Date: Mon, 10 Jun 91 10:15:04 +0200 From: mark@adler.philosophie.uni-stuttgart.de (Mark Johnson) Subject: Re: ISO10646 and diacritical marks Speaking from just about complete ignorance except for what I've seen on this group, I take it that ISO10646 proposes a fixed-length character encoding system. Some of these characters would have diacritics "built in" (the way the Postscript and Mac character sets have accented vowels as single characters, for example), I would assume. (Please, someone correct me if I am wrong here!). I think what's at issue here is whether there should be a productive way of combining pre-existing characters to form new characters (think of overstriking as such a way). For example, one might agree to interpret the backslash as an 'escape character' that means 'build a new character by overstriking the next two characters', as in \a` , for example. Since the new characters so constructed could themselves be subjected to further overstriking, we can get a very large number of characters. As I understand it, the technical problems associated with such a system are largely typographical, having to do with obtaining an acceptable typeface for the new combined character. It's difficult to imagine how one could automatically figure out exactly where some diacritic should be placed over some other character that is already a compound of many basic characters. Recall that in general we will be dealing with variable-width fonts with multiple faces or type-styles; I think the currently accepted position (again, someone please correct me if I am wrong!) is that the only way to get acceptable results is to have each diacritic-character combination individually designed by a type-face designer. However, I think that the sensible thing for the ISO standard to do is to build in a general escape method, which (among other things) would allow overstriking of arbitrary characters to build new characters. At the same time, the standards writers should not that with current technology, the results are likely to be typographically unacceptable --- but who knows what technology in 10 or 20 years from now will be like? No point in shutting the door unless you have to! Mark Johnson [End Linguist List, Vol. 2, No. 0283] ________________________________________________________________ Linguist List, Vol. 2, No. 0284. Monday, 10 June 1991 Subj: 2.0284 Phonology and Orthography Total: 68 lines Moderators: Anthony Aristar (a_aristar@fennel.cc.uwa.oz.au) Helen Dry (1echad@utsa86.utsa.edu) (1) Date: Fri, 7 Jun 91 15:39:54 PDT From: rwojcik@atc.boeing.com Subject: Re: Things Phonological and Orthographical (2) Date: Mon, 10 Jun 91 12:33 +0100 From: "Hartmut Haberland, Roskilde University" Subject: RE: Things Phonological and Orthographical (1) -------------------------------------------------------------------- Date: Fri, 7 Jun 91 15:39:54 PDT From: rwojcik@atc.boeing.com Subject: Re: Things Phonological and Orthographical I got the same reading of John Coleman's argument as Margaret Fleck did--that syllabaries and logographic systems would be more efficient than alphabetic systems if you compared combinations of characters to symbols. Under that reading, the cumbersome Japanese and Chinese typewriter keyboards would be considered superior to the English keyboard on the grounds that it would take fewer strokes to type words than with the alphabetic keyboards. So why are typists so few and far between for those hummers? The keyboards contain too many keys. It was this sense of efficiency that I originally meant when I called alphabets more efficient than syllabaries. -Rick Wojcik (rwojcik@atc.boeing.com) (2) -------------------------------------------------------------------- Date: Mon, 10 Jun 91 12:33 +0100 From: "Hartmut Haberland, Roskilde University" Subject: RE: Things Phonological and Orthographical Two remarks on Margaret's posting: (1) I think people have agreed upon considering scripts like Chinese or Japanese kanji as logographic, not ideographic. This is an important distinction. (There was an article a couple of months ago in monumenta Nipponica about this and the historical aspects of the question - I can supply the reference if somebody is interested -, but amongs linguists this should be old hat - Coulmas (1982) in his 'Ueber Schrift' refers to Japanese kanji as a logographic script, and he's not the first one.) (2) I don't find English spelling so different from the practice used in Arabic script, viz. omitting vowels. T If you exaggerate a bit, you might say that in English spelling, vowel letters only mark the _place_ of the vowel in the phonetic chain, plus give some rough indication of the quality of the associated vowel phoneme (think of words like read [2 readings!], beach, break, ...). With languages like Finnish, Czech, and, in spite of a complicated letter <-> match, French and Modern Greek (not to talk about Japanese kana etc.), it is possible to read a text aloud without actually knowing what it says, you just follow the letters. This is impos- sible with Arabic, Ivrit and other scripts that don't supply vowels: you have to understand the text first before you can read it aloud. English is closer to the second group than to the first. (This was pointed out to me by somebody when I complained about Irish spelling on bilingual road signs in ireland, like when you read 'Dun Laoghaire' and don't know which of the vowels are to be pronounced and which of them just palatalize/velarize consonants. 'You have to know what the place is called (and what it's name is pronounced like) before you can read it aloud,' I said. 'Well,' the answer was, 'this is even more the case with English ...'. And true it is! Hartmut [End Linguist List, Vol. 2, No. 0284] ________________________________________________________________ Linguist List, Vol. 2, No. 0285. Wednesday, 12 June 1991 Subj: 2.0285 Conferences Total: 198 lines Moderators: Anthony Aristar (a_aristar@fennel.cc.uwa.oz.au) Helen Dry (1echad@utsa86.utsa.edu) (1) Date: Thu, 06 Jun 91 14:43:47 BST From: Robert Dale Subject: Call for Papers: International NLG Workshop 1992 (2) From: Henry Thompson Date: Fri, 7 Jun 91 10:32:04 BST Subject: Symposium and Panel on Natural Language and Speech, ESPRIT week Symposium on Natural Language and Speech (3) Date: Sat, 8 Jun 91 14:23:10 BST From: morgan%bach.cogsci.uiuc.edu@RICEVM1.RICE.EDU (Jerry Morgan) Subject: Workshop program (1) -------------------------------------------------------------------- Date: Thu, 06 Jun 91 14:43:47 BST From: Robert Dale Subject: Call for Papers: International NLG Workshop 1992 Call for Papers The Sixth International Workshop on Natural Language Generation Castel Ivano, Trento, Italy, 5th--7th April 1992 PURPOSE AND SCOPE: Following on from the five previous International Workshops on Natural Language Generation, this workshop aims to bring together researchers in a rapidly consolidating field. We intend to structure the workshop around a number of emerging topic areas: Multi-modality: the practical and theoretical issues underlying the development of systems that integrate language generation with other media (such as graphics, maps, and forms). The representation and use of syntactic knowledge: we particularly welcome papers which attempt to bridge the gap between earlier phrase structure grammar based approaches, systemic approaches, and newer constraint-based approaches, and discussions of how these approaches address the motivation of syntactic choice. Approaches to text planning: a number of approaches to discourse structure (such as RST, DRT and schemas) have relevance to text planning. What are their respective strengths and, especially, weaknesses? In what areas do we need additional theories? Applications of NLG: the use of language generation techniques in, for example, expert system explanation, machine translation, dialogue systems, and report generation; their implications for more theoretical issues. Multi-linguality: the effects upon system architecture and underlying representation of building systems which generate text in more than one language. To what extent is it possible to build plug-and-play realization components for different languages for use with generic text planners? SUBMISSIONS: It is our intention to publish a book consisting of the workshop papers in time for the workshop itself; contributors interested in participating in this workshop are initially requested to submit A PAPER OF BETWEEN 10 AND 20 PAGES in length. Accepted papers will be returned for final polishing and revision into full length papers before inclusion into the workshop proceedings. The cover page of the draft paper should include the title, the name(s) of the author(s), complete addresses (including email address and fax number if available), a short (10 line) summary, and a specification of the topic area. Send to: Mail: Robert Dale Centre for Cognitive Science, University of Edinburgh 2 Buccleuch Place, Edinburgh EH8 9LW Scotland Tel: (+44) 31 650 4416 Fax: (+44) 31 662 4912 Email: R.Dale@uk.ac.edinburgh SCHEDULE: Submissions are due at the above address NO LATER THAN 15TH SEPTEMBER 1991, either by paper mail, email (in LaTeX form), or fax; notifications of acceptance should be received by authors BY 1ST DECEMBER 1991; camera ready versions of the final papers are due 15TH JANUARY 1992. Approximately 15 papers will be accepted for presentation at the workshop and inclusion in the book. WORKSHOP INFORMATION: Attendance at the workshop will be limited to around 50 people. The workshop has been timed to follow the Third Conference on Applied Natural Language Processing, being held in Trento, Italy from 1st--3rd April 1991. Details of this conference can be obtained from Oliviero Stock, IRST, 38050 Povo (Trento), Italy; Tel: (+39) 461 81444, email: stock@irst.it. The cost of the workshop, including accommodation and meals, is expected to be in the region of $300 per person. Financial support for the workshop is being sought. The workshop is co-sponsored by the Esprit Basic Research Actions and the Special Interest Group on Generation of the Association for Computational Linguistics. Organising Committee: Robert Dale, Eduard Hovy, Dietmar R\"osner and Oliviero Stock. (2) -------------------------------------------------------------------- From: Henry Thompson Date: Fri, 7 Jun 91 10:32:04 BST Subject: Symposium and Panel on Natural Language and Speech, ESPRIT week Symposium on Natural Language and Speech Brussels, PALAIS DES CONGRES, ROOM BENELUX November 26-27, 1991 As a special event organised by ESPRIT Basic Research within the ESPRIT Conference 1991 (November 25-29), a Symposium on Natural Language and Speech will take place in Brussels on November 26-27, 1991. The Symposium will be held from 14.00h to 18.00h on Tuesday 26 November and from 09.00h to 18.00h on Wednesday 27 November. The Symposium will consist of 9 lectures following an introduction by Ewan Klein (University of Edinburgh) and a 90 minute panel discussion. The lectures will be given by the following invited speakers : * Edward Briscoe (University of Cambridge) * Elisabet Engdahl (University of Edinburgh) * Hans Kamp (University of Stuttgart) * Mark Liberman (University of Pennsylvania, Philadelphia) * Chris Mellish (University of Edinburgh) * Fernando Pereira (AT&T Bell Laboratories, New Jersey) * Ivan Sag (Stanford University, California) * Mark Steedman (University of Pennsylvania, Philadelphia) * Johan van Benthem (University of Amsterdam) The lectures will cover a wide range of current research topics in Natural Language and Speech. The Symposium will end with a panel session on the topic "Spoken Language Systems: Technological Goals and Integration Issues", chaired and introduced by Henry Thompson (University of Edinburgh). The invited panelists will be: * Jaime Carbonell (Carnegie Mellon University, Pittsburgh) * Sadaoki Furui (NTT, Tokyo) * Jan Lansbergen (Philips, Eindhoven) * Mark Liberman (University of Pennsylvania, Philadelphia) * Christel Sorin (CNET, Lannion) * Walther von Hahn (University of Hamburg) Scientific Coordination: Ewan Klein (University of Edinburgh) Frank Veltman (University of Amsterdam) Introduction, lectures and panelists' statements will be published as a volume of the ESPRIT Basic Research Series (Springer Verlag), to be distributed at the Symposium. Registration Fees: Symposium (only) Before 18 October BFr 8,000 After 18 October BFr 10,000 (Includes 2 lunches and the Proceedings of the Symposium) Symposium and ESPRIT Conference '91 Before 18 October BFr 11,500 After 18 October BFr 13,500 (Includes 3 lunches and the Proceedings of the Symposium and Conference) For registration and further information send name, address and fax number to: E.C.C.O. (European Congress Consultants and Organizers) Rue Vilain XIIII, 17a, B - 1050 Brussels, Belgium Tel : +32 2 647 87 80 Fax :+32 2 640 66 97 (3) -------------------------------------------------------------------- Date: Sat, 8 Jun 91 14:23:10 BST From: morgan%bach.cogsci.uiuc.edu@RICEVM1.RICE.EDU (Jerry Morgan) Subject: Workshop program PROGRAM: WORKSHOP OF COMPUTATIONAL LINGUISTICS AND LINGUISTIC THEORY June 13 - 15, 1991 UNIVERSITY OF ILLINOIS URBANA, IL 61801 For further information: call 217-244-1983 or email to may@kant.cogsci.uiuc.edu [The complete program of this ongoing conference is available on the Listserv. To get it, send the command: get illinois-conf as the first and only line of your message. --Eds.] [End Linguist List, Vol. 2, No. 0285] ________________________________________________________________ Linguist List, Vol. 2, No. 0286. Thursday, 13 June 1991 Subj: 2.0286 Queries Total: 162 lines Moderators: Anthony Aristar (a_aristar@fennel.cc.uwa.oz.au) Helen Dry (1echad@utsa86.utsa.edu) (1) Date: Mon, 10 Jun 1991 19:02:13 PDT From: Diane_L._Olsen.osbu_north%xerox.com@RICEVM1.RICE.EDU Subject: Frequency list for Arabic/Persian/Urdu/Pashto? (2) Date: Tue, 11 Jun 91 17:56:03 +0200 From: Jan Olsen Subject: Long distance scrambling (3) Date: Wed, 12 Jun 91 17:16:46 CST From: SCHIERH%DHDIBM1.bitnet@RICEVM1.RICE.EDU Subject: machine readable dictionaries (4) Date: Wed, 12 Jun 91 11:18:30 +0100 From: Dr M Sebba Subject: query / request for a favour (5) Date: Wed, 12 Jun 91 19:12:53 +0100 From: "And Rosta (071-387 7050 x3120)" Subject: Munsell Colour chips (6) Date: Thu, 13 Jun 91 14:18:36 EDT From: "Bruce Fraser" Subject: Marker Doubling (7) Date: Thu, 13 Jun 91 16:44:11 CDT From: hi Subject: phonetics (8) Date: Thu, 13 Jun 91 13:44:37 -0900 From: "ACAD3A::FFJAL1" Subject: Moods (1) -------------------------------------------------------------------- Date: Mon, 10 Jun 1991 19:02:13 PDT From: Diane_L._Olsen.osbu_north%xerox.com@RICEVM1.RICE.EDU Subject: Frequency list for Arabic/Persian/Urdu/Pashto? One of my coworkers here at Xerox is making enhancements to our Arabic/Persian/Urdu/Pashto word-processing software and has need of a list (for any of those four languages) of the relative frequency of occurrence of the letters in the alphabet -- something like the "ETAONRISHRDLU..." list for English. Does anyone know where she might find such a list? Please send replies to iwoo.osbu_north@xerox.com, not to me. Thanks in advance! Diane L. Olsen Multilingual Development Xerox Corporation (2) -------------------------------------------------------------------- Date: Tue, 11 Jun 91 17:56:03 +0200 From: Jan Olsen Subject: Long distance scrambling I'm trying to find out which syntactic properties (if any) are correlated with the phenomenon of long distance scrambling. Examples of languages with the latter property I know of are: Hindi, Japanese, Korean, Russian, Makua, Turkish and Hungarian. They seem to have one thing in common, but before I'm going to make any claims about univerals, I'd like to have a look at more languages - so if you know other languages in which constituents of a complement clause may be scrambled into the matrix clause, please tell me. Gisbert Fanselow (fanselow@unipas.fmi.uni-passau.de) (3) -------------------------------------------------------------------- Date: Wed, 12 Jun 91 17:16:46 CST From: SCHIERH%DHDIBM1.bitnet@RICEVM1.RICE.EDU Subject: machine readable dictionaries The access to machine readable dictionaries seems to be quite different in every language and/or country. Especially for German good machine readable one's are rarities. I'm searching for commercialy available mono/bilingual machine readable dictionaries for the main languages (i.e. English, French, Spanish, Italian, Japanese, Dutch, German, ...) and perhaps a short description about the size, linguistic categories (phonetics, morphology, syntax, semantics, pragmatics, ...), medium (CD-ROM, disk, ...), price, and - if possible - an advice of an "expert" about quality. Thanks for any response. MfG Stefan Schierholz (4) -------------------------------------------------------------------- Date: Wed, 12 Jun 91 11:18:30 +0100 From: Dr M Sebba Subject: query / request for a favour Is there any kind person who could supply me with 2000 words (or less) of modern colloquial HAWAIIAN, the subject immaterial, in one of the following forms: 1. Machine readable, e.g. via email 2. As a clear, clean photocopy or printout, suitable for feeding into an optical character reader? Mark Sebba Dept. of Linguistics University of Lancaster, Lancaster LA1 4YT, England Telephone (0524) 65201 ext. 2241 (W) (0524) 69223 (H) Fax: (0524) 843085 e-mail: eia023@uk.ac.lancaster.central1 (5) -------------------------------------------------------------------- Date: Wed, 12 Jun 91 19:12:53 +0100 From: "And Rosta (071-387 7050 x3120)" Subject: Munsell Colour chips How can one obtain Munsell colour chips? (6) -------------------------------------------------------------------- Date: Thu, 13 Jun 91 14:18:36 EDT From: "Bruce Fraser" Subject: Marker Doubling I am looking at doubled discourse markers such as English "come come" "now now" "there there" in which the phrase meaning does not reflect the meaning of the doubled constituent. These are different from doublets such as "yes yes" in which the "yes" retains its customary meaning, and different from adjective doublets such as "piano piano" in Italian. Any idea/help from other languages or references would be much appreciated. Thanks in advance. Bruce Fraser SED91LN@BUACCA save send Thanks in advance. Bruce Fraser SED91LN@BUACCA (7) ------------------------------------------------------------------ Date: Thu, 13 Jun 91 16:44:11 CDT From: hi Subject: phonetics Hi you'll I need to analyse the frequency, intensity, and time of vowel sounds in French pronounced by American subjects. I am looking for a program on Mac or Ibm that would could analyse sounds pronounced into the computer. for example, in the sentence "Il n'a pas pu/ parce qu'il avait bu", I want to analyse the segment "pu" in terms of frequency, intensity, and time (8) -------------------------------------------------------------------- Date: Thu, 13 Jun 91 13:44:37 -0900 From: "ACAD3A::FFJAL1" Subject: Moods Is there an agreed-upon term for moods that express a desire for something to happen: Imperative, Hortative, Prohibitive, etc.? The usual term in traditional Western grammars is "Subjunctive" which seems to me rather misleading, as it implies dependent status. Isn't there anything better? And for that matter, is there an agreed-upon taxonomic terminology for moods in general? [End Linguist List, Vol. 2, No. 0286] ________________________________________________________________ Linguist List, Vol. 2, No. 0287. Thursday, 13 June 1991 Subj: 2.0287 Indirect Object agreement Total: 113 lines Moderators: Anthony Aristar (a_aristar@fennel.cc.uwa.oz.au) Helen Dry (1echad@utsa86.utsa.edu) (1) Date: From: koontz@alpha (John E. Koontz) Date: Tuesday, 11 June Subject: Agreement with IO in preference to DO (2) Date: Tue, 11 Jun 1991 12:09 EST From: Karen Christie Subject: Re: Queries (3) Date: Tue, 11 Jun 91 10:07 PDT From: IBENAJY%MVS.OAC.UCLA.EDU@CORNELLC.cit.cornell.edu Subject: Re: Queries (1) -------------------------------------------------------------------- From: koontz@alpha (John E. Koontz) Date: Tuesday, 11 June Subject: Agreement with IO in preference to DO Jan Olsen inquires: > I'm wondering if there are any spoken languages in which the verb has to > gree with its indirect object, but must not agree with the direct object. > What I have in mind is something like the situation we find in Georgian > ... V has to agree with SU, DO and IO, but there are only two slots for > agreement suffixes. This is the situation in the Mississippi Valley Siouan languages, e.g., Dakotan, Omaha-Ponca, or Winnebago. If a transitive clause includes IO, then verb object person concord is with IO, not DO. In addition, there is a specific pattern of derivation that marks verbs as S + IO verbs rather than S + DO verbs. There is no marking on the object. Most such verbs also have subject person concord, too, though there are some exceptions. There are also some oblique object constructions in which the OO seems not to preempt concord, e.g., in Omaha-Ponca, the applicative. I am not sure about the concord situation in Mandan and Crow-Hidatsa, where the dative relation is indicated with a serial verb construction based on `to give'. > For 3rd person DOs, this does not create a problem since the pertinent > suffix is phonologically empty. If a clause contains a SU and an IO, 1st > and 2nd person DOs must be replaced by a possessive pronoun + tavi > (_head_), a construction which is 3rd person from a formal point of view > and therefore helps to solve the agreement problem. In Siouan languages one simply can't have a non 3rd DO when there is a DO, as far as I know the data. Rephrasings with two predicates, especially a causative construction, are offered. > And now something completely different: Many writing systems use double > letters to represent e.g. vowel length. Are there any writing systems > which use triple letters? Well, Atsina (ALGONQUIAN) has short, long, and overlong vowels, which Allan Taylor writes with one, two, and three vowel sequences. I seem to recall that the overlong sequences result from loss of intervocalic consonants, and have a restricted range of the potential pitch contours. (2) -------------------------------------------------------------------- Date: Tue, 11 Jun 1991 12:09 EST From: Karen Christie Subject: Re: Queries An indirect response to Jan Olsen's query: I believe the situation for ASL is that IF the NPs are set in space that certain verbs (such as GIVE) MUST agree with the Object..true... It CAN show agreement with the subject but that is not required....It seems that certain other types of verbs such as SEE,TELL, INFORM, TATTLE only have object agreement (cannot have subject agreement incorporated into the verb). To echo Jan Olsen's query, I wonder if any highly inflectional languages have this type of object agreement (serbo-crotian...navajo..hebrew??) (3) -------------------------------------------------------------------- Date: Tue, 11 Jun 91 10:07 PDT From: IBENAJY%MVS.OAC.UCLA.EDU@CORNELLC.cit.cornell.edu Subject: Re: Queries RE: Jan OlseRE: Jan Olsen's questions on agreement There are a number of American Indian languages I know of which you may be interesting (and I think someone has already noted the parallel with ASL, though I forget where). In the Yuman languages, for instance, there is no for-- mal distinction between "direct" and "indirect" objects, but in verbs which may take two objects it is always the semantic dative (recipient, benefactive, whatever) which agrees on the verb. Some of these languages have an inde- pendent plural object prefix which may cooccur with person agreement for the indirect object to mark a plural direct object, but speakers generally are simply unwilling to translate sentences with non-third-person direct objects for such verbs (they resort to paraphrase, which generally works fine). Muskogean languages have a formal distinction between "direct" and "indirect" objects for many verbs, expressing these with agreement prefixes from what are often called the II and III agreement series. In some languages these may cooccur, but in Chickasaw, the language I know best, a verb may not have an overt prefix from both of these series. The third persons are zero, so it's fine to say 'I sent you' or 'I sent him to you', but you can't say 'He sent me to you'. Again, speakers resort to paraphrase; again, if you want to express both an agreeing indirect object and an agreeing direct object, the indirect object wins. Pam Munro [End Linguist List, Vol. 2, No. 0287] ________________________________________________________________ Linguist List, Vol. 2, No. 0288. Thursday, 13 June 1991 Subj: 2.0288 Borrowed Pronouns Total: 125 lines Moderators: Anthony Aristar (a_aristar@fennel.cc.uwa.oz.au) Helen Dry (1echad@utsa86.utsa.edu) (1) Date: Tue, 11 Jun 91 22:35:08 PDT From: suzanne@garnet.berkeley.edu Subject: Re: Queries (2) Date: Mon, 10 Jun 1991 21:58 EST From: Fan mail from some flounder? Subject: Re: Queries (3) Date: Mon, 10 Jun 91 23:08:01 PDT From: poser@csli.Stanford.EDU (Bill Poser) Subject: borrowed pronouns (4) Date: Tue, 11 Jun 1991 16:50 MST From: Susanna Cumming Subject: pronouns (5) Date: Wed, 12 Jun 91 11:15:28 +0100 From: Dr M Sebba Message-Id: <3423.9106121015@central1.lancaster.ac.uk> (1) -------------------------------------------------------------------- Date: Tue, 11 Jun 91 22:35:08 PDT From: suzanne@garnet.berkeley.edu Subject: Re: Queries Reply to Elise Morse Gagne re borrowing of pronouns The French indefinite pronoun ON, though not strictly borrowed from Germanic, is generally regarded to have been calqued on Germanic MAN. --Suzanne Fleischman (2) -------------------------------------------------------------------- Date: Mon, 10 Jun 1991 21:58 EST From: Fan mail from some flounder? Subject: Re: Queries With regard to vo. 2, # 279, ASL has several interesting situations where there aren't enough morphological slots on the verb to accommodate the number of necessary moprhosyntactic spaces. You can end up with a serial verb (described by Ted Supalla, ref provided on request), or what Wynne Janis and I term a "verb sandwich," where the verb splits in two to accommodate more stuff. Morris Halle has also discussed this sort of phenomenon in a presentation I heard last year, and specifically has a different analysis of Anderson's Georgian data. Re borrowing of pronouns, I was told about 15 years ago (and I speak from virtual ignorance here, but when did that ever stop me) that Thai borrowed the English word "you", largely in order not to have to make a statement about relative social status every time there was a conversation. Susan Fischer (3) -------------------------------------------------------------------- Date: Mon, 10 Jun 91 23:08:01 PDT From: poser@csli.Stanford.EDU (Bill Poser) Subject: borrowed pronouns The Japanese pronoun /boku/ (casual 1st person masculine singular) is a loan from Chinese. However, it wasn't a pronoun in Chinese. Rather, it meant "slave". It's as if English had borrowed "your servant" from another language and eventually turned it into a pronoun. Bill Poser (4) -------------------------------------------------------------------- Date: Tue, 11 Jun 1991 16:50 MST From: Susanna Cumming Subject: pronouns In relation to Elise Morse-gagne's query about borrowed pronouns: widespread use of borrowed pronouns can be found in Southeast Asian languages which have elaborate honorific systems. For instance, although Indonesian has reflexes of the Austronesian pronouns, in Java at least these are of fairly limited use; they are only appropriate for the most intimate situations. Speakers commonly replace them with honorific forms including indigenous kinship terms (bapak 'father', ibu 'mother'), borrowed kin terms (Javanese mbak 'elder sister', mas 'elder brother'; Dutch Om 'uncle', Tante 'aunt'), borrowed pronouns (yu < English you, gua/lu < Hokkien I/you), and various other items (saya 'I' from Sanskrit sahaya 'follower, slave', Tuan 'you' from Arabic tuhan 'lord', etc.) The inventory differs in different parts of the Malay-speaking world, but at least in the areas where there is a court tradition a large number of options (expressing a very sensitive response to differences in social status) is typical. Most of the forms can be used for second, third, and even first person reference. And from a syntactic point of view they must be treated as pronouns: not only do they have the characteristic discourse functions of pronouns, but they can procliticize to the verb in the "passive" construction -- a position which is not possible for ordinary lexical nouns. (5) -------------------------------------------------------------------- Date: Wed, 12 Jun 91 11:15:28 +0100 From: Dr M Sebba Message-Id: <3423.9106121015@central1.lancaster.ac.uk> Borrowed pronouns Elise Emerson Morse-Gagne asks about borrowed pronouns. The English pronouns I and YOU (in these forms, invariant) appear to have been borrowed into the Malay of many speakers. Whether this is true mainly or exclusively for bilinguals who know at least some English as well, I can't say, though I think it must be pretty widespread in colloquial Malay (Bahasa Malaysia) among educated people. This serves the pragmatic function of not requiring the speaker to choose between the large number of first and second person pronouns available in Malay, which require the speaker to make delicate judgments of status, solidarity, etc. Mark Sebba Dept. of Linguistics University of Lancaster, Lancaster LA1 4YT, England Telephone (0524) 65201 ext. 2241 (W) (0524) 69223 (H) Fax: (0524) 843085 e-mail: eia023@uk.ac.lancaster.central1 [End Linguist List, Vol. 2, No. 0288] ________________________________________________________________ Linguist List, Vol. 2, No. 0289. Thursday, 13 June 1991 Subj: 2.0289 Jobs Total: 174 lines Moderators: Anthony Aristar (a_aristar@fennel.cc.uwa.oz.au) Helen Dry (1echad@utsa86.utsa.edu) (1) Date: Thu, 13 Jun 91 15:18:27 +0200 From: busemann%dfki.uni-sb.de@RICEVM1.RICE.EDU (Stephan Busemann) Subject: Job Openings (2) Date: Thu, 13 Jun 91 15:16:29 +0200 From: busemann%dfki.uni-sb.de@RICEVM1.RICE.EDU (Stephan Busemann) Subject: Job Openings in Computational Semantics at DFKI (1) -------------------------------------------------------------------- Date: Thu, 13 Jun 91 15:18:27 +0200 From: busemann%dfki.uni-sb.de@RICEVM1.RICE.EDU (Stephan Busemann) Subject: Job Openings Re: Job Openings in Computational Linguistics at DFKI, Saarbruecken Research Associates (Wissenschaftlicher Mitarbeiter) in the Project DISCO We are looking for additional researchers to join the project DISCO located at the German Research Center for Artificial Intelligence (Deutsches Forschungszentrum fuer Kuenstliche Intelligenz, DFKI) at Saarbruecken. The posts should be filled by January 1992. DISCO (DIalogue Systems for autonomous COoperating agents) is a four year project (started January '90) funded by the German Minister for Research and Technology (BMFT). The main objective is to develop a natural language dialogue system for multiple cooperating agents. Besides strategies for parsing and generation, a grammar and a lexicon are presently being developed using a formalism close to that of HPSG. In the second phase of the project, the constraint-based approach will be extended to include the treatment of dialog phenomena and non-linguistic knowledge. Current project members are Rolf Backofen, Stephan Busemann, Hans-Ulrich Krieger, John Nerbonne, Klaus Netter, Harald Trost and Hans Uszkoreit (PI). Ideally, we are looking for computer scientists or computational linguists with a good theoretical and practical background in AI and natural language processing. Applicants should be able to implement in Common Lisp and have experience in one or more of the following areas: -- treatment of dialogue phenomena (including dialogues between multiple agents) -- relating conceptual knowledge (a domain model) to linguistic knowledge for interpretation and generation of dialogue steps -- parsing and interpretation of natural language input -- development of large natural language systems -- description of German in HPSG (or similar framework). DFKI is a non-profit organization which was founded in 1988 by its shareholder companies ADV/Orga, AEG, IBM, Insiders, Fraunhofer Gesellschaft, GMD, Krupp-Atlas, Mannesmann-Kienzle, Nixdorf, Philips and Siemens. Research projects conducted at the DFKI are funded by the BMFT, by the shareholder companies, or by other industrial contracts. The DFKI conducts application-oriented basic research in the field of AI and other related subfields of computer science. The overall goal is to construct systems with technical knowledge and common sense which--by using AI methods--implement a problem solution for a selected application area. >From its beginning, the DFKI has provided an attractive working environment for AI researchers from Germany and from all over the world. Several project groups work in the area of natural language processing. The Saarbruecken site offers an excellent setting for research in computational linguistics, AI, and computer science. Several research projects are conducted at the university's Computational Linguistics department. The university has one of the best Computer Science departments in Germany. Natural language processing is one of the main strengths of the department's AI lab. Furthermore a newly founded Max-Planck Institute for Computer Science with a focus in Parallel Processing has been set up on the Saarbruecken campus. The employment is not restricted to the duration of the project DISCO. The salary will mainly depend on the applicant's scientific experience. Applications with the usual documents should be sent electronically or by post to Prof. Dr. Hans Uszkoreit DFKI GmbH Stuhlsatzenhausweg 3 W 6600 Saarbruecken 11 Germany uszkoreit@coli.uni-sb.de (2) -------------------------------------------------------------------- Date: Thu, 13 Jun 91 15:16:29 +0200 From: busemann%dfki.uni-sb.de@RICEVM1.RICE.EDU (Stephan Busemann) Subject: Job Openings in Computational Semantics at DFKI Researchers (Wissenschaftlicher Mitarbeiter) in ASL We are looking for researchers to join the project ASL (Architectures for Speech and Language) located at the German Research Center for Artificial Intelligence (Deutsches Forschungszentrum fuer Kuenstliche Intelligenz, DFKI) in Saarbruecken. The posts should be filled by January 1992. ASL is a four-year project (started January '91) funded by the German Ministry for Research and Technology (BMFT). The main objective is to combine speech and natural language technologies with an eye to improving the accuracy of the former and the utility of the latter. The project involves teams throughout Germany and is led by a group at Hamburg under Prof. Walther von Hahn. The leading scientific idea of the project is to encode information about natural language in a declarative fashion so that its use in processing is subject to the needs of recognition. The task of the Saarbruecken project will be to define, design, implement and maintain a semantics and (contextual) pragmatics component in ASL. Because constraint-based methods are called for, and because there is an opportunity to work on semantics and (contextual) pragmatics in a single component, we are interested in exploring a situation semantics approach. Ideally, we are looking for computer scientists or computational linguists with good theoretical and practical backgrounds in AI and natural language processing. Applicants should normally be able to implement in Common Lisp, and should have experience in some of the following areas: --- semantics of suprasegmentals (stress, intonation) --- linguistic description in constraint-based models --- discourse context modeling, especially in situation semantics --- resolution of anaphora and disambiguation --- implementation of logics for meaning representation DFKI is a non-profit organization which was founded in 1988 by the share holder companies ADV/Orga, AEG, IBM, Insiders, Fraunhofer Gesellschaft, GMD, Krupp-Atlas, Mannesmann-Kienzle, Nixdorf, Philips and Siemens. Research projects conducted at the DFKI are funded by the BMFT, by the shareholder companies, or by other industrial contracts. The DFKI conducts application-oriented basic research in the field of AI and other related subfields of computer science. The overall goal is to construct systems with technical knowledge and common sense which--using AI methods--implement a problem solution for a selected application area. >From its beginning, the DFKI has provided an attractive working environment for AI researchers from Germany and from all over the world. Several project groups work in the area of natural language processing. The Saarbruecken site offers an excellent research environment for anyone interested in computational linguistics, AI, and computer science. Several research projects are conducted at the university's Computational Linguistics department. The university has one of the best Computer Science departments in Germany. Natural language processing is one of the main strengths of the department's AI lab. Furthermore a newly founded Max-Planck Institute for Computer Science with a focus in Parallel Processing has been set up on the Saarbruecken campus. The employment is not restricted to the duration of the project ASL; The salary mainly depends on the applicant's scientific experience. Applications with the usual documents should be sent electronically or by post to John Nerbonne DFKI GmbH Stuhlsatzenhausweg 3 W 6600 Saarbruecken 11 Germany nerbonne@dfki.uni-sb.de [End Linguist List, Vol. 2, No. 0289] ________________________________________________________________ Linguist List, Vol. 2, No. 0290. Thursday, 13 June 1991 Subj: 2.0290 For Your Information Total: 191 lines Moderators: Anthony Aristar (a_aristar@fennel.cc.uwa.oz.au) Helen Dry (1echad@utsa86.utsa.edu) (1) Date: Wed, 12 Jun 91 11:29:34 EDT From: "Edwin S. Segal" Subject: Volunteer Opportunity (2) Date: Fri, 14 Jun 91 12:03:12 est From: sr_willing%vaxa.mqcc.mq.oz.au@RICEVM1.RICE.EDU Subject: Interest Groups (3) From: Arnold D J Date: Mon, 10 Jun 91 09:39:55 BST Subject: Special Issue on Evaluation of NLP Systems (1) -------------------------------------------------------------------- Date: Wed, 12 Jun 91 11:29:34 EDT From: "Edwin S. Segal" Subject: Volunteer Opportunity The Center for Applied Research in African Languages, a nonprofit organization dedicated to African development, seeks VOLUNTEERS to help develop electronic materials in orthography, text analysis, database compilation and linguistic geography. Contact: Stanley Lewis Cushingham, Director, 162 West Rock Avenue, New Haven, CT 06515-2223; (203) 389-8650. (2) -------------------------------------------------------------------- Date: Fri, 14 Jun 91 12:03:12 est From: sr_willing%vaxa.mqcc.mq.oz.au@RICEVM1.RICE.EDU Subject: Interest Groups from: Ken Willing Macquarie University Sydney, Australia June 15, 1991 The following excellent Interest Groups now exist, in the area of Applied Linguistics, Sociolinguistics, Linguistics, and Communication: TESL-L (Teaching English as a Second Language) SLART-L (Second Language Acquisition Research and Teaching) MULTI-L (Language and Education in Multicultural Settings) LTEST-L (Language Testing Research and Practice) LINGUIST (The Linguist Discussion List) COMSERVE (Communication. Includes: "Intercultural" and "Ethnomethodology") =============================================================================== To find out more about any one of these, from Internet or Bitnet send an e-mail letter to: Internet Bitnet --------------------- ---------------- for TESL-L listserv@cunyvm.bitnet listserv@cunyvm ------------------------------------------------------------------------------- for SLART-L listserv@psuvm.bitnet listserv@psuvm ------------------------------------------------------------------------------- for MULTI-L listserv@vm.biu.ac.il listserv@barilvm ------------------------------------------------------------------------------- for LTEST-L listserv@uclacn1.oac.ucla.edu listserv@uclacn1 ------------------------------------------------------------------------------- for LINGUIST listserv@tamvm1.tamu.edu listserv@tamvm1 ------------------------------------------------------------------------------- for COMSERVE support@vm.ecs.rpi.edu support@rpiecs ++Note: The text of your letter should consist only of the words: subscribe XXXXXX John Doe where XXXXXX is the list-name (e.g. TESL-L), and John Doe is your name. =============================================================================== You'll be sent an introductory help-file. If the Interest Group ("List") turns out not to be your cup of tea, then simply go through the above procedure again, except say "unsubscribe XXXXXX" instead of "subscribe" (no need to give your name this time). =============================================================================== Names of other participants in a particular "List" are publicly available by saying only review XXXXXX in a letter to the relevant Listserv. [This doesn't work for the "Comserve" group.] =============================================================================== P.S.: I have made up, for my own reference, an "addressbook" file of participants in all the above Interest Groups [except Comserve], put together into one integrated alphabetical listing, by surname. It is current to June 1991, and shows only: last_name first_name e-mail_address The file in its present state contains some 1500 Applied-Linguistics-related names, world-wide. I am finding it very useful, for locating the e-mail address of someone whom I know, or know of, or have read something by, and whom I would like to contact. I'm sure others would find it useful too, for that purpose. So I would in principle be happy to make the file available as an e-mail message (60K) by request to me (or by anonymous ftp..). ... The problem is, I'm a little bit concerned -- naturally I wouldn't want the file to be mis-used... say, for quasi-commercial purposes or any other nuisance. On the other hand, all the names-and-addresses are in fact already a matter of public record (it's just that my list is integrated and alphabetical). Do you think I could safely offer it on the net? Any comments? Cheers, Ken Willing Macquarie University Sydney, Australia (3) -------------------------------------------------------------------- From: Arnold D J Date: Mon, 10 Jun 91 09:39:55 BST Subject: Special Issue on Evaluation of NLP Systems SPECIAL ISSUE ON EVALUATION The journal APPLIED COMPUTER TRANSLATION is dedicating a special issue to the topic of Evaluation of Natural Language Processing (NLP) systems, under our editorship. Contributions which deal with any facet of the topic are invited. (It is intended that the issue should focus on, but NOT BE RESTRICTED TO, Evaluation of Machine Translation Systems). The evaluation of systems is an essential, but still relatively undeveloped area of NLP. Its importance to potential end-users of systems is obvious, but it is just as important to those who develop systems, and to the field as a whole, since without sensible measures of evaluation the whole idea of `progress' is problematic. Some important issues for evaluation include: - the purposes of evaluation - identification of capabilities to be evaluated - function and design of test suites - metrics for (translation) quality and their computation - the role and principles of error analysis in development - design and standardisation of evaluation metrics interpretable by potential system users - identification of text types - studies in corpus linguistics and construction frequency Contributions dealing with other aspects of evaluation are also welcome. Contributions should be in English, and will be reviewed in the normal way (by two independent referees; the Journal aims to give a decision about publication within six weeks). The timescale for this issue is somewhat open, but we would hope for publication by the end of this year. Potential contributors should contact one of the editors of the special issue, at the address below. More information about the journal itself can be obtained from the same address, or from the general editor: Tony McEnery (mcenery@comp.lancs.ac.uk). Doug Arnold (doug@essex.ac.uk; +44 206 872084) Lee Humphreys (lee@essex.ac.uk; +44 206 872086) Louisa Sadler (louisa@essex.ac.uk; +44 206 872082) Department of Language and Linguistics, UNIVERSITY OF ESSEX Wivenhoe Park, Colchester, UK CO4 3SQ Telex 98440 (UNILIB G) Fax: +44 206 873598 Tel +44 206 872083 Email from different networks: The following addresses are for Arnold. Changes to the individual name (doug/lee/louisa) will give the addresses of the other editors: doug%essex.ac.uk@ean-relay.ac.uk (ean); doug%essex.ac.uk@cunyvm.cuny.edu (arpa); doug%essex.ac.uk@ac.uk (earn); ...!ukc!essex.ac.uk!doug (uucp). [End Linguist List, Vol. 2, No. 0290] ________________________________________________________________ Linguist List, Vol. 2, No. 0291. Thursday, 13 June 1991 Subj: 2.0291 Conferences Total: 294 lines Moderators: Anthony Aristar (a_aristar@fennel.cc.uwa.oz.au) Helen Dry (1echad@utsa86.utsa.edu) (1) Date: Tue, 11 Jun 91 14:36:38 -0400 From: walker%flash.bellcore.com@RICEVM1.RICE.EDU (Don Walker) Subject: TRANSLATION AND THE EUROPEAN COMMUNITIES (2) Date: Tue, 11 Jun 91 16:04:40 BST From: LNP6PJR%CMS1.LEEDS.AC.UK@RICEVM1.RICE.EDU Subject: Bulgarian Conference (1) -------------------------------------------------------------------- Date: Tue, 11 Jun 91 14:36:38 -0400 From: walker%flash.bellcore.com@RICEVM1.RICE.EDU (Don Walker) Subject: TRANSLATION AND THE EUROPEAN COMMUNITIES Conference 22-23 June 1992 near Stockholm on TRANSLATION AND THE EUROPEAN COMMUNITIES New scale, New problems, New challenges, New openings - and first and last: Changes! The deeper integration and the wider geographical scope of the European Communities are rapidly creating a very different Europe. Not least, the language situation is being radically remodeled. The first to be affected are the professional translators, along with those who buy or organize translation or provide tools and services for the purpose. Also, the translators are among the few who can immediately affect the galloping development - and give early warnings. There are some worries. Thus, "old" members of the communities report that the convergence has given rise to new linguistic barriers: bureaucrats and politicians have in many domains developed a "Eurospeak", with different versions claimed to be English, French, German etc, which are just barely intelligible to unspoilt native readers and writers of thenational languages of Europe. What are the effects of these converging and diverging tendencies on the political, economic and social life in Europe? And on the national languages? No European country will escape the consequences of this bureaucracy-based revolution. Thus, Sweden, not yet a member country, is translating a set of community documents tantamount to ten years' yield of statutory law in the country. Can the Swedish language with impunity assimilate this influx of new texts, concepts and words? Without - as is the case now in Sweden and in most "new" member countries - a co-ordinated planned terminological effort? The peaceful amalgamation of autonomous countries into a unified albeit pluralingual entity without Herrenvolk and lingua franca seems to be unique in history. Will it remain so? Is it an experiment worth observing for other regions which contemplate becoming more of one region than a multitude of neighbours? When is multilingualism a recommendable proposal? To address questions like these, the Swedish Association of Authorized Translators, FAT, in conjunction with the Committee for Linguistics (FID/LD) within the International Federation for Information and Documentation, FID, is organizing a conference, 22-23 June 1992, at Biskops Arnoe, just outside Stockholm, Sweden. Papers focussing on some specific aspect of this theme are invited. In particular, we welcome comments on the following issues: * A pluralingual community and its impact on the national languages * Terminological support and language control * The translation market after 1992 * Translation aids in a multilingual environment The spoken language at the conference will be English: we regret that we shall not have the resources to provide interpretation. Accepted papers will be discussed at the conference and included in revised form in a book summarizing the findings and results of the conference meetings. Papers may be submitted in any official language of the Communities and a summary in some other language of the Communities should be appended. Whether presenting a paper or not, participants with experience of translating and translation-related problems are welcome. To warrant an atmosphere promoting interaction rather than soliloquies attendance is restricted to about 70 persons from all countries, so please indicate your interest at your earliest convenience. TRANSLATION AND THE EUROPEAN COMMUNITIES CONFERENCE 22-23 June 1992 Biskops Arnoe Manor near Stockholm, Sweden DATES: * Submission of draft of papers: before January 15, 1992. * Decision by Programme Committee: before March 1, 1992. * Delivery of camera-ready text of paper: before May 15, 1992. * Payment of subscription dues: before April 15, 1992. * Arrival at conference venue: Sunday evening, June 21, 1992. * Working sessions: Monday and Tuesday, June 22 and 23, 1992. * Departure: Wednesday, June 24, 1992. PRELIMINARY CONFERENCE SCHEDULE Friday & Saturday June 19-20: Pre-conference Social Programme Sunday June 21: From 1 p.m. Arrival & Registration From 7 p.m. Informal Gathering Monday June 22: Presentation of papers; discussions Evening: Panel discussion Tuesday June 23: Presentation of papers; discussions Evening: Banquet. Wednesday June 24: Breakfast & Departure Thursday June 25: Study visits to translation companies and documentation departments in Sweden THE CONFERENCE PROGRAMME Papers to be considered for inclusion in the conference programme should be sent to the Programme Committee. Please send a draft of the full text - not more than 12,000 characters - as a plain ASCII text to e-mail COLING@COM.QZ.SE or on paper in quintuplicate. The Programme Committee, after reviewal, will make its decision not later than 1 March and notify the author(s) by e-mail, fax or telex at the address indicated for that paper in the heading of the paper. Each paper should focus on some specific issue of translation in the new Europe. It should report about a recent or expected change in the organization, conditions and market for translation, describe, suggest or criticize tools and methodware for translation or debate crucial problems of language policy and planning influencing or influenced by translation activities. We shall not have space on this occasion for papers, whatever their merits, on translation theory or practice in general. Accepted papers will be reproduced and distributed to the partipants on arrival and discussed at the conference. The papers, after revision, together with the results emerging from the conference will be published in the form of a book which is expected to become a work of reference for everybody interested in translation and language problems in Europe. Resources will be available for software demonstrations during the meeting. Proposals for demonstrations should be submitted to the programme committee in the manner described for papers. Please indicate what computational environment your demonstration will require. All participants as well as non-participating persons or organizations are invited to exhibit relevant literature and reports at a book show during the conference. If possible, bring copies for participants to pick up. One copy of each item presented will be retained by the organizers for future reference. For possible commercial exhibition and demonstration of products or services, please contact the organizers about terms and conditions. CONFERENCE TIMES AND VENUE The conference will be held at an old manor, Biskops Arnoe, built on the site of anancient castle - the Bishop's Eagle Isle, to translate the name - some 14th century vaults ofwhich are still, as we shall see, extant. It belongs today to "Norden", an Intra-Scandinavian cultural association which organizes training courses and meetings on topics of mutual interest to the whole Scandinavian area (and which finds the topic of this conference on the linguistic situation in Europe, including Scandinavia, highly pertinent). For the purpose of such activities, a building for meetings and a number of bungalows for accommodation have been added. The facilities are modest but modern, with a bathroom in each room or in every 4-roomed bungalow. Biskops Arnoe is situated in a rural environment, on a small island of its own in the large lake Maelaren, about an hour's drive from Stockholm City and about the same distance from Arlanda International Airport. The dates for the conference were chosen because Scandinavia is very bright and attractive at that time, so that the business visit can be combined with a tourist trip in Sweden. Accompanying persons can be accommodated on the conference site at a moderate extra charge, and they will be given ample opportunity to explore the surroundings, fraught with historical memories. Within a few miles of friendly land- and seascape visitors will find runic stones, medieval churches, castles, and Sweden's oldest city; within less than 70 km they will also find Sweden's oldest university as well as its present capital. The conference begins on the Monday following Midsummer, which is celebrated during two intense days and nights in Scandinavia. The participants are expected to drop in during the Sunday, and enjoy an informal gathering in the evening. Pre-Conference Registration To warrant an intimate atmosphere for open-minded constructive discussion, attendance is restricted to about 70 persons. Place will be reserved on a first-order-first-served basis. To be valid, registration must be followed by payment for the full conference documentation not later than April 15. One set of the documentation must be purchased and paid for by each participant. No additional participation fee is required. If for any reason a subscriber cannot attend the meeting, the documentation will be mailed to his address. The payment will not be refunded. Active participants will be provided with accommodation, meals and transportation on a complimentary basis. They are expected to pay for their own transportation to Biskops Arnoe, for beverages and for telephone and similar personal expenses. For accompanying persons a minor charge, 250 ecu, for meals will be made. The price for all relevant documentation, including the printed after-conference report on "Translation in the New Europe", is 500 ecu for subscribers paying before April 15. Otherwise the price is 750 ecu. The final report will be distributed through a commercial publisher. All payments should be credited to Eurofat AB, Account number 333 14 31, Skandinaviska Enskilda Banken, Stockholm, clearing number 5244. For subscribers in Sweden, VAT must be added. If you wish to be billed for the amount, please instruct us on the appropriate receiver and address of such a bill. ADDRESSES All correspondence concerning the conference prior to the meeting should be addressed to Eurofat AB, which is a company formed by FAT for this particular purpose. Its addresses are: EUROFAT AB: e-mail: COLING@COM.QZ.SE, telex: 15 440 KVAL S fax: +46 8 796 96 39 voice: +46 8789 66 83 paper mail: Skeppsbron 26, S-111 30 Stockholm, Sweden ORGANIZING COMMITTEE: President of FAT: Leif Oestling Coordinator: Hans Karlgren Liaison Officers: Joachim Wesseloh, David Knight, Jean Heyum Press Officer: Kerstin Ingmansson Conference Treasurer: Bo Widegren Conference Secretary: Katrin Sundius-Nordin Post-Conference Study Visit Manager: Matti Jaernare Travel and Pre- & Post-conference Tours Advisor: Heidemarie Nyrn Cultural Programme: Adolf Dahl Registration: Gerd Mller-Nordin During the conference, participants can be reached using the following address: Folkhogskolan Biskops Arnoe S-198 00 Baalsta, Sweden Phone: +46-171-522 60 (2) -------------------------------------------------------------------- Date: Tue, 11 Jun 91 16:04:40 BST From: LNP6PJR%CMS1.LEEDS.AC.UK@RICEVM1.RICE.EDU Subject: Bulgarian Conference I have been asked by a Bulgarian colleague to publicise the following announcement. Peter Roach, Leeds Univ. The Third International School in Sociolinguistics is to be held on 9-12 Sept. 1991 in Veliko Tirnovo, Bulgaria. The topic is : Language Situation in Micro- and Macro-Social Communities. Participants are expected to contribute a 10-minute paper, and will be have to cover their own travel and accommodation costs, plus living expenses estimated at $70 for the whole period. All materials will be sent to the participants free of charge. Sociolinguists from all over the world have been invited. Applications to the following address as soon as possible: Naouchno-metodicheski suvet po Bulgaristika, (Ground Floor, room 4) 49, Moskovska Street, 1000 SOFIA Bulgaria [End Linguist List, Vol. 2, No. 0291] ________________________________________________________________ Linguist List, Vol. 2, No. 0292. Friday, 14 June 1991 Subj: 2.0292 Last Posting on Tongue Twisters Total: 175 lines Moderators: Anthony Aristar (a_aristar@fennel.cc.uwa.oz.au) Helen Dry (1echad@utsa86.utsa.edu) (1) Date: Mon, 10 Jun 91 20:27:51 PDT From: tshannon@garnet.berkeley.edu Subject: Re: More Tongue Twisters (2) Date: 11 Jun 91 13:04:00 EST From: "ELISE EMERSON MORSE-GAGNE" Subject: Stahlke's pig-- er, fig-pluckers (3) Date: Wed, 12 Jun 1991 14:39 EDT From: KSRIDHAR%SBCCMAIL.bitnet@RICEVM1.RICE.EDU Subject: Kannada tongue twisters (4) Date: Wed, 12 Jun 91 17:41 EST From: MANGO%PINE.CIRCA.UFL.EDU@RICEVM1.RICE.EDU Subject: Tongue twister (1) -------------------------------------------------------------------- Date: Mon, 10 Jun 91 20:27:51 PDT From: tshannon@garnet.berkeley.edu Subject: Re: More Tongue Twisters At the risk of multiplying trivia beyond necessity, let me contribute my list of German tongue-twisters. I do not guarantee that the versions given here are the most usual ones, or even that all of them are equally known in German-speaking countries. Some of them in fact might not truly be considered real tongue-tiwsters. However, perhaps they'll give a bit of amusement to a few colleagues. My favorite is the one with "der dicke Diener" (that one I can do); my wife's is "Brautkleid" (that one she can do and I simply can't!). With that, I end my contribution to tongue-twister mania. In Ulm und um Ulm und um Ulm herum. 'In Ulm and around Ulm and all around Ulm.' Neun Naehnadeln naehen neun Nachtmuetzen. 'Nine sewing needles sewed nine night caps.' Tausend tapfere Tuerken trotzten tapfer den Templern. '(A) thousand brave Turks bravely defied the templars.' Thomas trank tausend Tropfen; tausend Tropfen trank Thomas. 'Thomas drank (a) thousand drops; a thousand drops drank Thomas.' Der Potsdamer Postkutscher putzt den Potsdamer Postkutschwagen. 'The Potsdamer post coachman washes the Potsdam post coach.' Der Kottbusser Postkutscher putzt den Kottbusser Postkutschwagen. 'The Kottbus post coachman washes the Kottbus post coach.' Roland der Riese vorm Rathaus zu Bremen. 'Roland the Giant before the city hall in Bremen.' Dreihundertdreiunddreissig Rotten reitender Ritter rasten rufend dreihundertdreiunddreissig mal ums riesige rote Rathaus. '333 troops of riding knights hurried calling 333 times around the giant red city hall.' Hans hoert hinterm Holzhaufen hundert heisere Hasen husten. 'Hans hears behind the wood pile (one) hundred hoarse hares coughing.' Ein Student mit Stulpenstiefeln stolperte am Stein und starb. 'A student mit top-boots tripped on a stone and died.' Messwechsel und Wachsmaske; Wachsmaske und Messwechsel. 'Mass change and wax mask; wax mask and mass change.' Brautkleid bleibt Brautkleid, und Blaukraut bleibt Blaukraut. 'Bride's dress remains bride's dress, and blue (= red!) cabbage remains blue cabbage.' Gleich bei Blaubeuren liegt ein Kloetzchen Blei, ein Kloetzchen Blei liegt gleich bei Blaubeuren. 'Right by Blaubeueren lies a clump of lead, a clump of lead lies right by Blaubeuren.' Die Katze tritt die Treppe krumm. 'The cat kicks the stairs crooked.' In der Fruehe faengt Fischers Fritze frische Fische; Fischers Fritze faengt in der Fruehe frische Fische. 'In the morning Fischer's Fritz catches fresh fish; Fischer's Fritz catches fresh fish in the morning,' Zwischen zweiundzwanzig schwankenden Zwetschgenzweigen schweben zweiundzwanzig schwarze, zwitschernde Schwalben. 'Between 22 waving plum branches float 22 black, chirping swallows.' Bernhard Brunos, buergerlichen Brauhausbesitzers bei Braunau, beruehmte bayrische Bierhymne beginnt: Biedere brave Bierbrauerburschen bereiten bestaendig bitteres, braunes bayrisches Bier. 'Bernhard Bruno, burgeois brewery owner in Braunau's famous Bavarian beer hymn begins: honest good brewery fellows continually make bitter, brown Bavarian beer.' Der duenne Diener traegt die dicke Dame durch den dicken Dreck. Da dankt die dicke Dame dem duennen Diener, dass der duenne Diener die dicke Dame durch den dicken Dreck getragen hat. 'The thin servant carries the fat lady through the thick mud. Then the fat lady thanks the thin servant that the thin servant carried the fat lady through the thick mud.' Wir Wiener Waschweiber wuerden wohl weisse Waesche waschen, wenn wir wuessten, wo weiches, warmes Wasser waer. 'We Viennese washing women would wash white wash, if we knew where soft warm water was.' Drei Teertonnen, drei Trantonnen; drei Trantonnen, drei Teertonnen. 3 tar barrels (tons), 3 blubber barrels (tons); 3 tar barrels (tons), 3 blubber barrels (tons). Mariechen sagte zu Mariechen: "Mariechen, lass Mariechen mal riechen!" Da liess Mariechen Mariechen mal riechen. 'Little Mary said to Little Mary: "Little Mary, let Little Mary take a smell." Then Little Mary let Little Mary take a smell.' tom shannon, uc berkeley (2) -------------------------------------------------------------------- Date: 11 Jun 91 13:04:00 EST From: "ELISE EMERSON MORSE-GAGNE" Subject: Stahlke's pig-- er, fig-pluckers I used to play a game with my cousins which went as follows. There are two players; the first instructs the second to say "shistl" (the t is pronounced) whenever the first person says "pit", and "pit" whenever she says "shistl", responding as fast as possible each time. The the first player (sorry, that's "then the") sneaks the word "pistol" into the sequence, and no matter how on guard the second player is, they can't help coming out with "shit." This goes over very well among 8-10 year olds. I have never ceased to be amazed at the inevitability of the swear-word once the stimulus is introduced. Elise Morse-Gagne (3) -------------------------------------------------------------------- Date: Wed, 12 Jun 1991 14:39 EDT From: KSRIDHAR%SBCCMAIL.bitnet@RICEVM1.RICE.EDU Subject: Kannada tongue twisters Here is a Kannada tongue twister: ondu tarikere kere e:ri mele ondu kari kurimari me:yuttittu. On the bank of the Tarikere tank, a black ewe was grazing! (4) -------------------------------------------------------------------- Date: Wed, 12 Jun 91 17:41 EST From: MANGO%PINE.CIRCA.UFL.EDU@RICEVM1.RICE.EDU Subject: Tongue twister Herb Stahlke asked for tongue twisters similar to "Fig pluckers." I remember playing a word game as an undergraduate called "Ducky Fuzz ." The object of the game was to say "Ducky Fuzz Fuzzy Duck Ducky Fuzz" without making a mistake. Not really a tongue twister, but the mispronounced results were quite meaningful. JoEllen M. Simpson University of Florida [End Linguist List, Vol. 2, No. 0292] ________________________________________________________________ Linguist List, Vol. 2, No. 0293. Friday, 14 June 1991 Subj: 2.0293 Responses Total: 137 lines Moderators: Anthony Aristar (a_aristar@fennel.cc.uwa.oz.au) Helen Dry (1echad@utsa86.utsa.edu) (1) Date: Tue, 11 Jun 91 14:37:09 BST From: John Phillips Subject: Re: Responses - Turkish morphological parser (2) Date: Wed,12 Jun 91 14:32:24 BST From: C.S.Butler%vme.nott.ac.uk@RICEVM1.RICE.EDU Subject: Register (3) Date: Mon, 10 Jun 91 23:58:51 -0400 From: daniel%drew.cog.brown.edu@RICEVM1.RICE.EDU (Daniel Radzinski) Subject: Jewish surnames. (4) Date: Tue, 11 Jun 91 11:34 GMT From: John Coleman Subject: Phonology and Orthography (1) -------------------------------------------------------------------- Date: Tue, 11 Jun 91 14:37:09 BST From: John Phillips Subject: Re: Responses - Turkish morphological parser A colleague of mine here at Umist has a Turkish morphological parser, part of a prototype machine translation system. He is Jeremy Carroll, jeremy@uk.ac.umist.ccl John Phillips (2) -------------------------------------------------------------------- Date: Wed,12 Jun 91 14:32:24 BST From: C.S.Butler%vme.nott.ac.uk@RICEVM1.RICE.EDU Subject: Register Anyone who is really interested in register should read Halliday on it if they haven't already. Studies inspired by Halliday's early work didn't win many friends, but his later approach is more interesting, in my view. He defines register as "the configuration of semantic resources that the member of a culture typically associates with a situation type" (Language as Social Semiotic, Edward Arnold 1985, p111), the situation type being defined in terms of values of field, tenor and mode. As I've said elsewhere, I think there's a lot of work still to do on firming up these ideas, but they're worth looking at. For a critical discussion, though a bit out of date, see my Systemic Linguistics: Theory and Applications, Batsford 1985, especially Ch 5. Chris Butler Dept of Linguistics University of Nottingham Nottingham NG7 2RD UK (3) -------------------------------------------------------------------- Date: Mon, 10 Jun 91 23:58:51 -0400 From: daniel%drew.cog.brown.edu@RICEVM1.RICE.EDU (Daniel Radzinski) Subject: Jewish surnames. Ellen Prince indicates that "it is typical for jews to take names that are phonologically and even apparently morphologically consistent with the languages of the countries in which they reside." This is certainly true. Consider the following variations on (Ha)Levi: Levin, Levine, Levitus, Leefsma, Horowicz, Hurvitz, Gurevich, Levitz(?) I guess one definitely must consider this factor in the context of acronym etymologies. -- Daniel Radzinski (4) -------------------------------------------------------------------- Date: Tue, 11 Jun 91 11:34 GMT From: John Coleman Subject: Phonology and Orthography > (2) Does anyone understand why John Coleman thinks that the length of > some (unspecified) bit-encoding of the 2D character patterns is a > suitable measure of how difficult it is for people to learn and use a > writing system? Allow me to make myself clearer. A number of people have argued that alphabetic writing is "superior" to, say kana-like writing systems because fewer alphabetic symbols than kana symbols are required for the orthography of a language. This property has been called "efficiency". However, a number of important considerations are being overlooked in this argument. Firstly, learning a writing system is not just a question of learning a set of symbols. It is also necessary to learn the way in which combinations of symbols are interpreted. Alphabetic writing costs more than kana-type systems on this score. Secondly, a simple comparison of the number of symbols is no use. Some alphabets have more symbols than they "need" from a phonemic point of view, e.g. positional variants or historical distinctions no longer preserved (such as in Thai orthography). Since the original discussion what not just about orthography, but phonology, I proposed that the efficiency of a writing system could be assessed less prejudicially to any particular system in terms of how efficiently the phonological distinctions [that's where the bits come in] are encoded, and how well redundant information is left UNrepresented. The assumption I am making is that the orthography which is most phonologically efficient is one which encodes all the phonological distinctive oppositions of a language, and no redundant information. By this measure it can be seen that kana, and yes, even syllable-based systems may be more efficient encodings than an alphabet. I make no claims as to how this relates to learnability. Margaret's comment that > However, it follows by the same line of reasoning > that an ideographic system (e.g. Chinese) uses storage space even more > efficiently and should thus be an even more popular (stable, adopted > by other languages, easy for children to learn) method of writing. is a non-sequitur. Chinese has a great many different characters for each syllable, and is thus an inefficient means for encoding the phonology of Chinese. A true syllabary of a couple of hundred characters might be most efficient, given the pervasive order-redundancies of Chinese. In general, because there are two places where consonantal oppositions occur in syllables, demisyllable systems are the most efficient. As well as this technical defence of my claim, I would like to close by adding that the tenacious defence of the "superior efficiency" of alphabetic writing that this discussion has engendered has at times been accompanied by a Eurocentric tone. The comment about not being able to read Arabic aloud until you know what it means is equally true of alphabetic scripts, yes even Finnish. --- John Coleman [End Linguist List, Vol. 2, No. 0293] ________________________________________________________________ Linguist List, Vol. 2, No. 0294. Friday, 14 June 1991 Subj: 2.0294 Flaming Total: 85 lines Moderators: Anthony Aristar (a_aristar@fennel.cc.uwa.oz.au) Helen Dry (1echad@utsa86.utsa.edu) (1) Date: Tue, 11 Jun 91 11:36:38 BST From: fleck%robots.oxford.ac.uk@RICEVM1.RICE.EDU (Margaret Fleck) Subject: flaming (2) Date: Wed, 12 Jun 91 18:06:30 EDT From: macrakis@osf.org Subject: Flaming and common rooms (1) -------------------------------------------------------------------- Date: Tue, 11 Jun 91 11:36:38 BST From: fleck%robots.oxford.ac.uk@RICEVM1.RICE.EDU (Margaret Fleck) Subject: flaming In reply to (Elise Morse-Gagne')'s [well, how ELSE am I supposed to spell it?] query on "flaming": you seem to have gotten the basic idea of the term correct. It is originally computer science jargon and has been very commonly used in that community for some time. Like certain other useful words (e.g. "kludge"), it seems to be spreading to a wider community. The MIT/Stanford/WPI jargon dictionary (version of ca. 1985) defines it as follows: FLAME v. To speak incessantly and/or rabidly on some relatively uninteresting subject or with a patently ridiculous attitude. FLAME ON: v. To continue to flame. See RAVE. The metaphor of fire is still very much alive. I remember one incident at MIT where someone brought a fire extinguisher to a facilities meeting in case there was excessive flaming. Apparently there are a series of more recent coinages of terms "asbestos X" ("asbestos longjohns") used to protect oneself against flames while reading e.g. network newsgroups, but I've only heard about these secondhand. It is well-known among the computer science community that discussions on electronic mail/newsgroups tend to degenerate into flaming. The basic problem seems to be that feedback is very slow: if you ask another participant what he intended, you may not find out for hours or days. However, people tend to treat the medium as if it were a conversation, rather than like an exchange of letters or journal articles. In particular, it is easy to reply to postings immediately, and people often do so in order to clear their mailboxes. The result is often a sequence of misunderstandings, frustration, and consequent flaming. The result is rather like sitting in the back of a talk in which the speaker is saying lots of things you disagree with. As there's a limit to what can be brought up in questions at the end of the talk, we've all had the experience of sitting and stewing, writing notes to the person in the next seat, and flaming to our friends in the corridor afterwards. Now, suppose that we recorded your pent-up thoughts and played them back to the speaker after the talk. And then he got to reply ... Another analogy would be to suppose that the main linguistics journals had an instant reply service. That is, as you finish reading the latest spectacularly irritating article by [name your favorite bete noir], you could scribble down your thoughts and have them appear instantly in print, without editorial review. In real conversations, these problems are usually avoided by asking frequent questions. In journals, they are avoided by honing the prose into something that is difficult to misunderstand. Margaret Fleck (fleck@robots.oxford.ac.uk) (2) -------------------------------------------------------------------- Date: Wed, 12 Jun 91 18:06:30 EDT From: macrakis@osf.org Subject: Flaming and common rooms Several people asked about the words `flaming' and `(senior) common room'. To avoid cluttering this list, if you want (my) definitions, drop me a line. -s [End Linguist List, Vol. 2, No. 0294] ________________________________________________________________ Linguist List, Vol. 2, No. 0295. Friday, 14 June 1991 Subj: 2.0295 Character Coding (Part 1) Total: 154 lines Moderators: Anthony Aristar (a_aristar@fennel.cc.uwa.oz.au) Helen Dry (1echad@utsa86.utsa.edu) (1) Date: Mon, 10 Jun 91 22:16:36 PDT From: "Charles A. Bigelow" Subject: Diacritics, orthography, type design, font format (2) Date: Tue, 11 Jun 91 10:40:47 +0100 From: Paul Hackney Subject: Character encodings (1) -------------------------------------------------------------------- Date: Mon, 10 Jun 91 22:16:36 PDT From: "Charles A. Bigelow" Subject: Diacritics, orthography, type design, font format The question of whether to have a "closed repertoire" character set (like the ISO proposal, which fixes single character codes for a limited but large, set of letter+diacritic combinations) or an "open repertoire" set (like Unicode, which allows arbitrary combinations, but as multiple codes) does not depend much on modern type font technology or the current art of type design. Most major font formats in use today, including PostScript Type1, Apple/Microsoft TrueType, and Sun F3, (perhaps also Hewlett Packard / Agfa CG Intellifont, though I am not certain), actually store most letter+diacritic combinations as subroutine calls to the separate elements - letter, diacritic - rather than as a fully formed character comprising letter and diacritic. That is to say, when a program like a word-processor calls out the character code for, say, a-acute, the font looks up the a, and then looks up the acute, and then looks up some information about where to position the acute over the a, puts the pieces together, rasterizes the new composite, and hands it over for display and/or printing. This method of forming composites has two advantage: economy - it reduces the memory requirements of the font; power - it allows the potential for arbitrary production of all possible letter + accent combinations. The creation of new diacritic combinations doesn't require the skills of professional type designers. Some brave and ingenious souls may write PostScript programs to implement the desired combinations and to assign those combinations to character codes. Or, if a font already has most floating diacritics (like the Macintosh, or the Microsoft UGL character set), a "kerning" table can be devised to properly position the accents over the selected letters. This requires some planning, arithmetic, etc. Another and simpler way is to use a font editing program, such as Altsys' Fontographer, LetraSet's FontStudio, or URW's Ikarus M (available on the Macintosh; there are also related programs for the PC) to get into the font and mix n' match letters and accents for the desired effect, and assign the results to arbitrary character codes/positions. This requires some time to learn the rudiments of the editing program, but no training in type design. In most fonts, the designer has provided most of the common letters and diacritics. All the user needs is the desire to combine them. In fact, some users will actually do a better job of it than the designers, since the designers are not likely to be literate in all the languages for which they have designed accents, and simply follow some basic, simple rules, or various precedents, whereas literate users often have a better feeling for what constitutes discriminability among the graphemes of their own language. In the 1950's, French typographers persuaded the English Monotype Corporation, originators of Times Roman, to reposition several of the accented characters and to redesign various other characters, to make a Gallicized version of Times that would be acceptable to the French literate palate. It is reasonable to suppose that literates of other languages and orthographies might want something similar. Moreover, some users may want to design new forms that are not included in a standard font. Such things don't always look as sleek and polished as professional work, but they also might have merits that professional designers would have failed to include. If a new form achieves acceptance, sooner or later some designer will come along to spiff it up. So, the technology of fonts and the art of type design provide the means for either closed or open character sets. The decision of which to use is based on other factors, including politics. -- Chuck Bigelow (2) -------------------------------------------------------------------- Date: Tue, 11 Jun 91 10:40:47 +0100 From: Paul Hackney Subject: Character encodings The most commonly used coding for text is the ASCII (American Standard Code for Information Interchange) character set, which does not provide for characters containing diacritical marks. As it stands only 7 bits of a possible 8 bits are used, giving 128 encodings (the reason for this is historical - the 8th bit was used for parity checking). Various extensions are in use (ISO multinational, DEC multinational et al) which use the 8th bit to provide another 128 encodings containing the commonly used European characters. However, in my experience, not all terminals, printers and personal computers support even this limited character set. There are (at least) two proposals for extending the character set into something that addresses the rich variety of symbols found in the many languages of the world. In response to John Baima, I must confess to an ignorance of what a floating diacritic is. I will therefore limit my comment to an inferred explanation: a floating diacritic is a character that can be combined with a normal character (such as ~ [tilde] and n) to provide a composite. This method is satisfactory but limited in that it enhances an existent impoverished system of coding. A more general solution is to extend the coding to cover more alphabets. I recently came across an article in New Scientist (a popular and serious scientific magazine) which described a new coding system ["Computer code speaks many tongues", New Scientist, 9 March 1991, pp.28]. Apparently a consortium of American companies called "Unicode" (inluding IBM, Apple, Sun, ...) have chosen to represent their character set using a 16 bit code, which will give a possible 65,536 characters. They suggest that 6,000 codes suffice for all the alphabets of Europe, the Middle East and the Indian subcontinent. Chinese, Japanese and Korean require about another 18,000 codes. I expect it is arguable whether these figures are really representative of the characters used and preferred by the respective nationals. However one thing is certain: it is not compatible with our present system. Furthermore you will need twice as much space to store text using the current character set, and transmission times will be doubled. I have come across another system that does not suffer from these limitations, and is in my opinion a winner by lengths. Instead of using a fixed length encoding, the answer is to arrange for the encoding to expand to two or three bytes when required [Becker, J.D. (1984?) Multilingual Word Processing, from: Language, Writing, and the Computer: Readings from Scientific American, pp 86-96 ISBN 0-7167-1772-7]. This is simply done by setting aside a few bytes as signals to the computer (or printer etc) and embedding these in the text. The principle signal is one byte that indicates that the next byte is a code representing the alphabet to be used for the subsequent text. This gives 255 different alphabets, each containing 255 codes. Compatability between the system and the current ASCII encoding is easily achieved by assuming the start of the text is in alphabet "Roman" (ie ASCII). Although most (all?) of the European languages, Cyrillic, Arabic etc can be fully represented with 255 characters, other "alphabets", such as Chinese require considerably more encodings. A simple extension to the above scheme provides an elegant solution. Two 'shift-alphabet' characters in sequence indicate that the next byte signals which 'super alphabet' is to be used. These alphabets use a two byte encoding scheme giving 65,536 possible letters (this is similar to the "Unicode" proposed system). (a 3 byte 'super-super-alphabet' would allow well over 16 million codes). I am really convinced that this system should be adopted in preference to the fixed length encoding. Unfortunately "Unicode" appear to be well established and their proposed system may well become the de facto standard (as they hope). Paul Hackney [End Linguist List, Vol. 2, No. 0295] ________________________________________________________________ Linguist List, Vol. 2, No. 0296. Friday, 14 June 1991 Subj: 2.0296 Character Coding (Part 2) Total: 264 lines Moderators: Anthony Aristar (a_aristar@fennel.cc.uwa.oz.au) Helen Dry (1echad@utsa86.utsa.edu) (1) Date: Tue, 11 Jun 91 09:05 EDT From: DJBPITT%PITTVMS.BITNET@CUNYVM.CUNY.EDU Subject: Re: Diacritics (2) Date: Tue, 11 Jun 91 11:26:09 EDT From: macrakis@osf.org Subject: Technical problems with diacritics (1) -------------------------------------------------------------------- Date: Tue, 11 Jun 91 09:05 EDT From: DJBPITT%PITTVMS.BITNET@CUNYVM.CUNY.EDU Subject: Re: Diacritics As one of the more active participants in the ISO10646 and Unicode ListServ discussions of "floating diacritics," I would like to comment on some of the postings on the subject in Linguist List, Vol. 2, No. 0283 (Monday, 10 June 1991, Subj: 2.0283 Diacritics). I apologize for the length of this contribution, but it's a large topic with a long history and many readers of this ListServ may not know the background. First, thousands of lines have already been written about this subject on those two ListServs (hundreds of them by me). I and a number of other linguists have been arguing from the beginning that character sets are too important to be left entirely to specialists in computer languages (who have their own priorities) and that natural language orthography is serious business. It is encouraging that someone has taken steps to draw more linguists into the discussion by some judicious crossposting to the Linguist ListServ. But I would like to suggest that interested linguists subscribe to the specialized ListServs mentioned above and that they read the archives available there. This will avoid unnecessary repetition and cross-posting, will ensure that all participants in the discussion know the background, and -- perhaps most important -- will ensure that linguists' informed opinions are shared with colleagues in other disciplines who often play decisive roles in developing international character set standards. I would also like to urge linguists to become involved in character set issues in an effective way. The ISO is composed of national representative bodies and is not required to listen to individuals. It is possible to join your country's national delegation and help formulate official positions on ISO proposals. ISO character set development is ultimately politics, not science. If you want to influence the outcome, you can't just post intelligent observations to a non-binding ListServ; you have to participate at the national delegation level. Why waste your time? Those of us who work with unusual writing systems are forced to develop our own character coding. Since we are not organized, we wind up with files that cannot be shared. Hardware and software manufacturers support recognized standards (both official ISO standards and de facto industry standards, which Unicode is likely to be); making a standard suit your needs means 1) you don't have to do your own character set development any more, and 2) you can share files with colleagues. macrakis@osf.org writes: >The argument is much narrower than that: should >character encodings be closed (i.e. contain a fixed repertoire of >character+diacritic combinations) or open (i.e. permit arbitrary >combinations of character and diacritic)? >From a different perspective, both repertoires are closed and open simultaneously. I see the crucial difference in their different definitions of character. First, an "open" repertoire, as defined above, is also fixed, in that there is a finite number of machine characters. In neither case can a new base, diacritic, or precomposed base+diacritic be added arbitrarily by a user as a machine character. In the sense of allowing new machine characters, all character sets are closed. (Both ISO DIS 10646 and Unicode have provisions for private use zones.) Second, no character set ever limits the arbitrary combination of alphabetic characters, and character sets always permit combinations that would not be meaningful in any writing system. If we define base and diacritic elements as our characters, allowing their arbitrary juxtaposition is no different from allowing the arbitrary juxtaposition of any two base characters. In the sense of allowing combinations of elements, all character sets are open. The "precomposed" camp essentially views characters as things that occupy linear space. The "separable" camp does not; from the latter perspective, base+diacritic is simply two characters, one of which is traditionally displayed above, rather than next to, the other. The issue isn't so much that one repertoire is closed and the other open as much as that the two repertoires have different constituencies. Creating a new base+diacritic combination in a separable diacritic system isn't creating a new machine character because the combination is no more a character than a sequence of two bases. >The difference comes with characters which are NOT widely used ... [a random >accented character] does not exist as a precomposed character in Unicode >or in 10646. But in Unicode it can be represented as a combination of three >codes, even if it's never been used before (and even if it is a typo!). >10646 could of course add it in a future revision, but this has to be done >on a case-by-case basis. There are two sets of issues in choosing between precomposed characters and separable diacritics. (Note: "diacritic" is not necessarily the best term for reasons discussed by Lloyd Anderson in the ISO10646 and Unicode ListServ archives, but I'll continue using it here.) These issues are adequacy and appropriateness. Adequacy is the easy one: the price of prohibiting separable diacritics is making room for all precomposed combinations. For some poorly-codified writing systems, it is impossible to determine which precomposed combinations actually occur. Separable diacritics ensure that unforeseen combinations can be represented. Opponents of separable diacritics, largely computer scientists with no experience in uncommon or poorly codified writing systems, do not care whether the repertoire is adequate for scholars, as long as it is adequate for businessmen using modern languages. One suggested criterion for representing characters is to represent only those characters used in newspapers, a criterion that I hope all linguists will find appalling. (Other opponents of separable diacritics are more reasonable, but, through lack of experience with poorly-codified writing systems, may not understand that even with the best of intentions it may be impossible to define a complete precomposed repertoire in advance.) Appropriateness is tougher. One set of arguments holds that character inventory should reflect grapheme inventory; diacritics would be encoded separably if they function as separable orthographic entities, while precomposed combinations would be used otherwise. (Graphemic analysis is not necessarily unique and "separable orthographic entity" is tricky to define, but constructive suggestions can be found in the archives.) Another holds that the proper criterion for appropriateness is processing efficiency; the programming languages people prefer precomposed combinations because they are better suited to certain machine operations. (In some cases, this reflects a limited understanding of the type of operations that people perform on texts, since separable diacritics may be better suited for other operations. Any operation _can_ be performed with either coding, but with differences in efficiency that programmers may consider significant). Another holds that the decision is arbitrary because the most appropriate or efficient encoding is unknowable. I will not rehash these arguments here; please consult the archives. It seems self-evident to me that the adequacy issue, which is entirely on the side of separable diacritics, must be paramount. >Proponents of closed repertoire systems argue that inventors of >NEW orthographies should limit themselves to standard characters. >Proponents of open repertoire systems argue that this is an unnatural >limitation which restricts designers of orthographies artificially. > ><> is what the argument is about, <> about suppressing e-acute. That may be what the argument _should_ be about. Nobody wants to suppress e-acute because it is used in French, but nobody is clamoring to make room for the early Cyrillic lower_case_neutral_jer+longa that I need. And nobody in the "precomposed" camp can tell me how they plan to provide for early Cyrillic when it is impossible to determine a precomposed inventory. My colleague Kyongsok Kim has raised exactly the same argument concerning ancient Hangul. Let me close with a telling anecdote from the ISO10646 ListServ. One of the Unicode developers had occasion to work with a Russian-language teach-yourself-Japanese book. The Japanese is transcribed phonetically in Cyrillic and uses a macron to indicate long vowels. This includes macron over e+diaeresis, which occurs in no Slavic writing of any period as far as I know. It was pointed out that a separable diacritic approach can handle this, while ISO DIS 10646, which is a precomposed approach, cannot. It was also suggested that if we petitioned the ISO to include this precomposed combination in ISO DIS 10646 because it was needed for Russian phonetic transcriptions of Japanese, we would not be warmly received (how many newspapers are published in Russian transcriptions of Japanese?). Someone responded to this: why can't Russians represent vowel length in some other way, such as using doubled vowels? Aside from the linguistic ignorance this betrays (every vowel letter in Russian is syllabic and a doubled vowel letter is two syllables), it demonstrates an attitude that making life easier for programmers is more important than the data. If I need e+diaeresis+macron, it the responsibility of character set designers to provide for it. It is not their business to tell me to bang in a screw with a hammer because their toolkit doesn't include a screwdriver and they don't think I need one. And <> is what the argument is <> about. Concerning Mark Johnson's summary of technical problems, he raises important and genuine issues, but confuses certain basic points. "Character" and "glyph" are technical terms in the character set business and what gets rendered is glyphs, not characters. A character set is not concerned with centering accent marks over vowels any more than it is concerned with forming Arabic ligatures; character sets encode characters and rendering software is responsible for putting out the proper glyphs. Inventories of characters and glyphs are not identical. Once again, please consult the archives of the appropriate ListServs for background on this issue. --David (2) -------------------------------------------------------------------- Date: Tue, 11 Jun 91 11:26:09 EDT From: macrakis@osf.org Subject: Technical problems with diacritics The technical issues around diacritics have been extensively discussed in the ISO10646 and Unicode mailing lists. Let me try to summarize them (with a bias towards Unicode, I'm afraid): Encoding issues In a closed repertoire system, characters (with or without diacritics) can presumably have a fixed encoding. ISO10646 prescribes exactly one way of representing e-acute. Unicode allows both the precomposed e-acute and also the composed e+acute. (Although precomposed characters appear inconsistent with Unicode's approach, they were included for compatibility with existing standards (here, Latin-1).) Unicode also does not prescribe a canonical form for multiple diacritics in cases where their relationship is unambiguous (although it suggests one). However, 10646 does not in fact adhere strictly to biunique mapping of characters and codes. It includes numerous ligatures (especially for Arabic). It also includes many contextual forms (for Arabic, Mongolian, etc.). For Chinese-character languages, 10646 may represent one character by several different codes, depending on language and variant rendering. Neither 10646 nor Unicode questions the distinctness of Greek, Roman, and Cyrillic `A', even though they have a common history and shape. In general, biunique encoding seems unattainable, since there are many borderline cases in the world's writing systems. Processing issues Proponents of 10646 argue that fixed-length encodings simplify processing. Proponents of Unicode argue that this is only true for the simplest cases. For instance, in Spanish, the digraph `ch' must be treated as a single letter for alphabetic sorting, but no one proposes to encode it as a single code. It is also argued that it would be easier to process text in a canonical form--otherwise, you must be prepared to handle both e-acute and e+acute. But after all, you must already be prepared to treat as equivalent: `b' and `B'; u-umlaut, U-umlaut, u+e, and U+e; eta-subscript and Eta+iota; the single character esszed and the two characters SS (and maybe ss or sz in some cases); the single character stigma and the two characters sigma+tau (in Greek numerals). The main upshot of this discussion has been to make clear that processing multilingual text is non-trivial. The methods that work more or less well for English do not work for many other languages -- but even in English, improved internationalization will mean better handling of such things as capitalization, which are handled poorly by all too many programs. Rendering issues As Johnson says, good rendering of most composed characters requires individual graphic design. This is technically compatible with both open and closed repertoire systems. Open repertoire systems should be able to render combinations which weren't considered by the designer; but even with today's technology, you can do better than overstriking (e.g. place accents above the character rather than overlapping with it). Note that Unicode does not prohibit meaningless combinations, such as using Hebrew vowel points on Japanese kana! But you can expect that the rendering will be just as absurd as the spelling.... -s [End Linguist List, Vol. 2, No. 0296] ________________________________________________________________ Linguist List, Vol. 2, No. 0297. Tuesday, 18 June 1991 Subject: Vol-2-297. Responses Total: 72 lines Moderators: Anthony Aristar (a_aristar@fennel.cc.uwa.oz.au) Helen Dry (1echad@utsa86.utsa.edu) (1) Date: Fri, 14 Jun 91 14:16:49 +1000 From: bert peeters Subject: Machine readable dictionaries (2) Date: Fri, 14 Jun 91 14:22:08 +1000 From: bert peeters Subject: Moods (3) Date: FRI, 14 JUN 1991 19:00 JST From: Janet Higgins Subject: RE: Queries (1) -------------------------------------------------------------------- Date: Fri, 14 Jun 91 14:16:49 +1000 From: bert peeters Subject: Machine readable dictionaries In reply to Stefan Schierholz's query about machine readable dictionaries: I myself have been working for a while now with the "Robert electronique", a CD-Rom which contains the complete contents of the 1985 edition of Le Grand Robert, a French monolingual dictionary in 9 volumes. Orders and inquiries: Chadwyck-Healey Ltd Cambridge Place Cambridge CB2 1NR UK or Chadwyck-Healey Inc. 1101 King Street Alexandria VA 22314 When I purchased the product, it was 690 English pounds. Bert Peeters (2) -------------------------------------------------------------------- Date: Fri, 14 Jun 91 14:22:08 +1000 From: bert peeters Subject: Moods is looking for a term which unambiguously refers to moods that express a desire for something to happen. Well, prohibitive, hortative, imperative (the ones this poster mentions) are not exactly agreed on world-wide, it seems to me. Subjunctive does not seem to work, for presumably it implies dependence. It goes together with dependence quite often, but not necessarily. In French one can have: Puissiez-vous reussir! = May you succeed! Soit un triangle ABC = Let's assume the existence of a triangle ABC Vive le roi (I'm Belgian...) = Long live the king etc What about iussive? It is, I believe, a term used by Rodney Huddleston (see e.g. Huddleston 1984). I'm not sure this will do, but it is worth exploring the issue. Bert Peeters (3) ------------------------------------------------------------------- Date: FRI, 14 JUN 1991 19:00 JST From: Janet Higgins Subject: RE: Queries Reply to the phonetics query. Do you know the MSL-pack IBM (Canada) there is also a MAC spectrograph programme. Do you want details? JMDH [Linguist Vol-2-297.] ________________________________________________________________ Linguist List, Vol. 2, No. 0298. Tuesday, 18 June 1991 Subject: Vol-2-298. Queries Total: 76 lines Moderators: Anthony Aristar (a_aristar@fennel.cc.uwa.oz.au) Helen Dry (1echad@utsa86.utsa.edu) (1) Date: Fri, 14 Jun 1991 12:49 +0800 From: "Tze-wan KWAN, Philosophy Dept., CUHK, Hongkong" Subject: Double Articulation (2) Date: Fri, 14 Jun 91 11:09:12 CDT From: Mdeneire Subject: phonetics (3) Date: Fri, 14 Jun 91 19:27:03 EST From: Boyd Davis Subject: Re: Flaming (1) _________________________________________________________ Date: Fri, 14 Jun 1991 12:49 +0800 From: "Tze-wan KWAN, Philosophy Dept., CUHK, Hongkong" Subject: Double Articulation I am currently working on a paper that handles the relation between speech sounds and meaning. In the course of formulat- ing my thesis, I came across the concept of "double articula- tion" which is in fact a semiotic distinction between the meaning determining function on the one hand (1st articula- tion: morphemes, words, sentences...rituals, cultural tradi- tion) and the meaning discriminative function on the other (2nd articulation: distinctive features, phonemes, syllables, may be sound clusters). This differentiation is supposedly important as it allows a symbolic system to function with the greatest efficiency and abstraction. Some cognitive scientists take this feature of differentiation in symbolic activity as peculiar to and characteristic of human intelligence at large. As this issue is quite new to me I would like to hear some more opinions from the outside world. The only relevant sources I have found so far are works by Roman Jakobson and Andre Martinet. Some works of Elmar Holenstein (Bochum, Germa- ny) also touch upon this issue. Can anyone out there give me some enlightenment in regard of the history of this notion? Further bibliographical information will also be appreciated! Tze-wan Kwan Dept. of Philosophy, The Chinese University of Hong Kong, Shatin, Hongkong E-Mail: B071767@CUCSC (Bitnet) or B071767@CUCSC.CUHK.HK (2) _____________________________________________________________ Date: Fri, 14 Jun 91 11:09:12 CDT From: Mdeneire Subject: phonetics I am working on a dissertion on French phonetics (acquisition of French vowels by Americans) I am presently looking for a program (software) that could analyze the 3 formants of the vowels, intensity, and time. For example, in the sentence "Il a bu tant qu'il a pu", I want to analyze the segment "bu", i.e. the vowel "u" in this specific environment. If you can help me with software or any suggestions, please tell me. Thank you. (3) _____________________________________________________________ Date: Fri, 14 Jun 91 19:27:03 EST From: Boyd Davis Subject: Re: Flaming Is the MIT/Stanford/WPI jargon dictionary available online? [Linguist: Vol-2-298.] ________________________________________________________________ Linguist List: Vol-2-299. Tuesday, 18 June 1991. 256 lines. Subject: Diacritics Moderators: Anthony Aristar (a_aristar@fennel.cc.uwa.oz.au) Helen Dry (1echad@utsa86.utsa.edu) -------------------------Directory------------------------------------- (1) Date: Thu, 13 Jun 91 22:22:40 PDT From: whistler@zarasun.Metaphor.COM (Ken Whistler) Subject: Diacritics, Unicode, and ISO10646 (2) Date: Fri, 14 Jun 91 16:43:15 EDT From: macrakis@osf.org Subject: Character encodings -------------------------Messages-------------------------------------- (1) Date: Thu, 13 Jun 91 22:22:40 PDT From: whistler@zarasun.Metaphor.COM (Ken Whistler) Subject: Diacritics, Unicode, and ISO10646 In response to issues raised in Linguist List, Vol. 2, No. 0283. First of all the terminology of "open repertoire" and "closed repertoire" tends to cause endless confusion when applied to character encodings, because the proponents of different character encoding architecture often mean different things when they say "character". This leads to different senses of "closed repertoire of characters". In Unicode terminology, "character" refers to the thing which gets a 16-bit number attached to it in the encoding. In this sense, Unicode clearly has a closed repertoire of characters. There are about 27,000 of them, mostly Han characters, and each one of them is unambiguously identified in the standard. However, the classes of things which are encoded as "characters" includes both baseform letters (U+0065 LATIN SMALL LETTER E) and floating diacritics (U+0301 NON-SPACING ACUTE), as well as accented letters (U+00E9 LATIN SMALL LETTER E ACUTE). This creates the multiple spelling problem for accented letters that we all know about-- but it is also the basis for the open-ended, productive part of Unicode, since U+0301 NON-SPACING ACUTE can be used with other characters to create compositions which are NOT preencoded in the standard. (e.g. x-acute, beta-acute, Georgian-an-acute, who can guess...?) In this sense, the encoding of non-spacing characters in Unicode (of various classes--the Latin/Greek/Cyrillic floating diacritics are only one of several major classes of non-spacing marks used in various scripts) creates a vast potential universe of coded "things" resulting from non-spacing marks applied one or more at a time to baseform characters. To avoid confusing these "things" with characters, let's for now call them "charactoids". While Unicode has a well-defined closed repertoire of characters (each exactly 16-bits in size and well-defined), at the same time is has an open repertoire of charactoids. The class of charactoids is defined by a well-defined set of composition rules, rather than by enumeration. We know the numerosity of the class is huge, but no one is going to try to count it--indeed the whole point is that charactoids are freely generable by the encoding scheme, without having to go through a committee to get a single 16-bit number assigned to it. (As an analogy, think of characters as representing the morphemes of a morphologically complex language like Cree, for example. Charactoids are then analogous to the words of Cree. Who knows what they all are? And is anyone going to try to define them all ahead of time?) While the class of charactoids encodable by composition in Unicode is vast and open, the structure of the code, together with the facts about several widely used orthographies in the Latin/Greek family of scripts results in several well-defined subsets of charactoids which have the following properties: A. charactoids which are functionally equivalent to accented letters encoded as single characters in Unicode. This is the + = case. The list of such cases is well-defined, not too large, and will be published in the Unicode 1.0 standard as one of the auxiliary tables. The important principle is that Unicode does not specify a functional distinction between the two equivalent "spellings" in Unicode. This is important because to do otherwise would prevent Unicode applications and systems programmers from normalizing freely from one to the other depending on their internal requirements for representation. Unicode does not require normalization; nor does it prevent it. Unicode also does not prevent an application from maintaining a private distinction between, say, as a fundamental vowel unit in an orthography, and + as a vowel plus applied tone mark. What it does say is that such private distinctions cannot be reliably conveyed in plain Unicode text, because another Unicode text interpreter may normalize them all to . B. charactoids which are functionally equivalent to accented letters which are NOT encoded as single characters in Unicode, but which are used in important orthographies. The ones which cause all the controversy are Vietnamese and Polytonic Greek. Both make wide use of doubly accented letters. Both are freely encodable in Unicode by baseform plus non-spacing diacritic combinations. The set of charactoids required for Vietnamese or for Polytonic Greek is well-defined and is being published as part of the Unicode 1.0 standard. C. charactoids which are useful, but whose users LIKE to have them be an open class, not enumerated. All of IPA falls in this class, together with the productive application of vector notation diacritics to mathematical symbols, for instance. D. The remainder are the charactoids which... well, who the hell knows what they might be. The Unicode standard has no intention of prescribing them all, nor of proscribing any of them. (Unicoders simply want to build software that lets users do what they want to do.) The design goals of Unicode were to keep subset A as small as possible, because keeping track of such "required" equivalences tends to impose efficiency and resource penalties on software whose combinatorial properties grow as n-squared. Subset A cannot, however, be reduced to zero, because of other code compatibility requirements and offsetting inefficiencies which set in when having to deal with charactoids instead of characters in the software. Others believe that all of subset B should be encoded as characters (i.e., move them into subset A). Vietnamese is on the hairy edge of this argument, and a strong case can be made either way. There is no absolutely right answer as to how to encode it--just a lot of contradicting tradeoffs in a multiple sum game, only some of whose sums are purely technical. Lars Henrik Mathiesen noted that "there are technical reasons why a standard without floating diacritics is easier to implement." While that is true in a world of limited implementations of European languages on glass terminals with character ROM's, the Unicode designers are firmly of the opinion that in building an international character set for multilingual, multiscript applications, the arguments all come down on the other side. (In the following I am not abscribing to Lars a particular position in this--I think his contribution was intended primarily as an exegesis of another note written by an anti-Unicoder posted by a pro-Unicoder.) 1. Open-ended productivity of diacritic application is a fundamental principle of the Latin/Greek family of scripts. To attempt to code all "useful" combinations and proscribe all others is both obtuse and unworkable. (Any devotee of IPA who wants to be able to encode and exchange it on a computer should see this is self-evidently obvious. But the ISO 10646 approach to the IPA encoding problem was to remove the problem by removing IPA from the encoding! Now THAT's a great solution for linguists!) 2. Any putatively universal character encoding has to be able to convert and interwork with existing standards (e.g. in the bibliographic community) which ALREADY have non-spacing diacritics. So it's a done deal. They MUST be included--unless you just ignore them. And THAT's a great solution for bibliographers! 3. Finally, non-spacing diacritics aren't even very hard to implement. Compared to the problems which need to be addressed and solved to support Arabic and Indic scripts, the whole issue of non-spacing diacritics is revealed for what it really is in the larger picture: well-understood "easy stuff". Next, to address some issues raised by Mark Johnson's comments: ISO/IEC DIS 10646 did not propose a "fixed-length character encoding system." That was but one of its many drawbacks. It proposed a character encoding system whose canonical form was four "octets" (ISO standardese for "bytes") for one "character", but which also allowed for "compaction forms" which would result in characters encoded as one, two, three, or a variable number of bytes. And in any case, even when a fixed multiple number of bytes (say 2) would be used to represent "graphical" characters such as , any control characters would have to be interpreted one byte at a time. That aside, it is true that 10646 attempted to enumerate all the "useful" accented letter forms for Latin, Greek, and Cyrillic, and encoded them as distinct characters. Now, with respect to the "general escape method, which...would allow overstriking of arbitrary characters to build new characters," such methods already are standardized! The first-order hack implemented in PC's (and some earlier computers) was to use the BACKSPACE control code as a direct technical calque from everyman's solution for creation of composite characters on a manual typewriter. ISO, in amongst the various standards for control characters (ISO 6429 to be exact), has defined a tonier, less lowbrow control character function, the GCC (Graphic Character Composition) to serve exactly as the "escape method" for combining two characters. The problem is that such approaches are in the stone age of computer typography. Characters (and charactoids) are not the same as glyphs, and glyphs are not the same as images. The glyphs are abstractions of the TYPE of elements of textual graphic representation. (This is pretty close to what linguists mean by "grapheme", but abstracted away from issues relevant to its use as a structural unit; think of "glyph" is to "grapheme" as "phone" is to "phoneme" and you'll be close.) Images are instatiations or TOKENS of actual textual graphic representations taken from particular fonts (of defined face, style, weight, size, etc.). Modern rendering software provides layers of mapping between a) character encoding, which is designed to encode textual CONTENT (appropriate for manipulation by textual processes) and b) glyphic representation (appropriate for rendering in visible form on screen, printer, or other device). Such mappings between character and glyph can be one-to-one in the simple (ASCII) case, but typically are not simple in computer typography even of English. Several characters may map to a single ligature glyph; a sequence of baseform + non-spacing diacritic may map to a single composite glyph. The typeface designer builds the glyphs; the rendering software maps to the correct choice. Even in those cases (as for the open-ended set of charactoids encodable in Unicode which could not all be anticipated by a typeface designer) in which a baseform + non-spacing diacritic is mapped to a pair of glyphs which must be (glyphically) composed, modern font technology builds the composition rules into the fonts. Effectively the diacritics "know" where to place themselves with respect to baseforms and each other (within limits). Once again, the solutions for handling these issues for the Latin script are quite well-understood in the industry. Font technology is an entire computer sub-industry single-mindedly driven towards making computer typography even better than the "real thing". But what is very well understood for Latin (and rapidly being extended to Greek and Cyrillic) is still very skimpily implemented for Arabic, or Devanagari, or Tibetan (!) or Burmese (!), for example, where the problems are very much harder and where getting it right is going to take a lot more work yet. Hoping this helps some, --Ken Whistler Secretary, Unicode Consortium (and a practicing linguist!) __________________________________________________________________________ (2) Date: Fri, 14 Jun 91 16:43:15 EDT From: macrakis@osf.org Subject: Character encodings Mr. Hackney: May I suggest you read the (extensive) discussions on multilingual character coding before speculating, and before asserting that ``the answer'' is a variable-length coding? -s (Stavros Macrakis) PS Your `inferred explanation' of `floating diacritics' is incorrect. [Linguist List: Vol-2-299.] ________________________________________________________________ Linguist List: Vol-2-300. Tuesday, 18 June 1991. 68 lines. Subject: Jobs Moderators: Anthony Aristar (a_aristar@fennel.cc.uwa.oz.au) Helen Dry (1echad@utsa86.utsa.edu) -------------------------Directory------------------------------------- (1) Date: Fri, 14 Jun 91 12:34+0100 Subject: job From: beate firzlaff -------------------------Messages-------------------------------------- (1) Date: Fri, 14 Jun 91 12:34+0100 Subject: job From: beate firzlaff Institute for Integrated Publication and Information Systems Gesellschaft fuer Mathematik und Datenverarbeitung mbH GMD Darmstadt - Germany JOB OFFER For our text analysis project KONTEXT we are looking for an additional researcher. The KONTEXT project develops a text analysis system with a better ability to process texts which communicate new knowledge. The task of the system is to construct a conceptual representation and textual access structures to full texts. The system is developed on the basis of a text model. It is part of a prototype of an integrated publication and information system. Your field of activity will be designing and modelling of a lexicalized text grammar. The text grammar describes the contributions linguistic means make towards text constitution (content and structure). For modelling a Feature Structure System is used. Ideally, we are looking for a computer scientist/computational linguist with theoretical and practical experience in - unification based grammar - grammar of German - formal languages - natural language analysis/generation - LISP. We are offering an interdisciplinary scientific working environment including an up-to-date technical equipment (a network consisting of SUN und SYMBOLICS workstations). Opportunities for working on a PHD are given. Salary on the German BAT scale. Applications should be submitted to Professor Dr. E. J. Neuhold GMD - Instiut fuer Integrierte Publikations- und Informationssysteme (IPSI) Dolivostrasse 15, 6100 Darmstadt, Germany Information concerning the job can be obtained from Dr. Karin Haenelt phone: ++49/6151/875-828 e-mail: [Linguist List. Vol-2-300]