WORDS Version 1.97FC
LATIN-ENGLISH DICTIONARY PROGRAM



SUMMARY

INSTALLATION
Is There a Problem?

INTRODUCTION

OPERATIONAL DESCRIPTION
Program Operation
Modes of Operation
Command Line Operation
Latin-to-English Examples
English-to-Latin Examples
Design of the Meaning Line
Signs and Abbreviations in Meaning

PROGRAM DESCRIPTION
Codes in Inflection Line
Help for Parameters
Special Cases
Uniques
Tricks
Trimming of uncommon results

GUIDING PHILOSOPHY
Purpose
Method
Word Meanings
Proper Names
Letter Conventions (u/v, i/j, w)

DICTIONARY
Dictionary Codes
AGE
AREA
GEO
FREQ
SOURCE
Current Distribution of DICTLINE Flags
Dictionary Conventions
Evolution of the Dictionary
Text Dictionary - DICTPAGE.TXT
Latin Spellchecking - Text Processor List - LISTALL.ZIP

INFLECTIONS

ENGLISH to LATIN
English Parsing of Meanings
Ordering English-to-Latin Output

TESTS AND STATUS
Testing
Current Status and Future Plans

USER MODIFICATIONS
Writing DICT.LOC and UNIQUES.LAT
DICT.LOC
UNIQUES.LAT

DEVELOPERS AND REHOSTING
Program source code and data
License
Rehosting WORDS
Feedback


SUMMARY



This program, WORDS, takes keyboard input or a file of Latin text lines and provides an analysis of each word individually. It uses an INFLECT.SEC, UNIQUES.LAT, ADDONS.LAT, STEMFILE.GEN, INDXFILE.GEN, and DICTFILE.GEN, and possibly .SPE and DICT.LOC.

The dictionary contains over 39000 entries, as would be counted in an ordinary dictionary. This expands to almost twice that number of individual stems (the count that the program may display at startup), and, through additional word construction with hundreds of prefixes and suffixes, may generate more, leading to many hundreds of thousands of 'words' that can be formed by declension and conjugation. This version of WORDS provides a tool to help in translations for the Latin student. It is now a large dictionary by any measure and can be helpful to advanced users. The dictionary will continue to grow - slowly.

INSTALLATION


The WORDS program, with its accompanying data files should run on any machine for which it is adapted, any monitor. Simply download the self-extracting EXE files or the compressed file for the appropriate system and execute/decompress it in your chosen subdirectory on the hard disk, creating the necessary files. Then call/run WORDS, or do as instructed in any README.

The load includes SPQR.ICO, a possible icon for WORDS, but just that, only an icon. You have to install the program as per the directions (put the downloaded files in a folder, run them to expand to the WORDS system, then run from that folder). However, If you are Windows-wise, you can use Explorer and make a shortcut and put it on the desktop. Windows will make a generic icon, but you can change it (using Properties) to whatever other icon you can find, for instance, the one included with the package. Or not. Make sure that the Properties on the icon has as Target the WORDS.EXE in the folder in which the system is loaded.

See the particular page for each specific system.
DOS
Windows 95/NT/98/2000/XP
Linux and FreeBSD
OS/2
MAC OS X


Is There a Problem?

Did you download the two appropriate file(s) to your hard disk, as listed in the download page for your system?

Can you verify that they are there and full size (megabytes as indicated)?

Did you execute/run/unzip these programs?

If self-extracted, were you asked where to put the generated files? (Maybe a default C:\WORDS)? If not, did you put them in the folder/subdirectory from which you wish to operate?

Can you verify that the full set of files (about 10 MB) was generated in that folder/subdirectory, or wherever you chose? At least WORDS.EXE, INFLECT.SEC, UNIQUES.LAT, ADDONS.LAT, STEMFILE.GEN, INDXFILE.GEN, and DICTFILE.GEN, plus documentation.

Did you run/execute WORDS in that folder/subdirectory? e.g.
C:\WORDS

If when you try to run there is no WORDS.EXE (or equivalent), the system should let you know.
If there is no INFLECTS.SEC, the program will say so and abort immediately.
If there are no dictionary files, the program will tell you, but will start (you can get Roman numerals!).
If there is no ADDONS.LAT or UNIQUES.LAT, the program will tell you, and if they are there it will tell you how many.


INTRODUCTION



I am no expert in Latin, indeed my training is limited to a couple of years in high school more than 50 years ago. But I always felt that Latin, as presented after two millennia, was a scientific language. It had the interesting property of inflection, words were constructed in a logical manner. I admired this feature, but could never remember the vocabulary well enough when it came time to exercise it on tests.

I decided to automate an elementary-level Latin vocabulary list. As a first stage, I produced a computer program that will analyze a Latin word and give the various possible interpretations (case, person, gender, tense, mood, etc.), within the limitations of its dictionary. This might be the first step to a full parsing system, but, although just a development tool, it is useful by itself.

Please remember that this is only a computer exercise in automating a Latin dictionary. I am not a Latin scholar and anything in the program or documentation is filtered by me from reading the cited Latin dictionaries. Please let no one go to his teacher and cite my interpretation as an authority.

While developing this initial implementation, based on different sources, I learned (or re-learned) something that I had overlooked at the beginning. Latin courses, and even very large Latin dictionaries, are put together under very strict ground rules. Some dictionary might be based exclusively on 'Classical' (200 BC - 200 AD) texts; it might have every word that appears in every surviving writing of Cicero, but nothing much before or since. Such a dictionary will be inadequate for translating medieval theological or scientific texts. In another example, one textbook might use Caesar as their main source of readings (my high school texts did), while another might avoid Caesar and all military writings (either for pacifist reasons, or just because the author had taught Caesar for 30 years and had grown bored with going over the same material, year after year). One can imagine that the selection of words in such different texts would differ considerably; moreover, even with the same words, the meanings attached would be different. This presents a problem in the development of a dictionary for general use.

One could produce a separate dictionary for each era and application or a universal dictionary with tags to indicate the appropriate application and meaning for each word. With such a tag arrangement one would not be offered inappropriate or improbable interpretations. The present system has such a mechanism, but it is not fully exploited.

The Version 1.97E dictionary may be found to be of fairly general use for the student; it has all the easy words that every text uses. It also has the adverbs, prepositions, and conjunctions, which are not as sensitive to application as are the nouns and verbs. The system also tests a few hundred prefixes and suffixes, if the raw word cannot be found. Beyond that, there are a large number of TRICKS which may be applied. These may be thought of as correcting for variations in spelling. This allows an interpretation of many words which would otherwise be marked unknown. The result of this analysis is fairly straightforward in most cases, accurate but esoteric in some others. Some constructions are recognized Latin words, and some are perfectly reasonable words which may never have been used by Cicero or Caesar but might have been used by Augustine or a monk of Jarrow. For about 1 in 10 constructed words the result has no relation to the normal dictionary meaning.

BE WARNED! The program will go to great lengths if all tricks are invoked. If you get a word formed with an enclitic, prefix, suffix, and syncope, be very suspicious! It my well be right, but look carefully. (Try siquempiamque!)

The final try is to look at the input as two words run together. In most cases this works out, and is especially useful for late Latin number usage. However, this algorithm may go very wrong. If it is not obviously right, it is probably incorrect.

With this facility, and a 39000 word dictionary, trials on some tested classical texts and the Vulgate Bible give hit rates of far better than 99%, excluding proper names (there are very few proper names in this dictionary). (I am an old soldier so the dictionary may have every possible word for attack or destroy. The system is near perfect for Caesar.) The question arises, what hit rate can be expected for a general dictionary. Classical Latin dictionaries have no references to the terminology of Christian theology. The legal documents and deeds of the Middle Ages are a challenge of jargon and abbreviations. These areas require special knowledge and vocabulary, but even there the ability to handle the non-specialized words is a large part of the effort.

The development system allows the inclusion of specialized vocabulary (for instance a SPEcial dictionary for specialized words not wanted in most dictionaries), and the opportunity for the user to add additional words to a DICT.LOC.

It was initially expected that there would be special dictionaries for special applications. That is why there is the possibility of a SPECIAL dictionary. Now the general dictionary is coded by AGE and application AREA. Thus special words used initially/only by St Thomas Aquinas would be Medieval (AGE code F) and Ecclesiastical (AREA code E). Eventually there needs to be a filter that allows one, upon setting parameters for Medieval and Ecclesiastical, to push those words over others. Right now there are not have enough non-classical vocabulary to rely on such a scheme. The problem is that one needs a very complete classical dictionary before one can assure that new entries are uniquely Medieval, that they are not just classical words that appear in a Medieval text. And the updated is only into the D's. So the situation is that the mechanism is there, but not sufficient data. Nevertheless that is exactly the application I had in mind when I set out to do the program.

One can set a parameter to exclude medieval words if there is a classical word answering the same parse. Likewise, the program can ignore rare meanings if there is a common meaning for the parse.

The program may be larger than is necessary for the present application. It is still in development but some effort has now been put into optimization. Nevertheless there is lots of room for speeding it up. Specifically, the program is disk-oriented is order to run on small machines, such as DOS with the 640KB limitation. Rejecting this limitation and assuming that the user has tens of megabytes of memory (clearly realistic today) would allow faster processing. The next version may go that way.

This is a free program, which means it is proper to copy it and pass it on to your friends. Consider it a developmental item for which there is no charge. However, just for form, it is Copyrighted (c). Permission is hereby freely given for any and all use of program and data. You can sell it as your own, but at least tell me.

This version is distributed without obligation, but the developer would appreciate comments and suggestions.


William A Whitaker
PO Box 3036
McLean VA 22103-3036
USA
whitaker@erols.com

OPERATIONAL DESCRIPTION


This write up is rudimentary and assumes that the user is experienced with computers, and as an example assumes a PC with a Windows OS. Other systems operate essentially the same.

The WORDS program, Version 1.97E, with it's accompanying data files should run on PC in Windows 95/98/NT, any monitor. Simply download the self-extracting EXE file and execute it in your chosen subdirectory/folder to UNZIP the files into a subdirectory of a hard disk. Then call WORDS.

There are a number of files associated with the program. These must be in the subdirectory/folder of the program, and the program must be run from that subdirectory. WORDS.EXE is the executable program. INFLECT.SEC holds the encoded inflection records. STEMFILE.GEN contains the stems of the GENERAL dictionary in a searchable form. DICTFILE.GEN is an indexed form of the GENERAL dictionary entries with form information and meanings. INDXFILE.GEN contains a set of indexes into the DICTFILE. In some versions, there may be a set of files for a SPECIAL (.SPE) dictionary of the same structure as the GENERAL dictionary, but there is no SPECIAL dictionary in the present distribution. A LOCAL dictionary may also be used. This is a limited dictionary of a different form, human readable and writeable. The knowledgeable user can augment and modify it on-line. It would consist of the file DICT.LOC. UNIQUES.LAT contains certain words which regular processing does not get. ADDONS.LAT contains the set of prefixes, suffixes and enclitics (-que, -ve) and the like. Other files may be generated by the program, so run it in a configuration that allows the creation of files.

All these files are necessary to run the program (except the optional dictionaries SPE and LOC). This excess of files is a consequence of the present developmental nature of the program. The files are very simple, almost human-readable. Presumably, a later version could condense and encode them. Nevertheless, beyond the original COPY, the user need not worry about them.

Additionally, there are files that the program may produce on request. All of these share the name WORD, with various extensions, and they are all ASCII/DOS text files which can be viewed and processed with an ordinary editor. The casual user may not want to get involved with these. WORD.OUT will record the whole output, WORD.UNK will list only words the program is unable to interpret. These outputs are turned on through the PARAMETERS mechanism.

PARAMETERS may be changed while running the program by inputting a line containing a '#' mark as the only (or first) character. Alternatively, WORD.MOD contains the MODES that can be set by CHANGE_PARAMETERS. If this file does not exist, default modes will be used. The file may be produced or changed when changing parameters. It can also be modified, if the user is sufficiently confident, with an editor, or deleted, thereby reverting to defaults.

There is another set of developers parameters which may be set with the input of '!'. These MODES may be changed and saved in a file WORD.MDV. These are not normal user facilities, probably no one but the developer would be interested. In any specific release these facilities may, or may not, work. They are just mentioned here in case they ever come up accidentally, and to point out that there are other capabilities, actual and possible, which may be invoked if there is a special need. The user is invited to review these parameters to see if any address an unusual need.

WORD.OUT is the file produced if the user requests output to a file. This output can be used for later manipulation with a text editor, especially when the input was a text file of some length. If the parameter UNKNOWNS_ONLY is set, the output serves as a sort of a Latin spell checker. Those words it cannot match may just not be in the dictionary, but alternatively they may be typos. A WORD.UNK file of unknowns can be generated.

Program Operation

To start the program, in the subdirectory that contains all the files, type WORDS. A setup procedure will execute, processing files. Then the program will ask for a word to be keyed in. Input the word and give a return (ENTER). Information about the word will be displayed.

One can input a whole line at a time, however long, but only one line since the return at the end of line will start the processing. If the results would fill more than a computer screen, the output is halted until the user responds to the 'MORE' message with a return. A file containing a text, a series of lines, can be input by keying in the character '@', followed (with no spaces) by the DOS name of the file of text. This input file need not be in the program subdirectory, just use the full path and name of the file. This is usually accompanied with the setting of the parameter switches to create and write to an output file, WORD.OUT.

One can have a comment in the file, a terminal portion of a line that is not parsed. This could be an English meaning, a source where the word was found, an indication that it may have been miscopied, etc. A comment begins with a double dash [--] and continues to the end of the line. The '--' and everything after on that line is ignored by the program.

A simple # character input at the start of a line (that is, a line containing only #) will permit the user to set modes to prevent the process from trying prefixes and suffixes to get a match on an item unknown to the dictionary, put output to a file, etc. Going into the CHANGE_PARAMETERS, the '?' character calls help for each entry.

Another set of parameters is invoked by the character !. These developer parameters are fairly specialized and are probably not required by the average user, nevertheless they are available for special applications.

Two successive returns with no text will terminate the program (except in text being read from an @ disk file.)

Modes of Operation

The mode of operation of WORDS can be specialized by setting some combination of available parameters. Here are a couple of example situations.

If you want only meanings to show up, set the # parameter
DO_ONLY_MEANINGS => Yes

If you do not even want to see the dictionary form (principle parts) set # parameter
DO_DICTIONARY_FORM => No

If you want to accept only the dictionary entry (amo, but not amas), set the ! parameter (this is the tricky one, requiring two parameters set)
DO_ONLY_INITIAL_WORD => Yes

This will ten require you to input one enrty per line, which is not unreasonable for a dictionary look-up process. Then you will be offered another, otherwise unavailable, option
FOR_WORD_LIST_CHECK => Yes

There are a large number of other options. The user is invited to consider all the options if needing anything more than the basic parse.

Of course, for both sets of parameters, you will want to go to the end of the parameter setting menu and save this set so you can restart with the same situation.

Command Line Operation

The main mode of usage for WORDS is a simple call, followed by screen interaction.

But there are other, command line, options. WORDS may be called with arguments on the same line, in a number of different modes. The program will execute with these arguments as input. Remember that the saved parameter settings (in WORD.MOD and WORD.MDV) are controlling, even for command line input.

Single argument, either a simple Latin word or an input file.

WORDS amo
which will cause it to execute for that input and then terminate. This is for a quick word.

WORDS infile
causes WORDS to execute with the contents of the inflie. The infile may be from any folder if the full path name is given.

With two arguments the options are: inputfile and outputfile, two Latin words, or a language shift to English (Latin being the startup default) and an English word (with no part of speech).

WORDS infile outfile
The program will read as input the INFILE and write the output to the OUTFILE (as though it were WORD.OUT). It will then await further input from the user. It terminates with a return. If the parameters are not legal file names, the program will assume they are Latin words to be processed as command line input.

WORDS amo amas

WORDS ^e love
switches to English input from the default Latin and searches for love.

With three arguments there could be three Latin words or a language shift and and English word and part of speech.

WORDS amo amas amat

WORDS ^e love v

More than three arguments must all be Latin words.

WORDS amo amas amat amamus amatis amant

There cannot be more than one English word in the argument list, since there can only be one English word per line for WORDS input.

An input file (either from interactive with @ or from command line) can have changes of language, but the ^E or ^L must be on a seperate line. Note that this capability can create confusing situations. An input file that starts off Latin then switches to English will be correctly processed. But if it is followed by a similiar input file, the second file will start off English (from the setting in the earlier file) and fail on the Latin input. Thus even submitting the same file twice in a run will give different results. Ithis problem can be alleviated by starting each input file with an explicit language instruction, but this will not normally be the situation.

Latin-to-English Examples

Following are annotated examples of output. Examination of these will give a good idea of the system. The present version may not match these examples exactly - things are changing - but the principle is there. A recent modification is the output of dictionary forms or 'principal parts' (shown below for some examples).

=>agricolarum
agricol.arum         N      1 1 GEN P M                 
agricola, agricolae  N    M   [XAXBO]  
farmer, cultivator, gardener, agriculturist; plowman, countryman, peasant;

This is a simple first declension noun, and a unique interpretation. The '1 1' means it is first declension, with variant 1. This is an internal coding of the program, and may not correspond exactly with the grammatical numbering. The 'N' means it is a noun. It is the form for genitive (GEN), plural ('P'). The stem is masculine (M). The stem is given as 'agricol' and the ending is 'arum'. The stem is normal in this case, but is a product of the program, and may not always correspond to conventional usage.

On the next line is given the expansion of the form that one might find in a paper dictionary, the nominitive and genitive (agricola, agricolae). The [XAXBO] is an internal code of the program and is documented below as Dictionary Codes. Several codes are associated with each dictionary entry (presently AGE, AREA, GEO, FREQ, SOURCE). These provide some information to enhance the interpretation of the dictionary entry. In this case, the interesting piece is the B, which signifies that this word is found frequently in texts, in the top 10 percent. The O says it has been verified in the Oxford Latin Dictionary. The A says it is an agrigultural word.

The declension/conjugation numbers for nouns and verbs are essentially arbitary (but will be familiar to Latin students). The variants are complete inventions. They have no real meaning, just codes for the program.

(In the case of adjectives, they are even more arbitary, although a Latin student might see how I came by them. Again they are only codes for the program. The initial release of the program did not put these out, but there is some interest on the part of students, so they are now included. The user may ignore them altogether. There is no relation between the declension/variant codes of a noun and the accompaning adjective. They only agree in case, number, and gender (NOM S N), which are listed in the output.)

=>feminae
femin.ae             N      1 1 GEN S F                 
femin.ae             N      1 1 DAT S F                 
femin.ae             N      1 1 NOM P F                 
femin.ae             N      1 1 VOC P F                 
femina, feminae  N    F   [XXXAX]  
woman; female;

This word has several possible interpretations in case and number (Singular and Plural). The gender is Feminine. Presumably, the user can examine the adjoining words and reduce the set of possibilities.

=>cornu
corn.u               N      4 1 ABL S F                 
cornus, cornus  N    F   [XXXCO]  
cornel-cherry-tree (Cornus mas); cornel wood; javelin (of cornel wood);
corn.u               N      4 2 NOM S N                 
corn.u               N      4 2 DAT S N                 
corn.u               N      4 2 ABL S N                 
corn.u               N      4 2 ACC S N                 
cornu, cornus  N    N   [XXXAO]  
horn; hoof; beak/tusk/claw; bow; horn/trumpet; end, wing of army; mountain top;
*

Here is an example of another declension and two variants. The Masculine (and few Feminine) (-us) nouns of the declension are '4 1' and the Neuter (-u) nouns are coded as '4 2'. This word has both. The horn parse is very frequent (A), while the cornel option (C) is less so but still common.

=>ego
ego                  PRON   5 1 NOM S C                 
 [XXXAX]  
I, me; myself;

A pronoun is much like a noun. The gender is common (C), that is, it may be masculine or feminine. For some odd words, especially including pronouns, there is no dictionary form given.

=>illud
ill.ud               PRON   6 1 NOM S N                 
ill.ud               PRON   6 1 ACC S N                 
ille, illa, illud  PRON   [XXXAX]  
that; those (pl.); also DEMONST; that person/thing; the well known; the former;
*

The asterisk means that there are other, less probable forms which have been trimmed, but which may be recovered by running with the TRIM parameter reset.

=>hic
h.ic                 PRON   3 1 NOM S M                 
hic, haec, hoc  PRON   [XXXAX]  
this; these (pl.); also DEMONST;
hic                  ADV    POS                         
hic                 ADV   [XXXCX]  
here, in this place; in the present circumstances;

In this case there is a adjectival/demonstrative pronoun, or it may be an adverb. The POS means that the comparison of the adverb is positive.

=>bonum
bon.um               N      2 1 ACC S M                 
bonus, boni  N    M   [XXXCO]  
good/moral/honest/brave man; man of honor, gentleman; better/rich people (pl.);
bon.um               N      2 2 NOM S N                 
bon.um               N      2 2 ACC S N                 
bonum, boni  N    N   [XXXAO]  
good, good thing, profit, advantage; goods (pl.), possessions, wealth, estate;
bon.um               ADJ    1 1 NOM S N POS             
bon.um               ADJ    1 1 ACC S M POS             
bon.um               ADJ    1 1 ACC S N POS             
bonus, bona -um, melior -or -us, optimus -a -um  ADJ   [XXXAO]  
good, honest, brave, noble, kind, pleasant, right, useful; valid; healthy;
*

Here we have an adjective, but it might also be a noun. The interpretation of the adjective says that it is POSitive, and that is the meaning listed, as is the convention for all dictionaries. The user must generate form this the meanings for other comparisons. Check the comparison value before deciding on the real meaning. Again, there is an asterisk, indicating further inflected forms were trimmed out.

=>facile
facil.e              ADJ    3 2 NOM S N POS             
facil.e              ADJ    3 2 ABL S X POS             
facil.e              ADJ    3 2 ACC S N POS             
facilis, facile, facilior -or -us, facillimus -a -um  ADJ   [XXXAX]  
easy, easy to do, without difficulty, ready, quick, good natured, courteous;
facile               ADV    POS                         
facile, facilius, facillime  ADV   [XXXBO]  
easily, readily, without difficulty; generally, often; willingly; heedlessly;
*

Here is an adjective or and adverb. Although they are related in meaning, they are different words.

=>acerrimus
acerri.mus           ADJ    3 3 NOM S M SUPER           
acer, acris -e, acrior -or -us, acerrimus -a -um  ADJ   [XXXAO]  
sharp, bitter, pointed, piercing, shrill; sagacious, keen; severe, vigorous;

Here we have an adjective in the SUPERlative. The meanings are all POSitive and the user must add the -est by himself.

=>optime
optime               ADV    SUPER                       
bene, melius, optime  ADV   [XXXAO]  
well, very, quite, rightly, agreeably, cheaply, in good style; better; best;
opti.me              ADJ    1 1 VOC S M SUPER           
bonus, bona -um, melior -or -us, optimus -a -um  ADJ   [XXXAO]  
good, honest, brave, noble, kind, pleasant, right, useful; valid; healthy;

Here is an adjective or and adverb, both are SUPERlative.

=>monuissemus
monu.issemus         V      2 1 PLUP ACTIVE  SUB 1 P    
moneo, monere, monui, monitus  V   [XXXAX]  
remind, advise, warn; teach; admonish; foretell, presage;

Here is a verb for which the form is PLUPerfect, ACTIVE, SUBjunctive, 1st person, Plural. It is 2nd conjugation, variant 1.

=>amat
am.at                V      1 1 PRES ACTIVE  IND 3 S    
amo, amare, amavi, amatus  V   [XXXAO]  
love, like; fall in love with; be fond of; have a tendency to;

Another regular verb, PRESent, ACTIVE, INDicative.

=>amatus
amat.us              VPAR   1 1 NOM S M PERF PASSIVE PPL
amo, amare, amavi, amatus  V   [XXXAO]  
love, like; fall in love with; be fond of; have a tendency to;
amat.us              ADJ    1 1 NOM S M POS             
amatus, amata, amatum  ADJ   [XXXEO]    uncommon
loved, beloved;

Here we have the PERFect, PASSIVE ParticiPLe, in the NOMinative, Singular, Masculine. In addition, there is the ADJective that is formed from this participle. If the ADJective is common, it will likely have its own dictionary entry. Sometimes there may be a special or idiomatic meaning not obvious from the verb, or the meaning may stray from the original. In this case, the verb is very frequent, but the use as a adjective is uncommon.

=>amatu
amat.u               SUPINE 1 1 ABL S N                 
amo, amare, amavi, amatus  V   [XXXAO]  
love, like; fall in love with; be fond of; have a tendency to;

Here is the SUPINE of the verb in the ABLative Singular.

=>orietur
ori.etur             V      4 1 FUT          IND 3 S    
orior, oriri, oritus sum  V    DEP   [XXXAO]  
rise (sun/river); arise/emerge, crop up; get up (wake); begin; originate from;
be born/created; be born of, decend/spring from; proceed/be derived (from);
ori.etur             V      3 1 FUT          IND 3 S    
orior, ori, ortus sum  V    DEP   [XXXBO]  
rise (sun/river); arise/emerge, crop up; get up (wake); begin; originate from;
be born/created; be born of, decend/spring from; proceed/be derived (from);

For DEPondent verbs the passive form is to be translated as if it were active voice, so there is no VOICE given in the output.

=>ab
ab                   PREP   ABL                         
ab  PREP  ABL   [XXXAO]  
by (agent), from (departure, cause, remote origin/time); after (reference);

Here is a PREPosition that takes an ABLative for an object.

=>sine
sin.e                N      2 2 NOM P N                 
sin.e                N      2 2 ACC P N                 
sinum, sini  N    N   [XXXCX]  
bowl for serving wine, etc;
sin.e                V      3 1 PRES ACTIVE  IMP 2 S    
sino, sinere, sivi, situs  V   [XXXAX]  
allow, permit;
sine                 PREP   ABL                         
sine  PREP  ABL   [XXXAX]  
without;
*

Here is a PREPosition that might also be a Verb or a Noun. While as a preperation it is so common that it is unlikely that any other use would occur, there is no way to indicate that. Just be reminded that the frequency given for a verb is for the sum of all the couple of hundred forms of the verb, not just the one form that is parsed.

=>contra
contra               ADV    POS                         
contra              ADV   [XXXAO]  
facing, face-to-face, in the eyes; towards/up to; across; in opposite direction;
against, opposite, opposed/hostile/contrary/in reply to; directly over/level;
otherwise, differently; conversely; on the contrary; vice versa;
contra               PREP   ACC                         
contra  PREP  ACC   [XXXAO]  
against, facing, opposite; weighed against; as against; in resistance/reply to;
contrary to, not in conformance with; the reverse of; otherwise than;
towards/up to, in direction of;  directly over/level with; to detriment of;

Here is a PREPosition that might also be an ADVerb. This is a very common situation, with the meanings being much the same.

=>et
et                   CONJ                               
et                  CONJ   [XXXAX]  
and, and even; also, even;  (et ... et = both ... and);

Here is a straight CONJunction.

=>vae
vae                  INTERJ                             
vae                 INTERJ   [XXXBX]  
alas, woe, ah; oh dear;  (Vae, puto deus fio - Vespasian); Bah!, Curses!;

Here is a straight INTERJection.

=>septem
septem               NUM    2 0 X   X X CARD            
septem, septimus -a -um, septeni -ae -a, septie(n)s  NUM   [XXXAX]  
 7 - (CARD answers 'how many');

Numbers are recognized as such and given a value. An additional provision is the attempt to recognize and display the value of Roman numerals, even combinations of appropriate letters that do not parse conventionally to a value but may be ill-formed Roman numerals.

=>VII
VII                  NUM    2 0 X   X X CARD            
7  as a ROMAN NUMERAL;

Beyond simple dictionary entry words, the program can construct additional words with prefixes, suffixes and other ADDONS.

=>populusque
que                  TACKON                             
-que = and (enclitic, translated before attached word); completes plerus/uter;
popul.us             N      2 1 NOM S M                 
populus, populi  N    M   [XXXAO]  
people, nation, State; public/populace/multitude/crowd; a following;
members of a society/sex; region/district (L+S); army (Bee);

Here the input word is recognized as a combination of a base word and an enclitic (-que) tacked on. This particular enclitic is extremely common and its omission, or the omission of the process that handles it, would result in an very large number of UNKNOWNs in the output.

=>pseudochristus
pseudo               PREFIX                             
false, fallacious, deceitful; sperious; imitation of;
christ.us            N      2 1 NOM S M                 
Christus, Christi  N    M   [XEXAO]  
Christ;

Here there is a prefix and a base. The user must make the combination into a word or phrase.

Generally, the meaning is given for the base word, as is usual for dictionaries. For the verb, it will be a present meaning, even when the tense given is perfect. For a noun, it will be the singular, and the user must interpret when the form is plural.

For an adjective, the positive meaning is given, even if a comparative or superlative form is output. The user is invited to expand to comparative (-er) and superlative (-est). For a few adjectives, the only stem in the dictionary is COMP or SUPER. When there is just one comparison, the WORDS dictionary gives that expanded meaning. This might be considered inconsistant, in that it expects the user to observe the FORM to interpret the meaning, but it is consisent with ordinary dictionary practice.

Initially there were more defective adjective entries. I had accepted assertions in OLD or L+S and others like 'comparative does not exist'. Later on I went over to the position that even if theCicero did not use it, someone might. I started generating COMP and SUPER where it seemed reasonable. One can also count on a suffix to correct most omissions, and it will.

Sometimes a word is constructed from a suffix and a stem of a different part of speech. Thus an adverb may be constructed from its adjective. It will show the base adjective meaning and an indication of how to make the adverb in English. The user must make the proper interpretation.

In some cases an adjective will be found that is a participle of a verb that is also found. The participle meaning, as inferred by the user from the verb meaning, is not superseded by the explicit adjective entry, but supplemented by it with possible specialized meanings.

English-to-Latin Examples

~E (tilde E/e plus Enter/CR) changes mode from Latin-to-English to English-to-Latin. ~L changes back.

A single input English word is followed by the desired part of speech. Omitting the part of speech defaults to all, which is not recommended for any word which can be ambiguous. Since the program is looking for a part of speech, it would be inconvenient to support the input of several English words on a line. While a (@) file of words can be processed in the English mode, it must be one word per line.

Output looks much like a paper dictionary entry, with form, part of speech, gender, etc. Also included are the WORDS coded declension/conjugation and the TRANS flags, which give age, frequency and source, information for the user in selecting the best trnslation. The output may also contain a vertical bar leading the meaning. This is a continuation symbol which states that there are other meanings for the Latin word. The user might want to run the Latin phase of WORDS to get the full set of meanings so that no unintended conflicts appear.


love v

amo, amare, amavi, amatus  V     1 1 [XXXAO]  
love, like; fall in love with; be fond of; have a tendency to;

diligo, diligere, dilexi, dilectus  V     3 1 [XXXAX]  
select, pick, single out; love, value, esteem; approve, aspire to, appreciate;

amo, amare, additional, forms  V     9 1 [BXXEO]  
love, like; fall in love with; be fond of; have a tendency to;

ardeo, ardere, arsi, arsus  V     2 1 [XXXAO]  
be on fire; burn, blaze; flash; glow, sparkle; rage; be in a turmoil/love;

adamo, adamare, adamavi, adamatus  V     1 1  TRANS   [XXXBO]  
fall in love/lust with; love passionately/adulterously; admire greatly; covet;

deamo, deamare, deamavi, deamatus  V     1 1  TRANS   [XXXCO]  
love dearly; be passionately/desperately in love with; be delighted with/obliged
*

in prep

in  PREP  ABL    [XXXAX]  
in, on, at (space); in accordance with/regard to/the case of; within (time);

ante  PREP  ACC    [XXXAO]  
in front/presence of, in view; before (space/time/degree); over against, facing;

super  PREP  ABL    [XXXAX]  
over (space), above, upon, in addition to; during (time); concerning; beyond;

in  PREP  ACC    [XXXAX]  
into; about, in the mist of; according to, after (manner); for; to, among;

prae  PREP  ABL    [XXXAX]  
before, in front; in view of, because of;

praeter  PREP  ACC    [XXXAX]  
besides, except, contrary to; beyond (rank), in front of, before; more than;
*

in

intro               ADV    [XXXAX]  
within, in; to the inside, indoors;

in  PREP  ABL    [XXXAX]  
in, on, at (space); in accordance with/regard to/the case of; within (time);

gener, generi  N     2 3  M   [XXXBX]  
son-in-law;

baro, baronis  N     3 1  M   [XXXBL]  
baron; magnate; tenant-in-chief (of crown/earl); burgess; official; husband;

sororius, sorori(i)  N     2 4  M   [XXXCX]  
sister's husband, brother-in-law;

socrus, socrus  N     4 1  M   [XXXCX]  
father-in-law; spouse's grandfather/great grandfather;
*

kill v

occido, occidere, occidi, occisus  V     3 1 [XXXAX]  
kill, murder, slaughter, slay; cut/knock down; weary, be the death/ruin of;

interficio, interficere, interfeci, interfectus  V     3 1 [XWXAX]  
kill; destroy;

consumo, consumere, consumpsi, consumptus  V     3 1  TRANS   [XXXAO]  
burn up, destroy/kill; put end to; reduce/wear away; annul; extinguish (right);

perago, peragere, peregi, peractus  V     3 1 [XXXAX]  
disturb; finish; kill; carry through to the end, complete;

dejicio, dejicere, dejeci, dejectus  V     3 1  TRANS   [XXXAS]  
|overthrow, bring down, depose; kill, destroy; shoot/strike down; fell (victim);

deicio, deicere, dejeci, dejectus  V     3 1  TRANS   [XXXAO]  
|overthrow, bring down, depose; kill, destroy; shoot/strike down; fell (victim);
*

death n

mors, mortis  N     3 3  F   [XXXAX]  
death; corpse; annihilation;

fatum, fati  N     2 2  N   [XPXAX]  
utterance, oracle; fate, destiny; natural term of life; doom, death, calamity;

funus, funeris  N     3 2  N   [XXXAX]  
burial, funeral; funeral rites; ruin; corpse; death;

nex, necis  N     3 1  F   [XXXBX]  
death; murder;

letum, leti  N     2 2  N   [XXXBX]  
death, ruin, annihilation; death and destruction;

Orcus, Orci  N     2 1  M   [XXXBX]  
god of the underworld, Dis; death; the underworld;
*

destruction n

cinis, cineris  N     3 1  C   [XXXAO]  
ashes; embers, spent love/hate; ruin, destruction; the grave/dead, cremation;

pestis, pestis  N     3 3  F   [XXXBX]  
plague, pestilence, curse, destruction;

exitium, exiti(i)  N     2 4  N   [XXXBX]  
destruction, ruin; death; mischief;

ruina, ruinae  N     1 1  F   [XXXBX]  
fall; catastrophe; collapse, destruction;

interitus, interitus  N     4 1  M   [XXXBX]  
ruin; violent/untimely death, extinction; destruction, dissolution;

excidium, excidi(i)  N     2 4  N   [XXXCX]  
ruin, destruction, military destruction; overthrow;
*

While six prioritized translations may seem like enough, and they will likely cover the needs of a student, the full set (setting # parameter to not TRIM) contains much valuable information for the advanced translator. For instance for the verb live vivo usually works, but there are other options associated with specific situations: cohabito meand live together, ruror means live in the country, adjaceo means live near, judaizo means live in the Jewish manner keeping the law. These sorts of meanings are often conveyed in Latin by a single word, while in English one might just use live and a modifing word or phrase.

Design of the Meaning Line

The role and complexity of the WORDS meaning line has evolved over time. Initially it reflected an elementry, back-of-the-book, textbook dictionary with a single word or two for each entry. Nevertheless, the size of the MEAN element was set at 80 characters (as God, Holerith and IBM decreed), as appropriate for a standard computer screen in text mode. (Depending on the system and mode of display, the output may be limited to 78 or 79 characters, but the traditional 80 characters of the century-old IBM card was chosen. They will likely appear on printed output.)

With expansion of the dictionary beyond a few thousand elementary seentries and the extensive inclusion of the Oxford dictionaries, a much larger set of possible interpretations surfaced for many words, filling and exceeding the 80 character limit. A certain disipline was introduced to structure the line.

Through the many phases of development of the dictionary, standards were developed and modified and rigor was not always maintained, therefore the rules below are generally, but not universally, observed. Evolution of the dictionary is bringing it more closely in line with these rules.

A decision was made to include as many meanings and synonyms as convenient. The OLD will sometimes list a dozen or more meaning groups with notably different senses, each with several similiar meanings. Presumably these different meanings were the product of different translations of the Latin word, different translators, different context, and different eras. The WORDS dictionary includes many of these synonyms, and specifically adds some more modern ones, in order to give the user inspiration for his translation. Further, it is important to give the user the full flavor of the word that various translations employ. A word with a nominal meaning of respect may be found to also mean fear (which may be the basis of all respect for the Romans), and that will certainly color the interpretation of a passage. Going the other way, one might not want to apply it to a discription of Mother Teressa. Also one should be warned if an otherwise simple word also is used as a rude reference to female anatomy.

There are a couple of other factors that may influence the user in determining the appropriate meaning from the list. Some words have different meanings depending on the age. If one is reading a text written recently in modern Latin, one must consider hints about the meaning. While the classical meaning, the WORDS default, may be appropriate, if there is a line with a late AGE code or an indication of a modern dictionary source (e.g,. Cal), the user should take this into consideration.

Signs and Abbreviations in Meaning
, [comma] is used to separate meanings that are similar. The philosophy has been to list a number of synonyms just to key the reader in making his translation.

; [semicolon] is used to separate sets of meanings that differ in intent. This is just a general tendency and is not always rigorously enforced.

: [colon] is used with an AREA code to specify a single special meaning appropriate for that AREA in a series of general meanings. For example, L: has the same impact as (legal) before or after a defination in meaning. This supplements the use of the AREA code in the set of flags, which implies that all or most of the meanings are associated with that area.

/ [solidus] means 'or' or gives an alternative word. It sometimes replaces the comma and is often used to compress the meaning into a short line.

(...) [parentheses] set off and optional word or modifier, e.g., '(nearly) white' means 'white' or 'nearly white', (matter in) dispute means either the matter in dispute or the dispute itself. They are also used to set off an explanation, further information about the word or meaning, or an example of a translation or a word combination.

? [question mark] in a meaning implies a doubt about the interpretation, or even about the existence of the word at all. For the purposes of this program, it does not matter much. If the dubious word does not exist, no one will ask for it. If it appears in his text, the reader is warned that the interpretation may be questionable to some degree, but is what is available. May indicate somewhat more doubt than (perh.).

~ [tilde] stands for the stem or word in question. Usually it does not have an ending affixed, as is the convention in other dictionaries, but represents the word with whatever ending is proper. It is just a space saving shorthand or abbreviation.

{~ [tilde] also is the flag for changing the language base. ~E (plus Enter/CR) changes from Latin-to-English to English-to-Latin. ~L changes back.)

=> in meaning this indicates a translation example.

abb. abbreviation.

(Dif) - [Diferrari] is used to indicate an additional meaning taken from A Latin-English Dictionary of St. Thomas Aquinas by Roy J. Diferrari. This is singled out because of the importance of Aquinas. The reference is to be applied from the last semicolon before the mark. It is likely that the meaning diverges from the base by being medieval and ecclesiastical, but not so overwhelming as to deserve a separate entry.

(Douay) is used to designate those words for which the meaning has been derived or modified by examination of the Douay translation of the Latin Vulgate Bible of St Jerome.

(eccl.) ecclesiastical - designating a special church meaning in a list of conventional meanings, an additional meaning not sufficient to justify a separate entry with an ecclesiastical code.

esp. [especially] - indicates a significant association, but is only advisory.

(King James) or (KJames) is used to designate those words for which the meaning has been derived or modified by examination of the King James Bible in connection with the Latin Vulgate Bible of St Jerome.

(KLUDGE) This indicates that the particular form is distorted in order to make it come out correctly. This usually takes the form of a special conjugational form applied to a few words, not applicable to other words of the same conjugation or declension. The user can expect the form and meaning to be correct, but the numerical coding will be odd.

(L+S) [Lewis and Short] is used to indicate that the meaning starting from the previous semicolon is information from Lewis and Short 'A Latin Dictionary' that differs from, or significantly expands on, the meaning in the 'Oxford Latin Dictionary' (OLD) which is the baseline for this program. This is not to imply that the meaning listed is otherwise taken directly from the OLD, just that it is not inconsistent with OLD, but the L+S information either inconsistent (likely OLD knows better) or Lewis and Short has included meanings appropriate for late Latin writers beyond the scope of OLD. The program is just warning the reader that there may be some difference. There are cases in which this indication occurs in entries that have Lewis and Short as the source. In those cases, the basic word is in OLD but the entry is a variant form or spelling not cited there. There are cases where OLD and L+S give somewhat different spellings and meanings for the 'same' word (same in the sense that both dictionaries point to the same citation). In these cases a combination of meanings are given for both entries with the (L+S) code distinction and the entries of different spelling or declension have the SOURCE coded.

NT [New Testament] is a reference in the Bible.
(OLD) [Oxford Latin Dictionary] is used to indicate an additional meaning taken from the Oxford Latin Dictionary in an entry that is otherwise attributed. While it is usually true that if a classical word has other than OLD as the listed source then it does not appear in that form in OLD, this is not always the case. On occasion some other dictionary gives a much better or more complete and understandable definition and the honor of source is thereto given.

OT [Old Testament] is a reference in the Bible.
Other source indicators are occasionally used and are indicated in the general discription of SOURCE below.

(PASS) [passive] - indicates a special, unexpected meaning for the passive form of the verb, not easily associated with the active meaning. In addition this is often used to remind the user that compounds of facio form the passive by using the active of fio. Ex: calefio (calefacio PASS). There may be more translation information in the base word cited and the user is encouraged to refer to it.

perh. [perhaps] - denotes an additional uncertainty, but not as strong as (?).

(pl.) [plural] means that the Latin word is believed by scholars to be used (almost) always in the plural form, with the meaning stated, even though that meaning in English may be singular. If it appears in the beginning of the meaning, before the first comma, it applies to all the meanings. If it appears later, it applies only to that and later meanings. For the purpose of this program, this is only advisory. While it is used by some tools to find the expected dictionary entry, the program does not necessarily exclude a singular form in the output. While it may be true that in good, classical Latin it is never used in the singular, this does not mean that some text somewhere might not use the singular, nor that it is uncommon in later Latin. The TRIM_OUTPUT option may cause only plural forms to appear, with no TRIM_OUTPUT the singular will be shown.

prob. [probably] - denotes some uncertainty, but not as much as (perh.).

pure Latin ... indicates a pure Latin term for a word which is derived from another language (almost certainly Greek).

(rude) - indicates that this meaning was used in a rude, vulgar, coarse, or obscene manner, not what one should hear in polite company. Such use is likely from graffiti or epigrams, or in plays in which the dialogue is to indicate that the characters are low or crude. Meanings given by the program for these words are more polite, and the user is invited to substitute the current street language or obscenity of his choice to get the flavor of text.

(sg.) [singular] means that the Latin word is believed by scholars to be used always in the singular. If it appears in the beginning of the meaning, before the first comma, it applies to all the meanings. If it appears later, it applies only to that and later meanings. For the purpose of this program, this is only advisory.

usu. [usually] is weakly advisory. (usu. pl.) is even weaker than (pl.) and may imply that the plural tendency occurred only during certain periods.

w/ means 'with'.

PROGRAM DESCRIPTION


A effect of the program is to derive the structure and meaning of individual Latin words. A procedure was devised to: examine the ending of a word, compare it with the standard endings, derive the possible stems that could be consistent, compare those stems with a dictionary of stems, eliminate those for which the ending is inconsistent with the dictionary stem (e.g., a verb ending with a noun dictionary item), if unsuccessful, it tries with a large set of prefixes and suffixes, and various tackons (e.g., -que), finally it tries various 'tricks' (e.g., 'ae' may be replaced by 'e', 'inp...' by 'imp...', syncope, etc.), and it reports any resulting matches as possible interpretations.

With the input of a word, or several words in a line, the program returns information about the possible accedience, if it can find an agreeable stem in its dictionary.

=>amo
am.o               V       1  1 PRES ACTIVE  IND  1 S       
love, like; fall in love with; be fond of; have a tendency to

To support this method, an INFLECT.SEC data file was constructed containing possible Latin endings encoded by a structure that identifies the part of speech, declension, conjugation, gender, person, number, etc. This is a pure computer encoding for a 'brute force' search. No sophisticated knowledge of Latin is used at this point. Rules of thumb (e.g., the fact, always noted early in any Latin course, that a neuter noun has the same ending in the nominative and accusative, with a final -a in the plural) are not used in the search. However, it is convenient to combine several identical endings with a general encoding (e.g., the endings of the perfect tenses are the same for all verbs, and are so encoded, not replicated for every conjugation and variant).

Many of the distinguishing differences identifying conjugations come from the voiced length of stem vowels (e.g., between the present, imperfect and future tenses of a third conjugation I-stem verb and a fourth conjugation verb). These aural differences, the features that make Latin 'sound right' to one who speaks it, are not relevant in the analysis of written endings.

The endings for the verb conjugations are the result of trying to minimize the number of individual endings records, while yet keeping the structure of the inflections data file fairly readable. There is no claim that the resulting arrangement is consonant with any grammarian's view of Latin, nor should it be examined from that viewpoint. While it started from the conjugations in text books, it can only be viewed as some fuzzy intermediate step along a path to a mathematically minimal number of encoded verb endings. Later versions of the program might improve the system.

There are some egregious liberties taken in the encoding. With the inclusion of two present stems, the third conjugation I-stem verbs may share the endings of the regular third conjugation. The fourth conjugation has disappeared altogether, and is represented internally as a variant of the third conjugation (3, 4), but this is replaced for the user in output by 4 1. There is an artificial fifth conjugation for esse and others, a sixth for eo, and a seventh for other irregularities.

As an example, a verb ending record has the structure:
PART -- the part code for a verb = V;
CONjugation -- consisting of two parts:
WHICH -- a conjugation identifier - range 0..9 and
VAR -- a variant identifier on WHICH - range 0..9;
TENSE -- an enumeration type - range PRES..FUTP + X;
VOICE -- an enumeration type - range ACTIVE..PASSIVE + X;
MOOD -- an enumeration type - range IND..PPL + X;
PERSON -- person, first to third - range 1..3 + 0;
NUMBER -- an enumeration type - range S..P + X;
KEY -- which stem to be used - range 1..4;
SIZE -- number of characters - range 0..9;
ENDING -- the ending as a string of SIZE characters;
AGE and FREQ flags which are not usually visible to the user.

Thus, the entry for the ending appropriate to 'amo' (with STEM = am) is:

V 1 1 PRES IND ACTIVE 1 S X 1 o

The elements are straightforward and generally use the abbreviations that are common in any Latin text. An X or 0 represents the 'don't know' or 'don't care' for enumeration or numeric types. Details are documented below in the CODES section.

A verb dictionary record has the structure:
STEMS -- for a verb there are 4 stems;
PART -- part code for a verb = V
WHICH -- a conjugation identifier - range 0..9
VAR -- a variant identifier - range 0..9;
KIND -- enumeration type of verb - range TO_BE..PERFDEF + X;
AGE, AREA, GEO, FREQ, and SOURCE flags
MEANING -- text for English translations (up to 80 characters).

Thus, an entry corresponding to 'amo amare amavi amatus' is:

am am amav amat 
V 1 1 X            X X X X X 
love, like; fall in love with; be fond of; have a tendency to

Endings may not uniquely determine which stem, and therefore the right meaning. 'portas' could be the accusitive plural of 'gate', or the second person, singular, present indicative active of 'carry'. In both cases the stem is 'port'. All possibilities are reported.

portas 
port.as V 1 1 PRES IND ACTIVE 2 S X 
carry, bring 

port.as N 1 1 ACC P F T 
gate, entrance; city gates; door; avenue;

And note that the same stem (port) has other uses (portus = harbor).

portum
port.um N 4 1 ACC S M T 
port, harbor; refuge, haven, place of refuge

PLEASE NOTE: It is certainly possible for the program to find a valid Latin construction that fits the input word and to have that interpretation be entirely wrong in the context. It is even possible to interpret a number, in Roman numerals, as a word! (But the number would be reported also.)

For the case of defective verbs, the process does not necessarily have to be precise. Since the purpose is only to translate from Latin, even if there are unused forms included in the algorithm these will not come up in any real Latin text. The endings for the verb conjugations are the result of trying to minimize the number of individual endings records, while keeping the structure of the base INFLECTIONS data file fairly readable.

In general the program will try to construct a match with the inflections and the dictionaries. There are some specific checks to reject certain mathematically correct combinations that do not appear in the language, but these checks are relatively few. The philosophy has been to allow a generous interpretation. A remark in a text or dictionary that a particular form does not exist must be tempered with the realization that the author probably means that it has not been observed in the surviving classical literature. This body of reference is minuscule compared to the total use of Latin, even limited to the classical period. Who is to say that further examples would not turn up such an example, even if it might not have been approved of by Cicero. It is also possible that such reasonable, if 'improper', constructs might occur in later writings by less educated, or just different, authors. Certainly English shows this sort of variation over time.

If the exact stem is not found in the dictionary, there are rules for the construction of words which any student would try. The simplest situation is a known stem to which a prefix or suffix has been attached. The method used by the program (if DO_FIXES is on, default is Yes) is to try any fixes that fit, to see if their removal results in an identifiable remainder. Then the meaning is mechanically implied from the meaning of the fix and the stem. The user may need to interpret with a more conventional English usage. This technique improves the hit performance significantly. However, in about 40% of the instances in which there is a hit, the derivation is correct but the interpretation takes some imagination. In something less than 10% of the cases, the inferred fix is just wrong, so the user must take some care to see if the interpretation makes any sense.

This method is complicated by the tendency for prefixes to be modified upon attachment (ab+fero = aufero, sub+fero = suffero). The program's 'tricks' take many such instances into account. Ideally, one should look inside the stem for identifiable fragments. One would like to start with the smallest possible stem, and that is most frequently the correct one. While it is mathematically possible that the stem of 'actorum' is 'actor' with the common inflection 'um', no intuitive first semester Latin student would fail to opt for the genitive plural 'orum', and probably be right. To first order, the procedure ignores such hints and may report this word in both forms, as well as a verb participle. However, it can use certain generally applicable rules, like the superlative characteristic 'issim', to further guess.

In addition, there is the capability to examine the word for such common techniques as syncope, the omission of the 've' or 'vi' in certain verb perfect forms (audivissem = audissem).

If the dictionary can not identify a matching stem, it may be possible to derive a stem from 'nearby' stems (an adverb from an adjective is the most common example) and infer a meaning. If all else fails, a portion of the possible dictionary stems can be listed, from which the user can draw in making his own guess.

Codes in Inflection Line

For completeness, the enumeration codes used in the output are listed here from the Ada statements. Simple numbers are used for person, declension, conjugations, and their variants. Not all the facilities implied by these values are developed or used in the program or the dictionary. This list is only for Version 1.97E. Other versions may be somewhat different. This may make their dictionaries incompatible with the present program.

NOTE: in print dictionaries certain information is conveyed by font encoding, e.g., the use of bold face or italics. There is no system independent method of displaying such on computers (although individual programs can handle these, each in it own unique way). WORDS uses capital letters to express some such differences, which method is system independent in present usage.


 type PART_OF_SPEECH_TYPE
          X,         --  all, none, or unknown
          N,         --  Noun
          PRON,      --  PRONoun
          PACK,      --  PACKON -- artificial for code
          ADJ,       --  ADJective
          NUM,       --  NUMeral
          ADV,       --  ADVerb
          V,         --  Verb
          VPAR,      --  Verb PARticiple
          SUPINE,    --  SUPINE
          PREP,      --  PREPosition
          CONJ,      --  CONJunction
          INTERJ,    --  INTERJection
          TACKON,    --  TACKON --  artificial for code
          PREFIX,    --  PREFIX --  here artificial for code
          SUFFIX     --  SUFFIX --  here artificial for code

  type GENDER_TYPE
          X,         --  all, none, or unknown
          M,         --  Masculine
          F,         --  Feminine
          N,         --  Neuter
          C          --  Common (masculine and/or feminine)
                       
  type CASE_TYPE
          X,         --  all, none, or unknown
          NOM,       --  NOMinative
          VOC,       --  VOCative
          GEN,       --  GENitive
          LOC,       --  LOCative
          DAT,       --  DATive
          ABL,       --  ABLative
          ACC        --  ACCusitive
                     
  type NUMBER_TYPE
          X,         --  all, none, or unknown
          S,         --  Singular
          P          --  Plural

  type PERSON_TYPE is range 0..3;
  
  type COMPARISON_TYPE
          X,         --  all, none, or unknown
          POS,       --  POSitive
          COMP,      --  COMParative
          SUPER      --  SUPERlative
    
  type NUMERAL_SORT_TYPE
         X,          --  all, none, or unknown
         CARD,       --  CARDinal
         ORD,        --  ORDinal
         DIST,       --  DISTributive
         ADVERB      --  numeral ADVERB

  type TENSE_TYPE
          X,         --  all, none, or unknown
          PRES,      --  PRESent
          IMPF,      --  IMPerFect
          FUT,       --  FUTure
          PERF,      --  PERFect
          PLUP,      --  PLUPerfect
          FUTP       --  FUTure Perfect
                                              
  type VOICE_TYPE
          X,         --  all, none, or unknown
          ACTIVE,    --  ACTIVE
          PASSIVE    --  PASSIVE
  
  type MOOD_TYPE
          X,         --  all, none, or unknown
          IND,       --  INDicative
          SUB,       --  SUBjunctive
          IMP,       --  IMPerative
          INF,       --  INFinative
          PPL        --  ParticiPLe
                                         
  type NOUN_KIND_TYPE
          X,            --  unknown, nondescript
          S,            --  Singular "only"           --  not really used
          M,            --  plural or Multiple "only" --  not really used
          A,            --  Abstract idea
          G,            --  Group/collective Name -- Roman(s)
          N,            --  proper Name
          P,            --  a Person
          T,            --  a Thing
          L,            --  Locale, name of country/city
          W             --  a place Where
                            
  type PRONOUN_KIND_TYPE
          X,            --  unknown, nondescript
          PERS,         --  PERSonal
          REL,          --  RELative
          REFLEX,       --  REFLEXive
          DEMONS,       --  DEMONStrative
          INTERR,       --  INTERRogative
          INDEF,        --  INDEFinite
          ADJECT        --  ADJECTival

   type VERB_KIND_TYPE
          X,         --  all, none, or unknown
          TO_BE,     --  only the verb TO BE (esse)
          TO_BEING,  --  compounds of the verb to be (esse)
          GEN,       --  verb taking the GENitive
          DAT,       --  verb taking the DATive  
          ABL,       --  verb taking the ABLative
          TRANS,     --  TRANSitive verb
          INTRANS,   --  INTRANSitive verb
          IMPERS,    --  IMPERSonal verb (implied subject 'it', 'they', 'God')
                     --  agent implied in action, subject in predicate
          DEP,       --  DEPonent verb
                     --  only passive form but with active meaning 
          SEMIDEP,   --  SEMIDEPonent verb (forms perfect as deponent) 
                     --  (perfect passive has active force)
          PERFDEF    --  PERFect DEFinite verb  
                     --  having only perfect stem, but with present force
                                       

The KIND_TYPEs represent various aspects of a word which may be useful to some program, not necessarily the present one. They were put in for various reasons, and later versions may change the selection and use. Some of the KIND flags are never used. In some cases more than one KIND flag might be appropriate, but only one is selected. Some seemed to be a good idea at one time, but have not since proved out. The lists above are just for completeness.

NOUN KIND is used in trimming (when set) the output and removing possibly spurious cases (locative for a person, but preserving the vocative).

VERB KIND allows examples (when set) to give a more reasonable meaning. A DEP flag allows the example to reflect active meaning for passive form. It also allows the dictionary form to be constructed properly from stems. TRANS/INTRANS were included to allow a further program a hint as to what kind of object it should expect. This flag is only now being fixed during the update. There are some verbs which, although mostly used in one way, might be either. These are assigned X rather than breaking into two entries. This would be of no particular use at this point since it would not allow the object to be determined. GEN/DAT/ABL flags have related function, but are almost absent. TO_BE is used to indicate that a form of esse may be part of a compound verb tense with a participle. TO_BEING indicates a verb related to esse (e.g., abesse) which has no object, neither is in used to form compounds. IMPERS is used to weed out person and forms inappropriate to an impersonal verb, and to insert a special meaning distinct from a general form associated with the same verb stem.

There is a problem in that all values for this parameter are not orthogonal. DEP is a different sort of thing from INTRANS. There ought to be a KIND_1 and KIND_2 to separate the different classes. However, this would be overkill considering the use made of this parameter, so far.

There is a more difficult DEP problem. 'Good Latin' requires that the DEP be recognized and processed to eliminate active forms. In some cases there are dictionary examples, mostly medieval, of the depondency being violated. Some of those cases have been recognized with a separate entry. This is not something that a suffix can handle appropriately, even if mechanically it can function. A better way might be to include the perfect form but still have the DEP flag, thereby allow the trimming in most cases. This has not been done yet. But an active form would be recognized if input, especially if the text is medieval.

NUMERAL KIND and VALUE are used by the program in constructing the meaning line.

Help for Parameters

One can CHANGE_PARAMETERS by inputting a '#' [number sign] character (ASCII 35) as the input word, followed by a return. (Note that this has changed from early versions in which '?' was used.) Each parameter is listed and the user is offered the opportunity to change it from the current value by answering Y or N (any case). For each parameter there is some explanation or help. This is displayed by in putting a '?' [question mark], followed by a return. HINT: While going down the list if one has made all the changes desired, one need not continue to the end. Just enter a space and then give a return. The program will interpret this as an illegal entry (not Y or N) and will cancel the rest of the list, while retaining any changes made to that point.

Some parameters may not function in the English mode, nor is the documentation necessarily complete,

The various help displays are listed here:



TRIM_OUTPUT_HELP 
   This option instructs the program to remove from the output list of   
   possible constructs those which are least likely.  There is now a fair
   amount of trimming, killing LOC and VOC plus removing Uncommon and    
   non-classical (Archaic/Medieval) when more common results are found   
   and this action is requested (turn it off in MDV (!) parameters).     
   When a TRIM has been done, the output is followed by an asterix (*).  
   There certainly is no absolute assurence that the items removed are   
   not correct, just that they are statistically less likely.            
   Note that poets are likely to employ unusual words and inflections for
   various reasons.  These may be trimmed out if this parameter in on.   
   When in English mode, trim just reduces the output to the top six     
   results, if there are that many.  Asterix means there are more        
                                                   The default is Y(es)  

HAVE_OUTPUT_FILE_HELP 
   This option instructs the program to create a file which can hold the 
   output for later study, otherwise the results are just displayed on   
   the screen.  The output file is named  WORD.OUT
   This means that one run will necessarily overwrite a previous run,    
   unless the previous results are renamed or copied to a file of another
   name.  This is available if the METHOD is INTERACTIVE, no parameters. 
   The default is N(o), since this prevents the program from overwriting 
   previous work unintentionally.  Y(es) creates the output file.        

WRITE_OUTPUT_TO_FILE_HELP 
   This option instructs the program, when HAVE_OUTPUT_FILE is on, to    
   write results to the WORD.OUT file.
   This option may be turned on and off during running of the program,   
   thereby capturing only certain desired results.  If the option        
   HAVE_OUTPUT_FILE is off, the user will not be given a chance to turn  
   this one on.  Only for INTERACTIVE running.         Default is N(o).  
   This works in English mode, but output in somewhat diffeent so far.   

DO_UNKNOWNS_ONLY_HELP 
   This option instructs the program to only output those words that it  
   cannot resolve.  Of course, it has to do processing on all words, but 
   those that are found (with prefix/suffix, if that option in on) will  
   be ignored.  The purpose of this option is t allow a quick look to    
   determine if the dictionary and process is going to do an acceptable  
   job on the current text.  It also allows the user to assemble a list  
   of unknown words to look up manually, and perhaps augment the system  
   dictionary.  For those purposes, the system is usually run with the   
   MINIMIZE_OUTPUT option, just producing a list.  Another use is to run 
   without MINIMIZE to an output file.  This gives a list of the input   
   text with the unknown words, by line.  This functions as a spelling   
   checker for Latin texts.  The default is N(o).                        
   This does not work in English mode, but may in the future.            
   
WRITE_UNKNOWNS_TO_FILE_HELP 
   This option instructs the program to write all unresolved words to a  
   UNKNOWNS file named  WORD.UNK
   With this option on , the file of unknowns is written, even though    
   the main output contains both known and unknown (unresolved) words.   
   One may wish to save the unknowns for later analysis, testing, or to  
   form the basis for dictionary additions.  When this option is turned  
   on, the UNKNOWNS file is written, destroying any file from a previous 
   run.  However, the write may be turned on and off during a single run 
   without destroying the information written in that run.               
   This option is for specialized use, so its default is N(o).           
   This does not work in English mode, but may in the future.            

IGNORE_UNKNOWN_NAMES_HELP 
   This option instructs the program to assume that any capitalized word 
   longer than three letters is a proper name.  As no dictionary can be  
   expected to account for many proper names, many such occur that would 
   be called UNKNOWN.  This contaminates the output in most cases, and   
   it is often convenient to ignore these sperious UNKNOWN hits.  This   
   option implements that mode, and calls such words proper names.       
   Any proper names that are in the dictionary are handled in the normal 
   manner.                                The default is Y(es).          

IGNORE_UNKNOWN_CAPS_HELP 
   This option instructs the program to assume that any all caps word    
   is a proper name or similar designation.  This convention is often    
   used to designate speakers in a discussion or play.  No dictionary can
   claim to be exaustive on proper names, so many such occur that would  
   be called UNKNOWN.  This contaminates the output in most cases, and   
   it is often convenient to ignore these sperious UNKNOWN hits.  This   
   option implements that mode, and calls such words names.  Any similar 
   designations that are in the dictionary are handled in the normal     
   manner, as are normal words in all caps.    The default is Y(es).     

DO_COMPOUNDS_HELP 
   This option instructs the program to look ahead for the verb TO_BE (or
   iri) when it finds a verb participle, with the expectation of finding 
   a compound perfect tense or periphastic.  This option can also be a   
   trimming of the output, in that VPAR that do not fit (not NOM) will be
   excluded, possible interpretations are lost.  Default choice is Y(es).
   This processing is turned off with the choice of N(o).                

DO_FIXES_HELP 
   This option instructs the program, when it is unable to find a proper 
   match in the dictionary, to attach various prefixes and suffixes and  
   try again.  This effort is successful in about a quarter of the cases 
   which would otherwise give UNKNOWN results, or so it seems in limited 
   tests.  For those cases in which a result is produced, about half give
   easily interpreted output; many of the rest are etymologically true,  
   but not necessarily obvious; about a tenth give entirely spurious     
   derivations.  The user must proceed with caution.                     
   The default choice is Y(es), since the results are generally useful.  
   This processing can be turned off with the choice of N(o).            

DO_TRICKS_HELP 
   This option instructs the program, when it is unable to find a proper 
   match in the dictionary, and after various prefixes and suffixes, to  
   try every dirty Latin trick it can think of, mainly common letter     
   replacements like cl -> cul, vul -> vol, ads -> ass, inp -> imp, etc. 
   Together these tricks are useful, but may give false positives (>10%).
   They provide for recognized varients in classical spelling.  Most of  
   the texts with which this program will be used have been well edited  
   and standardized in spelling.  Now, moreover,  the dictionary is being
   populated to such a state that the hit rate on tricks has fallen to a 
   low level.  It is very seldom productive, and it is always expensive. 
   The only excuse for keeping it as default is that now the dictionary  
   is quite extensive and misses are rare.         Default is now Y(es). ) ;

DO_DICTIONARY_FORMS_HELP 
   This option instructs the program to output a line with the forms     
   normally associated with a dictionary entry (NOM and GEN of a noun,   
   the four principal parts of a verb, M-F-N NOM of an adjective, ...).  
   This occurs when there is other output (i.e., not with UNKNOWNS_ONLY).
   The default choice is N(o), but it can be turned on with a Y(es).     

SHOW_AGE_HELP 
   This option causes a flag, like '' to appear for inflection or  
   form in the output.  The AGE indicates when this word/inflection was  
   in use, at least from indications is dictionary citations.  It is     
   just an indication, not controlling, useful when there are choices.   
   No indication means that it is common throughout all periods.         
   The default choice is Y(es), but it can be turned off with a N(o).    

SHOW_FREQUENCY_HELP 
   This option causes a flag, like '' to appear for inflection or  
   form in the output.  The FREQ is indicates the relative usage of the  
   word or inflection, from indications is dictionary citations.  It is  
   just an indication, not controlling, useful when there are choices.   
   No indication means that it is common throughout all periods.         
   The default choice is Y(es), but it can be turned off with a N(o).    

DO_EXAMPLES_HELP 
   This option instructs the program to provide examples of usage of the 
   cases/tenses/etc. that were constructed.  The default choice is N(o). 
   This produces lengthly output and is turned on with the choice Y(es). 

DO_ONLY_MEANINGS_HELP 
   This option instructs the program to only output the MEANING for a    
   word, and omit the inflection details.  This is primarily used in     
   analyzing new dictionary material, comparing with the existing.       
   However it may be of use for the translator who knows most all of     
   the words and just needs a little reminder for a few.                 
   The default choice is N(o), but it can be turned on with a Y(es).     

DO_STEMS_FOR_UNKNOWN_HELP 
   This option instructs the program, when it is unable to find a proper 
   match in the dictionary, and after various prefixes and suffixes, to  
   list the dictionary entries around the unknown.  This will likely     
   catch a substantive for which only the ADJ stem appears in dictionary,
   an ADJ for which there is only a N stem, etc.  This option should     
   probably only be used with individual UNKNOWN words, and off-line     
   from full translations, therefore the default choice is N(o).         
   This processing can be turned on with the choice of Y(es).            

SAVE_PARAMETERS_HELP 
   This option instructs the program, to save the current parameters, as 
   just established by the user, in a file WORD.MOD.  If such a file     
   exists, the program will load those parameters at the start.  If no   
   such file can be found in the current subdirectory, the program will  
   start with a default set of parameters.  Since this parameter file is 
   human-readable ASCII, it may also be created with a text editor.  If  
   the file found has been improperly created, is in the wrong format, or
   otherwise uninterpretable by the program, it will be ignored and the  
   default parameters used, until a proper parameter file in written by  
   the program.  Since one may want to make temporary changes during a   
   run, but revert to the usual set, the default is N(o).                           

There is also a set of DEVELOPER_PARAMETERS that are unlikely to be of interest to the normal user. Some of these facilities may be disconnected or not work for other reasons. Additional parameters may be included without notice or documentation. The HELP may be the most reliable source of information. These parameters are mostly for the use in the development process. These may be changed or examined by in similar change procedure by inputting a '!' [exclamation sign] character, followed by a return.

                     
HAVE_STATISTICS_FILE_HELP 
   This option instructs the program to create a file which can hold     
   certain statistical information about the process.  The file is       
   overwritten for new invocation of the program, so old data must be    
   explicitly saved if it is to be retained.  The statistics are in TEXT 
   format.     The statistics file is named  WORD.STA
   This information is only of development use, so the default is N(o).  

WRITE_STATISTICS_FILE_HELP 
   This option instructs the program, with HAVE_STATISTICS_FILE, to put  
   derived statistics in a file named  WORD.STA
   This option may be turned on and off while running of the program,    
   thereby capturing only certain desired results.  The file is reset at 
   each invocation of the program, if the HAVE_STATISTICS_FILE is set.   
   If the option HAVE_STATISTICS_FILE is off, the user will not be given 
   a chance to turn this one on.                Default is N(o).         

SHOW_DICTIONARY_HELP 
   This option causes a flag, like 'GEN>' to be put before the meaning   
   in the output.  While this is useful for certain development purposes,
   it forces off a few characters from the meaning, and is really of no  
   interest to most users.                                               
   The default choice is N(o), but it can be turned on with a Y(es).     

SHOW_DICTIONARY_LINE_HELP 
   This option causes the number of the dictionary line for the current  
   meaning to be output.  This is of use to no one but the dictionary    
   maintainer.  The default choice is N(o).  It is activated by Y(es).   

SHOW_DICTIONARY_CODES_HELP 
   This option causes the codes for the dictionary entry for the current 
   meaning to be output.  This may not be useful to any but the most     
   involved user.  The default choice is N(o).  It is activated by Y(es).

DO_PEARSE_CODES_HELP 
   This option causes special codes to be output flagging the different  
   kinds of output lines.  01 for forms, 02 for dictionary forms, and    
   03 for meaning. The default choice is N(o).  It is activated by Y(es).
  There are no Pearse codes in English mode.

DO_ONLY_INITIAL_WORD_HELP 
   This option instructs the program to only analyze the initial word on 
   each line submitted.  This is a tool for checking and integrating new 
   dictionary input, and will be of no interest to the general user.     
   The default choice is N(o), but it can be turned on with a Y(es).     

FOR_WORD_LIST_CHECK_HELP 
   This option works in conjunction with DO_ONLY_INITIAL_WORD to allow   
   the processing of scanned dictionarys or text word lists.  It accepts 
   only the forms common in dictionary entries, like NOM S for N or ADJ, 
   or PRES ACTIVE IND 1 S for V.  It is be used only with DO_INITIAL_WORD
   The default choice is N(o), but it can be turned on with a Y(es).     

DO_ONLY_FIXES_HELP 
   This option instructs the program to ignore the normal dictionary     
   search and to go direct to attach various prefixes and suffixes before
   processing. This is a pure research tool.  It allows one to examine   
   the coverage of pure stems and dictionary primary compositions.       
   This option is only available if DO_FIXES is turned on.               
   This is entirely a development and research tool, not to be used in   
   conventional translation situations, so the default choice is N(o).   
   This processing can be turned on with the choice of Y(es).            

DO_FIXES_ANYWAY_HELP 
   This option instructs the program to do both the normal dictionary    
   search and then process for the various prefixes and suffixes too.    
   This is a pure research tool allowing one to consider the possibility 
   of strange constructions, even in the presence of conventional        
   results, e.g., alte => deeply (ADV), but al+t+e => wing+ed (ADJ VOC)  
   (If multiple suffixes were supported this could also be wing+ed+ly.)  
   This option is only available if DO_FIXES is turned on.               
   This is entirely a development and research tool, not to be used in   
   conventional translation situations, so the default choice is N(o).   
   This processing can be turned on with the choice of Y(es).            
         ------    PRESENTLY NOT IMPLEMENTED    ------                   

USE_PREFIXES_HELP 
   This option instructs the program to implement prefixes from ADDONS   
   whenever and wherever FIXES are called for.  The purpose of this      
   option is to allow some flexibility while the program in running to   
   select various combinations of fixes, to turn them on and off,        
   individually as well as collectively.  This is an option usually      
   employed by the developer while experimenting with the ADDONS file.   
   This option is only effective in connection with DO_FIXES.            
   This is primarily a development tool, so the conventional user should 
   probably maintain the default  choice of Y(es).                       

USE_SUFFIXES_HELP 
   This option instructs the program to implement suffixes from ADDONS   
   whenever and wherever FIXES are called for.  The purpose of this      
   option is to allow some flexibility while the program in running to   
   select various combinations of fixes, to turn them on and off,        
   individually as well as collectively.  This is an option usually      
   employed by the developer while experimenting with the ADDONS file.   
   This option is only effective in connection with DO_FIXES.            
   This is primarily a development tool, so the conventional user should 
   probably maintain the default  choice of Y(es).                       

USE_TACKONS_HELP 
   This option instructs the program to implement TACKONS from ADDONS    
   whenever and wherever FIXES are called for.  The purpose of this      
   option is to allow some flexibility while the program in running to   
   select various combinations of fixes, to turn them on and off,        
   individually as well as collectively.  This is an option usually      
   employed by the developer while experimenting with the ADDONS file.   
   This option is only effective in connection with DO_FIXES.            
   This is primarily a development tool, so the conventional user should 
   probably maintain the default  choice of Y(es).                       

DO_MEDIEVAL_TRICKS_HELP 
   This option instructs the program, when it is unable to find a proper 
   match in the dictionary, and after various prefixes and suffixes, and 
   tring every Classical Latin trick it can think of, to go to a few that
   are usually only found in medieval Latin, replacements of caul -> col,
   st -> est, z -> di, ix -> is, nct -> nt.  It also tries some things   
   like replacing doubled consonants in classical with a single one.     
   Together these tricks are useful, but may give false positives (>20%).
   This option is only available if the general DO_TRICKS is chosen.     
   If the text is late or medieval, this option is much more useful than 
   tricks for classical.  The dictionary can never contain all spelling  
   variations found in medieval Latin, but some constructs are common.   
   The default choice is N(o), since the results are iffy, medieval only,
   and expensive.  This processing is turned on with the choice of Y(es).
   
DO_SYNCOPE_HELP 
   This option instructs the program to postulate that syncope of        
   perfect stem verbs may have occured (e.g, aver -> ar in the perfect), 
   and to try various possibilities for the insertion of a removed 'v'.  
   To do this it has to fully process the modified candidates, which can 
   have a consderable impact on the speed of processind a large file.    
   However, this trick seldom producesa false positive, and syncope is   
   very common in Latin (first year texts excepted).  Default is Y(es).  
   This processing is turned off with the choice of N(o).                

DO_TWO_WORDS_HELP 
   There are some few common Lain expressions that combine two inflected 
   words (e.g. respublica, paterfamilias).  There are numerous examples  
   of numbers composed of two words combined together.                   
   Sometimes a text or inscription will have words run together.         
   When WORDS is unable to reach a satisfactory solution with all other  
   tricks, as a last stab it will try to break the input into two words. 
   This most often fails.  Even if mechnically successful, the result is 
   usually false and must be examined by the user.  If the result is     
   correct, it is probably clear to the user.  Otherwise,  beware.     
   This problem will not occur for a well edited text, such as one will
   find on your Latin exam, but sometimes with raw text.
   Since this is a last chanceand infrequent, the default is Y(es);      
   This processing is turned off with the choice of N(o).                

INCLUDE_UNKNOWN_CONTEXT_HELP 
   This option instructs the program, when writing to an UNKNOWNS file,  
   to put out the whole context of the UNKNOWN (the whole input line on  
   which the UNKNOWN was found).  This is appropriate for processing     
   large text files in which it is expected that there will be relatively
   few UNKNOWNS.    The main use at the moment is to provide display     
   of the input line on the output file in the case of UNKNOWNS_ONLY.    

NO_MEANINGS_HELP   
   This option instructs the program to omit putting out meanings.        
   This is only useful for certain dictionary maintenance procedures.    
   The combination not DO_DICTIONARY_FORMS, MEANINGS_ONLY, NO_MEANINGS   
   results in no visible output, except spacing lines.    Default is N)o.

OMIT_ARCHAIC_HELP 
   THIS OPTION IS CAN ONLY BE ACTIVE IF WORDS_MODE(TRIM_OUTPUT) IS SET!  
   This option instructs the program to omit inflections and dictionary  
   entries with an AGE code of A (Archaic).  Archaic results are rarely  
   of interest in general use.  If there is no other possible form, then 
   the Archaic (roughly defined) will be reported.  The default is Y(es).

OMIT_MEDIEVAL_HELP 
   THIS OPTION IS CAN ONLY BE ACTIVE IF WORDS_MODE(TRIM_OUTPUT) IS SET!  
   This option instructs the program to omit inflections and dictionary  
   entries with AGE codes of E or later, those not in use in Roman times.
   While later forms and words are a significant application, most users 
   will not want them.  If there is no other possible form, then the     
   Medieval (roughly defined) will be reported.   The default is Y(es).  

OMIT_UNCOMMON_HELP 
   THIS OPTION IS CAN ONLY BE ACTIVE IF WORDS_MODE(TRIM_OUTPUT) IS SET!  
   This option instructs the program to omit inflections and dictionary  
   entries with FREQ codes indicating that the selection is uncommon.    
   While these forms area significant feature of the program, many users 
   will not want them.  If there is no other possible form, then the     
   uncommon (roughly defined) will be reported.   The default is Y(es).  

DO_I_FOR_J_HELP 
   This option instructs the program to modify the output so that the j/J
   is represented as i/I.  The consonant i was writen as j in cursive in 
   Imperial times and called i longa, and often rendered as j in medieval
   times.  The capital is usually rendered as I, as in inscriptions.     
   If this is NO/FALSE, the output will have the same character as input.
   The program default, and the dictionary convention is to retain the j.
   Reset if this ia unsuitable for your application. The default is N(o).

DO_U_FOR_V_HELP 
   This option instructs the program to modify the output so that the u  
   is represented as v.  The consonant u was writen sometimes as uu.     
   The pronounciation was as current w, and important for poetic meter.  
   With the printing press came the practice of distinguishing consonant 
   u with the character v, and was common for centuries.  The practice of
   using only u has been adopted in some 20th century publications (OLD),
    but it is confusing to many modern readers.  The capital is commonly 
   V in any case, as it was and is in inscriptions (easier to chisel).   
   If this is NO/FALSE, the output will have the same character as input.
   The program default, and the dictionary convention is to retain the v.
   Reset If this ia unsuitable for your application. The default is N(o).

PAUSE_IN_SCREEN_OUTPUT_HELP 
   This option instructs the program to pause in output on the screen    
   after about 16 lines so that the user can read the output, otherwise  
   it would just scroll off the top.  A RETURN/ENTER gives another page. 
   If the program is waiting for a return, it cannot take other input.   
   This option is active only for keyboard entry or command line input,  
   and only when there is no output file.  It is moot if only single word
   input or brief output.                 The default is Y(es).          

NO_SCREEN_ACTIVITY_HELP 
   This option instructs the program not to keep a running screen of the 
   input.  This is probably only to be used by the developer to calibrate
   run times for large text file input, removing the time necessary to   
   write to screen.                       The default is N(o).           
 
UPDATE_LOCAL_DICTIONARY_HELP 
   This option instructs the program to invite the user to input a new   
   word to the local dictionary on the fly.  This is only active if the  
   program is not using an (@) input file!  If an UNKNOWN is discovered, 
   the program asks for STEM, PART, and MEAN, the basic elements of a    
   dictionary entry.  These are put into the local dictionary right then,
   and are available for the rest of the session, and all later sessions.
   The use of this option requires a detailed knowledge of the structure 
   of dictionary entries, and is not for the average user.  If the entry 
   is not valid, reloading the dictionary will raise and exception, and  
   the invalid entry will be rejected, but the program will continue     
   without that word.  Any invalid entries can be corrected or deleted   
   off-line with a text editor on the local dictionary file.  If one does
   not want to enter a word when this option is on, a simple RETURN at   
   the STEM=> prompt will ignore and continue the program.  This option  
   is only for very experienced users and should normally be off.        
                                             The default is N(o).        
         ------    NOT AVAILABLE IN THIS VERSION   -------               

UPDATE_MEANINGS_HELP 
   This option instructs the program to invite the user to modify the    
   meaning displayed on a word translation.  This is only active if the  
   program is not using an (@) input file!  These changes are put into   
   the dictionary right then and permenently, and are available from     
   then on, in this session, and all later sessions.   Unfortunately,    
   these changes will not survive the replacement of the dictionary by a 
   new version from the developer.  Changes can only be recovered by     
   considerable prcessing by the deneloper, and should be left there.    
   This option is only for experienced users and should remain off.      
                                             The default is N(o).        
         ------    NOT AVAILABLE IN THIS VERSION   -------               
     
MINIMIZE_OUTPUT_HELP 
   This option instructs the program to minimize the output.  This is a  
   somewhat flexible term, but the use of this option will probably lead 
   to less output.                        The default is Y(es).          

SAVE_PARAMETERS_HELP 
   This option instructs the program, to save the current parameters, as 
   just established by the user, in a file WORD.MDV.  If such a file     
   exists, the program will load those parameters at the start.  If no   
   such file can be found in the current subdirectory, the program will  
   start with a default set of parameters.  Since this parameter file is 
   human-readable ASCII, it may also be created with a text editor.  If  
   the file found has been improperly created, is in the wrong format, or
   otherwise uninterpretable by the program, it will be ignored and the  
   default parameters used, until a proper parameter file in written by  
   the program.  Since one may want to make temporary changes during a   
   run, but revert to the usual set, the default is N(o).                

Special Cases

Some adjectives have no conventional positive forms (either missing or undeclined), or the POS forms have more than one COMP/SUPER. In these few cases, the individual COMP or SUPER form is entered separately. Since it is not directly connected with a POS form, and only the POS forms have different numbered declensions, the special form is given a declension of (0, 0). An additional consequence is that the dictionary form in output is only for the COMP/SUPER, and does not reflect all comparisons.

Uniques

There are some irregular situations which are not convenient to handle through the general algorithms. For these a UNIQUES file and procedure was established. The number of these special cases is less than one hundred, but may increase as new situations arise, and decrease as algorithms provide better coverage. The user will not see much difference, except in that no dictionary forms are available for these unique words.

Tricks

There are a number of situations in Latin writing where certain modifications or conventions regularly are found. While often found, these are not the normal classical forms. If a conventional match is not found, the program may be instructed to TRY_TRICKS. Below is a partial list of current tricks. The syncopated form of the perfect often drops the 'v' and loses the vowel. An initial 'a' followed by a double letter often is used for an 'ad' prefix, likewise an initial 'ad' prefix is often replaced by an 'a' followed by a double letter. An initial 'i' followed by a double letter often is used for an 'in' prefix, likewise an initial 'in' prefix is often replaced by an 'i' followed by a double letter. A leading 'inp' could be an 'imp'. A leading 'obt' could be an 'opt'. An initial 'har...' or 'hal...' may be rendered by an 'ar' or 'al', likewise the dictionary entry may have 'ar'/'al' and the trial word begin with 'ha...'. An initial 'c' could be a 'k', or the dictionary entry uses 'c' for 'k'. A nonterminal 'ae' is often rendered by an 'e'. An initial 'E' can replace an 'Ae'. An 'iis...' beginning some forms of 'eo' may be contracted to 'is...'. A nonterminal 'ii' is often replaced by just 'i'; including 'ji', since in this program and dictionary all 'j' are made 'i'. A 'cl' could be a 'cul'. A 'vul' could be a 'vol'. and many others, including a procedure to try to break the input word into two.

Various manipulations of 'u' and 'v' are possible: 'v' could be replaced by 'u', like the new Oxford Latin Dictionary, leading 'U' could be replaced by 'V', checking capitalization, all 'U's could have been replaced by 'V', like stone cutting. Previous versions had various kludges attempting to calculate the correct interpretation. They were surprisingly good, but philosophically baseless and certainly failed in a number of cases. The present version simply considers 'u' and 'v' as the same letter in parsing the word. However, the dictionary entries make the distinction and this is reflected in the output.

Various combinations of these tricks are attempted, and each try that results in a possible hit is run against the full dictionary, which can make these efforts time consuming. That is a good reason to make the dictionary as large as possible, rather than counting on a smaller number of roots and doing the maximum word formation.

Finally, while the program could succeed on a word that requires two or three of these tricks to work in combination, there are limits. Some words for which all the modifications are supported will fail, if there are just too many. In fact, it is probably better that that be the case, otherwise one will generate too many false positives. Testing so far does not seem to show excessive zeal on the part of the program, but the user should examine the results, especially when several tricks are involved.

There is a basic conflict here. At the state of the 1.97E dictionary there are so few words that both fail the main program and are caught by tricks that this option could be defaulted to No. However, one could argue that there will be very few occasions for trying TRICKS, so that the cost is minimal. Unfortunately the degree of completeness of the dictionary for classical latin does not carry over to medieval Latin. With the hope that the program will become more useful in that area, the default has been set to Yes, reflecting the philosophy early in the development for classical Latin.

Trimming of uncommon results

Trimming has an impact on output. If TRIM_OUTPUT parameter is set, and specific parameters set in the MDEV, the program will deprecate those possible forms which come from archaic or medieval (non-classical) stems or inflections, also stems or inflections which are relatively uncommon. It will report such if no classical/common solutions are found. The default is set for this, expecting that most users are students and unlikely to encounter rare forms. Other users can set the parameters appropriately for their situation.

This capability is preliminary. It is just becoming useful in that the factors are set for about half the dictionary entries. There are still a large number of entries and inflections that are not set and will continue to be reported until determination of rarity is made.

GUIDING PHILOSOPHY

Purpose

The dictionary is intended as a help to someone who knows roughly enough Latin for the document under study. It gives the accidence and meanings possible for an input Latin word. It is for someone reading Latin text.

This is a translation dictionary. Mostly it provides individual words in English that correspond to, and might be used in a translation of, words in Latin text. The program assumes a fair command of English. This is in contrast to a conventional same-language desktop dictionary which would explain the meanings of words in the same language. The distinction may be obvious but it is important. A Latin dictionary in medieval times would have explanations in Latin of Latin words.

There are various approaches to the preparation of a dictionary. The most scholarly might be to select only proper and correct entries, only correct derivations, grammar, and spelling. This would be a dictionary for one who wished to write 'correct' Latin. (Correct being defined as the way Cicero, or your favorite writer or grammarian, used it.) The current project has a different goal. This program is successful if a word found in text is given an appropriate meaning, whether or not that word is spelled in the generally approved way, or is 'good Latin'. Thus the program includes various words and forms that may have been rejected by recent scholars, but still appear in some texts. Philosophically, thus program deals with Latin as it was, not as it should have been. I make no corrections to Cicero, which some might have been tempted to do if producing an academic dictionary instead of a program. Moreover I make no corrections of St Jerome. If your copy of the Vulgate has a particular spelling, that may be recognized by the program, either through a TRICK or as a dictionary entry that I have generated.

A philosophical difference from many dictionary projects is that this one has no firm model of the user or application. It is not limited to classical Latin, or to 'good practice', or to common words, or to words appearing in certain texts. As a result there will be a lot of chaff in the output. Some of this may be trimmed out automatically if desired, but it is there and available.

However inadequately, I hope to document decisions that went into the arrangement of the program and dictionary. I am surprised that there is little or no such information to the user of published dictionaries. If others generate similar products, or use the data from this one, they can do so in knowledge of how and why processes and forms were constructed.

I make few value judgments and those are mechanical, not scholarly, and are documented herein. Nevertheless some may be inappropriate, in spite of good intentions.

Method

The program subtracts possible endings from an input words and searches a list of stems, trying to make a match. If no exact match is possible, it tries various modifications, beginning with prefixes and suffixes, and eventually involving various regular spelling variations (or 'tricks') common in classical and medieval Latin.

A choice was made that the base was classical Latin as defined by the Oxford Latin Dictionary (OLD). Their primary time period is arbitrary/roughly 100 BC to 100 AD.

The classical form of words is taken as the base. Modifications are in such a way to correct to this base. Further additions to local dictionaries should keep this in mind. Modifications are made to the input words, not to the dictionary stems. It could be done the other way, but the present situation was initially much easier. There are some consequences of this approach. For instance, it is easy to remove an 'h' from an input word to match with a stem. It is much more difficult (but not impossible) to add 'h' in all possible positions to check against stems.

It would be possible to match most words with a relatively smaller list of stems (or roots) and generous application of word construction. This approach is not followed. One difficulty is that while words may be constructed correctly, and the underlying meaning to be found from this construction, the common usage may be obscured by a formal interpretation of the parts. In practice this occurs in 20-40% of the cases. This method is still very useful in approaching a word for which there has been no dictionary interpretation, but it puts a considerable burden on the normal user. Further, in about 10% of constructions, the result is just wrong.

In normal usage, if the program finds a simple match, it does not go further and consider what constructed words might also be valid. (One can override and force prefix/suffix construction with a switch, but one might not want to force all possible tricks.)

For instance, if there is an adjective that matches, a corresponding identically spelled, logically valid noun will not be reported unless it is explicitly found in the dictionary, even though it could be constructed or inferred from the adjective or constructed with a suffix from a verb in the dictionary.

An exception to this is that enclitics (eg., -que) are always considered. Coloque can be a verb or collo-que. The latter is in Virgil and should not be omitted. Verb syncope is also favored. In the vast majority of cases, if there is a possible syncope it is the correct parse. This is given preference over word construction with suffix. Audii is syncope of audivi, but it could also be aud-i-i. The latter is considered very unlikely.

There are a large number of paths and possibilities. Choices have been made in the code that result in the exclusion of some. It is hoped that they were the best choices. The method was constructed by taking a number of primary procedures and combining/assembling them in such a way as to give reasonable parses for a number of test cases. Basicly, this is hacking, but it might be considered and emperical starting point from which one could construct a logical rationale.

Therefore, the philosophy is to populate the stem list as densely as possible. Even easily resolved differences are included redundantly (adligo as well as alligo - ad- is most of duplicates). The advantage is that while regular single-letter modifications are fairly easy, and two letter differences are possible (but more expensive), further deviations are problematical. The better populated the stem list, the better the chance of a result.

Even in easy cases the overpopulation is helpful. Antebasis is easily parsed as ante-basis ('pedestal before', which is reasonable), but inclusion as a separate word allows the additional information that it is the hindmost pillar of the pedestal of a ballista.

The stem list is also populated with variants suggested by different sources. The problem is that the remains of classical Latin have gone through many monks along the way. These copyists may have made simple mistakes (typos!), or have made what they thought were proper corrections (spell checkers!). And twenty centuries later scholars work hard to reassemble the best Latin to present in the dictionary. But a particular document in the form presented to the reader may have have a variety of spellings for exactly the same word in the same referenced passage (Pliny's Natural History is often subject to this problem). (It may even be that modern texts and dictionaries have misprints!) All forms found in various dictionaries can be included, with the exception of those explicitly labeled 'misread' (and the argument probably could mandate their inclusion also). However, a single example of a variant in one case will not be included as a dictionary entry. If such a word is sufficiently important, if it is used frequently or by several authors, it will be entered as a UNIQUE.

Lewis and Short seem to be more willing than the more recent Oxford Latin Dictionary to raise a few examples of variation to an entry (at least an alternate). Generally, I make an entry if some dictionary does so. But within an entry I generate additional possible stems not noted elsewhere, e.g., I expand first declension verbs with '-av' perfect stems, even though no example exists in classical Latin. This is often the practice in other dictionaries also.

Verb parts omitted from source dictionaries are mechanically added where it is clear, (ex. where the base verb is documented, but parts are omitted in compounds). Whether Cicero used them or not, some later text might.

In some cases I also have expanded adjectives and adverbs to include comparative and superlative stems where they seem reasonable or have corresponding English instances, even when there is no specific dictionary citation. This effort was modivated primarily by finding examples of such comparisons in processing of large amounts of text beyond the classical works upon which authoritative dictionaries are based, but even classical works yielded examples. The point is that, while these forms would usually be caught by the word formation (prefix/suffix) process in the program, the process is limited to how many operations can be done serially. Having more/expended stems allows another level of word modification to be implemented.

Adjectives are extrapolated to COMP and SUPER where it makes sense (when those meanings are reasonable, and in many cases they are not) even if the source dictionary only lists POS. They are expanded fully especially even when the source lists a COMP but no SUPER.

Perhaps a bit out of context, consider the common question of SECLORUM in the Great Seal of the USA. This pure word in not in any dictionary I know of, not the OLS or L+S. A simple trick gives seculorum (seculum = world), but the favored translation is from the twice modified saeculorum (saeculum = age), which would not be found by a minimalistic system.

It is often the practice in paper dictionaries to double up on an entry that may be either adjective or noun, usually by leading with the adjective and mentioning its use as a noun. A much larger set of adjective/noun pairs is favored with separate entries. It is the philosophy of this program to make separate entries whenever there is an example in any reference dictionary. This might faciliate the task of a larger translation program which would handle phrases or sentences. However there has been no effort to explicitly generate such pair expansion if there is no precedent, and the user must still recognize the possibility of unexpanded multiple possibilities for substantives.

An argument against a large stem list is that it increases the storage required (but this is extremely modest by current standards) and increases processing time for search of the stems (this is far offset by the processing which would be required to construct or analyze words working from a smaller stem list).

A significant objection is that artifically generated stems may conflict with real/common ones and produce false output confusing to the user. A certain amount of this is eliminated by trimming the output to emphisize the most probable results, but it is still a problem.

Perhaps a counterexample would be an inferred fourth stem to no/nare (swim). Natus conflicts with the fourth stem of nascor (be produced/born) and the nouns and adjectives stemming from it. The nare natus does not appear in dictionaries, nor does it occur in compounds of nare, so it has been omitted from the WORDS dictionary.

Additional parts of verbs are included (first conjugation is easily filled out, even eccentric verbs if they are compounds of known parts), although they may not have been found in any well known texts. Cases can be logically constructed that are 'missing' in classical Latin. Verbs with prefix can be expanded when the base is known. That a form has not been found in surviving copies of classical texts does not mean that it was not on the lips of every centurion and his girl friend, or that it might not find its way into medieval texts.

It may be argued in some cases that forms are missing because their pronounciation would be awkward. This may well be true when Cicero is the arbiter, but others may not be so elegant. Moreover, much of the texts are represented by medieval documents, Latin the was written but may not have been spoken, so the problem did not arise. However, I might be willing to accept this argument for considering carefully some perfect stems of first conjugation verbs which otherwise would end in -avav. In the end, the only one I found that I could not support was lavo (wash), and its compounds, for which the perfect is lavi.

In some cases there are good reasons not to do the mathematical expansion, and these are pointedly avoided. There is no mechanical generation of, for instance, conl- words for every coll- word, unless there is some citation or reasonable rationale. They may be paired in almost every case, but, for instance, collis and collyra are not. However, forms that are mentioned in dictionaries explicitly, or implicitly by being derived from words having variant forms, are included in order to reduce the dependence on 'tricks'. OLD has a conp- for almost every comp- (except derivatives from como). Rare exceptions seem to be rare words for which few examples (or only one) exist. Even in some of these cases, OLD (mechanically?) gives two forms. L+S follows the same pattern, except for words of late Latin (which would not be found in OLD). It is presumed that the general practice in later times was always to use comp-, and the program dictionary follows that. There are many acc-/adc- pairs, but OLD has a fair number of acc- words without mention of a corresponding adc-, and so the possible generation of these words has been resisted. If an example turns up in text, the appropriate trick procedure should suffice

One suspects that some amount of analytical expansion is present even in the best dictionaries. Otherwise how can one explain four alternate spellings for a word which apparently only appears in citation as a single inscription.

In a some few cases I have infered a declension to certain very obscure Greek words which other dictionaries have treated as indeclinable (having only a single classical example of its use). My argument is that some later writer, using this word, might attempt to decline in it in a conventional manner, no matter what Vitruvius thought. I have indicated the indecl. option in the meaning.

Adjectives from participles are included if an entry is found in some reference dictionary. In some case the adjective has a special meaning not obvious from the verb. The program will return both the adjective and the participle with its verb meaning. The user should give some additional consideration to the adjective meaning in this case. If the adjective is marked rare while the verb is common, it is likely there is reference to a special meaning.

Tricks are expensive in processing time. Each possible modification is made, then the resulting word goes through the full recognition process. If it passed, that is reported as the answer. If it fails, another trick is tried. This is effective if very few words get this far. It is expected that application of single tricks will solve most of the resolvable difficulties. It would be impractical to mechanically apply several tricks in series to a word. A large stem population reduces the likelyhood of multiple tricks being required. If the dictionary is heavily and redundantly populated, tricks are rarely necessary (and therefore not an overall processing burden) and largely successful (if the input word is a valid, but unusual, variant/construction).

Further, a conventional dictionary, especially one that wishes to set a standard for proper language, excludes words that may not meet criteria of propriety, slang, misspellings, etc. This may place the onus on the reader to convert words. A computer dictionary ought to relieve the reader as much as possible. The present program may be a far way from complete, but it's goal is to strive for that.

Word Meanings

The meanings listed are generally those in the literature/dictionaries. In the case of common words, there is general agreement among authors. Some uncommon words display convoluted interpretations.

Generally, the meaning is given for the base word, as is usual for dictionaries. For the verb, it will be a present meaning, even when the tense input is perfect. For an adjective, the positive meaning is given, even if a comparative or superlative form is shown. This is also so when a word is constructed with a suffix, thus an adverb constructed from its adjective will show the base adjective meaning and an indication of how to make the adverb in English.

For the level of usage for this program, and for convenience in coding, the meaning field has been fixed at 80 characters. It is possible to have multiple 80 character lines for an entry, but this only necessary for the most common words. In order to conserve space, extraneous helpers like 'a', 'the', 'to', which sometimes appear in dictionary definitions, are generally omitted. The solidus ('/') is used both to separate equivalent English meanings and to conserve space.

I have taken it upon myself to add some interpretations and synonyms, and propose common usage for otherwise complex descriptive definitions. The idea is to prompt the reader, expecting that the text may not be that from which some dictionary copied the meaning (from some 18th century translator!).

In the meanings I only use words of which I know the meaning. I find that in some cases the Oxford Latin Dictionary uses English that is not in the Oxford English Dictionary.

Where available, the Linnean or 'scientific Latin' name is given in parentheses, mostly for plants. This is not a classical Latin name, but a modern designation. Similarity of this designation to some Latin word may not be historically significant.

The spelling of the English meanings is US (plow not plough, c