SAL Noun Code Taxonomy¶
General Explanation of the SAL Noun Code Taxonomy¶
Nouns are organized in a taxonomy of Supersets, sets, and subsets.
Nouns have twelve Supersets:
concrete
mass
animate
place
information
abstract
process (intr)
process (tr)
measure
time
aspective
unknown
All noun Supersets have sets, but only some sets have subsets.
In the following, maroon denotes noun Superset, red denotes noun set, and blue denotes noun subset.
Mnemonics for each SAL element are provided for coders and rulewriters. Internal to the system, however, SAL codes are represented numerically. For Nouns, the numeric range signifies place on the taxonomy, as follows:
1-12 = Supersets
17-99 = sets
100-998 = subsets
Guidelines to SAL Coders¶
Nouns with Multiple Meanings¶
Many nouns fall into more than one SAL category. For example, passage can be both a conduit under Concrete, and a path under Place. It can also be a piece of writing or musical composition under Information.
Selection among the multiple meanings of a given noun can often be effected by the use of Subject Matter Codes (SMC) when entering the term in TermBuilder.
In some cases, however, Subject Matter Codes are not helpful. In such cases, the user must make an arbitrary choice among SAL codes at the time the word is entered. (Later development plans include giving the system the ability to resolve among the multiple meanings of a common noun on the basis of extra-sentential context. This capability does not presently exist in the Logos System.)
When making coding decisions, users should observe the coding priorities listed below.
Noun Coding Priorities¶
There is a critical set of priorities governing coding choices for nouns that should be observed, if translation degradation is to be avoided. The following represents the coding hierarchy in order of importance:
Verb-biased Nouns (See verbal abstracts set under Abstract Superset). Nouns coded for verb bias tell the system to expect a verb complement.
Verb-biased codes are critical for parsing. For example:
ways of cooking lentils
types of cooking utensils.
The verbal abstracts code given to ways in (1) biases the parser to expect a verb and therefore allows the parser to resolve cooking correctly to a verb. In (2) cooking is an adjective.
Nouns taking prepositional complementation. (See strong verbals under Abstract Superset.) For example:
attitude towards
interest in
anxiety about
phone connection to
attention to
Prep governance codes are critical for parsing decisions regarding prepositional attachement.
Mass Nouns. Unlike count nouns, mass nouns can occur in the singular without an article or quantifier; e.g., Gold is expensive.
Mass codes are critical to parsing. For example:
Test gold for …
… test tube for… .
In (1), gold as a Mass noun helps the parser to see test as a verb. (Unlike count nouns, singular mass nouns without an article can be the object of a verb.) In (2), test must be a noun because tube is a singular count noun.
Mass-like codes occur in various places in the SAL noun taxonomy. These include:
Mass Superset, which is mass by definition
trees/wood subset (e.g. oak) under Concrete Superset
edibles/color subset (e.g. orange) under Concrete Superset
mammals/food/fur subset (e.g. fox) under Animate Superset
fowl/food subset (e.g. duck) under Animate Superset
remote mass subset
Nouns denoting agents. Agentive type nouns occur in various places in the SAL noun taxonomy. These include:
Animate Superset, which is agentive by definition
agentive set under Concrete Superset
functional location (agentive) subset under Place Superset
geographical entities (agentive) subset under Place Superset
remote agentive subset (an optional subset code under any set or superset)
SAL Noun Code Hierarchy¶
For nouns and noun phrases that are able to take more than one code, assign that code which is highest in the following hierarchy.
Note that Process Nouns (WC 4 and 7) are not included here. Process Noun codes are derived automatically from their verbs. (Process Noun codes are preemptive.)
Characteristic |
Applicable SAL Type |
Mnemonic |
Numeric (SS Set Subset) |
---|---|---|---|
Takes Verbal Complementation |
purpose subset of ABSTRACT |
ABpur |
6 41 748 |
method/process/procedure subset of ABSTRACT |
ABmeth |
6 41 733 |
|
cause/potential/disposition subset of ABSTRACT |
ABcause |
6 41 602 |
|
Mass (non-count) Noun |
entire MASS noun Superset |
MASS |
11 |
trees/wood subset of CONCRETE Superset |
COtrwd |
3 32 855 |
|
edibles/color subset of CONCRETE Superset |
COedcol |
3 18 855 |
|
remote MASS (floating subset) |
(variable) |
855 |
|
Takes Prepositional Complementation |
strong verbals subset of ABSTRACT (code is specific for each prep governance) |
ABxxx |
6 nn 749 |
recorded data subset of INFORMATION |
INdata |
12 76 |
|
Denotes Agent |
entire ANIMATE Superset |
AN |
5 |
entire agentive set of CONCRETE Superset |
COagen |
3 35 |
|
agentive geographical entity set of PLACE Superset |
PLaggeo |
9 94 |
|
instructional data set of INFORMATION |
12 74 |
||
agentive functional location of PLACE Superset |
PLagfunc |
9 26 228 |
|
remote agentive (floating subset) |
(variable) |
228 |
All other SAL noun codes are more or less of equal weight.
A Caveat to SAL Coders¶
The organization of nouns into a small number of sub-classifications is inevitably going to be arbitrary and even seem unprincipled at times.
For example, the LOGOS system codes table as a supporting surface under Concrete Superset and platform as a Place noun, this on the grounds that the latter has human scale. But, by the same token, words like wall and fence are coded Concrete rather than Place despite their human scale.
There is no real defense of this except to repeat that any taxonomy that reduces 100,000 nouns to 100 categories is bound to incur these inconsistencies.
As one becomes familiar with SAL, idiosyncrasies such as this become less troublesome. It is only fair to say that natural language itself is riddled with unprincipled inconsistencies.