SAL Noun Code Taxonomy¶

General Explanation of the SAL Noun Code Taxonomy¶

Nouns are organized in a taxonomy of Supersets, sets, and subsets.

Nouns have twelve Supersets:

concrete
mass
animate
place
information
abstract
process (intr)
process (tr)
measure
time
aspective
unknown

All noun Supersets have sets, but only some sets have subsets.

In the following, maroon denotes noun Superset, red denotes noun set, and blue denotes noun subset.

Mnemonics for each SAL element are provided for coders and rulewriters. Internal to the system, however, SAL codes are represented numerically. For Nouns, the numeric range signifies place on the taxonomy, as follows:

1-12 = Supersets

17-99 = sets

100-998 = subsets

Guidelines to SAL Coders¶

Nouns with Multiple Meanings¶

Many nouns fall into more than one SAL category. For example, passage can be both a conduit under Concrete, and a path under Place. It can also be a piece of writing or musical composition under Information.

Selection among the multiple meanings of a given noun can often be effected by the use of Subject Matter Codes (SMC) when entering the term in TermBuilder.

In some cases, however, Subject Matter Codes are not helpful. In such cases, the user must make an arbitrary choice among SAL codes at the time the word is entered. (Later development plans include giving the system the ability to resolve among the multiple meanings of a common noun on the basis of extra-sentential context. This capability does not presently exist in the Logos System.)

When making coding decisions, users should observe the coding priorities listed below.

Noun Coding Priorities¶

There is a critical set of priorities governing coding choices for nouns that should be observed, if translation degradation is to be avoided. The following represents the coding hierarchy in order of importance:

Verb-biased Nouns (See verbal abstracts set under Abstract Superset). Nouns coded for verb bias tell the system to expect a verb complement.
Verb-biased codes are critical for parsing. For example:
1. ways of cooking lentils
2. types of cooking utensils.
The verbal abstracts code given to ways in (1) biases the parser to expect a verb and therefore allows the parser to resolve cooking correctly to a verb. In (2) cooking is an adjective.
Nouns taking prepositional complementation. (See strong verbals under Abstract Superset.) For example:
- attitude towards
- interest in
- anxiety about
- phone connection to
- attention to
Prep governance codes are critical for parsing decisions regarding prepositional attachement.
Mass Nouns. Unlike count nouns, mass nouns can occur in the singular without an article or quantifier; e.g., Gold is expensive.
Mass codes are critical to parsing. For example:
1. Test gold for …
2. … test tube for… .
In (1), gold as a Mass noun helps the parser to see test as a verb. (Unlike count nouns, singular mass nouns without an article can be the object of a verb.) In (2), test must be a noun because tube is a singular count noun.

Mass-like codes occur in various places in the SAL noun taxonomy. These include:
Mass Superset, which is mass by definition

trees/wood subset (e.g. oak) under Concrete Superset

edibles/color subset (e.g. orange) under Concrete Superset

mammals/food/fur subset (e.g. fox) under Animate Superset

fowl/food subset (e.g. duck) under Animate Superset

remote mass subset
Nouns denoting agents. Agentive type nouns occur in various places in the SAL noun taxonomy. These include:
- Animate Superset, which is agentive by definition
- agentive set under Concrete Superset
- functional location (agentive) subset under Place Superset
- geographical entities (agentive) subset under Place Superset
- remote agentive subset (an optional subset code under any set or superset)

SAL Noun Code Hierarchy¶

For nouns and noun phrases that are able to take more than one code, assign that code which is highest in the following hierarchy.

Note that Process Nouns (WC 4 and 7) are not included here. Process Noun codes are derived automatically from their verbs. (Process Noun codes are preemptive.)

Characteristic	Applicable SAL Type	Mnemonic	Numeric (SS Set Subset)
Takes Verbal Complementation	purpose subset of ABSTRACT	ABpur	6 41 748
	method/process/procedure subset of ABSTRACT	ABmeth	6 41 733
	cause/potential/disposition subset of ABSTRACT	ABcause	6 41 602
Mass (non-count) Noun	entire MASS noun Superset	MASS	11
	trees/wood subset of CONCRETE Superset	COtrwd	3 32 855
	edibles/color subset of CONCRETE Superset	COedcol	3 18 855
	remote MASS (floating subset)	(variable)	855
Takes Prepositional Complementation	strong verbals subset of ABSTRACT (code is specific for each prep governance)	ABxxx	6 nn 749
	recorded data subset of INFORMATION	INdata	12 76
Denotes Agent	entire ANIMATE Superset	AN	5
	entire agentive set of CONCRETE Superset	COagen	3 35
	agentive geographical entity set of PLACE Superset	PLaggeo	9 94
	instructional data set of INFORMATION		12 74
	agentive functional location of PLACE Superset	PLagfunc	9 26 228
	remote agentive (floating subset)	(variable)	228

All other SAL noun codes are more or less of equal weight.

A Caveat to SAL Coders¶

The organization of nouns into a small number of sub-classifications is inevitably going to be arbitrary and even seem unprincipled at times.

For example, the LOGOS system codes table as a supporting surface under Concrete Superset and platform as a Place noun, this on the grounds that the latter has human scale. But, by the same token, words like wall and fence are coded Concrete rather than Place despite their human scale.

There is no real defense of this except to repeat that any taxonomy that reduces 100,000 nouns to 100 categories is bound to incur these inconsistencies.

As one becomes familiar with SAL, idiosyncrasies such as this become less troublesome. It is only fair to say that natural language itself is riddled with unprincipled inconsistencies.