+ All documents
Home > Documents > Modeling Cognitive Linguistic Competence for MT

Modeling Cognitive Linguistic Competence for MT

Date post: 16-Nov-2023
Category:
Upload: independent
View: 1 times
Download: 0 times
Share this document with a friend
22
Modeling Cognitive-Linguistic Competence for MT © 2015 Boris Gorbis. NAFI Los Angeles, CA, USA All rights reserved Quoting and reproduction permitted with proper attribution Abstract: Inter-language textual equivalency is established through a mediator model of generic cognitive-linguistic competence. Parametric analysis organizes word-forms into MWCs - multiword-complexes (V+N, N+V, N+V+O, etc.) by assigning metalingual codes to components and yielding unique MWC coordinates identical in any language. A rigorous Point-of-View (POV) mapping of 'shared reality' proceeds horizontally identifying a (sub)set of all referents of one POV, and vertically, by all possible POVs of one referent (Array). Each POV comes with a class of possible relation-actions (operant spectrum). Elements of operant spectrums and referent subsets carry proprietary markers of their POVs forming proprietary code sequences (PCS'). MWC's coordinates are identical PCS' of any referent-operant pair. Universal Grammar operations frame MWC textual extensions and provide additional means of disambiguation. Key words: machine translation, computer assisted translation, cognitive modeling, universal grammar, shared reality, cognitive lexicography, aspect semantics, linguistic competence, metalanguage, universals.
Transcript

Modeling Cognitive-Linguistic Competence for MT

© 2015 Boris Gorbis.NAFILos Angeles, CA, USA

All rights reservedQuoting and reproduction permitted with proper attribution

Abstract:

Inter-language textual equivalency is established through a mediator model of generic

cognitive-linguistic competence. Parametric analysis organizes word-forms into MWCs -

multiword-complexes (V+N, N+V, N+V+O, etc.) by assigning metalingual codes to

components and yielding unique MWC coordinates identical in any language. A rigorous

Point-of-View (POV) mapping of 'shared reality' proceeds horizontally identifying a

(sub)set of all referents of one POV, and vertically, by all possible POVs of one referent

(Array). Each POV comes with a class of possible relation-actions (operant spectrum).

Elements of operant spectrums and referent subsets carry proprietary markers of their

POVs forming proprietary code sequences (PCS'). MWC's coordinates are identical PCS'

of any referent-operant pair. Universal Grammar operations frame MWC textual

extensions and provide additional means of disambiguation.

Key words: machine translation, computer assisted translation, cognitive modeling,

universal grammar, shared reality, cognitive lexicography, aspect semantics, linguistic

competence, metalanguage, universals.

1. Translation as Cognitive Behavior Replication

1.1. Any translation is a replication of cognitive-linguistic behavior distilled from

a 'linguistic mass' in the 'from-code' and correlate with linguistic means of the 'to-

code'. How to accomplish this in MT(machine translation) or CAT (computer

assisted translation) while parsing natural languages for the evasive 'equivalency

of meaning'? Polysemy of word-forms (pictographs) hampers word-form to word-

form equivalency, while text forming ideas are embedded at different levels of

grammar and syntax in different languages. How do we model shared

'competence' that makes not just translation but any communicative act

recognizable? Human participants easily rely on its existence, but we are still

trying to arrange an engagement between cognitive and linguistic modeling.

1.2. The move to multi-word grouping analysis is a step in the right direction. It

recognizes that language is a highly economical structure and that verbal behavior

reflects this economy through ready-made linguistic units (stable word-

combinations) and prefabricated blocks (clichés). Enabling their machine

translation is a relatively easy task. But it would be a mistake to mine natural

languages for solutions without having a well defined cognitive model to process

the ore. This writing is yet another effort to promote a specific model of

cognitive-linguistic competence (CLC) formalized for MT/CAT use but

potentially capable of expansion into machine text generation (MTG). (Gorbis,

1977)

1.3. May we agree that inter-language equivalency is achievable through a

cognitively shared core universally accessible from any language? If so, this core

should be represented (mapped) metalingually, irrespective of any natural

language or the multitude of individual, cultural or historical variations. This, in

turn, requires a uniform method of modeling cognitive processes of

categorization, comparison and differentiation. After all, any language speaker

can recognize and differentiate such categories as: "things that fall", "things that

present danger", "things I can put in my mouth", "heavy objects" or "many

objects." Capacity to think in these "abstract categories" is universal and supports

our ability to operate with their content while effortlessly zipping back and forth

between abstract and concrete planes. The task is to formalize these processes.

1.4. We previously suggested that formalization of multilingual translation (by

human or machine) can be accomplished through a metalingual mediator where

equivalency of texts is established. (Gorbis 2006 (a), 2006 (c) ). Consistently

adequate translation thus requires uniform mapping of cognitive-linguistic

competence serviced by two-way metalinguistic algorithms that connect non-

linguistic reality with language databases.

1. In search of Universals

2.1. Metalinguistic approach is a model of objects, phenomena and relations

independent of the language in which this model is expressed. This means that

any referent, i.e. object (living or not), state (produced or observed) and action

(physical or imaginary), etc., can be accessed by anyone for the dual purposes of

text (speech) production and recognition in any language. We focus on two

intersecting strategies - human capacity of universal categorization and relational

analysis.

2.2. Universal approach positions a man-speaker [ ↔ ]man-perceiver complex

at the center of our model. Language is an instrumental (teleological) toolbox

available to manage all relations within this complex. There are three sides to the

relational diagram: Me ↔ Me; Me ↔ Other(s) and Other(s) ↔ Other(s) where

"Other' is understood as any referent, i.e. human or inorganic, visible or invisible

object-subject of any and all relations.

2.3. The symbol '↔' represents any relation, -- any activity, process, or action.

It's arrows point to relational dynamics; humans act, destroy, inform, instruct, ask,

share, direct, change, control and otherwise affect each other regarding any object,

state, condition or process and to be likewise affected. This symbolically

corresponds to all relations represented as: "Me ↔Action"; and "Other(s) ↔ Action".

These can further branch into "Me↔Action"; "Me↔Action↔Other(s)";

"Other(s)↔Action↔Other(s)";and even develop further , e.g.

"Me↔Action↔Other(s) ↔Me"; our starting point in abstracting a web of

relations that are cognitively represented as earlier proposed rules of Universal

Grammar. (Gorbis, 2005)

2.4. The grey shaded area represents the phenomenon of our 'shared reality'. It is

a cumulative abstract map of all relations between all referents; e.g. man and self,

men and men, men and objects, etc'. As a source of both cognitive processing and

linguistic behavior, this multi-dimensional map is understood as a complex of

Relational Spheres that an abstract Human can (potentially) encounter and enter into

irrespective of cultural, linguistic or historical contexts.

2.5. There are several primary Relational Spheres that are further differentiated

and hierarchically organized into groups representing nested relational spheres. For

example, a relational sphere "Me ↔ Object(s)" includes "Me ↔ Physical Object(s)"

that includes "Me ↔ Natural Object(s)" and "Me ↔ Man- Made Object(s)", etc.

The primary Sphere "Me ↔ Other(s)" includes classes of relations of "Me ↔

Animals", "Me ↔ Parent(s)", "Me ↔ Property", "Me ↔ Nourishment" and

"Me ↔ God" and many other relational Spheres. The existence of the umbrella-like

"Me ↔ Object(s)" and "Me ↔ Other(s)" means that there is a cognitive level

where we view all referents the same way and can enter into the same types of

relations ("Me ↔Action") with any element from this referent set. The same

relations are available within any sphere nested under the big umbrella. But in a

smaller sphere we enter into different relations that do not belong to the primary

one. This is how our cognitive mapping works.

2.5. The nomenclature of Relational Spheres does not change: they may fill up

with new countries, gadgets, languages and people, but the full inventory of

relations in each Sphere and the cognitive-behavioral tools available to any of us

to affect and manage any relations within any Sphere, are likely to be the same

for the last 20,000 years. Language as the primary mechanism of actualizing relations

offers numerous fine-tuned choices to its users, but the results of cognitive

categorization of 'shared reality' remain the same.

2.6. This stability permits moving from one level of generalization to another

through the process of relational differentiation. It begins with a cognitive level

where 'shared reality' mapping does not require us to differentiate between

referents. We can observe anything or fail to do so, discuss, approach, measure,

and evaluate 'it' regardless of what 'it' is. We just view any referent as an Object

and any object as the same referent. We know that we can 'appropriate' only certain

objects as there is a logical or 'physical' limit, yet we understand the meaning of

'stealing stars', 'gobbling-up planets' 'buying air' or 'stealing a soul'. Note, that 'I

appropriate' i.e. establish a relationship of possession of an Object is not the same

when 'We appropriate' which means that plurality often results in different

relations between us and referents. Choice of relation defines product of

categorization and choice of a referent-set will define a class of relations.

3. Taking Baby Steps

3.1. The first step in modeling MT is to take stock of sets of referents of the

Relational Spheres. At first glance, these are nothing more than concepts of a

descending degree of abstraction. Thus, the primary set 'Object' would include

any Physical Object, Imaginary Object, Non-living Object, Living Objects,

Human Object, Animal Object, but here we run into idiosyncratic issues of

encyclopedic (not abstracted) taxonomy.

3.2. We can try to isolate types of relations that a collective Human can enter into

with each set of referents embraced by each Sphere. For example, the Relational

Sphere of "Me ↔ Object(s)" contains 'Other' as a 'physical (non-living) object'. It

specifies types of relations that we can enter into with any referent in that

category, e.g. any 'rock' or 'bone', etc., including relations that any 'rock' or 'bone'

can 'enter into' with us - such as a relation of perception, contact, use, value,

rejection, impact, concealment, depiction, description, etc,. We 'know' that these

can be further differentiated : e.g. contact relationship into penetration, surface,

dent, etc and use relationship extending to weapon, adoration, trade, decoration,

and even manufacturing and so on. How can we formalize this non-encyclopedic

'knowledge'?

3.3. The tusk is complicated since any relation can surface in more than one

Relational Sphere. Relation types of perception, contact, use, value, rejection,

impact, concealment, depiction, description are available whether the referent set

is an 'animal' or a 'human', a 'tree' or a 'vehicle', a 'sibling' or a 'stranger'. In other

words, we can enter into or be subject to the same type of relations in many

different Spheres and their different sets of referents. But what appear to be a

problem is, in fact, a practical solution.

3.4. Each instance of a relationship with an element from a set of referents is a

relonic pattern that can be parametrically and graphically represented. (Kvitash &

Gorbis, 2006). It is called an aspect application. (Gorbis, 2006 (b)) This means

that we cognitively attribute (and mark) the potentiality of any relation (action) as

a cognitive aspect of more than one referent. Semantically, any aspect becomes a

component of meaning of these referents. This may become normative and enter

into dictionaries, but typically, it does not. One of the MT hurdles lies in

machine's inability to recognize the nature of an application (e.g. 'tank flew' or

'wide open heart'.) The other hurdle is that the same text may reflect more than

one aspect application, (e.g. 'embrace a tree' vs. 'embrace an idea' or 'free kittens'

vs. 'free prisoners'.)

3.5. Contrary to perception, the difficulty is not in the polysemy of nouns-

referents but in the versatility of linguistic operant forms (e.g., verbs) identifying

relations of/with/upon referents. In many instances an operant serves so many

different aspect applications as to have no lexicographic meaning, standing as a

substitute for other operants. Take ('consider'), for example, the operant 'take' as

in "I had to take (buy tickets, board, enter) a train to...", "this will take (require)

an effort", "don't take (remove) this without asking me first", "take (bring) this to

my suite", "take (move) the next exit to...", "we had to take (prevent from

movement) her into custody", "take (hold) her bag" and so on.

3.6. Humans have little problem deciphering such word-combinations and many

jokes are based on this capacity. Each application is disambiguated by a universal

cognitive algorithm of categorization that we call a Point-of-View (POV). Each

point-of-view identifies a common cognitive aspect that exists in shared reality as

a potentiality of meaning in each referent. This cognitive algorithm that makes us

laugh is the core of our modeling approach.

3.7. The operant-referent versatility reflects two cognitive dimensions. The

horizontal universal allows the same relationship to identify many objects (set of

referents) from the same POV position, while the universal vertical dimension

allows any referent to be seen from different POV angles, most normatively

established and some freshly minted. Either way, text generation and text

processing all rest on human ability to use multi-form compounds (misleadingly

termed multi-word units (MWUs)) in order to express (or suppress) a dual

connection between any object and any relation-action it can undertake or

undergo. Children books exploit this potentiality to the fullest. The trick is that

POVs actualize a pre-verbal 'unit of thought', a cognitive unity between a certain

relation-action and a certain referent-set.

3.8. MWUs are not units with equivalent components in more than one language.

Any 'unit' is a two-dimensional cognitive potentiality actualized by a POV choice.

It remains the same but may be represented by divergent word-form groupings in

the same and in different languages. Accounting for the presence and operation of

this cognitive algorithm is the task of cognitive lexicography. The main (but not

the only) connection between Relational Spheres and cognitive-linguistic

behavior is a universal (shared) categorization menu, serving cognitive Points-of-

View (POV) as options. Choosing or identifying correctly a POV actualizing

MWUs is a condition necessary for communication. It is the backbone of human

cognitive-linguistic competence (CLC) and our machine translation model.

4. Modeling Cognitive-Linguistic Competence

4.1. To model CLC, we need to differentiate competence machinery from its

operational algorithms. In humans all operate seamlessly. In machine translation

we must separately develop relational databases (inventories) of: (1) Relational

Spheres, (2) referent sets of each Sphere, (3) individual referents of each set as

word-forms in different codes, (4) relation-type word-forms isolated for each

Sphere and (4) POV's from which anyone can view any referent in our 'shared

reality' as a subject or object of any relation-action. Some of these inventories are

open-ended and some are finite. The key is to connect them through specific

codes serving as metalinguistic coordinates in recognizing strings of text as

equivalent multi-word complexes (MWC') in different languages.

4.2. Cognitive-Linguistic competence necessarily includes all 'linguistic' data such

as vocabularies of referent sets (see, inventories 2 and 3) and means of expansion

into speech that we associate with grammar. Contrary to what we know as

'normative grammar', or 'generative grammar', the rules of this expansion do not

proceed from form to meaning, but in reverse, from a cognitive-operational

structure straight to a linguistic format.

4.3. Sentences are not generated -- they are framed by cognitive processing and

Universal Grammar as a metalanguage of thought structure is another component

of the CLC model. We have earlier offered a mechanism yielding universal

elements of action-relation such as 'agent', 'object' 'duration', 'permission', 'time'

etc and a simple methodology of their multi-dimensional expansion into sentence

structures by 'operations' of the 1st, 2d and 3d degrees. (Gorbis, 2005.)

4.4. We should not ignore that POV's are a product of Relational Spheres that also

provide structural blueprints of Universal Grammar for framing thought into its

linguistic expression. Take another look at text strings in 3.5. as speech acts of

command and explanation. These are not linguistic categories but expressions of

an underlying structure of human interactions arising from our 'shared reality'.

The choice/identification of a POV and the actualization of a corresponding

MWC is determined by that structure. POV analysis is one of the key elements of

our design.

4.5. The object of our modeling is the metalanguage of thought operating with

abstractions of primary Relational Spheres. Their sets of referents are

"supernotions" such as Action, Person, Object, Process, that includes their main

and optional attributes. Thus, any 'Action' has attributes of agent, duration,

object, effect etc. Any 'Object' has mass, dimension, shape, volume, composition,

capacity, function, use etc. Main attributes of 'Person' include presence,

movement, action, inaction, interaction, choice, expression, etc.. Secondary

(optional) attributes, for 'Person' are: opinion, reproduction, nourishment, etc. The

lists of primary and optional attributes are fairly short but each attribute can be

further differentiated into a new cognitive and verbal structure.

4.6. Universal Grammar frames abstractions into sentences through a hierarchy of

cognitive operations upon supernotions and their attributes. For example, in any

language there will be these 1st level structures: 'to perform [any] action', 'agent

of action' and 2d level structures, such as, 'to order to perform an action', 'to ask

to terminate an action', to interrupt performance of an action, and so on. At the

deeper (3d) level we find: 'to request to allow to terminate performance of an

action', 'to order to resume performance of an action', etc. Where focus is on the

agency, all languages provide structures needed 'to request to identify an agent of

action', 'to volunteer to become an agent of an action', 'to suggest an agent of an

action', or even go deeper, as in operation 'to prevent an agent of an action to

perform an action'.

4.7. The unfolding of cognitive into language behavior is a process in which

POVs (our ever shifting focus in the direction of thought) play a key role.

Languages might differ in their repertoire or linguistic means through which these

universal frames are expressed. They may also differ in levels where cognitive

operations become overly complex or may still find a corresponding lexico-

grammatical format. Thus, we can find 4th level cognitive operations to be

linguistically appropriate, e.g. 'to request to permit to volunteer to perform an

action' or 'to announce to intend to be an agent to destroy an object' or not as in:

'to ask an agent of an action1 to order an agent of an action to terminate

performance of an action2' when we say: "Please tell him to shut up!" To repeat, it

is the choice of POVs that allows such texts to be recognized as they connect a

referent (person) with an action-relation and an object of action in different ways.

5. POV as the Parametric Key

5.1. We recognize that each POV relationally separates objects, states, and

processes from our shared reality into a (sub)set of referents. Change a POV and

the (sub)set composition changes. Change the text and the POV analysis changes.

Each POV has a specific volume; the size of a referent (sub)set it creates. POV's

corresponding to primary Relational Spheres and their supernotions may cover

the entire nominative dictionary. The 'smaller' POV volume is an open class with

one shared characteristic; a common cognitive POV to which we assign a unique

marker. Thus, all word-forms corresponding to referents from each POV (sub)set

will carry the same POV marker.

5.2. Each POV provides its assembled referents with many types of 'shared

reality' relations. Under each POV any referent can create the same relations and

be subject to the same relations. Consequently, each (sub)set element can engage

in the same action-relations as others. Conversely, a need to enter into a specific

relationship would make no difference what objects we use. We have seen people

to write on their skin, on business cards, or 100 dollar bills as well as phones and

yellow pads. These cognitive subroutines form Musketeer Classes where one

element can invoke all relations and one relation-action will in turn bring out any

element of a POV (sub)set. The class of relations-actions that belong to a given

POV is called its Spectrum or a spectrum of operants. (Gorbis, 2014). Each

operant from a POV spectrum is assigned its marker. In other words subset

referents and spectrum operants of the same POV carry the same marker.

5.3. This cognitive economy of a single POV organizing linguistic behavior by

connecting its (sub)set of referents with its Spectrum of operants is fully utilized

in our design. A subset of referents created by POV 'Moving Object' has an

obviously huge volume where any moving referent has potential access to pre-

established cognitive aspects of being: 'observed', 'measured', 'ignored', 'noticed',

'avoided', 'deflected', 'destroyed', etc. sharing the same marker (say, for simplicity,

'A') with any other object that moves. In addition to marker 'A', any referents of a

related POV 'Objects in Motion' (marked 'B') engages in 'moving', 'stopping',

'speeding', 'resuming', 'slowing', 'rotating' 'covering distance' etc. Thus, both

referent subset and operant elements of this POV carry markers A&B. The

descending POV of 'Man-made Moving Objects' (marked 'C') would bring action-

relations Spectrum that any vehicle from a 'C' marked referent subset can

undertake as well as undergo.

5.4. We said earlier that each operant from a POV spectrum carries the same

marker as elements of the corresponding referent (sub)set. But how do we

differentiate between a taxi and a volleyball in MT: both are man-made and both

are moving objects and carry A,B,C markers? Moreover, the same cognitive

economy would allow one relation-action operant to belong to Spectrums of many

different POV's. See, versatility discussion in 3.5. above where the word-form

'take' expresses different relations as in 'to take a cab' and 'to take a ball'.

5.5. Because anything can be perceived from more than one angle, a volleyball is

not just a 'Man-Made Object', it is also a 'D'-marked 'Man-held Object', an 'E'-

marked 'Sports Object', an 'F'-marked 'Man-hit object', a 'G'-marked 'Man-thrown

object', an 'H'-marked 'Value Object' (Dad's prized possession) an 'I-marked'

'Commercial Object' and so on. Unlike a 'ball', a 'taxi' is an element of many

different subsets created by such POVs as 'Q'-marked 'Man-operated object', 'P'-

marked 'Hollow Object' and even a 'Z'-marked 'Object with Wheels'. The set of all

POVs potentially applicable to a referent is called an Array. (Gorbis, 2006 (c))

5.6. A POV Array of any referent can thus be represented as a string of markers

that identify this referent. The longer the string, the more likely it is that it is a

unique proprietary code sequence of the referent abbreviated as PCS in what

follows. Each marker in the PCS corresponds to a specific POV that carries an

identically marked Spectrum of operants. Thus, any referent PCS also identifies

all operant Spectrums carried by each POV in that referent's Array. This is the

main parametric feature of our model. Through its PCS, any referent of the word-

form 'ball' would: (1) differentiate identical word-forms as corresponding to

different referents, (2) actualize any POV from each referent's PCS-marked

Array, and (3) itself be actualized by any operant from any of the same PCS-

marked Spectrums. A word-form 'take' that corresponds to several operant

Spectrums will (1) actualize all POV's having this operant in its spectrum, (2)

identify all referents of each of these POVs, and (3) differentiate referents by their

PCS's. In simpler terms, a 'ball' would differ from 'taxi' because their PCS's would

not coincide in at least one instance e.g. I cannot 'throw a taxi' but I can 'throw a

ball' even if I can 'kick' both. Parametric codes that do not coincide in at least one

instance lead to complete differentiation.

5.7. To generalize, any word-form (referent or operant) can be represented as a

PCS, a unique string of individual POV markers of its referent(s) and

corresponding operant(s). Each referent's Array is different from the Array of

another referent (even if word-forms coincide, as in 'glasses'(pl) vs. 'glasses',

'train' vs. 'train' as in 'train of thought' and a 'dress train') and so coding for

corresponding word-forms would be different. Because PCS for a referent must

be the same for an operant, any multiple word-complex acquires unique

coordinates.

6. Text Indexing and Disambiguation

6.1. typical translation begins with word-by-word or sentence-by-sentence

processing where punctuation provides a recognizable break. A machine text

processing begins with identification (recognition) of all POV appearances and

mapping of all MWC coordinates. A parametric MT model notes but ignores

processing breaks and treats the text as a whole. This means that it 'recognizes'

continuations of linguistically expressed (initial) thought (subject), its

interruptions, deviations and appearances of new subjects through comparison of

parameters.

6..2. This is accomplished by establishing equivalence of a unit-of-thought

expressed in code A to a unit-of-thought to be found in code B by correlating their

metalingual coordinates. The primary unit-of-thought complex is linguistically

represented by complexes, such as V+N, N+V, N+N or A+V and A+N word-

forms (where V is a verb, N is a noun (or gerund) and A is either a qualifier, or a

quantifier). POV's of plural word-forms (e.g. 'parents', 'violins', 'fears') identify

different spectrums than those of a singular form and their PCS coordinates will

differ. This allows disambiguation of forms since POV of Objects is not the same

as the POV of an Object and the PCS of a singular form ('parent', 'violin', 'fear')

would invariably differ from the plural.

6.3. Any text reflects connected ideas and subjects, and parametric design can

'predict' appearance of subsequent and 'anticipate' development of new ones in

two ways. In treating text as a 'flow', probability-based model can identify

recurring word-complexes with identical coordinates and 'anticipate' those with a

substantially similar ones. The second format of probability-based analysis

compares coordinates of the first unit-of-thought (MWC) expressing a

relationship between a certain subject (referent) and predicate (operant).

Following rules of Universal Grammar (see 4.5 above) this unitary complex

expands by identification of the object of relation-action or a qualifier, specifying

type, quality, quantity etc., of any of the primary or optional attributes. Text often

reflects developments in which these referents become elements of the next unit.

Furthermore, the MWC as unit-of-thought is a metaphorical seed that expands

unto new sentence structures either through a choice of a new POV from the

Array or by a new choice from the operant Spectrum. (Gorbis, 2006 (b)).

6.4. The predictive algorithm mirrors progression of text generation framed by

Universal Grammar operations. Thus, any attribute of a supernotion appearing as

an object of a preceding text sequence has significant probability to become the

subject in a subsequent textual sequence. Because many operant-referent PCS

identified MWCs are stable language structures that reflect cognitive economy,

we recognize these complexes as 'ready-to-use' linguistic units. They enter text

production as a whole and are recognized as such by recipients allowing

disambiguation of a psycholinguistic volume of meaning of a non-contextual

word-form available even to beginning language students. (Gorbis, 1972) But this

is not true for machine (until it learns otherwise). We simulate this shared reality

"knowledge" by expanding on our parametric design.

6.5. Every 'unit of thought' expands through 2d and 3d levels Universal Grammar

operations over supernotions and upon attributes of the primary referent classes.

(See, 4.5. and 4.6. above). For example, when we focus on an object of action or

its duration, all languages provide means to expand their cognitive and linguistic

V+N units on a pre-formatted basis. Simple examples would suffice. A unit 'to

attend school' is cognitively represented as a complex that under Universal

Grammar rules expands as: 'to attend (which) school' 'to attend school (in which

place)' 'to attend school (with whom)", 'to fail to attend school' etc. By identifying

PCS coordinates for word-forms in the text, machine can 'develop' its own

'cognitive map' of the document, a potent source of disambiguation.

6.6. Instances of simple competition for word-form correlation are resolved with

the unique set of PCS coordinates. More complex cases can be resolved through

comparing POV arrays and spectrums of competitors for the best correlation with

a preceding multi-word complex or structures. Based on the assumption that the

subject matter evolves throughout the text, we can program whole text parametric

analyses by all identifiable POVs and thus greatly limit competing choices of

referent subsets and operant spectrums.

7. Cognitive Lexicography.

We recognize that text production is not stringing of words according to language

rules we are still unable to formalize, but identification of equivalent multiple

word complexes (MWCs) evoked through cognitive algorithms of Universal

Grammar operations and POVs. This, and other writings on the subject, suggest

that we can replicate linguistic behavior because we can formalize and build a

parametric model of cognitive-linguistic competence (CLC). Texts no longer

appear as ad-hoc produced speech with constantly shifting meanings. Translation

acquires stability by reliance on universality of language mediated representation

of 'shared reality'.

Unlike any other dictionary, where equivalency of meaning is established through

reference to other concepts, our model is a cognitive dictionary simulating

cognitive-linguistic competence (CLC). Its format is a coded hierarchy of all

instances of viewing an object-referent from every conceivable point-of-view

being code-referenced with all relation-action operants that actualize and service

each point-of-view. This design is language-independent and enables cognitive

mapping by machine. This CLC model is a multi-level relational database of

'units-of-thought' as expressed in natural languages. Identified by their unique

coordinates they are subject to further expansion with a number of Universal

Grammar operations. In creating these processing compendiums for natural

languages, we do not translate, but enable machines to establish (full or

significant) equivalency through correlation of code-based coordinates and

identification of equivalent multi-word expressions in each language. Full

realization of this design requires preliminary work with massive language

material to be conducted by many groups in many different languages following a

uniform protocol.

References:

Gorbis, B. (2014) "Cognitive Machine Translation Model: POV and Musketeer Classes of Shared Reality" In Proceedings of the 2014 International Conference on Intelligent Linguistic Technologies, pp. 74 -119. (ILINTEC'14) CSREA Press (2014).

Gorbis, B. (2006a) "The COG: Making Sentences From Concepts". In Proceedings of the 2006 International Conference on Machine Learning; Models, Technologies & Applications, pp. 43-49 (MLMTA 2006), CSREA Press (2006).

Gorbis, B. (2006b) "Borrowing With Interest: Aspect Semantics View of Language Extension and Expansion". In 'Proceedings of the 2006 International Conference on Machine Learning; Models, Technologies & Applications, (pp. 50-56) (MLMTA 2006), CSREA Press, (2006).

Gorbis, B. (2006c)“Cognitive Dictionary: a Representation of Shared Reality”, In Proceedings of the 50th Annual Meeting of the ISSS,  - Sonoma State University, Rohnert Park, California, USA, July 9th - 14th 2006, ISSN: 1999-6918.

Kvitash, V. & Gorbis, B.(2006d) “Relonic Properties of Living Systems” in Proceedings of the EMCSR 2006, 18th European Meeting on Cybernetics and Systems Research, (pp. 167-172) Vienna, Austria, Austrian Society for Cybernetic Research, April 18-21, 2006,.

Gorbis, B. (2005) “A Primitive Model of Metalanguage for Universal Grammar”. In Proceedings of the 2005 International Conference on Intelligent Linguistics, (pp. 39-45). (MLMTA 2005),. CSREA Press, (2005).

Gorbis, B. (1977) “Psycholinguistics and Generative Lexicography: A Preliminary Description" in “Translator’s Journal” (“Tetradi Perevodchika”) issue 14, “Foreign Relations Publishing House Moscow, (1977) (in Russian);

Gorbis, B. (1972) “The Psycholinguistic Volume of Meaning” in “Methods of Foreign Language Acquisition: Issues and Developments”, Kiev, (1972) (original in Ukrainian).

Boris Gorbis, Esq.

NAFI1741 Sunset Plaza DriveLos Angeles, CA, 90069-1311, USA


Recommended