Date post: | 16-Nov-2023 |
Category: |
Documents |
Upload: | independent |
View: | 1 times |
Download: | 0 times |
Modeling Cognitive-Linguistic Competence for MT
© 2015 Boris Gorbis.NAFILos Angeles, CA, USA
All rights reservedQuoting and reproduction permitted with proper attribution
Abstract:
Inter-language textual equivalency is established through a mediator model of generic
cognitive-linguistic competence. Parametric analysis organizes word-forms into MWCs -
multiword-complexes (V+N, N+V, N+V+O, etc.) by assigning metalingual codes to
components and yielding unique MWC coordinates identical in any language. A rigorous
Point-of-View (POV) mapping of 'shared reality' proceeds horizontally identifying a
(sub)set of all referents of one POV, and vertically, by all possible POVs of one referent
(Array). Each POV comes with a class of possible relation-actions (operant spectrum).
Elements of operant spectrums and referent subsets carry proprietary markers of their
POVs forming proprietary code sequences (PCS'). MWC's coordinates are identical PCS'
of any referent-operant pair. Universal Grammar operations frame MWC textual
extensions and provide additional means of disambiguation.
Key words: machine translation, computer assisted translation, cognitive modeling,
universal grammar, shared reality, cognitive lexicography, aspect semantics, linguistic
competence, metalanguage, universals.
1. Translation as Cognitive Behavior Replication
1.1. Any translation is a replication of cognitive-linguistic behavior distilled from
a 'linguistic mass' in the 'from-code' and correlate with linguistic means of the 'to-
code'. How to accomplish this in MT(machine translation) or CAT (computer
assisted translation) while parsing natural languages for the evasive 'equivalency
of meaning'? Polysemy of word-forms (pictographs) hampers word-form to word-
form equivalency, while text forming ideas are embedded at different levels of
grammar and syntax in different languages. How do we model shared
'competence' that makes not just translation but any communicative act
recognizable? Human participants easily rely on its existence, but we are still
trying to arrange an engagement between cognitive and linguistic modeling.
1.2. The move to multi-word grouping analysis is a step in the right direction. It
recognizes that language is a highly economical structure and that verbal behavior
reflects this economy through ready-made linguistic units (stable word-
combinations) and prefabricated blocks (clichés). Enabling their machine
translation is a relatively easy task. But it would be a mistake to mine natural
languages for solutions without having a well defined cognitive model to process
the ore. This writing is yet another effort to promote a specific model of
cognitive-linguistic competence (CLC) formalized for MT/CAT use but
potentially capable of expansion into machine text generation (MTG). (Gorbis,
1977)
1.3. May we agree that inter-language equivalency is achievable through a
cognitively shared core universally accessible from any language? If so, this core
should be represented (mapped) metalingually, irrespective of any natural
language or the multitude of individual, cultural or historical variations. This, in
turn, requires a uniform method of modeling cognitive processes of
categorization, comparison and differentiation. After all, any language speaker
can recognize and differentiate such categories as: "things that fall", "things that
present danger", "things I can put in my mouth", "heavy objects" or "many
objects." Capacity to think in these "abstract categories" is universal and supports
our ability to operate with their content while effortlessly zipping back and forth
between abstract and concrete planes. The task is to formalize these processes.
1.4. We previously suggested that formalization of multilingual translation (by
human or machine) can be accomplished through a metalingual mediator where
equivalency of texts is established. (Gorbis 2006 (a), 2006 (c) ). Consistently
adequate translation thus requires uniform mapping of cognitive-linguistic
competence serviced by two-way metalinguistic algorithms that connect non-
linguistic reality with language databases.
1. In search of Universals
2.1. Metalinguistic approach is a model of objects, phenomena and relations
independent of the language in which this model is expressed. This means that
any referent, i.e. object (living or not), state (produced or observed) and action
(physical or imaginary), etc., can be accessed by anyone for the dual purposes of
text (speech) production and recognition in any language. We focus on two
intersecting strategies - human capacity of universal categorization and relational
analysis.
2.2. Universal approach positions a man-speaker [ ↔ ]man-perceiver complex
at the center of our model. Language is an instrumental (teleological) toolbox
available to manage all relations within this complex. There are three sides to the
relational diagram: Me ↔ Me; Me ↔ Other(s) and Other(s) ↔ Other(s) where
"Other' is understood as any referent, i.e. human or inorganic, visible or invisible
object-subject of any and all relations.
2.3. The symbol '↔' represents any relation, -- any activity, process, or action.
It's arrows point to relational dynamics; humans act, destroy, inform, instruct, ask,
share, direct, change, control and otherwise affect each other regarding any object,
state, condition or process and to be likewise affected. This symbolically
corresponds to all relations represented as: "Me ↔Action"; and "Other(s) ↔ Action".
These can further branch into "Me↔Action"; "Me↔Action↔Other(s)";
"Other(s)↔Action↔Other(s)";and even develop further , e.g.
"Me↔Action↔Other(s) ↔Me"; our starting point in abstracting a web of
relations that are cognitively represented as earlier proposed rules of Universal
Grammar. (Gorbis, 2005)
2.4. The grey shaded area represents the phenomenon of our 'shared reality'. It is
a cumulative abstract map of all relations between all referents; e.g. man and self,
men and men, men and objects, etc'. As a source of both cognitive processing and
linguistic behavior, this multi-dimensional map is understood as a complex of
Relational Spheres that an abstract Human can (potentially) encounter and enter into
irrespective of cultural, linguistic or historical contexts.
2.5. There are several primary Relational Spheres that are further differentiated
and hierarchically organized into groups representing nested relational spheres. For
example, a relational sphere "Me ↔ Object(s)" includes "Me ↔ Physical Object(s)"
that includes "Me ↔ Natural Object(s)" and "Me ↔ Man- Made Object(s)", etc.
The primary Sphere "Me ↔ Other(s)" includes classes of relations of "Me ↔
Animals", "Me ↔ Parent(s)", "Me ↔ Property", "Me ↔ Nourishment" and
"Me ↔ God" and many other relational Spheres. The existence of the umbrella-like
"Me ↔ Object(s)" and "Me ↔ Other(s)" means that there is a cognitive level
where we view all referents the same way and can enter into the same types of
relations ("Me ↔Action") with any element from this referent set. The same
relations are available within any sphere nested under the big umbrella. But in a
smaller sphere we enter into different relations that do not belong to the primary
one. This is how our cognitive mapping works.
2.5. The nomenclature of Relational Spheres does not change: they may fill up
with new countries, gadgets, languages and people, but the full inventory of
relations in each Sphere and the cognitive-behavioral tools available to any of us
to affect and manage any relations within any Sphere, are likely to be the same
for the last 20,000 years. Language as the primary mechanism of actualizing relations
offers numerous fine-tuned choices to its users, but the results of cognitive
categorization of 'shared reality' remain the same.
2.6. This stability permits moving from one level of generalization to another
through the process of relational differentiation. It begins with a cognitive level
where 'shared reality' mapping does not require us to differentiate between
referents. We can observe anything or fail to do so, discuss, approach, measure,
and evaluate 'it' regardless of what 'it' is. We just view any referent as an Object
and any object as the same referent. We know that we can 'appropriate' only certain
objects as there is a logical or 'physical' limit, yet we understand the meaning of
'stealing stars', 'gobbling-up planets' 'buying air' or 'stealing a soul'. Note, that 'I
appropriate' i.e. establish a relationship of possession of an Object is not the same
when 'We appropriate' which means that plurality often results in different
relations between us and referents. Choice of relation defines product of
categorization and choice of a referent-set will define a class of relations.
3. Taking Baby Steps
3.1. The first step in modeling MT is to take stock of sets of referents of the
Relational Spheres. At first glance, these are nothing more than concepts of a
descending degree of abstraction. Thus, the primary set 'Object' would include
any Physical Object, Imaginary Object, Non-living Object, Living Objects,
Human Object, Animal Object, but here we run into idiosyncratic issues of
encyclopedic (not abstracted) taxonomy.
3.2. We can try to isolate types of relations that a collective Human can enter into
with each set of referents embraced by each Sphere. For example, the Relational
Sphere of "Me ↔ Object(s)" contains 'Other' as a 'physical (non-living) object'. It
specifies types of relations that we can enter into with any referent in that
category, e.g. any 'rock' or 'bone', etc., including relations that any 'rock' or 'bone'
can 'enter into' with us - such as a relation of perception, contact, use, value,
rejection, impact, concealment, depiction, description, etc,. We 'know' that these
can be further differentiated : e.g. contact relationship into penetration, surface,
dent, etc and use relationship extending to weapon, adoration, trade, decoration,
and even manufacturing and so on. How can we formalize this non-encyclopedic
'knowledge'?
3.3. The tusk is complicated since any relation can surface in more than one
Relational Sphere. Relation types of perception, contact, use, value, rejection,
impact, concealment, depiction, description are available whether the referent set
is an 'animal' or a 'human', a 'tree' or a 'vehicle', a 'sibling' or a 'stranger'. In other
words, we can enter into or be subject to the same type of relations in many
different Spheres and their different sets of referents. But what appear to be a
problem is, in fact, a practical solution.
3.4. Each instance of a relationship with an element from a set of referents is a
relonic pattern that can be parametrically and graphically represented. (Kvitash &
Gorbis, 2006). It is called an aspect application. (Gorbis, 2006 (b)) This means
that we cognitively attribute (and mark) the potentiality of any relation (action) as
a cognitive aspect of more than one referent. Semantically, any aspect becomes a
component of meaning of these referents. This may become normative and enter
into dictionaries, but typically, it does not. One of the MT hurdles lies in
machine's inability to recognize the nature of an application (e.g. 'tank flew' or
'wide open heart'.) The other hurdle is that the same text may reflect more than
one aspect application, (e.g. 'embrace a tree' vs. 'embrace an idea' or 'free kittens'
vs. 'free prisoners'.)
3.5. Contrary to perception, the difficulty is not in the polysemy of nouns-
referents but in the versatility of linguistic operant forms (e.g., verbs) identifying
relations of/with/upon referents. In many instances an operant serves so many
different aspect applications as to have no lexicographic meaning, standing as a
substitute for other operants. Take ('consider'), for example, the operant 'take' as
in "I had to take (buy tickets, board, enter) a train to...", "this will take (require)
an effort", "don't take (remove) this without asking me first", "take (bring) this to
my suite", "take (move) the next exit to...", "we had to take (prevent from
movement) her into custody", "take (hold) her bag" and so on.
3.6. Humans have little problem deciphering such word-combinations and many
jokes are based on this capacity. Each application is disambiguated by a universal
cognitive algorithm of categorization that we call a Point-of-View (POV). Each
point-of-view identifies a common cognitive aspect that exists in shared reality as
a potentiality of meaning in each referent. This cognitive algorithm that makes us
laugh is the core of our modeling approach.
3.7. The operant-referent versatility reflects two cognitive dimensions. The
horizontal universal allows the same relationship to identify many objects (set of
referents) from the same POV position, while the universal vertical dimension
allows any referent to be seen from different POV angles, most normatively
established and some freshly minted. Either way, text generation and text
processing all rest on human ability to use multi-form compounds (misleadingly
termed multi-word units (MWUs)) in order to express (or suppress) a dual
connection between any object and any relation-action it can undertake or
undergo. Children books exploit this potentiality to the fullest. The trick is that
POVs actualize a pre-verbal 'unit of thought', a cognitive unity between a certain
relation-action and a certain referent-set.
3.8. MWUs are not units with equivalent components in more than one language.
Any 'unit' is a two-dimensional cognitive potentiality actualized by a POV choice.
It remains the same but may be represented by divergent word-form groupings in
the same and in different languages. Accounting for the presence and operation of
this cognitive algorithm is the task of cognitive lexicography. The main (but not
the only) connection between Relational Spheres and cognitive-linguistic
behavior is a universal (shared) categorization menu, serving cognitive Points-of-
View (POV) as options. Choosing or identifying correctly a POV actualizing
MWUs is a condition necessary for communication. It is the backbone of human
cognitive-linguistic competence (CLC) and our machine translation model.
4. Modeling Cognitive-Linguistic Competence
4.1. To model CLC, we need to differentiate competence machinery from its
operational algorithms. In humans all operate seamlessly. In machine translation
we must separately develop relational databases (inventories) of: (1) Relational
Spheres, (2) referent sets of each Sphere, (3) individual referents of each set as
word-forms in different codes, (4) relation-type word-forms isolated for each
Sphere and (4) POV's from which anyone can view any referent in our 'shared
reality' as a subject or object of any relation-action. Some of these inventories are
open-ended and some are finite. The key is to connect them through specific
codes serving as metalinguistic coordinates in recognizing strings of text as
equivalent multi-word complexes (MWC') in different languages.
4.2. Cognitive-Linguistic competence necessarily includes all 'linguistic' data such
as vocabularies of referent sets (see, inventories 2 and 3) and means of expansion
into speech that we associate with grammar. Contrary to what we know as
'normative grammar', or 'generative grammar', the rules of this expansion do not
proceed from form to meaning, but in reverse, from a cognitive-operational
structure straight to a linguistic format.
4.3. Sentences are not generated -- they are framed by cognitive processing and
Universal Grammar as a metalanguage of thought structure is another component
of the CLC model. We have earlier offered a mechanism yielding universal
elements of action-relation such as 'agent', 'object' 'duration', 'permission', 'time'
etc and a simple methodology of their multi-dimensional expansion into sentence
structures by 'operations' of the 1st, 2d and 3d degrees. (Gorbis, 2005.)
4.4. We should not ignore that POV's are a product of Relational Spheres that also
provide structural blueprints of Universal Grammar for framing thought into its
linguistic expression. Take another look at text strings in 3.5. as speech acts of
command and explanation. These are not linguistic categories but expressions of
an underlying structure of human interactions arising from our 'shared reality'.
The choice/identification of a POV and the actualization of a corresponding
MWC is determined by that structure. POV analysis is one of the key elements of
our design.
4.5. The object of our modeling is the metalanguage of thought operating with
abstractions of primary Relational Spheres. Their sets of referents are
"supernotions" such as Action, Person, Object, Process, that includes their main
and optional attributes. Thus, any 'Action' has attributes of agent, duration,
object, effect etc. Any 'Object' has mass, dimension, shape, volume, composition,
capacity, function, use etc. Main attributes of 'Person' include presence,
movement, action, inaction, interaction, choice, expression, etc.. Secondary
(optional) attributes, for 'Person' are: opinion, reproduction, nourishment, etc. The
lists of primary and optional attributes are fairly short but each attribute can be
further differentiated into a new cognitive and verbal structure.
4.6. Universal Grammar frames abstractions into sentences through a hierarchy of
cognitive operations upon supernotions and their attributes. For example, in any
language there will be these 1st level structures: 'to perform [any] action', 'agent
of action' and 2d level structures, such as, 'to order to perform an action', 'to ask
to terminate an action', to interrupt performance of an action, and so on. At the
deeper (3d) level we find: 'to request to allow to terminate performance of an
action', 'to order to resume performance of an action', etc. Where focus is on the
agency, all languages provide structures needed 'to request to identify an agent of
action', 'to volunteer to become an agent of an action', 'to suggest an agent of an
action', or even go deeper, as in operation 'to prevent an agent of an action to
perform an action'.
4.7. The unfolding of cognitive into language behavior is a process in which
POVs (our ever shifting focus in the direction of thought) play a key role.
Languages might differ in their repertoire or linguistic means through which these
universal frames are expressed. They may also differ in levels where cognitive
operations become overly complex or may still find a corresponding lexico-
grammatical format. Thus, we can find 4th level cognitive operations to be
linguistically appropriate, e.g. 'to request to permit to volunteer to perform an
action' or 'to announce to intend to be an agent to destroy an object' or not as in:
'to ask an agent of an action1 to order an agent of an action to terminate
performance of an action2' when we say: "Please tell him to shut up!" To repeat, it
is the choice of POVs that allows such texts to be recognized as they connect a
referent (person) with an action-relation and an object of action in different ways.
5. POV as the Parametric Key
5.1. We recognize that each POV relationally separates objects, states, and
processes from our shared reality into a (sub)set of referents. Change a POV and
the (sub)set composition changes. Change the text and the POV analysis changes.
Each POV has a specific volume; the size of a referent (sub)set it creates. POV's
corresponding to primary Relational Spheres and their supernotions may cover
the entire nominative dictionary. The 'smaller' POV volume is an open class with
one shared characteristic; a common cognitive POV to which we assign a unique
marker. Thus, all word-forms corresponding to referents from each POV (sub)set
will carry the same POV marker.
5.2. Each POV provides its assembled referents with many types of 'shared
reality' relations. Under each POV any referent can create the same relations and
be subject to the same relations. Consequently, each (sub)set element can engage
in the same action-relations as others. Conversely, a need to enter into a specific
relationship would make no difference what objects we use. We have seen people
to write on their skin, on business cards, or 100 dollar bills as well as phones and
yellow pads. These cognitive subroutines form Musketeer Classes where one
element can invoke all relations and one relation-action will in turn bring out any
element of a POV (sub)set. The class of relations-actions that belong to a given
POV is called its Spectrum or a spectrum of operants. (Gorbis, 2014). Each
operant from a POV spectrum is assigned its marker. In other words subset
referents and spectrum operants of the same POV carry the same marker.
5.3. This cognitive economy of a single POV organizing linguistic behavior by
connecting its (sub)set of referents with its Spectrum of operants is fully utilized
in our design. A subset of referents created by POV 'Moving Object' has an
obviously huge volume where any moving referent has potential access to pre-
established cognitive aspects of being: 'observed', 'measured', 'ignored', 'noticed',
'avoided', 'deflected', 'destroyed', etc. sharing the same marker (say, for simplicity,
'A') with any other object that moves. In addition to marker 'A', any referents of a
related POV 'Objects in Motion' (marked 'B') engages in 'moving', 'stopping',
'speeding', 'resuming', 'slowing', 'rotating' 'covering distance' etc. Thus, both
referent subset and operant elements of this POV carry markers A&B. The
descending POV of 'Man-made Moving Objects' (marked 'C') would bring action-
relations Spectrum that any vehicle from a 'C' marked referent subset can
undertake as well as undergo.
5.4. We said earlier that each operant from a POV spectrum carries the same
marker as elements of the corresponding referent (sub)set. But how do we
differentiate between a taxi and a volleyball in MT: both are man-made and both
are moving objects and carry A,B,C markers? Moreover, the same cognitive
economy would allow one relation-action operant to belong to Spectrums of many
different POV's. See, versatility discussion in 3.5. above where the word-form
'take' expresses different relations as in 'to take a cab' and 'to take a ball'.
5.5. Because anything can be perceived from more than one angle, a volleyball is
not just a 'Man-Made Object', it is also a 'D'-marked 'Man-held Object', an 'E'-
marked 'Sports Object', an 'F'-marked 'Man-hit object', a 'G'-marked 'Man-thrown
object', an 'H'-marked 'Value Object' (Dad's prized possession) an 'I-marked'
'Commercial Object' and so on. Unlike a 'ball', a 'taxi' is an element of many
different subsets created by such POVs as 'Q'-marked 'Man-operated object', 'P'-
marked 'Hollow Object' and even a 'Z'-marked 'Object with Wheels'. The set of all
POVs potentially applicable to a referent is called an Array. (Gorbis, 2006 (c))
5.6. A POV Array of any referent can thus be represented as a string of markers
that identify this referent. The longer the string, the more likely it is that it is a
unique proprietary code sequence of the referent abbreviated as PCS in what
follows. Each marker in the PCS corresponds to a specific POV that carries an
identically marked Spectrum of operants. Thus, any referent PCS also identifies
all operant Spectrums carried by each POV in that referent's Array. This is the
main parametric feature of our model. Through its PCS, any referent of the word-
form 'ball' would: (1) differentiate identical word-forms as corresponding to
different referents, (2) actualize any POV from each referent's PCS-marked
Array, and (3) itself be actualized by any operant from any of the same PCS-
marked Spectrums. A word-form 'take' that corresponds to several operant
Spectrums will (1) actualize all POV's having this operant in its spectrum, (2)
identify all referents of each of these POVs, and (3) differentiate referents by their
PCS's. In simpler terms, a 'ball' would differ from 'taxi' because their PCS's would
not coincide in at least one instance e.g. I cannot 'throw a taxi' but I can 'throw a
ball' even if I can 'kick' both. Parametric codes that do not coincide in at least one
instance lead to complete differentiation.
5.7. To generalize, any word-form (referent or operant) can be represented as a
PCS, a unique string of individual POV markers of its referent(s) and
corresponding operant(s). Each referent's Array is different from the Array of
another referent (even if word-forms coincide, as in 'glasses'(pl) vs. 'glasses',
'train' vs. 'train' as in 'train of thought' and a 'dress train') and so coding for
corresponding word-forms would be different. Because PCS for a referent must
be the same for an operant, any multiple word-complex acquires unique
coordinates.
6. Text Indexing and Disambiguation
6.1. typical translation begins with word-by-word or sentence-by-sentence
processing where punctuation provides a recognizable break. A machine text
processing begins with identification (recognition) of all POV appearances and
mapping of all MWC coordinates. A parametric MT model notes but ignores
processing breaks and treats the text as a whole. This means that it 'recognizes'
continuations of linguistically expressed (initial) thought (subject), its
interruptions, deviations and appearances of new subjects through comparison of
parameters.
6..2. This is accomplished by establishing equivalence of a unit-of-thought
expressed in code A to a unit-of-thought to be found in code B by correlating their
metalingual coordinates. The primary unit-of-thought complex is linguistically
represented by complexes, such as V+N, N+V, N+N or A+V and A+N word-
forms (where V is a verb, N is a noun (or gerund) and A is either a qualifier, or a
quantifier). POV's of plural word-forms (e.g. 'parents', 'violins', 'fears') identify
different spectrums than those of a singular form and their PCS coordinates will
differ. This allows disambiguation of forms since POV of Objects is not the same
as the POV of an Object and the PCS of a singular form ('parent', 'violin', 'fear')
would invariably differ from the plural.
6.3. Any text reflects connected ideas and subjects, and parametric design can
'predict' appearance of subsequent and 'anticipate' development of new ones in
two ways. In treating text as a 'flow', probability-based model can identify
recurring word-complexes with identical coordinates and 'anticipate' those with a
substantially similar ones. The second format of probability-based analysis
compares coordinates of the first unit-of-thought (MWC) expressing a
relationship between a certain subject (referent) and predicate (operant).
Following rules of Universal Grammar (see 4.5 above) this unitary complex
expands by identification of the object of relation-action or a qualifier, specifying
type, quality, quantity etc., of any of the primary or optional attributes. Text often
reflects developments in which these referents become elements of the next unit.
Furthermore, the MWC as unit-of-thought is a metaphorical seed that expands
unto new sentence structures either through a choice of a new POV from the
Array or by a new choice from the operant Spectrum. (Gorbis, 2006 (b)).
6.4. The predictive algorithm mirrors progression of text generation framed by
Universal Grammar operations. Thus, any attribute of a supernotion appearing as
an object of a preceding text sequence has significant probability to become the
subject in a subsequent textual sequence. Because many operant-referent PCS
identified MWCs are stable language structures that reflect cognitive economy,
we recognize these complexes as 'ready-to-use' linguistic units. They enter text
production as a whole and are recognized as such by recipients allowing
disambiguation of a psycholinguistic volume of meaning of a non-contextual
word-form available even to beginning language students. (Gorbis, 1972) But this
is not true for machine (until it learns otherwise). We simulate this shared reality
"knowledge" by expanding on our parametric design.
6.5. Every 'unit of thought' expands through 2d and 3d levels Universal Grammar
operations over supernotions and upon attributes of the primary referent classes.
(See, 4.5. and 4.6. above). For example, when we focus on an object of action or
its duration, all languages provide means to expand their cognitive and linguistic
V+N units on a pre-formatted basis. Simple examples would suffice. A unit 'to
attend school' is cognitively represented as a complex that under Universal
Grammar rules expands as: 'to attend (which) school' 'to attend school (in which
place)' 'to attend school (with whom)", 'to fail to attend school' etc. By identifying
PCS coordinates for word-forms in the text, machine can 'develop' its own
'cognitive map' of the document, a potent source of disambiguation.
6.6. Instances of simple competition for word-form correlation are resolved with
the unique set of PCS coordinates. More complex cases can be resolved through
comparing POV arrays and spectrums of competitors for the best correlation with
a preceding multi-word complex or structures. Based on the assumption that the
subject matter evolves throughout the text, we can program whole text parametric
analyses by all identifiable POVs and thus greatly limit competing choices of
referent subsets and operant spectrums.
7. Cognitive Lexicography.
We recognize that text production is not stringing of words according to language
rules we are still unable to formalize, but identification of equivalent multiple
word complexes (MWCs) evoked through cognitive algorithms of Universal
Grammar operations and POVs. This, and other writings on the subject, suggest
that we can replicate linguistic behavior because we can formalize and build a
parametric model of cognitive-linguistic competence (CLC). Texts no longer
appear as ad-hoc produced speech with constantly shifting meanings. Translation
acquires stability by reliance on universality of language mediated representation
of 'shared reality'.
Unlike any other dictionary, where equivalency of meaning is established through
reference to other concepts, our model is a cognitive dictionary simulating
cognitive-linguistic competence (CLC). Its format is a coded hierarchy of all
instances of viewing an object-referent from every conceivable point-of-view
being code-referenced with all relation-action operants that actualize and service
each point-of-view. This design is language-independent and enables cognitive
mapping by machine. This CLC model is a multi-level relational database of
'units-of-thought' as expressed in natural languages. Identified by their unique
coordinates they are subject to further expansion with a number of Universal
Grammar operations. In creating these processing compendiums for natural
languages, we do not translate, but enable machines to establish (full or
significant) equivalency through correlation of code-based coordinates and
identification of equivalent multi-word expressions in each language. Full
realization of this design requires preliminary work with massive language
material to be conducted by many groups in many different languages following a
uniform protocol.
References:
Gorbis, B. (2014) "Cognitive Machine Translation Model: POV and Musketeer Classes of Shared Reality" In Proceedings of the 2014 International Conference on Intelligent Linguistic Technologies, pp. 74 -119. (ILINTEC'14) CSREA Press (2014).
Gorbis, B. (2006a) "The COG: Making Sentences From Concepts". In Proceedings of the 2006 International Conference on Machine Learning; Models, Technologies & Applications, pp. 43-49 (MLMTA 2006), CSREA Press (2006).
Gorbis, B. (2006b) "Borrowing With Interest: Aspect Semantics View of Language Extension and Expansion". In 'Proceedings of the 2006 International Conference on Machine Learning; Models, Technologies & Applications, (pp. 50-56) (MLMTA 2006), CSREA Press, (2006).
Gorbis, B. (2006c)“Cognitive Dictionary: a Representation of Shared Reality”, In Proceedings of the 50th Annual Meeting of the ISSS, - Sonoma State University, Rohnert Park, California, USA, July 9th - 14th 2006, ISSN: 1999-6918.
Kvitash, V. & Gorbis, B.(2006d) “Relonic Properties of Living Systems” in Proceedings of the EMCSR 2006, 18th European Meeting on Cybernetics and Systems Research, (pp. 167-172) Vienna, Austria, Austrian Society for Cybernetic Research, April 18-21, 2006,.
Gorbis, B. (2005) “A Primitive Model of Metalanguage for Universal Grammar”. In Proceedings of the 2005 International Conference on Intelligent Linguistics, (pp. 39-45). (MLMTA 2005),. CSREA Press, (2005).
Gorbis, B. (1977) “Psycholinguistics and Generative Lexicography: A Preliminary Description" in “Translator’s Journal” (“Tetradi Perevodchika”) issue 14, “Foreign Relations Publishing House Moscow, (1977) (in Russian);
Gorbis, B. (1972) “The Psycholinguistic Volume of Meaning” in “Methods of Foreign Language Acquisition: Issues and Developments”, Kiev, (1972) (original in Ukrainian).
Boris Gorbis, Esq.
NAFI1741 Sunset Plaza DriveLos Angeles, CA, 90069-1311, USA