About
The annual meeting of the Cognitive Science Society is aimed at basic and applied cognitive science research. The conference hosts the latest theories and data from the world's best cognitive science researchers. Each year, in addition to submitted papers, researchers are invited to highlight some aspect of cognitive science.
Volume 9, 1987
Problem Solving I
Modifying Previously-Used Plans to Fit New Situations
Re-uting plans that were created for one situation to solve a new problem is often more efficient than creating a new plan from scratch (e.g., [Fikes et al, 1972] and [Carbonell, 1986]. However,a plan that was created for one problem may not exactly fit a new situation; in that case, itwill have to be modified. There are two major problenu with re-using plans: (1) decidingwhether to modify a plan, use it as is, or discard it; and (2) modifying the plan efficiently.Our solution to these problems is to store information with plan preconditions to guide theplanner during plan application. Our approach is novel in two ways. First, we have identifieda type of precondition, called a Bexible precondition, that has information associated with itthat helps the planner decide whether or not to modify the plan should the precondition beviolated. Second, our preconditions contain information (derived from past experience usingthe plan) that provides heuristics for chioiging the plan so that the offending preconditionis either no longer violated or no longer necessary. By using this approach, otir planner canquickly determine whether or not to modify a plan, then efficiently perform the modification.Our work is implemented in the Consumer-Advisor System (CAS) [Kolodner and Cullingford,1986; Turner, 1986; Turner, in press], a common-sense advice-giving program.
Analogical Learning: Mapping and Integrating Partial Mental Models
Descriptions of scientific and technical systems take a number of different forms. Depending upon thepurpose of a description, it may focus on a system's behavior, causaliiy, physical or functional topology,or structural composition. An analogical explanation used to teach someone about such a system is alsotypically geared to one or another of these purposes. In this paper we describe some research leading tothe development of a theory of the role of explanatory model types in the generation of analogicalmappings. The work is motivated by the larger question of how explanations presented as analogies areapplied by students learning about new domains. Our long term goals are (1) the development of atheory of purpose-guided analogical learning, based on a coherent taxonomy of mental model types, and(2) the development of a theory of the integration of partial mental models during learning, usingprinciples for relating different explanatory model types.
Analogy and Similarity: Determinants of Accessibility and Inferential Soundness
Analogy and similarity are widely agreed to beimportant in learning and reasoning. Yet people Areoften unable to recall an analogy which would beinferentially useful. This finding suggests that acloser examination of the similarity factors thatpromote retrieval is necessary. We approached thisproblem by investigating the role of relationalcommonalities (higher-order relations and first-orderrelations) and common object-descriptions in theaccessiblity and inferential soundness of an analogy.Subjects first read a large number of stories. Oneweek later they were given a new set of stories toread. These new stories were designed to form matcheswhich shared different combinations of objectdescriptions,first-order relations and higher-orderrelations with the original stories. Subjects wereasked to recall any stories from the original set thatcame to mind. Afterwards they rated the matches forsubjective soundness and similarity.The results of two experiments showed thatsubjects recalled the original stories that sharedcommon object descriptions and first-order relationswith the new stories. These results support the ideathat similarity based access is enhanced by acombination of surface similarity and first-orderrelations. They also suggest that common higher-orderrelations play a smaller part in recall. In contrast,in both the soundness-rating and similarity-ratingtasks subjects rated the pairs that shared higher—orderrelations higher than the pairs which shared surfacesimilarity. This suggests that those aspects ofsimilarity that govern recall are different than thoseaspects that govern similarity-ratings and soundnessratings.
Transfer in Problem Solving as a Function of the Procedural Variety of Training Examples
Students often have difficulty solving homework assignments in quantitative courses such asphysics, algebra, programming, and statistics. We hypothesize that typical example problemsdone in class teach students a series of mathematical operations for solving certain types ofproblems but fail to teach the underlying subgoals and methods which remain implicit in theexamples. In the studies reported here, students in probability classes studied example problemsthat dealt with the Poisson distribution. In Experiment 1, the four examples all used the samesolution method, although for one group the examples were superficially more dissimilar than forthe other group. All subjects did well on the Near Transfer target problem that used the samesubgoals and methods as the training examples. However, most did poorly on two Far Transfertarget problems that had different subgoal orders and different methods. These results suggestthat subjects typically iearn solutions as a series of non-meaningful mathematical operationsrather than conceptual methods in a subgoal hierarchy. In Experiment 2, one group studiedproblems that demonstrated two different subgoal orders using different methods while the othergroup received superficially different problems which had identical subgoal orders and methods.Both groups still had difficulty with the Far Transfer problems. Subjects who received exampleswith varied subgoal orders and methods seemed to isolate the subgoals, however, but not themethods. This result suggests that goals and methods may be useful ways of characterizingtraining problems. However, students may require explicit instruction on subgoals and methods inorder to successfully solve novel problems.
Schema Acquisition from One Example: Psychological Evidence for Explanation-Based Learning
Recent explanalion-based learning (EBL) models in Al allow a computer program to learn aschema by analyzing a single example. For example. GENESIS is an EBL system which learns a planschema from a single specific instance presented in a narrative. Previous learning models in both Aland psychology have required multiple examples. This paper presents experimental evidence thatpeople can learn a plan schema from a single narrative and that the learned schema agrees with thatpredicted by EBL. This evidence suggests that GENESIS, originally constructed as a machine learningsystem, can be interpreted as a psychological model of learning a complex schema from a singleexample.
Facilitation from Clustered Features: Using Correlations in Observational Learning
In learning categories and rules from observation without external feedback, people must makeuse of the structure intrinsic to the instances observed. Success in learning complex categoriesfix)m observation, as in language acquisition, suggests that learners must be equipped withprocedures for efficiently using structure in input to guide learning. W e propose one way thatlearners make use of the correlational structure available in input to facilitate observationallearning: increase reliance on those features discovered to make good predictions about thevalues of other features. Such a mechanism predicts two types of facilitation from multiplecorrelations among input features. This contrasts with the effects of correlated features whichhave been suggested by models addressing learning with explicit feedback. Three experimentsinvestigated learning the syntactic categories of artifical grammars, without external feedback,and tested for the predicted pattern of facilitation. Subjects did show the predicted facilitation.In addition, a simulation of the learning mechanism investigated the conditions when it wouldprovide most benefit to learning. This research program begins investigation of procedureswhich might underlie efficient learning of complex, natural categories and niles finomobservation of examples.
Linguistics I
A Conncetionist Context-Free Parser Which is not Context-Free, But Then It is not Really Connectionist Either
We present a distributed connectionist architecture for parsing context free grammars. It improves earlierattempts in that it is not limited to parse trees of fixed width and height (i e. fixed length sentences). Thememory limitations inherent m connectionist architectures comes out in an inability to parse centerembeddedsentencesKey Words: connectionist parsing, distributed connectionism, context-free grammars.
A Principle-Based Approach To Parsing for Machine Translation
Many parsing strategies for machine translation systems are based entirely on context-free grammars; to try to capture all natural language phenomena, these systems require an overwhelming number of rules; thus a translation system either has limited linguistic coverage, or poor performance (due to formidable grammar size). This paper shows how a principle based "co-routine design" implementation improves the parsing problem for translation. The parser consists of a skeletal structure-building mechanism that operates in conjunction with a linguisticlaly based constraint module, passing control back and forth until underspecified skeltal phrase structure is converted into a fully isntantialed parse tree. The modularity of the parsing design accomodates lingusitic genralization, reduces the grammar size, enables extendibility, and is compatible with studies of human language processing.
Parsing and Generating the Pragmatics of Natural Language Utterances Using Metacommunication
This paper reports a new theory of natural language processing and its implementation in a computerprogram, DIALS (for DIALogue Structures). This represents a radical departure from the paradigmaticapproach to natural language processing currently dominating the Helds of artificial intelligence, linguistics,and language philosophy, among others. W e use the theory of metacommunication to develop a"pragmatic grammar" for the structural analysis of dialogue. W e are currently able to parse and generateover 5000 surface forms of a single underlying request content. W e propose using this pragmatic informationto manage the communication context, including inferring some of a speakers' goals and controllingstatus and politeness.
Techistic Natural Language Processes
AI approaches to natural language specify computational processes, yet they are based on thestructural concepts of language and grammar which posit necessary conditions at least for the"correct" interpretations of utterances and often also for the syntactic and/or logical representations of utterances. We argue that any such view will fail to account for a variety ofimportant features of language behavior, which we describe. Systems based on the language/grammar model are usually accompanied by heuristic or algorithmic search/selection methods ofcomputation. We contrast these methods with constructive {leuchistic) computation models based on redundant, inconsistent sets of constraints, and show that this view offers naturalaccounts of the phenomena described.
PARSNIP: A Connectionist Network that Learns Natural Language Grammar from Exposure to Natural Language Sentences
Linguists have pointed out that exposure to language is probably not sufficient for a general,domain-independent, learning mechanism to acquire natural language grammar. This "povertyof the stimulus" argument has prompted linguists to invoke a large innate component inlanguage acquisition as well as to discourage views of a general learning device (GLD) forlanguage acquisition. W e describe a connectionist non-supervised learning model (PARSNIP^)that "learns" on the basis of exposure to natural language sentences from a million wordmachine-readable text corpus (Brown corpus). PARSNIP, an auto-associator, was shown threeseparate samples consisting of 10, 100 or 1000 syntactically tagged sentences, each 15 words orless. The network leamed to produce correct syntactic category labels corresponding to eachposition of the sentence originally presented to it, and it was able to generalize to another 1000sentences which were distinct from all three training samples. PARSNIP does sentencecompletion on sentence fragments, prefers syntactically correct sentences, and also recognizesnovel sentence patterns absent from the presented corpus. One interesting parallel betweenPARSNIP and human language users is the fact that PARSNIP correctly reproduces testsentences reflecting one level deep center-embedded patterns which it has never seen beforewhile failing to reproduce multiply center-embedded patterns.
"Word Pronunciation as a Problem in case-Based Reasoning"
English word pronunciation is a challenging knowledge acquisition problem in which general rules are subject to frequent exceptions of an arbitrary nature. W e have developed a supervised learning system, PRO, which learns about English pronunciation by training with words and their dictionary pronunciations. PRO organizes its knowledge in a case-based memory which preserves fragments of training items but does not remember specific training items in their entirety. After P R O has created a Caise Base in response to a training set, it can pronounce novel test words with substantial degrees of success. Test items are processed by generating a search space in the form of a lateral inhibition network and embedding this search speu:ein a larger network that reflects PRO's previous training experience with relevant fragments.Spreading activation and network relaxation are then used to arrive at a preferred pronunciation for the given test item. In this paper we report preliminary test results based on a trainingcorpus of 750 words and a test set of 300 words.
Connectionism I
A Connectionist Architecture for Representing and Reasoning about Structured Knowledge
uKLONE is the first sub-symbolic connectionist systenm for reasoning about high level knowledge to approach the representational power of current symbolic AI systems. The algorithm for building a network takes as input a knowledge base definition in a language very similar to that of KL2, which has previously been implemented orJy in Lisp. In nXLONE, a concept is more than a set of features: it is a complex of required and optional subparts filling well defined roles, each of which may have its own type restrictions. In addition to being able to use complex structured descriptions in its reasoning, ^iKLONE exhibits a facility for plausible inference due to its inherently parallel constraint satisfaction algorithm that is not shared by symbolic systems. This paper describes h o w the system answers a query that requires both of these characteristics. It is hoped that this is the beginnings of a response to (McDermott, 1986)'s challenge that connectionists should pay more attention to architectural issues and rely less onlearrung.
A Connectionist Encoding of Semantic Networks
Although the connectionist approach has lead to elegant solutions to a number of problems in cognitive science and artificial intelligence, its suitability for dealing with iHX)blems in knowledge representation and inference has often been questioned. This paper partially answers this criticism by demonstrating that effective solutions to certain problems in knowledge representation and limited inference can be found by adopting a connectionist approach.The paper iM-esents a connectionist realization of semantic networks, i.e. it describes h o w knowledge about concepts, their properties, and the hierarchical relationship between them may be encoded as an interpreter-free massively parallel network of simple processing elements that can solve an interesting class of inheritance and recognition problems extremely fast - in time proprotional to the depth of the conceptual hierarchy. The connectionist realization is based on an evidential formulation that leads to principled solutions to the problems of exceptions, multiple inheritance, and conflicting information during inheritance, and the best match or partial match computation during recognition.
A Distributed Conncetionist Representation for Concept Structures
We describe a representation for frame-like concept structures ina neural network called DUCS. Slot names and slot fillers axe diffuse patternsof activation spread over a collection of units. Our choice of a distributedrepresentation gives rise to certain useful properties not shaied by conventionalframe systems. One of these is the ability to encode fine semantic distinctions assubtle vairiations on the canonical pattern for a slot. D U C S typically maintainsseveral concepts simultaneously in its concept memory; it can retrieve a conceptgiven one or more slots as cues. W e show how Hinton's notion of a "reduceddescription" cam be used to make one concept fill a slot in another.
Using Fast Weights to Deblur Old Memories
Connectionist models usually have a single weight on each connection. Some interesting newproperties emerge if each connection has two weights: A slowly changing, plastic weight which stores long-term knowledge and a fast-changing, elastic weight which stores temporary knowledge and spontaneously decays towards zero. If a network learns a set of associations and then these associationsare "blurred" by subsequent learning, all the original associations can be "deblurred" by rehearsing on just a few of them. The rehearsal allows the fast weights to take on values that temporarily cancel outthe changes in the slow weights caused by the subsequent learning.
On the Connectionist Reduction of Conscious Rule Interpretation
Connectionist models have traditionally ignored conscious rule aplication in learning and performance.Conceptual problems arise in treating rule application in a connectionist framework because the level of analysis ofconnectionist models is lower than that which is natural for describing conscious rules. An analysis is offered of the relation between these two levels of description, and of the kind of reduction involved in connectionist modeling.From this vantagepoint an approach is formulated to the treatment of conscious rule application within aconnectionist framewwk. The approach crucially involves connectionist language processing, and leads to adistinction between two types of knowledge that can be stored in connectionist systems.
Philosophy and Theory
Process and Connectionist Models of Pattern Recognition
The present paper explores the relationship between a process/mathematical model and aconnectionist model of pattern recognition. In both models, pattern recognition is viewed ashaving available multiple sources of infomiation supporting the identification and interpretationof the input The results from a wide variety of experiments have been described within theframework of a fuzzy logical model of perception. The assumptions central to this process modelare 1) each source of infonnation is evaluated to give the degree to which that source specifiesvarious alternatives, 2) the sources of information are evaluated independently of one another, 3)the sources are integrated to provide an overall degree of support for each alternative, and 4)percepmal identification and interpretation follows the relative degree of support among thealternatives. Connectionist models have been successful at describing the same phenomena.These models assume interactions among input, hidden, and output units that activate and inhibitone another. Similarities between the frameworks are described, and the relationship betweenthem explored. A specific connectionist model with input and output layers is shown to bemathematically equivalent to the fuzzy logical model. It remains to be seen which frameworkserves as the better heuristic for psychological inquiry.
Properties of Connectionist Variable Representations
A theoretical classification of the types of representations possible for variable in connectionist networks has been developed [l]. This paper discusses the properties of some classes of connectionist representations. In particular, the representation of variables in value-unit, variable-unit and intermediate unit representations are analyzed, and a course-fine concept of representation developed.In addition, the relation between the measurement of a feature and it's representation is discussed.
Individualism and heories of Action
In a recent series of articles Tyler Burge has presented arguments which cut against individualist theories of intentional states. In this paper I shall try to show what consequences Burge's arguments have for individualist theories of behavior. I shall take Jerry Fodor, who isone of the leading exponents of individualism in psychology, as representative of this view. First,I shall lay out one of Burge's arguments against individualist theories of intentional states;second, I shall describe die leading principles of Fodor's individualist metatheory for psychology; and lastiy, I shall draw some of the consequences that Burge's arguments have forFodor's theory of behavior.
The Content of Event Knowledge Structures
Autobiographical retrieval has been modeled as a predictive retrieval process, in which strategies elaborate the original retrieval cue relying on information accessed in knowledge structures to direct the search. Previous studies have demonstrated that event concepts differ in their utility in this process. The present study examines the type of information made available by accessing two such event concepts, activities and general actions. Activity structures are shown to enable more concrete predictions about included objects, people, and setting information, while general actions tend to be associated with internal mental states. These differences in available features are consistent with previously observed retrieval time differences between these types of concepts and support a general underlying mechanism of predictive inferencing in retrieval. The results suggest the types of information that computer models of memory organization should utilize in their representations of event structures and the reasoning mechanisms that depend on those structures.
A clausal form of logic and belief
In this paper, we present a clausal logic ofbelief which formalizes beliefs in an extendedclausal jorm of logic. Our aims are to solve therepresentationeil problem of quantified beliefs and to allow an efficient resolution-like proof procedure with controlled granularity to be developed.A levelled intensional scheme that enablesthe clausalization of belief is proposed.An inferential power bounded resolution ruleof beliefs for the formalism is introduced. Theformal semantics of the formalism is defined.A general circumscriptive non-monotonic reasoningsystem for belief revision is described. Finally, a scheme for handling the consistency of beliefs under Tarski's truth definition theoremis developed.
Education and Learning
Question Asking During Procedural Learning: Strategies for Acquiring Knowledge in Several Domains
Questions asked during acquisition of a complex skill reflect the types of knowledge that learners require at different stages. Questions that learners ask themselves may serve to generate incomplete conceptual frames that can be used to guide explanation of future events. Question asking data collected from students learning to use a spread sheet program suggest that learners initially require knowledge about plans and the structure of the skill domain.Next they require knowledge about the structure of tasks that they will be performing. Finally they concentrate on plan refinement. Models of skill acquisition and explanation-based learning should incorporate mechanisms for monitoring levels of knowledge in several distinct domains and dynamically altering strategies for knowledge acquisition within these domains.
ThinkerTools: Enabling Children to Understand Physical Laws
This project is developing an approach to science education that enables sixth graders to learn principles underlying Newtonian mechanics, and to apply them in unfamiliar problem solving contexts. The students' learning is centered around problem solving and experimentation within a set of computer microworlds (i.e., interactive simulations). The objective is for students to gradually acquire an increasingly sophisticated causal model for reasoning about how forces affect the motion of objects. To facilitate the evolution of such a mental model, the microworlds incorporate a variety of linked alternative representations for force and motion, and a set of gamelike problem solving activities designed to focus the students' inductive learning processes. As part of the pedagogical approach, students formalize what they learn into a set of laws, and critically examine these laws, using criteria such as correctness, generality, and parsimony. They then go on to apply their laws to a variety of real world problems. The idea is to synthesize the learning of the subject matter with learning about the nature of scientific knowledge — its form, its evolution, and its application.Instructional trials found that the curriculum is equally effective for male sand females, and for students of different ability levels. Further, sixth graders taught with this approach do better on classic force and motion problems than high school students taught using traditional methods.
From Children's Arithmetic To Medical Problem Solving An Extension of the Kintch-Greeno Model
It has been found that expert physicians use forward reasoning in diagnostic explanations of clinical cases. This paper shows that the Kintsch-Greeno model for solving arithmetic word problems, which assumes a forward chaining process, can be extended to explain this phenomena. The basic approach is to modify the lexicon and the schema structure of the existing simulation program while retaining the basic control structure. The principle modifications are in the structure of the schemata which make use of three slots: indicator,abnormality and consequence. As with the Kintsch-Greeno theory,the model proceeds by using these schemata to build super-schemata from the propositional representation of the problem text.
A Temporal-Difference Model of Classical Conditioning
Rescorla and Wagner's model of classical conditioning has been one of the most influential and successful theories of this fundamental learning process. The learning rule of their theory was first described as a learning procedure for connectionist networks by Widrow and Hoff. In this paper we propose a similar confluence of psychological and engineering constraints. Sutton has recently argued that adaptive prediction methods c
Connectionism II
A Representation for Natural Category Systems
Most AI systems model and represent natural concepts and categories using uniform taxonomies,in which no level in the taxonomy is distinguished. W e present a representation of natural taxonomies based on the theory that human category systems are non-uniform.That is, not all levels of abstraction are equally important or useful; there is a basic level which forms the core of a taxonomy. Empirical evidence for this theory is discussed, as are the linguistic and processing implications of this theory for an artificial intelligence/natural language processing system. We present our implementation of this theory in SNePS, a semantic network processing system which includes an A T N parser generator,demonstrating how this design allows our system to model human performance in the natural language generation of the most appropriate category name for an object.The internal structure of categories is also discussed, and a representation for natural concepts using a prototype model is presented and discussed.
Cascaded Back-Propagation on Dynamic Conncetionist Networks
The Bacic Propagation algorithm of Rumelhart, Hinton, and Williams (1986)is a powerful learning technique which can adjust weights in connectionist networks composed of multiple layers of perceptron-like units. This paper describes a variation of this technique which is applied to networks with constrained multiplicative connections. Instead of learning the weights to compute a single function, it learns the weights for a network whose outputs are the weights for a network which can then compute multiple functions.The technique is elucidated by example, and then extended into the realm of sequence learning, as prelude to work on connectionist induction of grammars. Finally, a host of issues regarding this form of computation are raised.
Implementing Stages of Motion Analysis In Neural Networks
A neural model is proposed for human motion perception. The goal of the model is to calculate the tvra-dimensional velxity of elements in an image. Unlilce most earlier approaches,the present model is structured, in accord with known neuro physiological data. Three distinct stages are proposed. At the first level, units are sensitive to the components of motion that are perpendicular to the orientation of a moving contour. The second level integrates these initial motion measurements to obtain translatlonal motion. The third level uses translational motion measurements to compute general three-dimensional motion such as rotation and expansion. The model shows a high level of performance in solving the measurement of two-dimensional translational motion from local motion information. Most importantly, the present model uses nervous system structure as a natural way to formulate constraints. The psychological implications of staged motion processing are discussed
Consistency and Variation in Spatial Reference
Modeling the meaning and use of linguistic expressions describing spatial relationships holding between a target object and a landmark object requires an understanding of both the consistency and variation in human performance in this area. Previous research [Herskovits 1985] attempts to account for some of this variation in terms of the angular deviation holding among objects in thevisual display. This approach is shown to fail to account for the full range of human variation inperformance, and a specific alternative algorithm is offered which is grounded in task variability and the notions of corridor and centroid. The significance to this algorithm of task variation, of theseparation of semantic from pragmatic issues, and of the role of function and structure is discussed
Linguistic Descriptions of Visual Event Perceptions
In this paper we address the problem of constructing a computational device that is able to describe in natural language its own conceptualization of visual input. This addresses the basic issues of event perception from raw data, as well as what connnection a language with a limited vocabulary has to this event construction. W e outline a model of how the perceptual primitives in a system act to both constrain the possible conceptualizations and naturally limit the language used to describe events.
RHO-Space: A Neural Network for the Detection and Representation of Oriented Edges
This paper describes a neural network for the detection and representation of oriented edges. Itwas motivated both by the inherent ambiguity of convolution-style edge operators, and the processingof oriented edge information in biological vision systems.The input to the network is the output of oriented edge operators. The computations within thenetwork are based on orientation dependent, three-dimensional, excitatory and inhibitory neighborhoodsin which computations such as lateral inhibition and linear excitation can occur.Rho-space has a variety of interesting properties, which have been investigated. These include:l) Both coarse and fine representation of the orientation information is possible.2) No global thresholding is required, and the local adaptive thresholding is localized in orientation, aswell as in spatial position.3) The filling-in of dotted and dashed lines readily occurs.4) There is a natural representation of connectivity, which agrees with human perception.5) Illusory contours, of one type produced by the human visual system are produced.6) All processing is completely data-driven, and no domain dependent knowledge or model based processingis used.
Learning Internal Representation From Gray-Scale Images: An Example of Extensional Programming
The recent development of powerful learning algorithms for parallel distributed networks has made it possible to program computation in a new way. These new techniques allow us to program massively parallel networks by example rather than by algorithm. This kind of extensional programming is especially useful when there are no known techniques for solving a problem. This is often the case with the computations associated with basic cognitive processes such as vision and audition. In this paper w e apply the technique to the problem of learning an efficient internal representation of image information direcdy from a gray-scale image. W e compare the results of this to the engineering version of this problem, i.e., image compression. Our results demonstrate that a very simple learning method learns internal representations that are nearly as efficient as those developed by the best known techniques in image compression. Thus w e have a technique whereby neuron-like networks can self-organize to form a compact representation of a visual environment.
Connectionism III
Order Information and Distributed Memory Models
Current versions of distributed models have difficulty in accounting for the representation of order information in matching tasks. In this article, experiments are presented that allow discrimination between physical and ordinal representations of ordinal information, discrimination between position-dependent codes and context-sensitive codes, and generalization of the results of matching tasks from strings of letters to long-term memory for triples of words. Data from these experiments constrain the kinds of models that can be developed to account for matching and order, and present problems for several current memory models. Including connectionist models. Suggestions are made for modifications of these models to account for the results from matching tasks.
A Parallel Natural Language Processing Architecture with Distributed Control
This paper describes work on the autonomous semantic network (ASN) knowledge representation and natural language processing architecture and its implementation - the NO HANS simulator. An ASN is an enhanced spreading activation semantic network, but one without a centralized controller. Rather, in addition to the semantic network are types of nodes which have the ability to change links or add nodes in the network. These nodes are activated by the energy spreading through the underlying semantic network. Thus, the same spreading activation which infuses the knowledge representation also drives the control mechanism as well. Because of this,ASN's offer a compromise between the distributed but restrictive connectionist model and the powerful but heretofore essentially serial conceptual natural language processing models.Spreading activation is also the basis for the search capability, which is loosely based on the connectionist winner-take-all idea. We construct a simple conceptual analyzer in this model and indicate how it works.
Organization of Action Sequences in Motor Learning: A Connectionist Approach
This paper presents a connectionist model of motor learning in which performance becomes more and more efficient by "chunking" output sequences, organizing small action components into increasingly large structures. The model consists cf two sequential networks: one that maps a stationary representation of an intention to a sequence of action specifications or action plans, and one that maps an action plan to a sequence of action components. As the network is trained to produce output sequences faster and faster, the units that represent the action plans gradually discover representational formats that can encode larger and larger chunks of subsequences.The model also shows digraph frequency effects similar to that observed in typewriting, and it generates capture errors similar to that observed in human actions.Organization of
Learning Acoustic Features From Speech Data Using Conncetionist Networks
A method for learning phonetic features from speech data using connectionist networks is described. A temporal flow model is introduced in which sampled speech data flows through a parallel network from input to output units. The network uses hidden units with recurrent links to capture spectral/temporal characteristics of phonetic features. A supervised learning algorithm is presented which performs gradient descent in weight space using a coarse approximation of the desired output as an target function.A simple connectionist network with recurrent links was trained on a single instance of the word pair "no" and "go" represented as fine timescale filterbank channel energies, and successfully learned to discriminate the word pair. The trained network also correctly separated 98% of 25other tokens of each word by the same speaker. The same experiment for a second speaker resulted in 100% correct discrimination. The discrimination task was performed without segmentation of the input, and without a direct comparison of the two items.A second experiment designed to extended the use of this model to discrimination of voiced stop consonants in various vowel contexts is described.Preliminary results are described in which the network was optimized using a second-order method and learned to correctly classify the voiced stops.The results of these experiments show that connectionist networks can be designed and trained to learn phonetic features from minimal word pairs.
Teaching a Minimally Structured Back-Progpagation Network to Recognise Speech Sounds
An associatve network was trained on a speech recognition task using continuous speech. The input speech was processed to produce a spectral representation incorporating some of the transformations introduced by the peripheral auditory system before the signal reaches the brain. Input nodes to the network represented a 150-millIsecond time window through which the transformed speech passed in 2-millisecond steps. Output nodes represented elemental speech sounds (demisyllables) whose target values were specified based on a human listener's ability to identify the sounds in the same input segment. The work reported here focuses on the experience and train on conditions needed to produce natural generalizations between training and test utterances.
Revealing the Structure of NETtalk's Internal Representations
NETtalk is a connectionist network model that learns to convert English text into phonemes. While the network performs the task with considerable accuracy and can generalize to novel texts, httle has been known about what regularities the network discovers about English pronunciation. In this paper, the structure of the internal representation learned by NETtalk is analyzed using two varieties of multivariate analysis, hierarchical clustering and factor analysis. These procedures reveal a great deal of internal structure in the pattern of hidden unit^w;tivations. The major distinction revealed by this analysis of hidden units is vowel/consonant. A great deal of substructure is also apparent.For vowels, the network appears to construct an articulatory model of vowel height and place of articulation even though no articulatory features were used in the encoding of the phonemes. This interpretation is corroborated by an analysis of the errors or confusions produced by the network; The network makes substitution errors that reflect these posited vowel articulatory features. These observations subsequently led to the discovery that articulatory features of place of articulation and, to some extent, vowel height, are largely present in first-order correspondences between vowel phonemes are their spellings. This work demonstrates how the study of language may be profitably augmented by models provided by connectionist networks.
Linguistics II
Generation of Simple Sentences in English Using the Connectionist Model of Computation
This paper discusses the design and implementation of a connectionist system for generation of well-formed English sentences of limited length and syntactic variability. The design employs several levels of interacting units for making appropriate decisions. It uses a simple technique for specifying assignment of input concepts to roles in a sentence and also has a reusable subnetwork for the expansion of noun phrases. The same NP-subnetwork is used for the expansion of noun phrases corresponding to the subject as well as the object phrases of the generated sentences.The input to the system consists of parallel activation of a cluster of nodes representing conceptual specification of the sentence whereas the output is in the form of sequential activation of nodes corresponding to the words constituting the sentence. The system can produce simple sentences in both active and passive voices, and in several tenses. Results of a simulation experiment performed are also included.
Inferences in Snetenc Processing: The Role of Constructed Representations
Recent studies have revealed interesting differences between lexical decision and naming tasks. Naming responses seem to be primarily sensitive tolexical processes and lexical decisions to both lexical and message-level processes. This differential sensitivity to level of representation was used to investigate the following questions: 1) Are probable instruments for an action routinely inferred during sentence comprehension? Previous work may have failed to show that instrximents are inferred, in part, because processing measures were used that were relatively insensitive to the level of representation involved in the inference and 2) If instruments are inferred,does this process require accessing elements of the linguistic or the constructed representation? Four experiments were performed that used cross modallexical decision and naming tasks as measures of instrtunent priming in sentences that implied the use of an instrviment. No priming was found for sentences with no context, replicating Dosher and Corbett (1982). When sentences were preceded by a context that explicitly mentioned the instrument,however, priming was found with the lexical decision task. In combination with the result of the first two experiments, this suggests that instrument sare inferred when the instrument implied by a sentence is available from the context but not when sentences are presented without contexts. Priming was not foxind with the naming task, however. The lexical decision/naming data together suggest that making an instrument inference involves accessing elements of a constructed representation of the discourse.In addition, in sentences that contained pronoxins that referred to the instruments, priming was found for appropriate referents with the lexical decision task but not with naming. This suggests that locating antecedents for pronouns also involves a constructed representation.
Semantic Relations, Metonymy, and Lexical Ambiguity Resolution: A Coherence-Based Account
An account of coherence is proposed which tries to clarify the relationship between semantic relations, metonymy, and the resolution of lexical ambiguity.Coherence is the synergism of knowledge (synergism is the interaction of two or more discrete agencies to achieve an effect of which none is individually capable) and plays a substantial role in cognition. In the account of coherence, semantic relations and metonymy are instances of coherence and coherence is used for lexical ambiguity resolution.This account of coherence, semantic relations, metonymy and lexical ambiguity resolution is embodied in Collative Semantics, which is a domain-independent semantics for natural language processing. A natural language program called metaSuses CS; an example of how it discriminates a metaphorical relation is given
Thematic Roles in Language Processing
We present some ideas about how thematic roles (case roles) associated with verbs are used during on-line language comprehension along with some supporting experimental evidence. The basic idea, following Cottrell (1985), is that all of the thematic roles associated with a verb are activated in parallel when the verb is encountered. In addition, we propose that thematic roles are provisionally assigned to arguments of the verbs as soon as possible, with any thematic roles incompatible with such an assignment becoming inactive. Active thematic roles that are not assigned arguments within the sentence are entered into the discourse model as unspecified entities or addresses. In our first experiment we show that temporary garden-paths arise when subjects initially assign the wrong sense to a verb as in Bill passed the test to his friend, but not when subjects initially assign the wrong role to the noun phrase, as in Bill loaded the car onto the platform. This prediction follows directly from our assumptions. In our second experiment we show that definite noun phrases without explicit antecedents in the preceding discourse can be more readily integrated into a preceding discourse when they can be indexed to an address created by an open thematic role.
Can Synchronization Deficits Explain Aphasic Comprehension Errors
The context dependent nature of language processing requires the synchronization of several subprocesses over time. One claim which follows is that de-synchronization is likely to disturb language processing.Aphasia is a language disorder which arises following certain types of brain lesion. Many theories of aphasia are competence based theories and do not address aspects of performance which can be affected under conditions of processing degradation. The discussion in this pajjer will focus on de-synchronization as a possible explanation for aphasic language comprehension problems.HOPE, a computer model for single sentence comprehension, provides a tool to systematically study the effects of various hypothesized de-synchronization problems on different language processing levels and on overall comprehension performance. HOPE includes a neural-like architecture that incorporates a grammar which functions in a predict/feedback manner. It illustrates one way in which serial-order input can map into synchronous, parallel subprocesses that can effectively produce normal sentence comprehension performance.Using H O P E , the study of explicit de-synchronization effects on a cover set of stimuli sentences suggests error patterns to be sought in neurolinguistic evidence. Within a subset of a cover set of stimuli,simulation results from a slowed propagation lesion experiment will demonstrate how timing problems can result in observed aphasic comprehension performance.
Artificial Intelligence and Simulation I
A Model of Purpose-driven Analogy and Skill Acquisition In Programming
X is a production system model of the acquisition of programming skill. Skilled programming is modelled by the goal-driven application of production rules (productions). Knowledge compilation mechanisms produce new productions that summarize successful problem solving experiences. Analogical problem solving mechanisms use representations of example solutions to overcome problem solving impasses. The interaction of these two mechanisms yields productions that generalize over example and target problem solutions.Simulations of subjects learning to program recursive functions are presented to illustrate the operation of X.
The Operational Level of a Commonsense Planner
This paper characterises the operational level of a commonsense planner. In particular it will compare two approaches on the problem of level of operation. A'plan instantiater' planner has access to a sort of summarisation of the activities involved in each of the subkinds of a (pven category of plans, and by a process of instantiation it selectively adds details to match the current planning situation.Such planners abandon the details of the lower plan and dynamically recreate the munder the exigencies of a given situation. A 'reference point' planner selects a subordinate plan, from a given category of plans, to represent the category as a whole. The reference point planner assumes a level of operation which can directly access a greater number of functional details than its plan instantiation counterpart,but perhaps at the loss of flexibility.The thrust of this paper is that reference point planners are the appropriate model of planning for the commonsense domain.
Seas: A Dual Memory Architecture For Computational Cognitive Mapping
We introduce a dual memory architecture that, by way of computing conditioned-conditioned stimulus (CS-CS)associations and conditioned-unconditioned stimulus (CS-US)associations, is capable of computational cognitive mapping.The network is able to describe complex classical conditioning paradigms in which cognitive mapping is presumably involved such as blocking, overshadowing, sensory preconditioning, second-order conditioning, compound conditioning, serial compound conditioning, and sensory preconditioning. By assuming that limbic-cortical regions of the brain are involved in CS-CS associations, the network isable to describe several cognitive impairments that have been reported after limbic-cortical lesions.
JANUS: An Architecture for Integrating Automatic and Controlled Problem Solving
This paper attempts to unify two problems in cognitive science: the relationship between "controlled" and "automatic"processing and the competing computational models of intelligence proposed by symbolic Artificial Intelligence and the connectionist school. An architecture is proposed in which symbolic and connectionist problem solving systems interact and take advantage of their different strengths. It is argued that the resulting system can account for much of the problem solving behavior associated with automatic and controlled processing as wellas their complex interplay. Thus, the architecture can account for how expertise can be transformed from "explicit" to "compiled"forms via automatization, and how the opacity of the resulting automatic behavior can be counterbalanced in a cognitively plausible manner by explanations generated ei post facto.
Problem Solving II
Domain Specificity and Knwoledge Utilization In Diagnostic Explanation
This paper examines the performance of cardiologists, psychiatrists, and surgeons indiagnostic explanations of cases within and outside their domain. The protocols were analyzed by techniques of transforming a propositional representation into a semantic network. Some graph theoretic criteria for analyzing semantic networks are used for precision of analysis. The results show that the subjects interpret cases in terms of the familiar component of the problem, using specific domain knowledge. This is related to forward directed reasoning. Unfamiliar or uncertaincomponents of the disorder are either ignored or explained using backward reasoning strategies. Atendency to move from a forward driven strategy to a backward driven strategy and vice versa is alsoseen in sc.ne protocols. This sequence is repeated a number of times to form a chain consisting offorward/backward reasoning sequences. This has implications for how subsequent patientinformation is processed in order to make decisions for treatment and management
Measuring Change and Coherence in Evaluating Potential Change in View
In changing your view, you must balance the amount of change involved against the improvement in explanatory coherence resulting from the change. Even if change and improvement in coherence are measured by simply counting, there can be no general requirement that the number of modified items (added or subtracted) be no greater than the number of new explanatory and implications links. The relation between conservatism and coherence is mroe complex than that.
Predictive Versus Diagnostic Reasoning in the Application of Biomedical Knowledge
Clinical problem solving involves both diagnostic and predictive reasoning.Diagnostic reasoning is characterized by inference from observations to hypotheses; predictive reasoning, by inference from hypotheses to observations. We investigate the use of such strategiesby medical students at three levels of training in explaining the underlying pathophysiology of aclinical case. Our results show that without a sound, pre-existing disease classification, the use ofbasic biomedical knowledge interferes with diagnostic reasoning; however, with sound classification, biomedical knowledge facilitates both diagnostic and predictive reasoning.
Planning Stories
Story generation can best be viewed as a planning task. We show here UNIVERSE, a program that generates melodrama plot outlines using hierarchical planning methods. Examples are given of the program creating story outlines using a set of characters that it also created. Weindicate that story telling is open-ended, does not have to be perfect, and evaluation criteria are unclear, and contrast the sort of planning needed for story telling with other planning tasks. We suggest that certain elements of the methods used by UNIVERSE could be usefully applied toother tasks.
Problem Representation and Hypothesis Generation in Diagnostic Reasoning
In this paper we examine the role of domain knowledge in the process of hypothesis generation and problem representation during diagnostic reasoning. An on-line task environment and the combination of discourse and protocol analysis techniques were used to test the differences between two groups of experts solving a clinical problem. The groups consisted of high domain knowledgesubjects (HDK) -endocrinologists- and low domain-knowledge (LDK) subjects cardiologists-. TTie results show that HDK subjects used a more efficient process of diagnostic reasoning as generated a more coherent representation of the problem. A two-stage model describing the process of hypothesis generation was proposed to explain the differences in theprocess of hypothesis generation.
Explanation-Based Decision Making
In complex decision tasks the decision maker frequently constructs a summary representation of the relevant evidence in the form of a causal explanation and relies on that representation, rather than the "raw" evidence base, to select a course of action from a choice set of decision alternatives. We introduce a general model for this form of decision making, called explanation-based decision making, because of the central roleplayed by the intervening evidence summary. Several original empirical studies of judicial decision making, a prototype of the class of explanation-based decision tasks, are reviewed and the findings are adduced in support of the explanation-based decision model. In legal decision making tasks subjects spontaneously construct evidence summaries in the form of stories comprising the perceived underlying causal relationships among decision relevant events. These explanations are primary mediators (i.e., causes)of the subjects' decisions.
Diagnosing Errors in Statistical Problem-Solving: Associative Problem Recognition and Plan-Based Error Detection
This paper describes our model for diagnosis of student errors in statistical problem solving.A simulation of that diagnosis, GIDE, is presented together with empirical validation on student solutions. The model consists of two components. An "intention based"diagnostic component analyzes solutions and locates errors by trying to synthesize student solutions from knowledge about the goal structure of the problem and related knowledge about planning errors. This approach can account for about 8 2 % of the lines and over 9 5 % of the goals in a set of 60 student t-tests. When solutions contain errors in procedural implementation such plan-based analysis is quite effective. In many cases,however, students do not pursue an "appropriate" solution path. The diagnostic model,therefore, includes a second component which is used to determine which type of problem the student is using, it is modeled by a spreading activation network of statistical knowledge.On a sample of 38 student solutions, the simulation correctly identified 8 6 % of the problem types. The model appears to account for a wide range of problem-solving behavior within the domain studied. The preliminary performance data suggest that our model may serve as a useful part of an intelligent tutoring system.
A Time-Dependent Distributed Processing Model of Stragtegy-Driven Inference Behavior
Experimental evidence suggests that some readers make inference decisions early on in text understanding and mold the inferences from later text to fit with the eairlier inferences, while other readers postpone inference decisions until later in the text and then base their final interpretation of the text on those postponed inferences. This behavior has been called strategy-driven inference behavior because it wa^s originally ascribed to different strategies used by readers to guide the coiirse of their inference decisions. This paper presents a new theory of how this behavior comes about,attributing the observed differences in behavior not to different strategies but to very small differences in the underlying cognitive architecture. This theory is illustrated by a simple model of inference processing during text understanding. The inference processing model employs a hybrid connectionist network whose behavior is extremely sensitive to the order of activation of nodes in the network, which in turn corresponds to the order of presentation of events in the story.
Capitalizing on Failure Through Case Based Inference
In case-based reasoning, previous reasoning experiences are used directly to solve new problems and make Inferences,rather than doing those tasks from saatch using generalized methods. One major advantage of a case-based approach is that It can help a reasoner avoid repeating previously-made mistakes. W h e n the case-based reasoner Is reminded of a case in which a mistake was made. It provides a warning of the potential for a mistake. If the previous case was finally solved successfully, it can provide a suggestion of what to do instead. In this paper, w e describe the process by which a case-based reasoner can take advantage of previous failures. W e Illustrate with cases from the domains of common-sense mediation and menu planning and show a program called JULIA reasoning In the domain of menu planning.
Problem Solving in a Natural Task as a Function of Experience
Problem solving Is known to vary in some predictable ways as a function of experience. In this study, we have Investigated the effects of experience on the problem solving behavior and knowledge base of workers In an applied setting: automobile mechanics. The automobile itself is a highly complex system with many interconnected subsystems.Problem descriptions (i.e., symptoms) presented to a mechanic who needs to diagnose a car, however, are usually quite sketchy, requiring the collection of more information before solution. Novices are less able than experts to diagnose any but the obvious problems, and w e are Interested in Identifying the qualitative differences between mechanics at different levels of expertise. In the study reported, w e observed three student mechanics in a postsecondary technical school, each at a different level of expertise, diagnose six problems Introduced Into cars In the school. W e then analyzed the protocols w e collected to find the knowledge and strategies used in solving each problem.W e also analyzed the series of protocols for each student to find the changes in knowledge and strategies used In solving later problems as compared to earlier problems. Differences were seen in both the knowledge used by the subjects and in their general approach to diagnosis. As a result of experience, the student mechanics seemed to Improve In three areas: (1) their knowledge of the relationships between symptoms and possible failures was augmented,(2) their causal models of the car's systems were augmented, and (3) their general troubleshooting procedures and decision rules were much Improved.
Distinguishing - A Reasoner's Wedge
In this paper we focus on the Distinguisher 's Wedge, an intellectual tool for responding to an argument that two cases are alike by asserting reasons why they are different and why the differences matter. W e characterize the wedge as involving a search for distinctions, factual differences between the cases that tie into justifications for treating them differently. W e show how the wedge can be modelled computationally in a Case-Based ReEisoning ("CBR") system using precerfenh'a/justifications and describe how the model is realized in our H Y P O program which performs legal reasoning in the domain of trade secret law. Legal argument, with its emphzisis on citing and distinguishing precedents and lack of a strong domain model, is an excellent domain for studying the wedge. W e show how H Y P O uses "dimensions", "case-analysis-record" and "claim lattice"mechanisms to cite and distinguish real cases and suggest how the model may be extended to cover more sophisticated kinds of distinguishing.
Belief Revision and Induction
This paper describes how inductively produced generalizations can influence the process of belief revision,drawing examples from a computational model of scientific discovery called REVOLVER. This system constructs componential models in chemistry, using techniques from truth maintenance systems to resolve inconsistencies that arise in the course of model formulation. The latter process involves reinterpreting observations(premises) given to the system and selecting the best of several plausible revisions to make. We will see how generalisations aid in such decisions. The choice is made by considering three main factors: the number of models each premise supports, the number of premises supporting the generalized reaction, and whether a proposed revision to that premise matches any predictions made by any generalizations. Based on these factors, a cost is assigned to each premise being considered for revision; the hypothesis (set of revisions) having the lowest cost is chosen as best, and its revisions are carried out. By viewing generalized premise reactions as a pariuligm, we will argue that the revision process of REVOLVER models how scientific paradigms shift over time.
Hot Cognition Mechanisms For Motivated Inference
We present an implemented computational theory of motivated inference intended to account for a variety of experimental results. People make motivated inferences when their conclusions are biased by their general motives ot goals. Our theory postulates four elements to account for such biasing. (1) A representation of the self, including attributes and motives. (2) A mechanism for evaluating the relevance of a potential conclusion to the motives of the self. (3) Mechanisms for motivated memory search to retrieve desired conceptions of the self and evidence supporting desired conclusions. (4) Inference rules with parameters that can be adjusted to encourage desired inferences and impede undesired ones.
Linguistics III
A Tale of Two Brains or The Sinistral Quasimodality of Language
Four experiments show that people differ strongly in the extent to v^ich they depend on linguistic structure during language comprehension.Structure-dependent people are immediately affected by grammatical variables,vrtiile structure-independent people are less affected by such variables. A surprising population difference between the two types of people suggests a genetic and neurological basis for the behavioral difference. All subjects were right-handed. However, structure-dependent people report no left-handers in their family, while structure-independent people do report left-handers in their family. This suggests that the neurological organization for linguistic ability in right handers with familial left-handedness, is more diffuse than for right handers with no familial left-handedness. Other facts connect this to a current hormonal theory of the ontogenesis of hemispheric asymmetries.
Computational Demand and Resources in Aphasia
It is sometimes claimed that interactive-activation models are too powerful, and that it is difficult to constrain them adequately. I illustrate this problem by showing that the basic interactive-activation architecture has several different possible sources for effects of spelling-to-sound regularity on word naming. I then show how data can constrain the architecture. New data lead to a rather different and more constrained version of the interactive-activation model to account for spelling-to-sound conversion. Analysis of the errors made by patients suffering from acquired surface dyslexia confirms the predictions of the constrained model. It is concluded that the traditional interactive activation framework must be considerably constrained to account for normal and disturbed word naming.
On The Role of Time In Reader-Based Text Comprehension
Information-processing models for comprehension typically regard text as the depository of a single determinate meaning, placed in it by the writer. Conversely, a reader-based approach views meaning as constituted by the interactions between an individual and a text. From a computational standpoint, reader-based understanding suggests abandoning models which depend on a priori rules of interpretation and limiting the design of an algorithm to the quantitative aspects of text comprehension. I propose that the perception of subject matter be viewed as a race process where the generation of bridging inferences and expectations is partly controlled by quantitative factors (such as the delay for memory retrieval) which emphasize the invisible but omnipresent role of time during reading.
Casual Reasoning in the Construction of a Propositional Textbase
The goal of this research is to unify two different approaches to the study of text comprehension and recall. The first of these approaches, exemplified bythe work of Trabasso and his colleagues (Trabasso &Sperry, 1985,* Trabasso & van den Broek, 1985) views comprehension as a problem solving task in which the reader must discover a series of causal links that connect a text's opening to its final outcome. The second approach, typified by Kintsch and van Dijk(1978; van Dijk & Kintsch, 1983) emphasizes the importance of short-term memory as a bottleneck in the comprehension process. We combine these approaches by assuming that the most likely causal antecedent to the next sentence is always held in short-term memory.Free recall data from three texts are presented in support of this assumption.
Syntax and the accessibility of antecedents in relation to neurophysiological variation
Results of a word-by-word reading experiment argue for a specifically syntactic mechanism (N.B., not a discourse mechanism) that assigns antecedents to pronouns such as he and they, even though such assignments are grammatically optional and likely to be revised in many instances by subsequent discourse processes. These results argue for a modular view of mental architecture along the lines of Fodor (1983). However, this study also draws on certain new proposals concerning possible behaviorally significant variation in the neurophysiological substrates of language processing. Partitioning subjects on certain biological criteria reveals that, while the pattern described above seems to apply to the majority of subjects, there is a large minority that seems to showan importantly different pattern.
Inferring Appropriate Responses in Discourse
This paper dicusses how Scenes, declarative representations of the intentional and attentional structure of discourse, facilitate the inference of appropriate responses.
Using Intentaional and Attentional Structure for Anaphor Resolution
This paper describes the Scenes knowledge representation that captures the intentional and attentional structure of discourse. Using this information a natural language interface can isolate context and resolve anaphors with focusing heuristics. Further, anaphor resolution can be coordinated with interruptions so that completed digressions are ignored.
Artificial Intelligence and Simulation II
Acquiring Special Case Schemata In Explanation-Based Learning
Much of expertise in problem-solving situations involves rajiidly choosinj? a tightly-constrained schema that is appropriate to the current problem. The paradigm of explanation-based learning, is being applied to investigate hou- an intelligent system can acquire these "appropriately general" schemata. While the motivations for producing these specialized schemata are computational, results reported in the psychological literature are corroborated by a fully-implemented comjjuter model. .Acquiring these special case schemata involves combining schemata from two different classes. One class contains domain-independent problem solving schemata, while the other class consists of domain-specific knowledge. By analyzing solutions to sample problems, new domain knowledge is produced that often is not easily usable by the problem-solving schemata. .Special case schemata result from constraining these general schemata so that a k n o w n problem solving technique is guaranteed to work. This significantly reduces the amount of ])lanning that the problemsolver would other>*ise need to perlorm elaborating the general schema in a n e w problem-solving situation.The model and an application of it in the domain of classical physics are presented.
Generating Scripts From Memory
A variation of the Raaijmaker and Shiffrin (1981) retrieval model is proposed to account for typical script generation data. In our model, knowledge is represented as an associative network with propositions as the nodes. A control process which utilizes temporal information in these propositions supplements the probabilistic memory retrieval process to produce ordered retrieval of scriptal events. Simulations are reported which provide a good qualitative fit to data collected from subjects in both script generation and free association tasks. These results support a view of memory as an unorganized knowledge base rather than a stable, organized structure
Spontaneous Retrieval in a Conceptual Information System
A traditional paradigm for retrieval from a conceptual knowledge base is to gather up indices or features used to discriminate among or locate items in memory, and then perform a retrieval operation to obtain matching items. These items may then be evaluated for their degree of match against the input. This type of approach to retrieval has some problems. It requires one to look explicitly for items in memory whenever the possibility exists that there might be something of interest there. Also,this approach does not easily tolerate discrepancies or omissions in the input features or indices. In a question-answering system, a user may make incorrect assumptions about the contents of the knowledge base. This makes a tolerant retrieval method even more necessary.An alternative, two-stage model of conceptual information retrieval is proposed.The first stage is a spontaneous retrieval that operates by a simple marker-passing scheme. It is spontaneous because items are retrieved a£ a by-product of the input understanding process. The second stage is a graph matching process that filters or evaluates items retrieved by the first stage. This scheme has been implemented in the SCISOR information retrieval system. It is successful in overcoming problems of retrieval failure due to omitted indices, and also facilitates the construction of appropriate responses to a broader range of inputs.
A Model of Schema Selection Using Marker Passing and Connectionist Spreading Activation
Schema selection involves determining which pre-stored schema best matches the current input.Traditional serial approaches utilize a match/predict cycle which is heavily dependent upon backtracking.This paper presents a parallel interactive model of schema selection called SAMPAN which is more flexible and adaptive. SAMPAN is a hybrid system that combines marker passing with connectionist spreading activation to provide a highly malleable and general representation for schema selection. This work is motivated by recent success in connectionist schema representations and in natural language marker passing systems. A connectionist schema representation provides many attractive features over traditional schema representations. However, a pure connectionist representation lacks generality; new propositions cannot easily be represented. SAMPAN gets around this problem by using marker passing to perform variable binding on generalized concepts. The S A M P A N system is a constraint satisfaction network with nodes that perform simple pattern matching and input summation. This approach is directly applicable to current schema-based systems.
Individual Differences In Mechanical Ability
People who understand mechanical systems can infer the principles of operation of an unfamiliar device from their knowledge of the device's components and their mechanical interactions. Individuals vary in their ability to make this tjrpe of inference. This paper describes studies of performance in psychometric tests of mechanical ability. Based on subjects' retrospective protocols and response patterns, it was possible to identify rules of mechanical reasoning which accounted for the performance of subjects who differ in mechanical ability. The rules are explicitely stated in a simulation model which demonstrates the sufficiency of the rules by producing the kinds of responses observed in the subjects. Three factors are proposed as the sources of individual differences in mechanical ability: <1| ability to correctly identify which attributes of a system are relevant to its mechanical function, (2) ability to use rules consistently, and (3| ability to quantitatively combine information about two or more relevant attributes.
Dimensionality-Reduction and Constraint in Later Vision
A computational tool is presented for maintaining and accessing knowledge of certain types of constraint in data: when data samples in an n-dimensional feature space are all constrained to lie on an m-dimensional surface, m < n, they can be encoded more concisely and economically in terms of location on the m-dimensional surface than in terms of the n feature coordinates. The receding of data in this way is called dimensionality-reduction. Dimensionality-reduction may prove a useful computational tool relevant to later visual processing. Examples are presented from shape analysis.
Poster Presdentations
Some Causal Models are Deeper than Others
The effort within AI to improve the robustness of expert systems has led to increasing interest in "deep" reasoning, which is representing and reasoning about the knowledge that underlies the compiled knowledge of expert systems. One view is that deep reasoning is the same as causal reasoning. Our aim in this paper is to show that this view is naive, specifically that certain kinds of causal models omit mioTinat\ou that is crucial to understanding the causality within a physical situation. Our conclusion is that "deepness" is relative to the phenomena of interest.I.e. whether the representation describes the properties and relationships that mediate interactions among the phenomena and whether the reasoning processes take this information into account.
Poster Presentations
Applying General Principles to Novel Problems as a Function of Learning History: Learning from Examples vs. Studying General Statements
This research concerns the effect of learning history for a general principle on the ability to apply the principle to novel situations. Adult subjects learned general problem solving principles under three alternative conditions:(a) abstraction of principles from diverse examples (b) study of explicit general statements of principles and (c) practice in mapping given statements onto examples. The specific aim of this research was to explore how examples given during learning a general principle affect its application to novel problems which do not share "surface" features with the examples.Results showed that examples did not significantly facilitate application of principles over learning only a given general statement. Moreover, subjects who abstracted principles from examples, although they had abstracted the relevant information, were significantly worse at application than subjects who learned only the general statement or who learned the given statement and examples. These subjects had particular difficulty accessing and selecting the appropriate principle for a problem.Results suggest that the representation of specific information from examples may interfere with efficiency at matching a principle to a novel problem. Whether such interference occurs may depend on the relationship between the principle and its examples in the memory representation. This relationship may be influenced by the way examples are initially encoded.
The Role of Categories in the Generation of Counterfactuals: A Connectionist Interpretation
This paper proposes that a fairly standard connectionist category mode can provide a mechanism for the generation of counter factuals ~ non-veridical versions of perceived events or objects. A distinction is made between evolved counterfactuals, which generate mental spaces (as proposed by Fauconnier), and fleeting counterfactuals, which do not TTiis paper explores only the latter in detail. A connection is made with the recently proposed counterfactual theory of Kahneman and Miller, specifically our model shares with theirs a fundamental rule of counterfactual production based on normality. The relationship between counterfactuals and the psychological constructs of "schema with correction" and "goodness" is examined. A computer simulation in support of our model is included.
Answering Why-Questions: Test of a Psychological Model of Question Answering
We conducted an experimental test of the Graesser and Clark (1985) model of question-answering for why questions.This model specifies how individuals answer different types of questions by searching through various sources of information after comprehending a text. The sources of information include the passage structure and the generic knowledge structures which are associated with the content words in the query.After these knowledge structures are activated in working memory, search components narrow down a set of relevant answers. A subset of these components were tested in this experiment: (1) an arc search procedure specifying which nodes and arcs within an information source are sampled for answers to a why-question; (2)arc distance, the number of arcs in the representational network that connect the queried node to the answer node; and (3) the intersection of information between the passage structure and the generic knowledge structures associated with the query.After reading short stories, subjects were presented with questions and a number of theoretical answers to each question. Subjects were timed as they judged whether each answer was "Good" (appropriate and relevant) or "Bad" (inappropriate or irrelevant). Results supported the validity of the arc search procedure in that subjects robustly distinguished theoretically good answers from theoretically bad answers to specific questions.
What Makes Language Formal?
This paper addresses part of the question "how do we say the same thing in different ways in order to communicate non-literal, pragmatic information?". Since the style of the text can communicate much information — it may be stuffy, slangy, prissy — generators that seek to satisfy pragmatic, hearer-related goals in addition to simple informative ones must have rules that control how and when different styles are used. But what is "style" ? In this paper,formal and informal language is analysed to provide stylistic rules that enable a program to produce texts of various levels of formahty
Grammatical priming of nouns in connected speech
On-line processing of inflected spoken words was examined using phoneme-monitoring RT to following targets. Plural and singular nouns followed contexts that required plurals (e.g., A dozen bagels/bagel tumbled...) or were neutral (e.g.. The frozen bagels/bagel tumbled ...) . Relative to the neutral contexts, recognition of congruent plural nouns was facilitated, and recognition of incongruent singular nouns was disrupted.
Integrated Learning of Words and their Underlying Concepts
Models of learning word meanings have generally assumed prior knowledge of the concepts to which the words refer. However, novel natural language text or discourse can often present both unknown concepts and words which refer to these concepts. Also, developmental data suggests that the learning of words and their concepts frequently occurs concurrently instead of concept learning proceeding word learning. This paper presents an integrated computational model for acquiring both word meanings and their underlying concepts concurrently. This model is implemented as a word learning component added to the CI-NLSIS explanation-based schema acquisition system for narrative understanding. A detailed example is described in which CnNHSIS learns provisional definitions for the words "kidnap", "kidnapper", and "ransom" as well as a kidnapping schema from a single narrative.
A Neural Model of Deep Dyslexia
This paper presents a simulation of the selective deficits and the partial breakdown patterns characteristic of the oral reading performance of deep dyslexics. The most striking symptom of deep dyslexia — usually considered its defining characteristic — is the occurrence in oral reading tasks of semantic paralexias: the vocalization of a word semantically related to an isolated, printed target word. The pattern of simulated paralexic errors by the neural model is strongly controlled by the similarity structure of the training set stimuli and, to a lesser extent, the frequency of presentation of stimuli during learning by the model. This result fits well with effects of stimulus type on patterns of paralexic error among deep dyslexics. Further, the model very naturally reproduces the patterns of partial breakdown observed in deep dyslexics, including a slow response time (RT) and within subject variation of response to a particular target word in successive test sessions.
Planning Principles Specific to Manual Goals
A theory of planning should provide a model of how planning knowledge might be learned and stored in memory so as to be available and utilized in appropriate situations. This paper presents a content theory of the planning strategies and constraints on planning specific to joint planning situations. The categorization helps to explain what information is relevant to this general class of planning problems,namely goal pursuit situations where goals cannot be satisfied without the participation of another planner. The taxonomy of planning principles presented outlines the common problems in mutual goal pursuit situations, and provides strategies for resolving the problematic interactions. The principles apply to a variety of types of mutual goal pursuit arrangements such as business partners, a political coalition, or social relationships.
Finding Creative Solutions in Adversarial Impasses
This paper presents a method for generating creative solutions to resolve adversarial impasses that makes use of memory structures based on goal interactions and blame attribution. These knowledge structures are called Situational Assessment Packets (SAPs). SAPs contain general strategies for satisfying multiple conflicting goals either totally or partially. These resolution strategies are evaluated for applicability to a situation by considering interactions of the goals of the problem solver and the goals of the interacting agents. This work is part of the PERSUADER , a computer program that functions as a third party problem solver (mediator) in hypothetical labor negotiations.