About
The annual meeting of the Cognitive Science Society is aimed at basic and applied cognitive science research. The conference hosts the latest theories and data from the world's best cognitive science researchers. Each year, in addition to submitted papers, researchers are invited to highlight some aspect of cognitive science.
Volume 11, 1989
Paper Presentations
A PDP model of sequence learning that exhibits the power law
This paper examines some characteristics of the learning process in a model of skill learning(Miyata, 1987) in which performance of executing sequential actions becomes increasingly more efficient as a skill is practiced. The model is a hierarchy of sequential PDP networks which was designed to model a shift from a slow, serial performance of a novice to a fast,parallel performance of an expert in tasks such as typing. The network develops representation of a set of sequences as it tries to produce the sequences faster. The model was found to yield the power law of learning (Newell and Rosenbloom, 1981). In addition, it exhibited a frequency effect on substitution errors similar to what was found in typing(Grudin, 1983).
Structured Representations and Connectionist Models
Recent descriptions of connectionist models have argued that connectionist representations are unstructured, atomic, and bounded(e.g., Fodor & Pylyshyn, 1988). This paper describes results with recurrent networks anddistributed representations which contest these claims. Simulation results are described which demonstrate that connectionist networks are able to learn representations which are richly structured and open-ended. These representations make use both of the high dimensional space described by hidden unit patterns, as well as trajectories through this space in time, and posses a rich structure which reflects regularities in the input.Specific proposals are advanced which address the type/token distinction, the representation of hierarchical categories in language, and there presentation of grammatical structure.
Compositionality and the Explanation of Cognitive Processes
Connectionist approaches to the modeling of cognitive processes have often been attacked on the grounds that they do not employ compositionally structured representations (e.g.,Fodor & Pylyshyn 1988). But what exactly is compositional structure, and how does such structure contribute to cognitive processing? This paper clarifies these questions by distinguishing two different styles of compositionality, one characteristic of connectionist modeling and the other essential to mainstream symbolic or "Classical" approaches. Given this distinction,it is clear that connectionist models can employ compositionally structured representations while remaining, both conceptually and in practice, quite distinct from the Classical approach;moreover, it can be shown that certain central aspects of cognition, such as its systematicity,are at least in principle amenable to Connectionist explanation.
Learning from Error
Distributed systems of cognition are receiving increasing attention in a variety of research traditions. A central question is h o w the specific features of cognitive functions will be affected by their occurrence within a system of cooperative agents. In this paper, we will examine the less often considered aspects of the organization of cooperative work settings that can become important in terms of error within a system. Specifically, w e examine how response to error in a cooperative task can in some ways benefit future task performance. The goal is to facilitate learning from error so that future errors become less likely. The study involved an analysis of observations of several cooperative teams involved in coordinated activity for the navigation of a large ship. The analysis of the team member's activities revealed a surprisingly high rate of errors; yet, the final product of the group work showed that the error had been removed somewhere within the system.Features of the distributed system that facilitated this error removal included the monitoring of other's performance, as constrained by a horizon of observation, Umiting exposure to particular subtasks; the distribution of knowledge within the team, such that more knowledgable members were also ones in a position to detect other's errors; and methods of providing feedback. In particular, specific design tradeoffs were found to underlie the functioning of the system. For example, evaluation depends on utilizing objective knowledge of h o w the product reconciles with the real world; however, separating evaluation from the system means "wasting" the knowledgable potential participant. Thus,the distributed system was found to contain certain properties that can be exploited for their utility in error detection, diagnosis, and correction. The results m a y be applied to the design of such cooperative tasks, including a role for technology, with the goal of designing cooperative systems that can more easily learn from their errors.
A State-Space Model For Prototype Learning
A general state-space model of prototype learning was formulated Interms of a set of Internal states and nonlinear Input-output mappings. The general model Includes several previous models as special cases such as Hintzman's (1986) multiple trace model, Metcalf's (1982) holographic model,and two parallel distributive memory models (Knapp & Anderson, 1984;McClelland & Rumelhart, 1985). Two basic properties common to the threemodels were defined in terms of this general model--addltivlty and time Invarlance. An experiment was conducted to test the basic properties using random spectral patterns as stimuli allowing possible nonlinear input and output distortions. Especially, ordinal tests of addltivlty were performed with few assumptions about Internal features that subjects may use to encode the stimulus Information. The results support addltivlty but tlme-Invarlance was clearly violated. Implications of these findings for models of the human memory system are discussed.
Learning Simple Arithmetic Procedures
Two types of simple recurrent networks (Jordan, 1986; Elman, 1988) were trained and compared on the task of adding two multi-digit numbers. Results showed that: (1) A manipulation of the training environment,called Combined Subset Training (CST), was found to be necessary to learn the large set of patterns used; (2) if the networks are viewed as learning simple programming constructs such as conditional branches, while-loops and sequences, then there is a clear way to demonstrate a capacity difference between the two types of networks studied. In particular, we found that there are programs that one type of network can perform that the other cannot. Finally, an analysis of the dynamics of one of the networks is described.
THIYOS: A Classifier System model of Implicit knowledge of Artificial Grammars
This study develops a computational model based on the Holland et al.'s (1986) induction theory to simulate the tacit knowledge of artificial grammars acquired from experience with exemplars of the grammar (e.g., Reber, 1969, 1976). The initial application of this model tests the proposition that the rules acquired about an artificial grammar consist of sets of partially valid rules that compete against one another to control response selection. Choices are made and the strength of rules is adjusted based on current levels of strength, specificity, and support among rules having their conditions matched on a particular trial. Verbal instructions generated by two human subjects who developed expertise in discriminating valid from invalid strings through extensive practice on a multiple choice string discrimination task served as inputs into the simulation model. Results show that these sets of rules verbalized by subjects can be represented as sets of condition-action rules. Further, these rules can compete against each other to select valid choices on the string discrimination task as described in the Holland et al. model, resulting in a level of performance very similar to that of human yoked subjects who attempted to use the rules provided by the original subjects. Finally, when the rules are automatically tuned by an optimization algorithm using feedback about correctness of choices, performance of the simulation approaches the level of the original subject. It is concluded that a considerable portion of implicit knowledge that is not verbalized to yoked partners consists of the relative strengths of competing rules.
Lexical Conceptual Structure and Generation in Machine Translation
This paper introduces an implemented scheme for generating target-language sentences using a compositional representation of meaning called lexical conceptual structure. Lexical conceptual structure facilitates two crucial operations associated with generation: lexical selection and syntactic realization. The compositional nature of" the representation is particularly valuable for these two operations when semantically equivalent source- and target-language words and phrases are structurally or thematically divergent. For example, the English verb to stab may be translated as the composite Spanish form dar cuchilladas a (literally, to knife or to give knife-wounds to). To determine the correct lexical items and syntactic realization associated with the surface form in such cases, the underlying lexical-semantic forms are systematically mapped to the target language syntactic structures. The mode described constitutes a lexical-semantic extension to UNITRAN,a syntactic-based translation system that is bidirectional between Spanish and English.'
Robust Lexical Selection in Parsing and Generation
A well-known difference between human language understanding and typical computational theories of language understanding is in the degree to which they handle partial or errorful input: computational models are comparatively brittle in the face of input which deviates from the norm. In language generation there is an analogous problem, that of selecting an appropriate lexical entry when there is none in memory which matches the pragmatic/semantic input to generation. This paper presents a localized connectionist model of robust lexical selection for both language understanding and generation. Processing takes the form of pattern completion, where patterns consist of complexes of semantic, morpho syntactic, and pragmatic features. The system is presented with portions of such patterns and retrieves others. In generation the given information is pragmatic/semantic and in understanding mainly morpho syntactic. This approach is not only a natural way of accommodating both understanding and generation but it also fosters the robustness that is characteristic of human language processors.
Causal/Temporal Connectives: Syntax and Lexicon
This paper elucidates the linguistic representation of temporal relations among evoiils. It docs soby examining sentences that contain two clauses connected by temporal/causal connectives.words like once, by the time, tchen, and before. Specifically, the data involve the effect of the tenses of the connected clauses on the acceptability of sentences. For example, Rachel (lisnppeairdonce Jon had fallen asleep is fine, but * Rachel had disappeared once Jon fell asleep is unacceptable. First a theory of acceptability is developed, then its implications for interpretation are discussed. The strategy employed is to factor linguisitic knowledge into a general, syntactic component and a lexical component dependent on the properties of individual connectives. Once the syntactic and lexical components have been teased apart the problem of interpretation becomes clearer. Finally, a computer model of the theory, which serves as a workbench and confirms the theory's behavior, is demonstrated.
The Frame of Reference Problem in Cognitive Modeling
Since at least the mid-70's there has been widespread agreement among cognitive science researchers that models of a problem-solving agent should incorporate its knowledge about the world and an inference procedure for interpreting this knowledge to construct plans and take actions. Research questions have focused on how knowledge is represented in computer programs and how such cognitive models can be verified in psychological experiments. W e are now experiencing increasing confusion and misunderstanding as different critiques are leveled against this methodology and new jargon is introduced (e.g., "not rules," "ready-to-hand," "background,""situated," "subsymbolic"). Such divergent approaches put a premium on improving our understanding of past modeling methods, allowing us to more sharply contrast proposed alternatives. This paper compares and synthesizes new robotic research that is founded on the idea that knowledge does not consist of objective representations of the world. This research develops a new view of planning that distinguishes between a robot designer's ontological preconceptions, the dynamics of a robot's interaction with an environment, and an observer's descriptive theories of patterns in the robot's behavior. These frame-of-reference problems are illustrated here and unified by a new framework for describing cognitive models.
The Many Uses of 'Belief' In Ai
Within AI and the cognitively related disciphnes, there exist a multiplicity of uses of 'belief. On the face of it, these differing uses reflect differing views about the nature of an objective phenomenon called belief. In this paper I distinguish six distinct ways in which 'belief is used in AI. I shall argue that no tall these uses reflect a difference of opinion about an objective feature of reality. Raiher, in some cases,the differing uses reflect differing concerns with special AI applications. In other cases, however, genuine differences exist about the nature of what we pre-theoretically call belief. To an extent, the multiplicity of opinions about, and uses of 'belief, echoes the discrepant motivations of AI researchers. The relevance of this discussion for cognitive science arises from the fact that (a) many regard theoretical research within AI as a branch of cognitive science, and (b) even if theoretical AI is not cognitive science, u-ends within AI influence theories developed within cognitive science. It should be beneficial, therefore, to unravel the distinct uses and motivations surrounding 'belief, in order to discover which usages merely rcflccl differing pragmatic concerns, and which usages genuinely reflect divergent views about reality.
Using View Types to Generate Explanations in Intelligent Tutoring Systems
Providing coherent explanations of domain knowledge is essential for a fully functioning Intelligent Tutoring System (ITS). Current ITSs that generate explanations directly from domain knowledge offer limited applicability because they place restrictions on the form and extent of the domain knowledge. Moreover,generating explanations in tutors that are designed to teach the breadth of foundational knowledge conveyed in most introductory college courses poses special problems. These problems arise because this knowledge is complex and contains multiple, highly-integrated viewpoints. To overcome these problems, we propose a method for selecting only the knowledge that is relevant for generating a coherent explanation from a desired viewpoint. This method uses domain-independent knowledge in the form of view types to select the appropriate knowledge.
Relations Relating Relations
The aim of the current work is to incorporate structural information in judgments of similarity. According to the assumption of feature independence, h o w one feature affects similarity is independent of the values of the other features present. W e present three violations of this assumption, all arising from Uie influence of relations between features and of relations between relations. A shared relation is more important for similarity judgments if it cooccurs with (A) relations that augment the first relation by "pointing in the same direction" as the first relation, (B)relations which are themselves salient, and (C) salient relations that involve the same objects as the first relation. We interpret these results as suggesting that relations do not have separately determined weights or saliences; the weight of a relation depends the relational structure in which it exists. Relations influence each other by creating higher order relational structures, and also by affecting processing.
Integrating Generalizations with Exemplar-Based Reasoning
Knowledge represented as generalizations is insufficient for problem solving in many domains,such as legal reasoning, because of a gap between the language of case-descriptions and the language in which generalizations are expressed, and because of the graded structure of domain categories. Exemplar-based representation addresses these problems, but accurate assessment of similarity between an exemplar of a category and a new case requires reasoning both with general domain theory and with the explanation of the exemplar's membership in the category. G R E B E is a system that integrates generalizations and exemplars in a cooperative manner. Exemplar-based explanations are used to bridge the gap between case-descriptions and generalizations, and domain theory in the form of general rules and specific explanations is used to explain the equivalence of new cases to exemplars.
Combining Explanation Types for Learning by Understanding Instructional Examples
Learning from instruction is a powerful technique for improving problem solving. It is most effective when there is cooperation between the instructor and the student. In one cooperative scenario, the instructor presents examples and partial explanations of them, based on the perceived needs of the student. An active student will predict the instructor's actions and thentry to explain the differences from the predictions. This focuses the learning, making it more efficient. W e expand the concept of explanation beyond the provably correct explanations of explanation-based learning to include other methods of explanation used by human students.The explanations can use deductions from causal domain knowledge, plausible inferences from the instructor's actions, previous cases of problem solving, and induction. They involve the goal being pursued and the action taken in support of the goal. The explanations result in improved diagnosis and improved future explanation. This combination of explanation techniques leads to more opportunities to learn. W e present examples of these ideas from the system we have implemented in the domain of automobile diagnosis.
Selecting The Best Case For A Case-Based Reasoner
The most important support process a case-based reasoner needs is a memory for cases. Among its functions, the memory for cases must be able to select out the most appropriate cases for the case-based reasoner to use at any time. In this paper, we present the selection processes implemented in PARADYME , a case memory designed to work alongside a case-based reasoner.PARADYME has a two-step retrieval process. In the first step, it retrieves the set of partial matches from the memory. In the second, it selects out a small set of "best" matches. PARADYME chooses"best" cases using a set of six preference heuristics: goal-directed preference, salience,specificity, frequency, recency, and ease of adaptation. PARADYME is novel in two ways. Its use of preferences for choosing a best case means that its principles act as selectors rather than restrictors. And its emphasis in choosing best cases is on usefulness rather than similarity.
The Function of Examples in Learning a Second Language From an Instructional Text
This paper addresses the role that examples play in instructional learning. We discuss several roles examples can serve when they complement an instruction. W e provide functional evidence for some of these roles, arguing why instructions and examples are both necessary for efficient learning. We present a system that lezirns from instructions which are enhanced by examples. The system, A N T (Acquisition using Native-language Transfer), learns a second language by reading instructions about grammatical rules of the second language as well as examples which use these rules. Finally, we argue for the functional utility of examples in instructional learning on more general grounds, showing how such a strategy can beappHcable to other domains besides second language learning.
Token Frequency and Phonological Predictability in a Pattern Association Network: Implications for Child Language Acquisition
The degree to which the behavior of PDP models of pattern associations (Rumelhart &McClelland, 1986; 1987) approximates children's acquisition of inflectional morphology has recently been highlighted in discussions of the applicability of P D P to the study of human cognition and language (Pinker & Mehler, 1988). In this paper, we attempt to eliminate many of the limitations of the R & M model, adopting an empirical approach to the analysis of learning(hit rate and error type) in two sets of simulations in which vocabulary structure (token frequency)and the presence of phonological subregularities are manipulated. A 3-layer back propagation network is used to implement a pattern association task with strings that are analogous to four types of present and past tense English verbs. W e overview resulting"competitions" when strings are randomly assigned to verb classes, in particular, the conditions under which different overgeneralization errors (both " pure" and " blended") are produced.In a second set of simulations, identical type and token frequencies are used, but strings are assigned to the identity and vowel change classes on the basis of phonological shape of the stem. Phonological cues are exploited by the system leading to overall improved performance.However, overgeneralizations continue to be observed in similar conditions. Token frequency works together with phonological subregularities to determine patterns of learning,including the conditions under which " rule-like"behavior will and will not emerge. The results are discussed with reference to behavioral data on children's acquisition of the English past tense.
Towards a Connectionist Phonology: The "Many Maps" Approach to Sequence Manipulation
Lakoff's new theory of cognitive phonology appears to be free of the rule ordering constraints that make generative rules computationally awkward. It uses a multilevel representation for utterances, to which multiple rules may apply in parallel. This paper presents the first implementation of Lakoff's proposal, based on a novel "many maps" architecture. The architecture may also explain certain constraints on phonological rules that are not adequately accounted for by more abstract models.
A Conncetionist Model of Form-related Priming Effects
In contrast to the results of many previous studies, Colombo (1986) has demonstrated that form related priming is sometimes inhibitory. Colombo proposed that inhibition reflects the suppression of lexical items orthographically related to the prime. We suggest, however, that form-related inhibition arises as a result of competition between discrepant prime-target phonemes. During the phonological encoding of the target word, active phonemes from the prime might be mistakenly selected, causing a delay in responding. We present a connectionist model that implements this account, and simulates tiie empirical data. The model is supported by the results of an experiment that distinguishes between the lexical suppression and phonological competition views
Figurative Adjective-Noun Interpretation in a Structured Connectionist Network
Non-literal use of an adjective, whether signalled by a category error or by a value expectation violation, invokes the connotations or immediate inferences associated with that adjective in various noun contexts. Immediate inferences reflect the structure of stored knowledge, as they are available too quickly and effortlessly to involve any complex form of information retrieval. Specifically, they suggest the use of the spreading activation model of semantic memory. The relation between the inferences invoked by the adjective and salient features of the noun employed in the figurative usage are exploited by the DIFICIL connectionist inferencing system to interpret the meaning of an unfamiliar adjective-noun phrase.
Competition for Evidential Support
In order to accept a hypothesis on the grounds that it is the best explanation of the evidence, one must know what other hypotheses compete for evidential support with the first hypothesis. But the principles for determining when hypotheses compete are obscure and represent a currently unsolved problem for this form of inference. Competing hypotheses need not contradict each other. A defender of inference to the best explanation as a distinctive form of inference will not want to identify competing hypotheses with hypotheses that are jointly highly improbable. Relying on probabilities to solve this problem would be to put the cart before the horse, since the idea behind taking inference to the best explanation to be a distinctive form of inference is that we use inference to the best explanation to determine probabilities, not the reverse Furthermore, it does not work to rely on a failure by the "hypothesis generator" to generate anything but competing hypotheses, because that just pushes the problem over to the hypothesis generator.Anyway, noncompeting hypotheses often have to be considered and therefore have to be generated Whether there is competition between two hypotheses seems to depend at least in part with whether one might be used to "fill out" the other without leading to a major change in the explanation. But it remains unclear how to distinguish "filling out" an explanation fi-om changing it.
Managing Uncertainty in Rule-based Reasoning
There are two major problems associated with propagation of uncertainty in the rule-based modeling of human reasoning. O n e concerns how the possibly uncertain evidence in a rule's antecedents affects the rule's conclusion. The other concerns the issue of combining evidence across rules having the same conclusion. Two experiments were conducted in which psychological data were compared with a variety of mathematical models for managing uncertainty. Results of an experiment on the fu-st problem suggested that the certainty of the antecedents in a production rule can be summarized by the maximum of disjunctively connected antecedents and the minimum of conjunctively connected antecedents {maximin summarizing), and that the m a x i m u m certainty of the rule's conclusion can be scaled down by multiplication with the results of that summary{multiplication scaling). A second experiment suggested that the second problem can be solved with Heckerman's modified certainty factor model which sums the certainties contributed by each of two rules and divides by 1 plus their product.
Explorations in the Contributors to Plausibility
In previous work, we identified a method for automatically deriving possible rules of plausible inference from a set of relations, and determined that the transitivity of underlying characteristics of the relations was a significant factor in predicting the plausibility of inferences generated from these rules. Recent work by other researchers has also focused on identifying these kinds of characteristics and examining their role in the ability to predict plausibility. W e examine these sets of characteristics and conclude that those factors that preserve transitivity provide most of the power of these systems. W e then show how inferences can be used to determine the intended semantics, and thus the appropriate set of representational features, of a relation.
A Theory of the Aspectual Progressive
The progressive construction in English has an unusually wide range of uses. In this paper, I propose a new theory of what is probably the most important use of the progressive the aspectual progressive. It is the aspectual progressive which is being contrasted with thesimple, i.e., nonprogressive, construction in the nonhabitual interpretation of such pairs of sentences as John was running at three o'clock versus John ran at three o'clock. The proposed theory is based on a particular analysis of the conceptualizations of events and situations commonly called the aspectual classes,and is able to account for the temporal properties of the progressive, for the "imperfectiveparadox" problem, and for the range of applicabiUty of the aspectual progressive.
Default Values in Verb Frames: Cognitive Biases for Learning Verb Meanings
Two experiments investigated children's and adults' initial mapping of verb meanings.In Experiment 1, subjects were asked to use a newly learned verb to label events in which the instrument, action, or result was different from the events used to teach the verbs. All subjects showed a bias to interpret the result as the most important component of the novel verbs' meanings, and this bias increased with age. In Experiment 2, either the instrument, action, or result of events was varied during training to test subjects' ability to override their default biases. When results were varied in training, 5-year-olds and adults, but not 3-year-olds, were more likely to use the novel verb to label an event in which the result was changed again.When results were varied in training, all subjects were less likely to use the novel verb to label an event in which the action was changed. These findings suggest that there is a default rule hierarchy for learning novel verbs, and that both default rules and the ability to override these rules when presented with conflicting information about the meaning of a verb are still developing during preschool.
Generating Temporal Expressions in Natural Language
We explore the problem of generating natural language temporal expressions and show that it is amenable to solution by the application of hierarchical constraint propagation.Constraints derived either directly or indirectly (via transformations) from client data are propagated over the hierarchical structure provided by syntactic templates and are required to be consistent at any given node. Multiple sources of constrcdnts must be used to achieve lexical selection of a single item.
The Role of Abstraction in Plave Vocabularies
Understanding mechanical behavior is an important part of both commonsense and expert reasoning which involves extensive spatial knowledge. A key problem in qualitative spatial reasoning is finding the right level of detail to support differing needs of reasoning methods. For example,analysis of failures may involve describing every surface imperfection; but, to gain an initial understanding of device behavior, one needs to eliminate extraneous information.People seem very good at varying their level of resolution to meet the needs of the activity being described, but in machine understanding this has proven to be a dilemma. In order to understand an artifact one needs to impose some level of abstraction, yet to obtain a sufficient level of abstraction, without omitting critical details, one needs to understand the artifact. Our solution simplifies descriptions of mechanical devices using quantitative information about qualitatively significant regions. Configuration space representation of the kinematic pairs of a mechanism serves as the underlying metric diagram to answer questions concerning contact between the components and provide the foundation for construction of a purely symbolic device description, the place vocabulary. W e explore the effect of abstracting the configuration space on condensation of the place vocabulary, showing how it makes qualitative reasoning about complex mechanisms, such as the mechanical clock, tractable. Examples shown are based on an implementation.
Cognitive Efficiency Considerations for Good Graphic Design
Larkin and Simon's (1987) analysis of how graphical representations support task performance is applied to designing graphical displays that streamline information-processing tasks. Theoretically this streamlining is done by designing external data structures that (a) allow users to substitute less effortful visual operators for more effortful logical operators, and (b) reduce search for needed information.A design program called BOZ is used to produce four alternative displays of airline schedule information to support a task of making airline reservations. W e postulate several procedures that use visual operators to perform the task using the different graphics. The number of times each operator is executed provides one measure of task difficulty (for a procedure and graphic). A second measure is the difficulty of executing each operator. Seven subjects performed the airiine reservation task using each of the four graphics. Response times for the different graphics differ by a factor of two, which is statistically highly significant. Detailed data analyses suggest that these differences arise through substitution of visual operators for logical ones and through the use of visual cues that help reduce search. These analyses provide quantitative estimates of the time saved through operator substitutions.
A Process Model of Experience-Based Design
Human designers use previous designs extensively in the process of producing a new design. When they come up with a partial or complete design, designers perform a mental simulation to verify the design.W e present a model for engineering design that integrates case-based reasoning and qualitative simulation. The model involves: (1) setting up the functional requirements, (2) accessing memory to retrieve cases relevant to the requirements, (3) synthesizing pieces of cases into designs, (4) verifying and testing the design, and finally, (5) debugging. This process is apphed recursively till the design is complete and bug free. The model integrates different levels of representation and reasoning mechanisms in order to effectively support the design tasks.
Cognition in Design Process
The purpose of this research is to study the cognitive process in architectural design problem solving. It also will explore a cognitive structure (model) capable of representing the problem solver's cognitive behavior. The goal plan, schemata, perceptual-test, and generate-and-test are regarded as cognitive mechanisms that evolved in the problem solving process. They were observed in an experiment in which an experienced architectural designer was asked to do a residential design.Results from protocol analysis showed that an invariant cognitive structure could be built upon these cognitive mechanisms to explain the problem solving behavior. This cognitive structure (model) also provides a framework for future simulation.
Evaluation of Suggestions during Automated Negotiations
An automated agent that has to act in a multi-agent environment needs the capability to negotiate. In this paper we concentrate on problems that arise while evaluating suggestions during negotiations. We distinguish between different kind of suggestions and present methods and techniques for evaluating them. The suggestions are written using a formal Negotiation Language that we have developed. W e show how our approach was successfully implemented in a specific environment: the Diplomacy game.As in other board games, playing Diplomacy involves a certain amount of technical skill but the capacity to negotiate, explain, convince, promise, keep promises or choose not to keep them, is an essential ingredient of good play. Diplomat was evaluated and consistently played better than well experienced players, and in games that were held,many players did not guess which player Diplomat was playing.
Composite Holohraphic Associative Recall Model (CHARM) and Blended Memories in Eyewitness Testimony
The idea that compositing or blending may occur in human episodic memory stems from two sources: (1) distributed models of human memory, and (2) studies that have focussed on the distortions and mistakes that occur in eyewitness testimony. In this paper,data that have been uncovered within the eyewitness testimony paradigm are simulated by a distributed memory model-CHARM (composite holographic associative recall memory). Studies done by Loftus have been interpreted as indicating that blending does occur; modification of these experiments conducted by McCloskey and Zaragosa have been claimed to refute Loftus' interpretation. It is shown that both of these results are predicted by the composite-trace model.
A Two-stage Categorization Model of Family Resemblance Sorting
A two-stage model is applied to category construction. The first stage of the model involves looking for a defining feature among exemplars and creating initial categories based on the defining features. In the second stage, overall similarity is calculated to categorize the remaining exemplars that were not classified by the defining feature. For some types of exemplar structures, family resemblance sorting emerges as a product of the two-stage model. A series of experiments was carried out to contrast the two-stage model with Anderson's induction model (Anderson, 1988)and CLUSTER/2 (Michalski & Stepp, 1983). The results showed that the two-stage model is abetter predictor of when family resemblance sorting will or will not occur.
A Configural-Cur Network Model of Animal and Human Associative Learning
test a configural-cue network model of human classification and recognition learning based on Rescorla & Wagner's (1972) model of classical conditioning. The model extends the stimulus representation assumptions from our earlier one-layer network model (Gluck & Bower, 1988b) to include pair-wise conjunctions of features as unique cues. Like the exemplar context model of Medin & Schaffer(1978), the representational assumptions of the configural-cue network model embody an implicit exponential decay relationship between stimulus similarity and and psychological (Hamming) distance, a relationship which has received substantial independent empirical and theoretical support (Shepard, 1957,1987). In addition to results from animal learning, the model accounts for several aspects of complex human category learning, including the relationship between category similarity and linear separability in determining classification difficulty (Medin &Schwanenflugel, 1981), the relationship between classification and recognition memory for instances (Hayes-Roth & Hayes-Roth, 1977), and the impact of correlated attributes on classification (Medin, Altom, Edelson, & Freko, 1982).
Induction of Continuous Stimulus-Response Relations
The present research investigates the mental processes involved in inducing continuous stimulus response relations. A simple perceptual-motor learning task was used in which subjects learned to produce a continuous variable (response duration) accurately for values chosen from another continuous dimension (stimulus length). Subjects were trained on several "practice" pairs, for which they received feedback about the correct responses. Trials involving practice pairs were intermixed with trials involving "transfer" pairs, for which no feedback was given. The correct responses and stimuli were related by simple mathematical functions: a power (Experiment 1); alogarithmic (Experiment 2); and a linear function with a positive intercept (Experiment 3).Experiment 1 demonstrated that people can learn a power function rapidly and use it to perform as well for transfer pairs as for practice pairs. Experiments 2 and 3 revealed a systematic pattern of bias during early learning, consistent with the hypothesis that people have a predisposition toward inducing a power function. However, the biases decreased in magnitude with practice.We propose an account for induction of continuous stimulus-response relations called the"adaptive-regression" model. According to it, people are initially biased to induce a power function, but the bias is gradually weakened through experience, so that other stimulus-response relations can be learned with sufficient practice. The present results support the adaptive regression model.
Structural Evaluation of Analogies: What Counts?
Judgments of similarity and soundness are important aspects of human analogical processing. This paper explores how these judgments can be modeled using SME, a simulation of Centner's structure-mapping theory. W e focus on structural evaluation, explicating several principles which psychologically plausible algorithms should follow. W e introduce the Specificity Conjecture, which claims that naturalistic representations include a preponderance of appearance and low-order information. W e demonstrate via computational experiments that this conjecture affects how structural evaluation should be performed, including the choice of normalization technique and how the systematicity preference is implemented.
Structural Representations of Music Performance
A primary goal of music cognition is to understand mental representations for musical knowledge that allow communication of thoughts and emotions. Theories of musical competence generally model mental representations in terms of structure given in the musical text, and do not model performers' preferential choices of structural content for emphasis.Such choices are an important component of musical interpretation. Two sources of converging evidence are described that support the role of phrases as structures in mental representations for music performance: evidence from expressive timing in skilled performance and from performance breakdowns (errors). The location and amount of expressive timing,and the likelihoods of different error types coincided with musicians' notated interpretations.Evidence from both ideal and non-ideal musical behavior implicate the same structures in representation of musical knowledge, and suggest that individual preferences can explain much variation in music performance.
A Logic for Emotions: a basis for reasoning about commonsense psychological knowledge
There is a body of commonsense knowledge about human psychology that we all draw upon in everyday life to interpret our own actions and those of the people around us. In this paper,we define a logic in which this knowledge can be expressed. W e focus on a cluster of emotions,including approval, disapproval, guilt, and anger, most of which involve some sort of ethical evaluation of the action that triggers them. As a result, we are able to draw on well-studied concepts from deontic logic, such as obligation, prohibition, and permission. W e formalize a portion of commonsense psychology and show how a simple problem can be solved using our logic.
Extracting Visual information From Text: using Captions to Label Human Faces in Newspaper Photographs
There are many situations where linguistic and pictorial data are jointly presented to communicate information. A computer model for synthesising information from the two sources requires an initial interpretation of both the text and the picture followed by consolidation of information. The problem of performing general-purpose vision(without apriori knowledge) would make this a nearly impossible task. However, in some situations, the text describes salient aspects of the picture. In such situations, it is possible to extract visual information from the text, resulting in a relational graph describing the structure of the accompanying picture. This graph can then be used by a computer vision system to guide the interpretation of the picture. This paper discusses an application whereby information obtained from parsing a caption of a newspaper photograph is used to identify human faces in the photograph. Heuristics are described for extracting information from the caption which contributes to the hypothesised structure of the picture. The top-down processing of the image using this information is discussed.
Head-Driven Massively-Parallel Constraint Propagation
We will describe a model of natural language understanding based on Head-driven Massively-parallel Constraint Propagation (HMCP). This model contains a massively parallel memory network in which syntactic head-features are propagated along with other information concerning the nodes that triggered the propagation. The propagated head features eventually collide with subcategorization lists which contain constraints on subcategorized arguments.These mechanisms handle linguistic phenomena such as case, agreement, complement order, and control which are fundamental to linguistic analysis but have not been captured in previous marker-passing models.
Virtual memories and Massive Generalization in Connectionist Combinatorial Learning
W e report a series of experiments on connectionist learning that addresses a particularly pressing set of objections to Ihe plausibility of connectionist learning as a model of human learning. Connectionist models have typically suffered from rather severe problems of inadequate generalization(where generalizations are significantly fewer than training inputs) and interference of newly learned items with previously learned items. Taking a cue from the domains in which human learning dramatically overcomes such problems, we sec that indeed connectionist learning can escape these problems in combinatorially structured domains. In the simple combinatorial domain of letter sequences, w e find that a basic connectionist learning model trained on 50 6-letter sequences can correctly generalize to about 10,000 novel sequences. W e also discover that the model exhibits over 1,000,000 virtual memories: new items which, although not correctly generalized, can be learned in a few presentations while leaving performance on the previously learned items intact.W e conclude that connectionist learning is not as hannful to the empiricist position as previously reported experiments might suggest.
Conncetionist Variable-Binding By Optimization
Symbolic AI systems based on logical or frame languages can easily perform inferencesthat are still beyond the capability of most connectionist networks. This paper presents a strategy for implementing in connectionist networks the basic mechanisms of variable binding, dynamic frzune allocation and equality that underlie many of the types of inferences commonly handled by frame systems, including inheritance, subsumption and abductive inference. The paper describes a scheme for translating frame definitions in a simple frame language into objective functions whose minima correspond to partial deductive closures of the legaJ inferences. The resulting constrained optimization problem can be viewed as a specification for a connectionist network.
Efficient Inference with Multi-Place Predicates and Variables in a Connectionist System
The ability to represent structured knowledge and use that knowledge in a systematic way is a very important ingredient of cognition. A n often heard criticism of connection ism is that connectionists ystems cannot possess that ability. The work reported in this paper demonstrates that a connectionist system can not only represent structured knowledge aind display systematic behavior, but can also do so with extreme efficiency. The paper describes a connectionist system that can represent knowledge expressed as rules and fads involving muhi-place predicates, and draw limited,but sound, inferences based on this knowledge.The system is extremely efficient - in fact, optimal,as it draws conclusions in time proportion alto the length of the proof. Central to this ability of the system is a solution to the variable binding problem. The solution makes use of the notion of a phased clock and exploits the time dimension to create and propagate variable bindings.
On The Nature of Children's Naïve Knowledge
argue that children construct a naive understanding of the world which gradually becomes modified to conform to adult-scientific views. This naive understanding consists of a number of discrete ontological beliefs, such as that the ground is flat, that things fall down, and that stars are small objects. Children are capable of synthesizing their ontological beliefs to form relatively consistent conceptual structures. However, they also seem to be operating under an epistemological constraint according to which these ontological beliefs represent the true state of affairs about the world. In the process of conceptual change children replace their ontological beliefs with a different explanatory framework.
Comparing Historical and Intuitive Explanations of Motion: Does "Naïve Physics" Have a Structure
Are students' explanations of motion generated by an underlying structure? We address this question by exploring striking parallels between intuitive explanations and those offered by medieval scholastics. Using the historical record, it is possible to reconstructan inferential structure that generates medieval explanations. W e posit a parallel structure for intuitive explanations.
Qualitative Geometric Reasoning
Are students' explanations of motion generated by an underlying structure? We address this question by exploring striking parallels between intuitive explanations and those offered by medieval scholastics. Using the historical record, it is possible to reconstructan inferential structure that generates medieval explanations. W e posit a parallel structure for intuitive explanations.
Scientific Reasoning Strategies in a Simulated Molecular Generics Environment
Two studies are reported investigating the strategies that subjects use to revise hypotheses following disconfirmation. Subjects attempted to discover how genes are controlled by conducting experiments in a simulated molecular genetics laboratory. In Study 1, subjects set a goal of finding an experimental result, when this goal was not achieved they adopted one of the three following strategies. (1) Distort the logic of evidence interpretation to fit the current goal. (2) Conduct a parametric analysis of the Experiment space to achieve the goal. (3) Set a new experimental goal of trying to discover the cause of unexpected findings. Only the third group discovered how the genes are controlled. In Study 2, the hypothesis that the subject's experimental goal blocks consideration of alternative hypotheses was investigated. When subjects were allowed to reach their initial goal, they then set a new goal of accounting for unusual findings and discovered the mechanism of control. These results suggest that the goal of the subjects constrains search of both an Hypothesis and an Experiment space. This strategy can produce distortions in reasoning and a failure to generate new hypotheses.
Learning events in the acquisition of three skills
According to current theories of cognitive skill acquisition, new problem solving rules are constructed by proceduralization, production compounding, chunking, syntactic generalization, and a variety of other mechanisms.All these mechanisms are assumed to run rather quickly, so a rule's acquisition should be a matter of a few seconds at most. Such "learning events" might be visible in protocol data. This paper discusses a method for locating the initial use of a rule in protocol data. The method is applied to protocols of subjects learning three tasks: a river crossing puzzle, the Tower of Hanoi, and a topic in college physics. Rules were discovered at the rate of about one every half hour. Most rules required several learning events before they were used consistently, which is not consistent with the one-trial learning predicted by explanation-based learning methods. Some observed patterns of learning events were consistent with a learning mechanism based on syntactic generalization of rules. Although most rules seem to have been acquired at impasses-occasions when the subject docs not know what to do next there were clear cases of rules being learned without visible signs of an impasse, which does not support the popular hypothesis that all learning occurs at impasses.
Perceptual Chunks in Geometry Problem Solving: A Challenge to Theories of Skill Acquisition
In current theories of skill acquisition it is quite common to assume that the input to learning mechanisms is a problem representation based on direct translations of problem instructions or simple inductions from problem solving examples. W e call such a problem representation an execution space because it is made up of operators corresponding to the external actions agents perform while executing problem solutions. Learning proceeds by modifications and combinations of these execution space operators. W e have built a model of geometry expertise based on verbal report evidence which contains operators which can be described as modifications (e.g., abstractions) and combinations (e.g., compositions) of execution operators. However, a number of points of evidence lead us to conclude that these operators were not derived from execution space operators. In contrast, it appears these operators derive from discoveries about the structure and properties of domain objects, particularly, perceptual properties. We have yet to develop a detailed and integrated theory of this "perceptual churJdng", but we present the exp)ert model is a challenge to ciurent dieories of skill acquisition.
Empirical Analyses of Self-Explanation and Transfer in Learning to Program
Building upon recent work on production system models of transfer and analysis-based generalization techniques, we present analyses of three studies of learning to program recursion. In Experiment 1, a production system model was used to identify problem solving that involved previously acquired skills or required novel solutions. A mathematical model based on this analysis accounts for inter-problem transfer. Programming performance was also affected by particular examples presented in instruction. Experiment 2 examined these example effects in finer detail. Using a production system analysis, examples were found to affect the initial error rates, but not the learning rates on cognitive skills. Experiment 3 examined relations between the ways in which people explain examples to themselves and subsequent learning. Results suggest that good learners engage in more metacognition, generate more domain-specific elaborations of examples, make connections between examples and abstract text, and focus on the semantics of programs rather than syntax.
Pragmatic Interpretation and Ambiguity
An approach to pragmatic interpretation in natural language understanding is described. The approach trades off a full generative natural language capacity for the ability to recognize the flow of familiar (and often complex) arguments. The theory requires considerable domain-dependent knowledge and specific domain-dependent goals for the understanding system. The process model described, however, is domain-independent with fsdrly relaxed representational constraints. All processing takes plaw:e within a hierarchical episodic memory, allowing expectations to be posted to quite general concepts from multiple sources in parallel.
Expertise and Constraints in Interactive Sentence Processing
We examined individual variation in the integration of conceptual and linguistic knowledge during discourse processing. Skilled and average processors received sentences that were strongly or weakly supported by context. To reduce the contribution of special processing strategies, the syntactic constructions and topics were highly familiar. The interactions of context with linguistic processing were more constrained by sentential connectives for skilled processors, but less constrained by imposed reading units, which varied from words to incomplete sentences to complete sentences. These results suggest that a characteristic of expertise in discourse processing is an almost continual focus on organizing the results of linguistic processing into a conceptual framework. The results are discussed in terms of an interactive model with autonomous processors, but with shared resources for attending to the products of these processors.
Anomaly Detection Strategies for Schema-Based Story Understanding
Schema-based story understanding allows systems to process routine stories efficiently. However, a system that blindly applies active schemas may fail to recognize and understand novel events. To deal effectively with novelty, a story understander needs to be able to recognize when new information conflicts with its model of a situation. Thus it needs to be able to do anomaly detection. Anomsdy detection is the process that identifies when new information is inconsistent with current beliefs and expectations. Checking for aJl possible inconsistencies would be an explosive inference problem: it would require comparing all the ramifications of a new faw;t to all the ramifications of the facts in memory. W e argue that this inference problem can be controlled by selective consistency checking: An initial set of inexpensive tests can be applied to detect potential problems, and more thorough tests used only when a likely problem is found. W e describe a set of stereotype-based basic believability checks, designed to identify potential problems with minimal inference, and fine-grained tests that can be used to diagnose the problems that basic believability checks detect. These tests are implemented in the story understanding program ACCEPTER.
Expectation Verification: A Mechanism for the Generation of Meta Comments
Meta Comments, such as "however," "as I have stated before" and "if you pardon the expression," are pervasive in human discourse. In this paper, we present a predictive mechanism for the generation of Meta Comments based on the tenet that they signal a change in the status of beliefs and expectations a listener is conjectured to possess. In particular, our mechanism anticipates a listener's expectations by activating prescribed inferences on the state of the discourse. It is being implemented in a system called FIGMENT II which generates commentaries on the solution of algebraic equations.
Toward a Unified Theory of Immediate Reasoning in Soar
Soar is an architecture for general intelligence that has been proposed as a unified theory of human cognition (UTC) (Newell, 1989) and has been shown to be capable of supporting a wide range of intelligent behavior (Laird, Newell & Rosenbloom, 1987; Steier et al, 1987). Polk & Newell (1988) showed that a Soar theory could account for human data in syllogistic reasoning. In this paper, we begin to generalize this theory into a unified theory of immediate reasoning based on Soar and some assumptions about subjects' representation and knowledge. The theory, embodied in a Soar system (IR-Soar), posits three basic problem spaces (comprehend, test-proposition, and build-proposition) that construct annotated models and extract knowledge from them, learn (via chunking) from experience and use an attention mechanism to guide search. Acquiring task specific knowledge is modeled with the comprehend space, thus reducing the degrees of freedom available to fit data. The theory explains the qualitative phenomena in four immediate reasoning tasks and accounts for an individual's responses in syllogistic reasoning. It represents a first step toward a unified theory of immediate reasoning and moves Soar another step closer to being a unified theory of all of cognition.
Toward a Soar Theory of Taking Instructions for Immediate Reasoning Tasks
Soar is a theory of the human cognitive architecture. W e present here the Soar theory of taking instructions for immediate reasoning tasks, which involve extracting implicit information from simple situations in a few tens of seconds. This theory is realized in a computer system that comprehends simple English instructions and organizes itself to perform a required task. Comprehending instructions produces a model of future behavior that is interprctively executed to yield task behavior. Soar thereby acquires task-specific problem spaces that, together with basic reasoning capabilities, model human performance in multiple immediate reasoning tasks. By providing an account of taking instructions, we reduce the degrees of freedom available to our theory of immediate reasoning, and also give more support for Soar as a unified theory of cognition
Poster Presentations
Learning Relative Attribute Weights For Instance-Based Concept Descriptions
Nosofsky recently described an elegant instance-based model (GCM) for concept learning that defined similarity (partly) in terms of a set of attribute weights. He showed that, when given the proper parameter settings, the G C M model closely fit his human subject data on classification performance. However, no algorithm was described for learning the attribute weights. The central thesis of the GCM model is that subjects distribute their attention amiong attributes to optimize their classification and learning performance. In this paper, we introduce two comprehensive process models based on the G C M . Our first model is simply an extension of the G C M that learns relative attribute weights. The GCM's learning and representational capabilities are limited - concept descriptions are assumed to be disjoint and exhaustive. Therefore, our second model is a further extension that learns a unique set of attribute weights for each concept description. Our empirical evidence indicates that this extension outperforms the simple G C M process model when the domain includes overlapping concept descriptions with conflicting attribute relevancies.
Selective Associations in Causality Judgements II: A Strong Causal Relationshop May Facilitate Judgements of a Weaker One
Previous research had shown that a strong relationship between a causal factor and an outcome reduces estimates of the relationship between a second causal factor and the same outcome (two causal factors, one outcome). In the present experiment subjects judged the effect of one response (pressing a spacebar) on two outcomes (a ball and/or a box might change color). W e used an operant-like procedure in which subjects did problems on the video screen of a computer. The response was involved in various contingencies with the ball and box. In the critical condition one outcome (changes in the color of the box) was highly correlated with the cause (pressing the spacebar) and the other, target, outcome (changing ball color) was only modestly related to the cause. In contrast to earlier work the concurrent strong causal relationship increased the perceived causal relationship between the target outcome and the cause. T h e present experiment was derived from and its results are partially accounted for by the Rescorla-Wagner model (1972), which is a simple connectionist model.
Representation and Acquisition of Knowledge of Fucntional Systems
Many experimental studies have shown that learning and memorization of complex information are strongly influenced by the learners' prior knowledge. Thus, detailed analyses of the structures and the processes involved in learning and memorization require precise assessment of the leamer's prior knowledge in relation to the characteristics of the domain to be acquired. W e have developed a formalization in terms of systems: relational, transformational, teleological (functional and intentional) which permits us to simultaneously describe that domain being acquired, the representaion of the acquiring organism, and our representation of that representation. Here, we will report a study in which this formalization was employed in assessing the representation that students with different levels of knowledge about automobile mechanics have of a functional system: the starter system of an automobile. The predictions made by this formalization were compared with the performances of three groups of students with different levels of knowledge on a series of four tasks: free interview, causal questioning, completing lacunary event triples, and a multiple choice questionaire on the existence of events and causal relations. The criterium used to choose these four tasks was that they differ according to the demands they make in the retrieval of stored information in memory. The results show that: (i) subjects with a good level of knowledge have a representation organized in a fuctional autonomous system organized in sub-systems, while (ii) subjects with lower levels of knowledge do not have a representation organized as a functional system, and (iii) subjects from the intermediate group built a representation organized as a functional autonomous system but containing less information and more poorly organized in sub-systems.
Conncetionism and Intentionality
Connectionism offers greater promise than symbolic approaches to cognitive science for explaining the intentionality of mental states, that is, their ability to be about other phenomena. In symbolic cognitive science symbols are essentially arbitrary so that there is nothing that intrinsically relates them to their referents. The causal process of transduction is inadequate to explain how mental states acquire intentionality, in part because it is incapable of taking into account the contextual character of mental states. In contrast, representations employed in connectionist models can be much more closely connected to the things they represent. The ability to produce these representations in response to external stimuli is controlled by weights which the system acquires through a learning process. In multi-layer systems the particular representations that are formed are also determined by processes internal to the system as it learns to produce the overall desired output. Finally, the representations produced are sensitive both to contextual variations in the objects being represented and in the system doing the representing. These features suggest that connectionism offers significant resources for explaining how representations are about other phenomena and so possess intentionality.
A Conncetionist Model of Category Size Effects During Learning
This paper reports the results of category learning experiments in which the number of exemplars defining a category during learning was varied. These results reveal that category exemplars from larger sized categories are classified more accurately than those from smaller-sized categories. This was true both early and late in learning. In addition, subjects exhibited a response bias toward classifying exemplars into larger-sized categories throughout learning. A connectionist model is developed which exhibits these same tendencies.
A Connectionist Model of Phonological Short-Term Memory
connectionist model of phonological short-term memory is described. The model makes use of existing connectionist techniques, develoj)ed to account for the production and perception of speech and other sequential data, to implement a model of the articulatory rehearsal involved in short-term retention of verbal information. The model is shown to be consistent with a wide range of experimental data, and can be interfaced with existing connectionist models of word recognition. The model illustrates, within a connectionist framework, how the mechanisms of speech perception and production can be recruited for the temporary storage of information. Advantages of this strategy are discussed.
Coherence Relation Assignment
Three empirical studies of coherence in large corpora of commentary text are sketched, showing that cue phrases are infrequent, and that substantive coherence relations must be assigned in order to infer discourse structure. The notion of coherence is carefully defined in relation to the world, cognitive models of the world, and formal semantic representations of discourse. A n efficient algorithm for assigning discourse coherence relations is described, which employs information from syntax, cue phrases, lexical items, formal semantics and naive semantics. The algorithm correctly assigns the coherence relations evident in an 8000 word corpus.
A Model for Contextualizing Natural Language Discourse
This paper describes a computational model of semantic processing in natural language discourse understanding based on the distribution of knowledge over multiple spaces as proposed by Fauconnier (1985), Dinsmore (1987a), K a m p (1980), Johnson-Laird (1985) and others. Among the claims made about such a partitioned representation of knowledge are the following: First, it promotes a more direct, more natural mapping from surface discourse sentence to internal representation. Second, it supports more efficient reasoning and retrieval processes over that internal representation. Finally, it provides an accurate account of many of the most recalcitrant problems in natural language discourse understanding. Among these are implicit information, presupposition, referential opacity, tense and aspect, and common-sense reasoning in complex domains. The model identifies two fundamental levels of semantic processing: contextualization, in which an appropriate space for assimilating the information conveyed in a discourse sentence is located, and construction, in which the information is actually assimilated into that space. Contextualization allows the full semantics of the discourse to be realized implicitly in the internal representation. It also accounts for the use of moods, tenses, and various adverbials in discourse. The interaction of the contextualization processes with the semantics of aspectual operators provides an account of the discourse use of aspect.
An Intelligent Tutoring System Approach to Teaching People How to Learn
Sherlock is an intelligent tutoring system designed to teach people to build simplified knowledge representations (graphic maps) to facilitate learning of a text. Previous attempts to automate instruction in graphic mapping have had problems because they attempted to diagnose a learner's misunderstandings by looking at a finished graphic map. Sherlock uses a knowledge-based approach to diagnose a leamer's misunderstandings by looking at the knowledge and processes that lead to a learner's graphic map, rather than the completed map. In Sherlock's model a semantic network is used to represent the knowledge in the text. A production system models the strategy for constructing a gniphic map by initiating spreading activation on the semantic network, and interpreting the resulting activation patterns. In a limited evaluation Sherlock was able to correcdy determine if a construction was appropriate 9 6 % of the time.
True and Pseudo Framing Effects
The term "framing effect" describes the finding that people often respond differently to different descriptions or "frames" of a single situation. Framing effects violate the principle of "invariance" which states that one's decision should not be affected by how a situation is described. An important question about framing effects is whether subjects agree that two versions are equivalent. The term "framing effect" assumes that subjects would agree that the two situations were equivalent. The study reported here tests this assumption. In this study, subjects were first asked to answer framing effect problems and then were asked to compare two versions of a problem and state whether the two versions should be treated the same. In some cases such as Kahneman and Tversky's (1984) lives lost/lives saved problem, subjects treated two versions differently but reported that they should be treated the same. This is called a "true framing effect." In other cases such as Thaler's (1980) reference point problem, subjects treated the two versions differently and stated that they should be treated differently. This is described as a "pseudo framing effect." The distinction between true and pseudo framing effects has implications for both normative and descriptive theories of decision making.
Question Answering in the Context of Causal Mechanisma
model of human question answering (called QUEST) accounts for the answers that adults produce when they answer different categories of open-class questions (such as why, how, when, what-enabled, and what-are-the-consequences). This project investigated the answers that adults generate when events are queried in the context of biological, technological, and physical mechanisms. According to QUEST, an event sequence in a scientific mechanism is represented as a causal network of events and states; a teleological goal hierarchy may also be superimposed on the causal network in biological and technological domains, but not in physical systems (e.g., rainfall, earthquake). When questions are answered, QUEST systematically operates on the causal networks and goal hierarchies that underlie a causal mechanism. Answers to how and enablement questions sample causal antecedents of the queried event in the causal network; consequence questions sample causal consequents. Answers to when questions sample antecedents to a greater extent than consequents even though events from both directions fumish sensible answers. Answers to why questions sample both causal antecedents in the causal network and superordinate goals from goal hierarchies that exist in technological and biological knowledge structures.
Learning a Troubleshooting Strategy: The Roles of Domain Specific Knowledge and General Problem-Solving Strategies
This research investigated how college students learned an efficient troubleshooting strategy, elimination. The subjects' task was to find the broken components in networks that were similar to digital circuits. With only minimal training in this task, subjects usually used a strategy of backtracking from the incorrect network outputs, instead of the more efficient elimination strategy, which involves backtracking but also eliminating (ignoring) components that lead into good network outputs. Computer simulation modeling suggested that in order for subjects to learn the eUmination strategy on their own, they needed to apply (1) certain key domain-specific knowledge about h o w the components worked, and (2) the general reductio-ad-absurdum problem-solving strategy. A n experiment showed that these two kinds of knowledge do enable students to increase their use of elimination, thus supporting the model.
Representing Variable Information with Simple Recurrent Networks
How might simple recurrent networks represent co-occurrence relationships such as those holding between a script setting (e.g., "clothing store") and a script item ("shirt") or those that specify the feature match between the gender of a pronoun and its antecedent? These issues were investigated by training a simple recurrent network to predict the successive items in various instantiations of a script. The network readily learned the script in that it performed flawlessly on the non-variable items and only activated the correct type of role filler in the variable slots. However, its ability to activate the target filler depended on the recency of the last script variable. The network's representation of the script can be viewed as a trajectory through multidimensional state space. Different versions of the script are represented as variations of the trajectory. This perspective suggests a new conception of how networks might represent a longdistance binding between two items. The binding must be seen as not existing between an antecedent and a target, but between a target item and the current global state.
Device Representation for Modeling Improvisation in Mechanical Use Situations
Improvisation requires an understanding and application of mechanical objects in broad contexts. The capacity to interpret a situation in terms of an object's capabilities requires the integration of functional and behavioral object representations. A model is presented which describes the integration of causal interactions between these levels of abstraction. The model maintains both intentional and behavioral representations to allow inferencing at each level, but integrates them by applying an infcrencing mapping between the two. This model is used to reason about simple mechanical objects in the domain of improvisational mechanics.
Confirmation bias' in rule discovery and the principle of maximum entropy
In scientific research as well as in everyday reasoning, people are prone to a 'confirmation bias', i.e. they tend to select tests that fit the theories or beliefs they already entertain. This tendency has been criticized by philosophers of science as not optimal. The behavior has been studied in a variety of psychological experiments on controlled, small-scale simulations of scientific research. Applying elementary information-theory to sequential testing during rule discovery, this paper shows that the biased strategy is not necessarily a bad one, moreover, that it reflects a healthy propensity of the subject (or researcher) to optimize the expected information on each trial.
Modeling of User Performance with Computer Access and Alternative Communication Systems for Handicapped People
Disabled individuals who cannot use a standard keyboard require a special interface in order to use a computer. The G O M S model is used here to quantitatively evaluate three interfaces currently used in computer access systems for handicapped people. Each interface uses a row/column scanning technique for letter selection, and two of the interfaces employ word prediction in an attempt to improve text input rate. Techniques for modeUng these interfaces are presented, and the resulting predictions for performance time, learning time, and working memory requirements are discussed. The models predict that the systems with word prediction actually have lower performance than one that allows only single letter selections. Factors contributing to this result include additional mental operators required for use of the word predictive interfaces and an insufficient probability of successful word prediction
Focusing Your RST: A Step Toward Generating Coherent Multisentential Text
In multisentence texts, the order and interrelationships of sentence topics is of crucial importance if the reader is to understand easily. What makes paragraphs coherent? What strategies do people employ to control the presentation order and linking of material? Without a theory of coherence, text generation systems have little hope of producing acceptable texts. While various theories of text coherence have been developed, no single theory captures all the phenomena of human-generated paragraphs. In this paper we argue that the coherence of a paragraph does not result from the application of a single theory, but instead results from the cooperation of a number of different coherence strategies. W e illustrate this claim by showing how two very different theories about the planning of coherent text — 1) Rhetorical Structure Theory, based on structural and semantic relationships that hold between pieces of the text, and 2) Focus Trees, based on how the focus of attention shifts during discourse — can be used within a single system to complement each other to best advantage.
Individual Differences in the Revision of an Abstract Knowledge Structure
Following the recent suggestion (Hockey, in press) that cognitive science has much to gain from consideration of variability in cognitive functioning, this paper addresses the question of what aspects of memory performance underlie differences in cognitive style' such as 'Ambiguity Tolerance'. Subjects allocated to 'tolerant' and 'intolerant' groups on the basis of a traditional pencil & paper measure of 'Ambiguity tolerance' took part in a conceptual editing task which required them to disregard information learnt on a previous occasion. The results of the study show significant differences between groups, both in terms of recall and discrimination, and are interpreted as supporting the view that Ambiguity tolerance effects result from differences in the organisation and availability of the underlying conceptual representation.
EBL and SBL: A Neural Network Synthesis
Previous efforts to integrate Explanation-Based Learning (EBL) and Similarity-Based Learning (SBL) have treated these two methods as distinct interactive processes. In contrast, the synthesis presented here views these techniques as emergent properties of a local associative learning rule operating within a neural network architecture. This architecture consists of an input layer, a layer buffering this input, but subject to descending influence from higher order units in the network, one or more hidden units encoding the previous knowledge of the network, and an output decision layer. S B L is accomplished in the normal manner by training the network with positive and negative examples. A single positive example only is required for E B L . Irrelevant feauires in the input are eliminated by the lack of top-down confirmation, and/or by descending inhibition. Associative learning then causes the strengthening of connections between relevant input features and activated hidden units, and the formation of "bypass" connections. O n future presentations of the same (or a similar) example, the network will then reach a decision more quickly, emulating the chunking of knowledge that takes place in symbolic E B L systems. Unlike tiiese programs, this integrated system can learn in the presence of an incomplete knowledge domain. A simulation program, ILn, provides partial verification of these claims.
Competition and Learning in a Connectionist Deterministic Parser
Deterministic parsing promises to (almost) never backtrack. Neural network technology promises competition, and learning capabilities. The marriage of these two ideas is being investigated in an experimental natural language parsing system that combines some of the best features of each. The result is a deterministic parser that learns, generalizes, and supports competition among structures and lexical interpretations. The performance of the parser is being evaluated on predicted as well as unpredicted sentence forms. Several mildly ungrammatical sentences have been successfully processed into structures judged reasonable when compared to their grammatical counterparts. Lexic^ ambiguities can create problems for traditional parsers, or at least require additional backtracking. With the use of neural netwoilcs, ambiguities can be resolved through the wider syntactic context. The results have shown the potential for parsing using this approach.
DESCARTES: Development Environment for Simulating Hybrid Connectionist Architectures
The symbolic and subsymbolic paradigms each offer advantages and disadvantages in constructing models for understanding the processes of cognition. A number of research programs at U C L A utilize connectionist modeling strategies, ranging from distributed and localist spreading-activation networks to semantic networks with symbolic marker passing. As a way of combining and optimizing the advantages offered by different paradigms, we have started to explore hybrid networks, i.e. multiple processing mechanisms operating on a single network, or multiple networks operating in parallel under different paradigms. Unfortunately, existing tools do not allow the simulation of these types of hybrid connectionist architectures. To address this problem, we have developed a tool which enables us to create and operate these types of networks in a flexible and general way. W e present and describe the architecture and use of DESCARTES, a simulation environment developed to accomplish this type of integration.
Frame Selection in a Connectionist Model Of High-Level Inferencing
Frame selection is a fundamental problem in high-level reasoning. Connectionist models have been unable to approach this problem because of their inability to represent multiple dynamic variable bindings and use them by applying general knowledge rules. These deficits have barred them from performing the high-level inferencing necessary for planning, reasoning, and natural language understanding. This paper describes a localist spreading-activation model, ROBIN, which solves a significant subset of these problems. ROBIN incorporates the normal semantic network su^ucture of previous localist networks, but has additional stfucture to handle variables and dynamic role-binding. Each concept in the network has a uniquely-identifying activation value, called its signature. A dynamic binding is created when a binding node receives the activation of a concept's signature. Signatures propagates across paths of binding nodes to dynamically instantiate candidate inference paths, which are selected by the evidential activation on the network's semantic structure. R O B I N is thus able to approach many of the high-level inferencing and frame selection tasks not handled by previous connectionist models.
A Symbolic/Connectionist Script Applier Mechanism
We constructed a Modular Connectionist Architecture which consists of meiny different types of 3 layer feed-forward PDP network modules (auto-associative recurrent, hetero-associative recurrent, and heteroassociative) in order to do script-based story understanding. Our system, called DYNASTY (DYNAmic script-based STory understanding sYstem) has the following 3 major functions: (1) DYNASTY can learn distributed representations of concepts and events in everyday scriptal experiences, (2) DYNASTY can do script-based causal chain completion inferences according to the acquired sequential knowledge, and (3) DYNASTY performs script role association and retrieval while performing script application. Our purpose in constructing this system is to show that the learned internal representations, using simple encoder-type networks, can be used in higher-level modules to develop connectionist architectures for fairly complex cognitive tasks, such as script processing. Unlike other neurally inspired script processing models, DYNASTY can learn its own similanty-hased distributed representations from input script data using ARPDP (Autoassociative Recurrent PDP) architectures. Moreover DYNASTY'S role association network handles both script roles and fillers as full-fledged concepts^ so that it can learn the generalized associative knowledge between several script roles and fillers.
Distributed Problem Solving: The Social Contexts of Learning and Transfer
The problem of transfer remains one of the most difficult challenges for schooling: knowledge and skills that students leam in a classroom is often not used in out-of-school contexts. To address this problem, this paper analyzes educational interaction conducted via long-distance electronic networks. W e present a new methodology, called Semantic Trace Analysis. From our analyses, we present two possible solutions to the transfer problem. First, we describe an organizing framework for network interactions, which we call "receiver site transfer", which provides a functional environment for students' problem solving. In addition, we describe some initial explorations of "teleapprenticeships", instructional interactions through which students learn knowledge and skills by interacting with adults outside the school system. To the extent that adults increasingly use electronic networks for their work, we will be able to avoid the transfer problem by instructing students within the same context that they will use that instruction.
A Framework for Psychological Causal Induction: Integrating the Power and Covariation Views
We propose a theoretical framework for interpreting the roles of covariation and the idea of in psychological causal induction. According to this framework, the computation of inference is purely covariation-based, but covariation is computed only on a set of selected dimensions in a set of selected events. Whether or not a dimension has power or efficacy exerts an influence on whether or not that dimension is selected. W e present an experiment testing two predictions based on this framework. Our experiment showed a strong bias towards inferring a movement by a human agent (compared to a state) to be the cause of an event. In support of our hypothesis, this bias was found only when the state was not salient and the inference was made within a relatively short time, suggesting that the bias occurred at the selection stage.
Planning in an Open World: A Pluralistic Approach
Recent work in planning has rejected the assumption of a closed, stable world, and the associated paradigm of exhaustive preplanning, which encounters serious problems trying to plan in a world where that assumption does not hold. Several alternative strategies have been proposed, responding to these new problems in a variety of ways. W e review this spectrum, finding the various approaches in part incompatible but not bereft of some common themes and complementary strengths. W e suggest factors in the application domain which should influence the appropriate mix, and describe the T R U C K E R project to illustrate some of the problems and benefits in implementing such a mix.
Lexical Ambiguity Resolution in a Constraint Satisfaction Network
Behavioral evidence supports the claim that in the absence of a strongly biasing context multiple meanings of an ambiguous word are activated, particularly when the two meanings occur with equal frequency. A simple constraint satisfaction system, based on a Hopfield network and incorporating a distributed memory scheme, is shown to account for results from a cross modal priming paradigm typically interpreted as evidence for multiple access. The model demonstrates that the power of an ambiguous word to facilitate identification of targets related to either of its two meanings m a y be produced by selective activation of just one meaning. Selective activation is driven by simultaneous processing of the ambiguous prime and the associated target word, with the unambiguous target determining the appropriate interpretation of the prime. The model also provides the basis for a reinterpretation of a number of other empirical results concerning lexical ambiguity resolution.
The Role of Computational Temperature in a Computer Model of Concepts and Analogy-Making
We discuss the role of computational temperature in Copycat, a computer model of the mental mechanisms underlying human concepts and analogy-making. In Copycat, computational temperature is used both to measure the amount and quality of perceptual organization created by the program as processing proceeds, and, reciprocally, to continuously control the degree of randomness in the system. W e discuss these roles in two aspects of perception central to Copycat's behavior: (1) the emergence of a parallel terraced scan, in which many possible courses of action are explored simultaneously, each at a speed and to a depth proportional to moment-to-moment estimates of its promise, and (2) the ability to restructure initial perceptions — sometimes radically — in order to arrive at a deeper understanding of a situation. W e compare our notion of temperature to similar notions in other computational frameworks. Finally, we give an example of how temperature is used in Copycat's creation of a subtle and insightful analogy.
An Interactive Activation Model for Priming of Geographical Information
Clustering effects in observed performance on spatial recognition tasks give evidence that the judgment of spatial relationships is not based solely on Euclidean proximity, but can depend on other similarity relationships to an equal, or even to a greater, extent. Thus, the representation of spatial information must be coded as one of many features of an object, and these features are expected to interact with one another. A recurrent network using the Interactive activation architecture of McClelland & Rumelhart (1981) is presented to illustrate the interaction of these featural representations, including a coarse coding representation of a Euclidean metric. The experiments of McNamara (1986) and McNamara, Ratcliff, and McKoon (1984) are simulated; the model results are in qualitative agreement with the data.
Apprenticeship or Tutorial: Models for Interaction with an Intelligent Instructional System
Conventional intelligent tutoring systems are based on the individual tutorial as a model of instructor-student interaction and use a model of the student's understanding as a principal component guiding instruction. Apprenticeship provides quite a different model of interaction in which a model of the sUident is not essential. Instead, the instructor, interested in making use of the student's work, provides demonstrations and feedback in terms of the product toward which they are both working. Recent advances in the cognitive science of instruction provide insights into the interactive processes by which instructors appropriate the work of apprentices. A n intelligent instructional system that instantiates apprenticeship interaction illustrates an alternative to tutorial-based systems that make use of a student model.
Abduction and World Model Revision
Abduction is the process of constructing explanations. This paper suggests that abduction is to "world model revisions" — dramatic changes in systems of beliefs such as occur in children's cognitive development and in scientific revolutions. The paper describes a model of belief revision based upon hypothesis formation by abduction. When a contradiction between an observation and an existing model or theory about the physical world is encountered, the best course is often simply to suppress parts of the original theory thrown into question by the contradiction and to derive an explanation of the anomalous observation based on relatively solid, basic principles. This process of looking for explanations of unexpected new phenomena can lead by abductive inference to new hypotheses that can form crucial parts of a revised theory. As an illustration, the paper shows how one of Lavoisier's key insights during the Chemical Revolution can be viewed as an example of hypothesis formation by abduction.
A Linguistic Approach to the Problem of Slot Semantics
Most frame-based knowledge representation (KR) systems have two strange features. First, the concepts represented by the nodes are nouns rather than verbs. Verbal ideas tend to appear mostly in describing roles or slots. Thus the systems are asymmetric. Second, and more seriously, the slot names on frames are arbitrary and not defined in the system. Usually no metasystem is given to account for them. Thus the systems are not closed. Both these features can be avoided by structures inspired by case-based linguistic theories. The basic ideas are that an ontology consists of separate, parallel lattices of verbal and nominal concepts, and that the slots of concepts in each lattice are defined by reference to the concepts in the other lattice. Slots of verbal concepts are derived from cases, and restricted by nominal concepts. Slots of nominal concepts include conducts (verbal concepts) and derivatives of the slots of verbal concepts. Our objective in this paper is not to define a new KR language, but to use input from the study of natural cognition (case grammar) to refine technology for artificial cognition.
Parsing and Representing Containter Metaphors
successful construction of a pattern based parser to recognize the class of container metaphors. Recognition of a metaphor in this class triggers a transformation that substitutes a correct, literal meaning form in the final representation of the utterance or sentence. The final meaning form reflects a theory of metaphors suggesting bodily experiences as the source of metaphor. A large set of primitives serves as the basic representation language. W e conclude that pattem parsers with attached transformations work well for many normally difficult constructions such as metaphors, cliches and idioms.
Recognition of Melody Fragments in ContinuouslyPerformed Music
The processing of continuous acoustic signals is a challenging problem for perceptual and cognitive models. Sound patterns are usually handled by dividing the signal into long fixed-length windows—long enough for the longest patterns to be recognized. W e demonstrate a technique for recognizing acoustic patterns with a network that runs continuously in time and is fed a single spectral frame of input per network cycle. Behavior of the network in time is controlled by temporal regularities in the input patterns that allow the network to predict future events.
Computing Value Judgements During Story Understanding
During story understanding readers make value judgments—^judgments of the "goodness" or 'badness' of characters' actions. This paper presents the representational structures and processes used to make value judgments by the computer program THUNDER. THUNDER creates evaluative beliefs about characters' plans based on a set of universal pragmatic and ethical judgment rules. To account for subjective differences in evaluative belief, THUNDER has a specific ideology to represent the idiosyncratic aspects of evaluation. There are two components in the representation of ideology: (1) a set of important, long term goals called values, and (2) a collection of planning strategies for each value. This representation for ideology allows THUNDER to reason about what is 'good', and what it believes to be 'good ways to get what is good.' The representation and rules for value judgments are used to (1) make inferences about character belief and ideology, (2) represent expectation knowledge based on personality traits, and (3) reason about the obligations that characters acquire.
Dynamic Reinforcement Driven Error Propagation Networks with Application to Game Playing
This paper discusses the problem of the reinforcement driven learning of a response to a time varying sequence. The problem has three parts: the adaptation of internal parameters to model complex mappings; the ability of the architecture to represent time varying input; and the problem of credit assignment with unknown delays between the input, output and reinforcement signals. The method developed in this paper is based on a connectionist network trained using the error propagation algorithm with internal feedback. The network is viewed both as a context dependent predictor of the reinforcement signal and as a means of temporal credit assignment. Several architectures for these networks are discussed and insight into the implementation problems is gained by an application to the game of noughts and crosses.
A Case for Symbolic/Sub-symbolic Hybrids
This paper considers the question of what qualities are necessary for an AI system to be a hybrid of symbolic and sub-symbolic approaches. Definitions of symbolic and sub-symbolic systems are given. SCALIR, a hybrid system for information retrieval, is presented, and then used to show how both symbolic and sub-symbolic processing can be combined. Arguments against SCALIR's hybrid nature are presented and rejected.
Neural Network Models of Memory Span
A model is presented in which short term memory is maintained by movement of vectors from one layer to another. This architecture is ideal for representing item order. Two mechanisms for accounting for serial position curves are considered, lateral inhibition, and noise from neighboring items. These also account for effects of grouping by inserting pauses during presentation. Two other effects, a reverse word-length effect and the effect of phonological similarity, are attributed to the reconstruction of items from partially decayed traces. If all the phonemes in an item are intact at recall, the item is recalled correctly. Otherwise, the subject guesses according to a model developed by Paul Luce for identification of words presented in noise.
A Cooperative Model of Intuition and Reasoning for Natural Language Processing - Microfeatures and Logic
This paper discusses problems of right retrievals of memory, preferential orderings, and script selection/withdrawal in natural language processing (NLP). Atmosphere is introduced to solve these problems. It works as a contextual indicator which roughly grasps what is being talked about. An implementation mechanism for atmosphere is presented inspired by artificial neural network researches. It is characterized by microfeature representation, a chronological FIFO (First-In First- Out memory), and threshold-based selection. The mechanism constructs an intuition module and works for N L P while cooperating with a logic module which uses T M S to check the justifications of preferential decisions done by the intuition module.
Reinterpretation and the Perceptual Microstructure of Conceptual Knowledge
In this paper I argue that conceptual knowledge has significant perceptual content, based upon evidence from studies of theory formation and from recent experimental work on learning in complex physical domains. I outline a theory of "perceptually grounded" conceptual knowledge, and briefly outline a computational model of learning about lasers, in which student's "qualitative' understanding of lasers rests primarily upon his perceptual experience in the domain.
A Model of Natual Category Structure and its Behavioral Implecations
Fisher (1988) uses the COBWEB concept formation system to illustrate a computational unification of basic level and typicality effects. The model relies on probabiUstic, distributed concept representations, and appropriate interaction between cue and category validity. W e review this work and report a new account of the fan effect. This extension requires an additional assumption of parallel processing, but otherwise is explained by precisely the same mechemisms as basic level and typicality phenomena.
Qualitative and Quantitative Reasoning
One goal of qualitative physics is to capture the mental models of engineers and scientists. This paper shows how Qualitative Process theory can be used to express concepts of engineering thermodynamics. This encoding provides the means to integrate qualitative and quantitative knowledge for solving textbook thermodynamics problems. These ideas have been implemented in a program called SCHISM, which analyzes thermodynamic cycles, such as gas turbine plants and steam power plants. W e describe its analysis of a sample textbook problem and discuss our plans for future work.
An Approach to Constructing Student Models: Status Report for the Programming Domain
Student models are important for guiding the development of instructional systems. An approach to constructing student models is reviewed. The approach advocates constructing student models in two steps: (1) develop a descriptive theory of correct and buggy student responses, then (2) develop a process theory of the way students actually generate those responses. The approach has been used in the domain of introductory programming. A status report is provided: (1) Goal-And-Plan (GAP) trees have been developed to describe student program variations, and (2) a Generate-Test-and-Debug (GTD) impasse/repair architecture has been developed to model the process of student program generation.
Processing Unification-based Grammars in a Connectionist Framework
We present an approach to the processing of unification-based grammars in the connectionist paradigm. The method involves two basic steps: (1) Translation of a grammar's rules into a set of structure fragments, and (2) encoding these fragments in a connectionist network such that unification and rule application can take place by spreading activation. Feature structures are used to constrain sentence generation by semantic and/or grammatical properties. The method incorporates a general model of unification in connectionist networks.
A Discrete Neural Network Model for Conceptual Representation and Reasoning
Current connectionist models are oversimplified in terms of the internal mechanisms of individual neurons and the communication between them. Although connectionist models offer significant advantages in certain aspects, this oversimplification leads to the inefficiency of these models in addressing issues in explicit symbolic processing, which is proven to be essential to human intelligence. What we are aiming at is a connectionist architecture which is capable of simple, flexible representations of high level knowledge structures and efficient performance of reasoning based on the data. W e first propose a discrete neural network model which contains state variables for each neuron in which a set of discrete states is explicitly specified instead of a continuous activation function. A technique is developed for representing concepts in this network, which utilizes the connections to define the concepts and represents the concepts in both verbal and compiled forms. The main advantage is that this scheme can handle variable bindings efficiently. A reasoning scheme is developed in the discrete neural network model, which utilizes the inherent parallelism in a neural network model, performing all possible inference steps in parallel, implementable in a fine-grained massively parallel computer.
Making Conversation Flexible
The goals of the speakers are the motivating force behind a conversation. The differences in these goals, and their relative priorities, account for many of the differences between conversations. In order to be easily understood, however, the resulting conversation must be constrained by the language conventions shared by speaker and hearer. In this paper we describe how the use of schemas for conversational control can be made flexible by integrating the priorities of a system's goals into the process of selecting the next utterance. Our ideas are implemented in a system called JUDIS (Turner & Cullingford, 1989), a natural language interface for an advice-giving system.
When Reactive Planning is Not Enough: Using Contextual Schemas to React Appropriately to Environmental Change
problem solver operating in the real world must adapt its behavior to an unpredictable and changing problem-solving environment. It must react appropriately to changes in the situation, where what is appropriate depends to a large extent on the overall problem-solving context. In order to do this, the reasoner needs to have explicit knowledge about the context it is in. In our approach, the problem-solving context is represented explicitly as a contextual schema. When presented with a problem, the reasoner finds an appropriate contextual schema, then uses it to influence its problem-solving behavior. The reasoner uses the contextual schema to recognize important changes in the problem-solving situation and to respond appropriately to those changes Our approach is implemented in the medic program (Turner, 1988b; Turner, in preparation), medic is medical diagnostic consultant whose domain is puLmonology.
Search in Analogical Reasoning
Analogical reasoning is the process of finding and adapting old solutions to solve new problems. Unlike most analogy work, which has emphasized mapping the analogy to the target problem, we focussed on search for the analogy. Experiments with humans doing analogical reasoning uncovered a search strategy which we call Lambda Search, because of its up and down shape through long-term memory. Lambda Search begins by generalizing on the properties of the target problem and then eventually specializing on the examples of some higher level concept. These ideas were implemented in a computer program named Lambda. Simulations demonstrated that search lessened, and in some cases solved, the problems of mapping and tweaking.
Capturing Intuitions About Human language Production
Human speech is creative. Move specifically, it is an effortful process that starts & o m a rich input and creates new meaning. Speech is also incremental, as evidenced by pauses and false starts. Existing nKxlels of language generation have not adequately addressed these phenomena. This paper presents six principles which specify a design for a cognitively plausible generator, as follows: Be able to handle non-trivial inputs, Be able to access relevant words, Consider many words as candidates for potential inclusion. Produce an utterance incrementally, Use feedback from words, and Monitor the output. These principles can be implemented using spreading activation in a semantic network which includes lexiced and syntactic knowledge. The prototype generation system FIG is such an implementation.
Learning Semantic Relationships in Compound Nouns with Connctionist Networks
This paper describes a new approach for understanding compound nouns. Since several approaches have demonstrated the difficulties in finding detailed and suitable semantic relationships within compound nouns, we use only a few basic semantic relationships and provide the system with the additional ability to learn the details of these basic semantic relationships from training examples. Our system is based on a back propagation architecture and has been trained to understand compound nouns from a scientific technical domain. The test results demonstrated that a connectionist network is able to learn semantic relationships within compound nouns.
The Role of Intermediate Abstractions in Understanding Science and Mathematics
Acquiring powerful abstractions -- i.e., representations that enable one to reason about key aspects of a domain in an economical and generic form - should be a primary goal of learning. The most effective means for achieving this goal is not. we argue, the "topdown" approach of traditional curricula where students are first presented with an abstraction, such as F=ma. and then with examples of how it applies in a variety of contexts. Nor do we advocate the "bottom-up" approach proposed by situated cognition theorists. Instead, we argue for a "middle-out approach where students are introduced to new domains via intermediate abstractions in the form of mechanistic causal models. These models serve as "conceptual eyeglasses" that unpack causal mechanisms implicit in abstractions such as F=ma. Tney are readily mappable to a vanety of real-worid contexts since their objects and operators are generic and causal. Intermediate abstractions thus give meaning to higher-order abstractions as well as to real-world situations, provide a link between the two, and a route to understanding.
Active Acquisition for User Modeling in Dialog Systems
user model in a natural language dialog system contains knowledge about particular users' beliefs, goals, attitudes, or other characteristics. User modeling facilitates cooperative adaptation to a user's conversational behavior and goals. This paper proposes active strategies for acquiring knowledge about users. Current systems employ positive acquisition strategies, which build a model of the user by making inferences based on passive observation of the dialog. Passive acquisition is generally preferable to active querying, to minimize unnecessary dialog. However, in some cases the system should actively initiate subdialogs with the purpose of acquiring information about the user. W e propose a method for identifying these conditions based upon maximizing expected utility.