Semantic Memory
From Psy241wiki
Types of Memory
Memory has two major divisions:
Procedural memory
Procedural memory concerns our memories of how to do things. Procedual memory guides the processes we perform and normally resides below the level of conscious awareness. When needed, procedural memories are automatically retrieved and used for both cognitive and motor skills (e.g. tying laces, driving a car, writing). Procedural memory is created through "procedural learning" (repeating a complex activity over and over again) and making all of the relevant neural systems work together automatically. Implicit (no concious awareness) procedural learning is essential to the development of any motor skill or cognitive activity.
Declarative memory
Declarative Memory refers to memory which can be consicuously recalled such as facts and knowledge. Declarative memory is further divided into semantic memory and episodic memory. However, the two types of memory may interact. It may be the case that these are different types of information stored in the same system as opposed to being seperate systems themselves.
Semantic memory refers to the memory of meanings, understandings, and other concept-based knowledge unrelated to personal experiences. Basically, semantic memory is the knowledge of facts, such as knowing what 'x' is. Semantic memory is best described as "what we already understand" while episodic memory is best described as "what has happened to us". Semantic memory consists of networks of associations between concepts. Links between these concepts (nodes) are labelled in a semantic fashion, with there being at least two types of links in the Semantic Network, these being class membership and attributes.
As opposed to episodic memory, semantic memory is not affected by Amnesia which is a condition when one's memory is lost. However, semantic memory is affected by Agnosia (loss of knowledge- it can be the loss of ability to recognize objects, persons, sounds, shapes, or smells). Furthermore, it is unrelated to context and personal relevance.
Hierarchical Network Model (HNM)
The Hierarchical network model (HNM) by Collins & Quillian (1969), was the first systematic model of semantic memory. The model suggests that semantic memory is organised into a series of hierarchical networks, consisting of nodes and properties. A node is a major concept, such as 'animal, bird, canary'. A property, attribute or feature is, as expected, a property of that concept. For example 'has wings, is yellow'. The model is arranged as a hierarchy, with the more widely encompassing nodes stored on the higher levels. The underlying principle of the model is that of cognitive economy. Property information is stored as high up as possible to minimise the amount of information stored in semantic memory. This means that 'fish' is on a higher level than 'fresh water fish', which in turn is on a higher level than 'salmon'. We make inferential choices in semantic memory. For example the information that Picasso had knees is not stored in semantic memory. The knowledge that he was a human is and we know that humans have knees so we can infer that Picasso must have had knees.
When a concept is 'activated' in semantic memory, linked nodes are also 'activated' and relevant data is inferred. An example of this is people were faster to react to "a canary is yellow" than "a canary has wings". This illustrates that the closer together in the hierachy, the faster someone can identify concepts and their properties. The concept (canary) and the property (yellow) are stored at the same level, and are thus activated quickly, but canary and "can fly" are separated by one level, and so reaction time takes longer.
Collins implemented the hierarchical model of semantic memory in a computer programme that could understand basic text - the Teachable Language Comprehender (TLC).
Evidence in favour
Collins and Quillian (1969, 1972) found that when asking questions such as 'a canary is an animal' and 'a canary is a canary', the results were as predicted by the model in that the greater the semantic distance the slower reaction time.
Criticism
Whilst the HNM had the right idea regarding semantic memory using a process of inferral, there are several issues with the model in general.
If falsification is taken as the means of measurement in reaction times, it becomes apparent that the greater the semantic distance between the concepts (nodes) the faster the reaction times. Collins and Quillian themselves found that 'a canary is a tulip' was faster to be rejected than 'a canary is a robin'. The model would suggest that for the tulip question the participant would have to search through the whole network before rejecting it and thus would be slower.
Conrad (1972) investigated whether hierarchical distance, or familiarity, was more influential in determining whether sentences were true or false. This was based on the notion that a slower reaction time for verification questions such as "a canary has skin", compared to "a canary sings", could be more down to the familiarity of the sentences. Conrad controlled the familiarity, and found that hierarchical distance between the subject and the property had little effect on verification time.
Another issue is that the model fails to explain typicality effects. The finding that people are faster to verify category inclusion for typical category members. For example, "a canary is a bird" is faster to verify than "a penguin is a bird", as a penguin is less typical/representative of the category of bird. This is supported by Rosch and Mervis (1975) who investigated the typicality ratings of fruits and found that oranges, apples, bananas and pears were rated as much more typical fruits than olives, tomatoes, coconuts and dates. Rips, Shoben and Smith (1973), found that verification times were faster for more typical or representative members, than for more atypical members of their category. This is called the typicality gradient.
Rosch (1973) showed that more typical members shared more characteristics associated with a category than atypical members such as 'a robin is a bird' registerd faster than 'a chicken is a bird'. This suggests that the concepts we use are much more loosely fitted to categories than the HNM proposes. Strong support for this came from McCloskey and Glucksberg (1978). They asked participants 30 'tricky' questions, such as "is a stroke a disease?". Answers were not unanimous among participants, and 11 months later many people had changed their answers to some of the questions. This shows how fuzzy memory can be.
A further issue is that the model only accounts for remembering sentences of a specific form (e.g a bird has wings)
Whilst the HNM is clearly flawed, it is worth remembering that it was the first systematic model of semantic memory, and it's influence over later models should not be dismissed.
Teachable Language Comprehender (TLC) Collins and Quillian (1969)
Goal
To comprehend text input by relating it to a pre-existing large semantic network (SN) representing rules already known about the world. Comprehension is identical with successful relation to input and learning is accomplished by incorporating any successfully comprehended rules into the SN.
Knowledge representation
Similar concepts are stored closer together than unrelated concepts
- Concepts are stores as local representations – each concept is stored as a single node and the nodes of related concepts are linked together in a hierarchal fashion.
- Links represent relationships between nodes – ‘is a’ or ‘has’.
- Cognitive economy – minimises the number of representations of a piece of information
- All or none
-Principle of cognitive economy is used in storage of properties for a concept. A property is stored at the highest possible node in the hierarchy so information can be deduced via inheritence for lower nodes e.g. 'has wings'.
Process of semantic analysis
- Intersection search: as soon as a word is parsed (broken down and analysed) it spreads like a 'plague', recording its original form and previous 'victim' so it is possible to retract. When it reaches a node it has already 'touched' this links the two nodes semantically (length of path= semantic distance).
- Semantic Interpretation of input corresponds to the set of linked words. E.g. for the phrase ‘the canary the shark bit had wings’, the semantic network is used to infer that the canary is the owner of the wings, and the shark is the one that bit, because ‘canary’ has ‘wings’ as a semantically linked property, and ‘shark’ has ‘biting’ as a semantically linked property.
- Syntax is only used to check the validity of interpretation. Any input that is not syntactically correct is rejected.
Classes of failure
- Because of its generalisation hierarchy structure, there was nowhere to put any abstract information that didn’t fit into the hierarchy.
- Often made false connections between subjects if they were too general and the syntax was too vague. E.g. ‘he hated the landlord so much that he moved into the house on Brunswick Street’ – TLC would incorrectly associate the landlord and the house.
- Although it is possible to comprehend episodic input using the semantic network, it doesn’t incorporate these episodes into the network itself, i.e. it can only learn semantic relationships
Evidence for
Studies on sentence varification times (Collins and Quillian 1969/72) show good support for the notion that reaction time increases as the semantic distance increases.
Problems for the TLC:
- Own data is inconsistent with model as RTs are faster the greater the semantic distance.
- Typicality effects; no associative strength attached to links shown by Rosch (1973). 'A robin is a bird' verified faster than 'a chicken is a bird' due to the fact that there is a difference in typicality between the two. Ratings of typicality were robin-bird (1.1) and chicken-bird (3.8) on a 1-7 rating scale.
- Alternative explanation for sentence verification= issue of typicality/representation. No evidence for cognitive economy (Conrad 1972).
Conclusion
It is unlikely that the precise representation chosen bears much resemblance to human semantic memory. However, TLC was hugely influential, first in demonstrating that it's possible to model SM and second, in influencing the development of consequent better models.
The Spreading Activation Model
Collins and Loftus (1975) developed the spreading activation model of semantic memory as a more complex answer to the HNM's criticisms. It suggests that concepts and nodes are linked together with different levels of conductivity. The more often the two concepts are linked, the greater conductivity. Thus the conductivity may be thought of as the criteria of the relation.
Collins and Loftus assumed that semantic memory is organised on the basis of semantic relatedness or semantic distance. In this model elements are linked in a conductive manner - the more two elements are activated together the greater their conductivity. This is modeled through shorter links between nodes. The shorter the link, the closer the semantic relation, and so the faster the brain will be at making the connection between the nodes. Furthermore, the longer a concept is accessed, the larger the spread of activation. When a concept is accessed activation spreads out from that node in all directions. The higher the conductivity the faster it spreads down that link. Whenever a person thinks hears or see a concept the appropriate node is activated. Like neurons, each intersection has a threshold and activation summates linearly from different inputs to the node.
The principle of weak cognitive economy is basically a revised version of Collins and Quillian's cognitive economy principle that allows information to be stored at a lower node in the hierarchy if the link has been explicit, even if already stored at a higher level. If relations are not stored explicitly it is still possible to infer them using hierarchical information.
Collins and Loftus claim that there are different types of links including: -Class membership (a cat is a mammal), -Subordinate (a cat has fur), -Prediction (game-play-people) -Exclusion (a whale is not a fish). Collins and Loftus suggest that connections made are not necessarily logical, rather based on personal experience.
Support
The model can explain the familiarity effect, the typicality effect, and direct concept-property associations.
S.A.M is supported by studies of priming in which there is an improvement in speed or accuracy to respond to a stimulus when it is seen to proceed a semantically related concept. Mackay (1973)demonstrated for example how prior context can remove any disambiguation from a phrase (e.g. he walked towards the bank) because of these interconnected units of information. Meyer and Schvaneveldt (1971) found that when words are related, reaction times are quicker. They asked participants if both words in a pair were words or non words. Participants answered "yes" much quicker when the words were related (e.g. bread and butter) compared to when they were not related (e.g. bread and coat). If the words are related activation from the first word is spread to the second word making the association much faster than if they are not related.
Criticism
However the disadvantage is that the theory is unable to predict much as it is based on the individual. It handles everything and makes very few predictions which are open to empirical testing making it very difficult to falsify.. The model also fails to consider how episodic knowledge or non-propositional knowledge could be stored. There are so many possible parameters to the system that it is possible to fit almost any empirical data anyway. Despite its neurological plausibility it is not sufficiently constrained enough to allow it to be implemented reliably.
Conclusion
SAM however, offers many strengths as a model, as it has clear face validity. It seems a plausible model and gives a good foundation of how the semantic network is built up initially.
For information to be used in a task like recognition, it must first be activated and then inspected. When information is in the LTM, but not currently in the WM, activation must spread to it.
The Fan Effect
The Fan effect causes interference in semantic memory. The more facts that are associated with a 'node', the slower the activation spreads from it, as a node has a fixed capacity for emitting activation. Therefore if there are more links to that node then more time must be taken in order to activate all the links, although this can be sped up if the links are often used and so there is more immediate association.
Anderson (1974) asked participants to learn sentences comprising of a subject and location with a relation between them. For example:
1. The Doctor is in the bank
2. The fireman is in the park
3. The lawyer is in the Church
4. The lawyer is in the park
Participants were then given a speed recognition task. They were asked to indicate when they recognised a learnt sentence (the target) amongst other sentences of similar nature (the distractors) An example of a distractor may be "the Doctor is in the park". Anderson found participants reaction time was faster when there were less shared facts. Reaction time for unique sentences -e.g. "the Doctor is in the bank" - was 1.11 seconds compared to when the location and person appeared in two sentences -e.g. "the lawyer is in the park" - which was at 1.22 seconds. Thus the more facts are associated with a node the slower the activation spreads from it - this is the fan effect. This has implications- the more you know the slower you get?! This may be the case but we can speed this up consciously using procedural memory.
Limitations
Semantic nodes are very subjective as people's schemas differ significantly. For example, the word 'apple' might eliit the colour 'green' to one person, yet 'red' to another. This could slow down the Fan effect due to the collaboration of knowledge that others may chose differently; hence slower the activiation.
Adaptive Control of Thought (ACT*)Declarative Memory
(Anderson, 1983)
ACT* was built upon the TLC and SAM models of semantic memory. It maintained the idea of “semantic networks” but suggested it was “activation” that was key to semantic knowledge and memory. The ACT* model was the first complete model of human cognition, (a challenging task). It therefore has a highly complex architecture which allows it to learn. This type of knowledge is called declarative; the knowledge of facts and information. ACT* also suggests that human knowledge can be procedural; knowledge we hold in order to perform automatic actions such as driving.
The ACT* therefore suggests that memories must be activated from source nodes in the Working Memory. Studies have highlighted that activation takes place automatically, that is, it requires no conscious awareness. The ACT* Throy of fact recognition proposed that items in the LTM remain permanently but cannot be accessed directly unless they are 'activated'. This model viewed Working Memory differently to how it had previously been viewed (Baddeley and Hitch) whereby 'source nodes' could be located anywhere throughout the brain and all those that are activated at any one time make up Working Memory. Being a complex system the ACT* is most easily represented by the “lightbulb analogy”. If you think of a floor of interconnected lightbulbs, most of which are off, some are dim (partially activated) and some will be lit brightly (fully activated). At different times, different sections of light bulbs will be turned off and on. This is supposed to represent the idea that activation is a continuous function rather than an all-or-none action.
Conclusions on Semantic Memory
There is no doubt that we have some sort of SN in our brains developing through experience. Semantic memory plays a crucial role in almost any cognitive activity and it is very likely that some sort of speading activation is involved in accessing this system. However, the system is not constrained enough to allow us to decide which complex model is closest to the truth. We don't have a good understandingof the way other parts of the system work. Semantic memories are learned through experiences so are idiosyncratic, that is, particular to the individual. Finally, Cognitive neuroscience may provide further insight into regional brain activity during SM tasks.
Schemas
Schemas are a set of rules on how to behave which can be applied to a situation. We use schemas every day, for instance when we go to a lecture. Our lecture schema tells us that we have to find out where our lecture is, how to get there, get there, enter the lecture theatre, find a seat and so on. Bartlett introduced the concept. Schemas are automatic and unconscious processes that occur on a daily basis. Our schemata tend to be long lasting and are not easily changed. If we are introduced to information that contradicts our schemata we tend to assume that the new information is unique or different rather than believing our schemata are faulty. This links to Piaget's concepts of accomodation and assimilation- accomodation is the process of adapting existing schemata to fit with new information, whereas assimilation involves modifying new information to fit with our preexisting schemas.
Language, Scripts and Frames
The given-new contract(Clark 1977) describes how, in a conversation, a speaker should provide "not too much or too little" in terms of establishing the context.
1. The information given by the speaker should provide the appropriate context from which new information can be given. To illustrate this, in the example "Jamie went to the shops" - it is "given" that we are talking about "Jamie", as this has been established at the forefront of the utterance. This establishment of context provides a platform for "new" information concerning the fact of where Jamie went - "to the shops".
2.In establishing this context, the speaker should not give more information than needed- only the amount necessary to establish the context of conversation. This forms the basis of cost-effective communication. An example of this is being asked directions by someone foreign, you accomodate thier lack of knowledge and give simple directions in detail. In contrast, when giving directions to someone local you may assume they know certain landmarks/places so will give appropriate level of information.
3.Enough appropriate information must be given in order for the listener to be able to make "bridging inferences", this is where the listener is able to refer back to previous elements of the conversation in order to infer what is being discussed at later parts of the conversation. To illustrate this, if we extend the previous example to: "Jamie went to the shops with Luke, he bought a loaf a bread", the listener can infer that "he" refers to Jamie - by recognizing that Jamie is the main subject/actor in the utterance.
4.Lastly, a speaker may include more or less specific information depending upon the knowledge base of the listener. For example, if I bring up the subject of the components of working memory to someone who takes Art, I will most likely provide a generic description of the matter in hand - as I am aware that my listener will only be able to make bridging inferences based on their general knowledge.
The necessity of the given-new contract was highlighted in a study by Bransford and Johnson(1972). They asked participants to read this paragraph:
'The procedure is actually quite simple. First you arrange things into different groups. Of course, one pile may be sufficient depending on how much there is to do. If you have to go somewhere else due to lack of facilities, that is the next step, otherwise you are pretty well set. It is important not to overdo things. That is, it is better to do a few things at once rather than too many. In the short run this may not seem too important, but complications can easily arise. A mistake can be expensive as well.'
When participants were told beforehand that the paragraph was about laundry, they were able to understand it and recall what was said. If participants were not told anything about the passage, they found it difficult to understand. This suggests that having the memory schemata activated led to comprehension and retention of the information.
Frames (Minsky 1976)
Frames consist of schema for organising information about a single concept. It has 'slots' that can be filled with 'variables' which can be compulsory or optional. If no information is given explicitly, optional variables may be filled with 'default' values. If all of these slots are filled, the frame is instantiated. Problems: 'bare' frames don't allow the use of context so needs a more global structure that is able to account for meaningful 'chunks' of life in which frames can be placed. This corresponds to Barlett's idea of schemas but they are recently described as 'scripts'.
Frames don't need to store extra information. Bower, Black and Turner (1979) asked subject to list actions normally taken in certain situations such as going to the doctors. They found lots of commonality between scripts and they also found that the more scripts a participant read, the larger the intrusion of unstated generic script material in recall.
Shank and Abelson (1970-82) Natural Language Understanding
SAM - Script Applier Mechanism
A computer program developed to create a script of default knowledge which is normally taken for granted in a conversation. The computer creates these scripts so that it can 'understand' the conversation as it requires the scripts human beings would make normally. When given a statement, SAM paraphrases it and can then answer questions on the original statement. The script provides the default information that allows the the computer to make necessary inferences to understand the statement. It is not possible to create a script for every eventuality, so sometimes we make inferences by understanding what is a likely outcome/reaction.
PAM - Plan Applier Mechanism
PAM was created by Schank and Abelson (1970-82). Whenever an inference needs to be made, PAM tried to find a link using a repertoire of possible plans. It is not perfect, as it generates ENDLESS possibilities, but it is clear we use our knowledge of what WE might do, to infer what OTHERS might do.
It has been argued that SAM is a much stonger concept as it is clear that we constantly face situations where not only our own knowledge is relevant but also the knowledge of others; the ability to put ourselves in 'someone elses shoes'.
Does knowledge help?
De Groot (1965) showed subjects a possition on chessboard before sweeping it off and asking for it to be replicated. 10 year old experts did better that 20 year old beginners and there was no difference when the possitions were scrambled suggesting knowledge was instrumental in reconstruction. Similarly, Voss et al 1978 found participants with high baseball knowledge did significantly better in recall test after reading a baseball passage than compared to low knowledge group.
Summary on expertise and links to language and memory
Experts use knowledge to develop abstract, highly specialised mechanisms for systematically encoding and retrieving meaningful patterns from LTM. It allows experts to anticipate information needed for familiar tasks and stores new information in a format that facilitates retrieval.
We are expert in our own social world- we have developed mechanisms for systematically encoding meaningful language information. Spreading Activation uses existing declarative memory structures to automatically disambiguate utterances, providing a mechanism in which schemas/scripts may be instantiated in the human brain.
