A. Thorndike Show
The graph below is called a ______________ and summarizes how quickly cats learned to escape from a puzzle box. For Thorndike�s cats, the behaviors that opened the door were followed by certain consequences: __________and____________. As a result of these consequences, the cat became _____ likely to repeat the effective escape behaviors, which ultimately decreased the animal�s escape latency. �Of several responses made to the same situation, those which are accompanied or closely followed by satisfaction to the animal will, other things being equal, be more firmly connected to the situation, so that, when it
(the situation) recurs, they will be more likely to recur.� Thorndike called this the Law ___ ________. Trial and error or insight? According to Thorndike, learning involves establishing a connection between the discriminative stimulus and the instrumental response (___-___ connection).instrumental learning�learning the connection between a behavior and its consequence B. Skinner Responsible for an enormous increase of interest in instrumental/operant conditioning in the 1940s and 1950s. How did Skinner�s methods differ from Thorndike�s? Discrete-trial procedure vs. free operant proceduresResponse latency vs. response rate II. Behavior, Consequences, & Contingencies Instrumental/operant contingency: Behavior Consequence (further, this relationship is contingent) A. Reinforcement & Punishment reinforcement�the contingency that results when the consequence of a behavior causes the future probability of that behavior to INCREASE Then what is a reinforcer? punishment�the contingency that results when the consequence of a behavior causes the future rate of the behavior to decrease� Then what is a punisher? positive reinforcement�when reinforcement involves the presentation of a stimulus positive punishment�when punishment involves the presentation of a stimulus In this instance, positive does NOT mean __________. negative reinforcement�when the consequence in a reinforcement contingency is the removal of a stimulus negative punishment�when the consequence in a punishment situation is the removal of a stimulus In this instance, negative does NOT mean ___________.
4 key questions to help you determine the type of contingency involved.
B. How might reinforcers and punishers serve an adaptive role in behavior? C. Application Operant conditioning in the treatment of alcohol abuse One effective treatment for alcohol abuse is the community reinforcement approach (CRA) (Azrin, Sisson, Meyers, & Godley, 1982; Hunt & Azrin, 1973). The Lovaas treatment program (established in 1960's) uses operant conditioning principles to decrease problem behavior of autistic children and to increase the frequency of appropriate behaviors (e.g., language and social skills). Explain the meaning and importance of the following statements: (1) We define consequences and contingencies by their effects on behavior, not by what we EXPECT their effects to be. (2) If behavior is followed by a reinforcer, it is the behavior that is reinforced, not the organism. (3) The apparent contingent relationship between behavior and outcomes can lead to behaviors that do not appear to make sense. Skinner's classic article II. Instrumental Conditioning Paradigms Instrumental vs. Operant Distinction A. Runways/Mazes B. Escape & Avoidance Paradigms C. Operant Procedures Bar-Press, Key-Peck, Human Operant Responses Measuring Operant Responses�usually rate at which organisms emit response is the dependent variable in most operant conditioning studies Cumulative recorder creates a record of __________________ as a function of time.
Shaping procedure Reinforcement of behaviors that are closer and closer approximations of the target response Shaping in Skinner Box III. Positive Reinforcement Situations Presentation of an event that increases the probability of future behavior. So responding is influenced mainly by characteristics of the reinforcer. A. Amount/Magnitude Generally, responding occurs faster and becomes more accurate as we increase the amount of a reinforcer delivered after each response. E.g., Crespi (1942)�5 groups of rats to run down a straight runway for food IV: # of food pellets in goal box (1, 4, 16, 64, or 256) **Results: After 20 trials, the running speeds of the groups corresponded directly to the number of pellets received. Complicating factor�Some researchers have disagreed as to the definition of reinforcer magnitude. Several studies have shown that if an experimenter gives two groups the same amount of food for each response, but for one group the food is simply broken into more pieces, the group receiving the most pieces will respond faster. **Quality of reinforcement also matters. We can define quality by assessing how vigorously an organism consumes a reward when it is presented. High quality reinforcers are consumed quickly by organisms, whereas low quality reinforcers are consumed less quickly. B. Delay of Reinforcement Generally,the longer that reinforcement is delayed after a response, the poorer the performance of that response. ***One reason for this difference might be ____________________? (Why would a 10 second delay matter? What could possibly go on during the 10 s delay?)
C. Contrast Effects�Effects of quantity, quality, and delay of reinforcement on instrumental responding can vary depending on an organism�s past experiences with a particular reinforcer. Crespi (1942)�Group 1 trained to traverse a runway for large amount of food, Group 2 for a small amount of food After several trials, the rats receiving the large reinforcer were running consistently faster than the small-reinforcement group. Then switched half of the large-reinf. animals to the small reinforcer being given to the other rats (large-small). And switched half the small-reinforcer rats to the large-reinforcer magnitude (small-large).
D. Drive Hull (1949)--Instrumental performance is determined by level of conditioning AND motivational factors...Instrumental conditioning was represented by Habit Strength (H), which was influenced by # of reinforced training trials and
delay of reinforcement. Motivation was represented as Drive (D) (e.g., # hours of food deprivation) and Incentive (K) (e.g., reinforcer amount and quality). E Schedules of Reinforcement What is the effect of such reinforcement inconsistency on behavior? It is clear that instrumental learning is not dependent on continuous reinforcement. In most species, instrumental responding develops quite efficiently even when reinforcement occurs only intermittently.
Ratio schedules�deliver reinforcer only after a certain number of responses Fixed Ratio (FR) Variable Ratio (VR) Interval schedules�deliver reinforcer only for the first response that occurs after a period of time has elapsed. Fixed interval (FI) Variable interval (VI) Self-Control--capacity to inhibit immediate gratification in exchange for a larger reward in the long run Types of Reinforcement Primary Reinforcer�events that are capable of producing behavior changes naturally without the benefit of any prior learning. Secondary Reinforcer�events that function as reinforcers because of their consistent association with one or more primary reinforcers. Theories of Reinforcement Generally, theories explaining why reinforcers work fall into one of the following two categories: 1) Stimulus-based theories--something about the particular stimulus makes it reinforcing --Hull's Drive-Reduction Hypothesis--stimuli that reduce drives are capable of reinforcing emitted behavior Learning is adaptive & biological in nature One problem---Organisms tend to SEEK stimulation (Kish, 1966) Conclusion--Events that reduce drives will function as reinforcers, but an event need not reduce a drive in order to function as a reinforcer. --Incentive motivation--reinforcers increase drive, reinforcers pull the organism toward them and elicit certain behaviors 2) Response-Based Theories of Reinforcement --reinforcement depends on the response made possible by some reinforcer Premack Principle (1965, 1971)--Any kind of emitted response can serve to reinforce operant behavior. Specifically, any response that is preferred by an organism can serve to reinforce the performance of a less preferred response....Reinforcement is relative... F. Constraints on Response Learning IV. Punishment V. Negative Reinforcement Situations A. Escape Learning Amount of Reinforcement�In escape learning, the amount of reinforcement corresponds to the degree to which the aversive stimulation is reduced after a successful response. Campbell & Kraeling (1953)�exposed all rats to the same shock intensity in the runway and reduced this shock by varying degrees in the safe box. That is, all animals received shock reduction after responding, but most animals still received SOME shock in the safe box. The speed of the escape response was a direct function of the degree to which shock was reduced. **Escape learning depends more on the amount of negative reinforcement (degree to which aversive stimulation is reduced) than on the intensity of the aversive stimulus per se. Delay of Reinforcement�time between the escape response and the reduction of the aversive stimulation. Fowler & Trapold (1962)�exposed rats to shock in an alleyway and required the animals to run to a safe box to escape the shock. The delay in shock offset varied between 0 and 16 seconds.
Results-- escape speeds decreased as the delay of shock offset increased. B. Avoidance Learning�A stimulus signals the presentation of the aversive event. The organism must respond in the presence of the signal in order to avoid the aversive event. Two stimuli�the signal and the aversive stimulus The characteristics of both of these stimuli are important determinants of avoidance learning. Intensity of Aversive Stimulus--If the intensity of the aversive stimulus (e.g., shock) is too strong, avoidance conditioning is actually slowed Signal-Aversive Stimulus Interval--Avoidance learning appears to occur most efficiently when the signal and the aversive stimulus overlap, as is the case in a delayed-conditioning procedure. Duration of the signal before onset of aversive event--In general, signals of longer duration tend to facilitate the learning of the avoidance response (Low & Low, 1962). Termination of the Signal--Even when the organism�s response results in avoidance of the aversive stimulus, the rate of learning depends on whether the response also leads to the termination of the signal. Kamin (1957) conducted a two-way avoidance experiment using rats. All rats were able to avoid shock by moving to the safe chamber during the signal-shock interval. For one group, the signal terminated as soon as the avoidance response occurred (o-second delay). In the other three groups, the signal ended either 2.5, 5, or 10 seconds after the avoidance response.
Classical Conditioning and Instrumental Conditioning in Avoidance LearningA. Watson-Mowrer Two-Process Theory Mowrer (1947) proposed that avoidance learning involved two processes--(1) classical conditioning and (2) instrumental conditioning. (Part 1) Dangerous, painful, aversive stimuli (US) cause an innate fear response (UR). Other stimuli present at the time get associated with fear through classical conditioning. When these other stimuli (CSs) are encountered again, they evoke a fear response (CR). (Part 2) The presence of fear and all of its visceral effects is aversive. Any response that removes these fear-evoking stimuli will be negatively reinforced. The avoidance response, therefore, is reinforced through instrumental conditioning. The avoidance paradigm�Solomon and Wynne (1953)
What was a typical trial like? What constituted a session? Describe the behavior of the dogs. What reinforces avoidance behavior? It was easy to understand how the escape behavior persisted. The termination of the aversive stimulus would be rewarding. Escape behavior was maintained through negative reinforcement. But how was the avoidance response maintained? The animal continues to respond even though it no longer receives aversive stimulation, but why? Since the electric shock (US) was painful and produced an innate fear (UR) response (increased heart rate, sweating, etc.), the dark side became a feared stimulus (CS) that led to innate fear (CR) (increased heart rate, sweating, etc.). These visceral responses are unpleasant so any response that caused their reduction or elimination (i.e., escape or avoidance) would be reinforced. Mowrer argued that the termination or reduction of fear stimuli negatively reinforced the avoidance response.
Two-process theory predicts that the avoidance responding will be learned only to the extent that the warning signal terminates when a response is made. Kamin (1957)--trained four groups of rats in a two-chamber avoidance apparatus
The figure shows that a significant amount of avoidance responding occurred in the first group only (response terminates signal and enables animal to avoid shock). As predicted by two-factor theory, avoidance responding was poor in the group that was able to avoid shock but could not terminate the signal. We know that delaying the onset of reinforcement reduces the effectiveness of reward. So it should be possible to reduce the level of reinforcement by introducing a delay between the avoidance response and termination of the feared stimulus. 4 conditions--After the avoidance response, the CS was terminated (1) immediately (2) 2.5 seconds after the response (3) 5 seconds after the response (4) or 10secondsafter the response See Kamin (1957) results graphed above As predicted, the animals in the zero-delay condition successfully avoided shock on over 80% of the trials. Animals in the 10-second delay condition avoided shock on fewer than 10% of the trials. C. Learned Helplessness What occurs when an event following a response increases an organism's tendency to make that response?reinforcement. occurs when an event following a response increases an organisms tendency to make that response; a response in strengthened because it leads to rewarding consequences.
What type of learning that occurs when an organisms responding is influenced by the observation of others who are called models is?Observational learning. A type of learning that occurs when an organism's responding is influenced by the observation of others, who are called models.
What is it called when an organism learns a response to one stimulus and then exhibits the same response to a slightly different stimulus?Stimulus generalization occurs when a stimulus that is similar to an already-conditioned stimulus begins to produce the same response as the original stimulus does. Stimulus discrimination occurs when the organism learns to differentiate between the CS and other similar stimuli.
When an organism acquires a response that prevents some aversive stimulation from occurring?PSY Chapter 6. |