Classical Conditioning: Why were Pavlov's dogs so significant?

I don't think it was so significant, it was more era-defining, it was an experiment which showcased the way people thought about things in the 1950s, and how different this thinking was from the earlier centuries, where mystical unknowables like the soul always interposed themselves between the stimulus input and the behavioral output.

The idea here is scientific behaviorism, the idea that we should describe things in terms of observable inputs and observable outputs, without postulating too many things in the middle, like mental states, unless we have to, to describe the inputs and outputs. This idea is associated with Pavlov and Skinner, and it is a materialistic no-nonsense minimalist scientific description which is in line with Marxist ideology, and with logical positivist ideas about how to describe nature.

Pavlov's dogs showed that if you make a particular stimulus, the dogs will respond in a certain way, reliably. It gave the description of the dog's behavior without making postulates about the interior experience of the dog, or any intermediate states that the dog has. It didn't say "now the dogs expect food", or "now the dog is fantasizing in it's head", it said to look for a signal in response to an input, and this is a complete description of what is going on.

This philosophy of behaviorism and logical positivism defined the scientific outlook of the 1950s and 1960s, but it went out of fasion in the 1970s and 1980s. It disappeared along with Marxism, materialism, logical positivism, the no-nonsense scientific outlook, in a puff of marijuana smoke.

There were several legitimate reasons for this, other than the marijuana smoke. Some people associated the ideas of behaviorism with the much more radical (and obviously false) idea that the dog just didn't have an internal experience, that there was nothing sophisticated going on in the dog's head at all, that it was all a relatively simple association between input and output that could be modelled with a small stimulus/response table. This idea is clearly false, because the dog has a big computer inside its head, and people have an even bigger and more sophisticated computer, and the computations are involved, and can lead to unpredictable results. But people were trying to squeeze as much juice as they could from the simplest models, so they tried to make the input-output relation as simple as it could possibly be, which was usually too simple to model anything at all.

For example, behaviorist inspired models of language were too primitive--- the language models of behaviorism were the regular languages, the languages that could be processed by finite-state automata. This was a state-transition model of human language processing, where you have a finite relatively small number of possible internal modes, like "I got a noun, now I am waiting for an adjective phrase" or "I completed the adjective phrase, now I am waiting for a verb", and as you process a sentence, you make transitions between these states, and process the words into sentences.

One source of opposition to behaviorism came from Chomsky, who noticed along with Schutzenberger and the computer scientists that human sentences can have embedding arbitrarily deep in principle, so you can make the sentence:

"I walked to the place that was behind the place that was in front of the place that was behind the place that was in front of te place that was behind of the place that was in front of my father's house."

and there is no obstacle to going arbitrarily deep. So the processing must be potentially infinite. Chomsky and Schutzenberger identified the proper model for these types of things, which is the context-free grammars, the languages that can be understood by a finite state machine with an infinite stack. Human stacks are not really infinite, but the model was better at describing what is going on in recursive language processing, and so it was considered a refutation of behaviorism--- it gave an infinite model of the transitions which was sort of "idealistic", rather than "materialistic".

These disputes are kind of dated. Sure, the stack grammars are more correct for modern languages, but you can obviously turn a stack grammar with a finite stack into a regular grammar, just because there are only finitely many things you can store on a finite stack, and a stack of not-too-deep depth can model the sentences that human beings will realistically encounter, and you can always add a little bit.

The dispute here was really over the materialist conception that the world can be understood from relations between inputs and outputs, so that all you need are the observations to deduce what is happening, you don't need to postulate metaphysical entities, like God, or the invisible hand, or the notion of human-rights, or some abstraction. Just focus on the practical stuff. This was what Marxism kept on trying to do.

The resolution to this dispute is to simply point out that the input and outputs might be small, but the internal computations which are required to produce these outputs still are enormous. Hamlet may be only a few kilobytes when encoded efficiently, but to produce it required unimaginably large computations in the author's head, visualizing the scenes, understanding the nuances of the phrases, making the language sonorous by hearing it in the head, and so on. These things are not directly in the output, they are in the interaction of this output with sophisticated computations in the reader and author.

So while the only evidence of these computations is the small input and output, the computations themselves required to relate the inputs to the outputs are immense, mind-bogglingly large.

Pavlov's dogs were significant because they were reducing the behaviors of dogs to the behaviors of switches, to an insignificantly tiny computation, and the implication was that the behavior of people is not much more sophisticated. In this sense it was false, but it was already clear that these models were far too primitive. Not because computations are the wrong language, but because the size of the computations involved in the behaviorist descriptions is too small by many many orders of magnitude.