Saturday, August 29, 2015

♫ Summer Running: Not Very Fast! / Summer Running: Pain in my Ass! ♫


Every couple of days I force myself to go outside and run about 2-4 miles. I do not enjoy it. It makes me feel like I am dying every time; I gasp and wheeze and, even after showering I stay uncomfortably sweaty for a few hours. Worse still, I do not feel "energized" or whatever other vital sensations people claim to derive from exercise; if anything, I feel especially fatigued afterwards, and this only gets more pronounced as the day progresses. However, I have convinced myself that the benefits of cardiovascular exercise outweigh its many miseries; I will go into my whys and wherefores later and try to convince you too (it could just be that I am an insane person); but to start, I want discuss the 'how' and the 'what'.

I started keeping track of this gruelling ordeal using an android app (RunKeeper); it uses GPS and links up with Google fit and is a terrible invasion of my privacy that has probably somehow already sent my info to every extant insurance company and devastated my future premiums. Indeed, it's probably also incremented with each new symptom-related google search ("knees hurt a lot", "gasp and wheeze", "how much sweating is normal" etc). Fact is, information pertaining to my health now exists in the ether, and with supply, demand, and end-user licensing agreements being what they are, someone savvy can get it if they want it badly enough; still though, like blogger, the app is awfully convenient. I tried a couple others (that didn't look as eager to sell your soul) and found them to be complete shit, functionally. Runkeeper is good at what it does; it currently operates with a freemium model, and evidently if you pay a little you get better stats. I used the basic free version and just manually entered everything into a spreadsheet-- it took less than a half-hour.

I started running in late March, but I didn't really seriously commit until June (see histogram). Since then, across 44 different running events, I have travelled 103.72 miles and wasted 13 hours and 22 minutes doing so. There are two basic routes I would run: a short route (~1.7 miles) and a long route (~3.4 miles).

So far, my average speed on the short run is 7:20/mile (440 seconds) with a standard deviation of 26 seconds, while my average speed on the long route is 7:45/mile (464 seconds) with a standard deviation of 20 seconds. On my fastest, I averaged 6:55/mile for 1.7 miles (update 8/30: new best time of 6:48/mile for 1.7 miles). On my slowest, I averaged 8:25/mile for 3.4 miles. Here's a graph showing my improvement over time.


Significant improvement over time, which was expected. A more interesting question is whether my improvement was greater for short runs or long runs.

Separate regression equations were fit for both long and short runs:

AveragePace(Short)= 471.647 - 1.68*(RunOrder) 
AveragePace(Long)= 506.28 - 1.47*(RunOrder)

A quick test of differences between slopes would be this:
Z = (b1-b2)/Sqrt((SEb1)^2 + (SEb2)^2)

This gives:
> (-1.6882- -1.4735)/sqrt(.2348^2+.2595^2)
[1] -0.6135005
> pnorm(ans())
[1] 0.2697727

So nope, slopes don't differ.

R-squares were large (.63 and .74, respectively), indicating that a significant amount of the variance in pace measurements are attributable to practice or the passage of time.  Looking at the graph, there appears to have been three pretty precipitous drops in average pace: initially for the short runs, and then again for the short runs after about 30 running events, but the drop in average pace for my long run time occurred just after my 20th running event, and didn't seem to affect the short runs.  I am happy enough with this; without getting into time series or forecasting (though here's a great tutorial), I checked my residual autocorrelations (ACF) and everything looked OK.




I'm sure I look ridiculous when I run--I wear cut-off jeans, tattered old t-shirts, and my $15 Costco-brand running sneakers. But this is intentional! First, I like the feeling of getting extra use out of my holey old clothes by using them as a running costume. Second, they are positively indecent and wholly unwearable, even to sleep in--out on the block this is another incentive NOT to stop running, indeed not even to slow down!

Why do I do this? Because I am by nature quite sedentary, and evidently this means I am going to get several diseases, my brain is going to atrophy, and I will die quite prematurely. Because I am scared of these things happening, I have been following this self-imposed routine of aerobic hell rather sedulously for the past few months. During the school-year I can tell myself convincing stories about how my daily 5-minute bike-rides to and fro the bus-stop really add up: "surely this is a sufficient amount of exercise". But during the summer, when I can easily remain seated in the same place for the entire day, even these weak rationalizations break down. Running is the easiest means of cardio-ing; you can do it anywhere there's a sidewalk.

Running appears to enhance cognitive performance in healthy individuals.
This wikipedia article provides an excellent summary, but I'll talk about a few specific studies below. Smith et al (2010) analyzed 29 studies that tested the association between neurocognitive performance and aerobic exercise; they found that individuals who had been randomly assigned to aerobic exercise conditions improved in attention, processing speed, executive function, and memory. Here's a PsychologyToday page about another study on the relationship between cardiovascular fitness and intelligence in young adulthood (spoiler: it's very positive).

Not only that, but cardio also appears to be optimal for longevity. VO2 max, the gold-standard measure for cardiovascular fitness, is a good predictor of life expectancy; the higher it is, the lower your risk of "all cause mortality" and cardiovascular disease. The good news is, VO2 max is trainable, especially if interval training is used!

This isn't just me showcasing studies that confirm my beliefs--here's an excerpt  about the relationship between exercise and cognitive function from the recent textbook "Memory" by three leaders in the field (Baddeley, Eysenck, and Anderson, 2014):
"The evidence is much stronger for a positive effect of exercise on maintaining cognitive function. In a typical study, Kramer, Hahn, Cohen, Banich, McAuley, Harrison, et al. (1999) studied 124 sedentary but healthy older adults, randomizing them into two groups. One group received aerobic walking-based exercise, while the control group received toning and stretching exercises. The groups trained for about an hour a day for 3 days a week over a 6-month period. Cognition was measured by a number of tests including task switching, attentional selection, and capacity to inhibit irrelevant information. They found a modest increase in aerobic fitness, together with a clear improvement in cognitive performance. A subsequent meta- analysis of a range of available studies by Colcombe and Kramer (2003) found convincing evidence for a positive impact of aerobic exercise on a range of cognitive tasks, most notably those involving executive processing."


Honestly though, I feel like the amount of car exhaust I have to breathe on my runs probably greatly offsets any potential gains of cardiovascular exercise. Especially when I read horrifying things about how even sitting in traffic can cause brain damage and how living near a busy road increases the risk of birth defects. I sure hope I'm not running right into the very outcomes I intended to run away from!



Here's some R-code I used for this post:
> sd(data1$AvgPace)
[1] 26.28296
> sd(data2$AvgPace)
[1] 20.17638
> mean(data1$AvgPace)
[1] 440.1333
> mean(data2$AvgPace)
[1] 464.6625
> summary(fit1)

Call:
lm(formula = data1$AvgPace ~ data1$Order)

Residuals:
Min 1Q Median 3Q Max
-25.918 -10.718 -2.435 9.947 31.418

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 471.6471 5.7739 81.687 < 2e-16 ***
data1$Order -1.6882 0.2595 -6.507 8.16e-07 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 16.33 on 25 degrees of freedom
Multiple R-squared: 0.6287, Adjusted R-squared: 0.6139
F-statistic: 42.34 on 1 and 25 DF, p-value: 8.157e-07

summary(fit2)

Call:
lm(formula = data2$AvgPace ~ data2$Order)

Residuals:
Min 1Q Median 3Q Max
-15.9043 -7.8003 -0.2513 6.0742 19.6548

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 506.2880 7.1511 70.799 < 2e-16 ***
data2$Order -1.4735 0.2348 -6.276 2.04e-05 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 10.69 on 14 degrees of freedom
Multiple R-squared: 0.7378, Adjusted R-squared: 0.719
F-statistic: 39.39 on 1 and 14 DF, p-value: 2.036e-05

plot(data$Order,data$AvgPace,col=data$LongShort, main="Average Pace over Time (ordinal)", ylab="Average Pace (seconds)", xlab="Order (1 = first run, 2 = second run, ... , 44 = most recent run)")
legend('bottomleft', legend = levels(data$LongShort), col = 1:3, pch = 1)
abline(lm(data1$AvgPace~data1$Date),col="red")
abline(lm(data2$AvgPace~data2$Date),col="blue")
lines(data1$Date,data1$AvgPace,type="l")
lines(data2$Date,data2$AvgPace,type="l")

plot(Date, AvgPace, t="l", xaxt="n", xlab="")
axis(1, at=Date, labels=FALSE)
text(x=seq(1,44,by=1), par("usr")[3]-6.5, labels=labs, adj=1, srt=45, xpd=TRUE)

hist(Month1, xlab="Month (March=3, April=4,...)",main="Logged Running Events per Month",breaks=seq(3,9,1),right=F,labels=T)

par(mfrow=c(2,2))
plot.ts(res3,ylab="res (AvgPace - SHORT)",main="residual autocorrelation (short runs)")
abline(0,0)
Acf(res3)
plot(res4,ylab="res (AvgPace - LONG)",main="residual autocorrelation (long runs)")
abline(0,0)
Acf(res4)

Friday, August 28, 2015

Summary/Review of "How Can The Mind Occur in The Physical Universe?"

"...There is this collection of ultimate scientific questions, and if you are lucky to get grabbed by one of these, that will just do you for the rest of your life. Why does the universe exist? When did it start? What’s the nature of life?...
The question for me is how can the human mind occur in the physical universe. We now know that the world is governed by physics. We now understand the way biology nestles comfortably within that. The issue is how will the mind do that as well." 
-Alan Newell, December 4, 1991
I found out about John R. Anderson almost immediately upon discovering intelligent tutoring systems a few years ago; he and his research group at Carnegie Mellon have blazed the way forward with these technologies. Their Cognitive Tutor, for example, is currently #5 out of 39 interventions in mathematics education, as evaluated by the US Department of Education's "What Works Clearing House". I learned that, notwithstanding these educational pursuits, his life's work had been more about developing a "cognitive architecture" -- a model of how the structure of the mind and its components work together to achieve human cognition. I learned that he called it ACT-R (for "adaptive control of thought - rational") and that it has been steadily undergoing refinements since it debuted in the early 70s. Anyway, given how amazed I was with his tutoring-systems research, I was naturally drawn to Anderson's 2007 book that surveys his life's work in attempting to answer the titular question via ACT-R.

I'm moved to blog this because I was extremely impressed by (1) the synthesis of seemingly disparate phenomena (ACT-R is very consistent with a wide range of findings in cognitive psychology), and (2) how well his theories map onto findings from neuroscience. This book contains the most convincing model of human cognition I know of, but it is spread out across several chapters and compartmentalized in such a way that I feel I can unbox everything and tie it all together here in a more readily intelligible, coarse-grained fashion. It really is amazing, but I understand if you don't want to sit here and read a whole long synopsis. For this reason, I will now post verbatim a summary given by Anderson at the end of the book (though before he talks about consciousness), so that you can make an informed decision about whether to read further.
1. The answer [to the title question] takes the form of a cognitive architecture—that is, the specification of the structure of the brain at a level of abstraction that explains how it achieves the function of the mind.
2. For reasons of efficiency of neural computation, the human
cognitive architecture takes the form of a set of largely inde-
pendent modules (e.g., figure 2.2) associated with different
brain regions.
3. Human identity is achieved through a declarative memory
module that, moment by moment, attempts to give each person
the most appropriate possible window into his or her past.
4. The various modules are coordinated by a central production
system that strives to develop a set of productions that will give
the most adaptive response to any state of the modules.
5. The human mind evolved out of the primate mind by achieving
the ability to exercise abstract control over cognition and the
ability to process complex relational patterns.

The Modular Nature of Mind and Brain

The function of a cognitive architecture, according to Anderson, is "to find a specification of the structure of the brain that explains how it achieves the function of the mind." He argues that connectionist models of cognition will never be able to completely account for human cognition as a whole:
"This is because the human mind is not just the sum of core competences such as memory, or categorization, or reasoning. It is about how all these pieces and other pieces work together to produce cognition. All the pieces might be adapted to the regularities in the world, but understanding their individual adaptations does not address how they are put together."
Though many cognitive phenomena are certainly connectionist in nature, there is also no question that the brain is more than a uniform network of individual neurons. Much in the way that a cell is functionality partitioned into organelles, or that an organism comprises interconnected organ systems that each carry out characteristic tasks, the brain too has modularized certain functions, as evidenced by unique regions of neural anatomy associated with the performance of different tasks. The brain isn't just one huge undifferentiated mass! Neurons that perform related computations occur close together by reason of parsimony: the further apart they are, the longer it would take for them to communicate. Thus, computation in the brain is local and parallel; different regions perform different functions in the service of cognition, though at a lower level the functionality of any given brain region is connectionist in nature. Indeed, almost all systems whose design is meant to achieve a function show this kind of hierarchical organization (Simon, 1962).

If the brain devotes local regions to certain functions, this implies that we should be able to use brain-scanning procedures to find regions that reflect specific activities. The ACT-R cognitive architecture proposes 8 basic modules, and has mapped them onto specific brain regions through a series of fMRI experiments.
The eight modules (four peripheral and four central), plus their associated brain regions, are as follows: (1) Visual - processing of attended information in the fusiform gyrus; (2) Aural - secondary auditory cortex; (3) Manual -  hand motor/sensory region of central sulcus; (4) Vocal - face/tongue motor/sensory region of central sulcus; (5) Imaginal - mental/spatial representation area in posterior parietal cortex; (6) Declarative - memory storage/retrieval operations in prefrontal cortical areas; (7) Goal - cognition directed by anterior cingulate cortex; and (8) Procedural - integration, selection of cognition actions through the basal ganglia. A single fMRI study (Anderson et al., 2007) demonstrated the exercise of all of these modules and their associated brain regions. For our purposes, two of these modules are worth considering in more detail.

While the many regions of the brain do their own separate processing, they must act in a coordinated manner to achieve cognition. Thus, many regions of localized functionality are interconnected by tracts of neural fibers; particularly important are the connections between the cortex (the outermost region of the brain) and subcortical structures. One subcortical area in particular, the basal ganglia, is innervated by most of the cortex and plays a major role in controlling behavior through its actions on the thalamus. It marks a point of convergence across brain regions, compressing widely distributed information into what is effectively a single decision point. Thus, the basal ganglia is believed to be the main brain structure involved in action selection, or choosing which of many possible behaviors to perform in a given instance. Like their associated brain regions, the ACT-R modules must be able to communicate among each other, and they do so by placing information in small-capacity buffers associated with each of them. The procedural module plays the role of the basal ganglia by responding to patterns of information in these buffers and producing action. Though all modules are capable of independent parallel processing, they have to communicate via the procedural module, which can only execute a single rule/action at a time, thus forming a serial "central bottleneck" in overall processing.

So the basal ganglia plays the role of a "coordinating module". Appropriately, this region is evolutionarily older than the cortex and it occurs to some extent in all vertebrates. The other module I wanted to consider is the Goal module, which enables means-ends analysis. This is a task that is more uniquely human; it requires that one be able to disengage from what one wants (the goal, or "end") in order to focus on something else (the "means"). Some researchers (Papineau, 2001) assert that this is a uniquely human capability.

So, where are we at? The human mind is thought to be partitioned into specific information-processing functions, and thankfully neuroanatomy appears to be cut along similar joints, with specific brain regions devoted to different functions and interconnections that provide for coordination among these functions. Having positive a cognitive architecture based on interacting modules, Anderson turns next to the nitty-gritty of learning and memory.

Learning and Memory in ACT-R

Above, I mentioned a "Declarative" module as being among the central modules posited by ACT-R. Anderson's fundamental claim is that "declarative memory tries to give us, moment by moment, the most appropriate possible window into our past," and "this window into our past gives us our identities."

He assumes the well-documented distinction between declarative learning, or learning of "facts" and procedural learning (skill acquisition). He doesn't, however, make Tulving's (1972) episodic/declarative distinction; instead he considers both explicitly learned in a given context, with the difference being that the "declarative" memory (such as "Lincoln was a U.S. president") has been encountered in so many subsequent contexts that we no longer have access to the context in which it was originally learned. Declarative memories can be strengthened, or made more available, by mere exposure.

In addition to the formation and strengthening of declarative memories, there is also procedural learning and subsequent conditioning of these actions. An example he gives is typing: we all know how to type, but we would have a difficult time if asked to give the location of a certain key on a keyboard (without using our fingers as an aid or relying on a common mnemonic like "the home row" or "qwerty"). Conditioning is how all animals learn that certain actions are more effective in certain situations through experience; these can be procedural actions or innate tendencies. Procedural knowledge is associated with the basal ganglia and will be discussed in greater detail below; for now, we will stay with declarative learning.

Interestingly, there are two ways of acquiring declarative memories. This can be illustrated by anterograde amnesiacs like H.M., who, despite the loss of the hippocampus (and the ability thereby to form new memories), was able to learn about famous people such as John F. Kennedy and others who became famous after his surgery. Recent researchers have postulated two different learning systems: while the hippocampus is known to subserve most declarative learning, other brain structures can slowly acquire such memories through repetition (presumably how H.M. came to know about famous people). Furthermore, through rehearsal, memories can be slowly transferred from the hippocampus to neocortical regions, explaining why those with a damaged or missing hippocampus can still access older memories (which are presumed to have undergone such transfer). So, while the hippocampus limits the capacity of declarative memory, it does not limit all learning.

I've long been confused about the relative finitude of memory, but Anderson makes a strong case for there being definite limits on the size of declarative memory. Beyond physical limits of sheer size and metabolic costs, he makes the interesting claim that the very flexibility of our memory-search ability derives from it being strategically limited, "throwing out" memories that are unlikely to be needed: "declarative memory, faced with limited capacity, is in effect constantly discarding memories that have outlived their usefulness".

Alongside Lael Schooler, Anderson (1991) researched the fundamental mechanisms of declarative memory. They found that if a memory has not been retrieved in a while, it becomes increasingly unlikely that it will be needed in the future. Indeed, there is a simple relationship between how likely a memory would be needed on a given day and how long it had been (t) since the memory was last used:
Odds needed = At-d

Where A is just a constant and d is the decay rate. Each time a memory was accessed, it added an increment to the odds that it would be needed again, with these increments all decaying according to a power function. Thus, if an item occurred n times, the odds of it appearing again is

Odds = ∑nk=1  Atk-d

Where tk is the time since the kth practice of an item. Thus, the past history of memory use predicts the odds that the memory will be needed. But the context of the current situation is involved as well. It turns out that memory availability is adjusted as a function of context; e.g., you will have an easier time remembering, say, your locker combination in the locker room than you would if someone were to randomly ask you for it elsewhere (Schooler and Anderson, 1997). Thus, human memory reflects the statistics of the environment and performs a triage on memories, devoting its limited resources to those that are most likely to be needed. How is this fact realized in ACT-R?

In ACT-R, the "past" that is available in the form of memories consists of the information that existed in the buffers of various modules. At any given moment, countless things are impinging on the human sensorium, of which we only remember a very small fraction. For instance, ambient sounds or things in the visual periphery certainly undergo processing in various brain regions, but they seldom attended to and thus often never make it into buffers. The system is "aware" only of the chunks information in the various buffers, and these chunks get stored in declarative memory. These chunks have activation values that govern the speed and success of their retrieval. Specifically, a given memory has an inherent, base-level activation, plus its strength of association to elements in the present context.

Since the odds of needing a memory can be considered the sum of a quantity that reflects the past history of that memory and the present context, we can represent this in Bayesian terms as

 log[prior(i)] + ∑(j∈C)log[likelihood(j|i)] = log[posterior(i|C)]

Where prior(i) is the base-level activation, or the prior odds that memory i would be needed based on factors such as recency/frequency of use, likelihood(j|i) is the likelihood ratio that element j would be part of the context given that memory i is needed (reflecting strength of association to the current context), and posterior(i|C) is the updated odds that memory i will be needed in contex C.



I'll give the basic ACT-R memory equations without going into them much further. The main point is that memory is responding to two statistical effects in the environment: (1) the more often a memory is retrieved, the more likely it is to be retrieved in the future. This produces a practice effect and is reflected in ACT-R's base-level activation. Secondly, (2) the more memories associated with a particular element, the worse a predictor the element is of any particular memory. This is reflected in the strengths of association in ACT-R, and produces the "fan" effect. The "fan" refers to the number of connections to a given element; increasing the sheer number of connections will decrease the strength of association between the element and any one of its connections. This is because when an element is associated with more memories, its appearance becomes a poorer predictor of any specific fact.

These results have been shown to affect all of our memories. In experimental illustration of this, Peterson and Potts (1982) had participants study 1 or 4 true facts about famous historical figures that they did not previously know, such as that Beethoven never married. Two weeks later, participants were tested on memory for three kinds of facts: (1) new facts they had learned about historical figures as part of the experiment, (2) known facts that they knew about the historical figures before the experiment (eg, Beethoven was a musician), and (3) false facts that they had not learned for the experiment and that should be recognizable as very unlikely (Beethoven was an famous athlete). Participants were shown these types of statements and had to rate them as true or false, and their speed in doing so was recorded. First, it was found that the facts they knew before the experiment were recognized much more quickly than those they learned for the experiment, reflecting the greater practice and base-level activation of the prior facts. More importantly, the number of facts they had learned for the experiment (1 vs. 4) affected BOTH new and prior facts: participants who learned 4 new facts made slower judgements for both well-known and newly-learned facts, while those who learned just 1 new fact were faster on both new and prior facts. Anderson writes:
From the perspective of the task facing declarative memory—making
most available those facts that are most likely to be useful—these results make perfect sense. The already known facts have been used many times in the past, and at delay of two weeks they are likely the ones needed, so the base-level activation works to make them most active. On the other hand, the more things one knows about an individual, the less likely any one fact will be, so they cannot be all made as active. The activation equations in table 3.2 capture these relationships.
This relationship is also borne out in fMRI research. The greater the activation of a memory, the less time/effort it will take to retrieve it; thus, higher activation should map onto weaker fMRI response. Using a fan-effect paradigm, it was found that greater fan (more connections to a single memory) resulted in decreased activation and therefore stronger fMRI respones (Sohn 2003, 2005).

Anderson goes on in this chapter to discuss how we often choose actions and make decisions based on our memories of similar past actions/decisions and the outcomes that they produced. Here, we rely on memories rather than reasoning on the basis of general principles. Sometimes we have general principles to reason from, while other times it's far easier to recall and act. This kind of instance-based reasoning may be far more common than has been traditionally thought.


The Adaptive Control of Thought

Given all of the above, we know how important a flexible declarative memory is to our ability to adapt to a changing environment; but once the relevant information has been retrieved, we have to act on it, using it to make inferences or predictions. This often requires intensive, deliberative processing which is not appropriate when we have to act rapidly in stressful situations. Indeed, to the extent that one can anticipate how knowledge will be used, it makes sense to prepackage the application of that knowledge in a way that can be executed without planning. It turns out that there is a process by which frequently useful computations are identified and cached as cognitive reactions that can be elicited directly by the situation, bypassing laborious deliberation. Thus, a balance must be struck between immediate reaction and deliberative reflection, a sort of dual processing reminiscent of Kahneman's "Thinking Fast and Slow." This is the way Anderson conceptualizes learning: a process of moving from intentional thinking and remembering (hippocampal/cortical) to more automatic reactions (basal ganglia).

But such an equal embrace of thought and action has not always characterized cognitive science; in fact, this very distinction marked the transition in psychology from the "behaviorist" to the "cognitive" era. This shift is very visible in the debate between Tolman and Hull about the relative roles of mental reflection and mechanistic action in producing behavior. To illustrate the struggle between thought and action in the mind, Anderson has us consider the Stroop task, where you are instructed to quickly report the font color while reading a list like red yellow orange green blue black etc. This task always takes slightly longer than simply reporting the color of non-words; Anderson points out that "this conflict basically involves the battle between Hull’s stimulus-response associations (the urge to say the word) and Tolman’s goal-directed processing (the requirement to comply with instructions)."

Anderson argues that 3 brain systems are especially relevant in achieving a balance between thought and action: the basal ganglia are responsible for the acquisition and application of "procedures", or Hull's automatic reactions; the hippocampal and prefrontal regions are responsible for storage and retrieval of declarative information, or Tolman's expectancies; and the anterior cingulate cortex (ACC) for exercising control in the selection of context-appropriate behavior. Note that these respectively correspond to the procedural module, the declarative module, and the goal module.

Declarative retrieval and of information during decision-making is very time and resource intensive; it would be sensible if our brains had a way of "hard-coding" frequently-used behaviors/actions so that we could respond more automatically to familiar situations. Fortunately, it appears they do just that! For example, Hikosaka et al. (1999) showed monkeys a sequence of 4x4 grids in which two cells were lit up, and the monkeys had to select them in the correct order. The monkeys practiced such sets over the course of several months, and telling differences emerged between performance during the early months and later months. Early on, the monkeys performed the same regardless of what order the grids were shown in, or of which hand they used; however, after months of practice, they had become much faster at completing the task but could not go out of order and could only use their favored hand to input the answer. Thus, it seemed that the monkeys had switched from a flexible declarative representation of the task to a classic stimulus-response representation. Hikosaka et al. examined the brains of monkeys performing the task in order to compare activity in the early vs. later months. As expected, the task activated prefrontal regions early on, but after much practice the task primarily produced activity basal ganglia structures, which are thought to display a variant of reinforcement learning. Furthermore, temporarily inactivating basal ganglia structures disrupted only the highly practiced sequences (not newly learned sequences).

The basal ganglia, then, is involved in producing automatic responses to stimuli. Indeed, it seems to display a variant of reinforcement learning, where a behavior followed by a "satisfying state of affairs" will increase in frequency (Thorndike's law of effect). The hippocampus is associated with Hebbian learning, where repeated occurrences of stimuli and response together serve to strengthen the connection (Thorndike's law of exercise); this is merely a function of temporal contiguity and does not depend on the consequences of the behavior. The basal ganglia is involved in a dopamine-mediated process that learns to recognize favorable patterns of activity in the cortex (Houk and Wise, 1995). That is, dopamine neurons provide information to the basal ganglia about how rewarding a behavior was, if it was more rewarding than expected, etc. Importantly, an element of time-travel is involved, because the rewards strengthen the salience of reward-producing contexual patterns. In humans, the basal ganglia (specifically the striatum) has been found to respond differentially to reward and punishment, the magnitude of the reward/punishment, and the difference between expected and recieved reward/punishment (Delgado et al. 2003). This was all very refreshing to me. Classical and operant conditioning are often presented in psychology classrooms as museum curiosities or animal training procedures, when in fact they apply equally well to human learning.

I wanted to share one final experimental demonstration of the difference between learning in the hippocampus versus the basal ganglia. This one involves a rat maze-learning paradigm; imagine a maze shaped like a plus sign (+); rats always enter on the same side, say the west side. Rats are trained to go to food housed in the south arm. What will rats do if they are put in the maze on the east side? Have they learned the spatial location of the food, or have they merely learned a right-turning behavior? If the former is true, they should turn down the correct arm of the maze to find the food; if they latter is true, their response will lead them down the wrong arm. Early results yielded no clear choice pattern (Restle, 1957). However, Packard and McGaugh (1996) trained all rats on the maze and then gave them injections that temporarily impaired either their hippocampus or their basal ganglia (specifically, the caudate). As you might expect, the rats with selective hippocampal impairment performed the right-turning response and ended up in the wrong arm of the maze, while rats with impediments to the basal ganglia chose the correct arm, presumably because their intact hippocampus contained the correct spatial "place-learning" representation. A convincing follow-up study by Packard (1999) produced the same pattern of results, but this time by using memory-enhancing agents applied selectively to the hippocampus or the caudate. This time, rats with hippocampal enhancements displayed behavior consistent with place-learning (they chose the correct arm), while rats with enhanced caudates relied on a right-turn response and chose the incorrect arm.


But where do these stimulus-response associations come from? In ACT-R, they are called "productions" or "production rules" -- when a situation arises for which the system does not already have rules, information must be retrieved from declarative memory and must be processed using more basic production rules. This could entail retrieving a similar prior experience upon which to base present actions or retrieving general principles and reasoning from them. In such a situation,
"the first production makes a retrieval request for some declarative information, that information is retrieved, and the next production harvests that retrieval and acts upon it. The compiled production eliminates that retrieval step and builds a production specific to the information retrieved. This is the process by which the system moves from deliberation to action. Each time a new production of this kind is created, another little piece of deliberation is dropped out in the interest of efficient execution."
However, this newly formed production requires multiple repetitions for it to acquire enough strength to be applicable in new situations. Such rules are learned slowly, consistent with the view that procedural memories are acquired gradually. This measure of strength is often called a rule's "utility" since it is a measure of the value of the rule; when a situation arises where multiple rules apply, the rule with the highest utility is chosen; further, rewarding consequences following the use of a rule serve to increase that rule's utility. When a new rule is first created, its utility is zero and thus it is extremely unlikely that it will "fire". However, each time this rule is recreated its utility is increased. Anderson gives an excellent example using children's learning of subtraction rules. In the interest of time I won't go into it here, other than to say that it accounts for the most common bug in learning to subtract two multi-digit numbers: instead of always subtracting the bottom number from the top number, the buggy rule children often use is to subtract the bigger from the smaller, regardless of which is on top. This rule is so persistent because half of the time, it produces the correct outcome and thus the same reward as the more limiting bottom-from-top rule. ACT-R is used to model the acquisition of the correct rule, and I found it very compelling.

This general learning process is seen clearly in skill learning: as one becomes more skillful (say, in riding a bike), there will be a decrease in the involvement of the more "cognitive" cortical regions and an increase in the involvement of the more "stimulus-response" posterior regions. Here's Anderson's summary:
"Learning can be conceptualized as a process of moving from thought-
ful reflection (hippocampus, prefrontal cortex) to automatic reaction
(basal ganglia). The module responsible for learning of this kind is the
procedural module (or production system). I offer the procedural mod-
ule as an explanation for behavior that embraces both Hull’s reactions
and Tolman’s reflections and provides a mechanism for the postulated
learning link between them. Through production compilation, thought-
ful behaviors become automatized; through utility learning, behavior
is modified to become adaptive. When combined with the declarative
memory module discussed in chapter 3, the production system provides a mechanism by which knowledge is used to make behavior more flexible and efficient."
Thus, an important part of cognition is the accumulation of production rules in long-term memory, which can then become activated by the contents of working memory, which can be composed into more complex production-rule chains when a particular problem is solved, the result of which can be cached and, if used above some some frequency threshold, will become a production rule in its own right.

Uniquely Human Learning

Anderson points out that his (and my) discussion up to this point has actually concerned primate learning; nothing so far has been unique to humans. In chapter 5, he discusses learning from verbal directions and worked-out examples. He also recognizes the role of individual discovery in the learning process, but criticizes the recent trend towards pure "discovery" learning in education:
" ...a third way to learn is by discovery and invention. Cultural artifacts such as algebra came into being because of such a process. Some constructivist mathematics educators advocate having children learn in the same manner (e.g., Cobb et al., 1992). In the extreme, it is a very inefficient way to learn algebra or any other cultural artifact... However, when one looks in detail at what happens in the process of learning from instruction and example, one frequently finds many minidiscoveries being made as students try to make sense of the instruction they are receiving and their experience in applying that instruction. Learning by discovery probably plays a more important role as a normal part of learning through social transmission (i.e., directions and examples) than it does as a solo means of learning."
Anderson goes on to discuss how human cognition can support a uniquely human skill: learning algebra from verbal directions and examples. He uses ACT-R to model algebra learning and to help point the way toward what is special about human cognition. He ends up describing three such features in detail: the potential for abstract control of cognition, the capacity for advanced pattern matching, and the metacognitive ability to reason about cognitive states.

The first is likely mediated by the anterior cingulate cortex (ACC), a structure involved in controlling behavior, which is especially active when people have to direct their behavior in ways that violate typical response tendencies. Interestingly, the ACC has undergone recent evolutionary changes found only in humans. Recall that this structure was the one associated with the goal buffer, which holds control elements. The idea is that the ACC allows us to maintain abstract control states which let us choose different actions when all the other buffers are in identical states. The second feature requires dynamic pattern matching, which allows for processing complex relational structures, as seen in analogical processing. It all gets pretty detailed and I won't go into it here. Instead I'll just quote the end of the chapter:
Dynamic pattern matching and recursive representations are connected. Dynamic pattern matching is only useful in a system that has powerful, interlinked representations. Processing recursive representations can be much easier with dynamic pattern matching. The human brain is expanded over that of other primates, and it is not just a matter of more brain. There are new prefrontal and parietal regions, and in the case of some regions such as the ACC, there are new kinds of cells. While brain lateralization is also a common feature of many species, its connection with language seems unique (Halpern et al., 2005), and Marcus’s second feature is strongly motivated by considerations of language processing. So, it seems pretty clear that there have been some changes to the structure of the human brain that enable the unique functions of human cognition.

The Question of Consciousness

It isn't really fair to talk about this here, because I have only given you a flavor for the main arguments presented in the book, and it is upon this foundation that his discussion of consciousness is founded. It requires an intimate understanding of ACT-R, and I don't think I've done a good enough job conveying that understanding in the present post. Still, I'll leave you with his thoughts on the subject, which he gives only grudgingly (preferring to "leave the philosopher's domain to the philosopher"):

In 2003, we noted that in ACT-R consciousness has an obvious mapping to the buffers that are associated with the modules. The contents of consciousness are the contents of these buffers, and conscious activity corresponds to the manipulation of the contents of these buffers by production rules. The information in the buffers is the information that is made available for general processing and is stored in declarative memory. ACT-R models can generate introspective reports by describing the contents of these buffers. In 2003 we did not think this was much of an answer and gave ACT-R low marks on this  dimension. I have subsequently come to the conclusion that this is indeed what consciousness is and that running ACT-R models are conscious. They may not be conscious in the same sense as humans, but this is probably because ACT-R gives a rather incomplete picture of the buffers that are available in the human system.

He immediately notes that this is "not a particularly novel interpretation of consciousness" and that it is essentially "the ACT-R realization of the global workspace theory of consciousness (Baars, 1988; Dehaene & Naccache, 2001)
These authors, Dehaene and Changeux (2004), summarize the view as follows:
We postulate the existence of a distinct set of cortical “workspace” neurons characterized by their ability to send and receive projections to many distant areas through long-range excitatory axons. These neurons therefore no longer obey a principle of local, encapsulated connectivity, but rather break the modularity of the cortex by allowing many different processors to exchange information in a global and flexible manner. Information, which is encoded in workspace neurons, can be quickly made available to many brain systems, in particular the motor and speech-production processors for overt behavioral report. We hypothesize that the entry of inputs into this global workspace constitutes the neural basis of access to consciousness. (p. 1147)
He is totally on-board with rejecting all "Cartesian theater" interpretations--the idea that there has to be something more to consciousness, some inner homunculus that watches our thoughts flit by-- and he seems to agree pretty completely with Dennett (1993). He finishing with the following:
 If we resist the temptation to believe in a hard problem of consciousness, we can appreciate how consciousness is the solution to the fundamental problem of achieving the mind in the brain. As noted in chapter 2, efficiency considerations drive the brain to try to achieve as much of its computation as possible locally in nearly encapsulated modules. However, the functionality of the mind demands communication among these modules, and to do this, some information must be made globally available. The purpose of the buffers in ACT-R is to create this global access. The contents of these buffers will create an information trail that can be reported and reflected upon. As in the last example in chapter 5, adaptive cognition sometimes requires reflection on this information trail. Thus, consciousness is the manifestation of the solution to the need for global coordination among modules. It is a trademark consequence of the architecture in figure 2.2. That being said, chapters 1–5 develop this architecture with only oblique references to consciousness. This is because the information processing associated with consciousness is already described by other terms of the theory. It still is not clear to me how invoking the concept of consciousness adds to the understanding of the human mind, but taking a coherent reading of the term consciousness, I am willing to declare ACT-R conscious.

Wednesday, July 1, 2015

I Finally Read "A New Kind of Science"

This book made me think new thoughts; this is rare, so I am posting about it.
If you read nothing else in the post, read the end.

"A New Kind of Science" is a 13-year-old book preceded, and regrettably often prejudged, by its reputation. Many of the criticisms that have come to define the work are valid, so let's get that part out in the open. The book can be read in its entirety here; it is enormous, both physically (~1,200 pages) and in scope, which has led to a limited and specialized readership lodging many legitimate, though mostly technical, complaints. To make matters worse, Wolfram comes across as rather smug and boastful, taking for granted the revolutionary impact of his work, staking claims to originality that are often incorrect, and failing to adequately cite his ideological predecessors in the main body of the text. These are legitimate concerns and egregious omissions to be sure.

However, the book itself was written to be accessible to anyone with basic knowledge of math/science, and I feel it gains so much for being simplistic and frankly written in this way. Furthermore, the biggest ideas in the book, even if not completely original or 100% convincing as-is, are still as beautiful and important as they are currently underappreciated. Even if he cannot claim unique ownership of them all, Wolfram has done an heroic job explicating these ideas for the lay reader and his book has vastly increased their popular visibility. But I fear that many may be missing out on a thoroughly enjoyable, philosophically insightful book simply on the basis of some  overreaching claims and some rather technical flaws; I get the sinking feeling it's going the way of Atlas Shrugged—a big book that's cool to dismiss without ever having read.

I'm not going to get into the specific criticisms; suffice it to say that there are issues with the book, though Wolfram has gone to some trouble to defend his positions. But this notorious reception, coupled the fact that the book weighs in at almost 6 pounds, had kept me and surely many others from ever giving it a proper chance. This post is not meant to be an apology, and neither is it intended to be a formal book review. Rather, I am going to show you several things that I took away from my reading of it that I feel very grateful for, regardless of the extent to which they represent any sort of paradigm shift, or even anything new to human inquiry. Lots of these ideas were very new to me, and they were presented so well that I feel I have ultimately gained a new perspective on many issues, including life itself, which I have been trying sedulously for years to better understand.

To attempt to write a general review this book would be quite difficult, for its arguments depend so much upon pictures (of which there are more than 1000!), careful explanations, and repeated examples to build up intuition for how very simple rules can produce complex behavior, and how this fact plausibly accounts for many phenomena in the natural and physical world (indeed, perhaps the universe itself). Wolfram uses this intuition to convey compelling explanations of space and time, experience and causation, thinking, randomness, free will, evolution, the Second Law of thermodynamics, incompleteness and inconsistency in axiom systems, and much more. I will be talking about most of the non-math/physicsy stuff in this post, because most of it's beyond my ken.

This will make a little more sense later on

To begin with, the first 5 to 8 chapters are nothing if not eye-opening; they could and should be read by all high-schoolers who are interested in science, and unless you are a scientist yourself I guarantee you will gain many new insights into some fundamental issues. Some of the physics (ch. 9) got a little heavy for me, but this may well be the most interesting part of the book for many. The final chapter (12) was what did it for me personally.

In this long last chapter, the main thesis of the book is driven home. It is as follows: the best (and indeed, perhaps the only) way to understand many systems in nature (and indeed, perhaps nature itself) is to think in terms of simple programs instead of mathematical equations; that is, to view processes in nature as performing rule-based computations. Simple computer programs can explain, or at least mimic, natural phenomena that have so far eluded mathematical models such as differential equations; Wolfram argues that nature is ultimately inexplicable by these traditional methods.

He demonstrates how very simple programs can produce complexity and randomness; he argues that because simple programs must be ubiquitous in the natural world, they are responsible for the complexity and randomness we observe in natural systems. He shows how idealized model programs like cellular automata can mimic in shockingly exact detail the behavior of phenomena which science has only tenuously been able to describe: crystal growth (e.g., snowflakes), fluid turbulence, the path of a fracture when materials break, biological development, plant structures, pigmentation patterns...thus indicating that such simple processes likely underlie much of what we observe.


Indeed, he makes a case (originally postulated by Konrad Zuse) that the universe itself is fundamentally rule based, and essentially one big ongoing computation of which everything is a part. It gets a little hairy, but in chapter 9 Wolfram discusses how the concept of causal networks can be used to explain how space, time, elementary particles, motion, gravity, relativity, and quantum phenomena all arise. Indeed, he argues that causal networks can represent everything that can be observed, and that all is defined in terms of their connections. This is predicated on the belief that there are no continuous values in nature; that is to say, that nature is fundamentally discrete. There were a lot of intriguing ideas here, but I cannot go into them all right now. There does seem to be a reasonable case to be made for some kind of of digital physics. I am way out of my league here though, so I'll stop. Check out that wikipedia article!


Cellular automata and other related easy-to-follow rule-based systems are used to demonstrate, or at least to hint at, most of these claims. If you haven't seen these before, check out that link: it takes you to Wolfram's own one-page summary of how these things work. In fact, I'm going to cut-and-paste most of it below. But here's a brief description: imagine of a row of cells that can be either black or white. You start with some initial combination of black and/or white cells in this row; to get the next row, you apply a set of rules to the original cells which tells you what color cells in the next row should be based on the colors of the original cells above. The rules that determine the color of a new cell are based on the colors of three cells: the cell immediately above it and the cells to the immediate right and left of the one above it. Thus, the color of any given cell is affected only by itself and its immediate neighbors. Simple enough, but those neighbors are in turn governed by their neighbors, and those neighbors by their neighbors, etc, so that the whole thing ends up being highly interconnected.  When you repeatedly apply a given rule and step back to observe the collective behavior of all the cells, large-scale patterns can emerge. You often get simple repetitive behavior (like the top picture below) or nested patterns (second picture below). However, sometimes you find behavior that is random, or some mixture of random noise with moving structures (last picture below).















Look at the picture just above; the left side shows certain regularities, but the right side exhibits random behavior (and has indeed been used for practical random number generators and encryption purposes). How might one predict the state of this system after it has evolved for a given number of time-steps? (This is an important "exercise left to the reader" so think about it before reading on).

The 'take-home' here is that sometimes simple rules lead to behavior that is complex and random, and the lack of regularities in these systems defy any short description using mathematical formulas. The only way to know how that sucker right there is going to behave in 1,000,000,000 steps is to run it and find out. 

If you like looking at pictures like this one, you should definitely check out the book. I read it digitally but I ordered a physical copy as soon as I finished because man, what a terrific coffee-table book this thing makes!

Computational universality

Now here's where things got really interesting for me. Unless you have studied computer science (which I honestly really haven't), you might be surprised to find out that certain combinations of rules, like those shown above, can result in systems that are capable of performing any possible computation and emulating any other system or computer program (which I honestly kind of was). Indeed, the computer you are reading this on right now has this capability. Hell, your microwave probably has this capability; given enough memory, it could run any program or calculate any function provided the function is able to be computed at all.

This idea, called universal computation, was developed by Alan Turing in the 1930s: a system is said to be "universal" or "Turing complete" if it is able to perform any computation. If a system is universal, it must be able to emulate any other system, and it must be able to produce behavior that is as complex as that of any other system; knowing that a system is universal implies that the system can produce behavior that is arbitrarily complex.

When studying the 256 rule-sets that generate the elementary cellular automata, Wolfram and his assistant Matthew Cook showed that a couple of them (rule 110 and relatives) could be made to perform any computation; that is, a couple of these extremely simple systems, among the most basic conceivable types of programs, were shown to be universal.

Rule 110 from a single black cell (16 steps; see 250 steps below)

In general, this itself is not new knowledge; von Neumann was the first to show that a cellular automaton could be a universal computer, and it was known that other simple devices could support universal computation. However, this was the simplest instantiation of universality yet discovered, and Wolfram uses this to argue that the phenomenon must indeed be quite more widespread than originally thought, and indeed very common in nature. While most basic sets of rules generate very simple behavior (like the first and second rules pictured above), past a certain threshold you get universality, where a system can emulate any other system by setting up the appropriate initial conditions, like rule 110:

How universality can actually be achieved with cellular automata in practice is described with great clarity in the book, but it would be too complicated to get into here. Pretty neat though! Wolfram goes on to show how universality is instantiated in Turing machines, cellular automata, register machines, and substitution systems, by showing how each one can be made to emulate the others by setting up appropriate initial conditions, despite great differences in their underlying structure.
"It implies that from a computational point of view a very wide variety of systems, with very different underlying structures, are at some level fundamentally equivalent...every single one of these systems is ultimately capable of exactly the same kinds of computations."
Any kind of system that is universal can perform the same computations; as soon as one gets past the threshold for universality, that's all. Things can't get more complex. It doesn't matter how complex the underlying rules are; one universal system is equivalent to any other, and adding more to its rules cannot have any fundamental effect. He goes on to say,
"...my general expectation is that more or less any system whose behavior is not somehow fundamentally repetitive or nested will in the end turn out to be universal."
This and related research led Wolfram to postulate his "new law of nature", the Principle of Computational Equivalence: that since universal computation means that one system can emulate any other system, all computing processes are equivalent in sophistication, and this universality is the upper limit on computational sophistication.
"No system could be constructed in our universe that is capable of more complex computations than any other universal system; no system can carry out computations that are more sophisticated than those carried out by a Turing machine or cellular automaton."
Another way of stating this is that there is a fundamental equivalence between many different kinds of processes, and that all processes which are not obviously simple can be viewed as computations of equivalent sophistication, whether they are man-made or spontaneously occurring. When we think of computations, we typically think of carrying out a series of rule-based steps to achieve a purpose, but computation in fact much broader, and as Wolfram would argue, all-encompassing. Thus, as in cellular automata, the process of any system evolving is itself a computation, even if its only function is to generate the behavior of the system. Thus, all processes in nature can be thought of as computations; the only difference is that "the rules such processes follow are defined not by some computer program that we as humans construct but rather by the basic laws of nature."

Wolfram goes on to suggest that any instance of complex behavior we observe is produced by a universal system.
"I suspect that in almost any case where we have seen complex behavior... it will eventualy be possible to show that there is universality. And indeed... I believe that in general there is a close connection between universality and the appearance of complex behavior."
"Essentially any piece of complex behavior that we see corresponds to a kind of lump of computation that is at some level equivalent."
He argues that this is why some things appear complex to us, while other things yield patterns or regularities that we can perceive, or which can be described by some some formal mathematical analysis:
"If one studies systems in nature it is inevitable that both the
evolution of the systems themselves and the methods of perception and analysis used to study them must be processes based on natural laws. But at least in the recent history of science it has normally been assumed that the evolution of typical systems in nature is somehow much less sophisticated a process than perception and analysis.

Yet what the Principle of Computational Equivalence now asserts is that this is not the case, and that once a rather low threshold has been reached, any real system must exhibit essentially the same level of computational sophistication. So this means that observers will tend to be computationally equivalent to the systems they observe— with the inevitable consequence that they will consider the behavior of such systems complex."
Thus, the reason things like turbulence in fluids or any other random-seeming phenomena appear complex to us is that we are computationally equivalent to these things. To really understand the implications of this idea, we need bring in the closely related idea of irreducibility.

Computational irreducibility

Wolfram claims that the main concern of science has been to find ways of predicting natural phenomena, so as to have some control/understanding of them. Instead of having to specify at each step how, say, a planet orbits a star, it is far better to derive a mathematical formula or model that allows you to determine the outcome of such systems with a minimum of computational effort. Sometimes, you can even find definite underlying rules for such systems which make prediction just a matter of applying these rules.

However, there are many common systems for which no traditional mathematical formulas have been found which can easily describe their behavior. And just because you know the underlying rules, there is often no way to know for sure how the system will ultimately behave, and it can take an irreducible amount of computation to actually do this. Imagine how you would try to predict the row of black and white cells after the rule-110 cellular automaton had run for, say, a trillion steps. There is simply no way to do this besides carrying out the full computation; no way to reduce the amount of computational effort that this would require. Thus,
"Whenever computational irreducibility exists in a system it means that in effect there can be no way to predict how the system will behave except by going through almost as many steps of computation as the evolution of the system itself.

...what leads to the phenomenon of computational irreducibility is that there is in fact always a fundamental competition between systems used to make predictions and systems whose behavior one tries to predict.

For if meaningful general predictions are to be possible, it must at some level be the case that the system making the predictions be able to outrun the system it is trying to predict. But for this to happen the system making the predictions must be able to perform more sophisticated computations than the system it is trying to predict."
This is because the system you are trying to predict and the methods you are using to make predictions are computationally equivalent; thus for many systems there is no general way to shortcut their process of evolution, and their behavior is therefore computationally irreducible. Unfortunately, there are many common systems whose behavior cannot ultimately be determined at all except for through direct simulation, and thus don't appear to yield to any mathematical short description. Wolfram argues that almost any universal system is irreducible, because nothing can systematically outrun a universal system. He gives the following thought experiment:
"For consider trying to outrun the evolution of a universal system. Since such a system can emulate any system, it can in particular emulate any system that is trying to outrun it. And from this it follows that nothing can systematically outrun the universal system. For any system that could would in effect also have to be able to outrun itself."
Since universality should be relatively common in natural systems, so too will computational irreducibility, making it impossible to predict the behavior of these systems. He argues that traditional science has always relied on computational irreducibility, and that "its whole idea of using mathematical formulas to describe behavior makes sense only when the behavior is computationally reducible. This seems to impose stark limits on traditional scientific inquiry, for it implies that it is impossible to find theories that will perfectly describe a complex system's behavior without arbitrarily much computational effort.

Free Will and Determinism

The section of the book uses the idea of computational irreducibility to demystify of the age-old problem of free will in a way I find quite satisfying, even beautiful. Humans, and indeed most other animals, seem to behave in ways that are free from obvious laws. We make minute-to-minute decisions about how to act that that do not seem fundamentally predictable.

Wolfram argues that this is because our behavior is computationally irreducible; the only way to work out how such a system will behave, or to predict its behavior, is to perform the computation. This lets us have our materialist/mechanistic cake and eat it: we can admit that our behavior essentially follows a set of underlying rules with our autonomy intact, because our rules produce complexities that are irreducible and hence unpredictable.

We know that animals as living systems follow many basic underlying rules— genes are expressed, enzymes catalyze biochemical pathways, cells divide—but we have also seen how even very basic rule-sets result in universality, complexity, and computational irreducibility of the system.
"This, I believe, that is the ultimate origin of the apparent freedom of human will. For even though all the components of our brains presumably follow definite laws, I strongly suspect that their overall behavior corresponds to an irreducible computation whose outcome can never in effect be found by reasonable laws."
The main criterion for freedom in a system seems to be that we cannot predict its behavior. For if we could, then the behavior of the system would thus be predetermined. Wolfram muses,
"For as we have seen many times in this book even systems with quite simple and definite underlying rules can produce behavior so complex that it seems free of obvious rules. And the crucial point is that this happens just through the intrinsic evolution of the system—without the need for any additional input from outside or from any sort of explicit source of randomness.

And I believe that it is this kind of intrinsic process—that we now know occurs in a vast range of systems—that is primarily responsible for the apparent freedom in the operation of our brains.

But this is not to say that everything that goes on in our brains has an intrinsic origin. Indeed, as a practical matter what usually seems to happen is that we receive external input that leads to some train of thought which continues for a while, but then dies out until we get more input. And often the actual form of this train of thought is influenced by memory we have developed from inputs in the past—making it not necessarily repeatable even with exactly the same input.

But it seems likely that the individual steps in each train of thought follow quite definite underlying rules. And the crucial point is then that I suspect that the computation performed by applying these rules is often sophisticated enough to be computationally irreducible—with the result that it must intrinsically produce behavior that seems to us free of obvious laws."

Intelligence in the Universe

Wolfram has a wonderful section about intelligence in the universe, but this post is quickly becoming quite long so I will stick to my highlights. Definitely check it out if what I say here interests you.

Here he poignantly discusses how "intelligence" and "life" are difficult to define, and how many features of commonly given definitions of intelligence (learning and memory, communication, adaptation to complex situations, handling abstraction) and life (spontaneous movement/response to stimuli, self-organization from disorganized material, reproduction) are in fact present in much simple systems that we would not describe as intelligent or alive.
"And in fact I expect that in the end the only way we would unquestionably view a system as being an example of life is if we found that it shared many specific details with life on Earth."
Discussing extraterrestrial intelligence, he introduced me to an idea that is probably a well-known science fiction trope, but one that genuinely surprised me. He talks about how earth is bombarded with radio signals from around our galaxy and beyond, but that these signals seem to be completely random noise, and thus they are assumed to be just side effects of some physical process. But, he notices, this very lack of regularities in the signal could actually be a sign of some kind of extraterrestrial intelligence: "For any such regularity represents in a sense a redundancy or inefficiency that can be removed by the sender and receiver both using appropriate data compression." If this doesn't make sense to you, then you will probably also enjoy his section on data compression and reducibility. The whole book is really worth taking the time to read!

An Incredible Ending

I'm going to quote the last few paragraphs of the book in full, because they are extremely beautiful to me and there is no way I could do them justice. If you read nothing else in this blog post, read this. Feeling the full intensity of its impact/import really depends on one having read the previous like, 800 pages, and have understood the main arguments in them, so if you are planning to read the book in its entirety you might save this part until then for greatest effect. Still, if this is the only thing you ever read by Stephen Wolfram, I think it should be this. It is a good stylistic representation of the book (the short sentences, the lucid writing, the hubris) and it is the ultimate statement of the work's conclusions. Fair warning: much of it is going to sound absolutely outrageous if you haven't read the book, and especially if you haven't read the parts of this post about universality and computational reducibility. In fact, even still it sounds kind of preposterous!

But having been preoccupied with these questions about life for many years now, this passage resonated with me deeply and immediately and I am still reeling from it. Though I am not completely convinced (though are we ever, of anything?), the ideas summarized herein constitute, at least for me personally, a singularly compelling theory of existence, of nature, of life... of everything. Granted, I am taking a lot on faith for now, but I know I will have occasion to return to these thoughts time and time again as they percolate across my lifetime; indeed, it is largely for this reason that I took the time to write this post. Well, here it is; as elsewhere in the quoted material, any emphasis is mine:

*********************************************************************
"It would be most satisfying if science were to prove that we as humans are in some fundamental way special, and above everything else in the universe. But if one looks at the history of science many of its greatest advances have come precisely from identifying ways in which we are not special—for this is what allows science to make ever more general statements about the universe and the things in it.

Four centuries ago we learned for example that our planet does not lie at a special position in the universe. A century and a half ago we learned that there was nothing very special about the origin of our species. And over the past century we have learned that there is nothing special about our various physical, chemical and other constituents.

Yet in Western thought there is still a strong belief that there must be something fundamentally special about us. And nowadays the most common assumption is that it must have to do with the level of intelligence or complexity that we exhibit. But building on what I have discovered in this book, the Principle of Computational Equivalence now makes the fairly dramatic statement that even in these ways there is nothing fundamentally special about us.

For if one thinks in computational terms the issue is essentially whether we somehow show a specially high level of computational sophistication. Yet the Principle of Computational Equivalence asserts that almost any system whose behavior is not obviously simple will tend to be exactly equivalent in its computational sophistication.

So this means that there is in the end no difference between the level of computational sophistication that is achieved by humans and by all sorts of other systems in nature and elsewhere. For my discoveries imply that whether the underlying system is a human brain, a turbulent fluid, or a cellular automaton, the behavior it exhibits will correspond to a computation of equivalent sophistication.

And while from the point of view of modern intellectual thinking this may come as quite a shock, it is perhaps not so surprising at the level of everyday experience. For there are certainly many systems in nature whose behavior is complex enough that we often describe it in human terms. And indeed in early human thinking it is very common to encounter the idea of animism: that systems with complex behavior in nature must be driven by the same kind of essential spirit as humans.

But for thousands of years this has been seen as naive and counter to progress in science. Yet now essentially this idea—viewed in computational terms through the discoveries in this book—emerges as crucial. For as I discussed earlier in this chapter, it is the computational equivalence of us as observers to the systems in nature that we observe that makes these systems seem to us so complex and unpredictable.

And while in the past it was often assumed that such complexity must somehow be special to systems in nature, what my discoveries and the Principle of Computational Equivalence now show is that in fact it is vastly more general. For what we have seen in this book is that even when their underlying rules are almost as simple as possible, abstract systems like cellular automata can achieve exactly the same level of computational sophistication as anything else.

It is perhaps a little humbling to discover that we as humans are in effect computationally no more capable than cellular automata with very simple rules. But the Principle of Computational Equivalence also implies that the same is ultimately true of our whole universe.

So while science has often made it seem that we as humans are somehow insignificant compared to the universe, the Principle of Computational Equivalence now shows that in a certain sense we are at the same level as it is. For the principle implies that what goes on inside us can ultimately achieve just the same level of computational sophistication as our whole universe.

But while science has in the past shown that in many ways there is nothing special about us as humans, the very success of science has tended to give us the idea that with our intelligence we are in some way above the universe. Yet now the Principle of Computational Equivalence implies that the computational sophistication of our intelligence should in a sense be shared by many parts of our universe—an idea that perhaps seems more familiar from religion than science.

Particularly with all the successes of science, there has been a great desire to capture the essence of the human condition in abstract scientific terms. And this has become all the more relevant as its replication with technology begins to seem realistic. But what the Principle of Computational Equivalence suggests is that abstract descriptions will never ultimately distinguish us from all sorts of other systems in nature and elsewhere. And what this means is that in a sense there can be no abstract basic science of the human condition—only something that involves all sorts of specific details of humans and their history.

So while we might have imagined that science would eventually show us how to rise above all our human details what we now see is that in fact these details are in effect the only important thing about us.

And indeed at some level it is the Principle of Computational Equivalence that allows these details to be significant. For this is what leads to the phenomenon of computational irreducibility. And this in turn is in effect what allows history to be significant—and what implies that something irreducible can be achieved by the evolution of a system.

Looking at the progress of science over the course of history one might assume that it would only be a matter of time before everything would somehow be predicted by science. But the Principle of Computational Equivalence—and the phenomenon of computational irreducibility—now shows that this will never happen.

There will always be details that can be reduced further—and that will allow science to continue to show progress. But we now know that there are some fundamental boundaries to science and knowledge.

And indeed in the end the Principle of Computational Equivalence encapsulates both the ultimate power and the ultimate weakness of science. For it implies that all the wonders of our universe can in effect be captured by simple rules, yet it shows that there can be no way to know all the consequences of these rules, except in effect just to watch and see how they unfold."

********************************************************************

Sunday, June 21, 2015

Laudato Si Pope-pourri

It's not everyday that I want to read something a Pope wrote. In fact, today, June 19, 2015 was the only day I have ever wanted that, if memory serves. Today I read Papa Francesca's second encyclical, entitled "Laudato Si -- On Care For Our Common Home". It was a unique experience to navigate to w2.vatican.va, download an essay the Pope just wrote about environmentalism, and then read it on my computer.

I read it more as a curiosity than anything else, and though I found much to disagree with, I was in strong accord with two of the Pope's main points as I have understood them: (1) that every living thing is valuable and all of nature is interdependent; and (2) that rampant consumerism/individualism is a soul-sucking trap that perverts nature, prizes inequity, ruins the environment, and leads millions of people to lead ultimately meaningless lives. These ideas are neither new nor original, and so I don't want to spend too much time on this post. However, I do want to share with you some of the passages I found most interesting in case you never get to read it. The Pope seems like a pretty decent human being!

Adumbrating the issues to be addressed, the Pope tells us:
"I will point to the intimate relationship between the poor and the fragility of the planet, the conviction that everything in the world is connected, the critique of new paradigms and forms of power derived from technology, the call to seek other ways of understanding the economy and progress, the value proper to each creature, the human meaning of ecology, the need for forthright and honest debate, the serious responsibility of international and local policy, the throwaway culture and the proposal of a new lifestyle" (13).
Quoting Patriarch Bartholomew, the Pope urges us "to look for solutions not only in technology but in a change of humanity; otherwise we would be dealing merely with symptoms. [Bartholomew] asks us to replace consumption with sacrifice, greed with generosity, wastefulness with a spirit of sharing..." (2).

The Pope looks at technology with a far more jaundiced eye than I do, but he may have a couple points here worthy of our consideration:
"Furthermore, when media and the digital world become omnipresent, their influence can stop people from learning how to live wisely, to think deeply and to love generously. In this context, the great sages of the past run the risk of going unheard amid the noise and distractions of an information overload. Efforts need to be made to help these media become sources of new cultural progress for humanity and not a threat to our deepest riches. 
True wisdom, as the fruit of self-examination, dialogue and generous encounter between persons, is not acquired by a mere accumulation of data which eventually leads to overload and confusion, a sort of mental pollution. Real relationships with others, with all the challenges they entail, now tend to be replaced by a type of internet communication which enables us to choose or eliminate relationships at whim, thus giving rise to a new type of contrived emotion which has more to do with devices and displays than with other people and with nature. Today’s media do enable us to communicate and to share our knowledge and affections. Yet at times they also shield us from direct contact with the pain, the fears and the joys of others and the complexity of their personal experiences. For this reason, we should be concerned that, alongside the exciting possibilities offered by these media, a deep and melancholic dissatisfaction with interpersonal relations, or a harmful sense of isolation, can also arise." (31-32)

Throughout, he constantly questions the modern conception of "progress" held by many in the developed world:
There is a tendency to believe that every increase in power means an increase of 'progress' itself, an advance in security, usefulness, welfare and vigour, an assimilation of new values into the stream of culture, as if reality, goodness and truth automatically flow from technological and economic power as such. The fact is that 'contemporary man has not been trained to use power well', because our immense technological development has not been accompanied by a development in human responsibility, values and conscience." (76)

Later in the encyclical the Pope expands on this theme more forcefully: "we need to grow in the conviction that a decrease in the pace of production and consumption can at times give rise to another form of progress and development. Put simply, it is a matter of redefining our notion of progress" (139). I'm about to quote the Pope at length, but it is well worth reading:
"Environmental protection cannot be assured solely on the basis of financial calculations of costs and benefits. The environment is one of those goods that cannot be adequately safeguarded or promoted by market forces. Once more, we need to reject a magical conception of the market, which would suggest that problems can be solved simply by an increase in the profits of companies or individuals. Is it realistic to hope that those who are obsessed with maximizing profits will stop to reflect on the environmental damage which they will leave behind for future generations? Where profits alone count, there can be no thinking about the rhythms of nature, its phases of decay and regeneration, or the complexity of ecosystems which may be gravely upset by human intervention. Moreover, biodiversity is considered at most a deposit of economic resources available for exploitation, with no serious thought for the real value of things, their significance for persons and cultures, or the concerns and needs of the poor" (138).
"A technological and economic development which does not leave in its wake a better world and an integrally higher quality of life cannot be considered progress. Frequently, in fact, people’s quality of life actually diminishes – by the deterioration of the environment, the low quality of food or the depletion of resources – in the midst of economic growth. In this context, talk of sustainable growth usually becomes a way of distracting attention and offering excuses. It absorbs the language and values of ecology into the categories of finance and technocracy, and the social and environmental responsibility of businesses often gets reduced to a series of marketing and image-enhancing measures" (141).

"The principle of the maximization of profits, frequently isolated from other considerations, reflects a misunderstanding of the very concept of the economy. As long as production is increased, little concern is given to whether it is at the cost of future resources or the health of the environment; as long as the clearing of a forest increases production, no one calculates the losses entailed in the desertification of the land, the harm done to biodiversity or the increased pollution. In a word, businesses profit by calculating and paying only a fraction of the costs involved. Yet only when “the economic and social costs of using up shared environmental resources are recognized with  transparency and fully borne by those who incur them, not by other peoples or future generations”, can those actions be considered ethical" (143).

"Many things have to change course, but it is we human beings above all who need to change. We lack an awareness of our common origin, of our mutual belonging, and of a future to be shared with everyone. This basic awareness would enable the development of new convictions, attitudes and forms of life. A great cultural, spiritual and educational challenge stands before us, and it will demand that we set out on the long path of renewal" (144).

I have expressed similar concerns about our ultimately destructive vision of progress myself, though I tend to be far more critical of consumerism and far more hopeful about technological advance. Still, "the economy accepts every advance in technology with a view to profit, without concern for its potentially negative impact on human beings" (76). Further, the Pope argues that
"We have to accept that technological products are not neutral, for they create a framework which ends up conditioning lifestyles and shaping social possibilities along the lines dictated by the interests of certain powerful groups. Decisions which may seem purely instrumental are in reality decisions about the kind of society we want to build. The idea of promoting a different cultural paradigm and employing technology as a mere instrument is nowadays inconceivable. The technological paradigm has become so dominant that it would be difficult to do without its resources and even more difficult to utilize them without being dominated by their internal logic. It has become countercultural to choose a lifestyle whose goals are even partly independent of technology, of its costs and its power to globalize and make us all the same." (pg 76)
 Along these same lines, he continues:
"A science which would offer solutions to the great issues would necessarily have to take into account the data generated by other fields of knowledge, including philosophy and social ethics; but this is a difficult habit to acquire today. Nor are there genuine ethical horizons to which one can appeal. Life gradually becomes a surrender to situations conditioned by technology, itself viewed as the principal key to the meaning of existence. In the concrete situation confronting us, there are a number of symptoms which point to what is wrong, such as environmental degradation, anxiety, a loss of the purpose of life and of community living" (pg 82)
Talking politics, the Pope criticizes the short-sightedness that is built into our systems of governance:
"A politics concerned with immediate results, supported by consumerist sectors of the population, is driven to produce short-term growth. In response to electoral interests, governments are reluctant to upset the public with measures which could affect the level of consumption or create risks for foreign investment. The myopia of power politics delays the inclusion of a far-sighted environmental agenda within the overall agenda of governments" (130).
...
A healthy politics is sorely needed, capable of reforming and coordinating institutions, promoting best practices and overcoming undue pressure and bureaucratic inertia. It should be added, though, that even the best mechanisms can break down when there are no worthy goals and values, or a genuine and profound humanism to serve as the basis of a noble and generous society" (132).

Though I am picking and choosing the passages which stood out to me, it should be clear that one common thread running through the Pope's message is that consumerism is at the heart of all that is wrong with society today:
"This paradigm leads people to believe that they are free as long as they have the supposed freedom to consume. But those really free are the minority who wield economic and financial power. Amid this confusion, postmodern humanity has not yet achieved a new self-awareness capable of offering guidance and direction, and this lack of identity is a source of anxiety. We have too many means and only a few insubstantial ends.
...
Obsession with a  consumerist lifestyle, above all when few people are capable of maintaining it, can only lead to violence and mutual destruction." (149-150)
He urges us to be mindful of whom we give our money, and to not give our money to those whose practices we cannot support in good conscience: "[there is a] great need for a sense of social responsibility on the part of consumers. Purchasing is always a moral–and not simply economic–act” (150). He reminds of of "the moral imperative of assessing the impact of our every action and personal decision on the world around us. If we can overcome individualism, we will truly be able to develop a different lifestyle and bring about significant changes in society" (151).
  
The Pope, as you might expect, also does not support measures to control population growth. I disagree with this, but he makes a defensible point:
"To blame population growth instead of extreme and selective consumerism on the part of some, is one way of refusing to face the issues. It is an attempt to legitimize the present model of distribution where a minority believes that it has the right to consume in a way which can never be universalized, since the planet could not even contain the waste products of such consumption" (29).
But he doesn't just criticize greed, consumption, and individualism without offering ways to address it. He discusses how "ecological education" needs to start at a very young age. The family environment plays a crucial role because
"In the family we first learn how to show love and respect for life; we are taught the proper use of things, order and cleanliness, respect for the local ecosystem and care for all creatures. In the family we receive an integral education, which enables us to grow harmoniously in personal maturity. In the family we learn to ask without demanding, to say “thank you” as an expression of genuine gratitude for what we have been given, to control our aggression and greed, and to ask forgiveness when we have caused harm. These simple gestures of heartfelt courtesy help to create a culture of shared life and respect for our surroundings" (155).

His vision of an alternative lifestyle is as powerful as it is simple, and it still holds together just as well if you cut out all the religious stuff:
"Christian spirituality proposes an alternative understanding of the quality of life, and encourages a prophetic and contemplative lifestyle, one capable of deep enjoyment free of the obsession with consumption. We need to take up an ancient lesson, found in different religious traditions and also in the Bible. It is the conviction that “less is more”. A constant flood of new consumer goods can baffle the heart and prevent us from cherishing each thing and each moment. To be serenely present to each reality, however small it may be, opens us to much greater horizons of understanding and personal fulfilment. Christian spirituality proposes a growth marked by moderation and the capacity to be happy with little. It is a return to that simplicity which allows us to stop and appreciate the small things, to be grateful for the opportunities which life affords us, to be spiritually detached from what we possess, and not to succumb to sadness for what we lack. This implies avoiding the dynamic of dominion and the mere accumulation of pleasures.

Such sobriety, when lived freely and consciously, is liberating. It is not a lesser life or one lived with less intensity. On the contrary, it is a way of living life to the full. In reality, those who enjoy more and live better each moment are those who have given up dipping here and there, always on the look-out for what they do not have. They experience what it means to appreciate each person and each thing, learning familiarity with the simplest things and how to enjoy them. So they are able to shed unsatisfied needs, reducing their obsessiveness and weariness. Even living on little, they can live a lot, above all when they cultivate other pleasures and find satisfaction in fraternal encounters, in service, in developing their gifts, in music and art, in contact with nature, in prayer. Happiness means knowing how to limit some needs which only diminish us, and being open to the many different possibilities which life can offer.

Sobriety and humility were not favourably regarded in the last century. And yet, when there is a general breakdown in the exercise of a certain virtue in personal and social life, it ends up causing a number of imbalances, including environmental ones. That is why it is no longer enough to speak only of the integrity of ecosystems. We have to dare to speak of the integrity of human life, of the need to promote and unify all the great values. Once we lose our humility, and become enthralled with the possibility of limitless mastery over everything, we inevitably end up harming society and the environment" (159-163).

Another theme that cut through the encyclical was his message of empathy, of how important it is to "see each human being as a subject who can never be reduced to the status of an object" (59), and how "all creatures are connected, each must be cherished with love and respect, for all of us as living creatures are dependent on one another" (29). "All of us are linked by unseen bonds and together form a kind of universal family, a sublime communion which fills us with a sacred, affectionate and humble respect" (64). Every act of cruelty towards any creature is contrary to human dignity. We can hardly consider ourselves to be fully loving if we disregard any aspect of reality" (67). "The earth is essentially a shared inheritance, whose fruits are meant to benefit everyone" (67).
"We fail to see that some [people] are mired in desperate and degrading poverty, with no way out, while others have not the faintest idea of what to do with their possessions, vainly showing off their supposed superiority and leaving behind them so much waste which, if it were the case everywhere, would destroy the planet. In practice, we continue to tolerate that some consider themselves more human than others, as if they had been born with greater rights" (65).

Also resonant with me is his repeated emphasis on ecology and how everything coexists in a delicate web of interrelationships:
"Ecology studies the relationship between living organisms and the environment in which they develop. This necessarily entails reflection and debate about the conditions required for the life and survival of society, and the honesty needed to question certain models of development, production and consumption. It cannot be emphasized enough how everything is interconnected. Time and space are not independent of one another, and not even atoms or subatomic particles can be considered in isolation. Just as the different aspects of the planet – physical, chemical and biological – are interrelated, so too living species are part of a network which we will never fully explore and understand. A good part of our genetic code is shared by many living beings. It follows that the fragmentation of knowledge and the isolation of bits of information can actually become a form of ignorance, unless they are integrated into a broader vision of reality" (95).
The Pope said much, much more in his message; I've merely highlighted the stuff I found most compelling. If any of it sounds interesting, you might check it out! He ends with two prayers, one of which I've reproduced below. I'm not religious in the slightest, but if I was, this is a prayer I could bow my head to:

A prayer for our earth:

All-powerful God,
you are present in the whole universe
and in the smallest of your creatures.
You embrace with your tenderness all that exists.
Pour out upon us the power of your love,
that we may protect life and beauty.
Fill us with peace, that we may live
as brothers and sisters, harming no one.
O God of the poor,
help us to rescue the abandoned
and forgotten of this earth,
so precious in your eyes.
Bring healing to our lives,
that we may protect the world and not prey on it,
that we may sow beauty,
not pollution and destruction.
Touch the hearts
of those who look only for gain
at the expense of the poor and the earth.
Teach us to discover the worth of each thing,
to be filled with awe and contemplation,
to recognize that we are profoundly united
with every creature
as we journey towards your infinite light.
We thank you for being with us each day.
Encourage us, we pray, in our struggle
for justice, love and peace.