Gary N. Smith

Senior Fellow, Walter Bradley Center for Natural and Artificial Intelligence

Archives

The Death of Peer Review?

Science is built on useful research and thoroughly vetted peer review
Two years ago, I wrote about how peer review has become an example of Goodhart’s law: “When a measure becomes a target, it ceases to be a good measure.” Once scientific accomplishments came to be gauged by the publication of peer-reviewed research papers, peer review ceased to be a good measure of scientific accomplishments. The situation has not improved. One consequence of the pressure to publish is the temptation researchers have to p-hack or HARK. P-hacking occurs when a researcher tortures the data in order to support a desired conclusion. For example, a researcher might look at subsets of the data, discard inconvenient data, or try different model specifications until the desired results are obtained and deemed statistically significant—and therefore publishable. HARKing

A World Without Work? Here We Go Again

Large language models still can't replace critical thinking
On March 22, nearly 2,000 people signed an open letter drafted by the Future of Life Institute (FLI) calling for a pause of at least 6 months in the development of large language models (LLMs): Contemporary AI systems are now becoming human-competitive at general tasks, and we must ask ourselves: Should we let machines flood our information channels with propaganda and untruth? Should we automate away all the jobs, including the fulfilling ones? Should we develop nonhuman minds that might eventually outnumber, outsmart, obsolete and replace us? Should we risk loss of control of our civilization? FLI is a nonprofit organization concerned with the existential risks posed by artificial intelligence. Its president is Max Tegmark, an MIT professor who is no

An Illusion of Emergence, Part 2

A figure can tell a story but, intentionally or unintentionally, the story that is told may be fiction
I recently wrote about how graphs that use logarithms on the horizontal axis can create a misleading impression of the relationship between two variables. The specific example I used was the claim made in a recent paper (with 16 coauthors from Google, Stanford, UNC Chapel Hill, and DeepMind) that scaling up the number of parameters in large language models (LLMs) like ChatGPT can cause “emergence,” which they define as qualitative changes in abilities that are not present in smaller-scale models but are present in large-scale models; thus they cannot be predicted by simply extrapolating the performance improvements on smaller-scale models. They present several graphs similar to this one that seem to show emergence: However, their graphs have the logarithms of the number

A Graph Can Tell a Story—Sometimes It’s an Illusion

Mistakes, chicanery, and "chartjunk" can undermine the usefulness of graphs
A picture is said to be worth a thousand words. A graph can be worth a thousand numbers. Graphs are, as Edward Tufte titled his wonderful book, the “visual display of quantitative information.” Graphs should assist our understanding of the data we are using. Graphs can help us identify tendencies, patterns, trends, and relationships. They should display data accurately and encourage viewers to think about the data rather than admire the artwork. Unfortunately, graphs are sometimes marred (intentionally or unintentionally) by a variety of misleading techniques or by what Tufte calls “chartjunk” that obscures rather than illuminates. I have described elsewhere many ways in which mistakes, chicanery, and chartjunk can undermine the usefulness of graphs. I recently saw a novel

Learning to Communicate

Why writing skills are so important, especially in today's artificial world
Educators have been shaken by fears that students will use ChatGTP and other large language models (LLMs) to answer questions and write essays. LLMs are indeed astonishing good at finding facts and generating coherent essays — although the alleged facts are sometimes false and the essays are sometimes tedious BS supported by fake references. I am more optimistic than most. I am hopeful that LLMs will be a catalyst for a widespread discussion of our educational goals. What might students learn in schools that will be useful long after they graduate? There are many worthy goals, but critical thinking and communication skills should be high on any list. I’ve written elsewhere about how critical thinking abilities are important for students and cannot be reliably faked by

Text Generators, Education, and Critical Thinking: an Update

The fundamental problem remains that, not knowing what words mean, AI has no critical thinking abilities
This past October, I wrote that educational testing was being shaken by the astonishing ability of GPT-3 and other large language models (LLMs) to answer test questions and write articulate essays. I argued that, while LLMs might mimic human conversation, they do not know what words mean. They consequently excel at rote memorization and BS conversation but struggle mightily with assignments that are intended to help students develop their critical thinking abilities, such as Develop and defend a reasonable position Judge well the quality of an argument Identify conclusions, reasons, and assumptions Judge well the credibility of sources Ask appropriate clarifying questions Lacking any understanding of semantics, LLMs can do none of this. To illustrate, I asked

Let’s Take the “I” Out of AI

Large language models, though impressive, are not the solution. They may well be the catalyst for calamity.
When OpenAI’s text generator, ChatGPT, was released to the public this past November, the initial reaction was widespread astonishment. Marc Andreessen described it as, “Pure, absolute, indescribable magic.” Bill Gates said that the creation of ChatGPT was as important as the creation of the internet. Jensen Huang, Nvidia’s CEO, Jensen Huang, said that, “ChatGPT is one of the greatest things ever created in the computing industry.” Conversations with ChatGPT are, indeed, very much like conversations with a super-intelligent human. For many, it seems that the 70-year search for a computer program that could rival or surpass human intelligence has finally paid off. Perhaps we are close to the long-anticipated singularity where computers improve rapidly and autonomously,

Does New A.I. Live Up to the Hype?

Experts are finding ChatGPT and other LLMs unimpressive, but investors aren't getting the memo
Original article was featured at Salon on February 21st, 2023. On November 30, 2022, OpenAI announced the public release of ChatGPT-3, a large language model (LLM) that can engage in astonishingly human-like conversations and answer an incredible variety of questions. Three weeks later, Google’s management — wary that they had been publicly eclipsed by a competitor in the artificial intelligence technology space — issued a “Code Red” to staff. Google’s core business is its search engine, which currently accounts for 84% of the global search market. Their search engine is so dominant that searching the internet is generically called “googling.” When a user poses a search request, Google’s search engine returns dozens of helpful

Goodhart’s Law and Scientific Innovation in Academia

Many university researchers are leaving academia so they can actually get things done
British economist Charles Goodhart was a financial advisor to the Bank of England from 1968 to 1985, a period during which many economists (“monetarists”) believed that central banks should ignore unemployment and interest rates. Instead, they believed that central banks should focus on maintaining a steady rate of growth of the money supply. The core idea was that central banks could ignore economic booms and busts because they are short-lived and self-correcting (Ha! Ha!) and should, instead, keep some measure of the money supply growing at a constant rate in order to keep the rate of inflation low and constant. The choice of which money supply to target was based on how closely it was statistically correlated with GDP. The British monetary authorities adopted this policy in

Large Language Models Can Entertain but Are They Useful?

Humans who value correct responses will need to fact-check everything LLMs generate
In 1987 economics Nobel Laureate Robert Solow said that the computer age was everywhere—except in productivity data. A similar thing could be said about AI today: It dominates tech news but does not seem to have boosted productivity a whit. In fact, productivity growth has been declining since Solow’s observation. Productivity increased by an average of 2.7% a year from 1948 to 1986, by less than 2% a year from 1987 to 2022. Labor productivity is the amount of goods and services we produce in a given amount of time—output per hour. More productive workers can build more cars, construct more houses, and educate more children. More productive workers can also enjoy more free time. If workers can do in four days what use to take five days, they can produce 25 percent more—or

Chatbots: Still Dumb After All These Years

Intelligence is more than statistically appropriate responses
This story, by Pomona College business and investment prof Gary Smith was #6 in 2022 at Mind Matters News in terms of reader numbers. As we approach the New Year, we are rerunning the top ten Mind Matters News stories of 2022, based on reader interest. At any rate: “Chatbots: Still dumb after all these years.” (January 3, 2022) In 1970, Marvin Minsky, recipient of the Turing Award (“the Nobel Prize of Computing”), predicted that within “three to eight years we will have a machine with the general intelligence of an average human being.”  Fifty-two years later, we’re still waiting. The fundamental roadblock is that, although computer algorithms are really, really good at identifying statistical patterns, they have no way of knowing what these

Large Learning Models Are An Unfortunate Detour in AI

Gary Smith: Even though LLMs have no way of assessing the truth or falsity of the text they generate, the responses sound convincing
For decades, computer scientists have struggled to construct systems possessing artificial general intelligence (AGI) that rivals the human brain—including the ability to use analogies, take into account context, and understand cause-and-effect. Marvin Minsky (1927–2016) was hardly alone in his overly optimistic 1970 prediction that, “In from three to eight years we will have a machine with the general intelligence of an average human being.” AGI turned out to be immensely more difficult than imagined and researchers turned their attention to bite-size projects that were doable (and profitable). Recently, large language models (LLMs) — most notably OpenAI’s GPT-3 — have fueled a resurgence of hope that AGI is almost here. GPT-3 was trained by breaking 450 gigabytes of

Has the Bitcoin Supply of Greater Fools Finally Been Exhausted?

It’s a bubble fueled by babble
The Declaration of Bitcoin’s Independence, endorsed by numerous celebrities, states that, We hold these truths to be self-evident. We have been cyclically betrayed, lied to, stolen from, extorted from, taxed, monopolized, spied on, inspected, assessed, authorized, registered, deceived, and reformed. We have been economically disarmed, disabled, held hostage, impoverished, enervated, exhausted, and enslaved. And then there was bitcoin. Bitcoin and other cryptocurrencies are, in reality, not useful alternatives to cash, checking accounts, debit cards, credit cards, and the other components of the financial system that powers the economies of developed countries. Blockchain technology is slow, expensive, and environmentally unfriendly. In 2021, Cambridge University researchers

This Time, Houston Was Blessed More by Luck Than by Stolen Signs

The victory parade over, let’s look at whether luck had more to do with the Astros’ success than Astro fans want to admit
The Houston Astros are the 2022 Major League Baseball (MLB) World Champion — this time, as far as we know, without relying on electronically stolen pitching signs sent to batters by banging trashcan lids or using buzzers hidden under uniforms. Now that the champagne has popped and the victory parade has been held, let’s consider the fact that maybe, just maybe, luck had more to do with the Astros’ success than Astro fans want to admit. Athletes and fans want to believe that the team that wins the World Series, Super Bowl, or any other championship is the best team that year. The reality is that in every sport — some more than others — outcomes are influenced by good fortune or bad. In football, fumbles bounce erratically; officials make inexplicable calls and non-calls;

What Does AI in Education Mean for Critical Thinking Skills?

Students, as reported at Motherboard, are increasingly using GPT-3 and other text-generator programs to write essays for them
The COVID pandemic pushed a lot of school coursework to the internet, with an increased reliance on true/false and multiple-choice tests that can be taken online and graded quickly and conveniently. Not surprisingly, once questions went online, so did answers, with several companies posting (for a fee) solutions for students who would rather Google answers than watch Zoomed lectures. To fit into a true/false or multiple-choice format, the questions are generally little more than a recitation of definitions, facts, and calculations. Here, for example, are three statistics questions I found at a question/answer site: Question: True or false: A group of subjects selected from the group of all subjects under study is called a sample. Answer: True Question: You are interested

More Hard Math Does Not Necessarily Mean More Useful Solutions

It is sometimes tempting to overemphasize the math and underemphasize the relevance
Math is said to be the language of science in that most (but definitely not all) scientific models of the world involve mathematical equations. The Pythagorean theorem, the normal distribution, Einstein’s energy-mass equivalence, Newton’s second law of motion, Newton’s universal law of gravitation, Planck’s equation. How could any of these remarkable models be expressed without math? Unfortunately, it is sometimes tempting to overemphasize the math and underemphasize the relevance. The brilliance of the models listed above lies not in mathematical pyrotechnics but, if anything, in their breath-taking simplicity. Useful models help us understand the world and make reliable predictions. Math for the sake of math does neither. Examples of mindless math are legion. I will give

The Hyper-Specialization of University Researchers

So many papers are published today in increasingly narrow specialties that, if there is still a big picture, hardly anyone can see it
The Bible warns that, “Of making many books there is no end; and much study is a weariness of the flesh.” Nowadays, the endless making of books is dwarfed by the relentless firehose of academic research papers. A 2010 study published in the British Medical Journal reported that the U.S. National Library of Medicine includes 113,976 papers on echocardiography — which would weary the flesh of any newly credentialed doctor specializing in echocardiography: We assumed that he or she could read five papers an hour (one every 10 minutes, followed by a break of 10 minutes) for eight hours a day, five days a week, and 50 weeks a year; this gives a capacity of 10000 papers in one year. Reading all papers referring to echocardiography… would take 11 years and 124 days, by which time at

Step Away From Stepwise Regression (and Other Data Mining)

Stepwise regression, which is making a comeback, is just another form of HARKing — Hypothesizing After the Results are Known
There is a strong correlation between the number of lawyers in Nevada and the number of people who died after tripping over their own two feet. There are similarly impressive correlations between U.S. crude oil imports and the per capita consumption of chicken — and the number of letters in the winning word in the Scripps National Spelling Bee and the number if people killed by venomous spiders. If you find these amusing (as I do), there are many more at the website Spurious Correlations. These silly statistical relationships are intended to demonstrate that correlation is not causation. But no matter how often or how loudly statisticians shout that warning, many people do not hear it. When there is a correlation between variables A and B, it could be that: ● A causes B —

Don’t Worship Math: Numbers Don’t Equal Insight

The unwarranted assumption that investing in stocks is like rolling dice has led to some erroneous conclusions and extraordinarily conservative advice
My mentor, James Tobin, considered studying mathematics or law as a Harvard undergraduate but later explained that I studied economics and made it my career for two reasons. The subject was and is intellectually fascinating and challenging, particularly to someone with taste and talent for theoretical reasoning and quantitative analysis. At the same time it offered the hope, as it still does, that improved understanding could better the lot of mankind. I was an undergraduate math major (at Harvey Mudd, not Harvard) and chose economics for the much the same reasons. Mathematical theories and empirical data can be used to help us understand and improve the world. Scene from the Great Depression For example, during the Great Depression in the 1930s, governments everywhere had so

Big Brother Is Watching You (And Trying to Read Your Mind)

Chinese researchers now claim to have developed technology that can read our minds
One of the most popular story lines in the widely acclaimed television show The Good Wife (2009–2016) is when National Security Agency (NSA) techies entertain themselves by eavesdropping on the heroine’s personal life. It clearly resonated with viewers and reinforced the fears of many that the NSA might be listening to their conversations. Indeed, they might be. In 2013 James Clapper, Director of National Intelligence, was asked by U.S. Senator Ron Wyden about whether NSA collects “any type of data at all on millions or hundreds of millions of Americans.” Clapper answered, under oath, “No sir, not wittingly.” Clapper had been informed the day before that he would be asked this question and he was offered an opportunity the day after to amend his answer — Clapper