Craig Mundie of Microsoft thinks that Tiger, his video-on-demand operating system, signals a fundamental shift in the computer industry. Ruling the new era will be bandwidth measured in billions of bits per second rather than in the millions of instructions per second of current computers.
“We’ll have infinite bandwidth in a decade’s time.”Bill Gates, PC Magazine, Oct. 11, 1994.
Andrew Grove, Titan of Intel, is widely known for his belief, born in the vortex of the Hungarian Revolution and honed in the trenches of Silicon Valley, that “only the paranoid survive.” If so, the Intel chief may soon need to resharpen the edges of fear that have driven his company to the top. Looming on the horizons of the global computer industry that Grove now shapes and spearheads is a gathering crest of change that threatens to reduce the microprocessor’s supremacy and reestablish the information economy on new foundations. Imparting a personal edge to the challenge are the restless energies of Microsoft’s Bill Gates and Tele-Communications Inc.’s John Malone, providing catalytic capital and leadership for the new tides of the telecosm.
Grove’s response is seemingly persuasive. “We have state-of-the-art silicon technology, state-of-the art microprocessor design skills and we have mass production volumes.” These huge assets endow Intel as a global engine of growth with 55% margins and more than 80% market share in the single most important product in the world economy. Why indeed should Grove worry?
One word only may challenge him and with him much of the existing computer establishment. Let us paraphrase a 1988 speech by John Moussouris, chairman and chief executive of the amazing Silicon Valley startup MicroUnity, which gains a portentous heft from being financed heavily by Gates and Malone: If the leading sage of computer design, in his last deathbed gasp, wanted to impart in one word all of his accumulated wisdom about the coming era to a prodigal son rushing home to inherit the business, that one word would be “bandwidth.” Andy Grove knows it well. Early this year he memorably declaimed: “If you are amazed by the fast drop in the cost of computing power over the last decade, just wait till you see what is happening to the cost of bandwidth.”
Eric Schmidt, chief technical officer of Sun Microsystems, is one of the few men who have measured this coming tide and mastered some of its crucial implications. His key insight is that the onrush of bandwidth abundance overthrows Moore’s Law as the driving force of computer progress. Until now progress in the computer industry has ridden the revelation in 1979 by Intel co-founder Gordon Moore that the density of transistors on chips, and thus the price-performance of computers, doubles every 18 months. Soon, however, Schmidt ordains, bandwidth will be king.
Bandwidth is communications power — the capacity of an information channel to transmit bits without error in the presence of noise. In fiber optics, in wireless communications, in new dumb switches, in digital signal processors, bandwidth will expand from five to 100 times as fast as the rise of microprocessor speeds. With the rapid spread of national networks of fiber and cable, the dribble of kilobits (thousands of bits) from twisted-pair telephone lines is about to become a firehose of gigabits (billions of bits). But the PC is not ready. Attach the firehose to the parallel port of your personal computer and the stream of bits becomes a blast of data smithereens.
Tsunami of Gigabits
The bandwidth bottleneck of telephone wires has long allowed the computer world to live in a strange and artificial isolation. In the computer world, Moore’s Law has reigned. At its awesome exponential pace, computer price-performance would increase some one hundredfold every 10 years. This means that for the price of a current 100 mips (millions of instructions per second) Pentium machine, you could buy a computer in 2004 running 10 billion instructions per second. Since today the fastest bit streams routinely linked to computers run 100 times slower, at 10 megabits per second on an Ethernet, 10 bips seems adequate as a 10-year target. All seems fine in computer land, where users rarely wonder what happens after the wire reaches the wall.
In the face of the 10 times faster increase in bandwidth, however, Moore’s Law seems almost paltry. The rise in bandwidth does not follow the smooth incremental ascent that the heroic exertions, inventions and investments of Andy Grove and his followers have maintained in microchips. bandwidth bumps and grinds and then volcanically erupts. The communications equivalent of those 10 bips that would take 10 years to reach according to the existing trend would be 10-gigabit-per-second connections to their corporate customers next year.
During the very period of apparent bandwidth doldrums during the 1980s, phone companies installed some 10 million kilometers of optical fiber. So far only an infinitesimal portion of its potential bandwidth has been delivered to customers. Moussouris estimates that the bandwidth of fiber has been exploited one million times less fully than the bandwidth of coax or twisted pair copper.
Nonetheless, the tide is now gathering toward a crest. This year, MCI offers its corporate customers access to a fiber connection at 2.4 gigabits per second. Next year that link will run at 10 gigabits per second for the same price. Two years after that it is scheduled to rise to 40 gigabits per second. Meanwhile, at Martlesham Heath in the United Kingdom, home of British Telecom’s research laboratories, Peter Cochrane announced in early September that he could send some 700 separate wavelength streams in parallel down a single fiber-optic thread the width of a human hair. Peter Scovell of Northern Telecom’s Bell Northern Research facility declares that by using “solitons” — an exotic method of keeping the bits intact at high speeds through a kind of surface tension counterbalancing dispersion in the fiber — it will be possible to carry 2.4 gigahertz (billions of cycles per second) on each wave length stream. That would add up to more than 1,700 gigahertz on every fiber thread.
Blocking such bandwidths until recently was what is called in the optics trade the “electronic bottleneck.” The light signals had to be converted to electronic pulses every 35 kilometers in order to be amplified and regenerated. Thus fiber optics could not function any faster than these electronic amplifiers did, or between two and 10 gigahertz. In the late 1980s, however, a team led by David Payne of the University of Southampton pioneered the concept of doping a fiber with the rare earth element erbium, to create an all-optical broadband amplifier. Perfected at Bell Labs, NTT and elsewhere, this device overcomes the electronic bottleneck and allows communications entirely at the speed of light.
IBM’s optical guru Paul Green prophesies that within the next decade or so it will be possible to send some 10,000 wavelength streams down a single fiber thread. Long prophesied by fiber optics pioneer Will Hicks, these developments remain mostly in the esoteric domains of optical laboratories. But IBM recently installed its first all-optical product — its MuxMaster — for a customer running 20 wavelengths on a fiber connecting offices in New York to a backup tape drive in New Jersey. Telephone companies from Italy to Canada are now deploying erbium-doped amplifiers. Long the frenzied pursuit of telecom laboratories from Japan to Dallas and government bodies from ARPA to NTT (now turning private), all-optical networks have become the object of entrepreneurial startups, such as Ciena and Erbium Networks.
Returning from the ethers of innovation to existing broadband technology connecting to people’s homes, Craig Tanner of CableLabs in Louisville, Colo., maintains that a typical cable coax line can accommodate two-way streams of data totaling eight gigabits per second. In Cambridge and other eastern Massachusetts cities, Continental Cablevision is now taking the first steps toward delivering some of this bandwidth for Andy Grove’s PC users. Today, using Digital Equipment’s LANCity broadband two-way cable modems, David Fellows, Continental’s chief technical officer, can offer 10 megabits per second Ethernet capability 70 miles from your office. That increases the current 9.6 kilobits per second speeds of most telephone modems by a factor of 1,000.
The most important short-term contributor to the tides of bandwidth is a new communications technology called asynchronous transfer mode. ATM is to telecommunications what containerization is to transport. It puts everything into same-sized boxes that can be readily handled by automated equipment. Just as containerization revolutionized the transport business, ATM is revolutionizing communications. In the case of ATM, the boxes are called cells and each one is 53 bytes long, including a five-byte address. The telephone industry chose 53 bytes as the largest possible container that could deliver real-time voice communications. But the computer industry embraced it because it allows fully silicon switching and routing. Free of complex software, small packets of a uniform 53 bytes can be switched at enormous speeds through an ATM network and dispatched to the end users on a fixed schedule that can accommodate voice, video and data, all at once.
Available at rates of 155 megabits per second and moving this year to 622 megabits and 2.4 gigabits, ATM switches from Fujitsu, IBM, AT&T, Fore Systems, Cisco Systems, SynOptics Communications and every other major manufacturer of hubs and routers will swamp the ports of personal computers over the next five years.
Why should all this bandwidth arouse the competitive fire of Andy Grove? The new explosions of bandwidth enable interactive multimedia and video, riding on radio frequencies, into every household — through the air from satellites and terrestrial wireless systems, through fiberoptic threads and cable TV and even phone-company coax.
If the personal computer cannot handle these streams, John Malone’s set-top boxes, Sega or Nintendo game machines or Bill Gates’s new communications technology will. A communications technology that can manage multimedia in full flood can also in time relegate one of Grove’s CPUs to service as a minor peripheral. The huge promise of the PC industry, with its richness of productivity tools and cultural benefits, could give way to an incoherent babel of toys, videophones and 3D games.
Redeeming the new era for the general-purpose PC entails overcoming the technical culture and mindset of bandwidth scarcity. In today’s world of bandwidth scarcity, arrays of special-purpose microprocessors constantly use their hard-wired computer cycles to compensate for the narrow bandwidth of existing channels and to make up for the small capacity of the fast, expensive memories where the data must be buffered or stored on the way. This is the world that Intel dominates today — a world of CPUs incapable of handling full multimedia and radio frequency demands, a world of narrowband four-kilohertz pipes to the home accessed by modems at 9.6 kilobits per second and a world of what Moussouris call arrays of “twisty little processors,” such as MPEG (Motion Picture Experts Group) decoders from C-Cube and IIT, graphics accelerators from Texas Instruments and an array of chips from Intel.
By fixing the necessary algorithms in hardware, these devices bypass the time-consuming tasks of retrieving software instructions and data from memory. Thus these chips can perform their functions at least 100 times faster than more general-purpose devices, such as Intel’s Pentium, that use software. But all this speed comes at the cost of rigid specialization. An MPEG-1 processor cannot even decode, MPEG-2. When the technology changes, you have to replace the chip. Such special-purpose devices now handle the broadband heavy lifting for video compression and decompression, digital radio processing, voice and sound synthesis, speech recognition, echo cancellation, graphics acceleration and other functions too demanding for the central processor.
By contrast, contemplate a world of bandwidth abundance. In a world of bandwidth abundance, specialized, hard-wired processing will be mostly unnecessary. In the extreme case, images can flow uncompressed through the network and onto the display. Bandwidth will have obviated thousands of mips of processing. The microprocessor instead can focus on managing documents on the screen, popping up needed information from databases, performing simulations or visualizations and otherwise enriching the conference. The arrival of bandwidth abundance transforms the computing environment.
Led by Grove’s and Intel’s bold investments in chip-making capability — some $2.4 billion in 1994 alone — the entire information industry has waxed fat and happy on the bonanzas of Moore’s Law. Now, however, some industry leaders are gasping for breath. Exkhard Pfeiffer of Compaq has denounced Intel’s avid campaign to shift customers toward the leading-edge processors such as Pentium, embodying the latest Moore’s Law advances. Gordon Moore himself has recently questioned whether the pace of microchip progress can continue in the face of wafer factory costs rising toward $2 billion for a typical “fab.” He has pronounced a new Moore’s Law: The costs of a wafer fab double for each new generation of microprocessor.
Sorry, but the new world of the telecosm offers no rest for weary microchip magnates or future-shocked PC producers. Driven by the new demands of video and multimedia, the pace of advance will now accelerate sharply rather than slow down.
Feeding the Tiger
Contemplate the advance of the Tiger, Microsoft’s all-software scheme for video-on-demand based entirely on PCs. Although Tiger has been presented as merely another way to build a “movie central” for cable headends or telco central offices, its real promise is not to redeem the existing centralized structure of video but to allow any PC owner to create a headend in the kitchen for video-on-demand. Today, such capability would mean buying a supercomputer plus an array of expensive boards containing special-purpose processors. Tiger’s consummation as a popular product therefore will require a new regime of semiconductor progress.
Driven by this imperative, a pioneering combine of Gates, Malone and Moussouris is making an audacious grab for supremacy in the telecosm. Just three miles from Intel and fueled by ideas from a 1984 defector from an Intel fabrication team, Moussouris’s MicroUnity is a flagrantly ambitious Sunnyvale, Calif., startup launched in 1988. Fueled by some $15 million from Microsoft and $15 million from TCI, among several other rumored backers, it plans a transformation of chip-making for the age of the telecosm, optimized for communications rather than computations.
MicroUnity’s goal is a general-purpose mediaprocessor, software programmable, that can run at no less than 400 billion bits per second — some hundreds of times faster than a Pentium — and perform all the functions currently done in special-purpose multimedia devices. Escaping the tyranny of fixed hardware standards, the mediaprocessor could receive decompression codes and other protocols, algorithms and services over the network with the video to be displayed in real time.
The Great Bandwidth Switch
In launching Tiger and MicroUnity, Gates and Malone are signaling a fundamental shift in the industry. Ruling the new era will be bandwidth or communications power, measured in billions of bits per second rather than in the millions of instructions per second of current computers. The telecosmic shift from mips to bandwidth, from storage-oriented computing to communications processing, will change the entire structure of information technology.
In the past, the industry has been driven by increases in computer power embodied in new generations of microprocessors — from the 8086 to the Pentium and on to the P-6 and new Reduced Instruction Set screamers such as the Power PC, Digital Equipment’s Alpha and Silicon Graphics new R-1000 (the latest in the family from Moussouris’s previous company Mips Computer, now owned by Silicon Graphics). External computer networks typically run much more slowly than internal networks, the backplane buses connecting microprocessors, memories, keyboards and screens. These buses race along at some 40 megabits per second, up to Intel’s new gigabit-per-second PCI bus. Even when computers are linked in local area networks in particular buildings at 10 megabit-per-second Ethernet speeds, they face a communications cliff at LAN’s end: the four-kilohertz wires of the telephone company. Under this regime, the processor is king and Moore’s Law dictates the pace of change.
In the age of the telecosm, however, all these rules collapse. When the network increasingly runs faster than the processors and buses in the PC, the computer “hollows out,” in the words of Eric Schmidt. The network becomes the bus and any set of interconnected processors and memories can become a computer regardless of their location. In this bandwidth-driven world, the key chips are communications processors, such as digital signal processors (DSPs) and MicroUnity’s mediaprocessors, which must function at the pace of the network firehose rather than at the pace of the Pentium.
For the last five years, communications processors have indeed been improving their price/performance tenfold every two years — more than three times as fast as microprocessors. This kind of difference add up. Soaring DSP capabilities have already made possible the achievement of many new digital technologies previously unattainable. Among them are digital video compression, video teleconferencing, broadband digital radios pioneered by Steinbrecher (see Forbes ASAP, April 11, 1994), digital echo cancellation and spread-spectrum cellular systems that allow 100% frequency reuse in every cell. All these schemes require processing speeds far in excess of the bit rate of the information.
For example, in accord with the prevailing MPEG standards, digital video compression produces a bit stream running at between 1.5 and six megabits per second. But in order to produce this signal manageable by a 100 mips Pentium, a supercomputer or special-purpose machine must process raw video bit flowing 100 times as fast as the compressed format — uncompressed video at a pace of 150 to 600 megabits per second. The complex and exacting process of compressing this onrush of bits — compensating for motion, comparing blocks of pixels for redundancy, smoothing out the flow of data — entails computer operations running 1,000 times as fast as the raw video bits. That is, the video compression algorithm requires a processing speed of between 150 and 600 gigabits per second — hundred of times faster than the Pentium.
Similarly, just to digitize radio signals requires a sampling rate twice as fast as the radio frequency — at a time when new wireless personal communications systems are moving to the two gigahertz bands and wireless cable is moving to 28 gigahertz. A broadband digital radio must handle some large multiple of the highest frequency it will process. Code division multiple access (CDMA) cellular systems depend on a spreading code at least 100 times faster than the bit rate of the message.
In order to feed the Tiger and other such bandwidth-hungry systems, communications processors will have to continue this breathtaking binge of progress beyond the bounds of the microcosm. Grove does not believe this possible. He contends that the surge in DSP will dwindle and converge with Moore’s Law, allowing the central processor to suck in functions currently performed in digital signal processors and other communications chips. DSP is nice, Grove observes, “but it is not free — unless, that is, it is performed in the Intel CPU, obviating the need to buy a DSP chip at all.
But in an era when the network advances faster than the CPU, it is more likely that communications processors will gradually “suck in” and “hollow out” the functions of the CPU, rather than the other way around. Echoing Sun’s perennial slogan, Schmidt predicts that the network will become the computer. In this era, Moore’s Law and the law of the microcosm are no longer the driving force of progress in information technology. Bandwidth is king.
As the great pioneer of communications theory Claude Shannon wrote in 1948, bandwidth is a replacement for switching. Since ultimately a microprocessor is a set of millions of transistor switches inscribed on a chip, bandwidth can even serve as a substitute for mips. With sufficient communication, engineers can duplicate any computer network topology they want. As the network becomes the computer, they thus redefine the optimal architectures of computing. As an example, take the problem of video-on-demand now being confronted by every major company in the industry from IBM to Microsoft.
In 1992, Microsoft assigned this problem to Craig Mundie, a veteran of Data General in Massachusetts, who had gone on to found Alliant Computer, one of the more successful of the massively parallel computer firms. As a supercomputer man, Mundie initially explored a hardware solution, hiring a team of computer designers from Supercomputer Systems Inc. SSI was Steve Chen’s effort to follow up on his successes at Cray Research with a machine for IBM. Although IBM ultimately closed SSI down, Chen commanded some of the best talent in supercomputers. Mundie hired George Spix and a team from SSI.
Looking to Software
On the surface, video-on-demand seems a super-computer task. It entails taking tens of thousands of streams of digital images, smoothing them into real-time flows, and switching them to the customers requesting them. Essentially huge hierarchies of storage devices, including fast silicon memories, connected through a specialized switching fabric to arrays of fast processors, supercomputers seem perfectly adapted to video-on-demand, which as Bill Gates explains, is “essentially a switching problem.” This is the solution chosen by Oracle Systems, using its nCube supercomputer, and by Silicon Graphics, employing its PowerChallenge server.
According to Mundie, the SSI team developed an impressive video server design. But they soon discovered they were in the wrong company. As Gates told Forbes ASAP, “Microsoft looks for a software solution to all problems. IBM looks for a mainframe hardware solution. Larry Ellison owned a supercomputer company so he looked for that solution. Fortunately for us, software solutions are the most scaleable, flexible, fault-tolerant and low cost.”
Enter Rick Rashid, a professor from Carnegie Mellon and designer of the Mach kernel adopted by Next, IBM and the Open Software Foundation and incorporated in part in Microsoft’s Windows NT operating system. Rashid joined Microsoft in September 1991 and began to focus on video-on-demand in 1992. Like most other people confronting this challenge, he first assumed that the huge bit streams involved would require specialized hardware — RAID (redundant arrays of inexpensive disk) storage, fast buffer memories and supercomputer-style switches. Soon, however, he came to the conclusion that progress in the personal computer industry would enable an entirely software solution.
For example, the memory problem illustrates a tradeoff between bandwidth and processing speed. Expensive hierarchies of RAID drives and semiconductor buffer memories managed by complex controller logic can speed the bit streams to the switch at the necessary pace. But Rashid and Mundie saw that bandwidth offered a cheaper solution. Through clever software, you could “stripe” the film bits across large arrays of conventional disk drives and gain speed through bandwidth. Rather than using one fast memory, plus fast processors, and hard-wired fault tolerance to send the movie reliably to a customer, you spread the images across arrays of cheap, slow disk drives — Seagate Barracudas — which, working in parallel, offer bandwidth and redundancy limited only the number of devices. Having dispensed with the idea of contriving expensive hardware solutions for the memory problem, Rashid recognized that with Windows NT he commanded an operating system with real-time scheduling guarantees that laid the foundation for a software solution. On it, he could proceed to build Tiger as a continuous digital stream operating system.
Liberated from special-purpose hardware, the team could revel in all the advantages of using off-the-shelf personal computer components. Mundie explains: “The personal computer industry commands intrinsic volume and a multi-supplier structure that takes anything in its path and drives its costs to ground.” A burly entrepreneur of massively parallel supercomputers, Mundie became a fervent convert to the manifest destiny of the PC to dominate all other technologies in the race to multimedia services, grinding all costs and functions into the ground of microprocessor silicon.
Video-on-demand has been heralded as the salvation of the television industry, the supercomputer industry, the game industry, the high-end server industry. It has been seen as Microsoft’s move into hardware. Yet nowhere in the Tiger Laboratory in Building Nine is there any device made by any TV company, supercomputer firm, workstation company, or Microsoft itself. On one side of the room are 12 monitors. On the other side are 12 Compaq computers piled on top of each other, said to be simulating set-top boxes. Next to these are a pile of Seagate Barracuda disk drives, each capable of holding the nine gigabytes of video in three high-resolution compressed movies. Next to them are another pile of Compaq computers functioning as video servers.
All this gear works together to extend Microsoft’s long mastery of the science of leverage, getting most of the world to drive costs to ground — or grind cost into silicon — while the grim reapers of Redmond collect tolls on the software. Exploiting another of Sun Microsystems co-founder Bill Joy’s famous laws — “The smartest people in every field are never in your own company” — Gates has contrived to induce most of the personal computer industry, from Bangalore to Taiwan, to work for Microsoft without joining the payroll.
In the new world of bandwidth abundance, however, it is no longer sufficient to leverage the PC industry alone. Gates is now reaching out to leverage the telephone and network equipment manufacturing industries as well. Transforming all this PC hardware into a “Tiger” that can consume the TV industry is an ATM switch. In the Tiger application, once one ATM switch has correctly sequenced the movie bits streaming from the tower of Seagate disks, another ATM switch in a metropolitan public network will dispatch the now ordered code to the appropriate display. Microsoft’s Tiger and its client “cubs” all march in asynchronous transfer mode.
The Masters of Leverage
Why is this a brilliant coup? It positions Microsoft to harvest the fruits of the single most massive and far-reaching project in all electronics today. Some 600 companies are now active in the ATM forum, with collective investments approaching $10 billion and rising every year. Not only are ATM switches produced by a competitive swarm of companies resembling the PC industry, ATM also turns networks of small computers into scaleable supercomputers. It combines with fiber-optic links to provide a far simpler, more modular and more scaleable solution than the complex copper backplane buses that perform the same functions in large computers. ATM and fiber prevail by using bandwidth as a substitute for complex protocols and computations.
Microsoft Technical Director Nathan Myhrvold points to the Silicon Graphics PowerChallenge superserver as a contrast. “They have a bus that can handle 2.4 gigabytes per second and which is electrically balanced to take a bunch of add-in cards (for processor and memory).” The complexities of this solution yield an expensive machine, costing more than $100,000, with specialized DRAM boards, for example, that cost 10 times as much per megabyte as DRAM in a PC.
This problem is not specific to Silicon Graphics. All supercomputers with multiple microprocessors linked with fast buses face the same remorseless economics and complexities. By contrast, the $30,000 Fore systems ATM switch being used in Tiger prototypes — together with the PCI buses in the PCs on the network — supply the same 2.4 gigabytes per second of bandwidth that the PowerChallenge does. And, as Myhrvold points out, “ATM prices are dropping like a stone.”
The Microsoft sage explains: ATM switches linked by fiber optic lines are far more efficient at high bandwidth than copper buses on a backplane. ATM allows “fault tolerance and other issues to be handled in software by treating machines (or disks, or even the ATM switch itself) as being replaceable and redundant, with hot spares standing by.”
As Gates told ASAP, video-on-demand is essentially a switching problem. You can create an expensive, proprietary, and unscaleable switch using copper lines and complex protocols on the backplane of a supercomputer, or you can use the bandwidth of fiber optics and ATM as a substitute for these complexities. You can put the ATM switches wherever you need them to create a system optimized for any application, allowing any group of PCs using Windows NT and PCI buses to function as video clients or servers as desired. As Microsoft leverages the world, it won’t object if the world chooses to lift NT into the forefront of operating systems in unit sales.
Mundie and his assistant Redd Becker earnestly explain the virtues of this scheme and demonstrate its robustness and fault tolerance by disabling several of the disk drives, cubs and servers without perceptibly affecting the 12 images on the screen. They offer it as a system to function as a movie central server resembling the Oracle nCube system adopted by Bell Atlantic, or the Silicon Graphics system used by Time Warner in its heralded Orlando project. But the Tiger is fundamentally different from these systems in that it is completely scaleable and reconfigurable, functioning with full VCR interactivity for a single citizen or for a city. It epitomizes the future of computing in the age of ATM, a system that will soon operate at up to 2.4 gigabits per second. Two point four gigabits per second is more than twice as fast as the Intel PCI bus that links the internal components of a Pentium-based personal computer.
Thus, ATM technology can largely eclipse the difference between an internal hard drive and an external Barracuda, between a video client and a video server. To the CPU, a local area network or even a wide area network running ATM can function as a motherboard backplane. With NT and Tiger software, PCs will be able to tap databases and libraries across the world as readily as they can reach their own hard disks or CD-ROM drives. Presented as an application-specific system for multimedia or movie distribution in real time, it is in fact a new operating system for client-server computing in the new age of image processing.
Gordon Bell, now on Microsoft’s technical advisory board, sums up the future of computing in an ATM world: “We can imagine a network with a range of PC-sized nodes costing between $500 and $5,000 that provide person-to-person communication, television and when used together (including in parallel), an arbitrarily large computer. Clearly, because of standards, ubiquity of service and software market size, this architecture will drive out most other computer structures such as massively parallel computers, low-priced workstations and all but a few special-purpose processors. This doomsday for hardware manufacturers will arrive before the next two generations of computer hardware play out at the end of the decade. But it will be ideal for users.” And for Microsoft.
For manufacturers of equipment that feeds the Tiger, however, what Bell calls “doomsday for hardware manufacturers: may well be as profitable as the current rage of “Doom,” the new computer game infectiously spreading from the Internet into computer stores. The new Tiger model provides huge opportunities for manufacturers of new ATM switches on every scale, for PCs equipped with fast video buses such as PCI, for vendors of network hardware and software, and perhaps most of all for the producers of the new communications processors.
For all the elegance of the Tiger system, however, Gates understands that it cannot achieve its goals within the constraints of Moore’s Law in the semiconductor industry. The vision of “any high school dropout buying PCs and entering the interactive TV business” cannot prevail if it takes a supercomputer to compress the images and an array of special-purpose processors to decode, decrypt and decompress them. Facing an ATM streams of 622 megabits per second — perhaps uncompressed video, 3-D or multimedia images — Eric Schmidt points out, a 100 mips Pentium machine would have to process 1.47 million 53-byte cells a second. That means well under 100 instruction cycles to read, store, display and analyze a packet. Since most computers use many cycles for hidden background tasks, the Pentium could not begin to do the job. Gate’s adoption of Tiger, his alliance with TCI, his investments in Teledesic, Metricom, and MicroUnity, all bring home face-to-face with the limits of current computer technology in confronting the telecosm. With MicroUnity, however, he may have arrived at a solution just in time.
MicroUnity seems like a throwback to the early years of Silicon Valley, when all things seemed possible — when Robert Widlar could invent a new product for National Semiconductor on the beach in Puerto Vallarta, and develop a new process to build it with David Talbert and his wife Dolores over beers on a bench at the Wagon Wheel. It was an era when scores of semiconductor companies were racing down the learning curve to enhance the speed and functions of electronic devices. Most of all, the MicroUnity project is a climactic episode in the long saga of the industry’s struggle between two strategies for accelerating the switching speeds in computers.
A New Moore’s Law?
Intel Chairman Gordon Moore recently promulgated a new Moore’s Law, supposedly deflecting the course of the old Moore’s Law, which ordains that chip densities double every 18 months. The new law is that the costs of a chip factory double with each generation of microprocessor. Moore speculated that these capital burdens might deter or suppress the necessary investment to continue the pace of advance in the industry.
Gerhard (“Gerry”) Parker, Intel’s chief technical officer, however, presents contrary evidence. The cost for each new structure may be approximately doubling as Moore says. But the cost per transistor — and thus the cost per computer function — continues to drop by a factor of between three and four every three years. Not only does the number of transistors on a chip rise by a factor of four, but the number of chips sold doubles with every generation of microprocessor, as the personal computer market doubles every three years. Thus there will be some eight times more transistors sold by Intel from a Pentium fab that from a 486 fab. At merely twice the cost, the new fab seems a bargain.
Of course, Intel gets paid not for transistors but for computer functions. To realize the benefits of the new fabs, therefore, Intel must deliver new computer functions that successfully adapt to the era of bandwidth abundance.
Return to Low and Slow
Since as a general rule, the more the power, the faster the switch, you can get speed by using high-powered or exotic individual components. It is an approach that worked well for years at Cray, IBM, NEC and other supercomputer vendors. Wire together superfast switches and you will get a superfast machine.
The other choice for speed is to use low-powered, slow switches. You make them so small and jam them so close together, the signals get to their destinations nearly as fast as the high-powered signals. This approach works well in the microprocessor industry and in the human brain.
Despite occasional deviations at Cray and IBM, low and slow has been the secret of all success in semiconductors from the outset. Inventor William Shockley substituted slow, low-powered transistors for faster, high-powered vacuum tubes. Gordon Teal at Texas Instruments replaced fast germanium with slower silicon. Jean Hoerni at Fairchild spurned the fast track of mountainous Mesa transistors to adopt a flat “planar” technology in which devices were implanted below the surface of the chip. Jack Kilby and Robert Noyce then substituted slow resistors and capacitors as well as slow transistors on integrated circuits for faster, high-powered devices on modules and printed circuit boards. Federico Faggin made possible the microprocessor by replacing fast metal gates on transistors with slow gates made of polysilicon. Frank Wanlass and others replaced faster NMOS and PMOS technologies with the 1,000 times slower and 10 times lower-power Complementary Metal Oxide Semiconductors (CMOS) that now rule the industry.
Low and slow finds its roots in the very physics of solid state, separating the microcosm from the macrocosm. Chips consist of complex patterns of wires and switches. In the macrocosm of electromechanics, wires were simple, fast, cool, reliable and virtually free; switches were vacuum tubes, complex, fragile, hot and expensive. In the macrocosm, the rule was economize on switches, squander on wires. But in the microcosm, all these rules of electromechanics collapsed.
In the microcosm, switches are almost free — a few millionths of a cent. Wires are the problem. However fast they may be, longer wires laid down on the chip and more wires connected to it translated directly into greater resistance and capacitance and more needed power and resulting heat. These problems become exponentially more acute as wire diameters drop. On the other hand, the shorter the wires the purer the signal and the smaller the resistance, capacitance and heat.
This fact of physics is the heart of microelectronics. As electron movements approach their mean free path — the distance they can travel “ballistically” without bouncing off the internal atomic structure of the silicon — they get faster, cheaper and cooler.
At the quantum level, noise plummets and bandwidth explodes. Tunneling electrons, the fastest of all, emit virtually no heat at all. It was a new quantum paradox; the smaller the space the more the room, the narrower the switches the broader the bandwidth, the faster the transport the lower the noise. As transistors are jammed more closely together, the power delay product — the crucial index of semiconductor performance combining switching delays with heat emission — improves as the square of the number of transistors on a single chip.
Since the breakthrough to CMOS in the early 1980s, however, the industry has been slipping away from the low and slow regime. Falling for the electromechanical temptation, they are substituting fast metals for slow polysilicon. For better performance, companies are increasingly turning to gallium arsenide and silicon germanium technologies. Semiconductor engineers are increasingly crowding the surface of CMOS with as many as four layers of fast aluminum wires, with tungsten now in fashion among the speed freaks of the industry . The planar chips that built Silicon Valley have given way to high sierras of metal, interlarded with uneven spreads of silicon dioxide and other insulators. Meanwhile, the power used on each chip is rising rapidly, since the increasing number of transistors and layers of metal nullify a belated move to three-volt operation from the five volts adopted with Transistor Transistor Logic in 1971. And as the industry loses touch with its early inspiration of low and slow, the costs of wafer fabrication continue to rise — to an extent that even demoralizes Gordon Moore.
In radically transforming the methods of semiconductor fabrication, John Moussouris and James (“Al”) Matthews, MicroUnity’s director of technology, seem to many observers to be embarking on a reckless and self-defeating course. But MicroUnity is betting on the redemptive paradoxes of the microcosm. Returning to low and slow, Moussouris and Matthews promise to increase peak clock speeds by a factor of five in the next two years and chip performance by factors of several hundred, launching communications chips in 1995 that function at 1.2 gigahertz and perform as many as 400 gigabits per second.
Matthews and Mead
In pursuing this renewal of wafer fabrication at MicroUnity, Matthews has applied for some 70 patents and won about 20 to date. A veteran of Hewlett-Packard’s bipolar process labs who moved to Intel in the early 1980s and spearheaded Intel’s switch to CMOS for the 386 microprocessor, Matthews has also worked as an engineer at HP-Avantek’s gallium arsenide fabs for microwave chips. Commanding experience in diverse fab cultures, Matthews thus escapes the cognitive trap of seeing the established regime as a given, rather than a choice.
At Aventek, Matthews plunged toward the microcosm and prepared the way for his MicroUnity process after reading an early paper by Carver Mead, the inventor of the gallium arsenide MESFET transistor. Mead had prophesied that the behavior of these transistors would deteriorate drastically if the feature sizes were pushed below two-tenths of a micron at particular doping levels (technically impossible at the time). In the mid-1980s, though, Matthews noticed that these feature sizes were then feasible. Testing the Mead thesis, he was startled to discover that far from deteriorating below the Mead threshold, these transistors instead showed “startlingly anomalous levels of good behavior,” marked by high gain and plummeting noise.
Based on this discovery, he created a low-noise, gigahertz-frequency amplifier for satellite dishes being sold in the European market. Matthew’s process reduced the cost so drastically that Sony officials were said to be contemplating claims of dumping. Avantek was charging a few dollars for microwave frequency chips that cost Sony perhaps some hundreds of dollars to make.
Having discovered the “anomalous good behavior” of gallium arsenide devices pushed beyond the theoretical limits, Matthews at MicroUnity decided to experiment with bipolar devices. Bipolar devices are usually used at high power levels with so-called emitter coupled logic to achieve high speeds in supercomputers and other advanced machines. Inspired by his breakthrough with gallium arsenide, Matthews believed that biopolar performance also might be radically different at extremely low power — under half a volt and at gate lengths approaching the so-called Debye limit, near one-tenth of a micron.
Once again, Matthews was startled by “anomalous good behavior” as processes approached the quantum mechanical threshold. It turned out that at high frequencies biopolar transistors use far less power even the CMOS transistors, famous for their low-power characteristics. At these radio-frequency speeds, however, he discovered that the transistors could not operate with aluminum wires insulated by oxide. Therefore, he introduced a technique he had used with fast bipolar and gallium arsenide devices: gold wires insulated by air. Replacing oxide insulators with “air bridges” drastically reduces the capacitance of the wires and allows the transistor to operate at speeds impossible with conventional device structures.
With these adventures in the microcosm behind him, Matthews was ready to develop a new process and technology for MicroUnity. Based on combining the best features of biopolar and CMOS at radially small geometries, the new technology uses bipolar logic functioning at gigahertz clock speeds, with CMOS retained chiefly for memory cells and with gold air bridges for the metalization layers. Perhaps it is a portent that the gold wires across the top of the chip repeat the most controversial feature of Jack Kilby’s original integrated circuit. (Matthews is also seeking patents for methods of using optical communications on the top of a silicon chip).
In essence, Matthews is returning to low and slow. He is shearing off the sierras of metal and oxides and restoring the planar surfaces of Jean Hoerni. Because the surface is flat to a tolerance of one-tenth of a micron, photolithography gear can function at higher resolution despite a narrow depth of field. Elimination of the aluminum sierras also removes a major source of parasitic currents and transistors and allows smaller polysilicon devices to be implanted closer together. A major gain from these innovations is a drastic move to lower power transistors. Rather than using the usual three volts or five volts, the MicroUnity devices operate at 0.3 volts to 0.5 volts (300 to 500 millivolts). In the microcosm, smaller devices closer together at lower power is the secret of speed.
Although MicroUnity will not divulge the details of future products, ASAP calculates on the basis of information from other sources that the MicroUnity chip can hold more than 10 million transistors in a space half the size of a Pentium with three million transistors. With lower power transistors set closer together, the MicroUnity chip can operate with a clock rate as much as 10 times faster than most current microprocessors and at an overall data rate more than 100 times faster. Low and slow results in blazing speed.
For ordinary microprocessor applications, an ultrafast clock is superfluous. Since ordinary memory technology is falling ever farther behind processor speeds, fast clocks mean complex arrangements of cache on cache of fast static RAM and specialized video memory chips. By using the MicroUnity technology at the relatively slow clock rates of a Pentium, MicroUnity might be able to produce Pentiums that use from five to 10 times less power — enabling new generations of portable equipment.
MicroUnity, however, is not building a CPU but a communications processor. In the communications world, the fast clock rate gives the “mediaprocessor” the ability to couple to broadband pipes using high radio frequencies. Most crucially, the mediaprocessor can connect to the radio frequency transmissions over cable coax.
Along with Bill Gates, one of the leading enthusiasts of MicroUnity is John Malone, who for the last year has been celebrating its potential to create a “Cray on a tray” for his set-top boxes and cable modems. For the rest of this decade, most Americans will be able to connect to broadband networks only over cable coax. Thus the link of TCI to MicroUnity and to Tiger offers the best promise of an information infrastructure over the next five years, affording a potential increase in bandwidth of 250,000-fold over the current four-kilohertz telephone wires.
The Regional Bell Operating Companies and the cable companies agree that cable coax is the optimal broadband conduit to homes and that fiber optics is the best technology for connecting central switches or headends to neighborhoods. Looping through communities, with a short drop at each home — rather than running a separate wire from the central office to every household — hybrid fiber-coax networks, according to a Pacific Bell study can reduce the cost of setup and maintenance of connections by some 75% and cut back the need for wire by a factor of 600.
In order to bring broadband video to homes, companies must collaborate with the cable TV industry. Collaborating with TCI, Microsoft once again has chosen the correct technology to leverage. With Digital Equipment, Zenith and Intel all engaged in alliances for the creation of cable modems — and several other companies announcing cable modem projects — Gates may well be leading the pack in transforming his company from a computer company into a communications concern, from the microcosm into the telecosm.
Fiber Miles (Millions)
Deployed in U.S. as of 1993
Local Exchange Carriers 7.28
Inter-Exchange Carriers 2.50
Competitive Access Providers 0.24
Driving Force of Progress
All the bandwidth in the world, however, will get you nowhere if your transceiver cannot process it. By returning to the inspiration of the original Silicon Valley, MicroUnity offers a promising route to the communications infrastructure of the next century, overthrowing Moore’s Law and issuing the first fundamental challenge to Moore’s company. As Al Matthews puts it: “Bob Noyce [the late Intel founder with Gordon Moore] is my hero. But there is a new generation at hand in Silicon Valley today, and this generation is doing things that Bob Noyce never dreamed of.”
Moussouris promises to deliver 10,000 mediaprocessors for set-top boxes in 1995. As everyone agrees, this is a high-risk project (although Bill Gates favorably compares MicroUnity’s risk to his other gamble, Teledesic). Even if it takes years for MicroUnity to reach its telecosmic millennium, the advance of communications processors continues to accelerate. Already available today, for example, is Texas Instruments’ MVP system — the first full-fledged mediaprocessor on one chip. It will function at a mere 30 to 50 megahertz but performs between two and three billion signal processing steps per second or roughly between 1,000 and 1,500 DSP mips. Rather than revving up the clock to gigahertz frenzies, TI gained its performance through a Multiple Instruction, Multiple Data approach associated with the massively parallel supercomputer industry. The MVP combines four 64-bit digital signal processors with a 32-bit RISC CPU, a floating point unit, two video controllers, 64 kilobytes of static RAM cache and a 64-bit direct memory access controller — all on one sliver of silicon, costing some $232 per thousand mips in 1995, when Pentiums will give you a hundred mips for perhaps twice as much.
This does not favor the notion that microprocessors will soon “suck in” DSPs. DSP mips and computer mips are different animals. As DSP guru Will Strauss points out, “As a rule of thumb, a microprocessor mips rating must be divided by about five to get a DSP mips rating.” To equal an MVP for DSP operations, a microprocessor would have to achieve some 5,000 mips.
Designed with the aid of teleconferencing company VTEL and Sony, the MVP chip can simultaneously encode or decode video using any favored compression scheme, process audio, faxes or input from a scanner and perform speech recognition or other pattern-matching algorithms. While Intel and Hewlett-Packard have been winning most of the headlines for their new RISC processing alliance, the key development in the microprocessor domain is the emergence of this new class of one-chip multimedia communications systems.
One thing is certain. Over the next decade, computer speeds will rise about a hundredfold, while bandwidth increases a thousandfold or more. Under these circumstances, the winners will be the companies that learn to use bandwidth as a substitute for computer processing and switching. The winners will be the companies that most truly embrace the Sun slogan: “The network is the computer.” As Schmidt predicts, over the next few years “the value-added of the network will so exceed the value-added of the CPU that your future computer will be rated not in mips but in gigabits per second. Bragging rights will go not to the person with the fastest CPU but to the person with the fastest network — and associated database lookup, browsing and information retrieval engines.”
The law of the telecosm will eclipse the law of the microcosm as the driving force of progress. Springing from the exponential improvement in the power delay product as transistors are made smaller, the law of the microcosm holds that if you take any number (N) transistors and put them on a single sliver of silicon you will get N squared performance and value. Conceived by Robert Metcalfe, inventor of the Ethernet, the law of the telecosm holds that if you take any number (n) computers and link them in networks, you get n squared performance and value. Thus the telecosm builds on and compounds the microcosmic law. The power of Tiger, MicroUnity and TCI comes from fusing the two laws into a gathering tide of bandwidth.
With network technology advancing 10 times as fast as central processors, the network and its nodes will become increasingly central while CPUs become increasingly peripheral. Faced with a CPU bottleneck, multimedia systems will simply bypass the CPU on broadband pipes. Circumventing Amdahl’s Law, system designers will adapt their architectures to exploit the high bandwidth components, such as mediaprocessors, ATM switches and fiber links. In time the microprocessor will become a vestigial link to the legacy systems such as word processing and spreadsheets that once defined the machine. All of this means that while the last two decades have been the epoch of the computer industry, the next two decades will belong to the suppliers of digital networks.
The chief beneficiaries of all this invention, however, will be the people of the world, ascending to new pinnacles of prosperity in an Information Age. Although many observers fear that these new tools will chiefly aid the existing rich — or the educated and smart — these technologies have already brought prosperity to a billion Asians, from India and Malaysia to Indonesia and China, previously mired in penury.
Communications bandwidth is not only the secret of electronic progress. It is also the heart of economic growth, stretching the webs of interconnection that extend the reach of markets and the realms of opportunity. Lavishing the exponential gains of networks, endowing old jobs with newly productive tools and unleashing creativity with increasingly fertile and targeted capital, the advance of the telecosm offers unprecedented hope to the masses of people whom the industrial revolution passed by.