From Chessprogramming wiki
Jump to: navigation, search

Home * Hardware * Cray-1

Seymour Cray in front of his Cray-1 [1]

a supercomputer designed, manufactured and marketed by Cray Research Inc. since 1972. The first Cray-1 system was installed at Los Alamos National Laboratory in 1976 and it went on to become one of the best known and most successful supercomputers in history, it reigned as the world’s fastest from 1976 to 1982 [2]. Cray Research was founded by former Control Data Corporation chief designer Seymour Cray [3], after CDC neglected to invest in Seymour Cray's CDC 8600 design.


The Cray-1 is a large-scale, general-purpose digital computer featuring scalar as well as vector processing, a 12.5 nanosecond clock period, and a 50 nanosecond memory cycle time. The basic configuration of the Cray-1 consists of the central processor unit (CPU), one or more minicomputer consoles, and a mass storage (disk) subsystem.


The CPU holds the ALU, memory, and I/O sections of the computer. It is constructed from LSI chips of high-speed ECL bipolar junction transistors. Memory is build from 1024-bit LSI chips of up to one mebi 72-bit words, arranged in 16 banks. A word consists of 64 data bits and 8 check bits which allows single-error correction double-error detection (SECDED).


Three primary register sets consists of eight 24-bit address registers (also loop counter, shift counts), eight 64-bit scalar registers, and eight vector registers, where one vector register is actually a set of 64 64-bit registers, called elements. Associated with the vector registers are a 7-bit vector length register and a 64-bit vector mask register to allow operations to be performed on individual vector elements.

Cray architecture.gif

Register and ALU Block Diagram [4]


The Cray-1 executes 128 operation codes as either 16-bit (register reference) or 32-bit (memory reference) scalar or SIMD instructions. An integer multiply operation produces a 24-bit result, additions and subtractions either 24-bit or 64-bit results. Integer divide is not provided. The instruction set includes boolean operations for OR, AND, and exclusive OR and for a mask-controlled merge operation. Shift operations allow the manipulation of 64- or 128-bit operands to produce a 64-bit result. Instructions for scalar population and leading zero counts return bit counts based on scalar register contents to an address register.

The Cray design used pipeline parallelism to implement vector instructions rather than multiple ALUs . In addition the design had completely separate pipelines for different instructions, for example, addition/subtraction was implemented in different hardware than multiplication. This allowed a batch of vector instructions themselves to be pipelined, a technique called vector chaining. The Cray-1 normally had a performance of about 80 MFLOPS, but with up to three chains running it could peak at 240 MFLOPS [5] .

Chess Programs

See also



Forum Posts

External Links

Company History | Cray


Up one Level