Multiplication instructions
| Opcode | P/U | Category | Description |
DSL |
user | ALU: multiply | double shift left |
MH |
user | ALU: multiply | multiply high |
MHL |
user | ALU: multiply | multiply high and low |
MHL0 |
user | ALU: multiply | multiply high and low, tribble 0 |
MHL1 |
user | ALU: multiply | multiply high and low, tribble 1 |
MHL2 |
user | ALU: multiply | multiply high and low, tribble 2 |
MHL3 |
user | ALU: multiply | multiply high and low, tribble 3 |
MHL4 |
user | ALU: multiply | multiply high and low, tribble 4 |
MHL5 |
user | ALU: multiply | multiply high and low, tribble 5 |
MHNS |
user | ALU: multiply | multiply high no shift |
ML |
user | ALU: multiply | multiply low |
DSL Double shift left
| Syntax |
c = a dsl b |
| Register | Signedness |
| All | ignored |
| 1 opcode only |
| Flag | Set if and only if |
N |
bit 35 of the result is set |
Z |
all result bits are zero |
T |
flag does not change |
R |
flag does not change |
DSL (double shift left) is a critical instruction for long multiplication, providing in one CPU instruction what would otherwise take four instructions. Sample code is available under 36-bit multiplication.
DSL adds the T flag with wrapping to b, and then shifts the sum left six bits. The six vacated bits are filled using the six leftmost bits of a. The result is written to c.
N and Z are set as if the destination is a signed register. The N and Z flags have no purpose in the long multiplication application for which DSL was designed, but they are updated in case someone invents a use for this information at a later date. T and R do not change.
This documentation does not match what the dissertation says about DSL, in that the left and right operands have since been interchanged. This switch was made so that DSL can directly obtain the correct register copy after an MHL5 instruction in long multiplication.
MH Multiply high
| Syntax |
c = a mh b |
| Register | Signedness |
| All | ignored |
| 1 opcode only |
| Flag | Set if and only if |
N |
never; flag is cleared |
Z |
c = 0 |
T |
c mod 64 ≠ 0 |
R |
T is set or R is already set |
This is a key instruction for unsigned “short” multiplication where one of the factors fits into six bits, and the product fits into 36 bits. The smaller of the factors must be copied into all of the tribbles via CX or an assembler constant. MH multiplies the tribbles of a and b pairwise, but the six 12-bit results cannot fit the 6-bit spaces afforded by the tribbles of c. Instead, MH retains only the six most significant bits of each 12-bit result. ML is the complementary instruction that retains the six least significant bits of each.
To meaningfully add the output of MH and ML, their place values must be aligned consistently, meaning that MH needs a 6-position left shift, and that the result can spill to as many as 42 bits (which will not fit in a 36-bit register) as a result of that shift. The solution is that instead of shift, MH rotates its result six bits left. If the six bits rotated into the rightmost places are not all zeros, the T and R flags are set because the eventual product will not fit in 36 bits. Otherwise T is cleared, R is left unchanged, and the output of MH can be directly added to ML to obtain the 36-bit product. Z will be set if the output of MH is all zeros. N is always cleared.
Here is an unsigned short multiplication example with full range checking, and an always-accurate Z flag at the end whether or not overflow occurs. Four instructions are needed. When multiplying by a small constant, the CX can be optimized out by hand.
unsigned big small t result ; will multiply big * small t = cx small ; copy small into all tribbles result = big mh t ; high bits of product t = big ml t ; low bits of product result = result + t ; result is now big * small
Warning
Like other macros, CX is not yet available. Although it may be tempting to use SWIZ in place of CX like this:
t = small swiz 0
SWIZ will not check to verify small is between 0 and 63. CX will have this verification and set T and R if small is out of range.
Note about replacing MH with MHL
The MHL family of instructions cannot improve over the performance of MH and ML for short multiplication, because MH is able to include a 6-bit shift that MHL and its derivatives cannot. (The issue is that only the beta RAMs can shift six bits, and only the gamma RAMs can split registers.) The MHL family is for long multiplication.
MHL Multiply high and low
| Syntax |
c = a mhl b |
| Register | Signedness |
| All | ignored |
| 1 opcode only |
| No flags changed |
MHL is the flagship of the MHL family of instructions for long multiplication and is the most flexible, although ordinarily the MHL0 through MHL5 instructions are used instead. MHL is a simultaneous execution of MHNS and ML, where the left and right copies of register c are allowed to desynchronize. Specifically, the MHNS result is stored in the left copy of register c, and the ML result is stored in the right copy.
For a drawing that shows the two copies of the register file in relation to the architecture, see page 200 of the dissertation. A short discussion of register splitting can be found on page 187; however, that discussion assumes the presence of a fast hardware multiplier that stores an entire multiplication result in a split register. MHL stores a partial result.
MHL multiplies the tribbles of a and b pairwise. In order to fit the six 12-bit results into c, the six most significant bits of each product are written to the left copy of c, and the six least significant bits of each product are written to the right copy of c. These writes are done simultaneously. To preclude any semantic confusion as to whether flags follow the left or right copy of a result, none of the MHL instructions change any CPU flags at all.
The differing values in the left and right copies of c can be selected in subsequent instructions by using the left and right operand positions of subsequent ALU instructions. Certain ALU instructions such as shifts are not symmetric in their left and right operands, so very unusual code may require an intervening instruction to transfer a value from one copy of the register file to the other copy. There are also two instructions that you probably don’t need to worry about after MHL, namely BOUND and WCM, where the syntactic left operand is actually the electrically right operand and vice versa. Also, most assignment instructions place the electrically left operand on the right side of the = sign.
MHL0 Multiply high and low, tribble 0
| Syntax |
c = a mhl0 b |
| Register | Signedness |
| All | ignored |
| 1 opcode only |
| No flags changed |
MHL0 replicates tribble 0 (bits 0–5) of a across all six subwords, and then multiplies them pairwise with the tribbles of b. In order to fit the six 12-bit results into c, the six most significant bits of each product are written to the left copy of c, and the six least significant bits of each product are written to the right copy of c. No flags are changed. See also MHL.
Except that only one instruction is required for MHL0, it is equivalent to:
t = a swiz 000000000000`o c = t mhl b
The MHL0–MHL5 instructions can be seen in action under 36-bit multiplication.
MHL1 Multiply high and low, tribble 1
| Syntax |
c = a mhl1 b |
| Register | Signedness |
| All | ignored |
| 1 opcode only |
| No flags changed |
MHL1 replicates tribble 1 (bits 6–11) of a across all six subwords, and then multiplies them pairwise with the tribbles of b. In order to fit the six 12-bit results into c, the six most significant bits of each product are written to the left copy of c, and the six least significant bits of each product are written to the right copy of c. No flags are changed. See also MHL.
Except that only one instruction is required for MHL1, it is equivalent to:
t = a swiz 010101010101`o c = t mhl b
The MHL0–MHL5 instructions can be seen in action under 36-bit multiplication.
MHL2 Multiply high and low, tribble 2
| Syntax |
c = a mhl2 b |
| Register | Signedness |
| All | ignored |
| 1 opcode only |
| No flags changed |
MHL2 replicates tribble 2 (bits 12–17) of a across all six subwords, and then multiplies them pairwise with the tribbles of b. In order to fit the six 12-bit results into c, the six most significant bits of each product are written to the left copy of c, and the six least significant bits of each product are written to the right copy of c. No flags are changed. See also MHL.
Except that only one instruction is required for MHL2, it is equivalent to:
t = a swiz 020202020202`o c = t mhl b
The MHL0–MHL5 instructions can be seen in action under 36-bit multiplication.
MHL3 Multiply high and low, tribble 3
| Syntax |
c = a mhl3 b |
| Register | Signedness |
| All | ignored |
| 1 opcode only |
| No flags changed |
MHL3 replicates tribble 3 (bits 18–23) of a across all six subwords, and then multiplies them pairwise with the tribbles of b. In order to fit the six 12-bit results into c, the six most significant bits of each product are written to the left copy of c, and the six least significant bits of each product are written to the right copy of c. No flags are changed. See also MHL.
Except that only one instruction is required for MHL3, it is equivalent to:
t = a swiz 030303030303`o c = t mhl b
The MHL0–MHL5 instructions can be seen in action under 36-bit multiplication.
MHL4 Multiply high and low, tribble 4
| Syntax |
c = a mhl4 b |
| Register | Signedness |
| All | ignored |
| 1 opcode only |
| No flags changed |
MHL4 replicates tribble 4 (bits 24–29) of a across all six subwords, and then multiplies them pairwise with the tribbles of b. In order to fit the six 12-bit results into c, the six most significant bits of each product are written to the left copy of c, and the six least significant bits of each product are written to the right copy of c. No flags are changed. See also MHL.
Except that only one instruction is required for MHL4, it is equivalent to:
t = a swiz 040404040404`o c = t mhl b
The MHL0–MHL5 instructions can be seen in action under 36-bit multiplication.
MHL5 Multiply high and low, tribble 5
| Syntax |
c = a mhl5 b |
| Register | Signedness |
| All | ignored |
| 1 opcode only |
| No flags changed |
MHL5 replicates tribble 5 (bits 30–35) of a across all six subwords, and then multiplies them pairwise with the tribbles of b. In order to fit the six 12-bit results into c, the six most significant bits of each product are written to the left copy of c, and the six least significant bits of each product are written to the right copy of c. No flags are changed. See also MHL.
Except that only one instruction is required for MHL5, it is equivalent to:
t = a swiz 050505050505`o c = t mhl b
The MHL0–MHL5 instructions can be seen in action under 36-bit multiplication.
MHNS Multiply high no shift
| Syntax |
c = a mhns b |
| Register | Signedness |
| All | ignored |
| 1 opcode only |
| Flag | Set if and only if |
N |
never; flag is cleared |
Z |
all result bits are zero |
T |
flag does not change |
R |
flag does not change |
MHNS is a former key instruction for unsigned long multiplication, where two 36-bit factors are multiplied as 6-bit tribbles and eventually sum to produce a 72-bit result. MHNS multiplies the tribbles of a and b pairwise, but the six 12-bit results cannot fit the 6-bit spaces afforded by the tribbles of c. Instead, MHNS retains only the six most significant bits of each result. The tribbles are output in their original positions, instead of being rotated left as with MH. The Z flag is set if the outcome of MHNS is all zeros, and cleared otherwise. N is always cleared, and T and R do not change.
MHNS has been supplanted by the MHL family of instructions, allowing the number of instructions required for long multiplication to be reduced from 47 to 35. But the MHL opcodes require a little more hardware and firmware loader support, due to their register splitting. Architectures derived from Dauug|36 which either do not split registers or have reduced ALU memory may benefit from using MHNS to multiply. For sample code showing how this used to be done, see page 113 of the dissertation.
ML Multiply low
| Syntax |
c = a ml b |
| Register | Signedness |
| All | ignored |
| 1 opcode only |
| Flag | Set if and only if |
N |
never; flag is cleared |
Z |
all result bits are zero |
T |
flag does not change |
R |
flag does not change |
ML is a key instruction for unsigned “short” multiplication where one of the factors fits into six bits, and the product fits into 36 bits. It was also a key instruction for unsigned long multiplication until being supplanted by the MHL family.
The smaller of the factors must be copied into all of the tribbles via CX or an assembler constant. ML multiplies the tribbles of a and b pairwise, but the six 12-bit results cannot fit the 6-bit spaces afforded by the tribbles of c. Instead, ML retains only the six least significant bits of each 12-bit result. The Z flag is set if the outcome of ML is all zeros, and cleared otherwise. N is always cleared, and T and R do not change. See MH and MHNS for more information and sample code.