除法器

发布时间 2023-07-17 14:28:53作者: luckylan

除法器

与乘法相比,除法的实现较为复杂,运算过程如下:

过程:

  • 被除数和余数:将余数和被除数视为一个,共享一个寄存器,初始值为被除数
  • 除数:可视为不断右移,并和被除数相减
  • 商:每个bit依次生成,可视为不断左移

 

除法器的工作流程

 

要注意的是,与手算相比,电路实现总是将余数减除数,所以如果出现差小于0,要执行回退操作。

怎么会退呢?其实没有真正回退地方法,由于前面执行的是减法,回退只需加回来就可以了

除法器的电路实现

工作过程:

  1. 初始化:将8-bit被除数放到“余数寄存器”,4-bit除数放到“除数寄存器”的高4位,将4-bit商寄存器置零
  2. 执行减法运算:余数-除数,将结果放到“余数寄存器”
  3. 检查余数寄存器的最高位(判断正负)。如果值小于0,回退,商左移,新的最右位设为0;如果值大于或等于0,商左移,新的最右位设为1.
  4. 除数寄存器右移一位
  5. 检查是否为最后一轮(本例为第5轮)

32-bit除法器同理可得

除法器的面积优化

对上面32-bit除法器从面积分析:

  • “除数寄存器”实际只使用了一半
  • “商寄存器”初始是空的,从左到右依次填满
  • “余数寄存器”初始是满的,有实际意义的位每过一个周期从左到右依次减少

对应的,我们可以得到针对上述问题的方案:

  • “除数寄存器”缩减为32-bit的,无需支持移位
  • 取消“商寄存器”,商从右端逐位移入“余数寄存器”
  • 64-bit ALU缩小为32-bit
  • “余数寄存器”只有高32位参与加减法运算
  • “余数寄存器”需要支持左移和右移
  • 运算结束时,商占据“余数寄存器”的低32位

 

其中有两点需要注意:

  1. 对优化的除法器右移,优化后的除法器的运算步骤是先相减再左移,比如0000,0111 / 0010, 每次做左移、减法,倒数第二步得到00110001,最后一次循环,减法再左移得到:00100011,因为真正的商是:0011, 而余数是0001。最终结果应该是先取低四位商,再右移取高四位余数。
  2. 和乘法器一样,余数寄存器实际应该是65位而保证加法器的进位不会丢失。

除法运算过程如下:

初始化时:

被除数和余数:将余数和被除数视为一个,共享一个寄存器div_rem_d,初始值为被除数扩大2倍后的值,位宽为2倍的被除数位宽2*width。所以在最后一次结果中商直接取余数寄存器的低半部分位宽的数值即可,余数即为余数寄存器的[2*width_A-1:width_A+1]位的值。

除数:将除数左移被除数位宽大小,作为除数寄存器的值,除数寄存器位宽为被除数位宽加上除数位宽; 商寄存器初始值为0,位宽为width

运算过程中,每次不断比较余数寄存器div_rem_d的值是否大于除数寄存器的值,如果大于或者等于,则余数寄存器div_rem_d更新方式为余数寄存器div_rem_d的值减去除数寄存器的值得到的结果左移一位再加1,如果小于,则余数寄存器div_rem_d更新方式为余数寄存器div_rem_d的值直接左移一位,上述处理过程共需要被除数位宽次比较过程,对于有符号数除法,可以先求绝对值,看作无符号除法,再根据有符号除法和无符号除法商与余数的关系,计算得到正确的余数和商,保证有符号除法的结果中余数一定是正数。

商:每个bit依次生成,可视为不断左移,电路实现总是将余数减除数,所以如果出现差小于0,要执行回退操作。回退只需加回来就可以了。

matlab代码

% % divide
clear;
close all;
clc;
WIDTH=16;
div_A=[20855,21855,15,1166,11665,1115,1180,32777];%WIDTH=16
div_B=[2,2,5,15,123,112,134,129];
for i=1:length(div_A)
    for j=1:length(div_B)
        div_rem_d=abs(div_A(i))*2;%高位补div_A位宽-1=(WIDTH-1)个0,低位补1个0;余数初值
        div_quo_d=0;%商位宽为WIDTH,
        div_C(i,j)=0;
        div_B_d=div_B(j)*2^(WIDTH);
        for k=1:WIDTH
            if div_rem_d>=div_B_d
                div_rem_d=(div_rem_d-div_B_d)*2+1;
                div_C(i,j)=div_C(i,j)*2+1;
            else
                div_rem_d=div_rem_d*2;
                div_C(i,j)=div_C(i,j)*2;
            end
        end
        div_rem_tmp2=floor(div_rem_d/2^(WIDTH+1));
        div_rem(i,j)= div_rem_tmp2;
    end
end
复制代码
matlab统一按无符号数处理了,修改很简单,就没改

RTL代码

`timescale 1ns / 1ps
module divide#(parameter width_A=16,
               parameter width_B=8
)(
    input                         clk,
    input                         rst_n,
    input[width_A-1:0]            div_A,    //被除数
    input[width_B-1:0]            div_B,    //除数
    input                        valid_in,
    output reg                    valid_out,
    output reg [width_A-1:0]    div_C,    //
    output reg[width_A-1:0]        div_rem //余数,恒为正
);

  parameter DIV_IDLE = 2'd0;
  parameter DIV_BUSY = 2'd1;
  parameter DIV_DONE = 2'd2;
  parameter DIV_ACK = 2'd3;
  reg [1:0] next_state,state;

  // reg [7:0]    result;
  reg[width_A+width_B-1:0]    div_B_d;  
  reg[width_A+width_B-1:0]    div_B_d_tmp;
  reg[2*width_A-1:0]         div_rem_d;   
  reg [2*width_A-1:0]        div_rem_d_tmp; 
  
  reg[width_A-1:0]            div_quo; 
  reg [width_A-1:0]            div_quo_d; 
  reg[3:0]        cnt;
  wire [width_A-1:0]        abs_A;
  wire [width_B-1:0]        abs_B;

  wire [2*width_A-1:0]    div_rem_tmp;
  
  assign abs_A = div_A[width_A-1] ? ~div_A+1 : div_A;
  assign abs_B=div_B[width_B-1] ? ~div_B+1 : div_B;
  
  assign is_neg = div_A[width_A-1] ^ div_B[width_B-1];
  assign all_neg = div_A[width_A-1] & div_B[width_B-1];
  always@(*) begin
    case(state)
        DIV_IDLE:begin
            if(valid_in)
                next_state=DIV_BUSY;
            else     
                next_state=DIV_IDLE;
        end
        DIV_BUSY:begin
            if(cnt==width_A-1)
                next_state=DIV_DONE;
            else     
                next_state=DIV_BUSY;
        end
        DIV_DONE:begin
                next_state=DIV_ACK;
        end
        DIV_ACK    :begin
            next_state=DIV_IDLE;
        end    
        default: next_state=DIV_IDLE;
    endcase
  end    
  always@(posedge clk)
    if(~rst_n)
        state<=DIV_IDLE;
    else
        state<=next_state;
        

  always@(posedge clk)
    if(~rst_n)begin
        div_B_d<=0;//除数寄存器
        div_rem_d<=0;//余数寄存器
        // div_quo_d<=0;//商寄存器
        div_rem<=0;
        div_C<=0;
        cnt<=0;
        valid_out<=1'b0;
    end
    else begin
        case(state)
            DIV_IDLE :begin
                valid_out<=1'b0;
                if(valid_in)begin    
                    div_B_d<={abs_B,{(width_A)1'b0}};
                    div_rem_d<={{(width_A-1)1'b0},abs_A,1'b0};
                    // div_quo_d<=0;
                    cnt<=0;
                end
        end
            DIV_BUSY:begin    
                    div_rem_d<=div_rem_d_tmp;
                    // div_quo_d<=div_quo;
                    // div_B_d<=div_B_d;
                    cnt<=cnt+1;                            
            end
            DIV_DONE:begin
                valid_out<=1'b1;
                if(all_neg) begin    //A<0,B<0;
                    div_C<=div_rem_d[width_A-1:0]+1;
                    div_rem<=div_B_d[width_A+width_B-1:width_A]-{0,div_rem_d[2*width_A-1:width_A+1]};
                end
                else if(is_neg) begin    //A<0,B>0
                    // div_C<=~div_quo_d+1;
                    /* div_C<=~div_rem_d[15:0]+1; */
                    div_C<=~div_rem_d[width_A-1:0];
                    div_rem<=div_B_d[width_A+width_B-1:width_A]-{0,div_rem_d[2*width_A-1:width_A+1]}; //31:17
                end
                else begin //A>0,B>0
                    // div_C<=div_quo_d;
                    div_C<=div_rem_d[width_A-1:0];
                    div_rem<={0,div_rem_d[2*width_A-1:width_A+1]};
                end
            end
            DIV_ACK:
                valid_out<=1'b0;
        endcase
    end
  reg data_bit;    
  assign  div_rem_tmp=div_rem_d-div_B_d;
  always@(*)begin
    if(div_rem_d<div_B_d) begin
        div_rem_d_tmp={div_rem_d[2*width_A-2:0],1'b0};
        data_bit=0;
        // div_quo={div_quo_d,1'b0};
    end
    else begin
        div_rem_d_tmp={div_rem_tmp[2*width_A-2:0],1'b1};
        // div_quo={div_quo_d,1'b1};
        data_bit=1;
    end
    // div_B_d_tmp={1'b0,div_B_d[23:1]};
  end
endmodule

testbench

## testbench


```bash
`timescale 1ns / 1ps
//////////////////////////////////////////////////////////////////////////////////
module tb_divide;

    reg clk;
    reg rst_n;
     
    reg start;
    reg [15:0]div_A;
    reg [7:0]div_B;

    wire done;
    wire [15:0]div_C;
     
   always #10 clk = ~clk;
    initial begin      
        rst_n = 0;         
        clk = 1;
        #10; 
        rst_n = 1;
     end
    
  divide u_divide(
    .clk(clk),
    .rst_n(rst_n),
    .div_A(div_A),
    .div_B(div_B),
    .valid_in(start),
    .valid_out(done),
    .div_C(div_C)
);    

     reg [3:0]i;
    always @ ( posedge clk or negedge rst_n )
        if( !rst_n ) begin
                i <= 4'd0;
                start <= 1'b0;
                div_A<= 16'd1;
                div_B<= 8'd1;             
        end                
        else begin
            case( i )
                0: 
                if( done ) begin 
                    start <= 1'b0; 
                    i <= i + 1'b1; 
                end
                else begin 
                    div_A<= 16'b1101_0101_0101_0101; 
                    div_B<= 8'd5; 
                    start <= 1'b1; 
                end
                
                1: // 
                if( done ) begin 
                    start <= 1'b0; 
                    i <= i + 1'b1; 
                end
                else begin 
                    div_A<= 16'd15; 
                    div_B<= 8'd5; 
                    start <= 1'b1; 
                end
                
                2: // 
                if( done ) begin 
                    start <= 1'b0; 
                    i <= i + 1'b1; 
                end
                else begin 
                    div_A<= 16'd1166; 
                    div_B<= 8'b11111011; //为了实验随手改成了负数
                    start <= 1'b1; 
                end
                
                3: // 
                if( done ) begin 
                    start <= 1'b0; 
                    i <= i + 1'b1; 
                end
                else begin 
                    div_A<= 16'd11665; 
                    div_B<= 8'd123; 
                    start <= 1'b1; 
                end
                4: //
                if( done ) begin 
                    start <= 1'b0; 
                    i <= i + 1'b1; 
                end
                else begin 
                    div_A<= 16'd11665; 
                    div_B<= 8'd1; 
                    start <= 1'b1; 
                end
                5: // 
                if( done ) begin 
                    start <= 1'b0; 
                    i <= i + 1'b1; 
                end
                else begin 
                    div_A<= 16'd15; 
                    div_B<= 8'd3; 
                    start <= 1'b1; 
                end
                6: // 
                if( done ) begin 
                    start <= 1'b0; 
                    i <= i + 1'b1; 
                end
                else begin 
                    div_A<= 16'b1111_1111_1111_0001;  //为了实验随手改成了二进制负数
                    div_B<= 8'b1111_1110; 
                    start <= 1'b1; 
                end
                7: begin 
                    i <= 4'd7; 
                    
                end
            endcase 
        end
 endmodule