Saturday 24 February 2018

Developing & Delivering KnowHow


The Easier UVM Code Generator Tutorial Part 5: Split Transactors



This tutorial explains the split_transactors feature of the Code Generator.

The Idea

When running a UVM simulation on an accelerator or emulator box, the UVM test bench running on the host computer can easily become a bottleneck because it is running much slower than the DUT. The idea is to move as much of the UVM test bench code as possible onto the accelerator or emulator box in order to speed up simulation. This means splitting each transactor (UVM driver or monitor) into two parts, a lightweight proxy or wrapper that continues to run on the host, and a synthesizable part (BFM or Bus-Functional Model) that runs on the box and wiggles the pins of the DUT. In order to achieve this, the code has to be structured into two separate domains:

  • An untimed "HVL domain" that contains the UVM environment and which must not contain any delays or clocks
  • An "HDL domain" that instantiates the DUT and the synthesizable parts of the transactors and can be run entirely on the box

The solutions provided by most EDA tool vendors make use of the SCE-MI standard for communication between the host computer and the accelerator or emulator box. SCE-MI allows the communication between the host computer and the box to occur at the transaction level. The tool vendor may be able to synthesize selected parts of the UVM test bench straight onto the box (including generation of the necessary SCE-MI interfaces), or you may need to do some manual work to transform your UVM code into a form suitable for SCE-MI communication and for synthesis. This step will depend on the capabilities of your tool vendor and is beyond the scope of the Easier UVM Code Generator and this tutorial.

The files for this example are in the directory ./minimal_split_txor of the Easier UVM Code Generator release. We will describe the necessary user-defined code fragments and the generated code below.

Dual-Top Modules

You tell the Code Generator to create split transactors by including the setting split_transactors = yes in the common template file:

Filename common.tpl
dut_top = mydut
split_transactors = yes
top_default_seq_count = 5

This setting infers the setting dual_top = yes, which means that the Code Generater will create two separate top-level modules, one untimed and one synthesizable, although whether the test harness actually is synthesizable will depend on the user-defined code fragments you include.

The synthesizable test harness contains code to generate the clock and the reset and code to instantiate the BFM, the pin-level interface, and the DUT. This code is not necessarily synthesizable in its entirety using regular RTL synthesis tools, but there do exist dedicated synthesis tools that target accelerator and emulator boxes that could handle it. The only difference between the test harness below and a test harness without split transactors is that the test harness below instantiates the bus_bfm interface, which contains the synthesizable parts of the transactors for the bus agent:

module top_hdl_th;
  ...
  logic clock = 0;
  logic reset;

  always #10 clock = ~clock;

  initial
  begin
    reset = 0;
    #75 reset = 1;
  end

  assign bus_if_0.clk = clock;

  // Pin-level interfaces connected to DUT
  bus_if   bus_if_0 ();         

  // BFM interfaces that communicate with proxy transactors in UVM environment
  bus_bfm  bus_bfm_0 (bus_if_0);

  mydut uut (
    .bus_clk (bus_if_0.clk),
    .bus_cmd (bus_if_0.cmd),
    .bus_addr(bus_if_0.addr),
    .bus_data(bus_if_0.data)
  );

endmodule

The untimed test bench instantiates the UVM environment:

module top_untimed_tb;
  ...
  top_config env_config;

  initial
  begin
    env_config = new("env_config");
    if ( !env_config.randomize() )
      `uvm_error("top_untimed_tb", "Failed to randomize configuration object" )

    env_config.bus_vif             = top_hdl_th.bus_bfm_0;
    env_config.is_active_bus       = UVM_ACTIVE;          
    env_config.checks_enable_bus   = 1;                   
    env_config.coverage_enable_bus = 1;                   

    uvm_config_db #(top_config)::set(null, "uvm_test_top.m_env", "config", env_config);

    run_test();
  end

endmodule

The important point to note about the untimed test bench (apart from the fact that it must be untimed) is that the virtual interface (within the top-level configuration object) is set to point to the BFM interface within the test harness top_hdl_th.bus_bfm_0. This is a full hierarchical reference to the interface instance bus_bfm_0 within the top-level module top_hdl_th. When the UVM driver and monitor proxies communicate with the code outside the class-based verification environment using their local virtual interface variables, they will actually be communicating with the BFM in the synthesizable test harness.

The Driver Proxy

The Easier UVM Code Generator interface template file contains settings to include the various user-defined code fragments necessary to define the UVM driver, which is just a proxy or wrapper for the full driver (transactor), most of which is implemented on the HDL side in the synthesizable BFM:

Filename bus.tpl
...
trans_inc_before_class       = bus_trans_inc_before_class.sv    inline
driver_inc_inside_class      = bus_driver_inc_inside_class.sv   inline
driver_inc_after_class       = bus_driver_inc_after_class.sv    inline
...

Filename bus_trans_inc_before_class.sv
typedef struct packed {
  bit cmd;
  byte addr;
  byte data;
} bus_tx_s;

Filename bus_driver_inc_inside_class.svv
extern task run_phase(uvm_phase phase);

Filename bus_driver_inc_after_class.sv
task bus_driver::run_phase(uvm_phase phase);
  forever
  begin
    bus_tx_s req_s;
    seq_item_port.get_next_item(req);

    // Copy fields to packed struct
    req_s.cmd  = req.cmd;
    req_s.addr = req.addr;
    req_s.data = req.data;
    
    // Call HDL-side transactor
    vif.drive(req_s);

    seq_item_port.item_done();
  end
endtask : run_phase 

The run_phase method of the UVM driver gets one transaction at a time from the sequencer and calls the drive function of the BFM through its virtual interface. Because the "interface" between the HVL and HDL domains has to map onto the SCE-MI interface of the accelerator/emulator it cannot pass arbitrary data types, such as handles to objects, but is restricted to synthesizable data types. One reasonable approach, the one shown here, is to copy the fields of the transaction object into a single packed struct, which is then passed as an argument to the drive task (and ultimately passed across the SCE-MI interface from the host to the emulator). The typedef for the packed struct can be included before the uvm_sequence_item class (setting trans_inc_before_class), thus making it available to all the classes associated with the particular agent (see FAQ).

Again, the significant point about the code above is that it is just a proxy or wrapper for the real work of the driver, which is actually done by code (within the drive task) that is synthesized onto the emulator box.

The Synthesizable BFM

The synthesizable parts of the drivers and monitors must be provided as user-defined code fragments and included within the BFM:

Filename bus.tpl
...
agent_inc_inside_bfm = bus_inc_inside_bfm.sv  inline
...

After code generation, the resultant BFM will be as follows:

interface bus_bfm(bus_if if_port); 

  import bus_pkg::*;

  // Start of inlined include file generated_tb/tb/include/bus_inc_inside_bfm.sv
  // HDL-side synthesizable transactors
  
  task drive(bus_tx_s req_s);
    @(posedge if_port.clk);
    if_port.cmd  <= req_s.cmd;
    if_port.addr <= req_s.addr;
    if_port.data <= req_s.data;
  endtask
  ...
  // End of inlined include file

endinterface : bus_bfm

This particular driver is trivial; an actual driver would have to do whatever is needed to implement the particular protocol being used. The code must comply with whatever rules are imposed by the tool vendor to make it synthesizable for the purposes of acceleration/emulation. These rules are usually less restrictive than those of RTL synthesis. For example, many vendors support an implicit FSM style with multiple @(posedge clock) statements within a single procedure.

Note that the BFM is an interface that has an interface port (if_port) which will refer to the pin-level interface after both interfaces have been instantiated. Any user-defined code fragments will need to refer to the pin-level interface using this interface port.

Passing Information Back to the Monitor Proxy

The code above shows the downstream route by which a driver proxy on the host passes information across to a transactor implemented on the box. It is possible to implement the monitor in exactly the same way, that is, a monitor proxy on the host can make a time-consuming task call to get information from a transactor on the box, for example:

task bus_monitor::run_phase(uvm_phase phase);
  forever
  begin
    bus_tx_s rsp_s;
    bus_tx tx;

    vif.get_next_transaction_from_emulator(rsp_s);

    tx = bus_tx::type_id::create("tx");
    tx.cmd  = rsp_s.cmd;
    tx.addr = rsp_s.addr;
    tx.data = rsp_s.data;
    analysis_port.write(tx);
  end
endtask : run_phase 

The downside of this approach is that it is an inefficient way to pass information over the communication link between the host and the emulator. It is more efficient to have the emulator push information to the host only when it is ready to do so. In this way both simulator and emulator can continue to execute in parallel: the emulator does not need to stall while waiting for the test bench to catch up. This approach is enabled by first setting up a back pointer from the BFM to the UVM monitor proxy object, as follows:

interface bus_bfm(bus_if if_port); 

  import bus_pkg::*;

  // Start of inlined include file generated_tb/tb/include/bus_inc_inside_bfm.sv
  ...
  import bus_pkg::bus_monitor;
  bus_monitor proxy_back_ptr;
  
  task run;
    forever
    begin
      bus_tx_s req_s;
      @(posedge if_port.clk);
      req_s.cmd  = if_port.cmd;
      req_s.addr = if_port.addr;
      req_s.data = if_port.data;
      proxy_back_ptr.write(req_s);
    end
  endtask
  // End of inlined include file

endinterface : bus_bfm

The synthesized BFM on the box now initiates monitor transactions when it is ready to do so. The write method is implemented within the UVM monitor proxy in the usual way:

// Start of inlined include file generated_tb/tb/include/bus_monitor_inc_after_class.sv
task bus_monitor::run_phase(uvm_phase phase);
  vif.proxy_back_ptr = this;
  vif.run();
endtask

function void bus_monitor::write(bus_tx_s req_s);
  bus_tx tx;
  tx = bus_tx::type_id::create("tx");
  tx.cmd  = req_s.cmd;
  tx.addr = req_s.addr;
  tx.data = req_s.data;
  analysis_port.write(tx);
endfunction
// End of inlined include file

Note that the monitor proxy assigns the back-pointer and starts a process running on the box at the start of the UVM run phase. It then sits passively until the transactor on the box calls its write method, at which point it creates a new UVM transaction and sends it out through its analysis port.




Links

Easier UVM Coding Guidelines
Introduction to the Easier UVM Coding Guidelines
Summary of the Easier UVM Coding Guidelines
Detailed Explanation of the Easier UVM Coding Guidelines
Easier UVM Glossary
Easier UVM Coding Guidelines - Download

Easier UVM - Deeper Explanations
Coverage-Driven Verification Methodology
Requests, Responses, Layered Protocols and Layered Agents
How to Access a Parameterized SystemVerilog Interface from UVM

Easier UVM Code Generator
Easier UVM Code Generator - Download
Easier UVM Code Generator - Tutorial Part 1: Getting Started
Easier UVM Code Generator - Tutorial Part 2: Adding User-Defined Code
Easier UVM Code Generator - Tutorial Part 3: Adding the Register Layer
Easier UVM Code Generator - Tutorial Part 4: Hierarchical Verification Environments
Easier UVM Code Generator - Tutorial Part 5: Split Transactors
Easier UVM Code Generator - Frequently Asked Questions (FAQ)
Easier UVM Code Generator - Reference Guide

Easier UVM Video Tutorial
Introducing Easier UVM
Easier UVM - The Big Picture
Key Concepts of the Easier UVM Code Generator
Running Easier UVM in EDA Playground
Easier UVM - Components and Phases
Easier UVM - Configuration
TLM Connections in UVM
Easier UVM - Transaction Classes
Easier UVM - Sequences
Easier UVM - Tests
Easier UVM - Reporting
Easier UVM - Register Layer
Easier UVM - Parameterized Interfaces
Easier UVM - Scoreboards
The Finer Points of UVM Sequences (Recorded Webinar)
UVM Run-Time Phasing (Recorded Webinar)

A YouTube playlist with all the above videos and more

Easier UVM Paper and Poster
Easier UVM - Coding Guidelines and Code Generation - as presented at DVCon 2014

Easier UVM Q&A Forum
Easier UVM Google Group

Easier UVM Examples Ready-to-Run on EDA Playground
Minimal example with driver
Minimal example with coverage in a subscriber as well as driver and monitor.
Minimal example with register sequence and register block
Example with four interfaces/agents, two of which use a register model.
Minimal example with dual-top modules and split transactors
Minimal example showing a UVM sequence getting information from the config database
Minimal example showing features of objections and the command line processor
Minimal example showing the reporting features of UVM.
Example that drops an objection when coverage exceeds some threshold
Example that sends a response transaction from the driver back to the uvm_reg_adapter
Example that uses a frontdoor sequence to pass a response object back to the register sequence that called read/write
Example of a parameterized interface generated from an Easier UVM interface template file
Example that pulls in a user-defined parameterized interface
Example of a reference model with the Syosil scoreboard

Back to the full list of UVM Resources

Privacy Policy Site Map Contact Us