Getting Started with TLM-2.0-draft-2


A Series of Tutorials based on a set of Simple, Complete Examples


John Aynsley, Doulos, January 2008


Tutorial 6



Introduction

In this tutorial we show a bus with multiple initiators and multiple targets, and also show temporally decoupled initiators using the quantum keeper to keep track of time. This tutorial builds on the previous tutorials, and the example get a little more complicated. Because we now have multiple initiators and multiple targets using the non-blocking transport interface, the interconnect component modeling the bus has to accurately route transactions on both the forward and backward paths.

Temporal Decoupling and the Quantum Keeper

We will start by looking at temporal decoupling in the initiators. As has been mentioned briefly in previous tutorials, simulation speed can be improved dramatically by reducing the amount of context switching between SystemC thread processes, which boils down to making fewer calls to wait. So instead of an initiator calling wait every time some simulated time is consumed, it keeps a tally of the time used in a local variable, and only calls wait from time-to-time to keep the initiators roughly synchronized. Provided that every initiator calls wait periodically, they can advance at approximately the same rate even in the absence of any explicit synchronization mechanisms in the application.

TLM2 provides a class, tlm_quantumkeeper, to help manage temporal decoupling. In principle it is straightforward to implement temporal decoupling using native SystemC constructs, but it is recommended practice to use the quantum keeper to give a consistent coding style.

In this example there are two initiators, one generating a series of write transactions and the second a series of read transactions. Both initiators are temporally decoupled and keep track of local time using the quantum keeper. At the start of simulation the first initiator sets the size of the global quantum:

SC_CTOR(Initiator1) ...
{
  ...
  m_qk.set_global_quantum( sc_time(1, SC_US) );
  m_qk.reset();
}

tlm::tlm_quantumkeeper m_qk;

You can create as many instances of tlm_quantumkeeper as you like and they will all share the same global quantum. set_global_quantum is a static method, so after calling the method it is important to call the reset method to have the local quantum keeper object pick up the new global quantum value. The quantum keeper will now keep track of the local time offset on behalf of the initiator, that is, how far the initiator is allowed to warp time ahead of the current simulation time. When the initiator needs to consume some time, it should do so using the quantum keeper:

m_qk.inc( sc_time(100, SC_NS) );

The key insight is that this call does not advance simulation time (as returned by sc_time_stamp()), but only advances the local time offset. The initiator has become decoupled from the SystemC simulation time, and is pushing out into the future on its own. The author of the code is responsible for ensuring that things are well-behaved. The initiator needs to check explicitly whether local time has reached or passed the next quantum boundary, and if it has, to synchronize the initiator with simulation time:

if (m_qk.need_sync()) m_qk.sync();

The sync method simply calculates the period from the current simulation time to the next quantum boundary and calls wait to suspend the process for that time. If every initiator follow the same procedure, then every initiator will progress forward in time together with a time granularity given by the global quantum.

Whenever a temporally decoupled initiator calls nb_transport, it should pass the local time offset as the delay argument, again using the quantum keeper:

status = socket->nb_transport( *trans, phase, m_qk.get_local_time() );

As we have seen in previous tutorials, the target can decide whether it is able to run in temporally decoupled mode, or whether it needs to synchronize with simulation time before executing the transaction. The target makes its intentions known to the initiator through the return value from nb_transport:

switch (status) {
  case tlm::TLM_REJECTED:
  case tlm::TLM_ACCEPTED:

  m_qk.sync();
  break;

A return value of TLM_REJECTED or TLM_ACCEPTED each imply that the initiator should yield control at some point in order to allow the target to make progress. If the target has rejected the transaction now, perhaps it will be able to accept it if it is allowed to execute a little further. If the target has accepted the transaction, it is announcing its intention to send a response using the backward path, but in order to do so it must be allowed to execute. This mechanism is known as synchronization on demand. The target is asking the initiator to synchronize and thus drop out of its time warp. Synchronization on demand always has an adverse effect on simulation speed, because it means executing an extra wait.

Temporal Decoupling in the Target

In this example, for the sake of illustration, we have contrived the target memory so that it supports temporal decoupling for the write command but not for the read command. This is not meant to be particularly realistic, but it gives an easy way of demonstrating this features of the non-blocking transport interface. The memory model is similar to the previous tutorial:

virtual tlm::tlm_sync_enum nb_transport(tlm::tlm_generic_payload& trans,
                                        tlm::tlm_phase& phase, sc_time& delay)
{
  tlm::tlm_command cmd = trans.get_command();

  if (cmd == tlm::TLM_READ_COMMAND && delay > SC_ZERO_TIME)
  {
    m_peq.notify( trans, phase, delay );
    return tlm::TLM_ACCEPTED;
  }

  // Read from or write to memory
  ...

  trans.set_response_status( tlm::TLM_OK_RESPONSE );
  delay += LATENCY;
  return tlm::TLM_COMPLETED;
}

For a delayed read command, the target pushes the transaction into a queue and asks the initiator to synchronize by returning a value of TLM_ACCEPTED. Otherwise, the target executes the transaction immediately and accumulates the memory latency into the delay argument before returning with TLM_COMPLETED. Increasing the value of the delay argument effectively advances the local time offset from the current simulation time.

Pooling Transactions

For loosely-timed models, delaying the execution of the command in the target can have the effect that a new transaction may be generated before the target has completed the previous transaction. In order to avoid overwritten the previous transaction before it has been completed, the initiator generating the read transactions uses a pool of transaction objects. Let us look at the relevant fragments of the thread process in the initiator:

void thread_process()
{
  m_qk.reset();

  tlm::tlm_generic_payload* trans_pool[POOL_SIZE];
  for (int i = 0; i < POOL_SIZE; i++)
    trans_pool[i] = new tlm_generic_payload;
  int count = 0;

  tlm::tlm_generic_payload* trans;
  tlm::tlm_phase phase;

  for (int i = 0; i < RUN_LENGTH; i += 4)
  {
    trans = trans_pool[count = (count + 1) % POOL_SIZE];

    ...
    trans->set_data_length( 4 );
    trans->set_response_status( tlm::TLM_INCOMPLETE_RESPONSE );

    ...
    status = socket->nb_transport( *trans, phase, m_qk.get_local_time() );
    ...

    m_qk.inc( sc_time(100, SC_NS) );
    if (m_qk.need_sync()) m_qk.sync();
  }
}

The process starts by allocating a pool of transactions, then cycles through the transactions in the pool as it calls nb_transport. Since the transaction objects in the pool are being reused, it is important to set every transaction attribute that might have been changed, including the response status. The response status should have been set to TLM_OK_RESPONSE the last time the transaction object was used, so needs to be set back to TLM_INCOMPLETE_RESPONSE, which is the default value.

The code fragment above also shows the use of the quantum keeper to maintain the local time.

Multiple Initiators and Multiple Targets

With multiple initiators and multiple targets, we require a bus model that is able to route transactions in both the forward and backward directions. This complicates the implementation of the interconnect component, because it has to remember where the transaction came from in order to send the response back through the correct socket.

Diagram of bus

Another complication is that the interconnect component must be able to implement the interface methods for multiple socket instances and to be able to identify which socket called the interface method. The TLM 2.0-draft-2 kit shows an example using specialized socket classes together with some machinery to register methods with sockets. In this example we take a different approach by having the interconnect model implement interface methods that are tagged with a socket id.

The top-level module is straightforward, with two distinct initiators, and four identical targets:

SC_MODULE(Top)
{
  Initiator1* init1;
  Initiator2* init2;
  Bus<2,4>*   bus;
  Memory*     memory[4];

  SC_CTOR(Top)
  {
    init1 = new Initiator1("init1");
    init2 = new Initiator2("init2");
    bus   = new Bus<2,4>  ("bus");

    for (int i = 0; i < 4; i++)
    {
      char txt[20];
      sprintf(txt, "memory_%d", i);
      memory[i] = new Memory(txt);

      ( *(bus->init_socket[i]) ).bind( memory[i]->socket );
    }

    init1->socket.bind( *(bus->targ_socket[0]) );
    init2->socket.bind( *(bus->targ_socket[1]) );
  }
};

The bus model is a class template with template arguments giving the number of initiators and targets:

template<unsigned int N_INITIATORS, unsigned int N_TARGETS>
struct Bus: sc_module, non_std::tlm_tagged_fw_nb_transport_if<>,
                       non_std::tlm_tagged_bw_nb_transport_if<>
{
  non_std::tlm_tagged_nb_target_socket<>*    targ_socket[N_INITIATORS];
  non_std::tlm_tagged_nb_initiator_socket<>* init_socket[N_TARGETS];

  ...
};

The tagged sockets are not part of the TLM 2.0-draft-2 kit, but are part of this example, written by Doulos. They are specialized classes derived from the standard sockets to support the socket tagging mechanism. For example, the tagged target socket is derived from the standard target socket:

template<unsigned int BUSWIDTH = 32,
          typename TYPES = tlm::tlm_generic_payload_types>
class tlm_tagged_nb_target_socket :
  public tlm::tlm_nb_target_socket <BUSWIDTH, TYPES>,
  public tlm::tlm_fw_nb_transport_if<TYPES>
{
  tlm_tagged_nb_target_socket(const char* name, unsigned int id)
  : tlm_nb_target_socket<BUSWIDTH, TYPES>(name), m_id(id)
  { ... }
  ...
};

The constructors for the tagged sockets take an unsigned int socket id as an argument. Each instance of a tagged socket within a module should be given a unique id. The tagged target socket implements the methods of the forward nb_transport interface by annotating the socket id as an extra argument, for example:

virtual tlm::tlm_sync_enum nb_transport(tlm::tlm_generic_payload& trans,
                                        tlm::tlm_phase& phase, sc_time& delay)
{
  return m_parent->nb_transport(m_id, trans, phase, delay);
}

The implementation of the tagged sockets is very straightforward. The only trick is that the constructor of the tagged socket searches the SystemC object hierarchy to find its immediate parent module, which is required to implement the appropriate tagged interfaces. The tagged interfaces are also not part of the TLM 2.0-draft-2 kit. They consist of the core TLM 2.0 interfaces with a change of name and the addition of an id argument to each method. That’s all. The source code is included with this example.

So, we have two sets of interfaces, standard and tagged, and two sets of sockets, standard and tagged. The relationship between them is simple. The TLM2 standard interfaces and sockets (together with the generic payload) provide the interoperability interface between transaction level models. The tagged sockets in this example are derived from the standard TLM2 sockets, so can be bound to standard TLM2 sockets belonging to other models. The tagged interfaces and sockets are not part of the interoperability standard but are just for internal use within each model. They facilitate model writing by providing a mechanism for implementing the same interface method for multiple sockets and determining through which socket an incoming call was received. The bus model uses the tagged sockets and implements the tagged interfaces internally, but the sockets just look like regular standard TLM2 sockets from an external viewpoint. The tagged socket implementations intercept incoming interface method calls and add the tag before calling the methods of the bus model.

Next we will illustrate the use of the tagged interfaces within the bus model, starting with one of the simpler cases, the backward DMI interface:

virtual void invalidate_direct_mem_ptr(unsigned int id,
                                       sc_dt::uint64 start_range,
                                       sc_dt::uint64 end_range)
{
  for (unsigned int i = 0; i < N_INITIATORS; i++)
    (*targ_socket[i])->invalidate_direct_mem_ptr(
                                  compose_address(id, start_range),
                                  compose_address(id, end_range));
}

The above method is a refinement to a similar method for the router in a previous tutorial. Now, using the id argument, the bus knows which target memory is invalidating its DMI pointer, so it is able to perform the reverse transformation on the address back into the address space used by the initiators. Also, since there are now multiple initiators, the same call gets propagated back to all initiators.

The implementation of the tagged non-blocking transport method in the bus has a little more work to do because it could be called on either the forward or the backward paths. The overall outline of the implementation is as follows:

virtual tlm::tlm_sync_enum nb_transport(unsigned int id,
    tlm::tlm_generic_payload& trans, tlm::tlm_phase& phase, sc_time& delay)
{
  if (id < N_INITIATORS)
  {
    // Forward path
    m_id_map[ &trans ] = id;
    ...
  }
  else if (id < N_INITIATORS + N_TARGETS)
  {
    // Backward path
    sc_dt::uint64 address = trans.get_address();
    trans.set_address( compose_address( id - N_INITIATORS, address ) );

    return ( *(targ_socket[ m_id_map[ &trans ] ]) )->nb_transport(
                                                     trans, phase, delay);
  }
  else
  {
    SC_REPORT_FATAL("TLM2", "Invalid tagged socket id in bus");
    return tlm::TLM_COMPLETED;
  }
}

std::map <tlm::tlm_generic_payload*, unsigned int> m_id_map;

The method uses the id to determine whether it is being called on the forward or the backward path. On the forward path, it stores the association between the transaction and the id in a map. On the backward path, it uses this association to route the call back through the same socket to the original initiator, having first restored the original address in the transaction (the same reverse address transformation as used by the backward DMI method).

When you run this example, you should see the following in the output log from the simulation:

  • The target memory contents being dumped out at the start and again at the end of simulation using the debug transaction interface.
  • Write transactions being executed using both the transport and direct memory interfaces.
  • Read transactions being completed using both the forward and backward paths.
  • The DMI pointers being invalidated once in the middle of the run.

You will find the source code for this example in file tlm2_getting_started_6.cpp.

Click here to download both the source file for this example and this page in PDF format. In exchange, we will ask you to enter some personal details. To read about how we use your details, click here. On the registration form, you will be asked whether you want us to send you further information concerning other Doulos products and services in the subject area concerned.

Previous:  Tutorial 5    Next:  Tutorial 7

Back to the full list of TLM2 Tutorials