Getting Started with TLM-2.0-draft-2


A Series of Tutorials based on a set of Simple, Complete Examples


John Aynsley, Doulos, January 2008


Tutorial 7



Introduction

In this tutorial we concentrate on socket widths and endianness using the generic payload. We show a bridge component with an incoming 64-bit socket converted down to an outgoing 8-bit socket, and in particular focus on the endianness of the width conversion, that is, on the order in which outgoing bytes are sent. The example also illustrates how to use byte enables, and how to handle burst transactions where the number of bytes transferred is greater than the socket width.

Socket Width

The code for this example is quite lengthy, but the architecture is very simple: there is one initiator, one bridge, and one target memory. We will start by considering the sockets. The initiator has a 64-bit socket, the memory an 8-bit socket, and the bridge converts down from 64 to 8 bits. The bridge uses the tagged interfaces and sockets introduced in the previous tutorial:

struct Initiator: sc_module, tlm::tlm_bw_nb_transport_if<>
{
  tlm::tlm_nb_initiator_socket<64> socket;
  ...
};

template<tlm::tlm_endianness ENDIANNESS>
struct Bridge: sc_module, non_std::tlm_tagged_fw_nb_transport_if<>,
                          non_std::tlm_tagged_bw_nb_transport_if<>
{
  non_std::tlm_tagged_nb_target_socket<64>   targ_socket;
  non_std::tlm_tagged_nb_initiator_socket<8> init_socket;
  ...
};

struct Memory: sc_module, tlm::tlm_fw_nb_transport_if<>
{
  tlm::tlm_nb_target_socket<8> socket;
  ...
};

The top-level module connecting the components together is straightforward:

SC_MODULE(Top)
{
  typedef Bridge<tlm::TLM_BIG_ENDIAN> bridge_t;

  Initiator* init;
  bridge_t*  bridge;
  Memory*    memory;

  SC_CTOR(Top)
  {
    init   = new Initiator("init");
    bridge = new bridge_t("bridge");
    memory = new Memory("memory");

    init->socket.bind( bridge->targ_socket );
    bridge->init_socket.bind( memory->socket );
  }
};

Notice that in this instance the bridge is configured to perform a big-endian width conversion, that is, it takes the bytes from the incoming 64-bit socket and sends them through the outgoing 8-bit socket in the order MSB..LSB. All the socket classes have a method get_bus_width that returns the socket width in bits. Every component calculates the socket width in bytes, because the attributes of the generic payload are given in bytes. For example, the initiator:

const unsigned int W; // Bus width in bytes

SC_CTOR(Initiator) : socket("socket"), W(socket.get_bus_width() / 8)

The bridge imposes some restrictions on the relationship between the bus width, the number of bytes in a burst transfer, and the number of byte enables. The number of bytes must be equal to or be a multiple of the bus width, and if byte enables are used, the bus width must be equal to or be a multiple of the number of byte enables. These restrictions simplify the calculations in this particular example, and are not a restriction imposed by the generic payload or the TLM2.0 standard. The following code fragment is taken from the body of the nb_transport method in the bridge:

unsigned int len = trans.get_data_length();
bool*        byt = trans.get_byte_enable_ptr();
unsigned int bel = trans.get_byte_enable_length();

if (len % W)
{
  trans.set_response_status( tlm::TLM_BURST_ERROR_RESPONSE );
  return tlm::TLM_COMPLETED;
}
if (byt && (W % bel))
{
  trans.set_response_status( tlm::TLM_BYTE_ENABLE_ERROR_RESPONSE );
  return tlm::TLM_COMPLETED;
}

Byte Enables and Bursts

The generic payload has a pointer to an array of byte enables, each a boolean value. The pointer may be null, in which case byte enables are unused in that particular transaction. The target memory implements byte enables by masking off bytes when executing write transactions:

sc_dt::uint64  adr = trans.get_address();
unsigned char* ptr = trans.get_data_ptr();
unsigned int   len = trans.get_data_length();
bool*          byt = trans.get_byte_enable_ptr();
unsigned int   bel = trans.get_byte_enable_length();

if (cmd == tlm::TLM_WRITE_COMMAND) {
  if (byt) {
    for (unsigned int i = 0; i < len; i++)
      if ( byt[i % bel] )
        mem[adr+i] = ptr[i];
  }
  else
    memcpy(&mem[adr], ptr, len);
}

Element 0 of the byte enable array always corresponds to byte 0 of the data array, whatever the endianness. If the length of the byte enable array is less than that of the data array, the byte enable array as scanned repeatedly, as shown by the calculation byt[i % bel] above. If byte enables are unused, the target calls memcpy directly to optimize the copy operation.

A target may or may not support byte enables, but the general principle is that the more features of the generic payload a target supports, the greater the level of interoperability. The target memory in this example supports byte enables for write transactions, but not for read transactions. If a target does not support a given feature of the generic payload, it is obliged either to return an error response through the generic payload response status, or to call the SystemC report handler.

else if (cmd == tlm::TLM_READ_COMMAND) {
  if (byt) {
    trans.set_response_status( tlm::TLM_BYTE_ENABLE_ERROR_RESPONSE );
    return tlm::TLM_COMPLETED;
  }
  else
    memcpy(ptr, &mem[adr], len);
}

The target memory in this example supports burst transfers of any length. Moreover, it uses the burst length to estimate the latency of the transaction in loosely-timed mode:

virtual tlm::tlm_sync_enum nb_transport(tlm::tlm_generic_payload& trans,
                                        tlm::tlm_phase& phase, sc_time& delay)
{
  ...
  unsigned int len = trans.get_data_length();
  ...
  delay += LATENCY * len;
  return tlm::TLM_COMPLETED;
}

When an initiator creates a generic payload transaction with byte enables, it must create storage for the byte enable array and set the byte enable pointer and length attributes of the transaction object:

if (cmd == tlm::TLM_WRITE_COMMAND)
{
  static word_t byte_enable_mask = 0x0000010101010101ull;
  trans.set_byte_enable_ptr( reinterpret_cast( &byte_enable_mask ) );
  trans.set_byte_enable_length( 8 );
}
else
  trans.set_byte_enable_ptr( 0 );

The generic payload always uses the endianness of the host computer and a word width calculated from the socket width, and since the order of the byte enables must match the order of the bytes, the byte enable array also uses the host endianness. Independent of whether the host computer is little endian or big endian, the code fragment above will enable the least significant 6 bytes and disable the most significant 2 bytes passed through a 64-bit socket. Whether the least-significant byte is actually byte 0 or byte 7 in the generic payload data array depends on host endianness.

Width Conversion in the Bridge

Now we come to the tricky part. The bridge must pass transactions between the incoming 64-bit and outgoing 8-bit sockets along both the forward and backward paths. For the sake of completeness, here is the overall structure of the bridge module:

template<tlm::tlm_endianness ENDIANNESS>
struct Bridge: sc_module, non_std::tlm_tagged_fw_nb_transport_if<>,
                          non_std::tlm_tagged_bw_nb_transport_if<>
{
  non_std::tlm_tagged_nb_target_socket<64>   targ_socket;
  non_std::tlm_tagged_nb_initiator_socket<8> init_socket;

  const unsigned int W; // Incoming bus width in bytes

  SC_CTOR(Bridge)
  : targ_socket("targ_socket", 0), init_socket("init_socket", 1),
    W(targ_socket.get_bus_width() / 8) {}

  // Tagged non-blocking transport method
  virtual tlm::tlm_sync_enum nb_transport(unsigned int id,
      tlm::tlm_generic_payload& trans, tlm::tlm_phase& phase, sc_time& delay)
  {
    ...
  }

  // Tagged forward DMI method
  virtual bool get_direct_mem_ptr(unsigned int id,
                                  const sc_dt::uint64& address,
                                  tlm::tlm_dmi_mode& dmi_mode,
                                  tlm::tlm_dmi&  dmi_data)
  {
    return init_socket->get_direct_mem_ptr( address, dmi_mode, dmi_data );
  }

  // Tagged debug transaction method
  virtual unsigned int transport_dbg(unsigned int id,
                                     tlm::tlm_debug_payload& dbg)
  {
    return init_socket->transport_dbg( dbg );
  }

  // Tagged backward DMI method
  virtual void invalidate_direct_mem_ptr(unsigned int id,
                                         sc_dt::uint64 start_range,
                                         sc_dt::uint64 end_range)
  {
    targ_socket->invalidate_direct_mem_ptr(start_range, end_range);
  }
  ...
};

The bridge module implements all of the methods of the tagged versions of the forward and backward nb_transport interfaces. The unsigned int id tag permits a single implementation of nb_transport to serve both sockets. The DMI and debug transaction interfaces do not care about socket width, so they simply pass the method calls on. But as mentioned previously, the transport interface has to consider the endianness of the initiator as it forwards transactions between a 64-bit and an 8-bit socket.

Well, it turns out that if the endianness of the initiator (and hence the endianness of the width conversion) is the same as the endianness of the host, then the bytes in the generic payload data array will already be in the correct order to be sent out as a transaction through the 8-bit socket, so the bridge can act as a TLM2 interconnect component and simply forward the incoming transaction untouched. On the other hand, if the endianness of the width conversion and host endianness are different, then the bytes in the transaction have to be re-ordered. The only way to re-order the bytes is to have the bridge act as a target for the incoming transaction and as an initiator for a new outgoing transaction.

virtual tlm::tlm_sync_enum nb_transport(unsigned int id,
    tlm::tlm_generic_payload& trans, tlm::tlm_phase& phase, sc_time& delay)
{
  if ( hasHostEndianness( ENDIANNESS ) )
  {
    if (id == 0)
      return init_socket->nb_transport(trans, phase, delay); // Forward path
    else if (id == 1)
      return targ_socket->nb_transport(trans, phase, delay); // Backward path
    else
      ...
  }
  else
    if (id == 0) // Forward path
    {
      ...
      outgoing_trans = new tlm::tlm_generic_payload( trans ); // Shallow copy
      ...
    }
}

The first branch of the if statement above catches the case where the endianness of the conversion matches host endianness, so is effectively zero-cost. The transaction is simply passed along the forward or backward path without modification. The second branch is executed when the width conversion requires the bytes in the generic payload to be shuffled around, so a new transaction must be initiated. The mechanics of copying information from the incoming to the outgoing transaction are as follows. First, create a shallow copy of the incoming transaction:

tlm::tlm_command cmd = trans.get_command();
unsigned int     len = trans.get_data_length();
bool*            byt = trans.get_byte_enable_ptr();
unsigned int     bel = trans.get_byte_enable_length();
...
outgoing_trans = new tlm::tlm_generic_payload( trans ); // Shallow copy

Next, create the data buffer for the outgoing transaction and copy the data across, swapping the byte order:

unsigned char* incoming_data = trans.get_data_ptr();
unsigned char* outgoing_data = new unsigned char[len];

if (cmd == tlm::TLM_WRITE_COMMAND)
  for (unsigned int i = 0; i < len; i += W)
    for (unsigned int j = 0; j < W; j++)
      outgoing_data[ i + W - 1 - j ] = incoming_data[ i + j ];

outgoing_trans->set_data_ptr( outgoing_data );

Then do the same for the byte enable array, if any. The byte enable order must also be swapped:

if (byt)
{
  bool* outgoing_byte_enable = new bool[bel];
  for (unsigned int i = 0; i < bel; i++)
    outgoing_byte_enable[i] = byt[ bel - i - 1 ];
  outgoing_trans->set_byte_enable_ptr( outgoing_byte_enable );
}

The new transaction can then be sent on its way:

tlm::tlm_sync_enum status;
status = init_socket->nb_transport( *outgoing_trans, phase, delay);

The code also copies data across in the reverse direction for a read transaction, and is able to handle the response being returned using either the forward path or the backward path. This adds more size and complexity to the code, but in principle is just a combination of the ideas discussed above with those from previous tutorials.

The code in this example exposes the details of calculations that depend on host endianness. The TLM 2.0-draft-2 kit includes some prototype helper functions to aid with endianness calculations, but these are still work-in-progress. In the final TLM 2.0 release, we hope that the helper functions will have been improved to the point where they can hide most of the dependencies on host endianness from the application.

When you run the example, you should see the following:

  • The initiator uses the debug transaction interface to dump out the memory contents at the start.
  • The initiator generates 64-bit (8-byte) read and write transactions at random.
  • The write transactions are executed using the forward path only, and use byte enables to mask the data being written to the target memory.
  • The read transactions use both forward and backward paths, the backward path being used to return the response.
  • The bridge performs a big endian width conversion such that the most-significant byte of each 8-byte word is written to the lowest address in memory, regardless of the endianness of the host computer.
  • The initiator then generates a burst write transaction to overwrite the entire memory contents (without using byte enables this time), followed by a burst read transaction to read back the contents. The burst write fills consecutive bytes with ascending integer values, starting with a value of 0 for the least significant byte of the first 64-bit word, and the bridge swaps the byte order.
  • The initiator uses the debug transaction interface to dump out the memory contents again before and after executing the two bursts.

You will find the source code for this example in file tlm2_getting_started_7.cpp.

Click here to download both the source file for this example and this page in PDF format. In exchange, we will ask you to enter some personal details. To read about how we use your details, click here. On the registration form, you will be asked whether you want us to send you further information concerning other Doulos products and services in the subject area concerned.

Previous:  Tutorial 6    Next:  Tutorial 8

Back to the full list of TLM2 Tutorials