TAPA Library (libtapa)

Task Library

struct task

Defines a parent task instantiating children task instances.

Canonical usage:

tapa::task()
  .invoke(...)
  .invoke(...)
  ...
  ;

A parent task itself does not do any computation. By default, a parent task will not finish until all its children task instances finish. Such children task instances are joined to their parent. The alternative is to detach a child from the parent. If a child task instance is instantiated and detached, the parent will no longer wait for the child task to finish. Detached tasks are very useful when infinite loops can be used.

Subclassed by tapa::hls_compat::task

Public Functions

explicit task()

Constructs a tapa::task.

template<typename Func, typename ...Args>
inline task &invoke(Func &&func, Args&&... args)

Invokes a task and instantiates a child task instance.

Parameters:
  • func – Task function definition of the instantiated child.

  • args – Arguments passed to func.

Returns:

Reference to the caller tapa::task.

template<int mode, typename Func, typename ...Args>
inline task &invoke(Func &&func, Args&&... args)

Invokes a task and instantiates a child task instance with the given instatiation mode.

Template Parameters:

mode – Instatiation mode (join or detach).

Parameters:
  • func – Task function definition of the instantiated child.

  • args – Arguments passed to func.

Returns:

Reference to the caller tapa::task.

template<typename Func, typename ...Args>
inline task &invoke(Func &&func, executable exe, Args&&... args)

Host-only invoke that takes an executable as an argument.

NOTE: This invoke must be called before any direct tapa::stream reader / writer; otherwise tapa::stream will not bind correctly.

Parameters:
  • func – Task function definition of the instantiated child.

  • exe – Optionally overrides the execution target.

  • args – Arguments passed to func.

Returns:

Reference to the caller tapa::task.

template<int mode, int n, typename Func, typename ...Args>
inline task &invoke(Func &&func, Args&&... args)

Invokes a task n times and instantiates n child task instances with the given instatiation mode.

Template Parameters:
  • mode – Instatiation mode (join or detach).

  • n – Instatiation count.

Parameters:
  • func – Task function definition of the instantiated child.

  • args – Arguments passed to func.

Returns:

Reference to the caller tapa::task.

struct seq

Class that generates a sequence of integers as task arguments.

Canonical usage:

void TaskFoo(int i, ...) {
  ...
}
tapa::task()
  .invoke<3>(TaskFoo, tapa::seq(), ...)
  ...
  ;

TaskFoo will be invoked three times, receiving 0, 1, and 2 as the first argument, respectively.

Public Functions

seq() = default

Constructs a tapa::seq. This is the only public API.

Stream Library

  • A blocking operation blocks if the stream is not available (empty or full) until the stream becomes available.

  • A non-blocking operation always returns immediately.

  • A destructive operation changes the state of the stream.

  • A non-destructive operation does not change the state of the stream.

template<typename T>
class istream : public virtual tapa::internal::basic_stream<T>

Provides consumer-side operations to a tapa::stream where it is used as an input.

This class should only be used in task function parameters and should never be instatiated directly.

Subclassed by tapa::internal::unbound_stream< T >

Public Functions

inline bool empty() const

Tests whether the stream is empty.

This is a non-blocking and non-destructive operation.

Returns:

Whether the stream is empty.

inline bool try_eot(bool &is_eot) const

Tests whether the next token is EoT.

This is a non-blocking and non-destructive operation.

Parameters:

is_eot[out] Uninitialized if the stream is empty. Otherwise, updated to indicate whether the next token is EoT.

Returns:

Whether is_eot is updated.

inline bool eot(bool &is_success) const

Tests whether the next token is EoT.

This is a non-blocking and non-destructive operation.

Parameters:

is_success[out] Whether the next token is available.

Returns:

Whether the next token is available and is EoT.

inline bool eot(std::nullptr_t) const

Tests whether the next token is EoT.

This is a non-blocking and non-destructive operation.

Returns:

Whether the next token is available and is EoT.

inline bool try_peek(T &value) const

Peeks the stream.

This is a non-blocking and non-destructive operation.

The next token must not be EoT.

Parameters:

value[out] Uninitialized if the stream is empty. Otherwise, updated to be the value of the next token.

Returns:

Whether value is updated.

inline T peek(bool &is_success) const

Peeks the stream.

This is a non-blocking and non-destructive operation.

The next token must not be EoT.

Parameters:

is_success[out] Whether the next token is available.

Returns:

The value of the next token is returned if it is available. Otherwise, default-constructed T() is returned.

inline T peek(std::nullptr_t) const

Peeks the stream.

This is a non-blocking and non-destructive operation.

The next token must not be EoT.

Returns:

The value of the next token is returned if it is available. Otherwise, default-constructed T() is returned.

inline T peek(bool &is_success, bool &is_eot) const

Peeks the stream.

This is a non-blocking and non-destructive operation.

Parameters:
  • is_success[out] Whether the next token is available.

  • is_eot[out] Set to false if the stream is empty. Otherwise, updated to indicate whether the next token is EoT.

Returns:

The value of the next token is returned if it is available. Otherwise, default-constructed T() is returned.

inline bool try_read(T &value)

Reads the stream.

This is a non-blocking and destructive operation.

The next token must not be EoT.

Parameters:

value[out] Uninitialized if the stream is empty. Otherwise, updated to be the value of the next token.

Returns:

Whether value is updated.

inline T read()

Reads the stream.

This is a blocking and destructive operation.

The next token must not be EoT.

Returns:

The value of the next token.

inline istream &operator>>(T &value)

Reads the stream.

This is a blocking and destructive operation.

The next token must not be EoT.

Parameters:

value[out] The value of the next token.

Returns:

*this.

inline T read(bool &is_success)

Reads the stream.

This is a non-blocking and destructive operation.

The next token must not be EoT.

Parameters:

is_success[out] Whether the next token is available.

Returns:

The value of the next token is returned if it is available. Otherwise, default-constructed T() is returned.

inline T read(std::nullptr_t)

Reads the stream.

This is a non-blocking and destructive operation.

The next token must not be EoT.

Returns:

The value of the next token is returned if it is available. Otherwise, default-constructed T() is returned.

inline T read(const T &default_value, bool *is_success = nullptr)

Reads the stream.

This is a non-blocking and destructive operation.

Parameters:
  • default_value[in] Value to return if the stream is empty.

  • is_success[out] Updated to indicate whether the next token is available if is_success is not nullptr.

Returns:

The value of the next token is returned if it is available. Otherwise, default_value is returned.

inline bool try_open()

Consumes an EoT token.

This is a non-blocking and destructive operation.

The next token must be EoT.

Returns:

Whether an EoT token is consumed.

inline void open()

Consumes an EoT token.

This is a blocking and destructive operation.

The next token must be EoT.

template<typename T, uint64_t S>
class istreams : public virtual tapa::internal::basic_streams<T>

Provides consumer-side operations to an array of tapa::stream where they are used as inputs.

This class should only be used in task function parameters and should never be instatiated directly.

Subclassed by tapa::internal::unbound_streams< T, S >

Public Functions

inline istream<T> operator[](int pos) const

References a tapa::stream in the array.

Parameters:

pos – Position of the array reference.

Returns:

tapa::istream referenced in the array.

Public Static Attributes

static constexpr int length = S

Length of the tapa::stream array.

template<typename T>
class ostream : public virtual tapa::internal::basic_stream<T>

Provides producer-side operations to a tapa::stream where it is used as an output.

This class should only be used in task function parameters and should never be instatiated directly.

Subclassed by tapa::internal::unbound_stream< T >

Public Functions

inline bool full() const

Tests whether the stream is full.

This is a non-blocking and non-destructive operation.

Returns:

Whether the stream is full.

inline bool try_write(const T &value)

Writes value to the stream.

This is a non-blocking and destructive operation.

Parameters:

value[in] The value to write.

Returns:

Whether value has been written successfully.

inline void write(const T &value)

Writes value to the stream.

This is a blocking and destructive operation.

Parameters:

value[in] The value to write.

inline ostream &operator<<(const T &value)

Writes value to the stream.

This is a blocking and destructive operation.

Parameters:

value[in] The value to write.

Returns:

*this.

inline bool try_close()

Produces an EoT token to the stream.

This is a non-blocking and destructive operation.

Returns:

Whether the EoT token has been written successfully.

inline void close()

Produces an EoT token to the stream.

This is a blocking and destructive operation.

template<typename T, uint64_t S>
class ostreams : public virtual tapa::internal::basic_streams<T>

Provides producer-side operations to an array of tapa::stream where they are used as outputs.

This class should only be used in task function parameters and should never be instatiated directly.

Subclassed by tapa::internal::unbound_streams< T, S >

Public Functions

inline ostream<T> operator[](int pos) const

References a tapa::stream in the array.

Parameters:

pos – Position of the array reference.

Returns:

tapa::ostream referenced in the array.

Public Static Attributes

static constexpr int length = S

Length of the tapa::stream array.

template<typename T, uint64_t N = kStreamDefaultDepth>
class stream : public tapa::internal::unbound_stream<T>

Defines a communication channel between two task instances.

Public Functions

inline stream()

Constructs a tapa::stream.

template<size_t S>
inline stream(const char (&name)[S])

Constructs a tapa::stream with the given name for debugging.

Parameters:

name[in] Name of the communication channel (for debugging only).

Public Static Attributes

static constexpr int depth = N

Depth of the communication channel.

template<typename T, uint64_t S, uint64_t N = kStreamDefaultDepth>
class streams : public tapa::internal::unbound_streams<T, S>

Defines an array of tapa::stream.

Public Functions

inline streams()

Constructs a tapa::streams array.

template<size_t name_length>
inline streams(const char (&name)[name_length])

Constructs a tapa::streams array with the given base name for debugging.

The actual name of each tapa::stream would be name[i].

Parameters:

name[in] Base name of the streams (for debugging only).

inline stream<T, N> operator[](int pos) const

References a tapa::stream in the array.

Parameters:

pos – Position of the array reference.

Returns:

tapa::stream referenced in the array.

Public Static Attributes

static constexpr int length = S

Count of tapa::stream in the array.

static constexpr int depth = N

Depth of each tapa::stream in the array.

MMAP Library

template<typename T>
class async_mmap : public tapa::mmap<T>

Defines a view of a piece of consecutive memory with asynchronous random accesses.

Public Types

using addr_t = int64_t

Type of the addresses.

using resp_t = uint8_t

Type of the write responses.

Public Members

ostream<addr_t> read_addr

Provides access to the read address channel.

Each value written to this channel triggers an asynchronous memory read request. Consecutive requests may be coalesced into a long burst request.

istream<T> read_data

Provides access to the read data channel.

Each value read from this channel represents the data retrieved from the underlying memory system.

ostream<addr_t> write_addr

Provides access to the write address channel.

Each value written to this channel triggers an asynchronous memory write request. Consecutive requests may be coalesced into a long burst request.

ostream<T> write_data

Provides access to the write data channel.

Each value written to this channel supplies data to the memory write request.

istream<resp_t> write_resp

Provides access to the write response channel.

Each value read from this channel represents the data count acknowledged by the underlying memory system.

template<typename T>
class mmap

Defines a view of a piece of consecutive memory with synchronous random accesses.

Subclassed by tapa::async_mmap< T >, tapa::hmap< T, chan_count, chan_size >

Public Functions

inline explicit mmap(T *ptr)

Constructs a tapa::mmap with unknown size.

Parameters:

ptr – Pointer to the start of the mapped memory.

inline mmap(T *ptr, uint64_t size)

Constructs a tapa::mmap with the given size.

Parameters:
  • ptr – Pointer to the start of the mapped memory.

  • size – Size of the mapped memory (in unit of element count).

template<typename Container>
inline explicit mmap(Container &container)

Constructs a tapa::mmap from the given container.

Parameters:

container – Container holding a tapa::mmap. Must implement data() and size().

inline operator T*()

Implicitly casts to a regular pointer.

tapa::mmap should be used just like a pointer in the kernel.

inline mmap &operator++()

Increments the start of the mapped memory.

Returns:

The incremented tapa::mmap.

inline mmap &operator--()

Decrements the start of the mapped memory.

Returns:

The decremented tapa::mmap.

inline mmap operator++(int)

Increments the start of the mapped memory.

Returns:

The tapa::mmap before incrementation.

inline mmap operator--(int)

Decrements the start of the mapped memory.

Returns:

The tapa::mmap before decrementation.

inline T *data() const

Retrieves the start of the mapped memory.

This should be used on the host only.

Returns:

The start of the mapped memory.

inline T *get() const

Retrieves the start of the mapped memory.

This should be used on the host only.

Returns:

The start of the mapped memory.

inline uint64_t size() const

Retrieves the size of the mapped memory.

This should be used on the host only.

Returns:

The size of the mapped memory (in unit of element count).

template<uint64_t N>
inline mmap<vec_t<T, N>> vectorized() const

Reinterprets the element type of the mapped memory as tapa::vec_t<T, N>.

This should be used on the host only. The size of mapped memory must be a multiple of N.

Template Parameters:

N – Vector length of the new element type.

Returns:

tapa::mmap of the same piece of memory but of type tapa::vec_t<T, N>.

template<typename U>
inline mmap<U> reinterpret() const

Reinterprets the element type of the mapped memory as U.

This should be used on the host only. Both T and U must have standard layout. The host memory pointer must be properly aligned. If sizeof(U) > sizeof(T), the size of mapped memory must be a multiple of sizeof(U)/sizeof(T) (which must be an integer itself). If sizeof(U) < sizeof(T), sizeof(T) must be a multiple of sizeof(U).

Template Parameters:

U – The new element type.

Returns:

tapa::mmap<U> of the same piece of memory.

template<typename T, uint64_t S>
class mmaps

Defines an array of tapa::mmap.

Public Functions

template<typename PtrContainer, typename SizeContainer>
inline mmaps(const PtrContainer &pointers, const SizeContainer &sizes)

Constructs a tapa::mmap array from the given pointers and sizes.

Parameters:
  • ptrs – Pointers to the start of the array of mapped memory.

  • sizes – Sizes of each mapped memory (in unit of element count).

template<typename Container>
inline explicit mmaps(Container &container)

Constructs a tapa::mmap array from the given container.

Parameters:

container – Container holding an array of tapa::mmap. container must implement operator[] that returns a container suitable for constructing a tapa::mmap.

inline mmap<T> &operator[](int idx)

References a tapa::mmap in the array.

template<uint64_t N>
inline mmaps<vec_t<T, N>, S> vectorized() const

Reinterprets the element type of each mapped memory as tapa::vec_t<T, N>.

This should be used on the host only. The size of each mapped memory must be a multiple of N.

Template Parameters:

N – Vector length of the new element type.

Returns:

tapa::mmap of the same pieces of memory but of type tapa::vec_t<T, N>.

template<typename U>
inline mmaps<U, S> reinterpret() const

Reinterprets the element type of each mapped memory as U.

This should be used on the host only. Both T and U must have standard layout. The host memory pointers must be properly aligned. If sizeof(U) > sizeof(T) , the size of each mapped memory must be a multiple of sizeof(U)/sizeof(T) (which must be an integer itself). If sizeof(U) < sizeof(T) , sizeof(T) must be a multiple of sizeof(N).

Template Parameters:

U – The new element type.

Returns:

tapa::mmaps<U, S> of the same pieces of memory.

Utility Library

template<typename T>
inline constexpr int tapa::widthof()

Queries width (in bits) of the type.

Template Parameters:

T – Type to be queried.

Returns:

T::width if it exists, sizeof(T) * CHAR_BIT otherwise.

template<typename T>
inline constexpr int tapa::widthof(T object)

Queries width (in bits) of the object.

Note

Unlike sizeof, the argument expression is evaluated (though unused).

Template Parameters:

T – Type of object.

Parameters:

object – Object to be queried.

Returns:

T::width if it exists, sizeof(T) * CHAR_BIT otherwise.

HLS-Compat Library

The HLS-compat library provides a set of APIs compatible with Vitis HLS stream behavior to ease migration from Vitis HLS.

Warning

tapa::hls_compat APIs are software simulation only and are NOT synthesizable. Before synthesis, remove #include <tapa/host/compat.h> and replace any tapa::hls_compat API with their synthesizable equivalent.

template<typename T>
using tapa::hls_compat::stream = ::tapa::stream<T, ::tapa::internal::kInfiniteDepth>

An infinite-depth stream that has the same behavior as hls::stream.

Intended for defining streams without knowing their depth for synthesis:

...
#include <tapa.h>
#include <tapa/host/compat.h>
...
void Top() {
  tapa::hls_compat::stream<int> data_q("data");
  ...
  tapa::task()
    .invoke(...)
    .invoke(...)
    ...
    ;
}

Software simulation only; NOT synthesizable. Replace with tapa::stream for synthesis.

template<typename T>
using tapa::hls_compat::stream_interface = ::tapa::internal::unbound_stream<T>

I/O direction agnostic interface that accepts both tapa::stream and tapa::hls_compat::stream.

Intended for declaring parameters without knowing the I/O direction:

...
#include <tapa.h>
#include <tapa/host/compat.h>
...
void Compute(tapa::hls_compat::stream_interface<int>& data_in_q) {
  int data = data_in_q.read();
  ...
}

Software simulation only; NOT synthesizable. Replace with tapa::istream / tapa::ostream for synthesis.

struct task : public tapa::task

Same as tapa::task, except that tasks are scheduled sequentially.

Intended for debugging code migrated from HLS:

...
#include <tapa.h>
#include <tapa/host/compat.h>
...
void Top() {
  ...
  tapa::hls_compat::task()
    .invoke(...)
    .invoke(...)
    ...
    ;
}

Software simulation only; NOT synthesizable. Replace with tapa::task for synthesis.

TAPA Compiler (tapa)

tapa

The TAPA compiler.

tapa [OPTIONS] COMMAND1 [ARGS]... [COMMAND2 [ARGS]...]...

Options

-v, --verbose

Increase logging verbosity.

-q, --quiet

Decrease logging verbosity.

-w, --work-dir <DIR>

Specify working directory.

--recursion-limit <limit>

Override Python recursion limit.

analyze

Analyze TAPA program and store the program description.

tapa analyze [OPTIONS]

Options

-f, --input <input_files>

Required Input file, usually TAPA C++ source code.

-t, --top <TASK>

Required Name of the top-level task.

-c, --cflags <cflags>

Compiler flags for the kernel, may appear many times.

--flatten-hierarchy, --keep-hierarchy

–keep-hierarchy (default) will generate RTL with the same hierarchy as the TAPA C++ source code; –flatten-hierarchy will flatten the hierarchy with all leaf-level tasks instantiated in the top module

--no-vitis-mode, --vitis-mode

–vitis-mode (default) will generate .xo files for Vitis v++ command, with AXI4-Lite interfaces for task arguments and return values, and AXI4-Stream interfaces for task FIFOs; –no-vitis-mode will only generate RTL code with Vitis HLS compatible interfaces

--gen-template <gen_template>

Generate templates for the specified task for rtl integration

compile

Compile a TAPA program to a hardware design.

tapa compile [OPTIONS]

Options

-f, --input <input_files>

Required Input file, usually TAPA C++ source code.

-t, --top <TASK>

Required Name of the top-level task.

-c, --cflags <cflags>

Compiler flags for the kernel, may appear many times.

--flatten-hierarchy, --keep-hierarchy

–keep-hierarchy (default) will generate RTL with the same hierarchy as the TAPA C++ source code; –flatten-hierarchy will flatten the hierarchy with all leaf-level tasks instantiated in the top module

--no-vitis-mode, --vitis-mode

–vitis-mode (default) will generate .xo files for Vitis v++ command, with AXI4-Lite interfaces for task arguments and return values, and AXI4-Stream interfaces for task FIFOs; –no-vitis-mode will only generate RTL code with Vitis HLS compatible interfaces

--gen-template <gen_template>

Generate templates for the specified task for rtl integration

--part-num <part_num>

Target FPGA part number. Must be specified if –platform is not provided.

-p, --platform <platform>

Target Vitis platform. Must be specified if –part-num is not provided.

--clock-period <clock_period>

Target clock period in nanoseconds.

-j, --jobs <jobs>

Number of parallel jobs for HLS.

--keep-hls-work-dir, --remove-hls-work-dir

Keep HLS working directory in the TAPA work directory.

--skip-hls-based-on-mtime, --no-skip-hls-based-on-mtime

Skip HLS if an output tarball exists and is newer than the source C++ file. This can lead to incorrect results; use at your own risk.

--other-hls-configs <other_hls_configs>

Additional compile options for Vitis HLS, e.g., –other-hls-configs “config_compile -unsafe_math_optimizations”

--print-fifo-ops, --no-print-fifo-ops

Print all FIFO operations in cosim.

--flow-type <flow_type>

Flow Option: ‘hls’ for FPGA Fabric steps, ‘aie’ for Versal AIE steps.

Options:

hls | aie

--print-fifo-ops, --no-print-fifo-ops

Print all FIFO operations in cosim.

--flow-type <flow_type>

Flow Option: ‘hls’ for FPGA Fabric steps, ‘aie’ for Versal AIE steps.

Options:

hls | aie

-o, --output <output>

Output packed .xo Xilinx object file.

-s, --bitstream-script <bitstream_script>

Script file to generate the bitstream.

--flow-type <flow_type>

Flow Option: ‘hls’ for FPGA Fabric steps, ‘aie’ for Versal AIE steps.

Options:

hls | aie

g++

Invoke g++ with TAPA include and library paths.

This is intended only for usage with pre-built binary installation. Developers building TAPA from source should compile binaries using bazel.

tapa g++ [OPTIONS] [ARGV]...

Options

--executable <executable>

Run the specified executable instead of g++.

Arguments

ARGV

Optional argument(s)

pack

Pack the generated RTL into a Xilinx object file.

tapa pack [OPTIONS]

Options

-o, --output <output>

Output packed .xo Xilinx object file.

-s, --bitstream-script <bitstream_script>

Script file to generate the bitstream.

--flow-type <flow_type>

Flow Option: ‘hls’ for FPGA Fabric steps, ‘aie’ for Versal AIE steps.

Options:

hls | aie

synth

Synthesize the TAPA program into RTL code.

tapa synth [OPTIONS]

Options

--part-num <part_num>

Target FPGA part number. Must be specified if –platform is not provided.

-p, --platform <platform>

Target Vitis platform. Must be specified if –part-num is not provided.

--clock-period <clock_period>

Target clock period in nanoseconds.

-j, --jobs <jobs>

Number of parallel jobs for HLS.

--keep-hls-work-dir, --remove-hls-work-dir

Keep HLS working directory in the TAPA work directory.

--skip-hls-based-on-mtime, --no-skip-hls-based-on-mtime

Skip HLS if an output tarball exists and is newer than the source C++ file. This can lead to incorrect results; use at your own risk.

--other-hls-configs <other_hls_configs>

Additional compile options for Vitis HLS, e.g., –other-hls-configs “config_compile -unsafe_math_optimizations”

--print-fifo-ops, --no-print-fifo-ops

Print all FIFO operations in cosim.

--flow-type <flow_type>

Flow Option: ‘hls’ for FPGA Fabric steps, ‘aie’ for Versal AIE steps.

Options:

hls | aie

version

Print TAPA version to standard output.

tapa version [OPTIONS]