5 Challenges of High-Level Synthesis for FPGA Design

Jordon Inkeles, Silexica’s VP of Product, looks at five challenges faced when using HLS for FPGA design. In the following blog, he explores how they can be addressed with new tools to increase productivity in your FPGA project and open up the use of FPGAs to a broader skillset.

High-level synthesis (HLS) tools, which transform C/C++ source code to Verilog/VHDL, have been commercially available for over 15 years. HLS tools from FPGA vendors and EDA companies promise improved productivity through a higher-level of abstraction, faster verification, and quicker design iterations. For example, simulating your design in C/C++ can be 10 to 100x faster than simulating it in RTL (register-transfer level). In addition, many applications in image processing and computer vision require a visual verification. Such validation is difficult to set up when performing RTL simulations but can be easily implemented at the C level. When using HLS-based methodology, the achieved gains in simulation speed lead to faster design iterations, significantly improving productivity. Quicker simulation leads to faster design iterations.

Using HLS potentially opens the use of FPGAs to a much broader base of engineers that specialize in embedded software programming. Traditional methods for designing FPGAs require a very specialized skill set and proficiency in describing the design in a hardware description language (HDL) such as Verilog or VHDL. This skillset is quite rare when compared to embedded software developers, limiting the use of FPGAs to only those experienced in coding in an HDL. HLS allows embedded software developers (and hardware engineers) to implement their algorithms in hardware using a higher-level language, such as C/C++.

If the benefits of using HLS for FPGA design are so substantial, why hasn’t designing FPGAs in C/C++ become the standard design entry method?

The simple answer is that adopting an HLS design methodology in the real world presents unique challenges that must be considered and overcome during the design process. These challenges can lead to more work by the designer thus requiring more development time, which begins to negate the productivity gains of HLS. Let’s look at five of these challenges:

1. C/C++ code which is non-synthesizable by the HLS compiler

The C/C++ coding guidelines for HLS compilers are extensive and can be over 1000+ pages of documentation needing to be comprehended when writing or refactoring C code for HLS synthesis. As an example, HLS does not support memory access on a variable within a dynamically sized array. The amount of memory inside a given FPGA device is fixed, which means that code that dynamically allocates objects of variable sizes with calls to functions such as malloc, calloc and new, is not supported. The HLS tool must know the required memory resources required by an algorithm at compilation time in order to produce an efficient hardware implementation.

2. “Non-Hardware Aware” C/C++ code

Creating C/C++ code with various memory constructs and data types that do not factor in the hardware implementation can have unintended consequences, including bloated device resources and slow performance. Care must be taken to avoid using data types that are too large and not needed. For example, using a 32-bit integer in software when only a 10-bit integer is required is inconsequential when mapping to a standard processor because the registers or memory locations already have a fixed size; when implemented in hardware, however,  the unused bits become costly as they consume valuable FPGA resources.

3. Identifying parallelism

C/C++ code is typically executed sequentially on standard processors, but implementing functions in logic gates allows operations to be executed in parallel, accelerating the execution of the code in hardware. Determining where potential parallelism exists in the design can be quite daunting and time-consuming, especially as the complexity of the algorithm, the function, or the codebase increases.

4. Software and hardware partitioning

For heterogeneous designs (FPGAs with embedded processors in this case), identifying what to run on the processor and what to move to hardware to exploit the parallel nature of the FPGA fabric via HLS can take significant time and many iterations, even while conducting pre-synthesis simulations.

5. Inserting HLS compiler pragmas or directives into the C/C++ code

In order for the HLS compiler to effectively implement the software into hardware, the user must provide guidance for the compiler in the form of pragmas or directives. Determining when to use pragmas, how to set their parameters, and where to insert them into the code while simultaneously optimizing the pragmas on a system-level within an application is challenging and time-consuming.

Collectively, these challenges present a significant barrier for those who want to take advantage of HLS design benefits. HLS vendors provide thorough documentation and training to educate customers on how to address these challenges, but it remains a manual process that takes time to master. Until now…

Silexica’s SLX FPGA tool, based on over 10 years of compiler technology research, provides practical solutions to the challenges discussed in this article by addressing these challenges through each step of the HLS design process.

First, SLX FPGA analyzes the C/C++ source code for synthesizability and provides automatic and guided code refactoring of the non-synthesizable code. SLX walks the user through each section of code that is non-synthesizable and automatically converts the code or provides guidance on how to refactor the code to be synthesizable.

The next challenge that SLX FPGA addresses is analyzing the algorithm or application for parallelism that can be converted from sequential execution to parallel or pipelined execution. By identifying parallelism, SLX then provides the most efficient hardware implementation of the C/C++ code. If using an FPGA with an embedded processor system, SLX FPGA can also provide guidance on the most efficient distribution of the code between the SW and HW domains.

2. PARALLELISM DETECTION-slx-fpga

Finally, after the SW and HW partitioning has been defined, SLX FPGA then inserts the pragmas into the code so that the HLS compiler can then implement the optimizations in HW when compiling the C/C++.

4. Pragma Insertion

SLX FPGA is the first tool in the industry that directly addresses the challenges of using HLS design flows, reducing the learning curve by providing actionable insights into converting C/C++ code into an optimized HW implementation.

Please contact us if you are interested in learning more or to request a live demonstration of SLX FPGA with our experts.

Alternatively, you can get in touch with us at FPGA Conference Europe 2020 (virtual event) on September 29-30. Our team will be giving presentations where you can learn about how SLX can help you. Get your ticket for the event here.

Jordon-5-challenges-of-hls-for-fpga-design

JORDON INKELES – VP PRODUCT

Jordon’s role is to drive Silexica’s product strategy including product planning and management, product marketing and corporate marketing functions. Jordon has 20+ years of experience in marketing and product management at world-leading semiconductor companies including Intel and Altera. His role at Intel PSG (Programmable Solutions Group) included leading product and corporate strategy, driving a $1 billion product line, building high-performance marketing teams and driving go-to-market strategies. Most recently, he was Director of Marketing for Intel’s Flagship Stratix Series FPGA family. For 16 years as part of Altera, before its acquisition by Intel in 2015, Jordon held senior marketing and product management positions for FPGA product families, EDA software including OpenCL, IP (Intellectual Property) and corporate marketing/communication teams.

Leave a Reply

Your email address will not be published. Required fields are marked *