

Institute for Computer Science

## **RISC-V**

State of the union

Ilia Kurin

September 26, 2022

Newest Trends in High-Performance Data Analytics

Hardware



Conclusions

## Table of contents



- 2 Technical details
- 3 Hardware



### 5 Conclusions

Technical details

Hardware

Software

Conclusions

**RISC-V** 

Instruction Set Architecture (ISA)

Based on Reduced Instruction Set Computer (RISC)

Technical details

Hardware

Conclusions

#### RICS

CICS

Simple instructions

Fixed size instructions

More registers

Software oriented

\*Compiler divides code into smaller commands

**Complex instructions** 

Variable size instructions

Less registers

Hardware oriented

Technical details

Hardware



Conclusions

## ISA

#### Abstract model of computer

Pseudocode for computer architectures

#### Defines:

- Instructions
- Data types
- Registers
- I/O model
- Other fundamental features

Hardware



Conclusions

## Advantages

- No need to pay royalties
- Faster time to market due to open ISA and cores

Hardware



Conclusions

## Advantages

- No need to pay royalties
- Faster time to market due to open ISA and cores
- Extensibility and customization
  - No need to support outdated instructions

Hardware

Conclusions

## Advantages

- No need to pay royalties
- Faster time to market due to open ISA and cores
- Extensibility and customization
  - No need to support outdated instructions
- Wide variety of applications
  - From embedded devices
  - To supercomputers

Hardware

Conclusions

## Advantages

- No need to pay royalties
- Faster time to market due to open ISA and cores
- Extensibility and customization
  - No need to support outdated instructions
- Wide variety of applications
  - From embedded devices
  - To supercomputers
- Avoidance of previous architectures mistakes
  - No delay slots, no register windows and others

Technical details

Hardware

Software

Conclusions

History

Berkeley RISC

First version was released in 1981 UC Berkeley by David Patterson

Hardware

Software

Conclusions

## History

Berkeley RISC

- First version was released in 1981 UC Berkeley by David Patterson
- Second version was released in 1983 and introduced expansion of 16 bit instructions to 32 bit

Hardware

Conclusions

## History

Berkeley RISC

- First version was released in 1981 UC Berkeley by David Patterson
- Second version was released in 1983 and introduced expansion of 16 bit instructions to 32 bit
- Third version was released in 1984 by Patterson's students and was designed to run Smalltalk

Hardware

Conclusions

## History

Berkeley RISC

- First version was released in 1981 UC Berkeley by David Patterson
- Second version was released in 1983 and introduced expansion of 16 bit instructions to 32 bit
- Third version was released in 1984 by Patterson's students and was designed to run Smalltalk
- Fourth version was released in 1988 by Patterson's students and was designed to run Lisp

Technical details

Hardware

Software

Conclusions

History

**RISC-V** 

Work began at 2010 at UC Berkeley in Parallel Computing Laboratory

- In development participated Prof. Krste Asanović, Prof. David Patterson, Yunsup Lee and Andrew Waterman
- First ISA was published at 2011 and contained base instructions with floating-point and compressed extensions

Technical details

Hardware



Conclusions

## History

**RISC-V** International

- Non-profit corporation curating ISA development
  - Openly accepting individual members
- Founded in 2015
- Before 2019 named as RISC-V Foundation
- Moved to Switzerland in 2020 to avoid USA trade regulations

Hardware



## Members



Source: [13] RISC-V members

Technical details

Hardware

Software

Conclusions

## Outline

1 Overview

#### 2 Technical details

3 Hardware

#### 4 Software

#### 5 Conclusions

Hardware



Conclusions

## Architectural decisions

Smaller instruction set

Easier to write and execute code

Hardware

Software

Conclusions

## Architectural decisions

- Smaller instruction set
  - Easier to write and execute code
- Load/store architecture
  - > All operation are either memory read/write or between registers
  - Simpler instructions and better pipeline utilization

Hardware

Software

Conclusions

## Architectural decisions

- Smaller instruction set
  - Easier to write and execute code
- Load/store architecture
  - > All operation are either memory read/write or between registers
  - Simpler instructions and better pipeline utilization
- No branch delay slot
  - Instruction execution while branch target address is being computed
  - ► Easier for multicycle CPUs, superscalar CPUs, and long pipelines

Hardware

Software

Conclusions

## Architectural decisions

- Smaller instruction set
  - Easier to write and execute code
- Load/store architecture
  - > All operation are either memory read/write or between registers
  - Simpler instructions and better pipeline utilization
- No branch delay slot
  - Instruction execution while branch target address is being computed
  - ▶ Easier for multicycle CPUs, superscalar CPUs, and long pipelines
- No register windows
  - Registers are divided into windows where only on visible at a time
  - Context switching becomes expensive

Hardware

Software

Conclusions

## Privilege levels

#### Machine

- Controls all physical resources and interrupts
- The only mandatory privilege level
- Supervisor
  - Intended for kernels and drivers
  - Can only be used with both User and Machine privilege levels
- User/Application
  - Provides boundary between applications and hardware access

Processor can be in only one of the privilege levels at a time

Technical details

Hardware

Software

Conclusions

## Specifications

Unprivileged ISA

- Privileged ISA
- Debug
- Trace

Link to current specifications: https://riscv.org/technical/specifications/

Hardware



Conclusions

## Terminology

Core

Independent instruction fetch unit

#### Hart

- Hardware thread
- Similar to Intel's Hyper-threading
- Accelerator
  - Non-programmable unit or core with fixed function operating autonomously
- Field-Programmable Gate Array (FPGA)
  - Re-configurable integrated circuit that allows to implement a wide range of custom digital circuits

Hardware

Conclusions

## Naming conventions

### Base pattern is **RV [xxx] [abc...xyz]**

- Where xxx address space size (32, 64, 128)
- abc...xyz extension letters
  - Underscores between extensions are supported
  - Case insensitive
  - Supports custom extensions

Hardware

## General-purpose registers

| Register | Name     | Usage                              | Preserved |
|----------|----------|------------------------------------|-----------|
| x0       | zero     | Hardwired to 0, ignores writes     | n/a       |
| x1       | ra       | Return address for jumps           | no        |
| x2       | sp       | Stack pointer                      | yes       |
| x3       | gp       | Global pointer                     | n/a       |
| x4       | tp       | Thread pointer                     | n/a       |
| x5-x7    | t0-t2    | Temporary registers                | no        |
| x8-x9    | s0/fp-s1 | Saved registers                    | yes       |
| x10-x11  | a0-a1    | Function arguments / Return values | no        |
| x12-x17  | a2-a7    | Function arguments                 | no        |
| x18-x27  | s2-s11   | Saved registers                    | yes       |
| x28-x31  | t3-t6    | Temporary registers                | no        |
| рс       | (none)   | Program counter                    | n/a       |

Technical details

Hardware

Software

Conclusions

## Extensions

| Name                                | Letter   |   |
|-------------------------------------|----------|---|
| Integer                             | I        |   |
| Integer multiplication and division | М        |   |
| Atomics                             | А        | G |
| Single-Precision floating-point     | F        |   |
| Double-Precision floating-point     | D        |   |
| Quad-Precision floating-point       | Q        |   |
| 16-bit Compressed instructions      | С        |   |
| Control and status register access  | Zicr     |   |
| Instruction-Fetch fence             | Zifencei |   |

Other instruction sets are in draft stage at the moment

Hardware

Software

Conclusions

## **Encoding variants**

| $31  30  25 \ 24  21  20$                               | 19 1 | 5 14 12 | 2 11 8 7                  | 6 0        |                            |
|---------------------------------------------------------|------|---------|---------------------------|------------|----------------------------|
| funct7 rs2                                              | rs1  | funct3  | rd                        | opcode     | Register-Register          |
|                                                         |      |         |                           |            |                            |
| $\operatorname{imm}[11:0]$                              | rs1  | funct3  | rd                        | opcode     | Short immediates and loads |
|                                                         |      |         |                           |            |                            |
| imm[11:5] rs2                                           | rs1  | funct3  | $\operatorname{imm}[4:0]$ | opcode     | Stores                     |
|                                                         |      |         |                           |            |                            |
| $\operatorname{imm}[12]$ $\operatorname{imm}[10:5]$ rs2 | rs1  | funct3  | imm[4:1] $ $ imm[         | 11] opcode | Conditional branches       |
|                                                         |      |         |                           |            |                            |
| $\operatorname{imm}[31:12]$                             |      |         | rd                        | opcode     | Long immediates            |
|                                                         |      |         |                           |            |                            |
| $[imm[20]] \qquad imm[10:1] \qquad imm[11]$             | imm[ | 19:12]  | rd                        | opcode     | Unconditional branches     |

#### Source: [1] RISC-V assembly guide

Hardware

Software

Conclusions

## **Pseudo-instructions**

- Pseudo-instruction is a shorthand for one or more machine instructions
- Most of them use zero register as its usage improves performance due pipeline and command microcode optimizations
- Example:
  - ► There is no real instruction to compute Two's complement
  - neg rd, rs is being converted to sub rd, zero, rs
  - It just subtracts the value (rs) from zero and writes to destination register (rd)

Hardware

Software

Conclusions

## **Custom Function Units**

- CFU is a hardware core following a specific interface that provides custom instructions
- Goal is to create a solution allowing the creation of reusable extensions for any hardware with no need to wait multiple years for ratification
- Combines advantages of standard and custom extensions
- Work started in 2019 and it's still in the draft stage

Technical details

Hardware

Software

Conclusions

## **CFU Playground**



- Framework that allows to build and test CFUs for Machine Learning
- Requires LiteX Boards FPGA board or Renode and Verilator to simulate it
- Can only be run on Linux

Technical details

Hardware

Software

Conclusions

## Outline

1 Overview

- 2 Technical details
- 3 Hardware
- 4 Software



Companies

Technical details

Hardware

Software

Conclusions

# **Si**Five

Source: [16] SiFive

#### SiFive

- Leading RISC-V hardware company
- Founded in 2015
- First company to produce a RISC-V chip
- This March received \$175 million funding, valuing the company at over \$2.5 billion

Companies

Technical details

Hardware

Software

Conclusions



Source: [17] Andes technology

#### Andes Technology

- Biggest supplier of embedded RISC-V cores
- Joined RISC-V International in 2016
- Major toolchain contributor
- For the past 15 years, engaged with more than 150 partners

Cores

Technical details

Hardware

Software

Conclusions

#### SweRV EH1 Core Complex DCCM **IEU** EXU ICCM SweRV EH1 Core - RV32IMC I-Cache PIC DEC LSU Debug IFU Bus Master DMA Slave LSU Bus Debug Bus Master Master Port 64-bit AXIA 64-bit AXIA 64-bit AXIA 64-bit AXIA or or AHB-Lite AHB-Lite AHB-Lite AHB-Lite Source: [8] Western Digital SweRV EH1

documentation

#### Western Digital SweRV EH1

- RV32IMC core with branch predictor
- IGhz target frequency
- 4-way set-assotiative instruction cache
- Bus interfaces for instuction fetch, data access, debug access and external Direct Memory Access

Cores

Technical details

Hardware

Conclusions

#### CLINT PLIC DEBUG SV39 MMU PMP 32KB IS w/ECC 32KB D\$ w/ECC **Bus Matrix** 128KB L2\$ w/ECC Periph Port Front Port System Port Mem Port

Source: [9] SiFive U74 documentation

#### SiFive U74

- RV64GC\_Zba\_Zbb\_Sscofpmf core
- 8 region memory psysical memory protection
- Virtual Memory support with up to 47 Physical Address bits
- The L2 Cache can be configured into high speed deterministic SRAMs

Technical details

Hardware

Software

Conclusions

## Boards



Source: [14] WaveShare ESP-C3-32S-Kit

#### Waveshare ESP-C3-32S-Kit

SoC: ESP32-C3

- Single RV32IMC core
- Frequency: 160 MHz
- Memory: 400KB of SRAM
- Wi-Fi and Bluetooth 5 support

Usage: IoT, wearable electronics

Price: 12.99 Euro

Buy: Link

Hardware

Conclusions

### Boards



Source: [15] Sipeed Nezha on Amazon

#### Sipeed Nezha

SoC: D1

- Single RV64GCV core (XuanTie C906)
- Memory: 1 GByte of DDR3 and 256 MByte of Nand Flash
- HiFi4 Digital Signal Processor
- G2D 2D graphics accelerators

Usage: Linux, IoT

Prices: 128.35 - 250.97 Euro

Buy: Link

Technical details

Hardware

Software

Conclusions

### Outline

1 Overview

- 2 Technical details
- 3 Hardware





Hardware

Conclusions

# Software ecosystem

RISC-V has a great software ecosystem

- Supports Linux, MacOS and Windows
- Includes GCC, Clang, Glibc, Newlib and others compilers
- Includes Fedora, Debian, Ubuntu and other Linux distributions
- Includes ports for Golang, Java, Rust and even Node.JS
- Includes multiple simulators, IDEs and much more

Link to the software list: https://github.com/riscvarchive/riscv-software-list

Hardware

Conclusions

# Simulation options



- Good for software developers
- SPIKE
  - Good for hardware developers
- RARS
  - Complete tool-set to start development
- Jupiter
  - Good for educational purposes

Technical details

Hardware

Software

Conclusions

## Code example

Multiplication without the required extension

Pipeline:

■ riscv64-unknown-elf-gcc -march=\*Architecture\* -g3 -o test test.c

riscv64-unknown-elf-objdump -S test

Compiled with GCC 11.1.0 on MacOS

Hardware

Software

Conclusions

Multiplication example

Architecture: RV64IM

1 lw a5,-20(s0) 2 mv a4,a5 3 lw a5,-24(s0) 4 mulw a5,a4,a5 5 sext.w a5,a5

- int multiply(int a, int b) {
   return a \* b;
- з }

Hardware

Conclusions

# Multiplication example

Architecture: RV64I

| 1  | 1 0000000000101f8 <muldi3>:</muldi3> |                                    |  |
|----|--------------------------------------|------------------------------------|--|
| 2  | mv                                   | a2,a0                              |  |
| 3  | li                                   | a0,0                               |  |
| 4  | andi                                 | a3,a1,1                            |  |
| 5  | beqz                                 | a3,10204 <muldi3+0xc></muldi3+0xc> |  |
| 6  | add                                  | a0,a0,a2                           |  |
| 7  | srli                                 | al,al,0x1                          |  |
| 8  | slli                                 | a2,a2,0x1                          |  |
| 9  | bnez                                 | a1,101fc <muldi3+0x4></muldi3+0x4> |  |
| 10 | ret                                  |                                    |  |
|    |                                      |                                    |  |

```
int multiply(int a, int b) {
1
      int result = 0;
2
      while (b > 0) {
3
           if (b & 1) {
4
               result += a:
5
           }
6
          b = b \gg 1:
7
          a = a \ll 1:
8
       }
9
       return result;
10
11 }
```

Technical details

Hardware



Conclusions

## Outline

1 Overview

- 2 Technical details
- 3 Hardware

#### 4 Software



Technical details

Hardware



Conclusions

# Is it really that free?

#### Not really

■ ISA is free and open-source but hardware can be private.

Hardware



Conclusions

# Is it really extensible?

#### Yes

Could be a great feature in a long-run. It can make adding new instruction sets easier. Hardware



Conclusions

## Does it have future in embedded devices?

#### Probably yes

Highly customizable and memory efficient. Can find its way IoT and micro-controller sectors.

Hardware



Conclusions

# Is it ready for High-Performance market?

#### No No

Almost no hardware at the moment

Technical details

Hardware

Conclusions

# **Final conclusion**

- RISC-V is an interesting technology that is still in the development stage
- Today RISC-V is is far behind Intel or ARM in terms of performance
- Market is still not stable so there is room for hardware startups
- If you are enthusiastic RISC-V can be a great opportunity to make your own project

Hardware

Software

Conclusions

### Sources

- [1] RISC-V assembly guide, 20.05.22
- [2] RISC-V history, 25.05.22
- [3] RISC-V specifications, 20.05.22
- [4] Design and implementation of RISC I, 20.05.22
- [5] RISC II, 20.05.22
- [6] Vector intrinsics, 26.05.22
- [7] Delay slots, 26.05.22
- [8] Western Digital SweRV EH1 documentation, 01.06.22

- [9] SiFive U74 documentation, 01.06.22
- [10] First RISC-V ISA, 01.06.22
- [11] CFU Playground, 01.06.22
- [12] RISC-V Cores and SoC Overview, 31.06.22
- [13] RISC-V members, 08.06.22
- [14] WaveShare ESP-C3-32S-Kit, 13.09.22
- [15] Sipeed Nezha on Amazon, 13.09.22
- [16] SiFive, 13.09.22
- [17] Andes technology, 13.09.22