

# Brief Intro to CGRAs

### Prof. Dejan Marković

ee216a@gmail.com

# **Coarse-Grained Reconfigurable Array (CGRA)**

### **Classification of CGRAs**

| Xputer [8]     1991     D     SCSD     SSE       PADDI [9]     1992     I     SCSD     SSE       PADDI [9]     1992     I     SCMD     DSD       RAW [68]     1997     I     MCMD     DSD     More like a multicore processor       FipeRench [69]     1998     D     SCMD     SSE       Wavescalar [62,70]     2003     I     MCMD     DSD       PACT:XPP [13]     2003     I/C     SCSD     DSD       PACT:XPP [13]     2004     I     SCSD     SSE       PACT:XPP [13]     2004     I     SCSD     SSE       ASH [57]     2004     D     SCSD     SSE       TRIPS [12]     2004     I     MCMD     DSD     Dataflow-driven ISA       CCA [52]     2004     transparent     SCSD     SSE     Runtime-generated configurations       Tartan [60]     2006     I     MCMD     DSD     Dataflow-driven ISA       RICA [72]     2008     I     SCSD     SSE     Forpymorbr                                                                                                                                                                                                 | Architecture         | Year | Programming<br>model* | Computation<br>model | Execution<br>model** | Specifications                        |
|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------|------|-----------------------|----------------------|----------------------|---------------------------------------|
| PADD-12     [67]     1993     D     SCMD     DSD       RAW [68]     1997     I     MCMD     DSD     More like a multicore processor       PipeRench [69]     1998     D     SCMD     SSE       Morphosys [11]     2000     I     SCMD     SSE       Marescalar [62, 70]     2003     I     MCMD     DDD     Dataflow-driven ISA       PACT-XPP [13]     2003     I/C     SCSD     SSE     programmable FSM controller       ADRES [10]     2004     I     SCSD     SSD     SSE       TRIPS [12]     2004     I     SCSD     SSD     Trans for any controller       ASH [57]     2004     I     MCMD     DSD     Dataflow-driven ISA       TCCA [52]     2004     transparent     SCSD     SSE     Runtime-generated configurations       Tartan [60]     2006     I     MCMD     DSD     Dataflow-driven ISA       RICA [72]     2008     I     SCSD     SSE     Polymorphic configurations       TCPA [51]     2009                                                                                                                                                     | Xputer [8]           | 1991 | D                     | SCSD                 | SSE                  |                                       |
| RAW [68]     1997     I     MCMD     DSD     More like a multicore processor       PipeRench [69]     1998     D     SCMD     SSE       Morphosys [11]     2000     I     SCMD     SSE       Wavescalar [62,70]     2003     I     MCMD     DDD     Dataflow-driven ISA       PACT-XPP [13]     2003     I/C     SCSD     DSD       DRP [26]     2004     I     SCSD     SSE     programmable FSM controller       ADRES [10]     2004     I     SCSD     SSE     Null W controller       ASH [57]     2004     I     MCMD     DSD     Dataflow-driven ISA       CCA [52]     2004     transparent     SCSD     SSE     Runtime-generated configurations       Tartan [60]     2006     I     MCMD     DSD     Dataflow-driven ISA       RICA [72]     2008     I     SCSD     SSE     Polymorphic configurations       TCPA [50]     2009     D     SCSD     SSE     Targeted reconfigurability, ASIC-like       DySER [47]     2012                                                                                                                                    | PADDI [9]            | 1992 | I                     | SCSD                 | SSE                  |                                       |
| PipeRench [69]     1998     D     SCMD     SSE       Morphosys [11]     2000     I     SCMD     SSE       Wavescalar [62, 70]     2003     I     MCMD     DDD     Dataflow-driven ISA       PACT-XPP [13]     2003     I/C     SCSD     DSD     Dataflow-driven ISA       DRP [26]     2004     I     SCSD     SSE     programmable FSM controller       ASH [57]     2004     D     SCSD     SSE     VLW controller       ASH [57]     2004     I     MCMD     DSD     Dataflow-driven ISA       CCA [52]     2004     I     MCMD     DSD     Dataflow-driven ISA       Tartan [60]     2006     I     MCMD     DSD     Dataflow-driven ISA       RICA [72]     2008     I     SCSD     SSE     Polymorphic configurations       TCPA [50]     2009     D     SCSD     SSE     Targeted reconfigurability, ASIC-like       DySER [47]     2012     I     SCSD     SSE     Targeted reconfigurability, ASIC-like       DySER [47]                                                                                                                                        | PADDI-2 [67]         | 1993 | D                     | SCMD                 | DSD                  |                                       |
| Amorphosys [1]     2000     I     SCMD     SSE       Wavescalar [62, 70]     2003     I     MCMD     DDD     Dataflow-driven ISA       PACT-XPP [13]     2003     I/C     SCSD     DSD       DRP [26]     2004     I     SCSD     SSE     programmable FSM controller       ADRES [10]     2004     I     SCSD     SSE     VLIW controller       ASH [57]     2004     D     SCSD     SSE     Wull work of the interpretation interpretation interpretation interpretations       Tartan [60]     2004     I     MCMD     DSD     Dataflow-driven ISA       TFlex [71]     2007     I     MCMD     DSD     Dataflow-driven ISA       TFlex [71]     2007     I     MCMD     DSD     Dataflow-driven ISA       RICA [72]     2008     I     SCSD     SSE     Polymorphic configurations       TCPA [50]     2009     D     SCSD     SSE     SSE       C-Cores [73]     2010     I     SCSD     SSE     Targeted reconfigurability, ASIC-like                                                                                                                              | RAW [68]             | 1997 | Ι                     | MCMD                 | DSD                  | More like a multicore processor       |
| Wavescalar     [62, 70]     2003     I     MCMD     DDD     Dataflow-driven ISA       PACT-XPP [13]     2003     I/C     SCSD     DSD     programmable FSM controller       ADRES [10]     2004     I     SCSD     SSE     programmable FSM controller       ADRES [10]     2004     I     SCSD     SSE     VLIW controller       ASH [57]     2004     D     SCSD     SSD     TRIPS [12]     2004     I     MCMD     DSD     Dataflow-driven ISA       CCA [52]     2004     I     MCMD     DSD     Dataflow-driven ISA     SCSD     SSE     Runtime-generated configurations       Tartan [60]     2006     I     MCMD     DSD     Dataflow-driven ISA       RICA [72]     2008     I     SCSD     SSE     Polymorphic configurations       TCPA [50]     2009     D     SCSD     SSE     Targeted reconfigurability, ASIC-like       DySER [47]     2010     I     SCSD     SSE     Targeted reconfigurability, ASIC-like       DySER [47]     2013     I                                                                                                             | PipeRench [69]       | 1998 | D                     | SCMD                 | SSE                  |                                       |
| PACT-XPP [13]     2003     I/C     SCSD     DSD       DRP [26]     2004     I     SCSD     SSE     programmable FSM controller       ADRES [10]     2004     I     SCSD     SSE     VLIW controller       ASH [57]     2004     D     SCSD     SSD     TRIPS [12]     2004     I     MCMD     DSD     Dataflow-driven ISA       CCA [52]     2004     transparent     SCSD     SSE     Runtime-generated configurations       Tartan [60]     2006     I     MCMD     DSD     Dataflow-driven ISA       RICA [72]     2008     I     SCSD     SSE     Polymorphic configurations       TCPA [50]     2009     D     SCSD     SSE     Targeted reconfigurability, ASIC-like       DySER [47]     2010     I     SCSD     SSE     Targeted reconfigurability, ASIC-like       DySER [47]     2013     I     SCSD     SSE     Targeted reconfigurability, ASIC-like       DySER [47]     2013     I     SCSD     SSE     Targeted faconfigurability, ASIC-like                                                                                                              | Morphosys [11]       | 2000 | Ι                     | SCMD                 | SSE                  |                                       |
| DRP [26]     2004     I     SCSD     SSE     programmable FSM controller       ADRES [10]     2004     I     SCSD     SSE     VLIW controller       ASH [57]     2004     D     SCSD     SSD       TRPS [12]     2004     I     MCMD     DSD     Dataflow-driven ISA       CCA [52]     2004     I     MCMD     DSD     Asynchronous circuit       Tartan [60]     2006     I     MCMD     DSD     Dataflow-driven ISA       Tartan [60]     2006     I     MCMD     DSD     Dataflow-driven ISA       RICA [72]     2008     I     SCSD     SSE     Polymorphic configurations       TCPA [50]     2009     D     SCSD     SSE     Targeted reconfigurability, ASIC-like       DySER [47]     2012     I     SCSD     SSE     Targeted neconfigurability, ASIC-like       DySER [47]     2013     I     SCSD     SSE     Targeted reconfigurability, ASIC-like       DySER [47]     2013     I     SCSD     SSE     Targeted neconfigurability, ASIC-like <td>Wavescalar [62, 70]</td> <td>2003</td> <td>I</td> <td>MCMD</td> <td>DDD</td> <td>Dataflow-driven ISA</td> | Wavescalar [62, 70]  | 2003 | I                     | MCMD                 | DDD                  | Dataflow-driven ISA                   |
| ADRE [10]     2004     I     SCSD     SSE     VLW controller       ASH [57]     2004     D     SCSD     SSD     VLW controller       ASH [57]     2004     I     MCMD     DSD     Dataflow-driven ISA       CCA [52]     2004     transparent     SCSD     SSE     Runtime-generated configurations       Tartan [60]     2006     I     MCMD     DSD     Dataflow-driven ISA       TFlex [71]     2007     I     MCMD     DSD     Dataflow-driven ISA       RICA [72]     2008     I     SCSD     SSE     Polymorphic configurations       TCPA [50]     2009     D     SCSD     SSE     C-Cores [73]     2010     I     SCSD     SSE       Triggered Inst. [61]     2013     I     SCSD     SSE     Targeted reconfigurability, ASIC-like       DySER [47]     2012     I     SCSD     SSE     Targeted reconfigurability, ASIC-like       DySER [47]     2013     I/C     MCMD     DSD     Trageted reconfigurability, ASIC-like       DySER [47]                                                                                                                     | PACT-XPP [13]        | 2003 | I/C                   | SCSD                 | DSD                  | -                                     |
| ASH [57]     2004     D     SCSD     SSD     D       TRIPS [12]     2004     I     MCMD     DSD     Dataflow-driven ISA       CCA [52]     2004     transparent     SCSD     SSE     Runtime-generated configurations       Tartan [60]     2006     I     MCMD     DSD     Asynchronous circuit       TFlex [71]     2007     I     MCMD     DSD     Dataflow-driven ISA       RICA [72]     2008     I     SCSD     SSE     Polymorphic configurations       TFlex [71]     2009     I     SCSD     SSE     Polymorphic configurations       TCPA [50]     2009     D     SCSD     SSE     Targeted reconfigurability, ASIC-like       DySER [47]     2012     I     SCSD     SSD     Targeted reconfigurability, ASIC-like       DySER [47]     2013     I     SCSD     SSE     Targeted reconfigurability, ASIC-like       DySER [47]     2013     I/C     MCMD     DSD     Dataflow-driven ISA       TS[74]     2013     I/C     MCMD     DSD                                                                                                                       | DRP [26]             | 2004 | Ι                     | SCSD                 | SSE                  | programmable FSM controller           |
| TRPS [12]     2004     I     MCMD     DSD     Dataflow-driven ISA       CCA [52]     2004     transparent     SCSD     SSE     Runtime-generated configurations       Tartan [60]     2006     I     MCMD     DSD     Asynchronous circuit       TFlex [71]     2007     I     MCMD     DSD     Dataflow-driven ISA       RICA [72]     2008     I     SCSD     SSE     Polymorphic configurations       TCPA [50]     2009     D     SCSD     SSE     Polymorphic configurations       TCPA [50]     2009     D     SCSD     SSE     Targeted reconfigurability, ASIC-like       DySER [47]     2012     I     SCSD     SSE     Targeted reconfigurability, ASIC-like       DySER [47]     2012     I     SCSD     SSE     Targeted reconfigurability, ASIC-like       DySER [47]     2012     I     SCSD     SSE     Targeted reconfigurability, ASIC-like       DySER [47]     2013     I/C     MCMD     DSD     Targeted reconfigurability, ASIC-like       TS[74]     2013     I/C<                                                                                 | ADRES [10]           | 2004 | Ι                     | SCSD                 | SSE                  | VLIW controller                       |
| $\begin{array}{ c c c c c c c c c c c c c c c c c c c$                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   | ASH [57]             | 2004 | D                     | SCSD                 | SSD                  |                                       |
| Tartan [60]     2006     I     MCMD     DSD     Asynchronous circuit       TFlex [71]     2007     I     MCMD     DSD     Dataflow-driven ISA       RICA [72]     2008     I     SCSD     SSE       PPA [54]     2009     I     SCSD     SSE       C-Cores [73]     2010     I     SCSD     SSE       C-Cores [73]     2010     I     SCSD     SSE       Triggered Inst. [61]     2013     I     SCSD     SSE       Triggered Inst. [61]     2013     D     MCMD     DSD       T3 [74]     2013     I/C     MCMD     DSD       T3 [74]     2014     I     MCMD     DSD       Triggered Inst. [61]     2014     I     MCMD     DDD       FPCA [76]     2014     I     MCMD     DSD     Dataflow-driven ISA       SGMF [75]     2014     I     MCMD     DSD     Dataflow-driven ISA       DynaSPAM [53]     2015     transparent     SCMD     SSD     Based on PipeRench </td <td>TRIPS [12]</td> <td>2004</td> <td>Ι</td> <td>MCMD</td> <td>DSD</td> <td>Dataflow-driven ISA</td>                                                                                         | TRIPS [12]           | 2004 | Ι                     | MCMD                 | DSD                  | Dataflow-driven ISA                   |
| TFICE     TOP     Dot     Dataflow-driven ISA       RICA [72]     2008     I     SCSD     SSE       PPA [54]     2009     I     SCSD     SSE       PPA [54]     2009     D     SCSD     SSE       C-Cores [73]     2010     I     SCSD     SSE       C-Cores [73]     2010     I     SCSD     SSE       Triggered Inst. [61]     2013     I     SCSD     SSE       Triggered Inst. [61]     2013     D     MCMD     DSD       T3 [74]     2013     I/C     MCMD     DSD       TFICA [76]     2014     I     MCMD     DDD       FPCA [76]     2014     I     SCSD     SSE       DynaSPAM [53]     2015     transparent     SCMD     SSD     Based on PipeRench       NDA [77]     2016     I     SCMD     SSD     Based on DySER       HARTMP [78]     2016     DI     SCD / SCMD     SSD     Based on DySER       HRE [70]     2016     D/I                                                                                                                                                                                                                              | CCA [52]             | 2004 | transparent           | SCSD                 | SSE                  | Runtime-generated configurations      |
| RiCA [72]     2008     I     SCSD     SSE     Polymorphic configurations       PPA [54]     2009     I     SCSD     SSE     Polymorphic configurations       TCPA [50]     2009     D     SCSD     SSE     Polymorphic configurations       TCPA [50]     2009     D     SCSD     SSE     Targeted reconfigurability, ASIC-like       DySER [47]     2012     I     SCSD     SSD     SSD       REMUS [30]     2013     I     SCSD     SSE     Targeted reconfigurability, ASIC-like       DySER [47]     2012     I     SCSD     SSE     Targeted reconfigurability, ASIC-like       DySER [47]     2013     I     CCSD     SSE     Triggered Inst. [61]     2013     D     MCMD     DSD       T3 [74]     2013     I/C     MCMD     DSD     Dataflow-driven ISA       SGMF [75]     2014     I     MCMD     DSD     Marcenthy       MPA [77]     2015     ransparent     SCMD     SSD     Based on PipeRench       NDA [77]     2016                                                                                                                                    | Tartan [60]          | 2006 | I                     | MCMD                 | DSD                  | Asynchronous circuit                  |
| PPA [54]     2009     I     SCSD     SSE     Polymorphic configurations       TCPA [50]     2009     D     SCSD     SSE     Polymorphic configurations       TCPA [50]     2009     D     SCSD     SSE     Targeted reconfigurability, ASIC-like       DySER [47]     2012     I     SCSD     SSE     Targeted reconfigurability, ASIC-like       DySER [47]     2013     I     SCSD     SSE     Targeted reconfigurability, ASIC-like       DySER [47]     2013     I     SCSD     SSE     Triggered Inst. [61]     2013     D     MCMD     DSD       Ts [74]     2013     I/C     MCMD     DSD     Dataflow-driven ISA       SGMF [75]     2014     I     MCMD     DDD        FPCA [76]     2014     I     SCSD     SSE        DynaSPAM [53]     2015     transparent     SCMD     SSD     Based on PipeRench       NDA [77]     2016     I     SCMD     DSD        DORA [51]     2016     DI     SCSD <td>TFlex [71]</td> <td>2007</td> <td>Ι</td> <td>MCMD</td> <td>DSD</td> <td>Dataflow-driven ISA</td>                                                            | TFlex [71]           | 2007 | Ι                     | MCMD                 | DSD                  | Dataflow-driven ISA                   |
| TCPA [50]     2009     D     SCSD     SSE     Trageted reconfigurability, ASIC-like       DySER [47]     2010     I     SCSD     SSE     Targeted reconfigurability, ASIC-like       DySER [47]     2012     I     SCSD     SSE       REMUS [30]     2013     I     SCSD     SSE       Triggered Inst. [61]     2013     D     MCMD     DSD       T3 [74]     2013     I/C     MCMD     DSD     Dataflow-driven ISA       SGMF [75]     2014     I     MCMD     DDD     FPCA [76]     2014     I     SCSD     SSE       DynaSPAM [53]     2015     transparent     SCMD     SSD     Based on PipeRench       NDA [77]     2015     -     SCSD     SSE     Process-in-memory       HARTMP [78]     2016     I     SCMD     DSD     Based on DySER       DORA [51]     2016     D/I     SCSD     SSE     Process-in-memory, mix-grained       HRL [79]     2016     D/I     SCSD     SSD     General-purpose                                                                                                                                                               | RICA [72]            | 2008 | I                     | SCSD                 | SSE                  |                                       |
| C-Cores [73]     2010     I     SCSD     SSE     Targeted reconfigurability, ASIC-like       DySER [47]     2012     I     SCSD     SSD       REMUS [30]     2013     I     SCSD     SSE       Triggered Inst. [01]     2013     D     MCMD     DSD       T3 [74]     2013     I/C     MCMD     DSD     Dataflow-driven ISA       SGMF [75]     2014     I     MCMD     DDD     FPCA [76]     2014     I     SCMD     SSE       DynaSPAM [53]     2015     transparent     SCMD     SSD     Based on PipeRench       NDA [77]     2015     -     SCSD     SSE     Process-in-memory       HARTMP [78]     2016     I     SCMD     DSD     Dataflow-driven ISA       DORA [51]     2016     I     SCMD     SSD     Based on PipeRench       HRL [79]     2016     DA     SCSD     SSE     Process-in-memory.mix-grained       HRAC [16]     2017     I     SCSD     SSD     General-purpose       Plast                                                                                                                                                                   | PPA [54]             | 2009 | Ι                     | SCSD                 | SSE                  | Polymorphic configurations            |
| DysR [47]     2012     I     SCSD     SSD       REMUS [30]     2013     I     SCSD     SSE       Triggered Inst. [61]     2013     D     MCMD     DSD       T3 [74]     2013     I/C     MCMD     DSD       SGMF [75]     2014     I     MCMD     DDD       FPCA [76]     2014     I     MCMD     DDD       FPCA [76]     2014     I     SCSD     SSE       DynaSPAM [53]     2015     transparent     SCMD     SSD     Based on PipeRench       NDA [77]     2015     -     SCSD     SSE     Process-in-memory       HARTMP [78]     2016     I     SCMD     SSD     Based on DySER       HRL [79]     2016     D/I     SCSD     SSE     Process-in-memory.mix-grained       HReA [16]     2017     I     SCSD     SSD     Based on DySER       Plasticine [19]     2017     D     SCMD / MCMD     SSD     Parallel-pattern-based programming       Stream-dataflow [20]     2017 <t< td=""><td>TCPA [50]</td><td>2009</td><td>D</td><td>SCSD</td><td>SSE</td><td></td></t<>                                                                                            | TCPA [50]            | 2009 | D                     | SCSD                 | SSE                  |                                       |
| REMUS [30]     2013     I     SCSD     SSE       Triggered Inst. [61]     2013     I     SCSD     SSE       Triggered Inst. [61]     2013     D     MCMD     DSD       T3 [74]     2013     I/C     MCMD     DSD       SGMF [75]     2014     I     MCMD     DDD       FPCA [76]     2014     I     SCSD     SSE       DynaSPAM [53]     2015     transparent     SCMD     SSD     Based on PipeRench       NDA [77]     2015     -     SCSD     SSE     Process-in-memory       HARTMP [78]     2016     I     SCMD     SSD     Based on DySER       DORA [51]     2016     D/I     SCSD     SSE     Process-in-memory.mix-grained       HRE [79]     2016     D/I     SCSD     SSE     Process-in-memory.mix-grained       HRAE [16]     2017     I     SCSD     SSD     General-purpose       Plasticine [19]     2017     D     SCMD / MCMD     SSD     Parallel-pattern-based programming       <                                                                                                                                                                   | C-Cores [73]         | 2010 | I                     | SCSD                 | SSE                  | Targeted reconfigurability, ASIC-like |
| Triggered Inst. [61]     2013     D     MCMD     DSD       T3 [74]     2013     I/C     MCMD     DSD     Dataflow-driven ISA       SGMF [75]     2014     I     MCMD     DDD       FPCA [76]     2014     I     MCMD     DDD       FPCA [76]     2014     I     SCSD     SSE       DynaSPAM [53]     2015     transparent     SCMD     SSD     Based on PipeRench       NDA [77]     2015     -     SCSD     SSE     Process-in-memory       HARTMP [78]     2016     I     SCMD     SSD     Based on DySER       DORA [51]     2016     transparent     SCSD     SSE     Process-in-memory.mix-grained       HRL [79]     2016     D/I     SCSD     SSD     Based on DySER       Hasticine [19]     2017     I     SCSD     SSD     Parallel-pattern-based programming       Stream-dataflow [20]     2017     I     SCSD     SSD     Parallel-pattern-based programming       Stream-dataflow [20]     2017     I     SCSD <td>DySER [47]</td> <td>2012</td> <td>I</td> <td>SCSD</td> <td>SSD</td> <td></td>                                                           | DySER [47]           | 2012 | I                     | SCSD                 | SSD                  |                                       |
| T3 [74] 2013 I/C MCMD DSD Dataflow-driven ISA   SGMF [75] 2014 I MCMD DDD   FPCA [76] 2014 I SCSD SSE   DynaSPAM [53] 2015 transparent SCMD SSD Based on PipeRench   NDA [77] 2015 - SCSD SSE Process-in-memory   HARTMP [78] 2016 I SCMD SSD Based on DySER   DORA [51] 2016 transparent SCSD SSE Process-in-memory   HRL [79] 2016 D/I SCSD SSE Process-in-memory, mix-grained   HREA [16] 2017 I SCSD SSD General-purpose   Plasticine [19] 2017 D SCMD / MCMD SSD Parallel-pattern-based programming   Stream-dataflow [20] 2017 I SCSD SSE ADRES-like   Wave DPU [18] 2017 I SCSD SSD Commercial product for DNN   PX-CGRA [81] 2018 - SCMD SSE Double-ALU/Reg. in each PE   i-DPs CGRA [82] 2018 - SCMD SSE Double-ALU/Reg. in each PE   Parallel-XL [83] 2018 I/C <                                                                                                                                                                                                                                                                                               | REMUS [30]           | 2013 | I                     | SCSD                 | SSE                  |                                       |
| SGMF [75]     2014     I     MCMD     DDD       FPCA [76]     2014     I     SCSD     SSE       DynaSPAM [53]     2015     transparent     SCMD     SSD     Based on PipeRench       NDA [77]     2015     -     SCSD     SSE     Process-in-memory       HARTMP [78]     2016     I     SCMD     DSD       DORA [51]     2016     I     SCMD     DSD       DORA [51]     2016     DI     SCSD / SCMD     SSD     Based on DySER       HRL [79]     2016     DI     SCSD / SCMD     SSD     General-purpose       Plasticine [19]     2017     I     SCSD     SSD     Parallel-pattern-based programming       Stream-dataflow [20]     2017     I     SCSD     DSD     Vector memory interface       CGRA-ME [80]     2017     I     SCSD     SSE     ADRES-like       Wave DPU [18]     2017     I/C     SCSD     SSD     Commercial product for DNN       PX-CGRA [81]     2018     -     SCSD     SSE                                                                                                                                                                | Triggered Inst. [61] | 2013 | D                     | MCMD                 | DSD                  |                                       |
| FPCA [76]     2014     I     SCSD     SSE       DynaSPAM [53]     2015     transparent     SCMD     SSD     Based on PipeRench       NDA [77]     2015     -     SCSD     SSE     Process-in-memory       HARTMP [78]     2016     I     SCMD     DSD       DORA [51]     2016     I     SCMD     DSD       DORA [51]     2016     DI     SCSD / SCMD     SSD     Based on DySER       HRL [79]     2016     D/I     SCSD     SSE     Process-in-memory, mix-grained       HRA [16]     2017     I     SCSD     SSD     General-purpose       Plasticine [19]     2017     D     SCMD / MCMD     SSD     Parallel-pattern-based programming       Stream-dataflow [20]     2017     I     SCSD     SSE     ADRES-like       Wave DPU [18]     2017     I     SCSD     SSE     ADRES-like       Wave DPU [18]     2017     I/C     SCSD     SSD     Commercial product for DNN       PX-CGRA [81]     2018     - <td< td=""><td>T3 [74]</td><td>2013</td><td>I/C</td><td>MCMD</td><td>DSD</td><td>Dataflow-driven ISA</td></td<>                                          | T3 [74]              | 2013 | I/C                   | MCMD                 | DSD                  | Dataflow-driven ISA                   |
| DynaSPAM [53]     2015     transparent     SCMD     SSD     Based on PipeRench       NDA [77]     2015     -     SCSD     SSE     Process-in-memory       HARTMP [78]     2016     I     SCMD     DSD       DORA [51]     2016     transparent     SCSD / SCMD     SSD     Based on DySER       HRL [79]     2016     D/I     SCSD     SSE     Process-in-memory, mix-grained       HRA [16]     2017     I     SCSD     SSD     Based on DySER       Plasticine [19]     2017     I     SCSD     SSD     General-purpose       Plasticine [19]     2017     I     SCSD     DSD     Vector memory interface       CGRA-ME [s0]     2017     I     SCSD     SSE     ADRES-like       Wave DPU [18]     2017     I/C     SCSD     SSE     ADRES-like       PX-CGRA [81]     2018     -     SCMD     SSE     Approximate PEs       i-DPS CGRA [82]     2018     -     SCMD     SSE     Double-ALU/Reg. in each PE       Parallel-XL                                                                                                                                         | SGMF [75]            | 2014 | I                     | MCMD                 | DDD                  |                                       |
| NDA [77]     2015     -     SCSD     SSE     Process-in-memory       HARTMP [78]     2016     I     SCMD     DSD       DORA [51]     2016     transparent     SCSD / SCMD     SSD     Based on DySER       HRL [79]     2016     D/I     SCSD     SSE     Process-in-memory, mix-grained       HReA [16]     2017     I     SCSD     SSD     General-purpose       Plasticine [19]     2017     D     SCMD / MCMD     SSD     Parallel-pattern-based programming       Stream-dataflow [20]     2017     I     SCSD     SSE     ADRES-like       Wave DPU [18]     2017     I     SCSD     SSD     Commercial product for DNN       PX-CGRA [81]     2018     -     SCMD     SSE     Approximate PEs       i-DPS CGRA [82]     2018     -     SCMD     SSE     Double-ALU/Reg. in each PE       Parallel-XL [83]     2018     I/C     SCMD/MCMD     DDD     Intel Cik & work stealing                                                                                                                                                                                    | FPCA [76]            | 2014 | I                     | SCSD                 | SSE                  |                                       |
| HARTMP [78]     2016     I     SCMD     DSD       DORA [51]     2016     transparent     SCSD / SCMD     SSD     Based on DySER       HRL [79]     2016     D/I     SCSD     SSD     Based on DySER       HRA [16]     2017     I     SCSD     SSD     General-purpose       Plasticine [19]     2017     D     SCMD / MCMD     SSD     Parallel-pattern-based programming       Stream-dataflow [20]     2017     I     SCSD     DSD     Vector memory interface       CGRA-ME [80]     2017     I     SCSD     SSE     ADRES-like       Wave DPU [18]     2017     I/C     SCSD     SSE     Approximate PEs       i-DPs CGRA [81]     2018     -     SCMD     SSE     Double-ALU/Reg. in each PE       Parallel-XL [83]     2018     I/C     SCMD/MCMD     DDD     Intel Cik & work stealing                                                                                                                                                                                                                                                                           | DynaSPAM [53]        | 2015 | transparent           | SCMD                 | SSD                  | Based on PipeRench                    |
| DORA [51]     2016     transparent     SCSD / SCMD     SSD     Based on DySER       HRL [79]     2016     D/1     SCSD     SSE     Process-in-memory, mix-grained       HReA [16]     2017     I     SCSD     SSD     General-purpose       Plasticine [19]     2017     D     SCMD / MCMD     SSD     Parallel-pattern-based programming       Stream-dataflow [20]     2017     I     SCSD     DSD     Vector memory interface       CGRA-ME [80]     2017     I     SCSD     SSE     ADRES-like       Wave DPU [18]     2017     I/C     SCSD     SSD     Commercial product for DNN       PX-CGRA [81]     2018     -     SCMD     SSE     Double-ALU/Reg. in each PE       i-DPs CGRA [82]     2018     -     SCMD     SSE     Double-ALU/Reg. in each PE       Parallel-XL [83]     2018     I/C     SCMD/MCMD     DDD     Intel Cik & work stealing                                                                                                                                                                                                               | NDA [77]             | 2015 | -                     | SCSD                 | SSE                  | Process-in-memory                     |
| HRL [79]     2016     D/I     SCSD     SSE     Process-in-memory, mix-grained       HReA [16]     2017     I     SCSD     SSD     General-purpose       Plasticine [19]     2017     D     SCMD / MCMD     SSD     Parallel-pattern-based programming       Stream-dataflow [20]     2017     I     SCSD     DSD     Vector memory interface       CGRA-ME [80]     2017     I     SCSD     SSE     ADRES-like       Wave DPU [18]     2017     I     SCSD     SSD     Commercial product for DNN       PX-CGRA [81]     2018     -     SCSD     SSE     Double-ALU/Reg. in each PE       i-DPs CGRA [82]     2018     I/C     SCMD/MCMD     DDD     Intel Cilk & work stealing                                                                                                                                                                                                                                                                                                                                                                                          | HARTMP [78]          | 2016 | I                     | SCMD                 | DSD                  |                                       |
| HRA [16]     2017     I     SCSD     SSD     General-purpose       Plasticine [19]     2017     D     SCMD / MCMD     SSD     Parallel-pattern-based programming       Stream-dataflow [20]     2017     I     SCSD     DSD     Vector memory interface       CGRA-ME [80]     2017     I     SCSD     SSE     ADRES-like       Wave DPU [18]     2017     I/C     SCSD     SSD     Commercial product for DNN       PX-CGRA [81]     2018     -     SCSD     SSE     Approximate PIs       i-DPs CGRA [82]     2018     -     SCMD     SSE     Double-ALU/Reg. in each PE       Parallel-XL [83]     2018     I/C     SCMD/MCMD     DDD     Intel Cilk & work stealing                                                                                                                                                                                                                                                                                                                                                                                                  | DORA [51]            | 2016 | transparent           | SCSD / SCMD          | SSD                  | Based on DySER                        |
| Plasticine [19]     2017     D     SCMD / MCMD     SSD     Parallel-pattern-based programming       Stream-dataflow [20]     2017     I     SCSD     DSD     Vector memory interface       CGRA-ME [80]     2017     I     SCSD     SSE     ADRES-like       Wave DPU [18]     2017     I/C     SCSD     SSD     Commercial product for DNN       PX-CGRA [81]     2018     -     SCSD     SSE     Approximate PEs       i-DPs CGRA [82]     2018     -     SCMD     SSE     Double-ALU/Reg. in each PE       Parallel-XL [83]     2018     I/C     SCMD/MCMD     DDD     Intel Cilk & work stealing                                                                                                                                                                                                                                                                                                                                                                                                                                                                     | HRL [79]             | 2016 | D/I                   | SCSD                 | SSE                  | Process-in-memory, mix-grained        |
| Stream-dataflow [20]     2017     I     SCSD     DSD     Vector memory interface       CGRA-ME [80]     2017     I     SCSD     SSE     ADRES-like       Wave DPU [18]     2017     I/C     SCSD     SSD     Commercial product for DNN       PX-CGRA [81]     2018     -     SCSD     SSE     Approximate PEs       i-DPs CGRA [82]     2018     -     SCMD     SSE     Double-ALU/Reg. in each PE       Parallel-XL [83]     2018     I/C     SCMD/MCMD     DDD     Intel Cilk & work stealing                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         | HReA [16]            | 2017 | I                     | SCSD                 | SSD                  | General-purpose                       |
| CGRA-ME [80]     2017     I     SCSD     SSE     ADRES-like       Wave DPU [18]     2017     I/C     SCSD     SSD     Commercial product for DNN       PX-CGRA [81]     2018     -     SCSD     SSE     Approximate PEs       i-DPs CGRA [82]     2018     -     SCMD     SSE     Double-ALU/Reg. in each PE       Parallel-XL [83]     2018     I/C     SCMD/MCMD     DDD     Intel Cilk & work stealing                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                | Plasticine [19]      | 2017 | D                     | SCMD / MCMD          | SSD                  | Parallel-pattern-based programming    |
| Wave DPU [18]     2017     I/C     SCSD     SSD     Commercial product for DNN       PX-CGRA [81]     2018     -     SCSD     SSE     Approximate PEs       i-DPs CGRA [82]     2018     -     SCMD     SSE     Double-ALU/Reg. in each PE       Parallel-XL [83]     2018     I/C     SCMD/MCMD     DDD     Intel Cilk & work stealing                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  | Stream-dataflow [20] | 2017 | I                     | SCSD                 | DSD                  | Vector memory interface               |
| PX-CGRA [81]     2018     -     SCSD     SSE     Approximate PEs       i-DPs CGRA [82]     2018     -     SCMD     SSE     Double-ALU/Reg. in each PE       Parallel-XL [83]     2018     I/C     SCMD/MCMD     DDD     Intel Cilk & work stealing                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       | CGRA-ME [80]         | 2017 | I                     | SCSD                 | SSE                  | ADRES-like                            |
| i-DPs CGRA [82] 2018 - SCMD SSE Double-ALU/Reg. in each PE<br>Parallel-XL [83] 2018 I/C SCMD/MCMD DDD Intel Cilk & work stealing                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         | Wave DPU [18]        | 2017 | I/C                   | SCSD                 | SSD                  | Commercial product for DNN            |
| Parallel-XL [83]     2018     I/C     SCMD/MCMD     DDD     Intel Cilk & work stealing                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   | PX-CGRA [81]         | 2018 | -                     | SCSD                 | SSE                  | Approximate PEs                       |
|                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          | i-DPs CGRA [82]      | 2018 | -                     | SCMD                 | SSE                  | Double-ALU/Reg. in each PE            |
| dMT_CCRA [84] 2018 UC MCMD DDD Based on SCME                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             | Parallel-XL [83]     | 2018 | I/C                   | SCMD/MCMD            | DDD                  | Intel Cilk & work stealing            |
| unif-contrion 2010 inc memb DDD based on Some                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            | dMT-CGRA [84]        | 2018 | I/C                   | MCMD                 | DDD                  | Based on SGMF                         |

<sup>\*</sup>I-imperative programming model, D-declarative programming model, C-parallel/concurrent (imperative) programming model, "*transparent*" means that CGRA-related programming is not required, "-" means that programming is not mentioned in that work.

#### **Required model features**

- Programming: I/C (Imperative, concurrent)
- Computation: MCMD (multi-config, multi-data)
- Execution: DDD (dynamic-scheduling, dynamic-dataflow)

#### Only recent designs have the desired features

- [83] is an FPGA prototype and simulations based
- [84] focuses on the narrow aspect of inter-thread communication (point-to-point), extensions to CUDA

#### Great need, many open challenges

- No efficient programming paradigm for CGRAs
- More complicated Hw than CPU due to 2D scheduling
- High-level abstraction provides coarse-grain parallelism, which is insufficient to fulfill the hardware potential
- Performance depends on applications; the need for application oriented extensions to the programming model
- Reconfig. speed down to pipeline level (10's of cycles)

L. Liu, et al., "A Survey of Coarse-Grained Reconfigurable Architecture and Design: Taxonomy, Challenges, and Applications," ACM Computing Surveys, Oct. 2019.

<sup>\*\*</sup>SSE-static-scheduling sequential-execution, SSD-static-scheduling static-dataflow-execution, DSD-dynamicscheduling static-dataflow-execution, DDD-dynamic-scheduling dynamic-dataflow-execution.

## Partial FPGA Reconfig. Small-Size & Very Slow

### • FPGA time to dynamic partial reconfigure depends on [1]:

- The size of the config. bit-stream (BitStr<sub>size</sub>) usually in KB
- The reconfig. path throughput (RP<sub>throughput</sub>) usually in MB/s

$$T_{dyn-rec} = \frac{BitStr_{size}}{RP_{throughput}}$$

- Dynamic partial reconfiguration controllers go up to 400MB/s [2]
- Usually, a large number of Clk cycles is required for a small amount of logic
  - 130k Clk cycles to reconfigure 1.5k slices of logic [3] | 0.4ms @ 300MHz Clk

#### • An SDR pipeline on a Zynq FPGA uses 3.2k slices of logic, 4-region partition

- Largest partial bit-stream size for a region is 324 KB [4]
- Worst execution time for dynamic partial reconfig. of this region is 1.08ms

<sup>[1]</sup> G. Valente et al., "Dynamic partial reconfiguration profitability for realtime systems," IEEE Embedded Systems Letters, pp. 1–1, 2020.

<sup>[2]</sup> S. D. Carlo, P. Prinetto, P. Trotta, and J. Andersson, "A portable open-source controller for safe dynamic partial reconfiguration on Xilinx FPGAs," in Proc. of the 25th International Conference on Field Programmable Logic and Applications (FPL), 2015, pp. 1–4.

<sup>[3]</sup> L. Pezzarossa, A. T. Kristensen, M. Schoeberl and J. Sparso, "Can real-time systems benefit from dynamic partial reconfiguration?," 2017 IEEE Nordic Circuits and Systems Conference (NORCAS): NORCHIP and International Symposium of System-on-Chip (SoC), Linkoping, 2017, pp. 1-6, doi: 10.1109/NORCHIP.2017.8124984.

<sup>[4]</sup> A. Kamaleldin et al., "A reconfigurable hardware platform implementation for software defined radio using dynamic partial reconfiguration on Xilinx Zynq FPGA," 2017 IEEE 60th International Midwest Symposium on Circuits and Systems (MWSCAS), Boston, MA, 2017, pp. 1540-1543, doi: 10.1109/MWSCAS.2017.8053229.

# **Runtime Reconfig. for Data-Driven Processing**



- Opportunistically repurpose unutilized processor arrays
  - Multi-step compilation (Sw + Hw) avoids complete program recompile
  - Requires array's network symmetry (for polygon translation/rotation/flip)
  - Support for Sw compilation from Python/C++ base

## An Alternative to Accelerators?

- ~7% of *entire* SoC area is active, TSMC 45nm node [\*]
- Depending on domain specialization, RTRA can be within 2x-10x in area and power vs accelerator
  - **Note:** system/platform power will not be 2-10x higher; much less (~30-50%)
    - e.g. iPad battery life with H.264 on CPU (3 hours) vs accelerator (10 hours), a 3x system impact with a 1,000x accelerator gain
- Feasible for new and/or evolving architectures, SDR, etc.







potentially saved area

~2 active programs Can accommodate more

Algorithm updates w/o chip respin

[\*] G. Venkatesh, et al., ACM SIGARCH Computer Architecture News, Volume 38Issue 1 March 2010 pp 205–218 https://doi.org/10.1145/1735970.1736044