soc.experiment package

Submodules

soc.experiment.alu_fsm module

Simple example of a FSM-based ALU

This demonstrates a design that follows the valid/ready protocol of the ALU, but with a FSM implementation, instead of a pipeline. It is also intended to comply with both the CompALU API and the nmutil Pipeline API (Liskov Substitution Principle)

The basic rules are:

  1. p.ready_o is asserted on the initial (“Idle”) state, otherwise it keeps low.
  2. n.valid_o is asserted on the final (“Done”) state, otherwise it keeps low.
  3. The FSM stays in the Idle state while p.valid_i is low, otherwise it accepts the input data and moves on.
  4. The FSM stays in the Done state while n.ready_i is low, otherwise it releases the output data and goes back to the Idle state.
class soc.experiment.alu_fsm.CompFSMOpSubset(name=None)

Bases: soc.fu.base_input_record.CompOpSubsetBase

class soc.experiment.alu_fsm.Dummy

Bases: object

class soc.experiment.alu_fsm.Shifter(width)

Bases: nmigen.hdl.ir.Elaboratable

Simple sequential shifter

Prev port data: * p.data_i.data: value to be shifted * p.data_i.shift: shift amount * When zero, no shift occurs. * On POWER, range is 0 to 63 for 32-bit, * and 0 to 127 for 64-bit. * Other values wrap around.

Operation type * op.sdir: shift direction (0 = left, 1 = right)

Next port data: * n.data_o.data: shifted value

class NextData(width)

Bases: object

class PrevData(width)

Bases: object

elaborate(platform)
ports()
soc.experiment.alu_fsm.test_shifter()

soc.experiment.alu_hier module

Experimental ALU: based on nmigen alu_hier.py, includes branch-compare ALU

This ALU is deliberately designed to add in (unnecessary) delays into different operations so as to be able to test the 6600-style matrices and the CompUnits. Countdown timers wait for (defined) periods before indicating that the output is valid

A “real” integer ALU would place the answers onto the output bus after only one cycle (sync)

class soc.experiment.alu_hier.ALU(width)

Bases: nmigen.hdl.ir.Elaboratable

elaborate(platform)
ports()
class soc.experiment.alu_hier.Adder(width)

Bases: nmigen.hdl.ir.Elaboratable

elaborate(platform)
class soc.experiment.alu_hier.BranchALU(width)

Bases: nmigen.hdl.ir.Elaboratable

elaborate(platform)
ports()
class soc.experiment.alu_hier.BranchOp(width, op)

Bases: nmigen.hdl.ir.Elaboratable

elaborate(platform)
class soc.experiment.alu_hier.Dummy

Bases: object

class soc.experiment.alu_hier.DummyALU(width)

Bases: nmigen.hdl.ir.Elaboratable

elaborate(platform)
ports()
class soc.experiment.alu_hier.Multiplier(width)

Bases: nmigen.hdl.ir.Elaboratable

elaborate(platform)
class soc.experiment.alu_hier.Shifter(width)

Bases: nmigen.hdl.ir.Elaboratable

elaborate(platform)
class soc.experiment.alu_hier.SignExtend(width)

Bases: nmigen.hdl.ir.Elaboratable

elaborate(platform)
class soc.experiment.alu_hier.Subtractor(width)

Bases: nmigen.hdl.ir.Elaboratable

elaborate(platform)
soc.experiment.alu_hier.alu_sim(dut)
soc.experiment.alu_hier.run_op(dut, a, b, op, inv_a=0)
soc.experiment.alu_hier.test_alu()
soc.experiment.alu_hier.test_alu_parallel()
soc.experiment.alu_hier.write_alu_gtkw(gtkw_name, clk_period=1e-06, sub_module=None, pysim=True)

Common function to write the GTKWave documents for this module

soc.experiment.cache_ram module

class soc.experiment.cache_ram.CacheRam(ROW_BITS=16, WIDTH=64, TRACE=True, ADD_BUF=False)

Bases: nmigen.hdl.ir.Elaboratable

elaborate(platform)

soc.experiment.compalu module

class soc.experiment.compalu.ComputationUnitNoDelay(rwid, alu)

Bases: nmigen.hdl.ir.Elaboratable

elaborate(platform)
ports()
soc.experiment.compalu.op_sim(dut, a, b, op, inv_a=0, imm=0, imm_ok=0)
soc.experiment.compalu.scoreboard_sim(dut)
soc.experiment.compalu.test_scoreboard()

soc.experiment.compalu_multi module

Computation Unit (aka “ALU Manager”).

Manages a Pipeline or FSM, ensuring that the start and end time are 100% monitored. At no time may the ALU proceed without this module notifying the Dependency Matrices. At no time is a result production “abandoned”. This module blocks (indicates busy) starting from when it first receives an opcode until it receives notification that its result(s) have been successfully stored in the regfile(s)

Documented at http://libre-soc.org/3d_gpu/architecture/compunit

class soc.experiment.compalu_multi.CompUnitRecord(subkls, rwid, n_src=None, n_dst=None, name=None)

Bases: soc.fu.regspec.RegSpec, nmutil.iocontrol.RecordObject

base class for Computation Units, to provide a uniform API and allow “record.connect” etc. to be used, particularly when it comes to connecting multiple Computation Units up as a block (very laborious)

LDSTCompUnitRecord should derive from this class and add the additional signals it requires

Subkls:the class (not an instance) needed to construct the opcode
Rwid:either an integer (specifies width of all regs) or a “regspec”

see https://libre-soc.org/3d_gpu/architecture/regfile/ section on regspecs

class soc.experiment.compalu_multi.MultiCompUnit(rwid, alu, opsubsetkls, n_src=2, n_dst=1, name=None)

Bases: soc.fu.regspec.RegSpecALUAPI, nmigen.hdl.ir.Elaboratable

elaborate(platform)
get_fu_out(i)
ports()
soc.experiment.compalu_multi.find_ok(fields)

find_ok helper function - finds field ending in “_ok”

soc.experiment.compalu_multi.go_record(n, name)

soc.experiment.compldst_multi module

LOAD / STORE Computation Unit.

This module covers POWER9-compliant Load and Store operations, with selection on each between immediate and indexed mode as options for the calculation of the Effective Address (EA), and also “update” mode which optionally stores that EA into an additional register.

soc.experiment.cscore module

soc.experiment.cxxsim module

soc.experiment.dcache module

DCache

based on Anton Blanchard microwatt dcache.vhdl

soc.experiment.dcache.CacheRamOut()
soc.experiment.dcache.CacheTagArray()
soc.experiment.dcache.CacheValidBitsArray()
class soc.experiment.dcache.DCache

Bases: nmigen.hdl.ir.Elaboratable

Set associative dcache write-through TODO (in no specific order): * See list in icache.vhdl * Complete load misses on the cycle when WB data comes instead of

at the end of line (this requires dealing with requests coming in while not idle…)
cache_tag_read(m, r0_stall, req_index, cache_tag_set, cache_tags)

Cache tag RAM read port

dcache_fast_hit(m, req_op, r0_valid, r0, r1, req_hit_way, req_index, req_tag, access_ok, tlb_hit, tlb_hit_way, tlb_req_index)
dcache_log(m, r1, valid_ra, tlb_hit_way, stall_out)
dcache_request(m, r0, ra, req_index, req_row, req_tag, r0_valid, r1, cache_valids, replace_way, use_forward1_next, use_forward2_next, req_hit_way, plru_victim, rc_ok, perm_attr, valid_ra, perm_ok, access_ok, req_op, req_go, tlb_pte_way, tlb_hit, tlb_hit_way, tlb_valid_way, cache_tag_set, cancel_store, req_same_tag, r0_stall, early_req_row)

Cache request parsing and hit detection

dcache_slow(m, r1, use_forward1_next, use_forward2_next, cache_valids, r0, replace_way, req_hit_way, req_same_tag, r0_valid, req_op, cache_tags, req_go, ra)
elaborate(platform)
maybe_plrus(m, r1, plru_victim)

Generate PLRUs

maybe_tlb_plrus(m, r1, tlb_plru_victim)

Generate TLB PLRUs

rams(m, r1, early_req_row, cache_out_row, replace_way)

Generate a cache RAM for each way. This handles the normal reads, writes from reloads and the special store-hit update path as well.

Note: the BRAMs have an extra read buffer, meaning the output is pipelined an extra cycle. This differs from the icache. The writeback logic needs to take that into account by using 1-cycle delayed signals for load hits.

reservation_comb(m, cancel_store, set_rsrv, clear_rsrv, r0_valid, r0, reservation)

Handle load-with-reservation and store-conditional instructions

reservation_reg(m, r0_valid, access_ok, set_rsrv, clear_rsrv, reservation, r0)
stage_0(m, r0, r1, r0_full)

Latch the request in r0.req as long as we’re not stalling

tlb_read(m, r0_stall, tlb_valid_way, tlb_tag_way, tlb_pte_way, dtlb_valid_bits, dtlb_tags, dtlb_ptes)

TLB Operates in the second cycle on the request latched in r0.req. TLB updates write the entry at the end of the second cycle.

tlb_update(m, r0_valid, r0, dtlb_valid_bits, tlb_req_index, tlb_hit_way, tlb_hit, tlb_plru_victim, tlb_tag_way, dtlb_tags, tlb_pte_way, dtlb_ptes)
writeback_control(m, r1, cache_out_row)

Return data for loads & completion control logic

class soc.experiment.dcache.DCachePendingHit(tlb_pte_way, tlb_valid_way, tlb_hit_way, cache_valid_idx, cache_tag_set, req_addr, hit_set)

Bases: nmigen.hdl.ir.Elaboratable

elaborate(platform)
class soc.experiment.dcache.DTLBUpdate

Bases: nmigen.hdl.ir.Elaboratable

elaborate(platform)
soc.experiment.dcache.HitWaySet()
class soc.experiment.dcache.MemAccessRequest(name=None)

Bases: nmutil.iocontrol.RecordObject

class soc.experiment.dcache.Op

Bases: enum.Enum

An enumeration.

OP_BAD = 1
OP_LOAD_HIT = 3
OP_LOAD_MISS = 4
OP_LOAD_NC = 5
OP_NONE = 0
OP_STCX_FAIL = 2
OP_STORE_HIT = 6
OP_STORE_MISS = 7
soc.experiment.dcache.PLRUOut()
class soc.experiment.dcache.PermAttr(name=None)

Bases: nmutil.iocontrol.RecordObject

class soc.experiment.dcache.RegStage0(name=None)

Bases: nmutil.iocontrol.RecordObject

class soc.experiment.dcache.RegStage1(name=None)

Bases: nmutil.iocontrol.RecordObject

class soc.experiment.dcache.Reservation

Bases: nmutil.iocontrol.RecordObject

soc.experiment.dcache.RowPerLineValidArray()
class soc.experiment.dcache.State

Bases: enum.Enum

An enumeration.

IDLE = 0
NC_LOAD_WAIT_ACK = 3
RELOAD_WAIT_ACK = 1
STORE_WAIT_ACK = 2
soc.experiment.dcache.TLBPLRUOut()
soc.experiment.dcache.TLBPtesArray()
soc.experiment.dcache.TLBTagEAArray()
soc.experiment.dcache.TLBTagsArray()
soc.experiment.dcache.TLBValidBitsArray()
soc.experiment.dcache.dcache_load(dut, addr, nc=0)
soc.experiment.dcache.dcache_random_sim(dut)
soc.experiment.dcache.dcache_sim(dut)
soc.experiment.dcache.dcache_store(dut, addr, data, nc=0)
soc.experiment.dcache.extract_perm_attr(pte)
soc.experiment.dcache.get_index(addr)
soc.experiment.dcache.get_row(addr)
soc.experiment.dcache.get_row_of_line(row)
soc.experiment.dcache.get_tag(addr)
soc.experiment.dcache.is_last_row(row, last)
soc.experiment.dcache.is_last_row_addr(addr, last)
soc.experiment.dcache.ispow2(x)
soc.experiment.dcache.next_row(row)
soc.experiment.dcache.read_tag(way, tagset)
soc.experiment.dcache.read_tlb_pte(way, ptes)
soc.experiment.dcache.read_tlb_tag(way, tags)
soc.experiment.dcache.test_dcache(mem, test_fn, test_name)
soc.experiment.dcache.write_tlb_pte(way, ptes, newpte)
soc.experiment.dcache.write_tlb_tag(way, tags, tag)

soc.experiment.icache module

ICache

based on Anton Blanchard microwatt icache.vhdl

Set associative icache

TODO (in no specific order): * Add debug interface to inspect cache content * Add snoop/invalidate path * Add multi-hit error detection * Pipelined bus interface (wb or axi) * Maybe add parity? There’s a few bits free in each BRAM row on Xilinx * Add optimization: service hits on partially loaded lines * Add optimization: (maybe) interrupt reload on fluch/redirect * Check if playing with the geometry of the cache tags allow for more

efficient use of distributed RAM and less logic/muxes. Currently we write TAG_BITS width which may not match full ram blocks and might cause muxes to be inferred for “partial writes”.
  • Check if making the read size of PLRU a ROM helps utilization
soc.experiment.icache.CacheRamOut()
soc.experiment.icache.CacheTagArray()
soc.experiment.icache.CacheValidBitsArray()
class soc.experiment.icache.ICache

Bases: nmigen.hdl.ir.Elaboratable

64 bit direct mapped icache. All instructions are 4B aligned.

elaborate(platform)
icache_comb(m, use_previous, r, req_index, req_row, req_hit_way, req_tag, real_addr, req_laddr, cache_valid_bits, cache_tags, access_ok, req_is_hit, req_is_miss, replace_way, plru_victim, cache_out_row)
icache_hit(m, use_previous, r, req_is_hit, req_hit_way, req_index, req_tag, real_addr)
icache_log(m, req_hit_way, ra_valid, access_ok, req_is_miss, req_is_hit, lway, wstate, r)
icache_miss(m, cache_valid_bits, r, req_is_miss, req_index, req_laddr, req_tag, replace_way, cache_tags, access_ok, real_addr)
icache_miss_clr_tag(m, r, replace_way, cache_valid_bits, req_index, tagset, cache_tags)
icache_miss_idle(m, r, req_is_miss, req_laddr, req_index, req_tag, replace_way, real_addr)
icache_miss_wait_ack(m, r, replace_way, inval_in, stbs_done, cache_valid_bits)
itlb_lookup(m, tlb_req_index, itlb_ptes, itlb_tags, real_addr, itlb_valid_bits, ra_valid, eaa_priv, priv_fault, access_ok)
itlb_update(m, itlb_valid_bits, itlb_tags, itlb_ptes)
maybe_plrus(m, r, plru_victim)
rams(m, r, cache_out_row, use_previous, replace_way, req_row)
soc.experiment.icache.PLRUOut()
class soc.experiment.icache.RegInternal

Bases: nmutil.iocontrol.RecordObject

soc.experiment.icache.RowPerLineValidArray()
class soc.experiment.icache.State

Bases: enum.Enum

An enumeration.

CLR_TAG = 1
IDLE = 0
WAIT_ACK = 2
soc.experiment.icache.TLBPtesArray()
soc.experiment.icache.TLBTagArray()
soc.experiment.icache.TLBValidBitsArray()
soc.experiment.icache.get_index(addr)
soc.experiment.icache.get_row(addr)
soc.experiment.icache.get_row_of_line(row)
soc.experiment.icache.get_tag(addr)
soc.experiment.icache.hash_ea(addr)
soc.experiment.icache.icache_sim(dut)
soc.experiment.icache.is_last_row(row, last)
soc.experiment.icache.is_last_row_addr(addr, last)
soc.experiment.icache.ispow2(n)
soc.experiment.icache.next_row(row)
soc.experiment.icache.read_insn_word(addr, data)
soc.experiment.icache.read_tag(way, tagset)
soc.experiment.icache.test_icache(mem)
soc.experiment.icache.write_tag(way, tagset, tag)

soc.experiment.imem module

class soc.experiment.imem.TestMemFetchUnit(pspec)

Bases: soc.minerva.units.fetch.FetchUnitInterface, nmigen.hdl.ir.Elaboratable

elaborate(platform)
ports()

soc.experiment.l0_cache module

L0 Cache/Buffer

This first version is intended for prototyping and test purposes: it has “direct” access to Memory.

The intention is that this version remains an integral part of the test infrastructure, and, just as with minerva’s memory arrangement, a dynamic runtime config selects alternative memory arrangements rather than replaces and discards this code.

Links:

class soc.experiment.l0_cache.CacheRecord(name=None)

Bases: nmigen.hdl.rec.Record

class soc.experiment.l0_cache.DataMerger(array_size)

Bases: nmigen.hdl.ir.Elaboratable

Merges data based on an address-match matrix. Identifies (picks) one (any) row, then uses that row, based on matching address bits, to merge (OR) all data rows into the output.

Basically, by the time DataMerger is used, all of its incoming data is determined not to conflict. The last step before actually submitting the request to the Memory Subsystem is to work out which requests, on the same 128-bit cache line, can be “merged” due to them being: (A) on the same address (bits 4 and above) (B) having byte-enable lines that (as previously mentioned) do not conflict.

Therefore, put simply, this module will: (1) pick a row (any row) and identify it by an index labelled “idx” (2) merge all byte-enable lines which are on that same address, as

indicated by addr_match_i[idx], onto the output
elaborate(platform)
class soc.experiment.l0_cache.DataMergerRecord(name=None)

Bases: nmigen.hdl.rec.Record

{data: 128 bit, byte_enable: 16 bit}

class soc.experiment.l0_cache.L0CacheBuffer(n_units, pimem, regwid=64, addrwid=48)

Bases: nmigen.hdl.ir.Elaboratable

L0 Cache / Buffer

Note that the final version will have two interfaces per LDSTCompUnit, to cover mis-aligned requests, as well as two 128-bit L1 Cache interfaces: one for odd (addr[4] == 1) and one for even (addr[4] == 1).

This version is to be used for test purposes (and actively maintained for such, rather than “replaced”)

There are much better ways to implement this. However it’s only a “demo” / “test” class, and one important aspect: it responds combinatorially, where a nmigen FSM’s state-changes only activate on clock-sync boundaries.

Note: the data byte-order is not expected to be normalised (LE/BE) by this class. That task is taken care of by LDSTCompUnit.

elaborate(platform)
ports()
class soc.experiment.l0_cache.L0CacheBuffer2(n_units=8, regwid=64, addrwid=48)

Bases: nmigen.hdl.ir.Elaboratable

elaborate(platform)
class soc.experiment.l0_cache.TestDataMerger(methodName='runTest')

Bases: unittest.case.TestCase

test_data_merger()
class soc.experiment.l0_cache.TestDualPortSplitter(methodName='runTest')

Bases: unittest.case.TestCase

test_dual_port_splitter()
class soc.experiment.l0_cache.TestL0Cache(methodName='runTest')

Bases: unittest.case.TestCase

test_l0_cache_test_bare_wb()
test_l0_cache_testpi()
class soc.experiment.l0_cache.TstDataMerger2

Bases: nmigen.hdl.ir.Elaboratable

addr_match(j, addr)
elaborate(platform)
class soc.experiment.l0_cache.TstL0CacheBuffer(pspec, n_units=3)

Bases: nmigen.hdl.ir.Elaboratable

elaborate(platform)
ports()
soc.experiment.l0_cache.data_merger_merge(dut)
soc.experiment.l0_cache.data_merger_test2(dut)
soc.experiment.l0_cache.l0_cache_ld(dut, addr, datalen, expected)
soc.experiment.l0_cache.l0_cache_ldst(arg, dut)
soc.experiment.l0_cache.l0_cache_st(dut, addr, data, datalen)
soc.experiment.l0_cache.wait_addr(port)
soc.experiment.l0_cache.wait_busy(port, no=False)
soc.experiment.l0_cache.wait_ldok(port)

soc.experiment.lsmem module

class soc.experiment.lsmem.TestMemLoadStoreUnit(pspec)

Bases: soc.minerva.units.loadstore.LoadStoreUnitInterface, nmigen.hdl.ir.Elaboratable

elaborate(platform)

soc.experiment.mem_types module

mem_types

based on Anton Blanchard microwatt common.vhdl

class soc.experiment.mem_types.DCacheToLoadStore1Type(name=None)

Bases: nmutil.iocontrol.RecordObject

class soc.experiment.mem_types.DCacheToMMUType(name=None)

Bases: nmutil.iocontrol.RecordObject

class soc.experiment.mem_types.Fetch1ToICacheType(name=None)

Bases: nmutil.iocontrol.RecordObject

class soc.experiment.mem_types.ICacheToDecode1Type(name=None)

Bases: nmutil.iocontrol.RecordObject

class soc.experiment.mem_types.LoadStore1ToDCacheType(name=None)

Bases: nmutil.iocontrol.RecordObject

class soc.experiment.mem_types.LoadStore1ToMMUType(name=None)

Bases: nmutil.iocontrol.RecordObject

class soc.experiment.mem_types.MMUToDCacheType(name=None)

Bases: nmutil.iocontrol.RecordObject

class soc.experiment.mem_types.MMUToICacheType(name=None)

Bases: nmutil.iocontrol.RecordObject

class soc.experiment.mem_types.MMUToLoadStore1Type(name=None)

Bases: nmutil.iocontrol.RecordObject

soc.experiment.mmu module

MMU

based on Anton Blanchard microwatt mmu.vhdl

class soc.experiment.mmu.MMU

Bases: nmigen.hdl.ir.Elaboratable

Radix MMU

Supports 4-level trees as in arch 3.0B, but not the two-step translation for guests under a hypervisor (i.e. there is no gRA -> hRA translation).

elaborate(platform)
mmu_0(m, r, rin, l_in, l_out, d_out, addrsh, mask)
proc_tbl_wait(m, v, r, data)
radix_read_wait(m, v, r, d_in, data)
radix_tree_idle(m, l_in, r, v)
segment_check(m, v, r, data, finalmask)
class soc.experiment.mmu.RegStage(name=None)

Bases: nmutil.iocontrol.RecordObject

class soc.experiment.mmu.State

Bases: enum.Enum

An enumeration.

DO_TLBIE = 1
IDLE = 0
PROC_TBL_READ = 3
PROC_TBL_WAIT = 4
RADIX_FINISH = 9
RADIX_LOAD_TLB = 8
RADIX_LOOKUP = 6
RADIX_READ_WAIT = 7
SEGMENT_CHECK = 5
TLB_WAIT = 2
soc.experiment.mmu.dcache_get(dut)

simulator process for getting memory load requests

soc.experiment.mmu.mmu_sim(dut)
soc.experiment.mmu.mmu_wait(dut)
soc.experiment.mmu.test_mmu()

soc.experiment.pi2ls module

PortInterface to LoadStoreUnitInterface adapter

PortInterface LoadStoreUnitInterface ————- ———————-

is_ld_i/1 x_ld_i is_st_i/1 x_st_i

data_len/4 x_mask/16 (translate using LenExpand)

busy_o/1 most likely to be x_busy_o go_die_i/1 rst? addr.data/48 x_addr_i (x_addr_i[:4] goes into LenExpand) addr.ok/1 probably x_valid_i & ~x_stall_i

addr_ok_o/1 no equivalent. might work using x_stall_i exception_o/2(?) m_load_err_o and m_store_err_o

ld.data/64 m_ld_data_o ld.ok/1 probably implicit, when x_busy drops low st.data/64 x_st_data_i st.ok/1 probably kinda redundant, set to x_st_i

class soc.experiment.pi2ls.Pi2LSUI(name, lsui=None, data_wid=64, mask_wid=8, addr_wid=48)

Bases: soc.experiment.pimem.PortInterfaceBase

elaborate(platform)
get_rd_data(m)
set_rd_addr(m, addr, mask)
set_wr_addr(m, addr, mask)
set_wr_data(m, data, wen)
class soc.experiment.pi2ls.Pi2LSUI1(name, pi=None, lsui=None, data_wid=64, mask_wid=8, addr_wid=48)

Bases: nmigen.hdl.ir.Elaboratable

connect_port(inport)
elaborate(platform)
splitaddr(addr)

split the address into top and bottom bits of the memory granularity

soc.experiment.pimem module

L0 Cache/Buffer

This first version is intended for prototyping and test purposes: it has “direct” access to Memory.

The intention is that this version remains an integral part of the test infrastructure, and, just as with minerva’s memory arrangement, a dynamic runtime config selects alternative memory arrangements rather than replaces and discards this code.

Links:

class soc.experiment.pimem.PortInterface(name=None, regwid=64, addrwid=48)

Bases: nmutil.iocontrol.RecordObject

defines the interface - the API - that the LDSTCompUnit connects to. note that this is NOT a “fire-and-forget” interface. the LDSTCompUnit must be kept appraised that the request is in progress, and only when it has a 100% successful completion can the notification be given (busy dropped).

The interface FSM rules are as follows:

  • if busy_o is asserted, a LD/ST is in progress. further requests may not be made until busy_o is deasserted.

  • only one of is_ld_i or is_st_i may be asserted. busy_o will immediately be asserted and remain asserted.

  • addr.ok is to be asserted when the LD/ST address is known. addr.data is to be valid on the same cycle.

    addr.ok and addr.data must REMAIN asserted until busy_o is de-asserted. this ensures that there is no need for the L0 Cache/Buffer to have an additional address latch (because the LDSTCompUnit already has it)

  • addr_ok_o (or exception.happened) must be waited for. these will be asserted only for one cycle and one cycle only.

  • exception.happened will be asserted if there is no chance that the memory request may be fulfilled.

    busy_o is deasserted on the same cycle as exception.happened is asserted.

  • conversely: addr_ok_o must ONLY be asserted if there is a HUNDRED PERCENT guarantee that the memory request will be fulfilled.

  • for a LD, ld.ok will be asserted - for only one clock cycle - at any point in the future that is acceptable to the underlying Memory subsystem. the recipient MUST latch ld.data on that cycle.

    busy_o is deasserted on the same cycle as ld.ok is asserted.

  • for a ST, st.ok may be asserted only after addr_ok_o had been asserted, alongside valid st.data at the same time. st.ok must only be asserted for one cycle.

    the underlying Memory is REQUIRED to pick up that data and guarantee its delivery. no back-acknowledgement is required.

    busy_o is deasserted on the cycle AFTER st.ok is asserted.

connect_port(inport)
class soc.experiment.pimem.PortInterfaceBase(regwid=64, addrwid=4)

Bases: nmigen.hdl.ir.Elaboratable

Base class for PortInterface-compliant Memory read/writers

addrbits
connect_port(inport)
elaborate(platform)
get_rd_data(m)
ports()
set_rd_addr(m, addr, mask)
set_wr_addr(m, addr, mask)
set_wr_data(m, data, wen)
splitaddr(addr)

split the address into top and bottom bits of the memory granularity

class soc.experiment.pimem.TestMemoryPortInterface(regwid=64, addrwid=4)

Bases: soc.experiment.pimem.PortInterfaceBase

This is a test class for simple verification of the LDSTCompUnit and for the simple core, to be able to run unit tests rapidly and with less other code in the way.

Versions of this which are compatible (conform with PortInterface) will include augmented-Wishbone Bus versions, including ones that connect to L1, L2, MMU etc. etc. however this is the “base lowest possible version that complies with PortInterface”.

elaborate(platform)
get_rd_data(m)
ports()
set_rd_addr(m, addr, mask)
set_wr_addr(m, addr, mask)
set_wr_data(m, data, wen)

soc.experiment.plru module

class soc.experiment.plru.PLRU(BITS=2)

Bases: nmigen.hdl.ir.Elaboratable

elaborate(platform)
ports()

soc.experiment.score6600 module

class soc.experiment.score6600.CompUnitALUs(rwid, opwid, n_alus)

Bases: soc.experiment.score6600.CompUnitsBase

elaborate(platform)
class soc.experiment.score6600.CompUnitBR(rwid, opwid)

Bases: soc.experiment.score6600.CompUnitsBase

elaborate(platform)
class soc.experiment.score6600.CompUnitLDSTs(rwid, opwid, n_ldsts, mem)

Bases: soc.experiment.score6600.CompUnitsBase

elaborate(platform)
class soc.experiment.score6600.CompUnitsBase(rwid, units, ldstmode=False)

Bases: nmigen.hdl.ir.Elaboratable

Computation Unit Base class.

Amazingly, this class works recursively. It’s supposed to just look after some ALUs (that can handle the same operations), grouping them together, however it turns out that the same code can also group groups of Computation Units together as well.

Basically it was intended just to concatenate the ALU’s issue, go_rd etc. signals together, which start out as bits and become sequences. Turns out that the same trick works just as well on Computation Units!

So this class may be used recursively to present a top-level sequential concatenation of all the signals in and out of ALUs, whilst at the same time making it convenient to group ALUs together.

At the lower level, the intent is that groups of (identical) ALUs may be passed the same operation. Even beyond that, the intent is that that group of (identical) ALUs actually share the same pipeline and as such become a “Concurrent Computation Unit” as defined by Mitch Alsup (see section 11.4.9.3)

elaborate(platform)
class soc.experiment.score6600.FunctionUnits(n_regs, n_int_alus)

Bases: nmigen.hdl.ir.Elaboratable

elaborate(platform)
class soc.experiment.score6600.IssueToScoreboard(qlen, n_in, n_out, rwid, opwid, n_regs)

Bases: nmigen.hdl.ir.Elaboratable

elaborate(platform)
ports()
class soc.experiment.score6600.Scoreboard(rwid, n_regs)

Bases: nmigen.hdl.ir.Elaboratable

elaborate(platform)
ports()
soc.experiment.score6600.create_random_ops(dut, n_ops, shadowing=False, max_opnums=3)
soc.experiment.score6600.disable_issue(dut)
soc.experiment.score6600.instr_q(dut, op, funit, op_imm, imm, src1, src2, dest, branch_success, branch_fail)
soc.experiment.score6600.int_instr(dut, op, imm, src1, src2, dest, branch_success, branch_fail)
soc.experiment.score6600.power_instr_q(dut, pdecode2, ins, code)
soc.experiment.score6600.power_sim(m, dut, pdecode2, instruction, alusim)
soc.experiment.score6600.print_reg(dut, rnums)
soc.experiment.score6600.scoreboard_branch_sim(dut, alusim)
soc.experiment.score6600.scoreboard_sim(dut, alusim)
soc.experiment.score6600.test_scoreboard()
soc.experiment.score6600.wait_for_busy_clear(dut)
soc.experiment.score6600.wait_for_issue(dut, dut_issue)

soc.experiment.score6600_multi module

class soc.experiment.score6600_multi.CompUnitALUs(rwid, opwid, n_alus)

Bases: soc.experiment.score6600_multi.CompUnitsBase

elaborate(platform)
class soc.experiment.score6600_multi.CompUnitBR(rwid, opwid)

Bases: soc.experiment.score6600_multi.CompUnitsBase

elaborate(platform)
class soc.experiment.score6600_multi.CompUnitLDSTs(rwid, opwid, n_ldsts, l0)

Bases: soc.experiment.score6600_multi.CompUnitsBase

elaborate(platform)
class soc.experiment.score6600_multi.CompUnitsBase(rwid, units, ldstmode=False)

Bases: nmigen.hdl.ir.Elaboratable

Computation Unit Base class.

Amazingly, this class works recursively. It’s supposed to just look after some ALUs (that can handle the same operations), grouping them together, however it turns out that the same code can also group groups of Computation Units together as well.

Basically it was intended just to concatenate the ALU’s issue, go_rd etc. signals together, which start out as bits and become sequences. Turns out that the same trick works just as well on Computation Units!

So this class may be used recursively to present a top-level sequential concatenation of all the signals in and out of ALUs, whilst at the same time making it convenient to group ALUs together.

At the lower level, the intent is that groups of (identical) ALUs may be passed the same operation. Even beyond that, the intent is that that group of (identical) ALUs actually share the same pipeline and as such become a “Concurrent Computation Unit” as defined by Mitch Alsup (see section 11.4.9.3)

elaborate(platform)
class soc.experiment.score6600_multi.FunctionUnits(n_reg, n_int_alus, n_src, n_dst)

Bases: nmigen.hdl.ir.Elaboratable

elaborate(platform)
class soc.experiment.score6600_multi.IssueToScoreboard(qlen, n_in, n_out, rwid, opwid, n_regs)

Bases: nmigen.hdl.ir.Elaboratable

elaborate(platform)
ports()
class soc.experiment.score6600_multi.Scoreboard(rwid, n_regs)

Bases: nmigen.hdl.ir.Elaboratable

elaborate(platform)
ports()
soc.experiment.score6600_multi.create_random_ops(dut, n_ops, shadowing=False, max_opnums=3)
soc.experiment.score6600_multi.disable_issue(dut)
soc.experiment.score6600_multi.instr_q(dut, op, funit, op_imm, imm, src1, src2, dest, branch_success, branch_fail)
soc.experiment.score6600_multi.int_instr(dut, op, imm, src1, src2, dest, branch_success, branch_fail)
soc.experiment.score6600_multi.power_instr_q(dut, pdecode2, ins, code)
soc.experiment.score6600_multi.power_sim(m, dut, pdecode2, instruction, alusim)
soc.experiment.score6600_multi.print_reg(dut, rnums)
soc.experiment.score6600_multi.scoreboard_branch_sim(dut, alusim)
soc.experiment.score6600_multi.scoreboard_sim(dut, alusim)
soc.experiment.score6600_multi.test_scoreboard()
soc.experiment.score6600_multi.wait_for_busy_clear(dut)
soc.experiment.score6600_multi.wait_for_issue(dut, dut_issue)

soc.experiment.sim module

class soc.experiment.sim.MemSim(regwid, addrw)

Bases: object

ld(addr)
st(addr, data)
class soc.experiment.sim.RegSim(rwidth, nregs)

Bases: object

check(dut)
dump(dut)
op(op, op_imm, imm, src1, src2, dest)
setval(dest, val)

soc.experiment.testmem module

class soc.experiment.testmem.TestMemory(regwid, addrw, granularity=None, init=True, readonly=False)

Bases: nmigen.hdl.ir.Elaboratable

elaborate(platform)
ports()

soc.experiment.wb_types module

wb_types

based on Anton Blanchard microwatt wishbone_types.vhdl

soc.experiment.wb_types.WBAddrType()
soc.experiment.wb_types.WBDataType()
class soc.experiment.wb_types.WBIOMasterOut(name=None)

Bases: nmutil.iocontrol.RecordObject

class soc.experiment.wb_types.WBIOSlaveOut(name=None)

Bases: nmutil.iocontrol.RecordObject

soc.experiment.wb_types.WBIOSlaveOutInit()
class soc.experiment.wb_types.WBMasterOut(name=None)

Bases: nmutil.iocontrol.RecordObject

soc.experiment.wb_types.WBMasterOutInit()
soc.experiment.wb_types.WBMasterOutVector()
soc.experiment.wb_types.WBSelType()
class soc.experiment.wb_types.WBSlaveOut(name=None)

Bases: nmutil.iocontrol.RecordObject

soc.experiment.wb_types.WBSlaveOutInit()
soc.experiment.wb_types.WBSlaveOutVector()

Module contents