Original Post

Following a discussion on discord with GuyPerfect and dasi, I wrote a little tool to determine cache behavior in VB (Guy guessed right…).
For completeness I attach a screenshot from real hardware and the test binary.
What it does:
It enables and clears cache and then executes a series of 2048 instructions ‘MOVEA $0000,r1,r2’ … ‘MOVEA $07ff,r1,r2’.
Then it dumps the cache to memory and plots the results onto screen using the hexfont https://github.com/enthusi/hexfont
What we can see are the 128 cache entries (8 byte each).
I strongly suggest to digest this beautiful source of wisdom:
http://perfectkiosk.net/stsvb.html#cpu_instruction_cache
Since the cache carries 1 KB executing 2KB of deterministic code reveals if the cache is actively being replaced on cache-misses or is staying ‘saturated’ until being cleared.
The good news: cache IS being replaced. This means while being good practice to disable the cache when not needed (cache misses are expensive) keeping it enabled will not saturate it.
The digits on the far right are part of some timing tests performed along with the cache replacement query.
It confirms (well that part was documented well before 🙂 the waitstates for RAM/ROM access as set via the WCR register.
The time difference of those instructions with WCR set and and unset was
(0x2a2e-6 – 0x1f2e-6) / 2048 ~ 1e-7 seconds and given the 20 Mhz = 2 cycles (1 cycle per 16bit read).

  • This topic was modified 4 weeks, 1 day ago by enthusi.
Attachments:
2 Replies

Well done! This really helps to de-mystify the operations of the cache. I’ll update the Sacred Tech Scroll to clarify the replacement policy.

How far in advance are the cache entries being fetched relative to the actual code being executed?

That’s an interesting question (of course).
This is what happens to dump the cache:
movea $12, r0 , r22
movhi $500, r22, r22; (1<<4 | 2 |$50000 << 8) = dump to $5000000
ldsr r22, 24;CHCW ;this is 16bit wide format II
;followed by a call
add -4, sp ;this is also 16bit format2
(last 2 opcodes are 4 byte together = $72d8, $447c)

(latest) mednafen stops there. Real hw also shows:
0000dfe3 which is st.w lp, 0[sp]. So I would think it fetches the next FULL 32bit.
(which is odd since cache entries are 8byte).

 

Write a reply

You must be logged in to reply to this topic.