Original Post

Hi, I’m new to the site … until yesterday I didn’t even know that anyone had updated the GCC V810 patches passed the original GCC 2.95 patches that were (AFAIK) done by a bunch of Japanese guys in 2000 (for the PC-FX, I believe).

Anyway … I’m trying to “open-up” the PC-FX for development and have done my own update of the old 2.95 patches to binutils 2.23.2 and GCC 4.7.4, in order to get a “modern” C compiler with C99 capability, and with nearly-all of C11.

It occurs to me that you guys over here with a love for the VirtualBoy may be interested in the work that I’ve done, and that you might be a larger group to provide a test-bed, rather than the PC-FX community, where I’m pretty-much the only assembler-capable developer.

I’ve had a quick “chat” with KR155E, and with his help, I’ve found the following threads …

“experimental gcc4 patches”
http://www.planetvb.com/modules/newbb/viewtopic.php?topic_id=3883

“gccVB optimization options and assembly code”
http://www.planetvb.com/modules/newbb/viewtopic.php?topic_id=5055

“Compiling gccvb 4.4.2 under Cygwin”
http://www.planetvb.com/modules/newbb/viewtopic.php?topic_id=5328

From what-I-can-see, I don’t think that my patches are experiencing any of the problems that have been reported in those threads, except for the “movhi optimization” issue … which isn’t really an “issue” as such, it’s because the compiler doesn’t know where the labels are going to resolve to, so it has to generate full 32-bit loads.

That may be something that the linker can resolve with “whole-program-optimization”, but I’ve not been brave-enough to even try to compile the toolchain with that feature enabled.

The new patches are built with mingw64/msys2, and not cygwin, so they’re Windows-native programs.

In trying to clean-up the code so that I could understand (and debug) what was going on, I removed a bunch of pointless options that don’t make any sense to the PC-FX (or VirtualBoy), such as the long-call, long-jump, GHS, and app-regs. Hopefully nobody here cares about those.

I have a version that uses the old GCC 2.95 ABI (with the 16-bytes of stack reserved for r6-r9), and I just completed the transition to the new GCC ABI from 2010 that removes that redundant stack space.

My next task is to change the ABI even more so that I can get useful stack-frames and actually implement a working backtrace function for debugging.

So … I have a couple of technical questions for the assembly-capable developers here.

I’ve not seen the VirtualBoy SDK (and don’t particularly want to wade through it) … but are the V810’s registers R2 and R5 actually used in whatever VirtualBoy libraries you guys use?

Does the VirtualBoy have single-cycle RAM, or does it have wait-states that slow down RAM access?

Are you using any Nintendo binary-only libraries, or can you re-assemble/re-compile whatever libraries/engines that you’re using?

54 Replies

ElmerPCFX wrote:
Then you’d just have to fix the linker script to make sure that those sections really are put within the top 32KB of the cartridge.

Top 32KB? Is that because the displacement is signed? What if gp was set to the middle of WRAM? Then all variable accesses (minus constants of course) could be in the ZDA section. πŸ™‚

blitter wrote:
Top 32KB? Is that because the displacement is signed?

Yes, the memory displacement is signed +/- 32KB.

So any negative references to R0 (which is 0) will wrap around to $FFFF8000-$FFFFFFFF (i.e. ROM space).

You’ve just got to tell the linker that you’re actually running in that location, and then make sure that the compiler/assembler is generating R0-relative instructions, which means putting things in the ZDA section.

You’ll need to patch the linker/compiler (I’m not sure which one off-the-top-of-my-head) because it’s been set up for the PC-FX which has RAM at $00000000 and so starts allocating ZDA memory from 0 and not -32768.

What if gp was set to the middle of WRAM? Then all variable accesses (minus constants of course) could be in the ZDA section. πŸ™‚

That’s exactly where it is supposed to be set.

If you look at any VirtualBoy games that shipped, you’ll almost-certainly see that they do this in their startup code.

If you look at the original V810 GCC patches that we’re all basing our changes on, which are at …

http://hp.vector.co.jp/authors/VA007898/pcfxga/develop/gcc.html

Or just look at the V850 directory in whatever version that you’re using you’ll see this in the linker script …

binutils-2.24/ld/scripttempl/v850.s

  .sdata ${SDATA_START_ADDR} :
  {
	${RELOCATING+PROVIDE (__gp = . + 0x8000);}
	*(.sdata)
   }

That’s so that 64KB of variables/data can be used in the SDA section, which is exactly where you VirtualBoy guys should be putting all your WRAM variables.

You can get the compiler to do automatically with the “-msda=32768” switch, although you might want to use a smaller value, and/or just force specific things in there with “__attribute__ ((sda))”.

This is supported with the GCC 2.95 patches, and so I have no idea why the folks that converted the original PC-FX patches for the VirtualBoy didn’t use that capability.

***********

If you look elsewhere in the linker script you’ll see …

  .tdata ${TDATA_START_ADDR} :
  {
	${RELOCATING+PROVIDE (__ep = .);}
	*(.tbyte)
	*(.tcommon_byte)
	*(.tdata)
	*(.tbss)
	*(.tcommon)
  }

That’s for the R30-relative addressing on the V850 in the TDA section.

I’ve changed that in the new V810 patches so that you can get fast access to another 64KB of variables/data using the R5 (aka TP) register that was otherwise unused.

That could be another useful area for fast access to stuff in ROM on the VirtualBoy.

ElmerPCFX wrote:

You’ll need to patch the linker/compiler (I’m not sure which one off-the-top-of-my-head) because it’s been set up for the PC-FX which has RAM at $00000000 and so starts allocating ZDA memory from 0 and not -32768.

I checked on this, and my previous message was wrong.

Everything seems to be working fine without any further mods to the compiler source … it’s all up to the linker script putting things in the right location, and in the user marking variables/data as belonging in the SDA/ZDA/TDA segments.

An example of doing this in both assembly and C is shown in the test interrupt code earlier in this thread.

dasi wrote:

My GCC 4.7 port hasn’t received enough widespread testing yet to see what bugs I’ve introduced … but Alex Marshall’s “liberis” examples all work properly, and another user has ported a simple shoot-em-up to the PC-FX with no apparent problems.

. . .

If someone is interested in being the “goto guy” for a VirtualBoy version of my patches, then I’d love to hear it.

I’d be happy to help with that and put a build together for testing. There are a few reasonably large Virtual Boy projects around which should give your patches a good workout. πŸ™‚

dasi

Well, dasi has had copies of my GCC patches for 2 months now with no feedback, and he hasn’t logged in here since early May.

If anyone else is interested in helping by testing them out on the VB, now’s the time to step forward!

Hi ElmerPCFX. I’d gladly try your patches. I was trying some of the things that you’ve talked about in this thread on gcc 4.4 without much success. Anyway, I’m curious about whether or not I will be able to squeeze some performance out of VBJaEngine by using your patches.

BTW: some features that we use and may or may not be affected are:

1) Recursive calls
2) Variadic functions

(Excuse my lack of ASM/Compiler knowledge if the these are not related at all to your patches.)

I’ll send you a PM soon, hopefully tomorrow.

Just got get something working in a different project and I don’t want to lose track of what I’m doing.

Recursion and variadic functions should be working fine in the new compiler.

Ensuring that “printf” worked properly was one of my big requirements.

Best wishes,

John

jorgeche wrote:
I’d gladly try your patches.

Thanks to jorgeche’s incredibly generous help, we’ve found and fixed 2 bugs in the compiler toolchain, 1 serious misunderstanding on my part, and 1 stupid typo.

VBJAEngine and the Platformer Demo are now compiling and running with GCC 4.7.4. πŸ˜€

And thanks to ElmerPCFX, a couple of BIG issues with the VBJaEngine’s architecture were found and fixed.

As he said, and besides a couple of small fixes that are pending on my side, the engine and the demo run properly with GCC’s latest version, and they compile way faster too!

It was a great collaboration that I hope will help both communities. πŸ™‚

jorgeche

As folks here may know, VBDE and VUEngine have been using my GCC 4.7 toolchain for a couple of years now, and so I consider it stable-enough for a public release.

I’ve created a project on github for the patches and build script, and have confirmed that you can build the toolchain cleanly on both Windows and Linux.

You can find the project here … v810-gcc

For the adventurous, you can now compile C++ code.

Limitations are that there is no STL, and exceptions are not supported … but that isn’t much of a practical problem on something like the VirtualBoy (where you would be well-advised to limit yourself to the Embedded C+++ language subset).

You need to do a couple of things in order to get compiled C++ source linking correctly, and there is an example of that in the “examples” directory of the “liberis” project for the NEC PC-FX console, also on my github.

Have fun!

Awesome, ElmerPCFX! I am glad these patches are finally out! I guess we now have a foot in the door for Linux (and possibly OSX) versions of VBDE. πŸ™‚

KR155E wrote:
I guess we now have a foot in the door for Linux (and possibly OSX) versions of VBDE. πŸ™‚

Yep, I had to change/fix a bunch of things to get the Linux compilation working again, and those should probably help the compilation on OSX.

I also fixed (or at-least tried to fix) a couple of bugs in GCC 4.7 that compiling on the latest GCC brought to light … and generally cleaned-up the code to fix the warnings (or silence them if they couldn’t be fixed).

I’ve found a site talking about how to compile GCC on OSX, and I may take a look at it myself if somebody doesn’t beat me to it!

I really hope that you guys take a look at using the C++ support in VUEngine since you’re already using an object-oriented design.

Using basic Embedded C++ language features should allow you to make your code a lot more readable/debuggable than using all of those cunning-but-fragile macros.

Yep, I had to change/fix a bunch of things to get the Linux compilation working again, and those should probably help the compilation on OSX.

I’ve found a site talking about how to compile GCC on OSX, and I may take a look at it myself if somebody doesn’t beat me to it!

Hey Elmer!

I already compiled it on macOS Mojave, but I didn’t really documented the process and it was kind of messy and hackish since I had to even modify one of the config.guess files, among other things. Anyway, I’m still porting our preprocessor before being able to successfully build any of our demos, so, I’m not 100% sure that the compiler is working fine.

I really hope that you guys take a look at using the C++ support in VUEngine since you’re already using an object-oriented design.

Using basic Embedded C++ language features should allow you to make your code a lot more readable/debuggable than using all of those cunning-but-fragile macros.

The plan is to evaluate, in the middle term, if it is feasible given the complexity and memory requirements of the engine. But right now we want to concentrate in implementing a complete game instead of keep pushing developer features. Anyway, thanks to the preprocessor, the code is around 85% based on C++’s syntax (the only missing syntax feature is proper method calling), so, that part of the port should be really easy. The hard part will be, first, to customize the memory handling since I don’t have enough experience on the inners of C++’s compiler and linker so I don’t know if it is possible to instruct it to allocate the virtual tables in specific memory regions or even if they can just be computed at compile time and placed in the ROM area (we load them in DRAM instead of WRAM because they amount to more than 10 KB and they are populated at runtime); and second, to handle the allocation of dynamic objects, although I guess we just have to override the new operator and just use our own MemoryPool to avoid the usage of the heap.

jorgeche

  • This reply was modified 5 years, 9 months ago by jorgeche.

jorgeche wrote:

I already compiled it on macOS Mojave, but I didn’t really documented the process and it was kind of messy and hackish since I had to even modify one of the config.guess files, among other things.

Congrats! I hope that the changes that I made to get it compiling on linux, will make it easier for you to build it on OSX in the future.

The hard part will be, first, to customize the memory handling since I don’t have enough experience on the inners of C++’s compiler and linker so I don’t know if it is possible to instruct it to allocate the virtual tables in specific memory regions or even if they can just be computed at compile time and placed in the ROM area (we load them in DRAM instead of WRAM because they amount to more than 10 KB and they are populated at runtime); and second, to handle the allocation of dynamic objects, although I guess we just have to override the new operator and just use our own MemoryPool to avoid the usage of the heap.

VTables are always constructed at compile-time. For speed on the VirtualBoy, you’re going to want to have the linker put them in either the ZDA section, or the TDA section so that they can be addressed in a single instruction relative to the R0 or R5 registers.

Putting them in the R0 section would be best (IMHO), at the top of your mirrored ROM space so that they show up in the $FFFF8000-$FFFFFFFF region. But, IIRC, that would mean that you’ve have to do some fiddling about with your linker script.

Even then, I’m not sure that VTables are marked at register-relative by the compiler … that may need another hack into the GCC source code.

The recent version of the patches has already started the process of making it possible to distinguish between the VirtualBoy and the PC-FX, so that I can take advantage of your system’s ability to load unsigned bytes-and-words without masking off the high bits afterwards.

FWIW, I have just updated the V810-GCC project patches and build scripts on github to work with the current GCC v10 compiler that is used on recent distributions of Msys2 and Linux.

Enjoy! πŸ™‚

 

Write a reply

You must be logged in to reply to this topic.