Original Post

Is there a way to use the optimization options with gccVB? Both versions I have (1.0 and the one distributed with the latest VBDE) give me linker errors when I use them:

Version 1.0:

C:\DOCUME~1\USER\LOCALS~1\Temp\ccQZaaaa.o: In function `main’:
C:\DOCUME~1\USER\LOCALS~1\Temp\ccQZaaaa.o(.text+0x81a): undefined reference to `
_save_r28_r31′
C:\DOCUME~1\USER\LOCALS~1\Temp\ccQZaaaa.o(.text+0x81a): relocation truncated to
fit: R_V810_26_PCREL _save_r28_r31

VBDE version:

C:\DOCUME~1\USER\LOCALS~1\Temp\ccrWXelX.o: In function `main’:
GAME.c:(.text+0x84a): undefined reference to `_save_r27_r31′
GAME.c:(.text+0x84a): relocation truncated to fit: R_V810_26_PCREL against undefin
ed symbol `_save_r27_r31′
collect2: ld returned 1 exit status
make.exe: *** [GAME.o] Error 1
Could Not Find W:\GAME\GAME.o

Also, how would I link assembly code (or code produced by another compiler) with C code and what tools do I have to use?

19 Replies

Are you using the linker script dasi published?

I’m just using whatever came with the releases.

Make sure you have the linker script from this thread http://www.planetvb.com/modules/newbb/viewtopic.php?topic_id=4483&post_id=20963#forumpost20963 referenced in your makefile.

Still getting the same error. Yes, I did run crt0_make.bat.

I’m not going to pretend to be an expert on compiler options but I reproduced the error you were getting and seemed to have worked around it by setting the optimization flags manually. Here is my gcc command with all the flags that would normally be set using -O (minus the ones that weren’t supported by this version). The following is what I now have in my makefile for the VBDE and I was able to compile the program I’ve been working on without getting those errors.

All the flag information I got from

http://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html

v810-gcc.exe -Wall -fomit-frame-pointer -funit-at-a-time -ftree-ter -ftree-sra -ftree-fre -ftree-dse -ftree-copyrename -ftree-dominator-opts -ftree-ch -ftree-builtin-call-dce -fsplit-wide-types -fmerge-constants -fmerge-constants -fipa-reference -fipa-pure-const -fif-conversion -fif-conversion2 -fguess-branch-probability -fdse -fdefer-pop -fdce -fcprop-registers -fauto-inc-dec -Xassembler -a=game.lst -nodefaultlibs -mv810 -Tc:/vbde/gccvb/v810/lib/vb.ld -xc -o game.o game.c

This works great. Thank you very much.

I measured a difference of about 3K between optimized and unoptimized ROM images, probably because of unused libgccvb functions.

HorvatM wrote:

Also, how would I link assembly code (or code produced by another compiler) with C code and what tools do I have to use?

If you’re using GCC assembly, you can just specify the .s file as one of the input files and it should be assembled and linked automatically, depending on the options specified; e.g. it won’t work if you use -x to turn off automatic language detection.

If it’s a binary produced by another compiler (or by hand) it has to be in a format gcc understands (couldn’t find a list, but ELF, COFF and possibly a.out should work). If it’s just a raw binary, you could turn it into an array and use an asm block to jump into it, put it would have to be position-independent code (i.e. using relative jumps). What non-gcc tools do you have that produce VB (NVC/v810) machine code?

RunnerPack wrote:
If you’re using GCC assembly, you can just specify the .s file as one of the input files and it should be assembled and linked automatically, depending on the options specified; e.g. it won’t work if you use -x to turn off automatic language detection.

That’s cool, but I’m planning to write my own assembler and I don’t like GCC syntax much (that’s not the only reason for writing it though). But it shouldn’t be too hard to write a converter from my syntax to GCC syntax.

What non-gcc tools do you have that produce VB (NVC/v810) machine code?

None [yet], but reading the veCC readme made me think about writing a compiler for BASIC or another language (even though I’m not an expert on compilers). But there isn’t any demand for one anyway.

For all of us using VBDE I think we need to revisit the optimization errors received when adding -O to the gcc compiler options. This last coding competition taught me something extremely important. The old compiler (Which at least DanB is still using) works with the -O compiler option. IT MAKES A HUGE DIFFERENCE in code performance. The options I found and added in this post don’t even begin to touch the performance of the old compiler with -O. I compiled my wireframe library using the old compiler (v2.95) and my code instantly ran 4-5 TIMES faster than running my code using the assembly versions of my functions on the compiler packaged with VBDE.

Unfortunately I’m a linux geek and compiling on my system isn’t going to benefit too many people other than myself.

I tried once to compile gcc using patches that I found on this site from other posts but have never been successful. There’s always something that comes up that I have no idea how to fix. (Of course I’m a total amateur at C and C++ so that’s not really saying much).

Thanks to Thunderstruck for compiling DanB’s zpace racers and pointing out on the forum that it ran much slower than DanB’s demo. That little nugget directed me to try my code on the old compiler which I had never tried before.

For now, I’ll be switching back to the old compiler version because I can basically eliminate all of my assembly routines and code directly in C since the performance difference is not noticable when using the -O or compiling with my assembly version functions.

Greg Stevens wrote:
Thanks to Thunderstruck for compiling DanB’s zpace racers and pointing out on the forum that it ran much slower than DanB’s demo. That little nugget directed me to try my code on the old compiler which I had never tried before.

For now, I’ll be switching back to the old compiler version because I can basically eliminate all of my assembly routines and code directly in C since the performance difference is not noticable when using the -O or compiling with my assembly version functions.

So it was just the newer compiler slowing things down? I guess I need to check that out.

Do you have vbde running with the old compiler? Could you maybe post it?

  • This reply was modified 10 years, 1 month ago by thunderstruck.

I just moved my project folder and manually compiled it. I didn’t try to add it to vbde. Ill be messing with it more this week at some point.

Greg,

After I read your post, I took another look at gcc 2.95. I found out the latest 2.x version is actually 2.95.3, and that there were a few changes that seem to be worth having. There is also a binutils 2.10.1, but I couldn’t find a list of what was changed. I got the code for those versions (actually, I got 2.95.2/2.10 and their respective patches) and applied the v810 patches (it only took a bit of manual tweaking).

I haven’t compiled them yet, because I wanted to look into what it would take to add the NVC-specific instructions from M.K. and/or blitter’s 4.x patches, as well as the register alias stuff blitter mentions here (hopefully without causing the same slow-down and “jal” problems).

I also need to gather every linker script and crt0 version, and find out which of each to use (or whether to make a new version of either) to make sure things like interrupts work correctly.

KR155E just sent me an updated version of VBDE with all patches so I’m going to give it a try and see if it works.

I’ll post after I’ve tested a bit.

Tried the version Kr155e sent and it didn’t work. The errors show up as soon as you try and add the -O to the compiler options in the make file. The functions it can’t find are in the lib1funcs.asm so I’m guessing this file is missing from a patch somewhere since the definitions can’t be found.

Here are two test roms to show the difference in performance. It’s 4 rotating spheres and I use an interrupt timer. I can only run in mednafen and I know that mednafen timings aren’t accurate to real hardware but it should be accurate to the number of instructions being executed. VBDE.vb is the current compiler and has a timer count around 10,000. V295.vb is the same code with the old compiler and it runs around 2-3000. Also Ben tested my Star Fox demo compiled on the old compiler and said it ran much better on the actual hardware.

I can post this test code as well if anybody thinks it’ll help.

Attachments:

Thinking out loud, here: would it be possible to “post-process” the ROMs made by gccVB 4.x to fix the badly generated code?

Some of it yes, some of it no. I’ve thought of this– in fact I wrote a tool that post-processes my ROM to poke addresses directly into the assembly, saving runtime lookups– but any processing that adds or removes instructions would be non-trivial since that would throw any other instructions that deal with addresses completely out of whack.

I HATE to be the one to bring this up, but perhaps it’s time that some of us take a look at GCC internals to see what’s going wrong? I’m taking a bit of a break from VB coding (call it “guilt that I’m letting my other code rot”) anyway, and I probably could take a look if I had some code that is known to generate bad jumps.

cr1901 wrote:
I HATE to be the one to bring this up, but perhaps it’s time that some of us take a look at GCC internals to see what’s going wrong? I’m taking a bit of a break from VB coding (call it “guilt that I’m letting my other code rot”) anyway, and I probably could take a look if I had some code that is known to generate bad jumps.

If somebody wanted to take a look at the GCC 4 patches, I know of a few spots that look suspicious:

– The “return “movhi hi(%1),%.,%0\n\tmovea lo(%1),%0,%0″;” in output_move_single, line 2255 or thereabouts in gcc-4.4.2-vb.patch. 32-bit loads are always encoded this way, even if the high word doesn’t change between consecutive loads. This line also shows up elsewhere in that function for handling other such loads.

– “sprintf (buff, “mov r31,r10\n\tmovhi hi(%s), r0, r11\n\tmovea lo(%s), r11, r11\n\tjal .+4\n\tadd 4, r31\n\tjmp r11″, name, name);” in construct_save_jarl, line 3797 or so also in gcc-4.4.2-vb.patch. While this isn’t bogus code– this code works– I don’t think we need to be doing long jumps in this way since jal takes up to a 26-bit displacement, which if my math is correct means up to an almost 64MByte jump in either direction– well more than we need on the VB.

– Prologue and epilogue function generation. Building with -O3 or -Os in gccVB 4.8 (part of dasi’s devkitV810 WIP) is totally broken here, generating unnecessary epilogue functions that clobber lp, leading to subroutines that in my testing return to address 0 (the first framebuffer), causing a crash. This can also happen occasionally in gccVB 4.4.2, though I haven’t been able to create a minimal example yet.

– Lines 837-849 in binutils-2.20-vb.patch, beginning with “HOWTO (R_V810_9_PCREL,”: This might be what’s causing the bad jump logic, creating relative jump addresses that are multiples of 0x400000 from what they should be, eventually wrapping around to the beginning of the address space and crashing. Not sure why the entry for 9-bit branches uses 26 for bitsize and type ‘long.’ I posted some code that exhibits this bug a while ago– http://www.planetvb.com/modules/newbb/viewtopic.php?post_id=26069#forumpost26069 — would be happy if somebody took a good hard look at what’s going on. (EDIT: It might just be a linker order problem, but would be nice to know for sure.)

🙂

 

Write a reply

You must be logged in to reply to this topic.