Select Software or Hardware Floating Point depending from your ARM Microcontroller
|
Software or Hardware Floating Point [depending from your ARM Microcontroller]? - This is the question!
The LPC2106 version is not supported anymore, because it has limited resources for networking and video capture on two UARTs. I think the new released LPC24XX is a good alternative for LCP2106. The LPC24xx has more memory expansion options with two separate APB buses with general purposes DMA, which increase the data throughput on WiFi and video capture. The ARM is abandoned anyway, the Blackfin won out for the new platform. However I still consider the LPC version SRV1 robot as a good open source project for education purpose.
I downloaded all the design documentation of SRV1 ARM7 project. The project suggests to developers to use GNUARM to compile the project. However I found that the GNUARM toolchain has version compatibility issue with my existing Cygwin, which is required for another MIPS project. I hate this. That means I have to make a choice. I have installed WinARM for LPC2000 project. WinARM uses MingW, it works like a native compiler. So I tried to use WinARM to build the SRV1 project.

The latest firmware is srv-053007.rar. The make result prompts identical errors as:
arm-elf-gcc main.o uart0.o uart1.o jpeg.o camera.o iap.o utils.o startup/libea_startup_thumb.a -I./startup -mcpu=arm7t dmi -mthumb-interwork -nostartfiles -T build_files/link_64k_128k_rom.ld -o srv1.elf -Wl,-Map=srv1.map, --cref c:/winarm/bin/../lib/gcc/arm-elf/4.1.2/../../../../arm-elf/bin/ld.exe: ERROR: startup/libea_startup_thumb.a(framework.o) uses software FP, whereas srv1.elf uses hardware FP
I inspected the makefile. The makefiles are distributed in different folders. I modified the \build_files\general.mk and startup\makefile and a makefile in unzipped root. Since ARM7 has not hardware FPU, I added -msoft-float to arm-elf-gcc as extra options in EFLAGS . Now the same errors of software and hardware FP issues are reported between libc.a and libea_startup_thumb.a and srv1.elf. Now I am totally confused. Because the GCC ld reports that libea_startup_thum.a and srv1.elf are using hardware FP, while libc.a is using software FP! Should I change the GCC and libc?
I don't think I am the only person who has to handle this issue. So I googled it. It is obviously a common issue in building projects on ARM7, even other embedded platforms. I am so lucky to find another person who has the same issue on same project with the same toolchain. Unfortunately the author of WinARM seems to have no clue about it. At last, the problem is officially solved and answered by a experienced engineer (Mr. Clifford Slocombe). I refer the answer as following:
Soft Floating Point or Hard Floating Point?
The default is to use -mhard-float, but this *does not* mean that you must have a hardware FPU. An FPU instruction without an FPU present causes an invalid instruction exception, the exception handler emulates the hardware floating point operation in software. (see gnu.org for details). The idea is that the same code will work with or without an FPU, with -msoft-float, you would never see any performance increase when an FPU is present because it would not be used.
That said, the only off-the-shelf ARM micro-controller with floating point hardware currently available is the NXP LPC3180, but it uses the VFP co-processor unit rather than the older FPA co-processor. The default Newlib builds are for the FPA only, you would have to re-build it yourself to get VFP support. The consequence of this is that without a rebuild library FP functions (libm.a) will be software emulated even when VFP hardware is available.
The solution is simple: don't use -msoft-float. You don't need it, and it is not always supported. There may be a marginal performance improvement from not having to do the mode switch on exception, but it is not significant - it is dog slow either way! Consider to avoid floating point altogether. And as a general rule, do not use any non-default compiler options unless you are sure exactly what they do!
So I cleaned the user lib libea_startup_thumb.a and srv1.elf, rebuild them without any extra options. The problem solved. I found the root cause, the libea_startup_thumb.a is prebuild by GNUARM, now I try to link it with new srv1.elf. Then I add extra options, rebuild them all, the new lib and elf are reported to error since libc.a is prebuild in WinARM. Finally I removed options, everything is consistent, the error report was removed.
Yes, I am lucky to solve it. I do believe out there many people are still wondering about it.
LPC3250 with hardware VFP
Up to now, I have found two ARM9 silicon families with hardware FPU. One is EP9315, which comes from Cirrus Logic. It has MaverickCrunch FP co-processor. The other is LPC3180, available from NXP semiconductors, with vector floating point (VPU) unit. The LPC3180 is revised to LPC3250, with more peripherals. Phytek offers SDK for LPC3180 as well as new cross toolchain on its sites with VPU support and newlibc. I can not find any application note for this toolchain. Here is another note from Clifford for evaluation the LPC3180 using GCC 4.0.2(WinARM).
To build VFP code use the options -mfpu=vfp -mfloat-abi=softfp
Philips' claim of 5x performance improvement for floating point operations over software floating point seems about fair. For algorithms heavily dependent on multiply-accumulate operations, or square-root, performance may be even more drastically improved as the VFP has instructions for these operations. The compiler will correctly translate statements such as fx += fy * fz into a single FP MAC instruction.
I have seen no evidence that the code generation employs auto-vectorization, even when the appropriate options are set. It is still not clear to me what platforms apart from PowerPC G5 this has been implemented on.
On the Nohau board, for single precision code, the performance with code and data running from external SDR SDRAM was about half of that when running from internal RAM, and worse for double precision code. This is because the external bus runs at half the CPU clock rate. The performance using DDR SDRAM would presumably be comparable to the internal RAM performance. Enabling the instruction and data caches has a dramatic effect on performance, especially for short algorithms working on small data sets.
None of the default Newlib builds support the VFP, you will need to rebuild the library specifically. Newlib does not include VFP specific code for sqrt(), so it will still use a software algorithm rather than a single instruction, although the individual operations in the algorithm would still use hardware floating point of course. To realize the advantage of the VFP square-root instruction you'd need to code your own in assembler or modify Newlib.
The GDB and Insight debuggers are not VFP aware. VFP has a 'natural' byte order for double precision values rather than the rather unconventional 'cross-endian' format used by the older ARM FPA unit and by GCC for softfp. Consequently the debugger displays doubles incorrectly, single precision values work correctly however.
When -mfpu=vfp is used in conjunction with optimization, the compiler will seg. fault when certain code constructs are encountered. This will for example prevent a Newlib build from completing. I have reproduced the error with the following test code:
double test( double x ) // Line 1
{ // Line 2
return (x >= 0 ? 0 : -1.0); // Line 3
} // Line 4 - error reported here.
when compiled with:
arm-elf-gcc -c -O2 -mfpu=vfp -mfloat-abi=softfp test.c
Rewriting the code thus:
double test( double x )
{
double y ;
if( x >= 0 )
y = 0 ;
else
y -1.0 ;
return ( y );
}
seems to work around the problem. I have encountered other constructs that exhibit this error, but in all cases I have been able to recode an equivilent that solves the problem. I raised a bug report at http://gcc.gnu.org/bugzilla, the response was simply that the bug could not be reproduced in the mainline development stream 4.2.0 I have not yet tried it on 4.1.1, and it may be an issue specific to the WinARM build, or even only occurring on my PC.
It seems NXP should help to release a GCC for its LPC3180/3250 ARM9. If VFP can not be used seamlessly in the floating intensive applications like video processing, people would rather use DSP or Blackfin, instead of castrated 200MHz ARM9 silicon.
Conclusion
WinARM is a good GCC collection as one of your open source GCC for ARM. However I highly recommend you to install Keil CC for ARM and CSL G++ (another GCC for ARM) in your PC as well. Keil CC is quite useful in developing small ARM project. And CSL G++ supports ARM cortex-M3 core, which the latest WinARM can not support. Yes, you have to keep all of the compilers safely. There are too many different processors we have to manage.
- allankliu's blog
- Login or register to post comments
- 990 reads



Giving a quick look at ARM
Giving a quick look at ARM VFP Manual explains a few things :
- VFP offers a sub set of FP standard set. This explains why certain operations such as sqrt are software implemented.
- Exceptions are software handled. This works out with a smaller die area, hence an acceptable cost.
- Vector operations provide for one instruction to operate on arrays of data. Ideal for ARM based applications as PDAs, game consoles, set top boxes, automotive applications.