GCC & Distcc - Distributed Build System
GCC & Distcc - Distributed Build System
Story
Here are some projects I have worked on with a 1.4GHz laptop:
- 8051 based CRT TV Project, 100KB source, Keil, 5 minutes build.
- 8051 based LCD TV Project, 20MB source, Keil and Cygwin, 45 minutes build.
- MIPS based LCD DTV Project, 600MB~5GB source, GCC and Cygwin, 1~4 hours build.
Actually a complete compiling for an 8051 project is quite fast. However, our company uses some encryption algorithm to protect the software IP (Intelectual Property - source code), so the make script has to decrypt the source files in memory and then compiles and links the objects files to the ABS/HEX file. The final build time increases dramatically. The last MIPS based DTV project was the worst case I have even experienced. I hesitated to change any code of that project, because even a tiny part of a header file was touched, a complete build took me 1~4 hours. The build time was a great pressure for me when I visited my customer to support on site. Each time the customer requested a change in the project, it took me a long time to build and test. The project was delayed, but I could not catch up anyway.
I tried to search a solution myself, and found that some computers (even the latest models) in my company office are not always used.
So I setup the VPN connections between my laptop and these desktop PCs. I put the source code on these PCs, so I can login to these PCs remotely to change the source code and build while my laptop could be available for regular tasks. If there are parallel inputs of change request, I will login to a different PC to perform the building. This solution is quite helpful for parallel handling change request. However, the complete build time is still high.
I believe the time consuming build process also happens in many applications including VHDL simulation and medical study.
Samba's distcc seems a good solution of distributed build system. There is preliminary support for some other compilers but the main focus is on GCC. Fortunately, GCC has been ported to many microcontrollers, including the popular AVR, PIC18, ARM, MIPS, 68K, H8 and C166. If you use other compilers like 8051, you'll have to look for commercial solutions.(To be frank, I have not found it yet.)
distcc
distcc is a distributed build program for C, C++, Objective C or Objective C++ code across the computers in a local network. The assembler is supported if necessary. Because distcc always sends the preprocessed code to the build servers, it generates identical results as a local build if the build servers have installed the similar compiler with the local machine. distcc can work independently on different platforms such as Linux, UNIX, FreeBSD, Cygwin over Windows.
Let us illustrate the usage of distcc on Debian/Ubuntu OS.
- First, install the distcc on client and servers. You can install distcc via tar ball, RPM (Red hat, Fedora) or emerge (Gentoo). In Ubuntu, you should use apt-get as following:
- sudo apt-get install distcc
- As soon as intallation is completed, you can find distcc(client), distccd(server daemon) and distccmon-text(build progress monitor) in /usr/bin/ and configuration folder in ~/.distcc after installation.
- Then, run distcc daemon on each sever by:
- distccd --daemon
You can use --allow options to restrict access.
- If you want to start distccd on startup, you can add distcc as a process to Linux sysvinit.
- In order to build a project, the environment should be setup on the client PC:
- export DISTCC_HOSTS="localhost fastest faster slow"
Here the "fastest faster slow" are the servers names. The order of their names is sorted by their processing power. The localhost of client PC is an optional server as well. In order, distcc looks in the $DISTCC_HOSTS, user's $DISTCC_DIR/hosts file and system-wide host file. If no host list can be found. distcc emits a warning and compiles locally.
- Finally, build the project by:
- cd yourproject
- make -j8 CC=distcc
The j option of CC is parallel build option of GNU make, 8 stands for 8 files for each batch of distributed build. Although we setup CC=distcc, but distcc is a front-end of GCC, rather than a real compiler.
- Last, but not leat, distcc offers the monitor utilities called distccmon-text and distccmon-gnome to allow the project owner to check the build status on servers.
Conclusion
distcc has many ports including Cygwin, Sharp Zaurus. So it can be used as distributed build cross different OS platforms and target platforms. Although this tool is not new at all, it is quite helpful to setup such as build pool to speed up the building of big projects. Because distcc only transfers the preprocessed files to the servers and gets the object files back to link locally, no header files, libraries are necessary to transfer over the network. Theoretically, it is possible to use distcc in a 8051 project with Keil, because Keil also generate preprocessed files. It requires extra effort to hack the distcc to support Keil C51 on Cygwin.
Since distcc has a port for Sharp Zaurus, it is possible to speed up the software development with the native ARM GCC across the network. Of course, we don't have to offer many Sharp Zaurus at all. We can run Zaurus OS over QEMU on a x86 PC then run the native compiler with distcc. The native ARM GCC is quite helpful to build some software packages which were not designed for cross-compilation. I may cover this topic in another blog later.
Other DBS
distcc
http://distcc.samba.org/
Cabie is "Continuous Automated Build and Integration Environment". Cabie is a build system with distributed build servers to perform software builds on various hardware platforms. Cabie supports "continuous integration" (build with each CM i.e. Subversion check-in) or "nightly builds", and automated regression testing.
http://www.yolinux.com/TUTORIALS/CabieBuildSystem.html
Electric Cloud, The first product, ElectricAccelerator, speeds up make, Microsoft Visual Studio, and Apache Ant based builds by parallelizing them and running them on a computer cluster.
http://www.electric-cloud.com/
XGE (Xoreax Grid Engine) is a Grid Computing engine for the Microsoft Windows operating system developed by Xoreax Software. It is available in a commercial product called IncrediBuild.
http://www.xoreax.com/main.htm
Rant (for Java)
Rant stands for Remote Ant. It is a distributed build system that allows an Ant build file to launch builds on other systems and receive exceptions should they occur.
http://sourceforge.net/projects/remoteant/
SN-DBS slashes compile times by distributing your source code builds between cooperating PCs on your local area network. SN-DBS is available free of charge to registered PS2, PSP and PS3 developers.
http://www.snsys.com/products/sn-dbs.asp
- allankliu's blog
- 3632 reads





Distributed Build System - Performance Improvement
Hi Allan,
I do have a few questions :
I mean. I expect that using a remote PC two time faster than your laptop delivers a twofold increase in building performances. Am I far from real figures?
Thanks.
About the performance
The improvement is almost linear. That means you can get double performance if you have two PCs (One local and one remote). However, please don't connect these PCs with VPN, unless you connect your lattop to remote local via VPN. In latter case, you are logging as client for the "local".
I have not recorded the final figure. However, you can feel the performance improvement. Also in my project, the situation is a little complicated. The project contains Linux kernel, UI, DSP and microcontroller. Only Linux kernel can benefit from distcc.
Allan K Liu
WOW!! Thank you for this
WOW!! Thank you for this bit. Don't know how well I can do, but I'll try.
Post new comment