Tuesday, August 31, 2010

Tundra, my build system

I've been working on a build system, Tundra, to scratch my own itches and it's come along far enough that I've released the code in beta form over at github (http://github.com/deplinenoise/tundra).

Tundra has taken on many different architectures over the years I've hacked on it, but I've finally arrived at a Lua frontend and a C backend. Here are some of the goals I've had with Tundra:

  • Optimize for iterative builds
  • Flexible configuration frontend (easy to add configurations and variants)
  • Keep the build system core small and simple
  • Utilize multi-core hardware even for scanning and up-to-date checks
  • Reliably support code generation (passes, DAG scheduling)
  • Separate configuration and building to make it easier to optimize & debug

Of course now that Tundra is nearing completion it's fun to show some performance numbers. Noel Llopis did a great writeup on the various build systems available a few years back (http://gamesfromwithin.com/the-quest-for-the-perfect-build-system). Using his script to generate 10 libraries with 50 files each (500 files total) I ran a small benchmark. Here's the statistics output from a no-act run of Tundra on that dataset. This was on a Core i7 box running Win7/64 with everything in disk cache; but the disk itself is a slow 7200 RPM disk:

J:\Scratch\benchmark>tundra --debug-stats
*** build success, 0 jobs run
post-build stats:
  files tracked: 1511 (1511 directly from DAG), file table load 1.64%
  relations tracked: 1034, table load 1.12%
  relation cache load: 0.003s save: 0.000s
  nodes with ancestry: 513 of 513 possible
  time spent in Lua doing setup: 0.057s
    - time spent iterating directories (glob): 0.004s over 10 calls
  total time spent in build loop: 0.032s
    - implicit dependency scanning: 0.152s
    - output directory creation/mgmt: 0.000s
    - command execution: 0.000s
    - (parallel) stat() time: 0.039s (2010 calls out of 0 queries)
    - (parallel) file signing time: 0.031s (md5: 0, timestamp: 1500)
    - up2date checks time: 0.016s
  efficiency: 0.00%
total time spent in tundra: 0.0926s

I'd say 93 ms for a no-act build is pretty nice considering that this includes full #include scanning! We can see that 57 ms was spent in Lua code creating the input DAG and 32 ms was spent in the native build loop. This is nice because if the build master is complaining about slow builds we can easily see if it's due to his bad scripting or due to actual build performance ;)

If you want to give Tundra a spin (it's beta quality software) you can get a binary drop and the documentation here: http://github.com/deplinenoise/tundra/downloads.

Friday, July 23, 2010

Visual Studio and your AppData\Temp directory

Visual Studio has a tendency to leave a lot of junk files behind when you cancel builds. A colleague of mine at DICE had 22,000 temp files weighing in at over 14.5 GB in his AppData\Temp directory! Having more than a few hundered files in a single directory on NTFS is suicide, so you can imagine how much his compile times improved when he emptied that directory.

The real question is how to get Microsoft's CL to stop creating these temporary files, and failing that, how to make it clean them up when you abort it?

In the meantime, I'm relying on a little tool I have running on a schedule that wipes files that are older than a few days from my AppData\Temp directory. Lo-fi solution to a yucky problem.

Tuesday, June 08, 2010

How to stop Vista from trashing your disk

Ever notice how Vista churns your HDD? I had the misfortune of having to install Vista on a machine again and finally took the time to figure out what to do to make it stop. I knew about the first four, but the last two were new to me and made all the difference:
  1. Disable ReadyBoost
  2. Disable SuperFetch
  3. Disable the Windows Search service
  4. Disable the scheduled Disk Defragmentation task
  5. Disable the scheduled System Restore task (shows up as super-high System Volume Information traffic)
  6. Go into the registry and disable prefetching entirely.
All these steps removes 99% of the extra disk activity and now the disk doesn't rattle all the time :)

Wednesday, February 03, 2010

c-amplify prototype available

I've uploaded the c-amplify to gitorious so people can look at the code. It's pretty rough around the edges right now (it's a rant-fueled hack after all), but here's a basic guide to getting started:
  • Clone the repository here: http://github.com/deplinenoise/c-amplify
  • Get the excellent cl-match library and add it to your central ASDF registry
  • Load the c-amplify ASDF system into your lisp (Clozure and SBCL on Win32 should work well, it's what I'm using)
  • In the :se.defmacro.c-amplify package, evaluate this from the REPL:
    • (load-csys-file #p"test-input/test.csys")
    • (update-system (find-system :core))
  • You should now have an amplified core.c file that you can examine (and eventually compile, once a lot of quirks have been worked out in c-amplify itself) :)
As always, feel free to leave your comments.

Friday, January 22, 2010

Amplifying C

On C++

Having programmed in C++ professionally for well over 10 years I have learned all of it. I have all the books, I know all the tricks. And I don’t like it anymore.

Update: This intro apparently made some people see red, because "no man could possibly know all of C++". If that includes you, you can read it as: "I've shipped 4 AAA games on MLOC code bases, and here's my take on the C++ abstractions you can reasonably use in projects that big".

Basically game teams using C++ fall into the same trap every time: they try to create abstractions with whatever is in the C++ toolbox and they fail miserably. On the next project they’re a little bit smarter from the experience so they set out to fix their abstractions and create new ones. And fail.

Quickly going over the major abstraction mechanisms C++ introduced over C I’m arguing that:

  • Templates suck as they cause link-time spam and compile times to skyrocket. They severely bloat the size of the debug symbol file, which on large projects can easily reach several hundred megabytes of data. Abstractions built with templates perform differently depending on whether compiler optimizations are enabled or not (every tried debugging Boost code?). They’re essentially unusable on large code bases beyond container-of-T and simple functions.

  • RTTI sucks because it doesn’t do what anyone wants, and you can’t even rely on it returning a type name formatted in a certain way.

  • Classes suck because their guts have to be in headers for all to see.

All these high-level concepts are flawed and you can’t alter their semantics because they’re set in ISO stone. What all teams do then is to reinvent all the language components that don’t work for them, and sets up rules forbidding the use of the other features. Every C++ shop has its own "accepted" subset.

Basically the C++ game developer community is slowly navigating away from C++'s abstraction patterns. We left operator overloading mostly in the 90’s (some vector libraries still use it). We ditched RTTI back in 2001. Exceptions are firmly off as they don’t even work on all platforms we develop for. A lot of people are advocating that we stop using member functions to reduce coupling.

This may sound harsh, but to me these are clear signs that C++ isn’t providing any real cost benefit for us, and that we should be writing code in other ways.

Coding C

Many C++ game programmers have started to turn towards C (or C-like C++) to get away from the flaws of C++. Targeting C manually with larger systems can be a lot of work, because it offers very basic abstraction facilities. There are functions, enums, tagged types (structs and unions) and a rudimentary type system, but that’s about it.

But let’s look at C as a platform for a minute. C is lean, compiles super fast, it’s supported everywhere and all the tools we need to ship games such as compilers and debuggers (including the obscure and proprietary) work well with it. If you need platform-specific intrinsics to get on with your job, you can rely on the target platform’s C compiler to provide them.

As C is such a simple, predictable language that works everywhere it makes a lot of sense to generate C code. Indeed many projects have done so, but typically the meat of the application has still been written in plain C as code generation is typically used for language interfaces or parser generators.

The Lisp Way

Having also programmed a lot of Common Lisp over the years, I’ve seen how the Lisp family of languages deals with extensibility. In Lisp, you write your own abstractions that become a part of the project’s language. This remarkable feature is enabled basically through two simple things:

  • Programs can be treated as data (because they can be thought of as parse trees)

  • There are macros which transform such data (that is, your programs) into other programs (implementing the abstractions).

I’m going to suggest something mildly radical: we should prefer C over C++. But not straight-up C. We should create our own C with the abstractions we need, built right into the language, customized for the problems we’re working on.

Amplification

We can apply many of the ideas that make Lisp powerful to C if we drop its Algol-like syntax. What is the difference between the following two program fragments?

int my_function(int a, int b) {
    return a + b;
}
(defun my-function ((a int) (b int) (return int))
  (return (+ a b))

Answer: none, they are equivalent as far as semantics go. The latter can trivially be parsed (using very simple rules) and transformed into the former, and so it is still the same C program. This is good news, because Lisp has shown us that if we represent programs as data, we can transform that data arbitrarily before evaluating or compiling it.

In my prototype system, c-amplify, I’m doing exactly this. The system introduces an "amplification" phase where s-expressions are transformed to C code before a traditional build system runs.

The c-amplify system has the following major parts:

  • A system definition facility (specifying input files and dependencies)

  • A reader, parsing ca source files

  • A persistent function and type database which is updated as source files are amplified.

  • A pretty-printing C code generator — important as we’re going to be debugging the generated code.

The system is intended to be run incrementally as source files are changed, re-reading changed files, updating the database and writing generated output. A traditional build system can then be used to the resulting files.

The persistent database is an interesting component that isn’t strictly needed for the system but enables a lot of neat features:

  • Type inference. Because all functions and types are known, c-amplify can easily supply a type-of operator for arbitrary expressions. This can be used as the basis for type inferring macros similar to auto in C++0x or var in C#.

  • Hook functions could be installed that run over the database and do additional work. For example, instrumenting all writes to a particular struct field, generating reflection info or performing project-specific checks on how types or functions are used. The possibilities are pretty much endless. Remember all those times you’ve thought: if we could only access this thing in the compiler we could give an error message if the code does this thing? Well, with hooks you could.

RAII, uncluttered

Let’s look at one such macro that solves a real problem: making sure a file handle is closed in an orderly manner, even when there are multiple exit points from the block.

In situations like this, C++ fans can’t wait to tell you about RAII. RAII means creating a stack object of some utility type that performs resource cleanup in its destructor. If we look at the AutoFile type we need to type up to implement RAII we find that exactly one line is providing the abstraction we need (the destructor), the rest is boilerplate:

class AutoFile // auxillary type
{
  private:
    FILE* f;

  public:
    AutoFile(const char* fn, const char* mode) {
      f = fopen(fn, mode);
    }
    ~AutoFile() { if (f) fclose(f); }
    operator FILE*() { return f; }

  private:
    AutoFile(const AutoFile&);
    AutoFile& operator=(const AutoFile&);
};

// later, that same day..
{
   AutoFile file("c:/temp/foo.txt", "w");
   fprintf(file, "Hello, world");
}

What Lisp programmers do to manage resources is to create block-wrapping macros (usually starting with the word with-). The macros sits around a body of code, indicating visually that the code wrapped by the macro will have access to some resource. The macro expansion is guaranteed to clean up the resource regardless of how the block terminates. Here’s an example of using such a macro with c-amplify:

(with-open-file (f "c:/temp/foo.txt" "w")
  (fprintf f "Hello, world"))

If we ask c-amplify to macro-expand this we see that the details of calling fopen and fclose are being handled as if we had written out everything by hand:

{
  FILE* f = (FILE *) 0;
  {
    f = fopen("c:/temp/foo.txt", "w");
    fprintf(f, "Hello, world");
  }
cleanup_8_:
  if (file) {
    fclose(file);
  }
}

Even if we add more complex code with multiple return paths, c-amplify doesn’t let us down:

(defun foo ((return int))
  (with-open-file (f "c:/temp/foo.txt" "w")
    (if (> (rand 10) 5)
       (return 20))
    (fprintf f "Hello, world")
    (return 10)))

This amplifies to the following C code. Note how the with-open-file block locally redefines what it means to return a value. This C code is a close representation of what a C++ compiler has to emit when RAII is used but as before there are no residual types left.

int foo(void)
{
  FILE* f = (FILE *) 0;
  {
    int result_13_;
    {
      f = fopen("c:/temp/foo.txt", "w");
      if (rand(10) > 5) {
        result_13_ = 20;
        goto cleanup_12_;
      }
      fprintf(f, "Hello, world");
      {
        result_13_ = 10;
        goto cleanup_12_;
      }
    }
    cleanup_12_:
    if (f) {
      fclose(f);
    }
    return result_13_;
  }
}

One possible c-amplify implementation of the with-open-file macro looks like this (on a real game project it would of course not use fopen, but some custom file manager):

(def-c-macro with-open-file ((var file-name mode) &body body)
  `(progn
     (declare (,var = (cast (ptr #$FILE) 0)))
     (unwind-protect
          (progn
            (= ,var (#$fopen ,file-name ,mode))
            ,@body)
       (when ,var
         (#$fclose ,var)))))

The funny #$foo syntax is just a reader macro to facilitate reading case sensitive symbols in a special package which corresponds to the C namespace. The implementation piggy-backs on unwind-protect, which makes sure that the body code always goes through the cleanup clauses:

(def-c-macro unwind-protect (form &body cleanup-forms)
  (with-c-gensyms (cleanup result)
    `(progn
       (ast-stmt-if (not (current-defun-void-p))
                    (declare (,result *current-return-type*)))
       (macrolet (return (&optional expr)
                         `(progn
                            (ast-stmt
                             (if (not (current-defun-void-p))
                                 `(= ,',',result ,,expr)
                                 `(cast void ,,expr)))
                            (goto ,',cleanup)))
         ,form)
       (label ,cleanup)
       ,@cleanup-forms
       (return ,result))))

If we decided to add exception handling (through e.g. setjmp or SEH exceptions), we only have to touch unwind-protect to enable exception cleanup in all our RAII-like resource macros. Layering pure compile-time abstractions like this to create programs is incredibly powerful.

Exploiting the database

As I mentioned, there are many advantages to having all your code parsed with type information sitting around in an in-memory database. Let’s highlight one such thing, automatic type inference for local variables. Consider the following c-amplify input:

(defstruct bar
  (x (const restrict volatile ptr int)))

(defstruct foo
  (barp (ptr struct bar)))

(defun my-function ((foop (ptr struct foo)) (return int))
  (let ((xp (-> foop barp x)))
     (return (* xp))))

This amplifies to:

struct bar { int * const volatile restrict x; };

struct foo { struct bar * barp; };

int my_function(struct foo * foop) {
  int * const volatile restrict xp = foop->barp->x;
  return *xp;
}

We can see that the lexical variable xp has the expected type. The c-amplify system knows how to compute the type of any C expression (including arithmetic promotion) so the this feature can be used extensively if desired.

Including what’s needed

Another great feature of having a complete function/type database is that generated C files do not need include statements. If a source file needs a bunch of declarations the generator will just emit them right there. There’s no need to maintain header files. If code in a generated file starts using a structure all of a sudden, a copy of its declaration will automatically pop in to the generated file.

For a full-out implementation of this idea to work, 3rd party declarations from the OS and C libraries must be imported into the amplification database. A separate tool must be devised for this but it would certainly be possible.

Compiling files generated like this would mean the the preprocessor wouldn’t touch disk except to read the input c file. Makefiles for generated files such as these will also be trivial to write as there are no implicit dependencies.

Further ideas

Here are additional ideas that could be explored within the c-amplify system:

  • Improved compile-time checking for traditionally dangerous functions (scanf, printf) (make macros that evaluate the format strings and types of arguments)

  • Add exception handling to C on top of setjmp, SEH or some other basic mechanism

  • Annotate structures for real-time tweaking.

  • Generate script language bindings at compile time via macros and hooks.

  • Inlining/code simplification at amplification time (trig function simplification, maths)

  • Add a proper sublanguage for vector math. Finally you can write that vector math library that combines plus and multiply to madd on altivec by analyzing the code at compile time, and you don’t need 3000 lines of C++ "expression templates" to do it

  • If you absolutely must have C++-like classes and templates, you could implement those too. Classes with single dispatch would be pretty easy (generate a couple of structs per class), and templates could be "done right" in the sense that you’d only generate a single expansion for each instantiated type and dump them all to a single source file, rather than compiling thousands of instantiations of std::vector and letting the linker sort through the carnage.

Conclusion

If there is a way (no matter how much work it would be) to express the semantics of an abstraction in C, chances are you can implement it as a set of macros and hooks in c-amplify.

However, c-amplify is still a prototype and a lot of work remains before it might be suitable for production use. I hope this rant has given you some new ideas on how we design programs. Send your feedback and flames my way.