If you’re writing code in C++, odds are good that you are maintaining
state in two places, and you probably get it out of sync more often than
not. What are the two places? One of them is the top of your .h and
.cc files:

#include "base/logging.h"
#include "net/http_client.h"

If you’re using make, then the other place is in a Makefile somewhere:

my_project: my_project.o base/logging.o net/http_client.o

This isn’t great. When it gets out of sync, you end up with broken
builds when stuff is missing, and bloated binaries when too much stuff
has been left in. If you’re using cmake or anything like that, you’re
in the same situation. You’re just throwing a slightly different
syntax at the problem.

Also, don’t forget about external libraries. Are you using protobuf?
How about libcurl? mysqlclient? libpq from Postgres? GNU Radio? All
of those have requirements – extra compiler flags and extra linker
flags. You need to carry those all the way up any time they are
referenced in a project. Do you want to edit a Makefile (or whatever)
every time you hook in a library that hooks in some external dependency?
No way!

Your code already has all of the dependencies expressed right
there in it. See those #include directives? Those unambiguously state
“I need this header and/or library target in order to work”. It’s been
sitting there all this time. You just have to start using it for your
own benefit.

Rigor

If you want something like this to work, you have to commit to a certain
amount of consistency in your code base. You might have to throw out a
few really nasty hacks that you’ve done in the past. It’s entirely
likely that most people are fully unwilling or unable to do this, and so
they will continue to suffer. That’s on them.

Here’s what you need to do in order to have success with this sort of
build approach.

One common source directory

  • src/
    • base/
      • logging.h
      • logging.cc
    • net/
      • http_client.cc
      • http_client.h
    • my_project/
      • my_project.cc

That’s not so bad, right? You are probably doing this already.

#include “…” consistency

Any time you #include something inside the tree, do it with “”, and
always spell out the full relative path to it inside the tree. It’s
always #include "net/http_client.h. It’s never just
#include "http_client.h", even if you’re in the same
directory
.

#include <...> consistency

This one is simpler. Any time you have a dependency outside the
project, it’s a system-level #include with .
Then make sure you always use the same path for the same targets.
If it’s #include in one place, then it
should be that anywhere else which also uses that external library.

One .cc/.h pair, one object

base/foo.cc and base/foo.h compile into base/foo.o. lib/bar.cc and
lib/bar.h compile into lib/bar.o. You will never compile more than one
.cc file into a single .o file. That’s what linking is for.

Acyclic graph

base/foo.cc and/or base/foo.h can #include lib/bar.h, but in that case,
lib/bar.cc and/or lib/bar.h can never #include base/foo.h, because that
would cause a cycle. In other words, no loops are permitted.

This also has the nice side-effect of making you do the right thing when
you design your code. If bar can’t refer back to foo, then you can’t
make awful implementations which are hard to figure out six months later
when nobody remembers how this thing worked.

Internal targets

An internal target is what you’re referencing when you do an #include
with a relative path “in quotes like this”. If you #include
"lib/bar.h"
, you have expressed a dependency on a target called
lib/bar.

Target type: header

A header target is one that only has a .h file present. It is not
compiled, but it is scanned to see if it has additional dependencies
(#includes), same as any other internal target.

Target type: library

A library target has both a .cc and a .h file present. It is compiled
into a .o file. Both the .cc and .h files are scanned for additional
dependencies (#includes).

Target type: binary

A binary target has a .cc file present and also has “int main(”
somewhere in it. It is first compiled into a .o file just like a
library target, and it is also scanned for dependencies (#includes).
Then once all of its dependencies are satisfied, that .o file is linked
(along with any other library .o) files into a binary. The link stage
also uses any ldflags picked up along the way from any dependencies.

System targets

A system target is what you’re referencing when you do an #include with
a relative path . You might do
#include or #include
to express this need.

Most system libraries need a little help for you to use them. This
typically means augmenting the include directories searched by the
compiler, and the library directories searched by the linker. The
easiest way to handle this is to have your build tool notice when
someone adds a system dependency, then look up the flags it needs.

One approach is to just literally specify the flags for a given target,
like this:

system_header {
  name: "atomic"
  ldflag: "-latomic"
}

You can do this with more complicated targets, but it can be annoying to
keep straight. Instead, you should use something like pkg-config to get
whatever values were bundled in by whoever built the library in the
first place, like this:

system_header {
  name: "jansson.h"
  pkg_config_name: "jansson"
}

When the build tool encounters #include ,
it’ll run pkg-config --cflags jansson to figure out how to
compile something which depends on it. Later, it’ll also run
pkg-config --libs jansson to get the -L and/or -l flags
required to link it into a binary.

If you’re wondering “why do we need the config stanza if we have
pkg-config”, that’s because pkg-config won’t tell us which #include
targets map onto which pkg-config library names. If there was a way for
pkg-config to say “when you see #include , ask
me about jansson“, then you wouldn’t need this!

If any pkg-config (or equivalent tool) maintainer types see this and
decide to put in the mapping from (what-to-#include) to
(what-to-ask-about), that would be amazing and would solve SO many
problems.

A future version of the build tool may come “pre-loaded” with mappings
from #include to pkg-config names for popular
libraries and other common targets. Ideally, very little would be
needed in any sort of config file.

Dependency propagation

Any compiler or linker flags picked up while processing a system target
“attach” to whatever local target referenced it. Then, if that local
target is referenced somewhere else, the compile and/or linker flags
also travel upward in the tree. This way, if target B depends on
jansson and picks up some special flags, those flags will be there when
A is compiled. They will also be there when A is pulled into my_project
and linked into a binary.

Any given set of system_header { … } directives should hold up across
that same version of the Linux distribution or OS. You might need to
tweak it slightly if the pkg-config stuff is wrong on a given system.
pkg-config hints for some libraries frequently leave out things that
they actually need. (GNU Radio, I’m looking at you.)

This approach has been shown to work across multiple Linux
distributions, BSD flavors, and architectures with minimal changes to
the system_header directives.

Actual portability issues come down to what you use in your code. If
you use epoll(), don’t expect it to work on non-Linux systems. If you
try to access more than about 3 GB of memory, don’t expect it to work on
a 32 bit system. This is all about what you do in your code and has
nothing to do with the build system itself.

Not necessarily. I built something to do this a long time ago and have
been using it ever since. Other people heard about it and did the same
thing. It’s not that hard to do.

Want to see what it looks like? Check out some
recordings from 2013 in which I use the build tool
without calling attention to it. Notice there are no build scripts,
and there’s no build language. You just write code and tell it to
build. If it’s a binary target, you get a binary out and can run it
right away. Also, once you teach it how to handle an external system
dependency, it can deal with it anywhere else.

Yeah, so, people who get into code-building projects because they have
some nerd need to create yet another specialized language for
expressing how to build their code are never going to “get” this
approach. Let them be. They’ll be creating yet another half-assed
incomplete implementation of half of Common Lisp and will be very happy
with that. (Meanwhile, they’ll stlil be using some terrible build
system with a different DSL to actually get some amount of “real work”
done when their bosses yell at them for screwing around on the job.)

Those of us who just want to write C++ code and turn it into usable
programs with a minimum of fuss will be off doing that instead of
spending cycles screwing with a build system and a build language.

Yup. Way back at the end of 2012, I released both a Linux x86_64 ELF
binary and a Mac … uh, whatever… binary to let people play with a
version of my build tool. Nobody actually downloaded or used it, and
it’s been almost 10 years, so I removed it. (Download some random
binary off the Internet and run it? Sounds legit, right?)

It seems better to just rig up a page like this to guilt people into
trying to do something about it on their own. This is that page, and
now you know.

Well then, since you put it that way, we should talk. Hit the contact
link at the bottom of the page and let’s figure something out.

Contact

You can send comments, questions, or whatever via my
contact form.

Read More