Programming/Tools/Make

From Thalesians Wiki
< Programming
Revision as of 23:20, 23 December 2020 by Admin (talk | contribs)

Make

Make is a build automation tool that automatically builds executable programs and libraries from source code by reading files called Makefiles which specify how to derive the target program. Though integrated development environments and language-specific compiler features can also be used to manage a build process, Make remains widely used, especially on Unix and Unix-like (e.g. Linux) operating systems.

Besides building programs, Make can be used to manage any project where some files must be updated automatically from others whenever the others change.

Make was originally created by Stuart Feldman in April 1976 at Bell Labs. Feldman received the 2003 ACM Software System Award for the authoring of this tool.

Feldman was inspired to write Make by the experience of a coworker in futilely debugging a program of his where the executable was accidentally not being updated with changes:

Make originated with a visit from Steve Johnson (author of yacc, etc.), storming into my office, cursing the Fates that had caused him to waste a morning debugging a correct program (bug had been fixed, file hadn't been compiled, cc *.o was therefore unaffected). As I had spent a part of the previous evening coping with the same disaster on a project I was working on, the idea of a tool to solve it came up. It began with an elaborate idea of a dependency analyzer, boiled down to something much simpler, and turned into Make that weekend. Use of tools that were still wet was part of the culture. Makefiles were text files, not magically encoded binaries, because that was the Unix ethos: printable, debuggable, understandable stuff.

—Stuart Feldman, The Art of Unix Programming, Eric S. Raymond, 2003.

Preliminaries

If Make is not yet present on your system, it can be installed using

$ sudo apt update
...
$ sudo apt install make

Once Make is installed, to find out its version, you can use

$ make --version
GNU Make 4.2.1
Built for x86_64-pc-linux-gnu
Copyright (C) 1988-2016 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Building without Make

Let's make a directory

$ mkdir my_project

Within that directory, let's create main.c and edit it:

$ cd my_project
$ touch main.c
$ nano main.c

Let's set the contents of main.c to

#include <stdio.h>

int main() {
    printf("Hello, World!\n");
    return 0;
}

We could build our program simply using

$ gcc -Wall -c main.c
$ gcc -Wall -o hello main.o

The first line will result in the production of main.o, an object file—a file containing object code, that is, machine code output of an assembler or compiler.

The object code is usually relocatable (relocation is the process of assigning load addresses for position-dependent code and data of a program and adjusting the code and data to reflect the assigned addresses), and is not usually directly executable (an executable file causes a computer to perform indicated tasks according to encoded instructions, as opposed to a data file that must be interpreted (parsed) by a program to be meaningful).

One or multiple object files can be linked using a linker into one executable program or library pulling in precompiled system libraries as needed. In contrast, scripts (Python or JavaScript, for example) are interpreted and Java programs are compiled into byte-code class files.

The second line, namely

$ gcc -Wall -o hello main.o

invokes the linker and links a single object file main.o into an executable file hello.

We can now run the executable using

$ ./hello
Hello, World!

We have built our simple project.

Introducing Make

Introducing Make in our simple project is an overkill. However, it's still a good setting to introduce Make. Later, when our project becomes more complex, we'll be able to appreciate Make's usefulness.

Let us first remove the previously built hello and main.o:

$ rm hello main.o

Let us then create a Makefile and edit it:

$ touch Makefile
$ nano Makefile

Let's set its contents to:

hello : main.o
        gcc -Wall -o hello main.o

main.o : main.c
        gcc -Wall -c main.c

What we have done here is specified the two rules. A rule tells Make how to remake certain files, called the rule's targets (most often only one per rule). It lists the other files that are the dependencies of the target and commands to use to create or update the target.

The order of rules is not significant, except for determining the default goal: the target for Make to consider, if you do not otherwise specify one. The default goal is the target of the first rule in the first Makefile.

In general a rule looks like this:

targets : dependencies
        command
        ...

or like this:

targets : dependencies ; command
        command
        ...

The targets are file names, separated by spaces. Wildcard characters may be used. Usually there is only one target per rule. The command lines start with a tab character. The first command may appear on the line after the dependencies, with a tab character, or may appear on the same line, with a semicolon. Either way, the effect is the same.

A rule tells Make two things: when the targets are out of date, and how to update them when necessary.

The criterion for being out of date is specified in terms of the dependencies, which consist of file names separated by spaces. (Wildcards are allowed here too.) A target is out of date if it does not exist or if it is older than any of the dependencies (by comparison of last-modification times). The idea is that the contents of the target file are computed based on information in the dependencies, so if any of the dependencies change, the contents of the existing target file are no longer necessarily valid.

How to update is specified by commands. These are lines to be executed by the shell.

Let's now

$ make

we'll see as output

gcc -Wall -c main.c
gcc -Wall -o hello main.o

Make notices that the target hello does not exist; it is therefore out of date and must be built. hello depends on main.o, which is itself another target. The target main.o does not exist either; therefore it is out of date and must be built. Make then executes the command needed to build it, namely

gcc -Wall -c main.c

Now that the dependencies of the target hello exist, it can also be built. Main executes the requisite command:

gcc -Wall -o hello main.o

Let's run

$ make

again. This time the output is

make: 'hello' is up to date.

The default goal, the target hello is up to date because its modification time is greater than or equal to the modification times of its dependencies.

We could

$ rm hello
<pre>
and then run
<pre>
$ make

In this case the output will be

gcc -Wall -o hello main.o

Only hello will be rebuilt; since the target main.o is up to date (its modification time is greater than or equal to the modification time of its dependency, main.c) it doesn't need rebuilding.

Header-only libraries

Let us modify main.c as follows:

#include "sphere.h"

int main(void)
{
    float radius, vol;

    printf("Input the radius of the sphere: ");
    radius = get_value();
    printf("Surface area = ");
    put_value(surface_area(radius));
    vol = volume(radius);
    printf("Volume of sphere = ");
    put_value(vol);

    return EXIT_SUCCESS;
}

Let us now run Make:

$ make
gcc -Wall -c main.c
main.c:1:10: fatal error: sphere.h: No such file or directory
    1 | #include "sphere.h"
      |          ^~~~~~~~~~
compilation terminated.
make: *** [Makefile:5: main.o] Error 1

We haven't yet created sphere.h. Let us

$ touch sphere.h
$ nano sphere.h

and set its contents to the following:

#ifndef SPHERE_H_INCLUDED
#define SPHERE_H_INCLUDED

#include <stdio.h>
#include <stdlib.h>

#define PI 3.14159265

float get_value(void)
{
    float x;

    scanf("%f", &x);
    return x;
}

void put_value(float x)
{
    printf("%f\n", x);
}

static float my_pow(float x, int n);

float surface_area(float r)
{
    return 4. * PI * my_pow(r, 2);
}

float volume(float r)
{
    return 4. * PI * my_pow(r, 3) / 3.;
}

static float my_pow(float x, int n)
{
    if (n < 0)
        return 1. / my_pow(x, -n);
    else if (n == 0)
        return 1;
    else
        return x * my_pow(x, n - 1);
}

#endif

If we run Make now, everything will work, but we mustn't forget to add sphere.h to main.o's dependencies. Currently, if sphere.h changes, main.o won't be rebuilt. Thus we

$ nano Makefile

and update it accordingly:

hello : main.o
        gcc -Wall -o hello main.o

main.o : main.c sphere.h
        gcc -Wall -c main.c

If we now run Make, the project will be rebuilt:

$ make
gcc -Wall -c main.c
gcc -Wall -o hello main.o

Let's try running the resulting executable:

$ ./hello
Input the radius of the sphere: 3.5
Surface area = 153.938034
Volume of sphere = 179.594376

sphere.h constitutes a "header-only" library. A library is called header-only if the full definitions of all macros, functions, and classes comprising the library are visible to the compiler in a header file form. Header-only libraries do not need to be separately compiled, packaged, and installed in order to be used. All that is required is to point the compiler at the location of the headers, and then #include the header files into the application source. Another advantage is that the compiler's optimizer can do a much better job when all the library's source code is available.

The disadvantages include:

  • brittleness—most changes to the library will require recompilation of all compilation units using that library;
  • longer compilation times—the compilation unit must see the implementation of all components in the included files rather than just their interfaces;
  • code-bloat (this may be disputed)—the necessary use of inline statements in non-class functions can lead to code bloat by over-inlining.

Nonetheless, the header-only form is popular because it avoids the (often much more serious) problem of packaging.

For C++ templates, including the definitions in the header is the only way to compile, since the compiler needs to know the full definition of the templates in order to instantiate.

More object files

We have mentioned some reasons why header-only libraries may not be a good idea in some situations.

Let us modify sphere.h so it contains declarations rather than definitions:

#ifndef SPHERE_H_INCLUDED
#define SPHERE_H_INCLUDED

#include <stdio.h>
#include <stdlib.h>

#define PI 3.14159265

float get_value(void);
void put_value(float x);
float surface_area(float r);
float volume(float r);

#endif

Let us add geometry.c with the following contents:

#include "sphere.h"

static float my_pow(float x, int n);

float surface_area(float r)
{
    return 4. * PI * my_pow(r, 2);
}

float volume(float r)
{
    return 4. * PI * my_pow(r, 3) / 3.;
}

static float my_pow(float x, int n)
{
    if (n < 0)
        return 1. / my_pow(x, -n);
    else if (n == 0)
        return 1;
    else
        return x * my_pow(x, n - 1);
}

And let us add simple_io.c:

#include "sphere.h"

float get_value(void)
{
    float x;

    scanf("%f", &x);
    return x;
}

void put_value(float x)
{
    printf("%f\n", x);
}

If we now try to build the project, we'll get some errors:

$ make
gcc -Wall -c main.c
gcc -Wall -o hello main.o
/usr/bin/ld: main.o: in function `main':
main.c:(.text+0x1e): undefined reference to `get_value'
/usr/bin/ld: main.c:(.text+0x42): undefined reference to `surface_area'
/usr/bin/ld: main.c:(.text+0x47): undefined reference to `put_value'
/usr/bin/ld: main.c:(.text+0x53): undefined reference to `volume'
/usr/bin/ld: main.c:(.text+0x77): undefined reference to `put_value'
collect2: error: ld returned 1 exit status
make: *** [Makefile:2: hello] Error 1

That's because Make doesn't know about the source files geometry.c and simple_io.c.

Let us edit Makefile:

hello : main.o geometry.o simple_io.o
        gcc -Wall -o hello main.o geometry.o simple_io.o

main.o : main.c sphere.h
        gcc -Wall -c main.c

geometry.o : geometry.c sphere.h
        gcc -Wall -c geometry.c

simple_io.o : simple_io.c sphere.h
        gcc -Wall -c simple_io.c

Now everything will work:

$ make
gcc -Wall -c geometry.c
gcc -Wall -c simple_io.c
gcc -Wall -o hello main.o geometry.o simple_io.o

Let's try running the resulting executable:

$ ./hello
Input the radius of the sphere: 3.5
Surface area = 153.938034
Volume of sphere = 179.594376

Refactoring the Makefile

Let's take a look at our Makefile in its present form:

hello : main.o geometry.o simple_io.o
        gcc -Wall -o hello main.o geometry.o simple_io.o

main.o : main.c sphere.h
        gcc -Wall -c main.c

geometry.o : geometry.c sphere.h
        gcc -Wall -c geometry.c

simple_io.o : simple_io.c sphere.h
        gcc -Wall -c simple_io.c

You will notice there is quite a lot of repetition in it. Let's refactor it.

TARGET = hello

$(TARGET) : main.o geometry.o simple_io.o
        gcc -Wall -o $@ $^

main.o : main.c sphere.h
        gcc -Wall -c $< -o $@

geometry.o : geometry.c sphere.h
        gcc -Wall -c $< -o $@

simple_io.o : simple_io.c sphere.h
        gcc -Wall -c $< -o $@

clean:
        rm -f *.o $(TARGET)

Building a static library

A library is a collection of items that you can call from your program. You can save much time by reusing work that someone else has already done and be more confident that it has fewer bugs (since other people may be using the same library, you will benefit from having them finding and fixing bugs).

The most straightforward way of using a library function is to have the object files from the library linked directly into your executable, just as with those you have compiled yourself. When linked like this the library is called a static library, because the library will remain unchanged unless the program is recompiled. The final result is a simple executable with no dependencies.

The static library under Linux is nothing more than an archive of object files.

Let us create a static library