Difference between revisions of "Programming/Tools/Make"

From Thalesians Wiki
 
(2 intermediate revisions by the same user not shown)
Line 1: Line 1:
= Make =
= Make =


[https://www.gnu.org/software/make/ '''Make'''] is a build automation tool that automatically builds executable programs and libraries from source code by reading files called '''Makefiles''' which specify how to derive the target program. Though integrated development environments and language-specific compiler features can also be used to manage a build process, Make remains widely used, especially on Unix and Unix-like (e.g. Linux) operating systems.
[https://www.gnu.org/software/make/ '''Make'''] is a '''build system''' ('''build automation tool''') that automatically builds executable programs and libraries from source code by reading files called '''Makefiles''' which specify how to derive the target program. Though integrated development environments and language-specific compiler features can also be used to manage a build process, Make remains widely used, especially on Unix and Unix-like (e.g. Linux) operating systems.


Besides building programs, Make can be used to manage any project where some files must be updated automatically from others whenever the others change.
Besides building programs, Make can be used to manage any project where some files must be updated automatically from others whenever the others change.
Line 15: Line 15:
</blockquote>
</blockquote>


Make is what is known as a '''build system'''&mdash;a build automation tool.
= Alternatives and complementary utilities =
 
= Alternatives =


Nowadays there are alternatives to Make, as well as build utilities that work alongside Make. See [https://medium.com/@julienjorge/an-overview-of-build-systems-mostly-for-c-projects-ac9931494444 An overview of build systems (mostly for C++ projects)] by [https://medium.com/@julienjorge Julien Jorge].
Nowadays there are alternatives to Make, as well as build utilities that work alongside Make. See [https://medium.com/@julienjorge/an-overview-of-build-systems-mostly-for-c-projects-ac9931494444 An overview of build systems (mostly for C++ projects)] by [https://medium.com/@julienjorge Julien Jorge].
Line 35: Line 33:
* [https://ant.apache.org/ Ant]
* [https://ant.apache.org/ Ant]
* [https://gradle.org/ Gradle]
* [https://gradle.org/ Gradle]
Whereas Make is a build system, which drives the compiler and other build tools to build your code, some of the above utilities, such as [https://cmake.org/ CMake], are generators of build systems. For example, CMake can produce Makefiles, [https://ninja-build.org/ Ninja] build files, [https://www.kdevelop.org/ KDEvelop] or [https://developer.apple.com/xcode/ Xcode] projects, and [https://visualstudio.microsoft.com/ Visual Studio] solutions from the same starting point&mdash;the same <tt>CMakeLists.txt</tt> file. So if you have a platform-independent project, CMake is a way to make it build system-independent as well.
We won't consider CMake in this tutorial, but you will find information on it in the dedicated [[Programming/Tools/CMake|CMake tutorial]].


= Preliminaries =
= Preliminaries =
Line 561: Line 563:


= Building a shared library =
= Building a shared library =
Static libraries, while reusable across multiple programs, are locked into a program at compile-time. '''Dynamic''' or '''shared libraries''' exist as separate files outside of the executable file.
The code of a static library is locked into the executable file and cannot be modified without a re-compile. In contrast, a dynamic library can be modified without a need to recompile. If a dynamic library becomes corrupt, the executable file may no longer work. A static library, however, is untouchable because it lives inside the executable file. The upside of using a dynamic library is that multiple running applications can use the same library without the need for each to have its own copy.
Let us see how we can build a shared library using Make.


<pre>
<pre>
Line 591: Line 599:
</pre>
</pre>


Here <tt>-fpic</tt> tells the compiler to produce '''position-independent code (PIC)'''.
Here <tt>-fpic</tt> tells the compiler to produce '''position-independent code (PIC)'''. Such code, being placed somewhere in the primary memory, executes properly regardless of its absolute address. PIC is commonly used for shared libraries, so that the same library code can be loaded in a location in each program's address space where it doesn't overlap with other memory in use (e.g. other shared libraries).


Let us
Let us
Line 650: Line 658:
</pre>
</pre>


Note that the <tt>-lsphere</tt> option is not looking for <tt>sphere.so</tt> but <tt>libsphere.so</tt>. GCC assumes that all libraries start with <tt>lib</tt> and end with <tt>.a</tt> or <tt>.so</tt> (<tt>.a</tt> is for archive or statically linked libraries, <tt>.so</tt> is for shared object or shared libraries).
Note that the <tt>-lsphere</tt> option is not looking for <tt>sphere.so</tt> but <tt>libsphere.so</tt>. GCC assumes that all libraries start with <tt>lib</tt> and end with <tt>.a</tt> or <tt>.so</tt> (<tt>.a</tt> for archive or statically linked libraries, <tt>.so</tt> for shared object or shared libraries).


Now let's try running <tt>hello</tt>:
Now let's try running <tt>hello</tt>:

Latest revision as of 21:02, 24 December 2020

Make

Make is a build system (build automation tool) that automatically builds executable programs and libraries from source code by reading files called Makefiles which specify how to derive the target program. Though integrated development environments and language-specific compiler features can also be used to manage a build process, Make remains widely used, especially on Unix and Unix-like (e.g. Linux) operating systems.

Besides building programs, Make can be used to manage any project where some files must be updated automatically from others whenever the others change.

Make was originally created by Stuart Feldman in April 1976 at Bell Labs. Feldman received the 2003 Association for Computing Machinery (ACM) Software System Award for the authoring of this tool.

Feldman was inspired to write Make by the experience of a coworker in futilely debugging a program of his where the executable was accidentally not being updated with changes:

Make originated with a visit from Steve Johnson (author of yacc, etc.), storming into my office, cursing the Fates that had caused him to waste a morning debugging a correct program (bug had been fixed, file hadn't been compiled, cc *.o was therefore unaffected). As I had spent a part of the previous evening coping with the same disaster on a project I was working on, the idea of a tool to solve it came up. It began with an elaborate idea of a dependency analyzer, boiled down to something much simpler, and turned into Make that weekend. Use of tools that were still wet was part of the culture. Makefiles were text files, not magically encoded binaries, because that was the Unix ethos: printable, debuggable, understandable stuff.

—Stuart Feldman, The Art of Unix Programming, Eric S. Raymond, 2003.

Alternatives and complementary utilities

Nowadays there are alternatives to Make, as well as build utilities that work alongside Make. See An overview of build systems (mostly for C++ projects) by Julien Jorge.

We will list some of them here:

Whereas Make is a build system, which drives the compiler and other build tools to build your code, some of the above utilities, such as CMake, are generators of build systems. For example, CMake can produce Makefiles, Ninja build files, KDEvelop or Xcode projects, and Visual Studio solutions from the same starting point—the same CMakeLists.txt file. So if you have a platform-independent project, CMake is a way to make it build system-independent as well.

We won't consider CMake in this tutorial, but you will find information on it in the dedicated CMake tutorial.

Preliminaries

If Make is not yet present on your system, it can be installed using

$ sudo apt update
...
$ sudo apt install make

Once Make is installed, to find out its version, you can use

$ make --version
GNU Make 4.2.1
Built for x86_64-pc-linux-gnu
Copyright (C) 1988-2016 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Building without Make

Let's make a directory

$ mkdir my_project

Within that directory, let's create main.c and edit it:

$ cd my_project
$ touch main.c
$ nano main.c

Let's set the contents of main.c to

#include <stdio.h>

int main() {
    printf("Hello, World!\n");
    return 0;
}

We could build our program simply using

$ gcc -Wall -c main.c
$ gcc -Wall -o hello main.o

The first line will result in the production of main.o, an object file—a file containing object code, that is, machine code output of an assembler or compiler.

The object code is usually relocatable (relocation is the process of assigning load addresses for position-dependent code and data of a program and adjusting the code and data to reflect the assigned addresses), and is not usually directly executable (an executable file causes a computer to perform indicated tasks according to encoded instructions, as opposed to a data file that must be interpreted (parsed) by a program to be meaningful).

One or multiple object files can be linked using a linker into one executable program or library pulling in precompiled system libraries as needed. In contrast, scripts (Python or JavaScript, for example) are interpreted and Java programs are compiled into byte-code class files.

The second line, namely

$ gcc -Wall -o hello main.o

invokes the linker and links a single object file main.o into an executable file hello.

We can now run the executable using

$ ./hello
Hello, World!

We have built our simple project.

Introducing Make

Introducing Make in our simple project is an overkill. However, it's still a good setting to introduce Make. Later, when our project becomes more complex, we'll be able to appreciate Make's usefulness.

Let us first remove the previously built hello and main.o:

$ rm hello main.o

Let us then create a Makefile and edit it:

$ touch Makefile
$ nano Makefile

Let's set its contents to:

hello : main.o
        gcc -Wall -o hello main.o

main.o : main.c
        gcc -Wall -c main.c

What we have done here is specified the two rules. A rule tells Make how to remake certain files, called the rule's targets (most often only one per rule). It lists the other files that are the dependencies of the target and commands to use to create or update the target.

The order of rules is not significant, except for determining the default goal: the target for Make to consider, if you do not otherwise specify one. The default goal is the target of the first rule in the first Makefile.

In general a rule looks like this:

targets : dependencies
        command
        ...

or like this:

targets : dependencies ; command
        command
        ...

The targets are file names, separated by spaces. Wildcard characters may be used. Usually there is only one target per rule. The command lines start with a tab character (not spaces). The first command may appear on the line after the dependencies, with a tab character, or may appear on the same line, with a semicolon. Either way, the effect is the same.

A rule tells Make two things: when the targets are out of date, and how to update them when necessary.

The criterion for being out of date is specified in terms of the dependencies, which consist of file names separated by spaces. (Wildcards are allowed here too.) A target is out of date if it does not exist or if it is older than any of the dependencies (by comparison of last-modification times). The idea is that the contents of the target file are computed based on information in the dependencies, so if any of the dependencies change, the contents of the existing target file are no longer necessarily valid.

How to update is specified by commands. These are lines to be executed by the shell.

Let's now

$ make

we'll see as output

gcc -Wall -c main.c
gcc -Wall -o hello main.o

Make notices that the target hello does not exist; it is therefore out of date and must be built. hello depends on main.o, which is itself another target. The target main.o does not exist either; therefore it is out of date and must be built. Make then executes the command needed to build it, namely

gcc -Wall -c main.c

Now that the dependencies of the target hello exist, it can also be built. Main executes the requisite command:

gcc -Wall -o hello main.o

Let's run

$ make

again. This time the output is

make: 'hello' is up to date.

The default goal, the target hello is up to date because its modification time is greater than or equal to the modification times of its dependencies.

We could

$ rm hello
<pre>
and then run
<pre>
$ make

In this case the output will be

gcc -Wall -o hello main.o

Only hello will be rebuilt; since the target main.o is up to date (its modification time is greater than or equal to the modification time of its dependency, main.c) it doesn't need rebuilding.

Header-only libraries

Let us modify main.c as follows:

#include "sphere.h"

int main(void)
{
    float radius, vol;

    printf("Input the radius of the sphere: ");
    radius = get_value();
    printf("Surface area = ");
    put_value(surface_area(radius));
    vol = volume(radius);
    printf("Volume of sphere = ");
    put_value(vol);

    return EXIT_SUCCESS;
}

Let us now run Make:

$ make
gcc -Wall -c main.c
main.c:1:10: fatal error: sphere.h: No such file or directory
    1 | #include "sphere.h"
      |          ^~~~~~~~~~
compilation terminated.
make: *** [Makefile:5: main.o] Error 1

We haven't yet created sphere.h. Let us

$ touch sphere.h
$ nano sphere.h

and set its contents to the following:

#ifndef SPHERE_H_INCLUDED
#define SPHERE_H_INCLUDED

#include <stdio.h>
#include <stdlib.h>

#define PI 3.14159265

float get_value(void)
{
    float x;

    scanf("%f", &x);
    return x;
}

void put_value(float x)
{
    printf("%f\n", x);
}

static float my_pow(float x, int n);

float surface_area(float r)
{
    return 4. * PI * my_pow(r, 2);
}

float volume(float r)
{
    return 4. * PI * my_pow(r, 3) / 3.;
}

static float my_pow(float x, int n)
{
    if (n < 0)
        return 1. / my_pow(x, -n);
    else if (n == 0)
        return 1;
    else
        return x * my_pow(x, n - 1);
}

#endif /* SPHERE_H_INCLUDED */

If we run Make now, everything will work, but we mustn't forget to add sphere.h to main.o's dependencies. Currently, if sphere.h changes, main.o won't be rebuilt. Thus we

$ nano Makefile

and update it accordingly:

hello : main.o
        gcc -Wall -o hello main.o

main.o : main.c sphere.h
        gcc -Wall -c main.c

If we now run Make, the project will be rebuilt:

$ make
gcc -Wall -c main.c
gcc -Wall -o hello main.o

Let's try running the resulting executable:

$ ./hello
Input the radius of the sphere: 3.5
Surface area = 153.938034
Volume of sphere = 179.594376

sphere.h constitutes a "header-only" library. A library is called header-only if the full definitions of all macros, functions, and classes comprising the library are visible to the compiler in a header file form. Header-only libraries do not need to be separately compiled, packaged, and installed in order to be used. All that is required is to point the compiler at the location of the headers, and then #include the header files into the application source. Another advantage is that the compiler's optimizer can do a much better job when all the library's source code is available.

The disadvantages include:

  • brittleness—most changes to the library will require recompilation of all compilation units using that library;
  • longer compilation times—the compilation unit must see the implementation of all components in the included files rather than just their interfaces;
  • code-bloat (this may be disputed)—the necessary use of inline statements in non-class functions can lead to code bloat by over-inlining.

Nonetheless, the header-only form is popular because it avoids the (often much more serious) problem of packaging.

For C++ templates, including the definitions in the header is the only way to compile, since the compiler needs to know the full definition of the templates in order to instantiate.

More object files

We have mentioned some reasons why header-only libraries may not be a good idea in some situations.

Let us modify sphere.h so it contains declarations rather than definitions:

#ifndef SPHERE_H_INCLUDED
#define SPHERE_H_INCLUDED

#include <stdio.h>
#include <stdlib.h>

#define PI 3.14159265

float get_value(void);
void put_value(float x);
float surface_area(float r);
float volume(float r);

#endif /* SPHERE_H_INCLUDED */

Let us add geometry.c with the following contents:

#include "sphere.h"

static float my_pow(float x, int n);

float surface_area(float r)
{
    return 4. * PI * my_pow(r, 2);
}

float volume(float r)
{
    return 4. * PI * my_pow(r, 3) / 3.;
}

static float my_pow(float x, int n)
{
    if (n < 0)
        return 1. / my_pow(x, -n);
    else if (n == 0)
        return 1;
    else
        return x * my_pow(x, n - 1);
}

And let us add simple_io.c:

#include "sphere.h"

float get_value(void)
{
    float x;

    scanf("%f", &x);
    return x;
}

void put_value(float x)
{
    printf("%f\n", x);
}

If we now try to build the project, we'll get some errors:

$ make
gcc -Wall -c main.c
gcc -Wall -o hello main.o
/usr/bin/ld: main.o: in function `main':
main.c:(.text+0x1e): undefined reference to `get_value'
/usr/bin/ld: main.c:(.text+0x42): undefined reference to `surface_area'
/usr/bin/ld: main.c:(.text+0x47): undefined reference to `put_value'
/usr/bin/ld: main.c:(.text+0x53): undefined reference to `volume'
/usr/bin/ld: main.c:(.text+0x77): undefined reference to `put_value'
collect2: error: ld returned 1 exit status
make: *** [Makefile:2: hello] Error 1

That's because Make doesn't know about the source files geometry.c and simple_io.c.

Let us edit Makefile:

hello : main.o geometry.o simple_io.o
        gcc -Wall -o hello main.o geometry.o simple_io.o

main.o : main.c sphere.h
        gcc -Wall -c main.c

geometry.o : geometry.c sphere.h
        gcc -Wall -c geometry.c

simple_io.o : simple_io.c sphere.h
        gcc -Wall -c simple_io.c

Now everything will work:

$ make
gcc -Wall -c geometry.c
gcc -Wall -c simple_io.c
gcc -Wall -o hello main.o geometry.o simple_io.o

Let's try running the resulting executable:

$ ./hello
Input the radius of the sphere: 3.5
Surface area = 153.938034
Volume of sphere = 179.594376

Refactoring the Makefile

Let's take a look at our Makefile in its present form:

hello : main.o geometry.o simple_io.o
        gcc -Wall -o hello main.o geometry.o simple_io.o

main.o : main.c sphere.h
        gcc -Wall -c main.c

geometry.o : geometry.c sphere.h
        gcc -Wall -c geometry.c

simple_io.o : simple_io.c sphere.h
        gcc -Wall -c simple_io.c

You will notice there is quite a lot of repetition in it. Let's refactor it.

# A slightly more advanced Makefile (this is a comment).
TARGET = hello
OFILES = main.o geometry.o simple_io.o
CC = gcc
CFLAGS = -Wall

$(TARGET) : $(OFILES)
        $(CC) $(CFLAGS) -o $@ $^

main.o : main.c
        $(CC) $(CFLAGS) -c $< -o $@

geometry.o : geometry.c
        $(CC) $(CFLAGS) -c $< -o $@

simple_io.o : simple_io.c
        $(CC) $(CFLAGS) -c $< -o $@

$(OFILES) : sphere.h

clean:
        rm -f *.o $(TARGET)

Here $@ means the target, $^ means all prerequisites, $< means just the first prerequisite.

A macro definition assigns a symbol to some text. A macro definition is a line in the Makefile of the form

NAME = sequence_of_tokens

In the Makefile, expressions of the form $(NAME) or ${NAME} will be replaced by sequence_of_tokens.

We can now

$ make clean
rm -f *.o hello

and

$ make
gcc -Wall -c main.c -o main.o
gcc -Wall -c geometry.c -o geometry.o
gcc -Wall -c simple_io.c -o simple_io.o
gcc -Wall -o hello main.o geometry.o simple_io.o

Building a static library

A library is a collection of items that you can call from your program. You can save much time by reusing work that someone else has already done and be more confident that it has fewer bugs: since other people may be using the same library, you will benefit from their usage, bug detection, and bug fixes.

The most straightforward way of using a library function is to have the object files from the library linked directly into your executable, just as with those you have compiled yourself. When linked like this the library is called a static library, because the library will remain unchanged unless the program is recompiled. The final result is a simple executable with no dependencies.

The static library under Linux is nothing more than an archive of object files.

Let us collect some of the object files in our example into a static library:

# Building a static library and an executable.
TARGET = hello
MAINOFILE = main.o
LIBOFILES = geometry.o simple_io.o
CC = gcc
CFLAGS = -Wall

$(TARGET) : main.o libsphere.a
        $(CC) $(CFLAGS) -o $@ $^

main.o : main.c
        $(CC) $(CFLAGS) -c $< -o $@

libsphere.a : $(LIBOFILES)
        ar rcs $@ $^

geometry.o : geometry.c
        $(CC) $(CFLAGS) -c $< -o $@

simple_io.o : simple_io.c
        $(CC) $(CFLAGS) -c $< -o $@

$(MAINOFILE) $(LIBOFILES) : sphere.h

clean:
        rm -f *.o *.a $(TARGET)

ar is a Linux tool to create, modify, and extract from archives. The options in this case mean: r—replace files existing inside the archive; c—create an archive if not already existent; s—create an object file index inside the archive.

Let us now

$ make clean
rm -f *.o hello

and

$ make
gcc -Wall -c main.c -o main.o
gcc -Wall -c geometry.c -o geometry.o
gcc -Wall -c simple_io.c -o simple_io.o
ar rcs libsphere.a geometry.o simple_io.o
gcc -Wall -o hello main.o libsphere.a

We notice that this time libsphere.a has been created alongside the executable. The object files inside it have been linked into the executable and there is no need to distribute libsphere.a alongside hello—everything that we need is inside hello.

Building a shared library

Static libraries, while reusable across multiple programs, are locked into a program at compile-time. Dynamic or shared libraries exist as separate files outside of the executable file.

The code of a static library is locked into the executable file and cannot be modified without a re-compile. In contrast, a dynamic library can be modified without a need to recompile. If a dynamic library becomes corrupt, the executable file may no longer work. A static library, however, is untouchable because it lives inside the executable file. The upside of using a dynamic library is that multiple running applications can use the same library without the need for each to have its own copy.

Let us see how we can build a shared library using Make.

# Building a shared library and an executable.
TARGET = hello
MAINOFILE = main.o
LIBOFILES = geometry.o simple_io.o
CC = gcc
CFLAGS = -Wall

$(TARGET) : main.o libsphere.so
        $(CC) $(CFLAGS) -o $@ $< -lsphere

main.o : main.c
        $(CC) $(CFLAGS) -c $< -o $@

libsphere.so : $(LIBOFILES)
        $(CC) -shared -o $@ $^

geometry.o : geometry.c
        $(CC) $(CFLAGS) -fpic -c $< -o $@

simple_io.o : simple_io.c
        $(CC) $(CFLAGS) -fpic -c $< -o $@

$(MAINOFILE) $(LIBOFILES) : sphere.h

clean:
        rm -f *.o *.a *.so $(TARGET)

Here -fpic tells the compiler to produce position-independent code (PIC). Such code, being placed somewhere in the primary memory, executes properly regardless of its absolute address. PIC is commonly used for shared libraries, so that the same library code can be loaded in a location in each program's address space where it doesn't overlap with other memory in use (e.g. other shared libraries).

Let us

$ make clean
rm -f *.o *.a *.so hello

But when we try to make, we get an error message:

$ make
ubuntu@ip-172-31-24-17:~/my_project$ make
gcc -Wall -c main.c -o main.o
gcc -Wall -fpic -c geometry.c -o geometry.o
gcc -Wall -fpic -c simple_io.c -o simple_io.o
gcc -shared -o libsphere.so geometry.o simple_io.o
gcc -Wall -o hello main.o -lsphere
/usr/bin/ld: cannot find -lsphere
collect2: error: ld returned 1 exit status
make: *** [Makefile:9: hello] Error 1

The linker does not know where to find libsphere.so. GCC has a list of places it examines by default, but our directory is not on that list. We need to tell GCC where to find libsphere.so. We will do that with the -L option:

# Building a shared library and an executable.
TARGET = hello
MAINOFILE = main.o
LIBOFILES = geometry.o simple_io.o
CC = gcc
CFLAGS = -Wall

$(TARGET) : main.o libsphere.so
        $(CC) $(CFLAGS) -L/home/ubuntu/my_project -o $@ $< -lsphere

main.o : main.c
        $(CC) $(CFLAGS) -c $< -o $@

libsphere.so : $(LIBOFILES)
        $(CC) -shared -o $@ $^

geometry.o : geometry.c
        $(CC) $(CFLAGS) -fpic -c $< -o $@

simple_io.o : simple_io.c
        $(CC) $(CFLAGS) -fpic -c $< -o $@

$(MAINOFILE) $(LIBOFILES) : sphere.h

clean:
        rm -f *.o *.a *.so $(TARGET)

This time the linking has succeeded:

$ make
gcc -Wall -L/home/ubuntu/my_project -o hello main.o -lsphere

Note that the -lsphere option is not looking for sphere.so but libsphere.so. GCC assumes that all libraries start with lib and end with .a or .so (.a for archive or statically linked libraries, .so for shared object or shared libraries).

Now let's try running hello:

$ ./hello
./hello: error while loading shared libraries: libsphere.so: cannot open shared object file: No such file or directory

The loader cannot find the shared library. We didn't install it in a standard location, so we need to give the loader a little help. We can use the environment variable LD_LIBRARY_PATH:

$ export LD_LIBRARY_PATH=/home/ubuntu/my_project:$LD_LIBRARY_PATH

Now everything is working fine:

$ ./hello
Input the radius of the sphere: 3.5
Surface area = 153.938034
Volume of sphere = 179.594376