Difference between revisions of "Programming/Linux"

From Thalesians Wiki
 
(13 intermediate revisions by the same user not shown)
Line 1: Line 1:
= What is Linux? =
= What is Linux? =


[https://www.linuxfoundation.org/ Linux] is a family of open-source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on 1991.09.17 by [https://en.wikipedia.org/wiki/Linus_Torvalds Linux Torvalds]. Linux is typically packaged in a Linux distribution.
'''[https://www.linuxfoundation.org/ Linux]''' is a family of open-source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on 1991.09.17 by [https://en.wikipedia.org/wiki/Linus_Torvalds Linus Torvalds].


Distributions include the Linux kernel and supporting system software and libraries, many of which are provided by the [https://www.gnu.org/ GNU Project].
Linux is typically packaged in a Linux '''distribution'''. Distributions include the Linux kernel and supporting system software and libraries, many of which are provided by the [https://www.gnu.org/ GNU Project].


Popular Linux distributions include [https://www.debian.org/ Debian], [https://getfedora.org/ Fedora], and [https://ubuntu.com/ Ubuntu]. Commercial distributions include [https://www.redhat.com/ Red Hat Enterprise Linux] and [https://www.suse.com/ SUSE Linux Enterprise Server].
Popular Linux distributions include [https://www.debian.org/ Debian], [https://getfedora.org/ Fedora], and [https://ubuntu.com/ Ubuntu]. Commercial distributions include [https://www.redhat.com/ Red Hat Enterprise Linux] and [https://www.suse.com/ SUSE Linux Enterprise Server].
As a Unix-like system, Linux subscribes to what has become known as the [https://en.wikipedia.org/wiki/Unix_philosophy Unix philosophy]. In their preface to the 1984 book, [https://amzn.to/3aQeAq7 The UNIX Programming Environment], [https://en.wikipedia.org/wiki/Brian_Kernighan Brian Kernighan] and [https://en.wikipedia.org/wiki/Rob_Pike Rob Pike], both from [https://en.wikipedia.org/wiki/Bell_Labs Bell Labs], give a brief description of the Unix design and the Unix philosophy:
<blockquote>
Even though the UNIX system introduces a number of innovative programs and techniques, no single program or idea makes it work well. Instead, what makes it effective is the approach to programming, a philosophy of using the computer. Although that philosophy can't be written down in a single sentence, at its heart is the idea that the power of a system comes more from the relationships among programs than from the programs themselves. Many UNIX programs do quite trivial things in isolation, but, combined with other programs, become general and useful tools.
</blockquote>


= Learning about the system =
= Learning about the system =
Line 31: Line 37:
</pre>
</pre>


= Working with packages =
= User management =
 
== The root user ==
 
'''<tt>root</tt>''' is the conventional name of the user who has all rights or permissions (to all files or programs) in all modes (single- or multi-user). The root user (also known as the '''superuser''') can do many things an ordinary user cannot, such as changing the ownership of files and binding to network ports numbered below 1024.
 
== <tt>sudo</tt> ==
 
<tt>sudo</tt> is a program that allows users to run programs with the security privileges of another user, by default, <tt>root</tt>. It originally stood for "superuser do" as the older versions of <tt>sudo</tt> were designed to run commands only as the superuser. However, the later versions added support for running commands not only as the superuser but also as other (restricted) users, and thus it is commonly expanded as "substitute user do". Although the latter case reflects its current functionality more accurately, <tt>sudo</tt> is still often called "superuser do" since it is so often used for administrative tasks.
 
Unlike the similar command <tt>su</tt>, users must, by default, supply their own password for authentication, rather than the password of the target user. After authentication, and if the configuration file, which is typically located at <tt>/etc/sudoers</tt>, permits the user access, the system invokes the requested command. The configuration file offers detailed access permissions, including enabling commands only from the invoking terminal; requiring a password per user or group; requiring re-entry of a password every time or never requiring a password at all for a particular command line. It can also be configured to permit passing arguments or multiple commands.
 
= Processes =
 
== What is a process? ==
 
A '''process''' is an instance of a computer program that is being executed by one or many threads. It contains the program code and its activity.
 
Each process is uniquely identified by its '''process identifier''' (also known as '''process ID''' or '''PID''').
 
== Listing processes ==
 
A list of processes running on a Linux machine can be obtained by running
<pre>
$ top
</pre>
 
There is also a more advanced, interactive process viewer called
<pre>
$ htop
</pre>
 
== What is a thread? ==
 
A '''thread''' of execution is the smallest sequence of programmed instructions that can be managed independently by a scheduler, which is typically part of the operating system. Multiple threads can exist within one process, executing concurrently and sharing resources such as memory, while different processes do not share these resources. In particular, the threads of a process share its executable code and the values of its dynamically allocated variables and non-thread-local global variables at any given time.
 
= Environment variables =
 
== What are environment variables? ==
 
An '''environment variable''' is a dynamic-named value that can affect the way running processes will behave on a computer. They are part of the environment in which a process runs. For example, a running process can query the value of the <tt>TEMP</tt> environment variable to discover a suitable location to store temporary files or the <tt>HOME</tt> environment variable to find the directory structure owned by the user running the process.
 
== Displaying the value of an environment variable ==
 
To display the value of an environment variable, you can use
<pre>
$ echo $HOME
/home/ubuntu
</pre>
 
== Listing all environment variables ==
 
To obtain a list of all environment variables along with their values, you can use
<pre>
$ env
SHELL=/bin/bash
PWD=/home/ubuntu
LOGNAME=ubuntu
XDG_SESSION_TYPE=tty
MOTD_SHOWN=pam
HOME=/home/ubuntu
LANG=C.UTF-8
...
</pre>
 
== Creating or modifying an environment variable ==
 
To create a new environment variable or change the value of an existing one, you can use the syntax
<pre>
$ MY_VAR="this is a new value"
</pre>
 
You can then check that the value has been set:
<pre>
$ echo $MY_VAR
this is a new value
</pre>
 
However, <tt>MY_VAR</tt> is not a "proper" environment variable: it hasn't been '''exported'''. It is merely a shell variable. If we kick off a child process from the current one, the environment variable won't be there:
<pre>
$ bash
$ echo $MY_VAR
 
$ exit
exit
$ echo $MY_VAR
this is a new value
</pre>
 
For an environment variable to be accessible to child processes, it must be '''exported''':
<pre>
ubuntu@ip-172-31-24-17:~$ export MY_VAR
ubuntu@ip-172-31-24-17:~$ bash
ubuntu@ip-172-31-24-17:~$ echo $MY_VAR
this is a new value
ubuntu@ip-172-31-24-17:~$ exit
exit
</pre>
 
It is also possible to define and export environment variables in one go:
<pre>
$ export MY_VAR="this is a new value"
</pre>
 
= Packages =
 
== What is a package? ==
 
A '''package''' is a distribution of software and data in archive files. Packages contain metadata, such as the software's name, description of its purpose, version number, vendor, checksum (preferably a cryptographic hash function), and a list of dependencies necessary for the software to run properly. Upon installation, metadata is stored in a local package database.
 
== What is a package manager? ==
 
'''Package managers''' typically maintain a database of software dependencies and version information to prevent software mismatches and missing prerequisites. They work closely with software repositories, binary repository managers, and app stores.
 
Package managers are designed to eliminate the need for manual installs and updates. This can be particularly useful for large enterprises whose operating systems are typically consisting of hundreds or even tens of thousands of distinct software packages.
 
== What is APT? ==
 
'''Advanced Package Tool''' ('''APT''') is a free-software user interface that works with core libraries to handle the installation and removal of software on Debian, Ubuntu, and related Linux distributions. APT simplifies the process of managing software on Unix-like computer systems by automating the retrieval, configuration, and installation of software packages either from precompiled files or by compiling source code.


== How to install a package? ==
== How to install a package? ==
Line 72: Line 196:
</pre>
</pre>


= Working with files and directories =
= Files and directories =


== Basics ==
== Basics ==
Line 89: Line 213:


For example, <tt>/var/log/apache2/error.log</tt> is an absolute path to Apache's error log file. Assuming that the current directory is <tt>/var/log</tt> (another absolute path), the relative path to the same log file is given by <tt>apache2/error.log</tt>.
For example, <tt>/var/log/apache2/error.log</tt> is an absolute path to Apache's error log file. Assuming that the current directory is <tt>/var/log</tt> (another absolute path), the relative path to the same log file is given by <tt>apache2/error.log</tt>.
<tt>.</tt> and <tt>..</tt> have a special meaning in paths. <tt>.</tt> refers to the current directory; <tt>..</tt> refers to its parent.


== What is the current directory? ==
== What is the current directory? ==
Line 162: Line 288:
$ sudo apt install emacs
$ sudo apt install emacs
</pre>
</pre>
The file can then be edited using
<pre>
vi README.txt
</pre>
for Vim or
<pre>
emacs README.txt
</pre>
for Emacs.


The learning curve for both these editors is quite steep.
The learning curve for both these editors is quite steep.
== Deleting a file or directory ==
Once a file has been created, e.g. with
<pre>
$ touch test.txt
</pre>
it can be deleted using <tt>rm</tt>:
<pre>
$ rm test.txt
</pre>
Useful options for <tt>rm</tt> are <tt>-f</tt> (ignore nonexistent files and arguments, never prompt) and <tt>-R</tt> (remove directories and their contents recursively).
For example, we can create a directory,
<pre>
$ mkdir test
</pre>
possibly some files and/or directories underneath it,
<pre>
$ mkdir test/subdir
$ touch test/foo.txt
</pre>
then delete the entire directory subtree starting with and including <tt>test</tt> using
<pre>
$ rm -R test
</pre>


= DevOps =
= DevOps =

Latest revision as of 13:55, 28 December 2020

What is Linux?

Linux is a family of open-source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on 1991.09.17 by Linus Torvalds.

Linux is typically packaged in a Linux distribution. Distributions include the Linux kernel and supporting system software and libraries, many of which are provided by the GNU Project.

Popular Linux distributions include Debian, Fedora, and Ubuntu. Commercial distributions include Red Hat Enterprise Linux and SUSE Linux Enterprise Server.

As a Unix-like system, Linux subscribes to what has become known as the Unix philosophy. In their preface to the 1984 book, The UNIX Programming Environment, Brian Kernighan and Rob Pike, both from Bell Labs, give a brief description of the Unix design and the Unix philosophy:

Even though the UNIX system introduces a number of innovative programs and techniques, no single program or idea makes it work well. Instead, what makes it effective is the approach to programming, a philosophy of using the computer. Although that philosophy can't be written down in a single sentence, at its heart is the idea that the power of a system comes more from the relationships among programs than from the programs themselves. Many UNIX programs do quite trivial things in isolation, but, combined with other programs, become general and useful tools.

Learning about the system

Which Linux?

To find out which Linux distribution is running on your machine, you can use

$ hostnamectl
   Static hostname: ip-172-31-24-17
         Icon name: computer-vm
           Chassis: vm
        Machine ID: b54d0220fe634fa4a96fa3d0641ab3ea
           Boot ID: 5208456664c54b09b34be6b541fa7588
    Virtualization: xen
  Operating System: Ubuntu 20.04.1 LTS
            Kernel: Linux 5.4.0-1029-aws
      Architecture: x86-64

More specifically, to find out the Kernel version, you can use

$ uname -r
5.4.0-1029-aws

User management

The root user

root is the conventional name of the user who has all rights or permissions (to all files or programs) in all modes (single- or multi-user). The root user (also known as the superuser) can do many things an ordinary user cannot, such as changing the ownership of files and binding to network ports numbered below 1024.

sudo

sudo is a program that allows users to run programs with the security privileges of another user, by default, root. It originally stood for "superuser do" as the older versions of sudo were designed to run commands only as the superuser. However, the later versions added support for running commands not only as the superuser but also as other (restricted) users, and thus it is commonly expanded as "substitute user do". Although the latter case reflects its current functionality more accurately, sudo is still often called "superuser do" since it is so often used for administrative tasks.

Unlike the similar command su, users must, by default, supply their own password for authentication, rather than the password of the target user. After authentication, and if the configuration file, which is typically located at /etc/sudoers, permits the user access, the system invokes the requested command. The configuration file offers detailed access permissions, including enabling commands only from the invoking terminal; requiring a password per user or group; requiring re-entry of a password every time or never requiring a password at all for a particular command line. It can also be configured to permit passing arguments or multiple commands.

Processes

What is a process?

A process is an instance of a computer program that is being executed by one or many threads. It contains the program code and its activity.

Each process is uniquely identified by its process identifier (also known as process ID or PID).

Listing processes

A list of processes running on a Linux machine can be obtained by running

$ top

There is also a more advanced, interactive process viewer called

$ htop

What is a thread?

A thread of execution is the smallest sequence of programmed instructions that can be managed independently by a scheduler, which is typically part of the operating system. Multiple threads can exist within one process, executing concurrently and sharing resources such as memory, while different processes do not share these resources. In particular, the threads of a process share its executable code and the values of its dynamically allocated variables and non-thread-local global variables at any given time.

Environment variables

What are environment variables?

An environment variable is a dynamic-named value that can affect the way running processes will behave on a computer. They are part of the environment in which a process runs. For example, a running process can query the value of the TEMP environment variable to discover a suitable location to store temporary files or the HOME environment variable to find the directory structure owned by the user running the process.

Displaying the value of an environment variable

To display the value of an environment variable, you can use

$ echo $HOME
/home/ubuntu

Listing all environment variables

To obtain a list of all environment variables along with their values, you can use

$ env
SHELL=/bin/bash
PWD=/home/ubuntu
LOGNAME=ubuntu
XDG_SESSION_TYPE=tty
MOTD_SHOWN=pam
HOME=/home/ubuntu
LANG=C.UTF-8
...

Creating or modifying an environment variable

To create a new environment variable or change the value of an existing one, you can use the syntax

$ MY_VAR="this is a new value"

You can then check that the value has been set:

$ echo $MY_VAR
this is a new value

However, MY_VAR is not a "proper" environment variable: it hasn't been exported. It is merely a shell variable. If we kick off a child process from the current one, the environment variable won't be there:

$ bash
$ echo $MY_VAR

$ exit
exit
$ echo $MY_VAR
this is a new value

For an environment variable to be accessible to child processes, it must be exported:

ubuntu@ip-172-31-24-17:~$ export MY_VAR
ubuntu@ip-172-31-24-17:~$ bash
ubuntu@ip-172-31-24-17:~$ echo $MY_VAR
this is a new value
ubuntu@ip-172-31-24-17:~$ exit
exit

It is also possible to define and export environment variables in one go:

$ export MY_VAR="this is a new value"

Packages

What is a package?

A package is a distribution of software and data in archive files. Packages contain metadata, such as the software's name, description of its purpose, version number, vendor, checksum (preferably a cryptographic hash function), and a list of dependencies necessary for the software to run properly. Upon installation, metadata is stored in a local package database.

What is a package manager?

Package managers typically maintain a database of software dependencies and version information to prevent software mismatches and missing prerequisites. They work closely with software repositories, binary repository managers, and app stores.

Package managers are designed to eliminate the need for manual installs and updates. This can be particularly useful for large enterprises whose operating systems are typically consisting of hundreds or even tens of thousands of distinct software packages.

What is APT?

Advanced Package Tool (APT) is a free-software user interface that works with core libraries to handle the installation and removal of software on Debian, Ubuntu, and related Linux distributions. APT simplifies the process of managing software on Unix-like computer systems by automating the retrieval, configuration, and installation of software packages either from precompiled files or by compiling source code.

How to install a package?

Use apt, the command line interface for the package management system.

Before proceeding, run

$ sudo apt update

update is used to download package information from all configured sources. Other commands operate on this data to e.g. perform package upgrades or search in and display details about all packages available for installation.

Once this is done, run

$ sudo apt install emacs

to install the GNU project Emacs editor,

$ sudo apt install mc

to install the GNU Midnight Commander. Other packages are installed in a similar manner.

How to find out which packages can be upgraded?

$ sudo apt list --upgradable

will produce a list of all packages that can be upgraded.

How to upgrade a package?

To upgrade a specific package, say emacs, you can use

sudo apt upgrade emacs

To upgrade all upgradable packages, use

sudo apt upgrade

Files and directories

Basics

A file system controls how data is stored and retrieved. Without a file system, data placed in a storage medium would be one large body of data with no way to tell where one piece of data stops and the next begins. By separating the data into pieces and giving each piece a name, the data is easily isolated and identified.

Taking its name from the way paper-based data management system is managed, each group of data is called a "file". A file is a computer resource for recording data discretely in a computer storage device. Just as words can be written to paper, so can information be written to a file. Files can be edited and transferred through the internet on a particular computer system.

A directory is a file system cataloging structure which contains references to other computer files and possibly other directories. Files are organized by storing related files in the same directory. In a hierarchical file system (that is, one in which files and directories are organized in a manner that resembles a tree), a directory contained inside another directory is called a subdirectory. The terms parent and child are often used to describe the relationship between a subdirectory and the directory in which it is cataloged, the latter being the parent.

The top-most directory in such a file system, which does not have a parent of its own, is called the root directory.

A path specifies a unique location in a file system. A path points to a file system location (file or directory) by following the directory tree hierarchy expressed in a string of characters in which path components, separated by a delimiting character (on Linux—"/"), represent each directory.

An absolute or full path is defined as specifying the location of a file or directory from the root directory (/). It points to the same location in a file system regardless of the current working directory. By contrast, a relative path starts from some given working directory, avoiding the need to provide the full absolute path.

For example, /var/log/apache2/error.log is an absolute path to Apache's error log file. Assuming that the current directory is /var/log (another absolute path), the relative path to the same log file is given by apache2/error.log.

. and .. have a special meaning in paths. . refers to the current directory; .. refers to its parent.

What is the current directory?

To find out the current directory on a Linux system, use

$ pwd
/home/ubuntu

How can I change the current directory?

The current directory can be changed using cd, for example:

~$ cd /var/www
/var/www$

To change the current directory to the user's home directory, one can use simply

$ cd

What are the contents of the current directory?

To list the contents of the current directory, you can use ls. In its most basic form, it's simply

$ ls

A useful variant is

$ ls -alt

where -a tells ls not to ignore entries starting with ., -l means that ls should use a long listing format, -t tells is to sort by modification time, newest first. (ls -a -l -t can be compressed into ls -alt.)

Creating an empty file

To create an empty file, you can use touch:

$ touch README.txt

If the file already exists, touch will update its access and modification times to the current time.

Editing a file

Suppose we want to edit the file README.txt. We are spoilt for choice when it comes to choosing an editor.

Many systems come with Nano pre-installed, Nano's ANOther editor, "a small and friendly editor":

$ nano README.txt

If Nano is not installed, you could install it using

$ sudo apt update
...
$ sudo apt install nano

There are two feature-rich, customizable editors popular among software developers: Vim and Emacs.

They can be installed using, respectively,

$ sudo apt update
...
$ sudo apt install vim

and

$ sudo apt update
...
$ sudo apt install emacs

The file can then be edited using

vi README.txt

for Vim or

emacs README.txt

for Emacs.

The learning curve for both these editors is quite steep.

Deleting a file or directory

Once a file has been created, e.g. with

$ touch test.txt

it can be deleted using rm:

$ rm test.txt

Useful options for rm are -f (ignore nonexistent files and arguments, never prompt) and -R (remove directories and their contents recursively).

For example, we can create a directory,

$ mkdir test

possibly some files and/or directories underneath it,

$ mkdir test/subdir
$ touch test/foo.txt

then delete the entire directory subtree starting with and including test using

$ rm -R test

DevOps

What is my system currently doing?

To find out what the system is currently doing, including things such as CPU and memory utilization, you can use glances.

To install it, use

$ sudo apt update
...
$ sudo apt install glances

Then run it using

$ glances