Configuring your shell

What goes where, and why, in your shell’s dotfiles

2015-12-20

This is Yet Another Article about Shells. The hook: this article will not explain what a shell is, or demonstrate any cool tab-completion setups, or review any dotfile management frameworks, or show how to pack 100 pieces of information into a five-character prompt using colors and font weights.

Instead, this article will explain how to debug the shell’s environment by showing which configuration files the shell reads when it starts. This apparently simple topic turns out to be complicated.

The reason it’s complicated has to do with the historical origin of shells. We’ll talk briefly about this history, and how it resulted in different “modes” for shells. A shell always starts in one of four different modes (interactive vs non-interactive, and login vs non-login), and it reads different configuration files depending the mode. (Completely different files, in the case of Bash.) The key to arranging your shell configuration, and to debugging your shell environment, is understanding when and why the different modes occur. Once you understand this, you’ll understand which files a shell reads for configuration when it starts, and figuring out why your environment looks a particular way is basically always a matter of carefully reading the right configuration files.

The assumed audience of this article is technical professionals doing scientific computing, data science, machine learning, or other engineering: people who probably work in a shell semi-regularly, using software and probably writing software, but who aren’t necessarily software engineers or have a formal background with Unix-style systems. (In other words, this article is written for me, ten years ago.) Learning shells better, honestly, might or might not be a net time saver for you, but it is at least interesting and satisfying to understand their intended design. And in my own experience, the need to debug or refactor some ridiculous piece of shell script does come up now and then — and having a very clear grasp of shell behavior can be gratifying in discussions with your more experienced colleagues.

Origin of complexity in the design of shells

Explaining shell complexity with a little history; and a couple of suggestions

For me (and so I assume for literally every other human), shells are fairly confusing and annoying when you’re first exposed to them. Doing simple things is fine, but once you start trying to debug or understand anything related to shells or CLIs, abyssal rabbit holes open beneath you. This section explains why shells are complicated by pointing out that shells pull double duty: a shell is both an interactive interface to the computer’s operating system, and also a general-purpose programming language. This is a result of the historical context of shells. Regardless of our modern judgements of this design, recognizing it can be a big help to understanding shell behavior. And although it’s a good idea to avoid writing new programs with shells, they’re still ubiquitous and useful interactive environments.

Shells are designed to solve two big problems

A shell is an interactive interface to a computer’s operating system that focuses on text-based, single-line commands. Shells are different interfaces than the more familiar graphical interfaces; GUIs do not (necessarily) “run on top” of a shell, and the major operating systems all provide both a GUI and CLI for different kinds of tasks. Nowadays, most people interact with a shell through a graphical program that runs a shell; the closest we get to a “real” command-line experience is SSHing to some server from the comfort of our windowed desktop.

But in the past, there were no GUIs yet, and shells were the most advanced and convenient computer interface that anyone had available. I’m not an expert on the history of shells, but the ones we use today are descended directly from Version 7 Unix, and their design is probably a direct result of the Unix approach to software. The shell was a user’s primary method for doing things on a computer — so when a user first lands in a shell, it was important for the shell to be set up so that it was connected to the keyboard and the monitor, and could find all the common programs people used, and so on. But because many of the things people did on computers were repetitive and automatable, shells also included a general-purpose programming language — not as powerful as C, but much handier for wiring existing programs together and shuffling files around. In this world, someone working on a Unix system would use the shell to launch individual programs, look at the contents of files, and also interactively figure out the correct way to compose files and programs to solve larger problems. They’d do this using shell syntax, and once they’d figured out a solution, they would enshrine it in a script, using the same shell syntax. So shells are early examples of integrated development environments: they address several use cases in the process of creating software. (And in fact, compared to a modern IDE like IntelliJ, shells are models of simplicity.)*

So a shell needed to support two important use cases: interactive, with a person sitting at a keyboard and typing things, and automatic, running a program (a shell script). And for both of these cases, the shell also needed two different methods for configuration: a way to specify a default configuration so that a person would get a useful set-up when they logged in to a computer, and another way that allowed the normal Unix approach to configuring a program through environment inheritance. For example, a person might start a shell interactively with a non-default value for PATH in order to test out some new piece of software. Or a person might want to non-interactively execute a shell script using a default environment, perhaps to report on the environment itself.

The key point, though, is that today shells seem somewhat complicated and awkward because they address two major problems: interaction, and general-purpose programming through program composition. Today, we tend to avoid using shells for the second purpose, leaving us to wonder why using them for the first purpose is so complex.

which $SHELL? (Use Zsh)

There are a lot of different shells. A stock installation of macOS 10.15 Catalina, for example, includes programs named sh, bash, csh, dash, ksh, tcsh, and zsh. (In case you’re counting, that means that 7 of the 36 programs in /usr/bin are shells.) And if you like spending time on this kind of stuff, it’s easy to install elvish, fish, osh, xonsh, or more, probably even from your package manager. The history of these shells is intricate and undoubtedly fascinating, but for this article, a quick summarization (more likely caricature) will do:

In the olden days, many different shells were in widespread use, and the Bourne shell (sh, released in 1977) was particularly popular. The GNU Project needed an open-source shell for their operating system and wrote the Bourne-again shell (bash, released 1989), which took ideas from the Bourne shell and all its competitors as well. Bash is, probably, the most widely used shell, and something of a de-facto standard — it’s been the default user-facing shell on most Linux distributions for decades. The Z Shell (zsh, released 1990) further extended and improved Bash, and is also quite popular. (In fact, in 2019, Zsh became the default user-facing shell on macOS with the release of Catalina.)

This article focuses on Bash and Zsh because these seem the most relevant for the target audience of technical professionals. Bash is ubiquitous across pretty much every non-Windows computer. Zsh is a bit nicer and a bit less constrained by historical baggage than Bash; it runs anywhere Bash runs, and it’s often already installed by default; and it’s backwards compatible with Bash, so if you’re comfortable in Zsh you won’t have trouble with Bash anyhow. Picking one of these shells and learning it well seems like a good investment, and although I recommend Zsh, you can’t go wrong with either.

Learning one of these shells is not exciting, or even particularly interesting, especially compared to fish (colors! an obviously better design philosophy!) or xonsh (just use Python!). But Bash or Zsh are much more valuable in the long term, because it puts you in a position to more deeply understand the entire operating system and software stack that you interact with all day, every day.

Takeaways

The shell startup behavior depends on the mode

A shell can be invoked in four different modes; the modes serve different purposes related to the double-duty design of shells; the shell reads different configuration files depending on the mode

There are four different modes

Bash and Zsh always start in one of four different modes. Logically, the mode is controlled by two boolean flags: Is this a login shell? Is this an interactive shell?

Two two-answer choices means four different modes. Why the complexity? To me it seems to boil down to the double-duty design explained in the last section: the interactive flag to decide between the person-at-the-keyboard model or the exectute-a-program model, and the login flag to allow the shell to have a useful environment by default but also allow overriding it to configure programs (or other interactive sessions).

The mode only affects what the shell does when it starts up — this includes which configuration files the shell reads (the main topic of this article) and some other things like whether the shell tries to connect to a TTY. Since the mode doesn’t matter after the shell starts up, there isn’t really a concept of changing the mode for the currently running shell — it wouldn’t do anything even if you could change it.

Testing and setting the mode

In most cases, you don’t need to explicitly set the shell mode — it happens automatically. And it turns out that if your shell configuration is organized well, there’s rarely a reason to query the shell mode either. But let’s go ahead and take a look at those problems for the sake of completeness.

Zsh first because it’s a little easier to explain. To figure out which mode Zsh is in, check the interactive and login shell options. As a running example, assume the file tester.zsh contains:

[[ -o interactive ]] && echo 'interactive' || echo 'non-interactive'
[[ -o login ]] && echo 'login' || echo 'non-login'

From within Zsh, running source tester.zsh will report the shell’s mode (probably interactive login on macOS, interactive non-login on Linux).

Zsh provides command-line options to start a new instance in a particular mode. Running a plain zsh will start in interactive non-login mode by default. (Try running zsh, and then source tester.zsh.) Running a script or command (zsh tester.zsh or zsh -c "$(cat tester.zsh)") will start in non-interactive non-login mode and exit once the script is done. The command line flags --login and --interactive override these defaults — so zsh --interactive tester.zsh reports interactive non-login. (Although the shell will still exit once the script finishes.)

Bash is basically the same as Zsh, although it only recognizes the -i command-line option. (Zsh recognizes --interactive and -i.) See “Invoking Bash” in the manual for more details, but to summarize:

To test what mode Bash is in, test for the presence of i in $- and test the login_shell option using shopt (note that [[ -o login_shell ]] does not work):

[[ $- == *i* ]] && echo 'interactive' || echo 'non-interactive'
shopt -q login_shell && echo 'login' || echo 'non-login'

When the different modes show up

Now we know how to force a particular mode, and how to check what mode we’re in. But I mentioned above that under normal circumstances, you won’t need to do either of those things because the correct shell mode will be set naturally based on what you’re doing. So what situations lead to these different modes?

Interactive login
A clean-slate shell when a user first logs in to a computer. When you SSH to a remote machine, the remote starts one of these for you. Perhaps less expected, on macOS, every new tab in Terminal.app is one of these — so there are operating systems that will give you clean-slate login shells, even though you’re already logged in through the GUI. This contrasts with most Linux systems, where opening a shell in the graphical terminal will give you an interactive non-login shell. The reason for this is that on Linux (but not macOS), your user-specific login configuration has already been executed in the graphical shell, so any graphical programs you launch (including the graphical terminal emulator) automatically inherit those settings.§
Interactive non-login
A shell that you’ve launched for some other purpose after logging in to a system. The new shell will inherit the environment of the shell you’ve launched it from. This mode is quite common — Python’s virtualenvs launch this type of shell; Emacs’ term mode launches this type of shell; anytime you execute bash or zsh without arguments it launches in this mode.
Non-interactive login
Although this is technically a shell mode, as far as I’m aware it rarely occurs “naturally.” (You can force it, for example zsh --login tester.zsh.) I guess that historically, there was little demand for this — a user pretty much always wanted a login shell to be interactive; if they didn’t actually need an interactive shell it was easy enough to launch a non-interactive non-login subshell after login.
Non-interactive non-login
A shell running a script or command. So zsh my-script.sh, or patterns like /bin/bash -c "$(curl -fsSL https://<not-suspicious-install-script.sh>)", use this mode.

The startup sequence for Zsh

So far, we’ve seen that a shell always runs in one of four different modes, and we’ve talked about the situations where those modes are used. The last point is that the mode determines which files the shell looks at when it starts.

Zsh divides its configuration into four different groups: environment, profile, rc, and login. There are two files for each group: one for system-wide configuration, and one for user-specific stuff. Every time Zsh starts, it goes through each group in order, and decides whether to read the corresponding files or not based on the setting of the interactive and login options. So summarized as a table, Zsh’s startup sequence in each of the four modes looks like this:

The Zsh startup sequence
File Interactive Non-interactive
Login Non-login Login Non-login
/etc/zshenv X X X X
$ZDOTDIR/.zshenv X X X X
/etc/zprofile X X
$ZDOTDIR/.profile X X
/etc/zshrc X X
$ZDOTDIR/.zshrc X X
/etc/zlogin X X
$ZDOTDIR/.zlogin X X

A couple of comments about Zsh’s arrangement:

The startup sequence for Bash

Bash is, again, a little more historic and complex than Zsh. A good way to visualize Bash’s startup process is with a flowchart, and maybe someday I’ll make a nice-looking flowchart to post here. In the meantime, we’ll just use this outline:

So to parse that a little bit: a Bash login shell will first read system-wide configuration from /etc/profile, and then it will try to read user-specific configuration from the first existing file it can find in three standard locations. A Bash login shell never tries to read ~/.bashrc — this is why people usually source this file manually in their ~/.bash_profile.

Here’s a couple of other comments to keep in mind:

Advice and examples

Recommendations for how to organize your configuration files; examples and case studies

So far: a list of the four invocation modes for a shell, some comments about why there are four modes, and an explanation of which files a shell reads in which mode.

Finally: some suggestions for what to put in which configuration files, which will (I hope) make obvious sense after the explanations above. And, some examples of using this knowlege to figure out what’s causing the weird configuration in your environment.

What goes where

The guiding principle is to set up your config files so that launching a sub-shell behaves well. Generally, this means:

We’ll use a couple of simple but common examples to illustrate why: setting your PATH (everyone’s favorite topic) and setting aliases.

Setting your PATH (or any environment variable) Let’s say that you’ve installed Homebrew on Linux, or TeX Live, or Cargo, or any of a thousand common pieces of software that explain that you need to adjust PATH in order to use them. The instructions tell you, for example, to put a line similar to this in your ~/.profile:

PATH=$HOME/.linuxbrew/bin:$PATH

What’s wrong with putting this in ~/.bashrc? Well, suppose we did put that line in ~/.bashrc — that means that every sub-shell we launched would prepend the Linuxbrew bin directory onto PATH again. So for example, if we logged in to a remote system (getting an interactive login shell) with that line in ~/.bashrc, then our initial environment would have a setting like

PATH=/home/jsmith/.linuxbrew/bin:/usr/bin:/bin

which looks correct. Now start Emacs, and start term mode (an interactive non-login shell), and check the environment again: we’ll now see

PATH=/home/jsmith/.linuxbrew/bin:/home/jsmith/.linuxbrew/bin:/usr/bin:/bin

That’s not correct. But although it’s inelegant, it’s not necessarily causing a problem. Suppose instead we wanted to activate a Python virtualenv. In principle, “activating” a virtualenv should be as simple as prepending the virtualenv path to PATH and launching an interactive non-login subshell, which looks like this:

PATH=~/project-env/bin:$PATH $SHELL

But if we did that with our hypothetically misconfigured Bash and checked the environment within the sub-shell, we would see

PATH=/home/jsmith/.linuxbrew/bin:/home/jsmith/project-env/bin:/home/jsmith/.linuxbrew/bin:/usr/bin:/bin

That’s wrong, and it’s a problem — the Linuxbrew version of Python would shadow the virtualenv version of Python.

Of course, in real life, you use a somewhat complicated script to “activate” a Python virtualenv, and this script provides some nice features beyond just setting PATH. But (I bet) a big reason for having the activate script in the first place is because this type of shell misconfiguration is very widespread.

In general, the login configuration file (~/.zprofile) is the only place that should set environment variables. I think this is good advice even for environment variables that only affect interactive shell use, like EDITOR, LS_COLORS, and so on. For example, suppose you normally use EDITOR=vim, but you’ve decided its keybindings are insufficiently hermetic and you want to try out a different editor for a little while. It should be as simple as running EDITOR=kak zsh.

The key point is that it should be possible to override environment variables in a child process. Setting an environment variable in ~/.bashrc or ~/.zshrc breaks this property for shells.

On the flip side of this recommendation, everything else should go in ~/.zshrc: aliases, functions, completion setup, a fancy prompt, and so on and on. These settings are only relevant for interactive shell use; if you put them in ~/.zprofile they wouldn’t be available in interactive non-login shells (like Python virtualenvs), and they might interfere with shell scripts (if you’ve aliased common commands like ls or grep). And since none of these settings can participate in environment inheritance anyhow, there’s no chance that you’ll overwrite environment configuration by putting them in ~/.zshrc.

Debug your environment by inspecting system-wide configuration

Finally, here are some examples that show how a knowledge of shell modes and startup sequences can help you debug environment problems you might encounter in the wild.

Once you’ve got your user-specific shell configuration in order, understanding why PATH, umask, or whatever, is set in particular way is basically always a matter of inspecting the contents and sequence of the system-wide shell configuration files. We might hope that operating system distributions would avoid causing problems with these, but humans are only human.

For quite a long time on Mac OS X, if you used Zsh as your login shell, you would notice strange behavior with your PATH when starting interactive non-login subshells - shells started from within Emacs, for example. The system paths like /usr/bin would always show up ahead of any paths specified in ~/.zprofile. It turns out this was because the operating system included an /etc/zshenv (which Zsh sources for every invocation) that did exactly the same thing as /etc/zprofile (the login configuration). This configuration called the path_helper program, a Mac OS X utility that’s supposed to set the default PATH for login shells. So when you opened a new Terminal tab (an interactive login shell), Zsh would read /etc/zshenv (resetting the PATH with system directories at the front), then ~/.zprofile (adding your customizations to PATH), and then ~/.zshrc, and all looks as expected. But for any program that started another instance of Zsh from within this shell (an interactive non-login shell), Zsh would again read /etc/zshenv (resetting the PATH), but not ~/.zprofile (because it’s a non-login shell), undoing your customizations to PATH.

This arrangement (which existed from at least Mac OS X 10.5 through 10.10) was an error on the part of Apple — this configuration for PATH should have been in /etc/zprofile, and in Mac OS X 10.11 that’s where they moved it. Unfortunately in situations like this, there’s not a very clean solution as a user. If it’s your personal computer and it won’t interfere with anyone else’s work, then editing the system configuration by hand is one way to go. Another way is to completely override PATH in your ~/.zprofile: get the correct system default and copy it, and then add any customizations.

As another less problematic but confusing example, on CentOS 6 (and so RedHat as well, I assume), there is an /etc/profile that does quite a bit of login set up for the user. There’s also an /etc/bashrc, which repeats some of the setup done in /etc/profile (for example, setting the umask), presumably to make sure that the user environment is consistent between interactive login and interactive non-login shells. But /etc/profile does not source /etc/bashrc, which means that Bash will never actually read /etc/bashrc. I’m not clear on the intended behavior here, but understanding the shell startup sequence can help you eliminate /etc/bashrc as a potential explanation for some behavior.

In general, carefully inspecting your user-specific and the system-wide shell config files can answer most questions about why your environment is set up a certain way. Why is umask set to this strange value? Why is my locale not using UTF-8? Where is the default prompt set up? Check the configuration files.

Finally

I hope that there have been some useful details in this long article, but the basic point isn’t complicated: To understand why your shell’s environment is set up in a particular way, and why it seems to change depending on where and how you start the shell, you should understand that the shell can start in one of four modes and that it reads different configuration files depending on the mode.