Chapter 2: How does zsh differ from...?

As has already been mentioned, zsh is most similar to ksh, while many of the additions are to please csh users. Here are some more detailed notes.

2.1: Differences from sh and ksh

Most features of ksh (and hence also of sh) are implemented in zsh; problems can arise because the implementation is slightly different. Note also that not all ksh's are the same either. I have based this on the 11/16/88f version of ksh; differences from ksh93 will be more substantial.

As a summary of the status:

  1. because of all the options it is not safe to assume a general zsh run by a user will behave as if sh or ksh compatible;
  2. invoking zsh as sh or ksh (or if either is a symbolic link to zsh) sets appropriate options and improves compatibility (from within zsh itself, calling ARGV0=sh zsh will also work);
  3. from version 3.0 onward the degree of compatibility with sh under these circumstances is very high: zsh can now be used with GNU configure or perl's Configure, for example;
  4. the degree of compatibility with ksh is also high, but a few things are missing: for example the more sophisticated pattern-matching expressions are different for versions before 3.1.3 --- see the detailed list below;
  5. also from 3.0, the command `emulate' is available: `emulate ksh' and `emulate sh' set various options as well as changing the effect of single-letter option flags as if the shell had been invoked with the appropriate name. Including the command `emulate sh; setopt localoptions' in a shell function will turn on sh emulation for that function only. In version 4 (and in 3.0.6 through 8), this can be abbreviated as `emulate -L sh';
  6. in versions after 5.9, the namespace syntax and named references (ksh nameref) are available, but differ in some details from the ksh93+ semantics;
  7. also after 5.9, non-forking command substitutions are available. These are described by ksh as a brace group preceded by a dollar sign (${ list;}), but zsh has both some added features adopted from mksh, and some limitations, see 2.11

The classic difference is word splitting, discussed in question 3.1; this catches out very many beginning zsh users. As explained there, this is actually a bug in every other shell. The answer is to set SH_WORD_SPLIT for backward compatibility. The next most classic difference is that unmatched glob patterns cause the command to abort; set NO_NOMATCH for those.

Here is a list of various options which will increase ksh compatibility, though maybe decrease zsh's abilities: see the manual entries for GLOB_SUBST, IGNORE_BRACES (though brace expansion occurs in some versions of ksh), KSH_ARRAYS, KSH_GLOB, KSH_OPTION_PRINT, LOCAL_OPTIONS, NO_BAD_PATTERN, NO_BANG_HIST, NO_EQUALS, NO_HUP, NO_NOMATCH, NO_RCS, NO_SHORT_LOOPS, PROMPT_SUBST, RM_STAR_SILENT, POSIX_ALIASES, POSIX_BUILTINS, POSIX_IDENTIFIERS, SH_FILE_EXPANSION, SH_GLOB, SH_OPTION_LETTERS, SH_WORD_SPLIT (see question 3.1) and SINGLE_LINE_ZLE. Note that you can also disable any built-in commands which get in your way. If invoked as `ksh', the shell will try to set suitable options.

Here are some differences from ksh which might prove significant for ksh programmers, some of which may be interpreted as bugs; there must be more. Note that this list is deliberately rather full and that most of the items are fairly minor. Those marked `*' perform in a ksh-like manner if the shell is invoked with the name `ksh', or if `emulate ksh' is in effect. Capitalised words with underlines refer to shell options.

2.2: Similarities with csh

Although certain features aim to ease the withdrawal symptoms of csh (ab)users, the syntax is in general rather different and you should certainly not try to run scripts without modification. The c2z script is provided with the source (in Misc/c2z) to help convert .cshrc and .login files; see also the next question concerning aliases, particularly those with arguments.

Csh-compatibility additions include:

2.3: Why do my csh aliases not work? (Plus other alias pitfalls.)

First of all, check you are using the syntax


    alias newcmd='list of commands'
  
and not

    alias newcmd 'list of commands'
  
which won't work. (It tells you if `newcmd' and `list of commands' are already defined as aliases.)

Otherwise, your aliases probably contain references to the command line of the form \!*, etc. Zsh does not handle this behaviour as it has shell functions which provide a way of solving this problem more consistent with other forms of argument handling. For example, the csh alias


    alias cd 'cd \!*; echo $cwd'
  
can be replaced by the zsh function,

    cd() { builtin cd "$@"; echo $PWD; }
  
(the `builtin' tells zsh to use its own `cd', avoiding an infinite loop) or, perhaps better,

    cd() { builtin cd "$@"; print -D $PWD; }
  
(which converts your home directory to a ~). In fact, this problem is better solved by defining the special function chpwd+() (see the manual). Note also that the ; at the end of the function is optional in zsh, but not in ksh or sh (for sh's where it exists).

Here is Bart Schaefer's guide to converting csh aliases for zsh.

  1. If the csh alias references "parameters" (\!:1, \!* etc.), then in zsh you need a function (referencing $1, $* etc.). In recent versions of zsh this can be done by defining an anonymous function within the alias. Otherwise, a simple zsh alias suffices.

  2. If you use a zsh function, you need to refer _at_least_ to $* in the body (inside the { }). Parameters don't magically appear inside the { } the way they get appended to an alias.

  3. If the csh alias references its own name (alias rm "rm -i"), then in a zsh function you need the "command" or "builtin" keyword (function rm() { command rm -i "$@" }), but in a zsh alias you don't (alias rm="rm -i").

  4. If you have aliases that refer to each other (alias ls "ls -C"; alias lf "ls -F" ==> lf == ls -C -F) then you must either:

    Those first four are all you really need, but here are four more for heavy csh alias junkies:

  5. Mapping from csh alias "parameter referencing" into zsh function (assuming SH_WORD_SPLIT and KSH_ARRAYS are NOT set in zsh):
    
          csh             zsh
         =====         ==========
         \!*           $*              (or $argv)
         \!^           $1              (or $argv[1])
         \!:1          $1
         \!:2          $2              (or $argv[2], etc.)
         \!$           $*[$#]          (or $argv[$#], or $*[-1])
         \!:1-4        $*[1,4]
         \!:1-         $*[1,$#-1]      (or $*[1,-2])
         \!^-          $*[1,$#-1]
         \!*:q         "$@"
         \!*:x         $=*             ($*:x doesn't work (yet))
            
    

  6. Remember that it is NOT a syntax error in a zsh function to refer to a position ($1, $2, etc.) greater than the number of parameters. (E.g., in a csh alias, a reference to \!:5 will cause an error if 4 or fewer arguments are given; in a zsh function, $5 is the empty string if there are 4 or fewer parameters. Force an error in this example by using ${5?}.)

  7. To begin a zsh alias with a - (dash, hyphen) character, use alias --:
    
                 csh                            zsh
            ===============             ==================
            alias - "fg %-"             alias -- -="fg %-"
          
    

  8. Stay away from alias -g in zsh until you REALLY know what you're doing.

There is one other serious problem with aliases: consider


    alias l='/bin/ls -F'
    l() { /bin/ls -la "$@" | more }
  
l in the function definition is in command position and is expanded as an alias, defining /bin/ls and -F as functions which call /bin/ls, which gets a bit recursive. Recent versions of zsh treat this as an error, but older versions silently create the functions.

One workaround for this is to use the "function" keyword instead:


    alias l='/bin/ls -F'
    function l { /bin/ls -la "$@" | more }
  
The l after function is not expanded. Note you don't need the () in this case, although it's harmless.

You need to be careful if you are defining a function with multiple names; most people don't need to do this, so it's an unusual problem, but in case you do you should be aware that in versions of the shell before 5.1 names after the first were expanded:


    function a b c { ... }
  
Here, b and c, but not a, have aliases expanded. This oddity was fixed in version 5.1.

The rest of this item assumes you use the (more common, but equivalent) () definitions.

Bart Schaefer's rule is: Define first those aliases you expect to use in the body of a function, but define the function first if the alias has the same name as the function.

If you aware of the problem, you can always escape part or all of the name of the function:


     'l'() { /bin/ls -la "$@" | more }
  
Adding the quotes has no effect on the function definition, but suppresses alias expansion for the function name. Hence this is guaranteed to be safe---unless you are in the habit of defining aliases for expressions such as 'l', which is valid, but probably confusing.

2.4: Similarities with tcsh

(The sections on csh apply too, of course.) Certain features have been borrowed from tcsh, including $watch, run-help, $savehist, periodic commands etc., extended prompts, sched and which built-ins. Programmable completion was inspired by, but is entirely different to, tcsh's complete. (There is a perl script called lete2ctl in the Misc directory of the source distribution to convert complete to compctl statements.) This list is not definitive: some features have gone in the other direction.

If you're missing the editor function run-fg-editor, try something with bindkey -s (which binds a string to a keystroke), e.g.


    bindkey -s '^z' '\eqfg %$EDITOR:t\n'
  
which pushes the current line onto the stack and tries to bring a job with the basename of your editor into the foreground. bindkey -s allows limitless possibilities along these lines. You can execute any command in the middle of editing a line in the same way, corresponding to tcsh's -c option:

    bindkey -s '^p' '\eqpwd\n'
  
In both these examples, the \eq saves the current input line to be restored after the command runs; a better effect with multiline buffers is achieved if you also have

    bindkey '\eq' push-input
  
to save the entire buffer. In version 4 and recent versions of zsh 3.1, you have the following more sophisticated option,

    run-fg-editor() {
      zle push-input
      BUFFER="fg %$EDITOR:t"
      zle accept-line
    }
    zle -N run-fg-editor
  
and can now bind run-fg-editor just like any other editor function.

2.5: Similarities with bash

The Bourne-Again Shell, bash, is another enhanced Bourne-like shell; the most obvious difference from zsh is that it does not attempt to emulate the Korn shell. Since both shells are under active development it is probably not sensible to be too specific here. Broadly, bash has paid more attention to standards compliance (i.e. POSIX) for longer, and has so far avoided the more abstruse interactive features (programmable completion, etc.) that zsh has.

In recent years there has been a certain amount of crossover in the extensions, however. Zsh (as of 3.1.6) has bash's `${var/old/new}' feature for replacing the text old with the text new in the parameter $var. Note one difference here: while both shells implement the syntax `${var/#old/new}' and `${var/%old/new}' for anchoring the match of old to the start or end of the parameter text, respectively, in zsh you can't put the `#' or `%' inside a parameter: in other words `{var/$old/new}' where old begins with a `#' treats that as an ordinary character in zsh, unlike bash. To do this sort of thing in zsh you can use (from 3.1.7) the new syntax for anchors in any pattern, `(#s)' to match the start of a string, and `(#e)' to match the end. These require the option EXTENDED_GLOB to be set.

2.6: Shouldn't zsh be more/less like ksh/(t)csh?

People often ask why zsh has all these `unnecessary' csh-like features, or alternatively why zsh doesn't understand more csh syntax. This is far from a definitive answer and the debate will no doubt continue.

Paul's object in writing zsh was to produce a ksh-like shell which would have features familiar to csh users. For a long time, csh was the preferred interactive shell and there is a strong resistance to changing to something unfamiliar, hence the additional syntax and CSH_JUNKIE options. This argument still holds. On the other hand, the arguments for having what is close to a plug-in replacement for ksh are, if anything, even more powerful: the deficiencies of csh as a programming language are well known (search for csh-whynot if you are in any doubt) and zsh is able to run many standard scripts such as /etc/rc.

Of course, this makes zsh rather large and feature-ridden so that it seems to appeal mainly to hackers. The only answer, perhaps not entirely satisfactory, is that you have to ignore the bits you don't want. The introduction of loadable in modules in version 3.1 should help.

2.7: What is zsh's support for Unicode/UTF-8?

`Unicode', or UCS for Universal Character Set, is the modern way of specifying character sets. It replaces a large number of ad hoc ways of supporting character sets beyond ASCII. `UTF-8' is an encoding of Unicode that is particularly natural on Unix-like systems.

The production branch of zsh, 4.2, has very limited support: the built-in printf command supports "\u" and "\U" escapes to output arbitrary Unicode characters; ZLE (the Zsh Line Editor) has no concept of character encodings, and is confused by multi-octet encodings.

However, the 4.3 branch has much better support, and furthermore this is now fairly stable. (Only a few minor areas need fixing before this becomes a production release.) This is discussed more fully below, see `Multibyte input and output'.

2.8: Why does my bash script report an error when I run it under zsh?

tl;dr: bash is not the reference implementation of zsh, and zsh is not a bug-for-bug compatible reimplementation of bash.

bash and zsh are different programming languages. They are not interchangeable; programs written for either of these languages will, in general, not run under the other. (The situation is similar with many other pairs of closely-related languages, such as Python 2 and Python 3; C and C++; and even C89 and C11.)

When bash and zsh behave differently on the same input, whether zsh's behaviour is a bug does not depend on what bash does on the same input; rather, it depends on what zsh's user manual specifies. (By way of comparison, it's not a bug in Emacs that :q! doesn't cause it to exit.)

That being said, the bash and zsh languages do have a common subset, and it is feasible to write non-trivial pieces of code that would run under either of them, if one is sufficiently familiar with both of them. However, a difference between bash's behaviour and zsh's does not imply that zsh has a bug. The difference might be a bug in zsh, a bug in bash, or a bug in neither shell (see 3.1 for an example).

The recommended way to deal with these differences depends on what kind of piece of code is in question: a script or a plugin.

For scripts — external commands that are located in $PATH, or located elsewhere and are executed by giving their path explicitly (as in ls, /etc/rc.d/sshd, and ./configure) — the answer is simple:

Don't run bash scripts under zsh. If the scripts were written for bash, run them in bash. There's absolutely no problem with having #!/usr/bin/env bash scripts even if zsh is your shell for interactive sessions.

In fact, if you've recently changed to zsh, we recommend that you keep your scripts as #!/usr/bin/env bash, at least for a while: this would make the change more gradual and flatten your learning curve. Once you're used to zsh, you can decide for each script whether to port it to zsh or keep it as-is.

For plugins — pieces of code executed within the shell itself, loaded via the ., source, or autoload builtins, added to .zshrc, or pasted interactively at the shell prompt — one may consider it worthwhile to invest the effort to make them runnable under either shell. However, as mentioned above, doing so requires one to be familiar with both shells, and either steer clear of their differences or handle them explicitly with conditional code (such as if test -n "$ZSH_VERSION").

In summary, if you'd like to run a bash script or plugin under zsh, you must port the script or plugin properly, reviewing it line by line for differences between the two languages and adjusting it accordingly, just like you would when translating a book from American English to British English.

2.9: What is a namespace anyway?

As of this writing, namespaces in zsh are little more than syntactic sugar for grouping related parameters. For example, as of the update to PCRE2, the parameters ${.pcre.match} and ${.pcre.subject} are used for regular expression substring capture. The .pcre. part is the namespace, and when you refer to a parameter that has one, you must use the ${...} braces around the name. Assignments are not special, they have the form .nspace.var=value as usual.

Parameters using a namespace have the additional property that, like file names beginning with a dot for globbing, they're hidden from typeset output unless explicitly asked for.

Namespaces appear in releases after but not including zsh 5.9.

2.10: What about named references?

Named references are a bit like aliases, but for parameters. A named reference would typically be usable in the same cases as ${(P)name} (see 3.22). The value of a named reference is the name of another parameter, and when you expand or assign to the named reference, that other parameter is expanded or assigned instead. Thus a trivial example is


    % target=RING
    % typeset -n ref=target
    % print $ref
    RING
    % ref=BULLSEYE
    % print $target
    BULLSEYE
  

One exception to this behavior is when a named reference is used as the loop variable in a for loop. In that case the reference is unset and reset on each iteration of the loop.


    % target=RING bullseye=SPOT other=MISS
    % typeset -n ref=other
    % for ref in target bullseye; do
    > print $ref
    > ref=HIT:$ref
    > done
    RING
    SPOT
    % print $other
    MISS
    % print $ref
    HIT:SPOT
  

Dynamic scoping applies to named references, so for example a named reference declared in global scope may be used in function scopes. In ksh, local parameters have static scope, so named references in zsh may have side-effects that do not occur in ksh. To limit those effects, zmodload zsh/param/private and declare all named references private.

Named references may be used in zsh versions later than 5.9.

2.11: What is zsh's support for non-forking command substitution?

This is for cases where you'd write $(command) but you don't want the overhead or other issues associated with forking a subshell. There are 3 variations:

In all three forms code behaves similarly to an anonymous function invoked like:


    () { code } "$@"
  
Thus, all parameters declared inside the substitution are local by default, and positional parameters $1, $2, etc. are those of the calling context.

The most significant limitation is that braces ({ and }) within the substitutions must either be in balanced pairs, or must be quoted, that is, included in a quoted string or prefixed by backslash. These substitutions first become usable after zsh 5.9.

2.12: Comparisons of forking and non-forking command substitution

${ command } and variants may change the caller's options by using setopt and may modify the caller's local parameters, including the positional parameters $1, $2, etc., via both assignments and set -- pos1 pos2 etc. Nothing that happens within $(command) affects the caller.

When not enclosed in double quotes, the expansion of $(command) is split on IFS into an array of words. In contrast, and unlike both bash and ksh, unquoted non-forking substitutions behave like parameter expansions with respect to the SH_WORD_SPLIT option.

Both ${|...} and ${{var} ...} retain any trailing newlines, except as handled by the SH_WORD_SPLIT option, consistent with ${|...} from mksh. ${ command } removes a single final newline, but "${ command }" retains it. This differs from bash and ksh, so in emulation modes, newlines are stripped even from quoted command output. In all cases, $(command) removes all trailing newlines from the output of command.

When command is not a builtin, ${ command } does fork, and typically forks the same number of times as $(command), because in the latter case zsh usually optimizes the final fork into an exec.

Redirecting input from files has subtle differences:

${|IFS= read -rd '' <file} is therefore the best solution for files that do not contain nul bytes, because it copies the file directly into the local REPLY and then substitutes that. For very large files, refer to Functions/Misc/zslurp.