Professional CMake:

A Practical Guide

Learn to use CMake effectively with practical advice from a CMake co-maintainer. You can also have the author work directly with your team!

Quoting In CMake

Let’s be honest, CMake’s syntax is not one of its most popular characteristics. This article isn’t going to try to convince you otherwise. Rather, it focuses on how quoting in CMake works. Its goal is to help you avoid common problems and write clearer, more robust CMake code. You’ll find no long-winded history lessons here, just the essential info you need.

Estimated reading time: 10 minutes

Quoting Simple Command Arguments

Unlike many scripting languages, CMake doesn’t always require strings to be quoted. In fact, much of the time, you can pass strings to commands without quoting. Keyword arguments are a common example. We might think of those as being special, but from CMake’s point of view, they too are just strings. It is up to the command’s implementation to give those strings meaning as keywords. To demonstrate, the following are all equivalent:

add_library(MyThings STATIC someFile.cpp)
add_library(MyThings STATIC "someFile.cpp")
add_library("MyThings" "STATIC" "someFile.cpp")

Developers often feel the need to treat file names and paths specially, often adding quotes when not required. However, you only need quotes if the file name contains whitespace or semicolons (and file names should never contain semicolons, since they would almost certainly lead to other problems). Directory separators do not require quoting.

add_library(MyThings STATIC someFile.cpp "I need quotes.cpp")

add_subdirectory(path/to/somewhere)

Some languages support quoting that starts part way through a value, but CMake’s handling of such arguments is different to most. If a value does not start with double-quotes, any double-quotes after the first character become part of the value. The quotes do still prevent spaces from acting as argument separators. CMake’s documentation warns that this is legacy behavior. You should avoid using unescaped quotes anywhere other than the start and end of a value.

# Relies on legacy behavior, don't do this
set(args something="unexpected perhaps" anotherArg)

foreach(arg IN LISTS args)
    message("arg: ${arg}")
endforeach()

The output from the above would be:

arg: something="unexpected perhaps"
arg: anotherArg

You can express a value with embedded quotes more clearly by quoting the whole string and escaping the embedded quotes with a backslash:

set(args "something=\"I contain spaces\"" anotherArg)

Quoting With Bracket Syntax

With CMake 3.0 or later, you can use lua-style bracket syntax instead of surrounding the value with double-quotes. CMake interprets values quoted with bracket syntax literally, so you do not need to escape quotes or backslashes. This makes them very handy for defining regular expressions. CMake will also not substitute variables inside bracket-quoted values. This can be useful when defining strings with CMake code that you don’t want evaluated immediately.

  • Bracket syntax uses a pair of square brackets with zero or more equal signs between them.
  • You must match the same number of equal signs at the opening and closing brackets.
  • As a special case, if the opening brackets are immediately followed by a newline, CMake discards that newline.
set(args [[something="I contain spaces"]] anotherArg)
foreach(arg IN LISTS args)
    message("arg: ${arg}")
endforeach()

set(progress "Processing (5 of 9)")
string(REGEX REPLACE
    [[Processing \(([0-9]+) of ([0-9]+)\)]]
    [[\1 / \2]]
    result
    "${progress}"
)
message("result = ${result}")

set(oneLiner [[
I am a single line]])
message("With evaluation of ${oneLiner}")

message([[No evaluation of ${oneLiner}]])

file(WRITE ${CMAKE_CURRENT_BINARY_DIR}/shellScript [=[
#!/bin/bash
[[ -n "${USER}" ]] && echo "Have USER"
]=])

The output from the above would be:

arg: something="I contain spaces"
arg: anotherArg
result = 5 / 9
With evaluation of I am a single line
No evaluation of ${oneLiner}

Command Arguments From Substituted Content

A very common scenario is generating a file or some CMake code and substituting variable values while doing so. The configure_file(), file(GENERATE), string(CONFIGURE) and cmake_language(EVAL) commands are used heavily for this. Consider the following contrived example:

adder.cmake.in

add_subdirectory(@subdir@)

CMakeLists.txt

# In practice, this value may be set elsewhere, far from
# this part of the code (e.g. as a command argument)
set(subdir "path with spaces")

# Write a file with the subdir variable substituted
configure_file(adder.cmake.in includeMe.cmake @ONLY)

# Execute the generated file
include(${CMAKE_CURRENT_BINARY_DIR}/includeMe.cmake)

The call to configure_file() will create a file called includeMe.cmake with the following contents:

add_subdirectory(path with spaces)

When written out like that, it is clear that the path should have been quoted. But the developer often isn’t thinking about paths potentially having spaces. You can write the input file more robustly like this:

# More robust, assuming quotes are not allowed in the value being substituted
add_subdirectory("@subdir@")

Be on the lookout for command arguments provided by substituted strings. It isn’t just paths that may need quoting. The same warning applies to any command argument that could potentially have spaces in the substituted value. if() expressions involving string comparisons are especially relevant.

Passing Lists As Command Arguments

The way CMake parses command arguments is different to most scripting languages. In CMake, arguments are separated by whitespace (like many scripting languages), but they can also be separated by semicolons. The following demonstrates this behavior:

function(count_args)
    message("Number of arguments: ${ARGC}")
endfunction()

count_args(one two three)
count_args(one;two;three)

This will lead to the following output:

Number of arguments: 3
Number of arguments: 3

Things get more interesting when arguments require variable evaluation. In the following, note that a list in CMake is just a string with list items separated by semicolons.

set(args_as_string "one two three")
set(args_as_list    one two three)   # CMake stores this as one;two;three

count_args(${args_as_string})
count_args(${args_as_list})

Now the output will be:

Number of arguments: 1
Number of arguments: 3

The space-separated string appears as a single argument when passed as an evaluated variable. But when specified directly, it appears as three separate arguments. In contrast, the semicolon-separated string appears as three separate arguments, even when specified as an evaluated variable. This highlights a characteristic of CMake’s parsing behavior:

Whitespace argument separators are found before variable evaluation.
Semicolon argument separators are found after variable evaluation.

If you need to pass a list as a single argument, you must surround the list with quotes. This is the case whether passing the list directly or as an evaluated variable.

set(args_as_list one two three)   # Same as one;two;three

count_args("one;two;three")
count_args("${args_as_list}")
Number of arguments: 1
Number of arguments: 1

For a more detailed discussion of passing arguments using variables, see Forwarding Command Arguments In CMake.

Generator Expressions

In the CMake forum, you will sometimes see posts asking why CMake doesn’t seem to recognize a particular generator expression. These typically arise from a misunderstanding of CMake’s quoting rules and argument processing. A generator expression has the form $<...> and can be used in a variety of places. Common examples include custom commands or as values for some target properties. The following example defines a build target PrintHash which prints the MD5 hash of an executable’s binary file:

add_executable(MyApp ...)

add_custom_target(PrintHash
    COMMAND ${CMAKE_COMMAND} -E md5sum $<TARGET_FILE:MyApp>
)

The above uses a very simple generator expression which contains no spaces. When the generator expression does contain a space, new line or other whitespace, you must exercise more care. Consider the scenario where a custom target should log a configuration-dependent message:

# WRONG: Spaces not handled correctly
add_custom_target(PrintMessage
    COMMAND ${CMAKE_COMMAND} -E echo
                $<IF:$<CONFIG:Debug>,This is debug,Must be something else>
)

CMake parses arguments by whitespace before it tries to identify generator expressions. The above code therefore doesn’t do what you might expect. The generator expression contains unescaped spaces, which split the expression across multiple arguments. As a result, CMake doesn’t see the generator expression. Instead, it sees the following ordinary arguments (each argument on its own line for clarity):

# WRONG: Spaces not handled correctly
add_custom_target(
    PrintMessage
    COMMAND
    ${CMAKE_COMMAND}
    -E
    echo
    $<IF:$<CONFIG:Debug>,This
    is
    debug,Must
    be
    something else>
)

To prevent CMake from treating the spaces in the expression as argument separators, either escape the spaces or surround the whole expression with quotes. The latter is usually simpler:

add_custom_target(PrintMessage
    COMMAND ${CMAKE_COMMAND} -E echo
                "$<IF:$<CONFIG:Debug>,This is debug,Must be something else>"
)

For more complex generator expressions, building the expression up in variables is a good strategy. This makes the code easier to read and debug.

set(is_debug $<CONFIG:Debug>)
set(msg_debug "This is debug")
set(msg_other "Must be something else")

add_custom_target(PrintMessage
    COMMAND ${CMAKE_COMMAND} -E echo
                $<IF:${is_debug},${msg_debug},${msg_other}>
)

No quotes are needed around the final $<IF:...> generator expression in the above add_custom_target() call. This is because CMake looks for whitespace to separate arguments before it evaluates any variables. In general though, you should quote such expressions anyway. If the variables contain semicolons and the expression is not quoted, argument splitting would occur.

Special Cases For The if() Command

Sadly, one of CMake’s most used commands is also the one with the most complex rules. The if() command supports a variety of forms. Of particular interest for this article are those that compare two things, such as numbers, strings or versions:

  • if(thing1 EQUAL thing2)
  • if(thing1 STREQUAL thing2)
  • if(thing1 VERSION_EQUAL thing2)

In each of the above cases, thing1 and thing2 could be the names of variables, or they could be plain strings to be used directly. If CMake sees that a variable called thing1 exists, then it will use the value of that variable, as though you’d written ${thing1} instead. But if there is no such variable by that name, then CMake treats it as a plain string and uses thing1 literally in the comparison. The same goes for thing2.

Other forms of the if() command perform similar variable-or-string analysis as part of evaluating the expression (thing could be a variable name or a string in each of the following cases):

  • if(thing)
  • if(thing IN_LIST variableName)
  • if(thing MATCHES string)

You can immediately see that this variable-or-string behavior could lead to problems. If you are expecting a variable to exist but no such variable is defined, the variable name is used instead. On the other hand, you might use a string but not realise that a variable with the same name exists. Such variables can easily come from somewhere higher up in the project. In this situation, you get the variable’s value instead of the string you expected.

In most cases, you can prevent CMake from treating an argument as a variable name by quoting it. The following example demonstrates the principle:

set(hello WORLD)

# Compare a variable and a string. Only safe when we are certain
# a variable called "hello" exists.
if(hello STREQUAL "WORLD")
    message("Variable was evaluated, as expected")
endif()

# Compare two strings. Always safe, with one exception discussed below.
if("hello" STREQUAL "WORLD")
    message("Should not get here")
endif()

In the first if() expression, the unquoted hello is seen as the name of a variable, so its value is used (which is "WORLD"). In the second if() expression, hello is quoted, so CMake can’t treat it as a variable name and instead uses "hello" in the comparison.

The above describes the behavior of CMake 3.1 or later. More accurately, it describes the behavior when policy CMP0054 is set to NEW. The OLD behavior with CMake 3.0 and earlier is much more insidious. With the OLD behavior, adding quotes doesn’t prevent CMake from looking for a variable name matching the string. The OLD behavior therefore has much greater potential to give unexpected results. The policy documentation includes a step-by-step example for this OLD behavior.

One particular pattern deserves special mention, because it occurs so frequently. Consider the following example:

cmake_minimum_required(VERSION 3.1)

# ... other code here

if(${MY_FEATURE})
  message("Yes")
else()
  message("No")
endif()

CMake parses the if() line by first expanding ${MY_FEATURE} before it does any processing of the expression itself. CMake’s expression evaluation then continues from there, but that includes checking if the expression is a variable name. If you are unlucky enough that the value in MY_FEATURE is itself the name of another variable, you get a second variable evaluation you probably weren’t expecting. The following demonstrates the effects (you can change the value of oops to confirm it is being used in the expression).

cmake_minimum_required(VERSION 3.1)

set(oops 1)
set(MY_FEATURE oops)

if(${MY_FEATURE})
  message("Yes")
else()
  message("No")
endif()

Many developers do not understand the above rules all that well. They are easy to get wrong. The following if() command guidelines may help you steer clear of problems:

  1. Always ensure you have policy CMP0054 set to NEW. For most projects, having cmake_minimum_required(VERSION 3.1) or later in the top level CMakeLists.txt file will be enough to accomplish this.
  2. If you are not absolutely certain that a variable exists with the name you’re going to use, but you want to take the value of that variable (or an empty string if it is undefined), explicitly evaluate the variable and use quotes: if("${thing}" STREQUAL "whatever")
  3. If you are not absolutely certain that a variable does not exist with the name you’re going to use as a plain string, always quote that value (like the "whatever") in the previous point).
  4. Don’t manually evaluate a variable reference without quotes. In other words, don’t do this: if(${MY_FEATURE})

It is worth noting that the while() command also accepts the same expressions as if(). Therefore, all the same behaviours and caveats regarding quoting apply to while() too.

Closing Remarks And References

In most cases, “if in doubt, use quotes” is a conservative mindset that will mostly steer you away from trouble. On the other hand, overuse of quotes may reduce readability, especially if it interferes with syntax highlighting. As with many things, the best result frequently involves judgement. Favor robustness and clarity over enforcing an inflexible rule regarding quoting in CMake.

The cmake-language(7) manual covers CMake’s quoting rules in a formal manner. The Variable Expansion section of the if() command’s documentation and the CMP0054 policy documentation also cover important details.

Do you often encounter a particular misuse of quoting in CMake not discussed above? Please mention it in the comments below.


Have a CMake maintainer work on your project

Get the book for more CMake content

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.