Recently, a coworker pointed me towards a C++17 library to convert enumeration values to strings and vice versa. The library called (magic_enum)[https://github.com/Neargye/magic_enum] and it indeed feels like magic. I was immediately curious: how did they pull this off?
The code I am presenting here can be seen as a massively down-scaled version of magic_enum: the approach taken is exactly the same. I’ve mainly tried to simplify for purposes of readability and understandability. Thus, consider the code presented here licensed using MIT, the same license which governs magic_enum – just will less features and more comments. You can find my re-implementation at https://gist.github.com/zhmu/9ac375706ffbafa5d24693f8475abd79.
I would like to thank Daniil Goncharov for giving me something to study and blowing my mind!
Goal
We want the following to code to ield the value as commented:
cpp
1
2
3
4
5
6
7
8
enumclassColour{ Red =27, Green, Blue =40,};const std::string_view s = enum_to_string(Colour::Green);puts(s.data());// Green
1
2
3
4
5
6
7
8
enumclassColour {
Red =27,
Green,
Blue =40,
};
const std::string_view s = enum_to_string(Colour::Green);
puts(s.data()); // Green
This code already has some requirements subtly embedded:
enum_to_string() must return owned memory, as it returns a std::string_view
The returned string_view must be zero-terminated for puts() to work
This means we need to provide static storage for the string values. We could do without, but it turns out the code optimizes better and is fairly easy to implement so I’ve decided to roll with it.
First steps: how would you convert a given enum value to a string?
In classic C, you have __FILE__ and __LINE__, which expand to character arrays containing the filename and current line number of the input you are compiling. As time went on, C99 introduced __func__, which contains the function you are in. As this is C, function overloading does not exist and no parameters are necessary.
C++ does not provide a standard way to determine the current function name with all parameters involved, but GCC/Clang provides __PRETTY_FUNCTION__ and Microsoft Visual Studio provides __FUNCSIG__. Let’s show some examples:
cpp
1
2
3
4
5
6
7
voidfun1(){ puts(__PRETTY_FUNCTION__);}intfun2(int v){ puts(__PRETTY_FUNCTION__);return0;}fun1();// void fun1()
fun2(123);// int fun2(int)
The clever insight here is that you can create a template that takes an enumeration and a compile-time known value of that enumeration:
cpp
1
2
3
4
template<typename E, E v>void fun4(){ puts(__PRETTY_FUNCTION__);}fun4<Colour, Colour::Red>();// void fun4() [with E = Colour; E v = Colour::Red]
1
2
3
4
template<typename E, E v>void fun4() { puts(__PRETTY_FUNCTION__); }
fun4<Colour, Colour::Red>();
// void fun4() [with E = Colour; E v = Colour::Red]
Hence, if we’d perform a compile-time instantiation of all possible enumeration values, we have their corresponding character representation!
How do you know which enumeration values are possible?
First, let’s see what happens if you try to use fun4() on a value that isn’t in the enumeration:
cpp
1
2
fun4<Colour,static_cast<Colour>(1)>();// void fun4() [with E = Colour; E v = (Color)1 ]
1
2
fun4<Colour, static_cast<Colour>(1)>();
// void fun4() [with E = Colour; E v = (Color)1 ]
This value is distinct in a sense that it doesn’t correspond with the Color::... output we saw previously. This means we can determine whether any integer corresponds to a given enumeration value or not!
So how we do obtain a list of all valid enumeration values? We try them all, one by one. If you look at the magic_enum documentation, it stands out that the enum value must reside within a certain range, which by default is MAGIC_ENUM_RANGE_MIN .. MAGIC_ENUM_RANGE_MAX. Only this range will be evaluated by default.
In magic_enum, the function n() is responsible for the conversion. It uses pretty_name() to normalize the resulting compiler-specific result of __PRETTY_FUNCTION__ / __FUNCSIG__ to the enumeration value name, or an empty string_view in case the value does not exist within the enumeration.
A incomplete implementation of pretty_name() / n() function, which works well enough for enumeration values that do not contain digits, is as follows:
The implementation in magic_enum supports more cases and will reject invalid names. For our purposes, the implementation above suffices.
Collecting the possible enumeration values
Now to we have a way to query enumeration values one by one, we need a way to assemble them together into an array. We want the following code to compile and yield the output in comments:
cpp
1
2
3
4
5
6
auto f = values<Colour>();for(constauto&i: f) std::cout <<static_cast<int>(i)<<'\n'// 27
// 28
// 40
1
2
3
4
5
6
auto f = values<Colour>();
for(constauto&i: f)
std::cout <<static_cast<int>(i) <<'\n'// 27
// 28
// 40
First things first, we need a function to determine whether a given enumeration value is valid. Remember the magic n() function above? All we need to check is whether the value returned is non-empty:
cpp
1
2
3
4
5
6
template<typename E, E V>constexprauto is_valid(){constexpr E v =static_cast<E>(V);return!n<E, V>().empty();}
1
2
3
4
5
6
template<typename E, E V>constexprauto is_valid()
{
constexpr E v =static_cast<E>(V);
return!n<E, V>().empty();
}
We also need a way to convert an integer v to the v-th enumeration value. Within magic_enum, this function is called ualue(), so I’ll stick with that. Thankfully, this is pretty straight forward:
Now things get more tricky. We want to create a compile-list sequence of all integers it needs to try (this is simply the list of integers between MAGIC_ENUM_RANGE_MIN and MAGIC_ENUM_RANGE_MAX. We know there are ENUM_MAX_VALUE - ENUM_MIN_VALUE + 1 such values. Using std::make_index_sequence<>, we can generate a compile-time list containing all std::size_t values from 0 up to and including ENUM_MAX_VALUE - ENUM_MIN_VALUE.
This allows us to write the values<E>() function, which generates the appropriate list and feeds it into a helper function:
Let’s give the function prototype of the values() helper function:
cpp
1
2
template<typename E, std::size_t... I>constexprauto values(std::index_sequence<I...>)noexcept;
1
2
template<typename E, std::size_t... I>constexprauto values(std::index_sequence<I...>) noexcept;
Here, I is a sequence of std:size_t’s. There can be zero up to a lot of them, and each one corresponds with an enumeration value we want to try to see if it is valid. C++17 gives us fold expressions, which allow us to conveniently express this using the is_valid() and ualue() functions:
Our Colours enumeration has only three values, which means most of valid will be false. We want to condense it to a std::array<E, N> which contains only the values present in the enumeration. The first step is to introduce a helper function to count the number of items in valid that are true:
cpp
1
2
3
4
5
6
7
8
9
template<std::size_t N>constexprauto count_values(constbool(&valid)[N]){// Cannot use std::count_if(), it is not constexpr pre C++20
std::size_t count =0;for(std::size_t n =0; n < N;++n)if(valid[n])++count;return count;}
1
2
3
4
5
6
7
8
9
template<std::size_t N>constexprauto count_values(constbool (&valid)[N])
{
// Cannot use std::count_if(), it is not constexpr pre C++20
std::size_t count =0;
for(std::size_t n =0; n < N; ++n)
if (valid[n]) ++count;
return count;
}
Which makes the remainder of the values() function pretty straight-forward:
cpp
1
2
3
4
5
6
7
8
9
10
11
12
constexprauto num_valid = count_values(valid);static_assert(num_valid >0,"no support for empty enums");std::array<E, num_valid> values ={};for(std::size_t offset =0, n =0; n < num_valid;++offset){if(valid[offset]){ values[n]= ualue<E>(offset);++n;}}return values;
1
2
3
4
5
6
7
8
9
10
11
12
constexprauto num_valid = count_values(valid);
static_assert(num_valid >0, "no support for empty enums");
std::array<E, num_valid> values = {};
for(std::size_t offset =0, n =0; n < num_valid; ++offset) {
if (valid[offset]) {
values[n] = ualue<E>(offset);
++n;
}
}
return values;
We’ll be needing values<E>() quite a bit. We’ll introduce values_v as a shorthand:
We can test our implementation by iterating over all values_v<>. It indeed yields all value enumeration values of Colour, exactly as intended:
cpp
1
2
3
4
5
6
auto f = values_v<Colour>;for(constauto&i: f) std::cout <<static_cast<int>(i)<<'\n'// 27
// 28
// 40
1
2
3
4
5
6
auto f = values_v<Colour>;
for(constauto&i: f)
std::cout <<static_cast<int>(i) <<'\n'// 27
// 28
// 40
From values to entries
Our intention is to implement an entries_v<E> variable, which yields a std::array<std::pair<E, string_view>, ...>: that is, for a given enum E, it yields an array containing tuples with each valid value within that enum and its corresponding string representation.
First, we’ll introduce another helper, enum_value_v<V, E> to obtain the string representation of a enumeration value V within enum E. For now, this will simply be a call to the n() function:
cpp
1
2
3
4
5
6
7
8
9
template<typename E, E V>constexprauto enum_name(){constexprauto name = n<E, V>();return name;}template<typename E, E V>inlineconstexprauto enum_name_v = enum_name<E, V>();
1
2
3
4
5
6
7
8
9
template<typename E, E V>constexprauto enum_name()
{
constexprauto name = n<E, V>();
return name;
}
template<typename E, E V>inlineconstexprauto enum_name_v = enum_name<E, V>();
We can then introduce the entries() function as follows: given a sequence of all possible enumeration values I, we yield a std::array<> with the enumeration value and the corresponding enumerating name string. Fold expressions make this convenient:
cpp
1
2
3
4
5
6
7
template<typename E, std::size_t... I>constexprauto entries(std::index_sequence<I...>)noexcept{return std::array<std::pair<E, std::string_view>,sizeof...(I)>{{{ values_v<E>[I], enum_name_v<E, values_v<E>[I]>}...}};}
template<typename E>constexpr std::string_view enum_to_string(E value)
{
for (constauto& [ key, name ]: entries_v<E>) {
if (value == key) return name;
}
return {};
}
Of course, C++20’s constexpr algorithms would make this a lot nicer.
Static string storage and zero-termination
This implementation has a major drawback and an annoying bug.
As for the drawback, the complete strings as output by __PRETTY_FUNCTION__ will be stored in the executable (this can be seen using tools like Compiler Explorer and examining the assembly output). We have:
asm
1
2
3
4
5
6
7
8
9
.LC0:.string"constexpr auto n() [with E = Colour; E V = Colour::Green]"main:subrsp,8movedi,OFFSETFLAT:.LC0+51callputsxoreax,eaxaddrsp,8ret
1
2
3
4
5
6
7
8
9
.LC0: .string "constexpr auto n() [with E = Colour; E V = Colour::Green]"main:subrsp, 8movedi, OFFSETFLAT:.LC0+51callputsxoreax, eaxaddrsp, 8ret
This prints Green] – which illustrates the bug: if we use std::cout instead, we’d get the correct string. This is because we do not properly insert a \0-character – hence, we’ll just end up with whatever was in memory.
We can introduce a helper class to store a compile-time zero-terminated string. magic_enum calls it static_string, which I’ll also do. The idea is to provide N + 1 bytes of storage, which are initially zero. In the constructor, we’ll copy the bytes of the string_view:
cpp
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
template<std::size_t N>structstatic_string{constexprstatic_string(std::string_view sv)noexcept{// std::copy() is not constexpr in C++17, hence...
for(std::size_t n =0; n < N;++n) content[n]= sv[n];}constexproperator std::string_view()constnoexcept{return{ content.data(), N };}private: std::array<char, N +1> content{};};
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
template<std::size_t N>structstatic_string{
constexprstatic_string(std::string_view sv) noexcept {
// std::copy() is not constexpr in C++17, hence...
for(std::size_t n =0; n < N; ++n)
content[n] = sv[n];
}
constexproperator std::string_view() constnoexcept { return { content.data(), N }; }
private: std::array<char, N +1> content{};
};
All that remains is to use static_string<> in enum_name(), as follows:
cpp
1
2
3
4
5
template<typename E, E V>constexprauto enum_name(){constexprauto name = n<E, V>();return static_string<name.size()>(name);
1
2
3
4
5
template<typename E, E V>constexprauto enum_name()
{
constexprauto name = n<E, V>();
return static_string<name.size()>(name);
Which yields the desired assembly output:
asm
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
main:subrsp,8movedi,OFFSETFLAT:enum_name_v<Colour,(Colour)28>callputsxoreax,eaxaddrsp,8retenum_name_v<Colour,(Colour)28>:.byte71// G
.byte114// r
.byte101// e
.byte101// e
.byte110// n
.zero1
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
main:subrsp, 8movedi, OFFSETFLAT:enum_name_v<Colour, (Colour)28>callputsxoreax, eaxaddrsp, 8retenum_name_v<Colour, (Colour)28>:
.byte 71// G
.byte 114// r
.byte 101// e
.byte 101// e
.byte 110// n
.zero 1
Closing words
I’m extremely grateful for Daniil Goncharov’s work. Initially, I would use the tried and proven C macro-style approach, which feels clumsy given the state C++ is in these days. Studying his approach has taught me some wonderful things, and I hope I’ve share some of them by writing this post.