C/C++ Preprocessor Metaprogramming - Lists and FOR_EACH

One of the most powerful things we can have in the preprocessor is the power of FOR_EACH (or MAP in functional terms). With this functionality, we can do stuff like loop unrolling, generating enum tables, generate typedefs, dynamic template parameters (these examples in a later post).

FOR_EACH allows us to evaluate a macro function to each of the arguments we pass our FOR_EACH. Lets look at we want to implement.

FOR_EACH(f, a, b, c, d) -> returns f(a) f(b) f(c) f(d)
FOR_EACH(f, (a, b), (x, y)) -> returns f(a, b) f(x, y)

Lists

First we need to define what we mean by list and a couple of macros for lists.

/*
 A simple list can be represented by a, b, c
 A tuple is represented as (a, b, c, d)

 We will need three functions for list:
 1) Head: Gets first value of a list
    -> HEAD(a, b, c) = a
 2) Tail: Returns everything after the first element
    -> TAIL(a, b, c) = b, c
 We can make use of the fact that if list are not enclosed,
 they act as parameters to a macro
 Therefore HEAD is simply the first parameter
 TAIL simply returns the remaining parameters
*/
#define HEAD(x, ...) x
#define TAIL(x, ...) __VA_ARGS__

HEAD(a, b, c) // Expands to a
TAIL(a, b, c) // Expands to b, c
HEAD((a, b), (x, y)) // Expands to (a, b)
TAIL((a, b), (x, y)) // Expands to (x, y)

/*
 3) Test if a list is empty or not.
 A list maybe a list of values or a list of tuples.
 We need to account for both cases
   -> IS_LIST_EMPTY()       = 1
   -> IS_LIST_EMPTY(a, b)   = 0
   -> IS_LIST_EMPTY((a, b)) = 0
   -> IS_LIST_EMPTY(,,)     = 0
 We will achieve this through pattern matching.
 For our list we'll add a value LAST but without a comma operator
 As a result, our lists will now look like
   -> a, b           = a, b LAST
   -> (a, b), (c, d) = (a, b), (c, d) LAST
   -> <nothing>      = LAST
  We will simply take the first value of the new list and MATCH for LAST
*/
#define TEST_LAST EXISTS(1)
#define IS_LIST_EMPTY(...) \
  TRY_EXTRACT_EXISTS( \
    DEFER(HEAD) (__VA_ARGS__ EXISTS(1))\
  , 0)
#define IS_LIST_NOT_EMPTY(...) NOT(IS_LIST_EMPTY(__VA_ARGS__))

IS_LIST_EMPTY()       // 1
IS_LIST_EMPTY(a, b)   // 0
IS_LIST_EMPTY((a, b)) // 0
IS_LIST_EMPTY(,,)     // 0

Tuples are essentially enclosed lists, lets look at some functions we need for tuples

/*
 We also want a function to convert tuples into lists (remove brackets)
 As for why, recall our example:
 FOR_EACH(f, (a, b), (x, y)) -> returns f(a, b) f(x, y)
 We need to remove the the brackets from (a, b) so we can
 pass it into f as two parameters.
 Now this just uses our pattern matching stuff before
 1) First test and see if it is enclosed (discussed in previous section)
 2) If it is enclosed remove enclosure otherwise return as normal
*/
#define ENCLOSE(...) ( __VA_ARGS__ )
#define REM_ENCLOSE_(...) __VA_ARGS__
#define REM_ENCLOSE(...) REM_ENCLOSE_ __VA_ARGS__

#define IF_ENCLOSED_1(true, ...) true
#define IF_ENCLOSED_0(true, ...) __VA_ARGS__
#define IF_ENCLOSED(...) CAT(IF_ENCLOSED_, IS_ENCLOSED(__VA_ARGS__))
// This function will optionally remove brackets around its arguments
// if there are any. Otherwise it will return normally
#define OPT_REM_ENCLOSE(...) \
  IF_ENCLOSED (__VA_ARGS__) ( REM_ENCLOSE(__VA_ARGS__), __VA_ARGS__ )

ENCLOSE(a, b) // (a, b)
REM_ENCLOSE((a, b)) // a, b
f ( OPT_REM_ENCLOSE((a, b)) ) // Expands to f(a, b)
f ( OPT_REM_ENCLOSE(a, b) )   // Expands to f(a, b)

FOR_EACH

So with the boiler plate done, lets look at how FOR_EACH works. FOR_EACH works by going through the last passed and evaluating a function for each one. To do this, we need to go through the list recursively. We can make use of the HEAD and TAIL functions. First lets look at how we would implement in in function pseudo form

FOR_EACH(fVisitor, list)
{
  if ( !is_empty( list ) )
  {
    fVisitor( tuple_to_list_opt(head(list)));
    FOR_EACH(fVisitor, tail(list));
  }
}

Converting this to using our MACROs:

// Remember list is simply varargs to our macro
// As a result list = ...
#define FOR_EACH(fVisitor, ...) \
  IF ( IS_LIST_NOT_EMPTY( list )  ) \
  ( \
    fVisitor( OPT_REMOVE_ENCLOSE(HEAD(__VA_ARGS__)) ) \
    FOR_EACH(fVisitor, TAIL(__VA_ARGS__)) \
  )

Then, lets apply indirection to get rid of the direct recursion and get our final macro

#define FOR_EACH_INDIRECT() FOR_EACH_NO_EVAL
#define FOR_EACH_NO_EVAL(fVisitor, ...) \
  IF ( IS_LIST_NOT_EMPTY( __VA_ARGS__ )  ) \
  ( \
    fVisitor( OPT_REM_ENCLOSE(HEAD(__VA_ARGS__)) ) \
    DEFER2 ( FOR_EACH_INDIRECT ) () (fVisitor, TAIL(__VA_ARGS__)) \
  )
#define FOR_EACH(fVisitor, ...) \
  EVAL(FOR_EACH_NO_EVAL(fVisitor, __VA_ARGS__))

FOR_EACH(f, a, b, c, d)     // Expands to f(a) f(b) f(c) f(d)
FOR_EACH(f, (a, b), (x, y)) // Expands to f(a, b) f(x, y)

And there have it, we've implemented FOR_EACH! FOR_EACH doesnt have to be limited to 1 dimension.

Extending FOR_EACH to higher dimensions

Lets consider a 2D version of FOR_EACH. We want it to take two lists and apply a function to all permutations of the function. Lets look at an example

FOR_EACH_2D(f, (a, b), (x, y)) -> returns f(a, x) f(a, y) f(b, x) f(b, y)

Ideally we'd use the FOR_EACH function however using our current macros, its not possible (we would need to implement bind which we will discuss later). For now we'll do it iteratively like we did for FOR_EACH. Lets look at the functional form:

FOR_EACH_2D(fVisitor, listA, origListA, listB, origListB):
{
  // Check to see if we've iterated thourgh all our values in listB
  // If we havent then we can iterate
  if ( !empty(listB) )
  {
    // Call visitor in the first elements of each
    fVisitor ( head(listA), head(listB) );
    // Then move onto the next list
    FOR_EACH_2D(fVisitor, listA, origListA, tail(listB), origListB)
  }
  // If we've processed all the values in listB with our current A value
  // Then move onto the next value of A if we havent run out yet
  else if ( !empty(tail(listA)) )
  {
    FOR_EACH_2D(fVisitor, tail(listA), origListA, origListB, origListB)
  }
}
FOR_EACH_2D(fVisitor, listA, listB):
  FOR_EACH_2D(fVisitor, listA, listA, listB, listB)

Using our macros we get this form:

#define FOR_EACH_2D_(fVisitor, listA, origListA, listB, origListB) \
  /* Note we dont use brackets because listB is already bracketed */ \
  IF ( IS_LIST_NOT_EMPTY listB )\
  ( \
    fVisitor ( HEAD listA, HEAD listB  ) \
    FOR_EACH_2D_(fVisitor, listA, origListA, (TAIL listB), origListB) \
  , /* else */ \
    IF (IS_LIST_NOT_EMPTY ( TAIL listA ) ) \
    ( \
     FOR_EACH_2D_(fVisitor, TAIL(listA), origListA, origListB, origListB) \
    ) \
  )
#define FOR_EACH_2D(fVisitor, listA, listB) \
  FOR_EACH_2D_ ( fVisitor, listA, listA, listB, listB )

And finally applying indirection, we get the final form

#define FOR_EACH_2D_INDIRECT() FOR_EACH_2D_
#define FOR_EACH_2D_( visitorExp, listA, origListA, listB, origListB )\
  IF ( IS_LIST_NOT_EMPTY listB )\
  (\
    visitorExp( HEAD listA, HEAD listB )\
    DEFER2(FOR_EACH_2D_INDIRECT) () ( visitorExp, listA, listA, (TAIL listB), origListB )\
    ,\
    IF ( IS_LIST_NOT_EMPTY( TAIL listA ) )\
    ( \
      DEFER3(FOR_EACH_2D_INDIRECT) () ( visitorExp, (TAIL origListA), (TAIL origListA), origListB, origListB )\
    )\
  )
#define FOR_EACH_2D_NO_EVAL(visitorExp, listA, listB) \
  FOR_EACH_2D_(visitorExp, listA, listA, listB, listB)
#define FOR_EACH_2D(visitorExp, listA, listB) \
  EVAL(FOR_EACH_2D_NO_EVAL(visitorExp, listA, listB))

FOR_EACH_2D(f, (a, b), (x, y))
// Expands to f( a, x ) f( a, y ) f( b, x ) f( b, y )

We can extend this a dimension further too. Im not going to go into the derivation but its the same as the second but doing another loop. Heres what FOR_EACH_3D looks like

#define FOR_EACH_3D_INDIRECT() FOR_EACH_3D_
#define FOR_EACH_3D_( visitorExp, listA, origListA, listB, origListB, listC, origListC )\
  IF ( IS_LIST_NOT_EMPTY listC )\
  (\
    visitorExp( HEAD listA, HEAD listB, HEAD listC )\
    DEFER2(FOR_EACH_3D_INDIRECT) () ( visitorExp, listA, listA, listB, listB, (TAIL listC), origListC )\
    ,\
    IF ( IS_LIST_NOT_EMPTY( TAIL listB ) )\
    ( \
      DEFER3(FOR_EACH_3D_INDIRECT) () ( visitorExp, origListA, origListA, (TAIL origListB), (TAIL origListB), origListC, origListC )\
      IF ( IS_LIST_NOT_EMPTY( TAIL listA ) )\
      ( \
        DEFER4(FOR_EACH_3D_INDIRECT) () ( visitorExp, (TAIL origListA), (TAIL origListA), origListB, origListB, origListC, origListC )\
      )\
    )\
  )
#define FOR_EACH_3D_NO_EVAL(visitorExp, listA, listB, listC) \
  FOR_EACH_3D_(visitorExp, listA, listA, listB, listB, listC, listC)
#define FOR_EACH_3D(visitorExp, listA, listB, listC) \
  EVAL(FOR_EACH_3D_NO_EVAL(visitorExp, listA, listB, listC))

FOR_EACH_3D(f, (a, b), (x, y), (1, 2) )
/*  Expands to
    f( a, x, 1 ) f( a, x, 2 )
    f( a, y, 1 ) f( a, y, 2 )
    f( b, x, 1 ) f( b, x, 2 )
    f( b, y, 1 ) f( b, y, 2 )
*/

C/C++ Preprocessor Metaprogramming - Evaluating and Defering Macro Calls

Before we go into our more in depth examples, I want to provide some basic building block macros.

Deferring/Evaluating Expressions

Defering and evaluating macros are going to be used a lot and are really important to be discussed. The goal of defer is to delay the evaluation of a macro. Evaluation macros are going to evaluate macros. We will make use of combinations them heavily to achieve many things. Lets first look at what the use case of the macro will be

#define FOO(x, y) x y
#define BAR() x, y

/*
  The following expression results in an error
  This is because FOO is expanded first
  As a result, Since Foo only has 1 parameter Bar
  It complains we didnt provide the correct arguments
*/
FOO(BAR()) 

/*
  Using defer allows BAR() to be evaluated first.
  However, since we used the defer, FOO is not evaluated
*/
DEFER(FOO) (BAR()) // Expands to FOO(x, y)

/*
  To get around this, we will need an evaluation macro.
  Calling EVAL will allow us to evalated the DEFER expression
*/
EVAL(DEFER(FOO) (BAR())) // Expands to x y

/*
  We might also want to defer twice
  if we have two levels of evaluation needed
*/
#define QUX() BAR()

// Error: The following
// expands to FOO(BAR()) and FOO is evaluated
EVAL(DEFER(FOO) (QUX())) 

EVAL(DEFER2(FOO) (QUX())) // Expands to x y

The definition of DEFER and EVAL is actually relatively simple, we can make use of our rules discussed previously to make it work. The idea of DEFER is to require n evaluations for the deferred expression/macro to actually get called. We can use our EMPTY macro that we first saw in the rules section. By adding EMPTY() calls in between a macro call and its parameters we can defer the execution of FOO by however many EMPTY's we have.

// EMPTY() expands to nothing
#define EMPTY() 

FOO(x, y) // x y
// We require an extra evaluation to evaluate FOO
FOO EMPTY() (x, y) // FOO(x, y)
// We require two extra evaluations to evaluate FOO
FOO EMPTY EMPTY() () (x,y ) // FOO EMPTY() (x, y)

This pattern can be implemented in the DEFER macros.

// __VA_ARGS__ is the expression to defer
#define DEFER(...) __VA_ARGS__ EMPTY()
// To defer it twice we simply add in another defered EMPTY() call
#define DEFER2(...) __VA_ARGS__ DEFER(EMPTY) ()
#define DEFER3(...) __VA_ARGS__ DEFER2(EMPTY) ()
#define DEFER4(...) __VA_ARGS__ DEFER3(EMPTY) ()
#define DEFER5(...) __VA_ARGS__ DEFER4(EMPTY) ()
// .
// .
// .
// Can have this go to however many defer stages you need
// This can easily be autogen'd

Next, we want to implement an EVAL macro which evaluates the macro. This will come later when we explore recursion but the main goal is we want to have a sub-expression evaluated many times. The way this works is by relying on Rule 3. By having an empty macro that simply returns our arguments we can have an expression evaluated once. Stack the calls to these macros and we can evaluate many more times. If we have an EVAL call another EVAL twice and have that EVAL call another EVAL, we can have 2^n evaluations of an expression where n is how many times we stack. This is easier to show in code

#define EVAL_1(...) __VA_ARGS__
// Note how we call EVAL of the lower level twice
// This allows us to double the number of EVAL calls per level
#define EVAL_2(...) EVAL_1(EVAL_1(__VA_ARGS__))
#define EVAL_3(...) EVAL_2(EVAL_2(__VA_ARGS__))
#define EVAL_4(...) EVAL_3(EVAL_3(__VA_ARGS__))
#define EVAL_5(...) EVAL_4(EVAL_4(__VA_ARGS__))
#define EVAL_6(...) EVAL_5(EVAL_5(__VA_ARGS__))
#define EVAL_7(...) EVAL_6(EVAL_6(__VA_ARGS__))
#define EVAL_8(...) EVAL_7(EVAL_7(__VA_ARGS__))
// Finally our EVAL calls the top level EVAL call
#define EVAL(...) EVAL_8(__VA_ARGS__)
// ...
// This can easily  autogen'd

Note, if have a macro that is evaluated using EVAL which calls another macro that uses an EVAL then that EVAL wont be expanded (Rule 4). This can be seen in the following example.

#define BAR(x) EVAL(x)
EVAL( DEFER(BAR)(x) ) // Expands to EVAL(x)

We would need to create an appropriate EVAL chain for macro if they are going to be chained together

// Assuming EVALA and EVALB are defined appropriately
#define BAR(x) EVALA(x)
EVALB( DEFER(BAR)(x) ) // Expands to x

C/C++ Preprocessor Metaprogramming - Applications: Typedef Unrolling

Another use for metaprogramming is when you want to have multiple typedef for variable parameters. An example of this is say you have a Vector class and we want typdef it for various dimensions and various data types.

template<int NUM_DIMENSIONS, class DataType>
struct Vector {
  DataType m_data[NUM_DIMENSIONS;
};

typedef Vector<2, int> Vector2i;
typedef Vector<2, double> Vector2d;
typedef Vector<2, float> Vector2f;

typedef Vector<3, int> Vector3i;
typedef Vector<3, double> Vector3d;
typedef Vector<3, float> Vector3f;

Suppose we wanted to expand this for more dimensions and data types as well. We can make use of pattern matching and FOR_EACH_2D to make our life easier.

// Pattern match PREFIX so that we can get a prefix if appropriate
// Otherwise return the type itself
#define PREFIX_int EXISTS(i)
#define PREFIX_float EXISTS(f)
#define PREFIX_double EXISTS(d)
#define GET_PREFIX(type) TRY_EXTRACT_EXISTS(CAT(PREFIX_, type), type)

#define DEFINE_VECTOR_TYPE(n, type) \
  typedef Vector<n, type> CAT(CAT(Vector, n), GET_PREFIX(type));

FOR_EACH_2D(DEFINE_VECTOR_TYPE, (2, 3, 4), (int, double, float))
/* Expands to
  typedef Vector<2, int> Vector2i;
  typedef Vector<2, double> Vector2d;
  typedef Vector<2, float> Vector2f;

  typedef Vector<3, int> Vector3i;
  typedef Vector<3, double> Vector3d;
  typedef Vector<3, float> Vector3f;

  typedef Vector<4, int> Vector4i;
  typedef Vector<4, double> Vector4d;
  typedef Vector<4, float> Vector4f;
*/
#undef DEFINE_VECTOR_TYPE