1*bf2c3715SXin Linamespace Eigen { 2*bf2c3715SXin Li 3*bf2c3715SXin Li/** \eigenManualPage TopicStructHavingEigenMembers Structures Having Eigen Members 4*bf2c3715SXin Li 5*bf2c3715SXin Li\eigenAutoToc 6*bf2c3715SXin Li 7*bf2c3715SXin Li\section StructHavingEigenMembers_summary Executive Summary 8*bf2c3715SXin Li 9*bf2c3715SXin Li 10*bf2c3715SXin LiIf you define a structure having members of \ref TopicFixedSizeVectorizable "fixed-size vectorizable Eigen types", you must ensure that calling operator new on it allocates properly aligned buffers. 11*bf2c3715SXin LiIf you're compiling in \cpp17 mode only with a sufficiently recent compiler (e.g., GCC>=7, clang>=5, MSVC>=19.12), then everything is taken care by the compiler and you can stop reading. 12*bf2c3715SXin Li 13*bf2c3715SXin LiOtherwise, you have to overload its `operator new` so that it generates properly aligned pointers (e.g., 32-bytes-aligned for Vector4d and AVX). 14*bf2c3715SXin LiFortunately, %Eigen provides you with a macro `EIGEN_MAKE_ALIGNED_OPERATOR_NEW` that does that for you. 15*bf2c3715SXin Li 16*bf2c3715SXin Li\section StructHavingEigenMembers_what What kind of code needs to be changed? 17*bf2c3715SXin Li 18*bf2c3715SXin LiThe kind of code that needs to be changed is this: 19*bf2c3715SXin Li 20*bf2c3715SXin Li\code 21*bf2c3715SXin Liclass Foo 22*bf2c3715SXin Li{ 23*bf2c3715SXin Li ... 24*bf2c3715SXin Li Eigen::Vector2d v; 25*bf2c3715SXin Li ... 26*bf2c3715SXin Li}; 27*bf2c3715SXin Li 28*bf2c3715SXin Li... 29*bf2c3715SXin Li 30*bf2c3715SXin LiFoo *foo = new Foo; 31*bf2c3715SXin Li\endcode 32*bf2c3715SXin Li 33*bf2c3715SXin LiIn other words: you have a class that has as a member a \ref TopicFixedSizeVectorizable "fixed-size vectorizable Eigen object", and then you dynamically create an object of that class. 34*bf2c3715SXin Li 35*bf2c3715SXin Li\section StructHavingEigenMembers_how How should such code be modified? 36*bf2c3715SXin Li 37*bf2c3715SXin LiVery easy, you just need to put a `EIGEN_MAKE_ALIGNED_OPERATOR_NEW` macro in a public part of your class, like this: 38*bf2c3715SXin Li 39*bf2c3715SXin Li\code 40*bf2c3715SXin Liclass Foo 41*bf2c3715SXin Li{ 42*bf2c3715SXin Li ... 43*bf2c3715SXin Li Eigen::Vector4d v; 44*bf2c3715SXin Li ... 45*bf2c3715SXin Lipublic: 46*bf2c3715SXin Li EIGEN_MAKE_ALIGNED_OPERATOR_NEW 47*bf2c3715SXin Li}; 48*bf2c3715SXin Li 49*bf2c3715SXin Li... 50*bf2c3715SXin Li 51*bf2c3715SXin LiFoo *foo = new Foo; 52*bf2c3715SXin Li\endcode 53*bf2c3715SXin Li 54*bf2c3715SXin LiThis macro makes `new Foo` always return an aligned pointer. 55*bf2c3715SXin Li 56*bf2c3715SXin LiIn \cpp17, this macro is empty. 57*bf2c3715SXin Li 58*bf2c3715SXin LiIf this approach is too intrusive, see also the \ref StructHavingEigenMembers_othersolutions "other solutions". 59*bf2c3715SXin Li 60*bf2c3715SXin Li\section StructHavingEigenMembers_why Why is this needed? 61*bf2c3715SXin Li 62*bf2c3715SXin LiOK let's say that your code looks like this: 63*bf2c3715SXin Li 64*bf2c3715SXin Li\code 65*bf2c3715SXin Liclass Foo 66*bf2c3715SXin Li{ 67*bf2c3715SXin Li ... 68*bf2c3715SXin Li Eigen::Vector4d v; 69*bf2c3715SXin Li ... 70*bf2c3715SXin Li}; 71*bf2c3715SXin Li 72*bf2c3715SXin Li... 73*bf2c3715SXin Li 74*bf2c3715SXin LiFoo *foo = new Foo; 75*bf2c3715SXin Li\endcode 76*bf2c3715SXin Li 77*bf2c3715SXin LiA Eigen::Vector4d consists of 4 doubles, which is 256 bits. 78*bf2c3715SXin LiThis is exactly the size of an AVX register, which makes it possible to use AVX for all sorts of operations on this vector. 79*bf2c3715SXin LiBut AVX instructions (at least the ones that %Eigen uses, which are the fast ones) require 256-bit alignment. 80*bf2c3715SXin LiOtherwise you get a segmentation fault. 81*bf2c3715SXin Li 82*bf2c3715SXin LiFor this reason, %Eigen takes care by itself to require 256-bit alignment for Eigen::Vector4d, by doing two things: 83*bf2c3715SXin Li\li %Eigen requires 256-bit alignment for the Eigen::Vector4d's array (of 4 doubles). With \cpp11 this is done with the <a href="https://en.cppreference.com/w/cpp/keyword/alignas">alignas</a> keyword, or compiler's extensions for c++98/03. 84*bf2c3715SXin Li\li %Eigen overloads the `operator new` of Eigen::Vector4d so it will always return 256-bit aligned pointers. (removed in \cpp17) 85*bf2c3715SXin Li 86*bf2c3715SXin LiThus, normally, you don't have to worry about anything, %Eigen handles alignment of operator new for you... 87*bf2c3715SXin Li 88*bf2c3715SXin Li... except in one case. When you have a `class Foo` like above, and you dynamically allocate a new `Foo` as above, then, since `Foo` doesn't have aligned `operator new`, the returned pointer foo is not necessarily 256-bit aligned. 89*bf2c3715SXin Li 90*bf2c3715SXin LiThe alignment attribute of the member `v` is then relative to the start of the class `Foo`. If the `foo` pointer wasn't aligned, then `foo->v` won't be aligned either! 91*bf2c3715SXin Li 92*bf2c3715SXin LiThe solution is to let `class Foo` have an aligned `operator new`, as we showed in the previous section. 93*bf2c3715SXin Li 94*bf2c3715SXin LiThis explanation also holds for SSE/NEON/MSA/Altivec/VSX targets, which require 16-bytes alignment, and AVX512 which requires 64-bytes alignment for fixed-size objects multiple of 64 bytes (e.g., Eigen::Matrix4d). 95*bf2c3715SXin Li 96*bf2c3715SXin Li\section StructHavingEigenMembers_movetotop Should I then put all the members of Eigen types at the beginning of my class? 97*bf2c3715SXin Li 98*bf2c3715SXin LiThat's not required. Since %Eigen takes care of declaring adequate alignment, all members that need it are automatically aligned relatively to the class. So code like this works fine: 99*bf2c3715SXin Li 100*bf2c3715SXin Li\code 101*bf2c3715SXin Liclass Foo 102*bf2c3715SXin Li{ 103*bf2c3715SXin Li double x; 104*bf2c3715SXin Li Eigen::Vector4d v; 105*bf2c3715SXin Lipublic: 106*bf2c3715SXin Li EIGEN_MAKE_ALIGNED_OPERATOR_NEW 107*bf2c3715SXin Li}; 108*bf2c3715SXin Li\endcode 109*bf2c3715SXin Li 110*bf2c3715SXin LiThat said, as usual, it is recommended to sort the members so that alignment does not waste memory. 111*bf2c3715SXin LiIn the above example, with AVX, the compiler will have to reserve 24 empty bytes between `x` and `v`. 112*bf2c3715SXin Li 113*bf2c3715SXin Li 114*bf2c3715SXin Li\section StructHavingEigenMembers_dynamicsize What about dynamic-size matrices and vectors? 115*bf2c3715SXin Li 116*bf2c3715SXin LiDynamic-size matrices and vectors, such as Eigen::VectorXd, allocate dynamically their own array of coefficients, so they take care of requiring absolute alignment automatically. So they don't cause this issue. The issue discussed here is only with \ref TopicFixedSizeVectorizable "fixed-size vectorizable matrices and vectors". 117*bf2c3715SXin Li 118*bf2c3715SXin Li 119*bf2c3715SXin Li\section StructHavingEigenMembers_bugineigen So is this a bug in Eigen? 120*bf2c3715SXin Li 121*bf2c3715SXin LiNo, it's not our bug. It's more like an inherent problem of the c++ language specification that has been solved in c++17 through the feature known as <a href="http://wg21.link/p0035r4">dynamic memory allocation for over-aligned data</a>. 122*bf2c3715SXin Li 123*bf2c3715SXin Li 124*bf2c3715SXin Li\section StructHavingEigenMembers_conditional What if I want to do this conditionally (depending on template parameters) ? 125*bf2c3715SXin Li 126*bf2c3715SXin LiFor this situation, we offer the macro `EIGEN_MAKE_ALIGNED_OPERATOR_NEW_IF(NeedsToAlign)`. 127*bf2c3715SXin LiIt will generate aligned operators like `EIGEN_MAKE_ALIGNED_OPERATOR_NEW` if `NeedsToAlign` is true. 128*bf2c3715SXin LiIt will generate operators with the default alignment if `NeedsToAlign` is false. 129*bf2c3715SXin LiIn \cpp17, this macro is empty. 130*bf2c3715SXin Li 131*bf2c3715SXin LiExample: 132*bf2c3715SXin Li 133*bf2c3715SXin Li\code 134*bf2c3715SXin Litemplate<int n> class Foo 135*bf2c3715SXin Li{ 136*bf2c3715SXin Li typedef Eigen::Matrix<float,n,1> Vector; 137*bf2c3715SXin Li enum { NeedsToAlign = (sizeof(Vector)%16)==0 }; 138*bf2c3715SXin Li ... 139*bf2c3715SXin Li Vector v; 140*bf2c3715SXin Li ... 141*bf2c3715SXin Lipublic: 142*bf2c3715SXin Li EIGEN_MAKE_ALIGNED_OPERATOR_NEW_IF(NeedsToAlign) 143*bf2c3715SXin Li}; 144*bf2c3715SXin Li 145*bf2c3715SXin Li... 146*bf2c3715SXin Li 147*bf2c3715SXin LiFoo<4> *foo4 = new Foo<4>; // foo4 is guaranteed to be 128bit-aligned 148*bf2c3715SXin LiFoo<3> *foo3 = new Foo<3>; // foo3 has only the system default alignment guarantee 149*bf2c3715SXin Li\endcode 150*bf2c3715SXin Li 151*bf2c3715SXin Li 152*bf2c3715SXin Li\section StructHavingEigenMembers_othersolutions Other solutions 153*bf2c3715SXin Li 154*bf2c3715SXin LiIn case putting the `EIGEN_MAKE_ALIGNED_OPERATOR_NEW` macro everywhere is too intrusive, there exists at least two other solutions. 155*bf2c3715SXin Li 156*bf2c3715SXin Li\subsection othersolutions1 Disabling alignment 157*bf2c3715SXin Li 158*bf2c3715SXin LiThe first is to disable alignment requirement for the fixed size members: 159*bf2c3715SXin Li\code 160*bf2c3715SXin Liclass Foo 161*bf2c3715SXin Li{ 162*bf2c3715SXin Li ... 163*bf2c3715SXin Li Eigen::Matrix<double,4,1,Eigen::DontAlign> v; 164*bf2c3715SXin Li ... 165*bf2c3715SXin Li}; 166*bf2c3715SXin Li\endcode 167*bf2c3715SXin LiThis `v` is fully compatible with aligned Eigen::Vector4d. 168*bf2c3715SXin LiThis has only for effect to make load/stores to `v` more expensive (usually slightly, but that's hardware dependent). 169*bf2c3715SXin Li 170*bf2c3715SXin Li 171*bf2c3715SXin Li\subsection othersolutions2 Private structure 172*bf2c3715SXin Li 173*bf2c3715SXin LiThe second consist in storing the fixed-size objects into a private struct which will be dynamically allocated at the construction time of the main object: 174*bf2c3715SXin Li 175*bf2c3715SXin Li\code 176*bf2c3715SXin Listruct Foo_d 177*bf2c3715SXin Li{ 178*bf2c3715SXin Li EIGEN_MAKE_ALIGNED_OPERATOR_NEW 179*bf2c3715SXin Li Vector4d v; 180*bf2c3715SXin Li ... 181*bf2c3715SXin Li}; 182*bf2c3715SXin Li 183*bf2c3715SXin Li 184*bf2c3715SXin Listruct Foo { 185*bf2c3715SXin Li Foo() { init_d(); } 186*bf2c3715SXin Li ~Foo() { delete d; } 187*bf2c3715SXin Li void bar() 188*bf2c3715SXin Li { 189*bf2c3715SXin Li // use d->v instead of v 190*bf2c3715SXin Li ... 191*bf2c3715SXin Li } 192*bf2c3715SXin Liprivate: 193*bf2c3715SXin Li void init_d() { d = new Foo_d; } 194*bf2c3715SXin Li Foo_d* d; 195*bf2c3715SXin Li}; 196*bf2c3715SXin Li\endcode 197*bf2c3715SXin Li 198*bf2c3715SXin LiThe clear advantage here is that the class `Foo` remains unchanged regarding alignment issues. 199*bf2c3715SXin LiThe drawback is that an additional heap allocation will be required whatsoever. 200*bf2c3715SXin Li 201*bf2c3715SXin Li*/ 202*bf2c3715SXin Li 203*bf2c3715SXin Li} 204