Why can templates only be implemented in the header file?


Quote from The C++ standard library: a tutorial and handbook:



The only portable way of using templates at the moment is to implement them in header files by using inline functions.



Why is this?



(Clarification: header files are not the only portable solution. But they are the most convenient portable solution.)



It is not necessary to put the implementation in the header file, see the alternative solution at the end of this answer.



Anyway, the reason your code is failing is that, when instantiating a template, the compiler creates a new class with the given template argument. For example:



When reading this line, the compiler will create a new class (let's call it FooInt), which is equivalent to the following:



Consequently, the compiler needs to have access to the implementation of the methods, to instantiate them with the template argument (in this case int). If these implementations were not in the header, they wouldn't be accessible, and therefore the compiler wouldn't be able to instantiate the template.



A common solution to this is to write the template declaration in a header file, then implement the class in an implementation file (for example .tpp), and include this implementation file at the end of the header.



This way, implementation is still separated from declaration, but is accessible to the compiler.



Another solution is to keep the implementation separated, and explicitly instantiate all the template instances you'll need:



If my explanation isn't clear enough, you can have a look at the C++ Super-FAQ on this subject.



Plenty correct answers here, but I wanted to add this (for completeness):



If you, at the bottom of the implementation cpp file, do explicit instantiation of all the types the template will be used with, the linker will be able to find them as usual.



Edit: Adding example of explicit template instantiation. Used after the template has been defined, and all member functions has been defined.



This will instantiate (and thus make available to the linker) the class and all its member functions (only). Similar syntax works for template functions, so if you have non-member operator overloads you may need to do the same for those.



The above example is fairly useless since vector is fully defined in headers, except when a common include file (precompiled header?) uses extern template class vector<int> so as to keep it from instantiating it in all the other (1000?) files that use vector.



It's because of the requirement for separate compilation and because templates are instantiation-style polymorphism.



Lets get a little closer to concrete for an explanation. Say I've got the following files:



Separate compilation means I should be able to compile foo.cpp independently from bar.cpp. The compiler does all the hard work of analysis, optimization, and code generation on each compilation unit completely independently; we don't need to do whole-program analysis. It's only the linker that needs to handle the entire program at once, and the linker's job is substantially easier.



bar.cpp doesn't even need to exist when I compile foo.cpp, but I should still be able to link the foo.o I already had together with the bar.o I've only just produced, without needing to recompile foo.cpp. foo.cpp could even be compiled into a dynamic library, distributed somewhere else without foo.cpp, and linked with code they write years after I wrote foo.cpp.



"Instantiation-style polymorphism" means that the template MyClass<T> isn't really a generic class that can be compiled to code that can work for any value of T. That would add overhead such as boxing, needing to pass function pointers to allocators and constructors, etc. The intention of C++ templates is to avoid having to write nearly identical class MyClass_int, class MyClass_float, etc, but to still be able to end up with compiled code that is mostly as if we had written each version separately. So a template is literally a template; a class template is not a class, it's a recipe for creating a new class for each T we encounter. A template cannot be compiled into code, only the result of instantiating the template can be compiled.



So when foo.cpp is compiled, the compiler can't see bar.cpp to know that MyClass<int> is needed. It can see the template MyClass<T>, but it can't emit code for that (it's a template, not a class). And when bar.cpp is compiled, the compiler can see that it needs to create a MyClass<int>, but it can't see the template MyClass<T> (only its interface in foo.h) so it can't create it.



If foo.cpp itself uses MyClass<int>, then code for that will be generated while compiling foo.cpp, so when bar.o is linked to foo.o they can be hooked up and will work. We can use that fact to allow a finite set of template instantiations to be implemented in a .cpp file by writing a single template. But there's no way for bar.cpp to use the template as a template and instantiate it on whatever types it likes; it can only use pre-existing versions of the templated class that the author of foo.cpp thought to provide.



You might think that when compiling a template the compiler should "generate all versions", with the ones that are never used being filtered out during linking. Aside from the huge overhead and the extreme difficulties such an approach would face because "type modifier" features like pointers and arrays allow even just the built-in types to give rise to an infinite number of types, what happens when I now extend my program by adding:



There is no possible way that this could work unless we either



Nobody likes (1), because whole-program-analysis compilation systems take forever to compile , and because it makes it impossible to distribute compiled libraries without the source code. So we have (2) instead.



Templates need to be instantiated by the compiler before actually compiling them into object code. This instantiation can only be achieved if the template arguments are known. Now imagine a scenario where a template function is declared in a.h, defined in a.cpp and used in b.cpp. When a.cpp is compiled, it is not necessarily known that the upcoming compilation b.cpp will require an instance of the template, let alone which specific instance would that be. For more header and source files, the situation can quickly get more complicated.



One can argue that compilers can be made smarter to "look ahead" for all uses of the template, but I'm sure that it wouldn't be difficult to create recursive or otherwise complicated scenarios. AFAIK, compilers don't do such look aheads. As Anton pointed out, some compilers support explicit export declarations of template instantiations, but not all compilers support it (yet?).



Actually, versions of the C++ standard before C++11 defined the 'export' keyword, that would make it possible to simply declare templates in a header file and implement them elsewhere.



Unfortunately, none of the popular compilers implemented this keyword. The only one I know about is the frontend written by the Edison Design Group, which is used by the Comeau C++ compiler. All others insisted you write templates in header files, needing the definition of the code for proper instantiation (as others pointed out already).



As a result, the ISO C++ standard committee decided to remove the export feature of templates beginning with C++11.



Although standard C++ has no such requirement, some compilers require that all function and class templates need to be made available in every translation unit they are used. In effect, for those compilers, the bodies of template functions must be made available in a header file. To repeat: that means those compilers won't allow them to be defined in non-header files such as .cpp files



There is an export keyword which is supposed to mitigate this problem, but it's nowhere close to being portable.



Templates must be used in headers because the compiler needs to instantiate different versions of the code, depending on the parameters given/deduced for template parameters. Remember that a template doesn't represent code directly, but a template for several versions of that code.
When you compile a non-template function in a .cpp file, you are compiling a concrete function/class. This is not the case for templates, which can be instantiated with different types, namely, concrete code must be emitted when replacing template parameters with concrete types.



There was a feature with the export keyword that was meant to be used for separate compilation.
The export feature is deprecated in C++11 and, AFAIK, only one compiler implemented it. You shouldn't make use of export. Separate compilation is not possible in C++ or C++11 but maybe in C++17, if concepts make it in, we could have some way of separate compilation.



For separate compilation to be achieved, separate template body checking must be possible. It seems that a solution is possible with concepts. Take a look at this paper recently presented at the
standards commitee meeting. I think this is not the only requirement, since you still need to instantiate code for the template code in user code.



The separate compilation problem for templates I guess it's also a problem that is arising with the migration to modules, which is currently being worked.



It means that the most portable way to define method implementations of template classes is to define them inside the template class definition.



Even though there are plenty of good explanations above, I'm missing a practical way to separate templates into header and body.
My main concern is avoiding recompilation of all template users, when I change its definition.
Having all template instantiations in the template body is not a viable solution for me, since the template author may not know all if its usage and the template user may not have the right to modify it.
I took the following approach, which works also for older compilers (gcc 4.3.4, aCC A.03.13).



For each template usage there's a typedef in its own header file (generated from the UML model). Its body contains the instantiation (which ends up in a library which is linked in at the end).
Each user of the template includes that header file and uses the typedef.



A schematic example:



MyTemplate.h:



MyTemplate.cpp:



MyInstantiatedTemplate.h:



MyInstantiatedTemplate.cpp:



main.cpp:



This way only the template instantiations will need to be recompiled, not all template users (and dependencies).



That is exactly correct because the compiler has to know what type it is for allocation. So template classes, functions, enums,etc.. must be implemented as well in the header file if it is to be made public or part of a library (static or dynamic) because header files are NOT compiled unlike the c/cpp files which are. If the compiler doesn't know the type is can't compile it. In .Net it can because all objects derive from the Object class. This is not .Net.



If the concern is the extra compilation time and binary size bloat produced by compiling the .h as part of all the .cpp modules using it, in many cases what you can do is make the template class descend from a non-templatized base class for non type-dependent parts of the interface, and that base class can have its implementation in the .cpp file.



The compiler will generate code for each template instantiation when you use a template during the compilation step.
In the compilation and linking process .cpp files are converted to pure object or machine code which in them contains references or undefined symbols because the .h files that are included in your main.cpp have no implementation YET. These are ready to be linked with another object file that defines an implementation for your template and thus you have a full a.out executable.
However since templates need to be processed in the compilation step in order to generate code for each template instantiation that you do in your main program, linking won't help because compiling the main.cpp into main.o and then compiling your template .cpp into template.o and then linking won't achieve the templates purpose because I'm linking different template instantiation to the same template implementation! And templates are supposed to do the opposite i.e to have ONE implementation but allow many available instantiations via the use of one class.



Meaning typename T get's replaced during the compilation step not the linking step so if I try to compile a template without T being replaced as a concrete value type so it won't work because that's the definition of templates it's a compile time process, and btw meta-programming is all about using this definition.



A way to have separate implementation is as follows.



inner_foo has the forward declarations. foo.tpp has the implementation and include inner_foo.h; and foo.h will have just one line, to include foo.tpp.



On compile time, contents of foo.h are copied to foo.tpp and then the whole file is copied to foo.h after which it compiles. This way, there is no limitations, and the naming is consistent, in exchange for one extra file.



I do this because static analyzers for the code break when it does not see the forward declarations of class in *.tpp. This is annoying when writing code in any IDE or using YouCompleteMe or others.




Thank you for your interest in this question.
Because it has attracted low-quality or spam answers that had to be removed, posting an answer now requires 10 reputation on this site (the association bonus does not count).


Would you like to answer one of these unanswered questions instead?

Popular posts from this blog

The Dalles, Oregon

眉山市

清晰法令