Incomplete types and shared_ptr / unique_ptr

发布时间 2023-06-13 18:35:37作者: huorexiaji

If the object being deleted has incomplete class type at the point of deletion and the complete class has a non-trivial destructor or a deallocation function, the behavior is undefined.

UB=undefined behavior 

 

https://en.cppreference.com/w/cpp/memory/default_delete

std::default_delete

 
 
 
 
std::default_delete
 
Defined in header <memory>
   
templateclass T struct default_delete;
(1) (since C++11)
templateclass T struct default_delete<T[]>;
(2) (since C++11)
     

std::default_delete is the default destruction policy used by std::unique_ptr when no deleter is specified. Specializations of default_delete are empty classes on typical implementations, and used in the empty base class optimization.

1) The non-specialized default_delete uses delete to deallocate memory for a single object.
2) A partial specialization for array types that uses delete[] is also provided.

 

https://howardhinnant.github.io/incomplete.html

Most templates in the C++ standard library require that they be instantiated with complete types. However shared_ptr and unique_ptr are partial exceptions. Some, but not all of their members can be instantiated with incomplete types. The motivation for this is to support idioms such as pimpl using smart pointers, and without risking undefined behavior.

Undefined behavior can occur when you have an incomplete type and you call delete on it:

class A;
A* a = ...;
delete a;

The above is legal code. It will compile. Your compiler may or may not emit a warning for above code like the above. When it executes, bad things will probably happen. If you're very lucky your program will crash. However a more probable outcome is that your program will silently leak memory as ~A() won't be called.

Using auto_ptr<A> in the above example doesn't help. You still get the same undefined behavior as if you had used a raw pointer.

Nevertheless, using incomplete classes in certain places is very useful! This is where shared_ptr and unique_ptr help. Use of one of these smart pointers will let you get away with an incomplete type, except where it is neccessary to have a complete type. And most importantly, when it is necessary to have a complete type, you get a compile-time error if you try to use the smart pointer with an incomplete type at that point.

No more undefined behavior:

If your code compiles, then you've used a complete type everywhere you need to.

class A
{
    class impl;
    std::unique_ptr<impl> ptr_;  // ok!

public:
    A();
    ~A();
    // ...
};

shared_ptr and unique_ptr require a complete type in different places. The reasons are obscure, having to do with a dynamic deleter vs a static deleter. The precise reasons aren't important. In fact, in most code it isn't really important for you to know exactly where a complete type is required. Just code, and if you get it wrong, the compiler will tell you.

However, in case it is helpful to you, here is a table which documents several members of shared_ptr and unique_ptr with respect to completeness requirements. If the member requires a complete type, then entry has a "C", otherwise the table entry is filled with "I". I've also added a column for noexcept (True or False) for each operation.

Complete type rquirements for unique_ptr and shared_ptr
 unique_ptrshared_ptr
 I/CnoexceptI/Cnoexcept
P()
default constructor
I T I T
P(const P&)
copy constructor
N/A N/A I T
P(P&&)
move constructor
I T I T
~P()
destructor
C T I T
P(A*) I T C F
operator=(const P&)
copy assignment
N/A N/A I T
operator=(P&&)
move assignment
C T I T
reset() C T I T
reset(A*) C T C F

Any operations requiring pointer conversions require complete types for both unique_ptr and shared_ptr.

 

https://chrizog.com/cpp-pimpl-unique-ptr-incomplete-types-default-constructor

C++ PImpl pattern with std::unique_ptr, incomplete types and default constructors

This article is about a common compilation error I came across multiple times when using the PImpl ("Pointer to Implementation") idiom. It arises when you use an std::unique_ptr as the pointer to the implementation (= "impl") class and are not aware of the compiler-generated default constructor and destructor. In this post I'll try to give a better understanding of the pImpl idiom using a smart pointer and about generated default constructors by explaining the solution of the error.

PImpl Pattern in C++ - Why and how?

Which problem does the PImpl pattern solve?

To understand the problem the PImpl idiom in C++ solves a simple example is used. In the following constellation there is a class Car which uses the class SteeringWheel. Car has private member of type SteeringWheel. The Car class is used in the main-function, i.e. the main-function is a client of Car.

The file structure is as follows:

The corresponding code of the different files:

// File steering_wheel.h
#pragma once 

class SteeringWheel {
public:
  void steer(const int angle);

private:
  int angle_{0};
};

// File steering_wheel.cpp
#include "steering_wheel.h"

void SteeringWheel::steer(const int angle) { angle_ = angle; }

// File car.h
#pragma once

#include "steering_wheel.h"

class Car {
public:
  void drive();

private:
  SteeringWheel wheel_;
};

// File car.cpp
#include <iostream>
#include "car.h"

void Car::drive() { std::cout << "I am driving!" << std::endl; }

// File main.cpp
#include "car.h"

int main() {
  Car c;
  c.drive();
  return 0;
}

In this example there is a compilation dependency of main.cpp to the SteeringWheel class although main shall be only a client of Car and not of SteeringWheel. Ideally the SteeringWheel is transparent to the client of Car. When the SteeringWheel class changes, e.g. there is a new private member in SteeringWheel (let's say the radius of the steering wheel), then the size and memory layout of the SteeringWheel class changes. From this it follows that car.cpp has to be recompiled because car.h includes steering_wheel.h which has changed (new private member). Hence also main.cpp has to be recompiled because it includes car.h.

Now imagine that Car and SteeringWheel are part of a separate "Car" library and the class Car has many clients. All clients of Car need to be recompiled now although only the internal SteeringWheel implementation changed.

Here is the output of make to show the needed recompilation which is described above. In the 3rd step you can see that in case steering_wheel.h was modified all .cpp files need to be recompiled.

# Initial make: All .cpp files are compiled
❯ make        
[ 25%] Building CXX object CMakeFiles/main.dir/car.cpp.o
[ 50%] Building CXX object CMakeFiles/main.dir/steering_wheel.cpp.o
[ 75%] Building CXX object CMakeFiles/main.dir/main.cpp.o
[100%] Linking CXX executable main
[100%] Built target main

# Change nothing and call make again -> Nothing is recompiled since there was no change
❯ make
Consolidate compiler generated dependencies of target main
[100%] Built target main

# Change steering_wheel.h and add new private member "int radius" -> all .cpp files need to be recompiled
❯ make
[ 25%] Building CXX object CMakeFiles/main.dir/car.cpp.o
[ 50%] Building CXX object CMakeFiles/main.dir/steering_wheel.cpp.o
[ 75%] Building CXX object CMakeFiles/main.dir/main.cpp.o
[100%] Linking CXX executable main
[100%] Built target main

Reducing compilation dependencies with PImpl

PImpl is a typical technique to reduce the shown compilation dependency. The PImpl idiom aims to have a stable ABI and to reduce the compilation time.

Here are the steps that are taken when implementing the PImpl pattern:

  1. With PImpl the implementation details (the private section) of the Car class are placed in a separate class called "CarImpl". So the private section of Car is cleared.
  2. The #include of steering_wheel.h can be removed since the implementation details are now placed in a separate class.
  3. The new class CarImpl is forward declared inside the Car class and used via a pointer. Since we don't want to use raw pointers std::unique_ptr is used.
  4. The CarImpl class declaration and implementation are put into car.cpp.

The following diagram shows the classes after implementing the pImpl idiom. You can see the additional CarImpl class introduced which now contains the SteeringWheel instance. Since we do not want to use plain pointers but a smart unique pointer, the Car class has a private std::unique_ptr to the CarImpl class.

The file structure stays the same as before. However the includes now change. The CarImpl class is put into the car.cpp implementation. Only car.cpp includes now steering_wheel.h which is the key for the reduction of the compilation dependency.

The code of car.cpp and car.h looks different now. car.cpp contains the class CarImpl and inside Car::drive() the CarImpl's drive function is called.

// File car.h
#pragma once

#include <memory>

class Car {
public:
  void drive();

private:
  class CarImpl;
  std::unique_ptr<CarImpl> car_impl_;
};

// File car.cpp
#include "car.h"
#include "steering_wheel.h"
#include <iostream>

class Car::CarImpl {
public:
  void drive() { std::cout << "I am driving!" << std::endl; };

private:
  SteeringWheel wheel_;
};

void Car::drive() { car_impl_->drive(); }

// File main.cpp
#include "car.h"

int main() {
  Car c;
  c.drive();
  return 0;
}

Common compilation error

When you compile this you will get an error like this:

...
In file included from /opt/compiler-explorer/gcc-12.2.0/include/c++/12.2.0/memory:76,
                 from /app/car.h:3,
                 from /app/main.cpp:1:
/opt/compiler-explorer/gcc-12.2.0/include/c++/12.2.0/bits/unique_ptr.h: In instantiation of 'void std::default_delete<_Tp>::operator()(_Tp*) const [with _Tp = Car::CarImpl]':
/opt/compiler-explorer/gcc-12.2.0/include/c++/12.2.0/bits/unique_ptr.h:396:17:   required from 'std::unique_ptr<_Tp, _Dp>::~unique_ptr() [with _Tp = Car::CarImpl; _Dp = std::default_delete<Car::CarImpl>]'
car.h:5:7:   required from here
/opt/compiler-explorer/gcc-12.2.0/include/c++/12.2.0/bits/unique_ptr.h:93:23: error: invalid application of 'sizeof' to incomplete type 'Car::CarImpl'
   93 |         static_assert(sizeof(_Tp)>0,
      |                       ^~~~~~~~~~~

Incomplete types

What can be found out when analyzing the error:

First Car::CarImpl seems to be an incomplete type in car.h. After reading the cppreference.com page about types it's clear that an incomplete type is for example a "class type that has been declared (e.g. by forward declaration) but not defined" which is the case here since CarImpl is forward declared in car.h.

After that on the cppreference.com page for std::unique_ptr we find something about incomplete types as well:

"If the default deleter is used, T must be complete at the point in code where the deleter is invoked, which happens in the destructor, move assignment operator, and reset member function of std::unique_ptr."

-- cppreference.com

We are using the default deleter but we are not calling the destructor explicitly so apparently there has to be some code generated by the compiler where it is called.

The second point is that when you lookup the reported line of the static_assert in unique_ptr.h you will find that in the constructor of std::default_delete there is a check for incomplete types. Again we are not calling any constructor explicitly so some code seems to be generated by the compiler.

Implicitly-defined default constructors and destructors

Again on cppreference.com we can study the compiler-generated constructors and destructors. The following interesting information is found: If you declare and define a class in C++ that is not trivially constructible and you do not declare a user-defined constructor, the compiler will insert a default constructor. This can be very convenient for a programmer since but it can also bring issues. A trivially constructible type would be e.g. a POD (plain old data) class like this which we do not have in the Car example:

class X {
  public:
    int a, b, c;
}

That means in car.h the compiler will implicitly declare a default constructor. This default constructor will always be declared "as an inline public member of its class". Since it is an inline method the definition is also generated directly in the header file. The generated constructor will then call the constructor of the base class and the default constructors of all non-static members.

If you want to investigate compiler-generated constructors and destructors by yourself I can highly recommend to use the compiler explorer of godbolt.org, implement some simple classes and have a look at the generated assembler.

I tweaked the example to compile and the following snippet of assembler shows that the Car constructor and Car destructor call the std::unique_ptr<CarImpl> constructor and destructor.

Car::Car():
..
 mov    rax,QWORD PTR [rbp-0x8]
 mov    rdi,rax
 call   4012b8 <std::unique_ptr<Car::CarImpl, std::default_delete<Car::CarImpl> >::unique_ptr<std::default_delete<Car::CarImpl>, void>()>
...
Car::~Car():
...
 mov    rax,QWORD PTR [rbp-0x8]
 mov    rdi,rax
 call   4012de <std::unique_ptr<Car::CarImpl, std::default_delete<Car::CarImpl> >::~unique_ptr()>
...

Working PImpl with std::unique_ptr

We figured out the issue is that the compiler generates the constructor and destructor as public inline members and therefore they are also defined in car.h. However, in car.h CarImpl is an incomplete type.

The solution is to

  1. declare the constructor and destructor by yourself so it's not generated as "public inline" by the compiler
  2. and put the definition into the implementation file of Car, i.e. into car.cpp so the constructor and destructor of the private members are not called in the header file.

Calling the constructor and destructor of std::unique_ptr<CarImpl> is then done in the implementation in car.cpp where CarImpl is a complete type since CarImpl is defined in car.cpp.

// File car.h
class Car {
public:
  Car();
  ~Car();
  void drive();

private:
  class CarImpl;
  std::unique_ptr<CarImpl> car_impl_;
};


// File car.cpp
...
Car::Car() = default;
Car::~Car() = default;

void Car::drive() { car_impl_->drive(); }

For the definition of the constructor and destructor in car.cpp we can either use the convenient "default" keyword or define an empty function body.

Summary and testing the benefit of the PImpl idiom

As a last step we can check now if the PImpl idiom works. When the SteeringWheel class is changed it shall not be needed to recompile the client of Car (main.cpp). First everything is compiled initially and then a change in steering_wheel.h is done by adding a new private member:

# Initial make: All .cpp files are compiled
❯ make
Consolidate compiler generated dependencies of target main
[ 25%] Building CXX object CMakeFiles/main.dir/car.cpp.o
[ 50%] Building CXX object CMakeFiles/main.dir/main.cpp.o
[ 75%] Linking CXX executable main
[100%] Built target main

# Change nothing and call make again -> Nothing is recompiled since there was no change
❯ make
Consolidate compiler generated dependencies of target main
[100%] Built target main

# Change steering_wheel.h and add new private member -> main.cpp is not recompiled
❯ make
[ 25%] Building CXX object CMakeFiles/main.dir/car.cpp.o
[ 50%] Building CXX object CMakeFiles/main.dir/steering_wheel.cpp.o
[ 75%] Linking CXX executable main
[100%] Built target main

The output of make shows that after the change main.cpp including the main-function (the client of Car) is not recompiled. This might not be a big improvement for this small example but it can have a very big impact if you use the PImpl idiom for libraries with many transitive includes or when the client of the library itself has a big compilation time and always needs to be recompiled.

Although the actually bugfix was a small one by adding the definition of constructor and destructor to the implementation, this blog post aims to explain the background of the error and raises the awareness of compiler generated methods. This can greatly improve figuring out the causes of other compiler errors, too.

Source Code

The full source code to this example is also available here in the compiler explorer on godbolt.org.

References