python魔术方法大全基础篇、比较篇-526互联

魔术方法大全

魔术方法官方名称叫 special method，所谓的魔术方法就是python让用户客制化一个类的方式，顾名思义就是定义在类里面的一些特殊的方法。

这些special method的特点就是它的method的名字，前后都有两个下划线，所以这些方法也被称为Dunder method。

基础篇
比较篇
属性篇
类构建篇
运算篇
模拟篇

===============================================================================================================================

1，基础篇

__new__和__init__
__del__
__repr__和__str__
__format__
__bytes__

__new__方法和__init__方法

class A:
    def __new__(cls):
        print("__new__")
        return super().__new__(cls)

    def __init__(self):
        print("__init__")


o = A()
# obj = __new__(A)
# __init__(obj)

区别：__new__是从一个class建立一个object的过程，而__init__是有了这个object之后，给这个object初始化的过程
理解过程：我们可以把它粗略的想象成，当我们做 o = A()的时候，我们先把这个class A作为argument参数，传到这个__new__函数里面，返回一个object，然后再把这个object作为变量，去调用这个__init__函数。

如果我们在建立这个object的时候，传入了一些参数，那么这个参数既会被传入到__new__里，也会被传入到__init__里。

class A:
    def __new__(cls, x):
        print("__new__")
        return super().__new__(cls)

    def __init__(self, x):
        self.x = x
        print("__init__")


o = A(1)
# obj = __new__(A, 1)
# __init__(obj, 1)

什么时候会用到__new__呢，比如说我想做一个 Singleton class，也就是说在建立一个class的object之前，先判断有没有其他的object已经被建立了，如果有我就不在建立新的object，这里我们就是在客制化建立object的过程，它才会用到__new__，包括一些跟metaclass有关的内容，也会用到__new__。简而言之，当你不知道什么时候要用__new__的时候，你就不需要用__new__。

由于__new__函数是建立了一个object，所以它必须要返回这个object，也就是说__new__函数是有返回值的，而__init__函数是没有返回值的，__init__函数里面传入的这个self，就是你要初始化的对象，__init__函数基本上就是在操作self。

__del__方法

__del__这个函数可以粗略的把它理解成一个析构函数，当然它不是啊，这个__del__方法就是当这个对象被释放的时候，你想干点什么。这个函数并没有那么好用，因为在python中，对象的释放是一个比较复杂的过程。一个对象有可能在引用到 0零的时候被释放，也有可能在garbage collection（垃圾收集）时候被释放，并且这个释放有可能是在任意一个地方发生的，它可能发生在运行一个Bytecode（字节码）的时候，也可能发生在python这个Interpreter最后结束的时候，由于python本身的引用体系比较复杂，所有这个函数比较不可控。

class A:
    def __del__(self):
        print("__del__")


o = A()
# 魔术方法__del__和关键字del是没有关系的
del o

这里强调一件事，就是这个special method __del__和python里的关键字del是没有关系的。在python中，del o 的时候，并不一定会触发这个__del__ method，del o 只是让这个对象少一个引用，也就是说这个local variable o 我用不着了。举一个简单的例子

这里我们建立一个 object o，然后我们让 x = o ，再 del o，我们会发现运行结果它是先打印的finish，再打印的__del__，也就是说第8行的这个del o并没有触发这个__del__ method，这就是因为我们在第8行我们做del o的时候，这个object身上还有一个x的引用。

__repr__方法和__str__方法

repr即representation，str即string，这两个函数的功能是相似的，都是返回这个object的字符串表示。这两个method之间主要是语义上的不同，一般来说，__str__这个函数它返回的内容是人类更容易理解的string，它比较注重可读性，而__repr__返回的内容，一般要有更详细的信息。在两个方法都定义了的情况下，print是会使用这个__str__函数的。

我们可以使用builtin函数repr去调用这个representation函数

class A:
    def __repr__(self):
        return "<A repr>"

    def __str__(self):
        return "<A str>"


print(A())  # 打印<A str>

print(repr(A()))    # 打印<A repr>
print(str(A()))     # 打印<A str>

如果不需要简略版，我们可以只定义这个__repr__ method，当__str__这个method没有被定义的时候，当我们试图打印这个object或这个把这个object变成string的时候，它就会自动调用这个__repr__函数。

class A:
    def __repr__(self):
        return "<A>"

    # def __str__(self):
    #    return "<A str>"

print(repr(A()))    # 打印<A>
print(str(A()))     # 打印<A>

__format__方法

当你尝试使用某种格式，打印这个object时候，它就有可能会调用到__format__函数。

class A:
    def __format__(self, format_spec):
        if format_spec == "x":
            return "0xA"
        return "<A>"


print(f"{A()}",)    # 打印<A>

print("{}".format(15))      # 十进制
print("{:b}".format(15))    # 二进制
print("{:x}".format(15))    # 16进制
# 用f-string也可以达到相同效果
print(f"{15}")
print(f"{15:b}")
print(f"{15:x}")

__format__就是给你一个处理这个format的方法，比如说我在这里检查一下这个format_spec是不是x，那当我用x这个format来打印A这个object的时候，它就会打印出不一样的string。

class A:
    def __format__(self, format_spec):
        if format_spec == "x":
            return "0xA"
        return "<A>"


print(f"{A()}")     # 打印<A>
print(f"{A():x}")   # 打印0xA

__bytes__方法

当你尝试用你自己的这个class去建立一个bytes的时候，它应该返回一个什么东西，也就是下面我们写的bytes括号，然后一个A的object。

这个method除非你要客制化这个object的bytes表示，否则它也是用不上

class A:
    def __bytes__(self):
        print("__bytes__ called")
        return bytes([0, 1])


print(bytes(A()))
# 输出结果
# __bytes__ called
# b'\x00\x01'

===============================================================================================================================

2，比较篇

__eq__和__ne__
__gt__和__lt__
__ge__和__le__
__hash__
__bool__

__eq__方法

在python中我们经常会比较两个数据结构是否相等，或者这两个数据结构哪个大哪个小。

print([1, 2] == [1, 2])    # True  比较列表
print(2 > 6)                   # False    比较整数
print("a" <= "b")            # True　　比较字符串

在python中管这些比较是否相等，或者比较大小的操作叫做rich comparison，rich comparison一共包含了6个操作符：等于、不等于、大于、小于、大于等于、小于等于

在python的内置数据结构中，比如说dict或者list，尤其是这个integer、string、float这些，它们的rich comparison都是有良好定义的。然而，有时候对于我们自己写的数据结构，我们也希望利用这些比较运算符，而比较的逻辑实现，我们就是通过魔术方法来完成的。

举一个例子，假如我们自定义一个类日期，这个日期里面包含年月日，这个时候如果我建立了两个同样的日期，然后我用 == 号去比较它俩是否相等的话，你会发现输出的是False，这是因为在python中，当你没有去写一个类的比较逻辑的时候，它默认比较两个object是否相等的方法是用is，也就是说我们在这里打印 x == y，本质上是打印了一个 x is y

class Date:
    def __init__(self, year, month, date):
        self.year = year
        self.month = month
        self.date = date


x = Date(2022, 2, 22)
y = Date(2022, 2, 22)

# print(x is y)
print(x == y)  # False

而我们可以通过定义一个__eq__函数，这个代表equal，来改变它的比较的逻辑，比如在这里我们把这个__eq__函数定义成返回这两个object year相等 month相等 day相等，完成这个魔术方法之后，可以看到打印 x == y的时候就出来的是True了，如果我们把y改一下就发现它是False

class Date:
    def __init__(self, year, month, date):
        self.year = year
        self.month = month
        self.date = date

    def __eq__(self, other):
        return (self.year == other.year and
                self.month == other.month and
                self.date == other.date)


x = Date(2022, 2, 22)
y = Date(2022, 2, 22)

print(x == y)  # True

这里我们注意一下，尽管理论上说这些比较运算符应该返回一个boolean值，也就是True或者False，但是你在实际写的时候，是可以返回任何东西的，比如这里我们返回一个string 'abc'。它打印出来就是abc

class Date:
    def __init__(self, year, month, date):
        self.year = year
        self.month = month
        self.date = date

    def __eq__(self, other):
        return "abc"


x = Date(2022, 2, 22)
y = Date(2022, 2, 23)

print(x == y)  # abc

有的时候，你可以利用这特性做一些trick，比如说你在对两个向量进行比较的时候，你的返回值可以是一个boolean的向量。

在默认的情况下，python如果没有发现你单独定义的不等于函数，它会把等于函数取反。下面我们在进行不等于运算的时候，__eq__这个函数还是被运行了，并且它正确的返回了结果，也就是说在很多常见的应用里，我们只需要定义__eq__就足够了

class Date:
    def __init__(self, year, month, date):
        self.year = year
        self.month = month
        self.date = date

    def __eq__(self, other):
        print("__eq__")
        return (self.year == other.year and
                self.month == other.month and
                self.date == other.date)


x = Date(2022, 2, 22)
y = Date(2022, 2, 23)

print(x != y)  
# __eq__
# True

__ne__方法

当然我们也可以定义不等于的魔术方法，不等于的魔术方法叫做__ne__ not equal ，我们在定义了__ne__这个函数之后，我们运行不等于的时候，它就只会调用__ne__这个函数，而不会在去调用__eq__这个函数再把它去取反了。

class Date:
    def __init__(self, year, month, date):
        self.year = year
        self.month = month
        self.date = date

    def __eq__(self, other):
        print("__eq__")
        return (self.year == other.year and
                self.month == other.month and
                self.date == other.date)

    def __ne__(self, other):
        print("__ne__")
        return (self.year != other.year or
                self.month != other.month or
                self.date != other.date)


x = Date(2022, 2, 22)
y = Date(2022, 2, 23)

# x.__ne__(y)
print(x != y)
# __ne__
# True

这里我们发现不管是__eq__还是__ne__它的函数的definition里面都有两个argument，我们一个叫self，一个叫other，因为这两个操作都是二元操作，也就是需要两个object来进行这个操作，当我们在写 x != y的时候，我们基本上是在做一个 x.__ne__(y) 的操作，在__ne__这个函数里，x 就被作为 self 传进去，而 y 作为 other传进去。

那对于等于和不等于，我们遇到的大部分数据结构它都是对称的，就是一般来说在我们的直觉上，如果a等于b，那么b就等于a，a不等于b，b就不等于a。

接下来看一些不对称，比如说大于或者是小于，我们刚才讲过等于和不等于是有默认实现的，如果你没有做自己的实现，那它直接就会用 is 来做比较，如果我们尝试比较两个object的大小，这里它就会直接报错，它告诉你这个数据结构不支持这个操作符。大于和小于我们要自己定义这个逻辑

print(x > y)

TypeError: '>' not supported between instances of 'Date' and 'Date'

__gt__方法

实现大于号的函数叫做__gt__ 即greater than，这里我们做一个比较简单的实现，比较年比较月比较日，在实现了这个__gt__函数之后，我们就可以使用这个大于号了，然而你会发现一个神奇的事情，尽管我们只实现了__gt__函数，但是我们可以使用小于号，这是因为python内部，默认大于号跟小于号是一对，当你不做特殊说明的时候，x小于y就意味着 y 大于 x，所以尽管我们找不到关于x 的小于号的实现，但是我们能找到 y 的关于大于号的实现，因为 x 跟 y 是同一种object，所以这个时候，当我们做 x 小于 y 的时候，本质上我们调用的是 y.__gt__(x)。

class Date:
    def __init__(self, year, month, date):
        self.year = year
        self.month = month
        self.date = date

    def __eq__(self, other):
        print("__eq__")
        return (self.year == other.year and
                self.month == other.month and
                self.date == other.date)

    def __ne__(self, other):
        print("__ne__")
        return (self.year != other.year or
                self.month != other.month or
                self.date != other.date)

    def __gt__(self, other):
        if self.year > other.year:
            return True
        if self.year == other.year:
            if self.month > other.month:
                return True
            if self.month == other.month:
                return self.date > other.date


x = Date(2022, 2, 22)
y = Date(2022, 2, 23)


# y.__gt__(x)
print(x < y)　　# True

__lt__方法

小于号对应的魔术方法叫做__lt__，less than ，如果我们同时实现了__gt__和__lt__，你会发现当我们使用 x 小于 y 的时候，它会优先调用__lt__函数。

class Date:
    def __init__(self, year, month, date):
        self.year = year
        self.month = month
        self.date = date

    def __eq__(self, other):
        print("__eq__")
        return (self.year == other.year and
                self.month == other.month and
                self.date == other.date)

    def __ne__(self, other):
        print("__ne__")
        return (self.year != other.year or
                self.month != other.month or
                self.date != other.date)

    def __gt__(self, other):
        print("__gt__")
        if self.year > other.year:
            return True
        if self.year == other.year:
            if self.month > other.month:
                return True
            if self.month == other.month:
                return self.date > other.date

    def __lt__(self, other):
        print("__lt__")
        if self.year < other.year:
            return True
        if self.year == other.year:
            if self.month < other.month:
                return True
            if self.month == other.month:
                return self.date < other.date


x = Date(2022, 2, 22)
y = Date(2022, 2, 23)


print(x < y)
# __lt__
# True

当我写 x 小于 y的时候，它在语义上既是 x 小于 y，又是 y 大于 x，那python怎么判断是用 __gt__函数还是 __lt__函数呢？你可能天真的认为这个是小于号，那肯定是用__lt__函数，然而不一定是这样，我们新建一个类叫NewDate，然后这NewDate继承了原来的这个Date，我们让x是一个Date object，然后让y 是一个NewDate object，这个时候当我们做 x 小于 y这个操作的时候，我们会发现它调用的是__gt__函数，也就是它比较的是y 是不是大于x，而不是 x 是不是小于y

class Date:
    def __init__(self, year, month, date):
        self.year = year
        self.month = month
        self.date = date

    def __eq__(self, other):
        print("__eq__")
        return (self.year == other.year and
                self.month == other.month and
                self.date == other.date)

    def __ne__(self, other):
        print("__ne__")
        return (self.year != other.year or
                self.month != other.month or
                self.date != other.date)

    def __gt__(self, other):
        print("__gt__")
        if self.year > other.year:
            return True
        if self.year == other.year:
            if self.month > other.month:
                return True
            if self.month == other.month:
                return self.date > other.date

    def __lt__(self, other):
        print("__lt__")
        if self.year < other.year:
            return True
        if self.year == other.year:
            if self.month < other.month:
                return True
            if self.month == other.month:
                return self.date < other.date

class NewDate(Date):
    pass
    
x = Date(2022, 2, 22)
y = NewDate(2022, 2, 23)

print(x < y)
# __gt__
# True

事实上如果我们在这个__eq__函数里，把这个self 跟 other按照顺序打印出来，我们就会发现，当我们做 x == y的时候，它先打印的是 y，后打印的是 x，也就是这里执行的是 y.__eq__(x)，而不是 x.__eq__(y)。

class Date:
    def __init__(self, year, month, date):
        self.year = year
        self.month = month
        self.date = date

    def __str__(self):
        return f"{self.year}/{self.month}/{self.date}"

    def __eq__(self, other):
        print("__eq__")
　　　　 print(self, other)
        return (self.year == other.year and
                self.month == other.month and
                self.date == other.date)

    def __ne__(self, other):
        print("__ne__")
        return (self.year != other.year or
                self.month != other.month or
                self.date != other.date)

    def __gt__(self, other):
        print("__gt__")
        if self.year > other.year:
            return True
        if self.year == other.year:
            if self.month > other.month:
                return True
            if self.month == other.month:
                return self.date > other.date

    def __lt__(self, other):
        print("__lt__")
        if self.year < other.year:
            return True
        if self.year == other.year:
            if self.month < other.month:
                return True
            if self.month == other.month:
                return self.date < other.date


class NewDate(Date):
    pass


x = Date(2022, 2, 22)
y = NewDate(2022, 2, 23)


# y.__eq__(x)
print(x == y)

__eq__
2022/2/23 2022/2/22
False

而如果y和x 一样都是Date object，它们二者的顺序就返回来了，这里使用的就是x.__eq__(y)，也就是说，在做 rich comparison的时候，如果 x 跟y 不是同一个类的object，这里的规则是如果y 是 x的衍生类，那么就优先使用y 的rich comparison的函数，否则，就优先使用x 的 rich comparison的函数，大部分情况下，优先使用运算符左边的那个类，如果那个类里没有，再去找右边的那个类对应的函数，除非右边那个类是左边那个类的子类，这个时候，就优先使用右边那个类里的函数

x = Date(2022, 2, 22)
y = Date(2022, 2, 23)


# x.__eq__(y)
print(x == y)

__eq__
2022/2/22 2022/2/23
False

__ge__方法和__le__方法

最后大于等于和小于等于对应的魔术方法分别叫做__ge__和__le__，意思是greater than or equal to，或者less than or equal to，这里我们调用小于等于，就会发现运行的是

__le__函数，

***
    def __ge__(self, other):
        print("__ge__")
        if self.year > other.year:
            return True
        if self.year == other.year:
            if self.month > other.month:
                return True
            if self.month == other.month:
                return self.date >= other.date

    def __le__(self, other):
        print("__le__")
        if self.year < other.year:
            return True
        if self.year == other.year:
            if self.month < other.month:
                return True
            if self.month == other.month:
                return self.date <= other.date


class NewDate(Date):
    pass


x = Date(2022, 2, 22)
y = Date(2022, 2, 23)

print(x <= y)
# __le__
# True

这里一定要注意一下，python中不会对小于等于这个运算进行任何的推测，它不会认为小于等于就是小于或者等于，当我们把__ge__函数和__le__函数都注释掉之后，前面的小于或者等于依然可以正常运行，但是小于等于就报错了，当然这也就意味着，在很多的自制的数据结构里面小于等于和小于或者等于并不等价　　

__hash__方法

我们有可能会希望求某一个数据结构的hash值，一个自定义的数据结构是有它默认的hash算法的，所以在这里我们hash(x)就能直接拿到它的hash值。

class Date:
    def __init__(self, year, month, date):
        self.year = year
        self.month = month
        self.date = date

x = Date(2022, 2, 22)
print(hash(x))
# 8771107107837

我们最常见的对hash的使用，就是把这个数据结构的对象作为key，放到一些hash table里面，比如说字典，set，我们看下面这个例子，我们做了一个字典叫做income，就记录每天赚了多少钱，我们打印一下结果看似是没有问题的

class Date:
    def __init__(self, year, month, date):
        self.year = year
        self.month = month
        self.date = date
    
    def __repr__(self):
        return f"{self.year}/{self.month}/{self.date}"

x = Date(2022, 2, 22)  
income = {}
income[x] = 1000
print(income)
# {2022/2/22: 1000}

然后如果我们建立了两个一样的Date，你会发现在字典里面它们是两个key，因为它们俩不相等

class Date:
    def __init__(self, year, month, date):
        self.year = year
        self.month = month
        self.date = date
    
    def __repr__(self):
        return f"{self.year}/{self.month}/{self.date}"

x = Date(2022, 2, 22)
y = Date(2022, 2, 22)  
income = {}
income[x] = 1000
income[y] = 1000
print(income)
# {2022/2/22: 1000, 2022/2/22: 1000}

这时候大家就会想到把刚才定义的那个 __eq__函数拿过来，不就行了吗，如果你把这个__eq__函数写上，你会发现，它这里直接给你报错了，告诉你这个type是unhashable的，事实上，你也没法对这个变量进行hash操作了，这是因为尽管python对每一个自定义的数据结构都默认了一个__eq__函数或者说是相等函数，以及一个hash函数，但是如果你对这个自定义类定义了你自己的__eq__函数，这个默认的hash函数就会被删除掉，如果你还想对它使用hash这个功能的话，你就必须自定义自己的hash函数，这背后的原因是 hash的基本定义就在于两个东西如果相等它们两个的hash必须相等，而当你改变了默认的相等函数，默认的hash函数显然就不对了。

class Date:
    def __init__(self, year, month, date):
        self.year = year
        self.month = month
        self.date = date
    
    def __repr__(self):
        return f"{self.year}/{self.month}/{self.date}"
    
    def __eq__(self, other):
        return (self.year == other.year and
                self.month == other.month and
                self.date == other.date)

x = Date(2022, 2, 22)
y = Date(2022, 2, 22)
# print(hash(x)) # 对变量进行hash操作  
income = {}
income[x] = 1000
income[y] = 1000
print(income)

TypeError: unhashable type:'Date'

所以在这里如果我们还想把这个类的object作为一个key，在字典里面使用的话，我们就必须在定义这个__eq__之余，再定义我们的hash函数。

hash对应的魔术方法就是这个__hash__，我们看在定义了这个__hash__函数之后，我们的程序就正常的工作了。

class Date:
    def __init__(self, year, month, date):
        self.year = year
        self.month = month
        self.date = date
    
    def __repr__(self):
        return f"{self.year}/{self.month}/{self.date}"
    
    def __eq__(self, other):
        return (self.year == other.year and
                self.month == other.month and
                self.date == other.date)
    
    def __hash__(self):
        return 2

x = Date(2022, 2, 22)
y = Date(2022, 2, 22)
income = {}
income[x] = 1000
income[y] = 1000
print(income)
# {2022/2/22: 1000}

hash函数的要求第一是它必须返回一个整数，第二对于两个相等的对象它们必须要有同样的hash值，那我们这里直接return 2 是一个合法但是愚蠢的hash函数，它符合要求但是会造成大量的hash collision，那如果我们返回的是一个字符串，它就是一个非法的hash函数，并抛出错误：TypeError:hash method should return an integer。

python官方的建议是你在你的hash函数里面使用python提供的这个hash builtin function，你可以通过把这个对象的核心属性组成一个tuple然后再求它们hash值的方式来返回你这个对象的hash值，当然这里你自己一定要保证两个对象相等的时候它们的hash值是相等的。

这里还有一个值得注意的地方，如果你这个object本身是mutable（可变的）的，它就不应该作为key出现在一个字典里面，当然这个mutable不是指的python语法层面的mutable，因为所有的自定义python object，本质上都是mutable，这里指的是你是否把它当做一个mutable使用，如果你的object在你建立了之后，你还要去改变它的话，它就不应该成为字典的key

class Date:
    def __init__(self, year, month, date):
        self.year = year
        self.month = month
        self.date = date
    
    def __repr__(self):
        return f"{self.year}/{self.month}/{self.date}"
    
    def __eq__(self, other):
        return (self.year == other.year and
                self.month == other.month and
                self.date == other.date)
    
    def __hash__(self):
        return hash((self.year,self.month,self.date))

x = Date(2022, 2, 22)
y = Date(2022, 2, 22)
income = {}
income[x] = 1000
income[y] = 1000
print(income)

__bool__方法

最后我们来看一下一个自定义对象作为条件出现在条件判断语句中的事，比如这里我们在建立了一个Date对象 x 之后，我们说 if x:print("Hello!")，这里我们可以看到x被当做成了真，那对于所有的自定义对象，当你把它们直接放进if statement里面的时候，它都会被认为是真，

class Date:
    def __init__(self, year, month, date):
        self.year = year
        self.month = month
        self.date = date
    
    def __repr__(self):
        return f"{self.year}/{self.month}/{self.date}"

x = Date(2022, 2. 22)

if x:
    print("Hello!")
# Hello!

如果你想改变他们在做boolean运算时候的结果，你可以用__bool__这个magic method，我们定义bool这个魔术方法之后，我们再尝试对x进行True or False转换的时候，它就会调用这个__bool__方法，当我们尝试把它convert成一个__bool__的时候，即bool(x)，我们调用了一次；我们把它当做一个条件判断的时候，即 if x ，我们调用了一次，同时看到这个Hello! 就没有被打印出来了

class Date:
    def __init__(self, year, month, date):
        self.year = year
        self.month = month
        self.date = date
    
    def __repr__(self):
        return f"{self.year}/{self.month}/{self.date}"

    def __bool__(self):
        print("__bool__")
        return False

x = Date(2022, 2. 22)
print(bool(x))

if x:
    print("Hello!")