PHP 使用 nikic/php-parser 处理 AST

发布时间 2023-09-08 01:40:43作者: 菜皮日记

先来熟悉 php-parser 的 API

nikic/PHP-Parser 可以解析 PHP 代码并生成 AST,还支持修改 AST 再还原成PHP源码,从而实现元编程,可用来做 AOP 和静态代码检查等。Swoft 框架中 AOP 也是基于 PHP-parser 开发的。

https://github.com/nikic/PHP-Parser

首先使用 composer 安装 php-parser

composer require nikic/php-parser

在代码中引入 autoload.php,开始测试代码

<?php
require __DIR__ . '/vendor/autoload.php';

use PhpParser\Error;
use PhpParser\NodeDumper;
use PhpParser\ParserFactory;

// 定义一段PHP代码
$code = <<<'CODE'
<?php
function printLine($msg) {
    echo $msg, "\n";
}
printLine('Hello World!!!');
CODE;

// 创建一个解析器parser,需要指定优先版本
$parser = (new ParserFactory)->create(ParserFactory::PREFER_PHP7);
try {
		// 解析上面定义的PHP代码
    $ast = $parser->parse($code);
} catch (Error $error) {
    echo "Parse error: {$error->getMessage()}\n";
    return;
}

$dumper = new NodeDumper;
// 打印出生成的 AST
echo $dumper->dump($ast) . "\n========\n";

打印出结果:


array(
    0: Stmt_Function(
        attrGroups: array(
        )
        byRef: false
        name: Identifier(
            name: printLine
        )
        params: array(
            0: Param(
                attrGroups: array(
                )
                flags: 0
                type: null
                byRef: false
                variadic: false
                var: Expr_Variable(
                    name: msg
                )
                default: null
            )
        )
        returnType: null
        stmts: array(
            0: Stmt_Echo(
                exprs: array(
                    0: Expr_Variable(
                        name: msg
                    )
                    1: Scalar_String(
                        value:

                    )
                )
            )
        )
    )
    1: Stmt_Expression(
        expr: Expr_FuncCall(
            name: Name(
                parts: array(
                    0: printLine
                )
            )
            args: array(
                0: Arg(
                    name: null
                    value: Scalar_String(
                        value: Hello World!!!
                    )
                    byRef: false
                    unpack: false
                )
            )
        )
    )
)

AST 中各个结构说明可参见文档:https://github.com/nikic/PHP-Parser/blob/master/doc/2_Usage_of_basic_components.markdown#node-tree-structure

上面打印的数组中分别是:

  • Stmt_Function -> PhpParser\Node\Stmt\Function_
  • Stmt_Expression -> PhpParser\Node\Stmt\Expression

Function_ 有个 _ 后缀是因为 Function 本身是保留字,包中还有很多命名带有_ 也都是这个原因。

Node 的类型说明:

  • PhpParser\Node\Stmts are statement nodes, i.e. language constructs that do not return a value and can not occur in an expression. For example a class definition is a statement. It doesn't return a value and you can't write something like func(class A {});.
  • PhpParser\Node\Exprs are expression nodes, i.e. language constructs that return a value and thus can occur in other expressions. Examples of expressions are $var (PhpParser\Node\Expr\Variable) and func() (PhpParser\Node\Expr\FuncCall).
  • PhpParser\Node\Scalars are nodes representing scalar values, like 'string' (PhpParser\Node\Scalar\String_), 0 (PhpParser\Node\Scalar\LNumber) or magic constants like __FILE__ (PhpParser\Node\Scalar\MagicConst\File). All PhpParser\Node\Scalars extend PhpParser\Node\Expr, as scalars are expressions, too.
  • There are some nodes not in either of these groups, for example names (PhpParser\Node\Name) and call arguments (PhpParser\Node\Arg).

访问并修改 Node:


// 访问第0个元素 即Stmt_Function,一级一级向下访问,最后赋值
$ast[0]->stmts[0]->exprs[1]->value = '换行被替换了';
// 访问第1个元素 即Stmt_Expression
$ast[1]->expr->args[0]->value->value = 'Hello World被替换了';

echo $dumper->dump($ast) . "\n========\n";

打印结果:

array(
    0: Stmt_Function(
        attrGroups: array(
        )
        byRef: false
        name: Identifier(
            name: printLine
        )
        params: array(
            0: Param(
                attrGroups: array(
                )
                flags: 0
                type: null
                byRef: false
                variadic: false
                var: Expr_Variable(
                    name: msg
                )
                default: null
            )
        )
        returnType: null
        stmts: array(
            0: Stmt_Echo(
                exprs: array(
                    0: Expr_Variable(
                        name: msg
                    )
                    1: Scalar_String(
                        value: 换行被替换了  // 这里value被改变了
                    )
                )
            )
        )
    )
    1: Stmt_Expression(
        expr: Expr_FuncCall(
            name: Name(
                parts: array(
                    0: printLine
                )
            )
            args: array(
                0: Arg(
                    name: null
                    value: Scalar_String(
                        value: Hello World被替换了  // 这里value也被改了
                    )
                    byRef: false
                    unpack: false
                )
            )
        )
    )
)

遍历 AST 中的 Node:

遍历 AST 需要指定一个访问器,需实现几个方法,beforeTraverse 和 afterTraverse 是在开始遍历前和结束遍历后执行一次,enterNode 和 leaveNode 是每遍历到一个 Node 时执行一次。

interface NodeVisitor {
    public function beforeTraverse(array $nodes);
    public function enterNode(Node $node);
    public function leaveNode(Node $node);
    public function afterTraverse(array $nodes);
}

// NodeVisitorAbstract 是其抽象类
class NodeVisitorAbstract implements NodeVisitor
{
    public function beforeTraverse(array $nodes) {
        return null;
    }

    public function enterNode(Node $node) {
        return null;
    }

    public function leaveNode(Node $node) {
        return null;
    }

    public function afterTraverse(array $nodes) {
        return null;
    }
}
use PhpParser\Node;
use PhpParser\Node\Stmt\Function_;
use PhpParser\NodeTraverser;
use PhpParser\NodeVisitorAbstract;

$traverser = new NodeTraverser();
$traverser->addVisitor(new class extends NodeVisitorAbstract {
    public function enterNode(Node $node) {
				// 如果node是Function_类型时
        if ($node instanceof Function_) {
            // Clean out the function body
						// 情况node的语句,即清空了函数体
            $node->stmts = [];

            // 或者返回一个新的node
            // return new Function_("new_func");
        }
    }
});

$ast = $traverser->traverse($ast);
echo $dumper->dump($ast) . "\n========\n";

输出:

array(
    0: Stmt_Function(
        attrGroups: array(
        )
        byRef: false
        name: Identifier(
            name: new_func
        )
        params: array(
        )
        returnType: null
        stmts: array(
        )  //  stmts 被清空了
    )
    1: Stmt_Expression(
        expr: Expr_FuncCall(
            name: Name(
                parts: array(
                    0: printLine
                )
            )
            args: array(
                0: Arg(
                    name: null
                    value: Scalar_String(
                        value: Hello World被替换了
                    )
                    byRef: false
                    unpack: false
                )
            )
        )
    )
)

输出修改后的 PHP 代码,即 Pretty Print

use PhpParser\PrettyPrinter;

$prettyPrinter = new PrettyPrinter\Standard;
echo $prettyPrinter->prettyPrintFile($ast);

输出:

<?php

function printLine($msg)
{
}
printLine('Hello World被替换了');%

函数体被清空了,并且第二个语句 printLine 中的参数被替换了。

有了这种能力,结合一些注释标注等,就可以在 PHP 代码在执行之前动态修改带有指定特征的 PHP代码的行为。

使用 PHP-parser 重写 PHP 类代码实现AOP:

参考文章:https://learnku.com/articles/14387/aop-design-rewrite-the-php-class-using-php-parser

该 AOP 增强的效果是在字符串后面增加一个叹号 !

入口 aop.php:

<?php
require __DIR__ . '/vendor/autoload.php';
require __DIR__ . '/ProxyVisitor.php';
require __DIR__ . '/Test.php';

use PhpParser\ParserFactory;
use PhpParser\NodeTraverser;
use PhpParser\NodeDumper;
use PhpParser\PrettyPrinter\Standard;

$file = './Test.php';
$code = file_get_contents($file);

$parser = (new ParserFactory())->create(ParserFactory::PREFER_PHP7);
$ast = $parser->parse($code);

$dumper = new NodeDumper;
echo $dumper->dump($ast) . "\n========\n";

$traverser = new NodeTraverser();
$className = 'Test';
$proxyId = uniqid();
$visitor = new ProxyVisitor($className, $proxyId);

$traverser->addVisitor($visitor);
$proxyAst = $traverser->traverse($ast);
if (!$proxyAst) {
    throw new \Exception(sprintf('Class %s AST optimize failure', $className));
}
$printer = new Standard();
$proxyCode = $printer->prettyPrint($proxyAst);

echo $proxyCode;

eval($proxyCode);

$class = $visitor->getClassName();
$bean = new $class();

echo $bean->show();

PHP-Parser 的访问器 ProxyVisitor.php

<?php
require __DIR__ . '/vendor/autoload.php';

use PhpParser\NodeVisitorAbstract;
use PhpParser\Node;
use PhpParser\Node\Expr\Closure;
use PhpParser\Node\Expr\FuncCall;
use PhpParser\Node\Expr\MethodCall;
use PhpParser\Node\Expr\Variable;
use PhpParser\Node\Name;
use PhpParser\Node\Param;
use PhpParser\Node\Scalar\String_;
use PhpParser\Node\Stmt\Class_;
use PhpParser\Node\Stmt\ClassMethod;
use PhpParser\Node\Stmt\Return_;
use PhpParser\Node\Stmt\TraitUse;
use PhpParser\NodeFinder;
use PhpParser\NodeDumper;

class ProxyVisitor extends NodeVisitorAbstract
{
    protected $className;

    protected $proxyId;

    public function __construct($className, $proxyId)
    {
        $this->className = $className;
        $this->proxyId = $proxyId;
    }

    public function getProxyClassName(): string
    {
        return \basename(str_replace('\\', '/', $this->className)) . '_' . $this->proxyId;
    }

    public function getClassName()
    {
        return '\\' . $this->className . '_' . $this->proxyId;
    }

    /**
     * @return \PhpParser\Node\Stmt\TraitUse
     */
    private function getAopTraitUseNode(): TraitUse
    {
        // Use AopTrait trait use node
        return new TraitUse([new Name('AopTrait')]);
    }

    public function leaveNode(Node $node)
    {
        echo "=====leaveNode=====\n";

        // Proxy Class
        if ($node instanceof Class_) {
            // Create proxy class base on parent class
            echo "Class_ instance";
            return new Class_($this->getProxyClassName(), [
                'flags' => $node->flags,
                'stmts' => $node->stmts,
                'extends' => new Name($this->className),
            ]);
        }
        // Rewrite public and protected methods, without static methods
        if ($node instanceof ClassMethod && !$node->isStatic() && ($node->isPublic() || $node->isProtected())) {
            $methodName = $node->name->toString();
            echo "classmethod name ", $methodName , "\n";
            // Rebuild closure uses, only variable
            $uses = [];
            foreach ($node->params as $key => $param) {
                if ($param instanceof Param) {
                    $uses[$key] = new Param($param->var, null, null, true);
                }
            }
            $params = [
                // Add method to an closure
                new Closure([
                    'static' => $node->isStatic(),
                    'uses' => $uses,
                    'stmts' => $node->stmts,
                ]),
                new String_($methodName),
                new FuncCall(new Name('func_get_args')),
            ];
            $stmts = [
                new Return_(new MethodCall(new Variable('this'), '__proxyCall', $params))
            ];
            $returnType = $node->getReturnType();
            if ($returnType instanceof Name && $returnType->toString() === 'self') {
                $returnType = new Name('\\' . $this->className);
            }
            return new ClassMethod($methodName, [
                'flags' => $node->flags,
                'byRef' => $node->byRef,
                'params' => $node->params,
                'returnType' => $returnType,
                'stmts' => $stmts,
            ]);
        }
    }

    public function afterTraverse(array $nodes)
    {
        echo "=====afterTraverse=====\n";

        $addEnhancementMethods = true;
        $nodeFinder = new NodeFinder();
        $nodeFinder->find($nodes, function (Node $node) use (
            &$addEnhancementMethods
        ) {
            if ($node instanceof TraitUse) {
                foreach ($node->traits as $trait) {
                    // Did AopTrait trait use ?
                    if ($trait instanceof Name && $trait->toString() === 'AopTrait') {
                        $addEnhancementMethods = false;
                        break;
                    }
                }
            }
        });
        // Find Class Node and then Add Aop Enhancement Methods nodes and getOriginalClassName() method
        $classNode = $nodeFinder->findFirstInstanceOf($nodes, Class_::class);
        $addEnhancementMethods && array_unshift($classNode->stmts, $this->getAopTraitUseNode());
        return $nodes;
    }
}

trait AopTrait
{
    /**
     * AOP proxy call method
     * 这个AOP加强就是往字符串后面加一个 !
     * @param \Closure $closure
     * @param string   $method
     * @param array    $params
     * @return mixed|null
     * @throws \Throwable
     */
    public function __proxyCall(\Closure $closure, string $method, array $params)
    {
        $res = $closure(...$params);
        if (is_string($res)) {
            $res .= '!';
        }
        return $res;
    }

}

被代理的类 Test.php

<?php

class Test
{
    public function show()
    {
        return 'hello world';
    }
}

执行后,被增强的结果类为:

class Test_60b7bffeb7672 extends Test
{
    use AopTrait;
    public function show()
    {
        return $this->__proxyCall(function () {
            return 'hello world';
        }, 'show', func_get_args());
    }
}

执行结果:

hello world!

Swoft 框架中的 AOP 实现原理

swoft 的 aop 也是基于 php-parser 来实现的,由于懒的搞 phpunit,在本是 testcase 的类上直接改代码手动调试了:

<?php declare(strict_types=1);

namespace SwoftTest\Aop\Unit;

require_once './vendor/autoload.php';

use Swoft\Aop\Ast\Visitor\ProxyVisitor;
use Swoft\Aop\BeiAopClass;
use Swoft\Aop\Proxy;
use Swoft\Proxy\Ast\Parser;
use Swoft\Proxy\Exception\ProxyException;
use Swoft\Proxy\Proxy as BaseProxy;
use function class_exists;
use function sprintf;
use const PHP_EOL;

class AopTest
{
    public function testProxyClass(): void
    {
        /* 
				源码在 https://github.com/swoft-cloud/swoft-aop/blob/master/src/Ast/Visitor/ProxyVisitor.php
				实现了PhpParser的NodeVisitor接口,即定义了遍历ast的nodes时,处理每个node的具体方式。
				*/
        $visitor   = new ProxyVisitor();
        $className = BaseProxy::newClassName(BeiAopClass::class, $visitor);
        
        $o = new $className;
        var_dump($o->MethodNull(1,'2', 3.0, 'xxxx'));
       
    }
}

$a = new AopTest();
$a->testProxyClass();

newClassName 方法如下:

<?php

class Proxy{

public static function newClassName(string $className, Visitor $visitor, string $suffix = ''): string
    {
        echo "被aop的类名:$className \n";
        $cacheKey = $className . $suffix;
        if (isset(self::$caches[$cacheKey])) {
            return self::$caches[$cacheKey];
        }
        $parser = new Parser();
        // 给parser添加访问器,即默认的ProxyVisitor
        $parser->addNodeVisitor(get_class($visitor), $visitor);

        $proxyCode = $parser->parse($className);
        echo "代理后的代码:\n $proxyCode \n";
        $proxyName = $visitor->getProxyName();
        $newClassName = $visitor->getProxyClassName();

	      // 要想让代理后的代码生效,可以写入文件,之后require文件生效
        eval($proxyCode);

				// 或者直接eval这段代理后的代码
        $proxyFile = sprintf('%s/%s.php', Sys::getTempDir(), $proxyName);
        $proxyCode = sprintf('<?php %s %s', PHP_EOL, $proxyCode);

        // Generate proxy class
        $result = file_put_contents($proxyFile, $proxyCode);
        if ($result === false) {
            throw new ProxyException(sprintf('Proxy file(%s) generate fail', $proxyFile));
        }

        // Load new proxy class file.
        self::loadProxyClass($proxyFile);

        // Ensure proxy class is loaded
        if (!class_exists($newClassName)) {
            throw new ProxyException(sprintf('Proxy class(%s) is not exist!', $newClassName));
        }

        // Add cache, mark has been required.
        self::$caches[$cacheKey] = $newClassName;
        return $newClassName;
    }

得到类名,即可 new 之,按照原类的方法签名方式调用,即可得到代理后的效果。

本文由mdnice多平台发布