Redission并发锁报错:IllegalMonitorStateException: attempt to unlock lock, not locked by current thread by node id

发布时间 2023-10-18 14:51:06作者: 梦里有时身化鹤

生产上突然出现一条报错

    j.l.IllegalMonitorStateException: attempt to unlock lock, not locked by current thread by node id: 1411e030-3c44-48d7-9eb6-6030022ce681 thread-id: 111
    at o.r.RedissonBaseLock.lambda$unlockAsync$2(RedissonBaseLock.java:323)
    at j.u.c.CompletableFuture.uniHandle(CompletableFuture.java:930)
    at j.u.c.CompletableFuture$UniHandle.tryFire(CompletableFuture.java:907)
    at j.u.c.CompletableFuture.postComplete(CompletableFuture.java:506)
    at j.u.c.CompletableFuture.complete(CompletableFuture.java:2073)
    at o.r.c.CommandBatchService.lambda$executeAsync$7(CommandBatchService.java:322)

先看下报错地方使用到的代码如下

RLock lock = redissonClient.getLock(key);
if (lock.tryLock()) {
    try {
       //业务代码
    }finally {
        lock.unlock();
    }
} else {
     // 其余处理
}        

这个是方法说明中推荐的用法了

    /**
     * Acquires the lock only if it is free at the time of invocation.
     *
     * <p>Acquires the lock if it is available and returns immediately
     * with the value {@code true}.
     * If the lock is not available then this method will return
     * immediately with the value {@code false}.
     *
     * <p>A typical usage idiom for this method would be:
     * <pre> {@code
     * Lock lock = ...;
     * if (lock.tryLock()) {
     *   try {
     *     // manipulate protected state
     *   } finally {
     *     lock.unlock();
     *   }
     * } else {
     *   // perform alternative actions
     * }}</pre>
     *
     * This usage ensures that the lock is unlocked if it was acquired, and
     * doesn't try to unlock if the lock was not acquired.
     *
     * @return {@code true} if the lock was acquired and
     *         {@code false} otherwise
     */
    boolean tryLock();

查看报错的tracing日志,unlock之前的操作时间很短,只有200ms左右,不存在锁超时自动释放的情况,百思不得其解。

在网上查了很多相关报错的记录,一些比如“建议unlock之前先检查一下当前线程是否持有锁”来避免报错,都未根本性地描述问题原因和解决办法。

 

排查的思路如下:

先去了Redisson的官网(https://github.com/redisson/redisson/wiki/8.-distributed-locks-and-synchronizers/#81-lock)查看了一下锁相关的说明,未找到需要的信息

然后去issus内搜索关键词,终于找到了遇到同样问题的人,并且看到了最近已经修复的提示(写本篇的时候还为发布到预计发布的3.23.6中):

1. 问题描述:https://github.com/redisson/redisson/issues/4871

2. 修复代码改动:https://github.com/redisson/redisson/commit/22c239eaf8b46c8d0e94af14c98ced21bdee0b52#diff-589deed8cc0d31690d8a3b1d16b81c366fa8b9346969295df2061d610fa166b3

    protected final RFuture<Boolean> unlockInnerAsync(long threadId) {
        String id = getServiceManager().generateId();
        MasterSlaveServersConfig config = getServiceManager().getConfig();
        int timeout = (config.getTimeout() + config.getRetryInterval()) * config.getRetryAttempts();
        RFuture<Boolean> r = unlockInnerAsync(threadId, id, timeout);
        CompletionStage<Boolean> ff = r.thenApply(v -> {
            //commandExecutor.writeAsync(getRawName(), LongCodec.INSTANCE, RedisCommands.DEL, getUnlockLatchName(id)); 这个是被删除的代码
       //以下为新增代码
            CommandAsyncExecutor ce = commandExecutor;
            if (ce instanceof CommandBatchService) {
                ce = new CommandBatchService(commandExecutor);
            }
            ce.writeAsync(getRawName(), LongCodec.INSTANCE, RedisCommands.DEL, getUnlockLatchName(id));
            if (ce instanceof CommandBatchService) {
                ((CommandBatchService) ce).executeAsync();
            }
       //新增代码结束位置
            return v;
        });
        return new CompletableFutureWrapper<>(ff);
    }

看起来是显示地使用了CommandBatchService这个用来批量操作Redis命令的服务,不过因为本身对整体源码还不熟,一时也看不懂造成报错的根本原因是什么。

这里先存个疑,计划要了解的如下:

1. 分布式锁存在Redis内的格式是怎么样的?

2. 加锁和解锁的流程是怎么样的?

3. 如何保证的数据一致性呢?

4. 高并发情况下有什么需要注意点?