微信昵称乱码-解决方案

发布时间 2023-07-06 19:09:41作者: 进击的小蔡鸟

背景

网页授权拉取用户信息时昵称乱码

原因:

调接口时未设置字符集,默认使用的字符集是 ISO-8859-1,该字符集不适合汉字和特殊字符

原来的代码

    /**
     * 网页授权之拉取用户信息
     *
     * @param accessToken 网页授权token(注意和公众号的token不一样)
     * @param openId      用户openId
     * @return
     */
    public @Nullable JSONObject getSnsUserInfo(String accessToken, String openId) {
        String requestUrl = StrUtil.format("https://api.weixin.qq.com/sns/userinfo?access_token={}&openid={}&lang=zh_CN", accessToken, openId);
        log.info("getSnsUserInfo 请求url:{}", requestUrl);
        try {

            String responseStr = restTemplate.getForObject(requestUrl, String.class);
            JSONObject response = JSON.parseObject(responseStr);
            log.info("getSnsUserInfo 响应:{}", response);
            boolean isSuccess = checkResponseIsSuccess(response, "getSnsUserInfo");
            if (isSuccess) {
                return response;
            }
        } catch (Exception e) {
            log.info("网页授权之拉取用户信息 异常:{}", e.getMessage());
        }
        return null;
    }

解决方案:

增量数据

发送请求时,指定字符集 UTF-8

完善后的代码

    /**
     * 网页授权之拉取用户信息
     *
     * @param accessToken 网页授权token(注意和公众号的token不一样)
     * @param openId      用户openId
     * @return
     */
    public @Nullable JSONObject getSnsUserInfo(String accessToken, String openId) {
        String requestUrl = StrUtil.format(SNS_USER_INFO_URL, accessToken, openId);
        log.info("getSnsUserInfo 请求url:{}", requestUrl);
        try {
            // 创建一个StringHttpMessageConverter,并设置字符集为UTF-8
            StringHttpMessageConverter stringConverter = new StringHttpMessageConverter(Charset.forName("UTF-8"));
            stringConverter.setSupportedMediaTypes(Collections.singletonList(MediaType.TEXT_PLAIN));
            // 将StringHttpMessageConverter添加到RestTemplate的消息转换器列表中
            restTemplate.getMessageConverters().add(0, stringConverter);
            // 创建HttpHeaders对象,设置Accept头部的值为"text/plain;charset=UTF-8"
            HttpHeaders headers = new HttpHeaders();
            headers.setAccept(Collections.singletonList(MediaType.TEXT_PLAIN));
            headers.set(HttpHeaders.ACCEPT_CHARSET, "UTF-8");

            String responseStr = restTemplate.getForObject(requestUrl, String.class);
            JSONObject response = JSON.parseObject(responseStr);
            log.info("getSnsUserInfo 响应:{}", response);
            boolean isSuccess = checkResponseIsSuccess(response, "getSnsUserInfo");
            if (isSuccess) {
                return response;
            }
        } catch (Exception e) {
            log.info("网页授权之拉取用户信息 异常:{}", e.getMessage());
        }
        return null;
    }

历史数据

将字符集是 ISO_8859_1的昵称转换为 UTF-8

    @Test
    public void test(){
        String wrongEncodedString = "Má´\u0087á´\u0087á´\u009B ꦿá\u00AD\u0084 .";
        
        if (isISO88591(wrongEncodedString)) {
            String newStr = convertStrCharset(wrongEncodedString);
            System.out.println(newStr);
            //结果: Mᴇᴇᴛ ꦿ᭄ .
        }
    }

    private boolean isISO88591(String str) {
        byte[] byteArr = str.getBytes(StandardCharsets.ISO_8859_1);
        String convertedStr = new String(byteArr, StandardCharsets.ISO_8859_1);
        // 比较原始字符串和转换后的字符串是否相等
        return str.equals(convertedStr);


    }

    private String convertStrCharset(String str) {
        try {
            // 假设原始字符编码
            byte[] bytes = str.getBytes(StandardCharsets.ISO_8859_1);
            // 使用UTF-8重新编码为正常的字符串
            return new String(bytes, StandardCharsets.UTF_8);
        } catch (Exception e) {
            log.warn("convertStrCharset failed,errorMsg:{}", e.getMessage());
        }
        return str;
    }

ps:

ISO-8859-1并不适合表示所有语言的字符,特别是亚洲语言如中文、日文和韩文等。对于这些语言,需要使用其他字符集,例如UTF-8或UTF-16。通常更推荐使用Unicode字符集(如UTF-8)