分布式服务雪崩

发布时间 2023-11-30 13:33:43作者: hongdada

分布式服务雪崩

简介

公司分布式服务链路,底层强依赖数美第三方风控,校验用户是否黑产,随着并发量上升,数美接口返回时间逐渐增加,导致风控服务很多线程等待中,其他调用风控服务的上游服务线程也等待,然后上游的上游。。。,一层层全阻塞。

模拟:

新建springboot服务

tomcat配置:

server:
  tomcat:
    uri-encoding: UTF-8
    #最大链接数
    max-connections: 1000
    #最大等待队列长度
    accept-count: 200
    #请求头最大长度kb
    max-http-header-size: 1048576
    #请请求体最大长度kb
    #max-http-post-size: 2097152
    threads:
      #最大线程数
      max: 150
      #最小线程数
      min-spare: 10
  #服务http端口
  port: 8080

同一个服务,分端口部署,模拟上游调用下游

下游接口:

	@RequestMapping(value = "/test", method = {RequestMethod.GET})
	public ResultBase test() throws InterruptedException {
		Thread.sleep(30000);
		return ResultBase.getSuccessResult(ResultDTO.of().setResult(null));
	}

使用RestTemplate调用下游接口

import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.http.client.ClientHttpRequestFactory;
import org.springframework.http.client.SimpleClientHttpRequestFactory;
import org.springframework.web.client.RestTemplate;


@Configuration
public class RestTemplateConfig {
	@Bean
	public RestTemplate restTemplate(ClientHttpRequestFactory factory) {
		return new RestTemplate(factory);

	}

	@Bean
	public ClientHttpRequestFactory simpleClientHttpRequestFactory() {
		SimpleClientHttpRequestFactory factory = new SimpleClientHttpRequestFactory();
		factory.setConnectTimeout(15000);
		factory.setReadTimeout(5000);
		return factory;
	}
}

使用abtest进行压测

ab -n 60000 -c 300 http://127.0.0.1:8080/rest

使用同步线程调用下游

	@RequestMapping(value = "/rest", method = {RequestMethod.GET})
	public ResultBase rest() {
		restTemplate.getForObject("http://127.0.0.1:8090/test", ResultBase.class);
		return ResultBase.getSuccessResult(ResultDTO.of().setResult(null));
	}

调用前:

image-20231129181738198

掉用开始:

image-20231129181615344

部分进入WAITING

image-20231129181640413

使用异步线程调用下游

	@RequestMapping(value = "/rest", method = {RequestMethod.GET})
	public ResultBase rest() {
		CompletableFuture.supplyAsync(() -> {
			//长时间的计算任务
			restTemplate.getForObject("http://127.0.0.1:8090/test", ResultBase.class);
			return null;
		});
		return ResultBase.getSuccessResult(ResultDTO.of().setResult(null));
	}

开始调用前:

image-20231129174340278

进行中:

image-20231129174552153

可以看到,15个forkjoinpool

image-20231129174649024

差不多150个请求在WAITING

[arthas@20636]$ thread 228
"http-nio-8080-exec-170" Id=228 WAITING on java.util.concurrent.locks.ReentrantLock$NonfairSync@bd6cb20 owned by "http-nio-8080-exec-268" Id=326
    at sun.misc.Unsafe.park(Native Method)
    -  waiting on java.util.concurrent.locks.ReentrantLock$NonfairSync@bd6cb20
    at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
    at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
    at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
    at ch.qos.logback.core.OutputStreamAppender.writeBytes(OutputStreamAppender.java:197)
    at ch.qos.logback.core.OutputStreamAppender.subAppend(OutputStreamAppender.java:231)
    at ch.qos.logback.core.OutputStreamAppender.append(OutputStreamAppender.java:102)
    at ch.qos.logback.core.UnsynchronizedAppenderBase.doAppend(UnsynchronizedAppenderBase.java:84)
    at ch.qos.logback.core.spi.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:51)
    at ch.qos.logback.classic.Logger.appendLoopOnAppenders(Logger.java:270)
    at ch.qos.logback.classic.Logger.callAppenders(Logger.java:257)
    at ch.qos.logback.classic.Logger.buildLoggingEventAndAppend(Logger.java:421)
    at ch.qos.logback.classic.Logger.filterAndLog_1(Logger.java:398)
    at ch.qos.logback.classic.Logger.info(Logger.java:583)
    at com.shanjige.chatgpt.chatgptweb.aop.ControllerAspect.aroundAdvice(ControllerAspect.java:80)
    at sun.reflect.GeneratedMethodAccessor47.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.springframework.aop.aspectj.AbstractAspectJAdvice.invokeAdviceMethodWithGivenArgs(AbstractAspectJAdvice.java:644)
    at org.springframework.aop.aspectj.AbstractAspectJAdvice.invokeAdviceMethod(AbstractAspectJAdvice.java:633)
    at org.springframework.aop.aspectj.AspectJAroundAdvice.invoke(AspectJAroundAdvice.java:70)
    at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186)
    at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.proceed(CglibAopProxy.java:750)
    at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:95)
    at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186)
    at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.proceed(CglibAopProxy.java:750)
    at org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:692)
    at com.shanjige.chatgpt.chatgptweb.web.IndexController$$EnhancerBySpringCGLIB$$a0081f82.rest(<generated>)
    at sun.reflect.GeneratedMethodAccessor46.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.springframework.web.method.support.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:190)
    at org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:138)
    at org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:105)
    at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandlerMethod(RequestMappingHandlerAdapter.java:878)
    at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:792)
    at org.springframework.web.servlet.mvc.method.AbstractHandlerMethodAdapter.handle(AbstractHandlerMethodAdapter.java:87)
    at org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:1040)
    at org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:943)
    at org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:1006)
    at org.springframework.web.servlet.FrameworkServlet.doGet(FrameworkServlet.java:898)
    at javax.servlet.http.HttpServlet.service(HttpServlet.java:626)
    at org.springframework.web.servlet.FrameworkServlet.service(FrameworkServlet.java:883)
    at javax.servlet.http.HttpServlet.service(HttpServlet.java:733)
    at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:227)
    at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:162)
    at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:53)
    at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:189)
    at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:162)
    at org.springframework.web.filter.RequestContextFilter.doFilterInternal(RequestContextFilter.java:100)
    at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:119)
    at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:189)
    at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:162)
    at org.springframework.web.filter.FormContentFilter.doFilterInternal(FormContentFilter.java:93)
    at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:119)
    at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:189)
    at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:162)
    at org.springframework.boot.actuate.metrics.web.servlet.WebMvcMetricsFilter.doFilterInternal(WebMvcMetricsFilter.java:97)
    at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:119)
    at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:189)
    at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:162)
    at org.springframework.web.filter.CharacterEncodingFilter.doFilterInternal(CharacterEncodingFilter.java:201)
    at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:119)
    at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:189)
    at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:162)
    at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:202)
    at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:97)
    at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:542)
    at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:143)
    at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:92)
    at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:78)
    at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:357)
    at org.apache.coyote.http11.Http11Processor.service(Http11Processor.java:374)
    at org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:65)
    at org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:893)
    at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1707)
    at org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:49)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
    at java.lang.Thread.run(Thread.java:748)

image-20231129173949265

差不多150个请求在TIMED_WAITING

[arthas@20636]$ thread 319
"http-nio-8080-exec-261" Id=319 TIMED_WAITING on java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@1e4f12a6
    at sun.misc.Unsafe.park(Native Method)
    -  waiting on java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@1e4f12a6
    at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
    at java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467)
    at org.apache.tomcat.util.threads.TaskQueue.poll(TaskQueue.java:90)
    at org.apache.tomcat.util.threads.TaskQueue.poll(TaskQueue.java:33)
    at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1073)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
    at java.lang.Thread.run(Thread.java:748)

结束后:

image-20231129174003462

比开始时,多了15个forkjoinpool

使用线程池调用下游

	private ThreadPoolExecutor threadPoolExecutor = new ThreadPoolExecutor(20, 20, 200, TimeUnit.SECONDS,
			new LinkedBlockingDeque<>(300));

	@RequestMapping(value = "/rest", method = {RequestMethod.GET})
	public ResultBase rest() {
		threadPoolExecutor.execute(()-> restTemplate.getForObject("http://127.0.0.1:8090/test", ResultBase.class));
		return ResultBase.getSuccessResult(ResultDTO.of().setResult(null));
	}

image-20231129170832340

可以看到,线程池设置的为20,线程池满了阻塞以后,差不多150个请求线程全都WAITING

参考

曹工说Tomcat:200个http-nio-8080-exec线程全都被第三方服务拖住了,这可如何是好(上:线程模型解析)