在线观看不卡亚洲电影_亚洲妓女99综合网_91青青青亚洲娱乐在线观看_日韩无码高清综合久久

鍍金池/ 問(wèn)答/Java  Python/ 一個(gè)服務(wù)假死問(wèn)題,服務(wù)突然沒(méi)流量,求排查方案

一個(gè)服務(wù)假死問(wèn)題,服務(wù)突然沒(méi)流量,求排查方案

我寫(xiě)了一個(gè)服務(wù)調(diào)用elasticsearch做增刪改查的,結(jié)果前些天有一起假死事件,現(xiàn)象是服務(wù)全都沒(méi)流量了,請(qǐng)求進(jìn)不來(lái),rpc服務(wù)隊(duì)列里面堆積了好多請(qǐng)求都超時(shí)了。過(guò)了十幾分鐘后這個(gè)現(xiàn)象自己就緩解了。

當(dāng)時(shí)的GC頻率不高都是年輕帶GC,qps也屬于正常的范圍,日志記錄的耗時(shí)較長(zhǎng)的查詢或修改 但是我再拿去執(zhí)行的時(shí)候發(fā)現(xiàn)其實(shí)耗時(shí)并不長(zhǎng),也許是因?yàn)檎?qǐng)求超時(shí)了所以才會(huì)統(tǒng)計(jì)到耗時(shí)較長(zhǎng)吧,感覺(jué)這個(gè)線索不具備可靠性。

還有一種可能,就是服務(wù)層和es集群的連接數(shù)滿了,導(dǎo)致那一會(huì)兒請(qǐng)求es集群的任務(wù)都阻塞了?我的客戶端是這么寫(xiě)的:

public class ClientManager {

private static Logger logger = LogManager.getLogger(ClientManager.class.getName());

private static final String CLUSTER_NAME = "cluster.name";
private static final String ES_SERVICES = "es.services";

private Client client;

private static class ClientManagerHolder {
    private ClientManagerHolder() {
    }

    private static final ClientManager INSTANCE = new ClientManager();
}

public static Client getClient() {
    return ClientManagerHolder.INSTANCE.client;
}

private ClientManager() {
    if (client == null) {
        createClient();
    }
}

private void createClient() {
    // init

    String configPath = Path.getCurrentPath() + "/../config/app.properties";
    logger.info("######## appConfig配置文件路徑  " + configPath);
    AppConfig.init(configPath);

    try {
        String clusterName = AppConfig.getProperty(CLUSTER_NAME);
        String services = AppConfig.getProperty(ES_SERVICES);
        logger.debug("es.services:" + services);
        logger.debug("clusterName:" + clusterName);
        Settings settings = Settings.settingsBuilder().put("cluster.name", clusterName)
                .put("client.transport.sniff", true).put("client.transport.ignore_cluster_name", true)
                .put("client.transport.ping_timeout", "1s").put("client.transport.nodes_sampler_interval", "1s")
                .build();
        // add delete-by-query plugin
        TransportClient c = TransportClient.builder().settings(settings).addPlugin(DeleteByQueryPlugin.class)
                .build();
        String[] servicesArray;
        if (StringUtils.isNotBlank(services)) {
            servicesArray = services.split(",");
            for (String service : servicesArray) {
                String[] serviceInfo = service.split(":");
                if (serviceInfo.length > 1) {
                    c = c.addTransportAddress(new InetSocketTransportAddress(InetAddress.getByName(serviceInfo[0]),
                            Integer.valueOf(serviceInfo[1])));
                }
            }
            client = c;
            logger.info("connect to es cluster success.");
        } else {
            logger.error(" has no services info.");
        }

    } catch (Exception e) {
        logger.error("create es client failed.", e);
    }
}

}
大致就是做了個(gè)單例,但是我不太清楚esclient 有沒(méi)有做連接池 或者請(qǐng)求關(guān)閉等操作?總之我是沒(méi)有手動(dòng)調(diào)用過(guò)close方法的,不知道是不是這塊導(dǎo)致連接池資源都釋放不掉了。

es有關(guān)連接池部分的配置我也發(fā)一下吧:
threadpool:

    index:
            type: fixed
            size: 24
            queue_size: 500
    bulk:
            type: fixed
            size: 24
            queue_size: 500

action.write_consistency: one
index.store.type: mmapfs
indices.memory.index_buffer_size: 10%
index.translog.flush_threshold_ops: 50000
index.translog.flush_threshold_size: 500mb
index.translog.flush_threshold_period: 10m
indices.memory.min_translog_buffer_size: 512m
indices.memory.max_translog_buffer_size: 512m
indices.queries.cache.size: 512m
indices.queries.cache.count: 5000

求大神幫忙分析

回答
編輯回答
貓館

你要做的是用jstack查看線程狀態(tài),并且結(jié)合當(dāng)時(shí)的io與cpu情況做進(jìn)一步的分析。

2017年5月19日 01:38