excel导入数据 引发OutOfMemoryError:Javaheap space 持续优化过程 load data infile,easyExcel,executeBatch()批量插入的应用

发布时间 2023-09-14 16:23:13作者: 时倏珍慧

本文不长读完需要 3分20秒,不包含看这些代码;

项目中处理一个导入需求:

      客户会不定期整理一份excel 文档,通过项目系统中的导入功能导入到数据库中;excel 为固定格式,共145列;落地到数据库对应的不同表中,主要存储信息表 table_A 存储69列,这69个对应的字段 有double ,string,text,long 四种类型;其他是一些子表,通过查询或转换插入到其他的5张表中;项目功能迭代上经历了一段过程,

一,项目开始时采用的是:

      - org.apache.poi:poi4.1.2  

      - HSSFworkbook,XSSFworkbook,SXSSFworkbook(具体三者的区别不过多陈述)

 1 public ImportExcel(String fileName, InputStream is, int headerNum, int sheetIndex) throws InvalidFormatException, IOException {
 2         if (StringUtils.isBlank(fileName)) {
 3             throw new RuntimeException("导入文档为空!");
 4         } else if (fileName.toLowerCase().endsWith("xls")) {
 5             this.wb = new HSSFWorkbook(is);
 6         } else if (fileName.toLowerCase().endsWith("xlsx")) {
 7             this.wb = new XSSFWorkbook(is);
 8         } else {
 9             throw new RuntimeException("文档格式不正确!");
10         }
11         if (this.wb.getNumberOfSheets() < sheetIndex) {
12             throw new RuntimeException("文档中没有工作表!");
13         }
14         this.sheet = this.wb.getSheetAt(sheetIndex);
15         this.headerNum = headerNum;
16         log.debug("Initialize success.");
17     }

具体的导入业务逻辑也不详细罗列了

 @Transactional(readOnly = false)
    public void importChemical(MultipartFile multipartFile, int headerNum, int sheetIndex)
            throws InvalidFormatException, IOException {

        long start = System.currentTimeMillis();

        /**
         * 校验
         */
        ImportTask userTask = importDataService.getUserTask();// 当前用户的任务
        // 如果没有创建任务就新建
        if (ConverterUtil.isEmpty(userTask)) {
            userTask = new ImportTask();
            userTask.setStatus(ImportTask.ST_RUNNING);
        } else {
            // 如果任务未正在进行中则提示
            if (ImportTask.ST_RUNNING == userTask.getStatus()) {
                userTask.setStatus(0);
                throw new RuntimeException("You have another task, please waiting for the end.");
            } else {
                userTask.setStatus(ImportTask.ST_RUNNING);
            }
        }
        ImportExcel ei = new ImportExcel(multipartFile, headerNum, sheetIndex);
        int dataRowNum = headerNum;
        int lastDataRowNum = ei.getSheet().getLastRowNum() + headerNum;

        int lastCellNum = ei.getLastCellNum();
        userTask.addMessage("正在对模板进行检查");

        if (lastCellNum != 118) {
            userTask.setStatus(ImportTask.ST_WAIT);
            userTask.addMessage(ImportMessage.ST_ERROR, "模板格式不正确,请下载正确的模板!");
        }

读取excel 文件保存来完成的,项目刚上线用户用不用还不清楚,测试相关的深度性能测试没有进行,简单功能有了上线以后再说吧;

       项目顺利交付进入运行阶段,偶尔会有几个人上来点一点,看一看,导入个几条数据,简单功能可以用,没大问题;我们也没有在意,继续做着其他项目(忘记这个项目的坑,继续去挖坑....); 可是出问题的这一天还是来了,客户要在系统中进行一项什么实验报告,5万多条数据要快速的导入进去,这时候客户想到这个导入功能,就将数据整理到excel 中 试着导入了,导一次不行,接着试还不行,就这样不知道多少次以后,电话打过来了,我们看了看服务中日志,都讨厌的错来了

 

OutOfMemoryError:Javaheap space;

 

事情到眼前了,没跑了修复吧;首先将客户给的文件拆分成多份,让一位测试人员(后来这兄弟就离职了)先导入到系统,客户那边先用着吧。自己一点一点的排查原因,归根到底 初始化 XSSFWorkbook直接报错了

 

ImportExcel ei = new ImportExcel(multipartFile, headerNum, sheetIndex);

 

这句话就直接走不动了,后面的逻辑压根没跑;这时候心理还是有侥幸心理调整了生产环境运行内存大小,继续拖着,其实问题并没有根本解决;

二,用户隔段时间就会找你去导入,因为他自己导不了,真的很烦,开始着手修改,根据学习和了解Easyexcel  是对POI 导入导出的一种重写,尽可能的优化了一些内存益出,让使用者更加简单方便,使用起来也很方便,阿里开源项目,看起来挺不错用起来吧,第二版功能应用而生

 

// 有个很重要的点 DemoDataListener 不能被spring管理,要每次读取excel都要new,然后里面用到spring可以构造方法传进去
public class UploadDataListener implements ReadListener<ExcelSaveChemicalData> {

    private static MsdbChemicalSearchService msdbChemicalSearchService = SpringBootContext.getBean(MsdbChemicalSearchService.class);
    private static ImportDataService importDataService = SpringBootContext.getBean(ImportDataService.class);
    private static AsyncService asyncService = SpringBootContext.getBean(AsyncService.class);
    private static AsyncEsCrudService asyncEsCrudService = SpringBootContext.getBean(AsyncEsCrudService.class);
    private Logger log = LoggerFactory.getLogger(UploadDataListener.class);
    private HashSet<String> strings = new HashSet<>();
    /**
     * 每隔5条存储数据库,实际使用中可以100条,然后清理list ,方便内存回收
     */
    private static final int BATCH_COUNT = 1000;
    long start = System.currentTimeMillis();
    private ImportTask userTask = importDataService.getUserTask();
    private List<ExcelSaveChemicalData> resourceList = ListUtils.newArrayListWithExpectedSize(BATCH_COUNT);
    private List<ChemicalCompoundInfoVo> selQuselIdKey = ListUtils.newArrayList();
    private Map<String, Long> twoTypesMaps = ChemicalRedisUntil.selTwoType();
    private Map<String, Long> oneTypeMaps = ChemicalRedisUntil.selOneType();


    @Override
    public void invoke(ExcelSaveChemicalData data, AnalysisContext context) {
        data.setIndexNum(context.getCurrentRowNum().intValue() + 1);
        resourceList.add(data);
    }

    @Override
    public void invokeHead(Map<Integer, ReadCellData<?>> headMap, AnalysisContext context) {

    }

    /**
     * 所有数据解析完成了 都会来调用
     *
     * @param context
     */
    @Override
    public void doAfterAllAnalysed(AnalysisContext context) {

        // 验证数据处理
        int size = resourceList.size();
        //适配线程池数与分组长度
        //Math.ceil()对小数向下“”取整”
        int batches = (int) Math.ceil(size * 1.0 / BATCH_COUNT);
        int startIndex, toIndex;
        CountDownLatch countDownLatch = new CountDownLatch(batches);
        for (int i = 0; i < batches; i++) {
            //开始索引位置
            startIndex = i * BATCH_COUNT;
            //截止索引位置
            toIndex = startIndex + BATCH_COUNT;
            //如果截止索引大于最大索引,截止索引等于最大索引
            if (toIndex > size) {
                toIndex = size;
            }
            //截取数组
            List<ExcelSaveChemicalData> listSub = resourceList.subList(startIndex, toIndex);
            if (listSub == null || listSub.size() == 0) {
                continue;
            }
            ChemicalCompoundPo po = new ChemicalCompoundPo();
            listSub.forEach(e -> {
                strings.add(e.getQuselId());
            });
            po.setQuselIds(strings);
            selQuselIdKey = msdbChemicalSearchService.selectList(po, Collections.singletonList(SortBuilders.fieldSort("quselId")));
            log.info("es sqlQuselIdKey" + selQuselIdKey.size());
            Map<String, String> quselIdKey = ListToMapUntil.toMapQuselIdKey(selQuselIdKey);
            Map<String, String> chemicalNameKey = ListToMapUntil.toMapCommNameKey(selQuselIdKey);
            Map<String, Long> chemicalIdIsValue = ListToMapUntil.toMapIdIsValue(selQuselIdKey);
            Map<String, ChemicalCompoundInfoVo> quselIdKeyChemicalIsValue = ListToMapUntil.toMapQuselIdKeyChemicalIsValue(selQuselIdKey);
            log.info("es quselIdKeyChemicalIsValue = " + quselIdKeyChemicalIsValue.size());
            // JdbcKeysUtils.mysqlKeysDisAble();
            asyncService.executeAsync(listSub, selQuselIdKey, quselIdKey, chemicalNameKey, chemicalIdIsValue, quselIdKeyChemicalIsValue,
                    twoTypesMaps, oneTypeMaps, userTask, countDownLatch);
            // JdbcKeysUtils.mysqlKeysEnable();
        }
        try {
            countDownLatch.await(); //保证之前的所有的线程都执行完成,才会走下面的;
            // 这样就可以在下面拿到所有线程执行完的集合结果
        } catch (Exception e) {
            log.error("阻塞异常:" + e.getMessage());
        }
        long end = System.currentTimeMillis();
        log.info("数据组装完成" + (end - start));
        if (ConverterUtil.isEmpty(userTask)) {
            userTask = new ImportTask();
            userTask.setStatus(ImportTask.ST_RUNNING);
            log.info("创建任务的状态++++++++++++++++++" + userTask.getStatus());
        }
        if (userTask.getMsgList().size() == 0) {
            userTask = new ImportTask();
            userTask.setStatus(ImportTask.ST_WAIT);
            userTask.addMessage("数据加载完成");
            userTask.addMessage("导入成功");
            ChemicalRedisUntil.setTaskMsgList(userTask);
            log.info("获取任务的状态++++++++++++++++++" + userTask.getStatus());
        } else if (userTask.getStatus().equals(ImportTask.ST_RUNNING) && userTask.getMsgList().size() == 0) {
            userTask.setStatus(ImportTask.ST_WAIT);
            userTask.addMessage("数据加载完成");
            userTask.addMessage("导入成功");
            ChemicalRedisUntil.setTaskMsgList(userTask);
        } else if (userTask.getStatus().equals(ImportTask.ST_WAIT)) {
            userTask.setStatus(ImportTask.ST_WAIT);
            userTask.addMessage("数据加载完成");
            ChemicalRedisUntil.setTaskMsgList(userTask);
        } else {
            userTask.setStatus(ImportTask.ST_WAIT);
            userTask.addMessage("数据加载完成");
            ChemicalRedisUntil.setTaskMsgList(userTask);
        }
        asyncEsCrudService.executeAsync(countDownLatch);
        // 清空缓存
        ChemicalRedisUntil.clearChemicalRedis();
        log.info("所有数据解析完成!");
    }
}
主要逻辑就是在这个UploadDataListener 中;

 

其实这一版本中逻辑代码也挺繁琐,中间有很多对象的创建,也没有使用线程池的概念,经过一段时间的调整和优化

线程池和数据库的分批量插入 整合进来由原来的单条新增变为批量插入,在逻辑代码中尽量减少对象的创建;

一些需要查询数据库的基础数据坚决走缓存;

一些列表的查询引入了elasticsearch,同时导入的一些验证也开始走elasticsearch了;

 

public class JdbcUtils {

    @Autowired
    private static DataSource dataSource = SpringBootContext.getBean(DataSource.class);

    public static void saveDataTest(List<ExcelSaveChemicalData> dataList) {

        Logger log = LoggerFactory.getLogger(JdbcUtils.class);
        // 查看默认的数据源
        // 获得数据库连接
        Connection conn = null;
        PreparedStatement ps = null;
        try {
            conn = dataSource.getConnection();
            try {
                long start = System.currentTimeMillis();
                log.info(dataList.size() + "条,开始导入到数据库时间:" + (System.currentTimeMillis() - start)/1000f + "秒");
                //控制事务:默认不提交
                conn.setAutoCommit(false);
                String sql = "insert into chemical_compound (id, common_name, common_name_cn, other_name, other_name_cn, chemical_name, " +
                        "chemical_name_cn,molecular_formula, molecular_weight, chemical_structure,chemical_structure2,chemical_structure3," +
                        "chemical_structure_mol,chemical_structure2_mol,chemical_structure3_mol," +
                        "isotopic_peak_1,isotopic_peak_2,isotopic_peak_3,isotopic_peak_4,isotopic_peak_5, cas_reg_no," +
                        "solvent,qusel_id,iupac_name_en,iupac_name_cn,monoisotopic_peak,exact_mass," +
                        "adduct_ion,add_result,adduct_ion1,add_result1,adduct_ion2,add_result2,adduct_ion3,add_result3,adduct_ion4,add_result4," +
                        "chemspider_id,pubchem_id,note,smiles,inchl_key,himd_bi,tox_cast,iecsc," +
                        "note_a,note_b,note_c,note_d,note_e,note_f,note_g,note_h," +
                        "chiral_compound,chiral_center,stereoisomeric_number,stereoisometric_structure,stereoisometric_structure_mol,stere_activ_behavi," +
                        "create_by, create_date, update_by, update_date,remarks, del_flag) values";
                sql += "(?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?," +
                        "?,?,?,?,?,?,?,?,?,?)";
                ps = conn.prepareStatement(sql);
                log.info(sql);
                //循环结果集:这里循环不支持"java8"表达式
                for (int i = 0; i < dataList.size(); i++) {
                    final ExcelSaveChemicalData chemical = dataList.get(i);
                    ps.setLong(1, chemical.getId());
                    ps.setString(2, chemical.getCommonName());
                    ps.setString(3, chemical.getCommonNameCn());
                    ps.setString(4, chemical.getOtherName());
                    ps.setString(5, chemical.getOtherNameCn());
                    ps.setString(6, chemical.getChemicalName());
                    ps.setString(7, chemical.getChemicalNameCn());
                    ps.setString(8, chemical.getMolecularFormula());
                    ps.setString(9, chemical.getMolecularWeight());
                    ......
                    ps.setString(64, chemical.getRemarks());
                    ps.setString(65, chemical.getDelFlag());

                    ps.addBatch();
                }
                //执行批处理
                ps.executeBatch();
                //手动提交事务
                conn.commit();
                log.info(dataList.size() + "条,导入用时:" + (System.currentTimeMillis() - start)/1000f + "秒");
            } catch (Exception e) {
                e.printStackTrace();
            }
        } catch (SQLException e) {
            e.printStackTrace();
        }
    }
}

 

批量的这部分周末跟同事一起讨论还留下了很深的印象,这一版本上线后,也坚持了一段时间,客户从原来只能导入少量数据,到可以多次导入几个大一些的文件,也算是进步了,随着时间的推移,使用系统的用户也越来越多了,整理的导入文件也越来越多,导入的功能程序还是成功拉垮了服务中其它功能。使用Easyexcel网上都说解决问题了,或许我没用好,我的问题还是没解决

 

OutOfMemoryError:Javaheap space;

 

这个内存溢出一直困扰着我,它从未离开.....

记得大概那是快过年的那会,或许要年终总结一部分客户在不断的导入数据,数据库不断的做插入操作,一直占用连接不释放,严重的影响项目中其他功能使用了;哈哈哈哈;

你人还怪好的,还过什么年,苦逼了我这个打工人啊;

三,这次功能不好用,收到了更多的需要导入的文件压缩包,有的整理的对,有的整理的不对,这些都不重要了,总之那种每天让你去做重复动作的事情还是找过来了,这次又协调了一位项目管理的女生来做这个拆分导入工作,DB 也承担了一部分导入工作,别提多尴尬了,功能有问题,这个问题始终没解决;自己还是继续优化导入功能吧,又一段时间的学习和了解到 mysql 数据库的一些特性,插入慢可能是索引创建的多,在插入数据时,先将这张表的索引删除,在插入会不会提高;也看到了Load data local infile 命令是专门导入大批量数据使用的高效命令,最后决定在这个方向上试试,因为删除索引完以后还需要重新创建索引;创建索引也听消耗时间,先试试吧,有了方向,就动手继续优化吧

public class DataLoadScriptGenerator {

    @Autowired
    private static DataSource dataSource = SpringBootContext.getBean(DataSource.class);

    public static synchronized ResultPoJo<Object> generateLoadDataScript(List<ExcelSaveChemicalData> dataList) {
        ResultPoJo<Object> result = new ResultPoJo<Object>(ConstantCodeMsg.NOMAL);
        Logger log = LoggerFactory.getLogger(DataLoadScriptGenerator.class);
        try {
            long start = System.currentTimeMillis();
            log.info(dataList.size() + "LOAD DATA LOCAL START" + (System.currentTimeMillis() - start) / 1000f + "秒");
            // 创建临时文本文件
            FileWriter fileWriter = new FileWriter("file.csv");
            BufferedWriter bufferedWriter = new BufferedWriter(fileWriter);

            // 关闭文件写入流
            for (ExcelSaveChemicalData chemical : dataList) {
                // 构建一行数据,字段之间使用制表符分隔
                String line = Stream.of(
                                Optional.ofNullable(chemical.getId()).map(Object::toString).orElse(""),
                                Optional.ofNullable(chemical.getCommonName()).orElse(""),
                                Optional.ofNullable(chemical.getCommonNameCn()).orElse(""),
                                Optional.ofNullable(chemical.getOtherName()).orElse(""),
                                Optional.ofNullable(chemical.getOtherNameCn()).orElse(""),
                                Optional.ofNullable(chemical.getChemicalName()).orElse(""),
                                Optional.ofNullable(chemical.getChemicalNameCn()).orElse(""),
                                Optional.ofNullable(chemical.getMolecularFormula()).orElse(""),
                                Optional.ofNullable(chemical.getMolecularWeight()).orElse(""),
                                Optional.ofNullable(chemical.getChemicalStructure()).orElse(""),
                                Optional.ofNullable(chemical.getChemicalStructure2()).orElse(""),
                                Optional.ofNullable(chemical.getChemicalStructure3()).orElse(""),
                                Optional.ofNullable(chemical.getCasRegNo()).orElse(""),
                                Optional.ofNullable(chemical.getCompoundType()).map(Object::toString).orElse(""),
                                Optional.ofNullable(chemical.getChemicalCategory()).map(Object::toString).orElse(""),
                                Optional.ofNullable(chemical.getSolvent()).orElse(""),
                                Optional.ofNullable(chemical.getSolventCn()).orElse(""),
                                Optional.ofNullable(chemical.getCreateBy()).orElse(""),
                                Optional.ofNullable(chemical.getCreateDate()).map(Object::toString).orElse(""),
                                Optional.ofNullable(chemical.getUpdateBy()).orElse(""),
                                Optional.ofNullable(chemical.getUpdateDate()).map(Object::toString).orElse(""),
                                Optional.ofNullable(chemical.getRemarks()).orElse(""),
                                Optional.ofNullable(chemical.getDelFlag()).orElse(""),
                                Optional.ofNullable(chemical.getQuselId()).orElse(""),
                                Optional.ofNullable(chemical.getIupacName()).orElse(""),
                                Optional.ofNullable(chemical.getIupacNameCn()).orElse(""),
                                Optional.ofNullable(chemical.getExactMass()).map(value -> value != null ? String.format("%.2f", value) : "NULL").orElse("NULL"),
                                Optional.ofNullable(chemical.getMonoisotopicPeak()).map(value -> value != null ? String.format("%.2f", value) : "NULL").orElse("NULL"),
                                Optional.ofNullable(chemical.getAdductIon()).map(value -> value != null ? String.format("%.2f", value) : "NULL").orElse("NULL"),
                                Optional.ofNullable(chemical.getAddResult()).map(value -> value != null ? String.format("%.2f", value) : "NULL").orElse("NULL"),
//                                Optional.ofNullable(chemical.getAdductIon()).map(value -> String.format("%.2f", value)).orElse("NULL"),
//                                Optional.ofNullable(chemical.getAddResult()).map(value -> String.format("%.2f", value)).orElse("NULL"),
                                Optional.ofNullable(chemical.getChemspiderId()).orElse(""),
                                Optional.ofNullable(chemical.getPubchemId()).orElse(""),
                                Optional.ofNullable(chemical.getIsotopicPeak1()).map(value -> String.format("%.2f", value)).orElse(""),
                                Optional.ofNullable(chemical.getIsotopicPeak2()).map(value -> String.format("%.2f", value)).orElse(""),
                                Optional.ofNullable(chemical.getIsotopicPeak3()).map(value -> String.format("%.2f", value)).orElse(""),
                                Optional.ofNullable(chemical.getIsotopicPeak4()).map(value -> String.format("%.2f", value)).orElse(""),
                                Optional.ofNullable(chemical.getIsotopicPeak5()).map(value -> String.format("%.2f", value)).orElse(""),
                                Optional.ofNullable(chemical.getNote()).orElse(""),
                                Optional.ofNullable(chemical.getChiralCompound()).orElse(""),
                                Optional.ofNullable(chemical.getChiralCenter()).orElse(""),
                                Optional.ofNullable(chemical.getStereoisomericNumber()).map(value -> String.format("%.2f", value)).orElse(""),
                                Optional.ofNullable(chemical.getStereoisometricStructure()).orElse(""),
                                Optional.ofNullable(chemical.getStereActivBehavi()).orElse(""),
                                Optional.ofNullable(chemical.getSmiles()).orElse(""),
                                Optional.ofNullable(chemical.getInchlKey()).orElse(""),
                                Optional.ofNullable(chemical.getHimdBi()).orElse(""),
                                Optional.ofNullable(chemical.getToxCast()).orElse(""),
                                Optional.ofNullable(chemical.getIecsc()).orElse(""),
                                Optional.ofNullable(chemical.getNoteA()).orElse(""),
                                Optional.ofNullable(chemical.getNoteB()).orElse(""),
                                Optional.ofNullable(chemical.getNoteC()).orElse(""),
                                Optional.ofNullable(chemical.getNoteD()).orElse(""),
                                Optional.ofNullable(chemical.getNoteE()).orElse(""),
                                Optional.ofNullable(chemical.getNoteF()).orElse(""),
                                Optional.ofNullable(chemical.getNoteG()).orElse(""),
                                Optional.ofNullable(chemical.getNoteH()).orElse(""),
                                Optional.ofNullable(chemical.getAdductIonMass()).orElse(""),
                                Optional.ofNullable(chemical.getAdductIon1()).map(value -> String.format("%.2f", value)).orElse(""),
                                Optional.ofNullable(chemical.getAddResult1()).map(value -> String.format("%.2f", value)).orElse(""),
                                Optional.ofNullable(chemical.getAdductIon2()).map(value -> String.format("%.2f", value)).orElse(""),
                                Optional.ofNullable(chemical.getAddResult2()).map(value -> String.format("%.2f", value)).orElse(""),
                                Optional.ofNullable(chemical.getAdductIon3()).map(value -> String.format("%.2f", value)).orElse(""),
                                Optional.ofNullable(chemical.getAddResult3()).map(value -> String.format("%.2f", value)).orElse(""),
                                Optional.ofNullable(chemical.getAdductIon4()).map(value -> String.format("%.2f", value)).orElse(""),
                                Optional.ofNullable(chemical.getAddResult4()).map(value -> String.format("%.2f", value)).orElse(""),
                                Optional.ofNullable(chemical.getChemicalStructureMol()).orElse(""),
                                Optional.ofNullable(chemical.getChemicalStructure2Mol()).orElse(""),
                                Optional.ofNullable(chemical.getChemicalStructure3Mol()).orElse(""),
                                Optional.ofNullable(chemical.getStereoisometricStructureMol()).orElse("")
                        )
                        .collect(Collectors.joining("\t"));

                // 将数据行写入文本文件
                bufferedWriter.write(line);
                bufferedWriter.newLine();
            }
            bufferedWriter.close();
            fileWriter.close();
            // 构建LOAD DATA INFILE命令
            String loadDataCommand = String.format(
                    "LOAD DATA LOCAL INFILE '%s' INTO TABLE chemical_compound FIELDS TERMINATED BY '\\t' LINES TERMINATED BY '\\n'" +
                            "set compound_type = NULLIF(compound_type,''),chemical_category = NULLIF(chemical_category,'')," +
                            "exact_mass = NULLIF(exact_mass,''),monoisotopic_peak = NULLIF(exact_mass,''),adduct_ion = NULLIF(adduct_ion,'')," +
                            "add_result = NULLIF(add_result,''),isotopic_peak_1 = NULLIF(isotopic_peak_1,''),isotopic_peak_2 = NULLIF(isotopic_peak_2,'')," +
                            "isotopic_peak_3 = NULLIF(isotopic_peak_3,''),isotopic_peak_4 = NULLIF(isotopic_peak_4,''),isotopic_peak_5 = NULLIF(isotopic_peak_5,'')," +
                            "adduct_ion1 = NULLIF(adduct_ion1,''),add_result1 = NULLIF(add_result1,''),adduct_ion2 = NULLIF(adduct_ion2,'')," +
                            "add_result2 = NULLIF(add_result2,''),adduct_ion3 = NULLIF(adduct_ion3,''),add_result3 = NULLIF(add_result3,'')," +
                            "adduct_ion4 = NULLIF(adduct_ion4,''),add_result4 = NULLIF(add_result4,''),stereoisomeric_number = NULLIF(stereoisomeric_number,'');",
                    "file.csv"
            );

            // 打印LOAD DATA INFILE命令
            System.out.println(loadDataCommand);
            WallConfig config = new WallConfig();
            config.setMultiStatementAllow(true);
//            config.setLoadFileAllow(true);
            config.setNoneBaseStatementAllow(true);
            final Statement statement = dataSource.getConnection().createStatement();
            statement.execute(loadDataCommand);
            statement.close();
            log.info(dataList.size() + "LOAD DATA LOCAL START END" + (System.currentTimeMillis() - start) / 1000f + "秒");
            // 执行完删除文件
            File file = new File("file.csv");
            if (file.exists()) {
                file.delete();
            }

        } catch (SQLException | IOException e) {
            result.setCode(ConstantCodeMsg.ERR_814);
            result.setMsg(ConstantCodeMsg.MSG_814);
            e.printStackTrace();
        }
        return result;
    }
}  load data local infile 命令的使用代码

这次优化将客户整理的excel 文档数据首先用 easyExcel 全部读取到程序中,接着去进行校验,数据封装,在异步的分批将数据 写进临时文件中,在使用load data local infile命令 来落地数据;同时业务中也引入了对象池的使用来减少对象的创建来减少对内存的开销

 

@Component
public class ChemicalCategoryRelaBuffer {

    private Queue<ChemicalCategoryRela> buffer;
    private Integer minSize;
    private Integer maxSize;

    public void setBufferSize(Integer minSize, Integer maxSize) {
        this.minSize = minSize;
        this.maxSize = maxSize;
        this.buffer = new LinkedList<>();
        initializeBuffer(minSize);
    }

    private void initializeBuffer(Integer minSize) {
        for (int i = 0; i < minSize; i++) {
            buffer.offer(new ChemicalCategoryRela());
        }
    }

    public ChemicalCategoryRela borrow() {
        // 根据需要动态增加缓冲池的大小,不超过最大值
        if (buffer.size() < maxSize) {
            buffer.offer(new ChemicalCategoryRela());
        } else {
            // 如果达到最大值,可以根据需求进行处理,例如等待一段时间再尝试借用
            // 或者抛出异常,具体处理方式取决于应用程序的需求
            throw new IllegalStateException("缓冲池已满,无法借用对象");
        }
        return buffer.poll();
    }

    public void release(ChemicalCategoryRela chemicalCategoryRela) {
        buffer.offer(chemicalCategoryRela); // 将对象放回缓冲区
    }

 

对象池的使用

 private void chemicalRelaUseExcelData(ExcelSaveChemicalData data, Integer i, Integer j, ImportTask userTask, List<ChemicalCategoryRela> chemicalCatgMageList,
                                          Map<String, Long> oneTypesMaps, Map<String, Long> twoTypesMaps, List<ExcelSaveChemicalData> resourceList) {
        for (int k = 1; k <= 6; k++) {
            String compoundType = data.getCompoundType(k);
            if (ConverterUtil.isNotEmpty(compoundType)) {
                Long ontTypeId = oneTypesMaps.get(compoundType);
                if (ConverterUtil.isEmpty(ontTypeId)) {
                    userTask.setStatus(ImportTask.ST_WAIT);
                    userTask.addMessage(ImportMessage.ST_ERROR, "第" + i + "行【" + ImportUtil.indexToColumn(j + k).concat("列的单元格内容,在分类列表中不存在"));
                } else {
                    ChemicalCategoryRelaBuffer chemicalCategoryRelaBuffer = new ChemicalCategoryRelaBuffer();
                    chemicalCategoryRelaBuffer.setBufferSize(1, resourceList.size() * 6);
                    ChemicalCategoryRela chemicalCate = chemicalCategoryRelaBuffer.borrow();
                    try {
                        chemicalCate.setOnetypeId(ontTypeId);
                        String[] categoryFields = {data.getEffectClassification(k), data.getChemicalCategory(k),
                                data.getFourthCategory(k), data.getFifthCategory(k)};
                        for (int l = 0; l < categoryFields.length; l++) {
                            String categoryField = categoryFields[l];
                            if (ConverterUtil.isNotEmpty(categoryField)) {
                                String ontTypeIdTypeNameCnKey = ontTypeId.toString().concat(String.valueOf(l + 1)).concat(categoryField);
                                Long cateGoryId = twoTypesMaps.get(ontTypeIdTypeNameCnKey);
                                if (ConverterUtil.isEmpty(cateGoryId)) {
                                    userTask.setStatus(ImportTask.ST_WAIT);
                                    userTask.addMessage(ImportMessage.ST_ERROR, "第" + i + "行【" + ImportUtil.indexToColumn(j + k + l).concat("列的单元格内容,在分类列表中不存在"));
                                } else {
                                    switch (l) {
                                        case 0:
                                            chemicalCate.setUseId(cateGoryId);
                                            break;
                                        case 1:
                                            chemicalCate.setStructureId(cateGoryId);
                                            break;
                                        case 2:
                                            chemicalCate.setFourhtId(cateGoryId);
                                            break;
                                        case 3:
                                            chemicalCate.setFifthId(cateGoryId);
                                            break;
                                    }
                                }
                            }
                        }

                        if (!ConverterUtil.isEmpty(chemicalCate.getOnetypeId(), chemicalCate.getUseId(),
                                chemicalCate.getStructureId(), chemicalCate.getFourhtId(), chemicalCate.getFifthId())) {
                            chemicalCate.setCompoundId(data.getId());
                            chemicalCate.preInsert();
                            chemicalCatgMageList.add(chemicalCate);
                        }
                    } catch (Exception e) {
                        e.printStackTrace();
                    } finally {
                        chemicalCategoryRelaBuffer.release(chemicalCate);
                    }
                }
            }
        }
    }

这一版目前看还可以使用着,没有出现过严重问题。。。但是下一次优化又怎么去做呢?

项目中的数据从原来的几十条到现在的500多万,根据从客户那里了解到未来肯定会有上千万的数据。

看着数据的不断新增,想想自己缓慢的积累,我不知道下一个版本会出现在什么时候......

项目中不对的地方,希望各位大神不吝赐教,大神的想法也都畅所欲言,真诚的希望您提出的新想法;