私有知识库搭建整理

发布时间 2023-08-27 14:56:00作者: NoobSir

一. 私有知识库选型:

二. 安装笔记

  • 下载资源:

    git clone https://huggingface.co/THUDM/chatglm2-6b-32k
    git clone https://huggingface.co/moka-ai/m3e-base
    git clone https://github.com/chatchat-space/Langchain-Chatchat.git 
    cd Langchain-Chatchat
    
  • conda环境

    conda create -n chatchat python=3.10
    conda activate chatchat
    pip install --upgrade pip
    pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
    # pip install -pre torch torchvision torchautio --index-url https://download.pytorch.org/whl/nightly/cu121
    conda install spacy
    pip install cchardet
    pip install accelerate
    
  • chatchat构建

    pip install -r requirements.txt
    cd configs
    cp ./model_config.py.example ./model_config.py
    # embedding_model_dict 中
    # "m3e-base":"D:\Files\projects\chatchat\models\m3e-base"
    # llm_model_dict 中
    # "local_model_path":"D:\Files\projects\chatchat\models\chatglm2-6b-32k"
    cp ./server_config.py.example ./ server_config.py
    
  • 向量数据库配置

    git clone --branch v0.4.4 https://github.com/pgvector/pgvector.git
    cd pgvector
    
    # Postgresql + PGVector
    # https://www.enterprisedb.com/downloads/postgres-postgresql-downloads
    # 下载并安装Postgresql15
    # cmd中执行以下代码
    set PGROOT=C:\Program Files\PostgreSQL\15
    call "C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Auxiliary\Build\vcvars64.bat"
    nmake /F MakeFile.win
    nmake /F MakeFile.win install
    
    -- .\psql.exe --username=postgres 登录root账户
    CREATE DATABASE TEST;
    CREATE EXTENSION IF NOT EXISTS vector;
    
    python -m spacy download en_core_web_sm
    python -m spacy download zh_core_web_sm
    pip install psycopg2 pgvector flask-mysqldb protobuf==3.20 filemagic
    pip install -r requirements.txt
    # pgvector报错处理:
    # 错误: KeyError: 'answer'错误
    # Langchain-Chatchat/server/knowledge_base/km_service/base.py
    # 119行: docs = self.do_search(query, top_k, embeddings)
    python init_database.py
    # python init_database.py --recreate-vs
    
  • 启动运行:

    python startup.py --all-webui