Transformers/SpaCy安装在Android手机(Termux)的Python开发环境

发布时间 2023-06-20 13:34:11作者: abaelhe
  1. 安装Rust(Python库safetensors依赖Rust环境):
    $ rm -rf ~/.cargo #删除所有Rust残余旧版本
    $ pkg install rust #最好安装完退出Termux所有Sessions
    $ mkdir -p ~/.cargo #重建rust的用户配置目录
    # 启用rust本地仓库加速下载crates

    $ echo 'export RUSTUP_DIST_SERVER="https://mirrors.tuna.tsinghua.edu.cn/rustup"' >> ~/.cargo/env
    $ rustc --version #重启Termux运行;

  2. Termux安装opencv, numpy, scipy, pandas,…
    $ pkg install python opencv-python vim-python
    $ pkg install python-{numpy,scipy,pandas,pillow}
    $ pkg install python-torch{,audio,vision}
    $ pkg install protobuf{,-dev} google{test,-glog}

  3. 安装SpaCy:
    # 修改thinc/spacy/spacy-transformers的安装包依赖文件(py project.toml, setup.cfg, setup.py),
    # 用最新版numpy(1.25.0);

    $ pip install thinc spacy spacy-pkuseg

  4. 安装 SentencePiece(unsupervised tokenizer and detokenizer)
    $ git clone https://github.com/google/sentencepiece.git
    $ cd sentencepiece
    $ mkdir build && cd build
    # Android上编译必须要指定 -llog 链接器参数
    $ LDFLAGS="-llog" cmake .. -DSPM_ENABLE_SHARED=ON -DCMAKE_INSTALL_PREFIX=/data/data/com.termux/files/usr
    $ make install $ cd ../python && python setup.py bdist_wheel $ pip install dist/sentencepiece*.whl

  5. 安装transformers依赖的Python库
    $ pip install safetensors # rust联网下载crates
    $ pip install protobuf tokenizers sentencepiece

  6. 安装transformers库(前文准备工作做足即顺利):
    $ pip install transformers[sentencepiece] # Huggingface Series
    $ pip install spacy-{alignments,transformers}
    $ python -m spacy download zh_core_web_trf
    $ python -m spacy download en_core_web_trf
    # 下载 spacy_models 时最好用有断点续传的浏览器(例如Microsoft的Edge)
    # 用 spacy 库下载前打印出下载文件的url 张贴到Edge浏览器上打开下载.
    # https://github.com/explosion/spacy-models/releases/download/en_core_web_trf-3.5.0/en_core_web_trf-3.5.0-py3-none-any.whl
    # https://github.com/explosion/spacy-models/releases/download/zh_core_web_trf-3.5.0/zh_core_web_trf-3.5.0-py3-none-any.whl