python分片和断点续传oss

发布时间 2023-09-14 14:40:20作者: 蒲公英PGY

分片上传和断点续传是常见于文件上传至云存储服务的操作,其中OSS (Object Storage Service) 是阿里云提供的云存储服务。Python 可以用来实现分片上传和断点续传到阿里云 OSS。下面是一个简单的示例,演示了如何使用 Python 和阿里云 OSS SDK 进行分片上传和断点续传。

首先,确保你已经安装了阿里云 OSS SDK。你可以使用以下命令来安装:

pip install oss2

接下来,你可以使用以下代码示例来进行分片上传和断点续传:

import os
import oss2

# 阿里云 OSS 访问信息
access_key_id = 'your_access_key_id'
access_key_secret = 'your_access_key_secret'
endpoint = 'your_endpoint'
bucket_name = 'your_bucket_name'

# 创建 OSS 客户端
auth = oss2.Auth(access_key_id, access_key_secret)
bucket = oss2.Bucket(auth, endpoint, bucket_name)

# 上传文件的本地路径
local_file_path = 'path_to_local_file'
object_key = 'destination_object_key'  # 存储在 OSS 上的对象名称

# 分片上传函数
def multipart_upload(bucket, object_key, local_file_path, part_size=1024 * 1024):
    total_size = os.path.getsize(local_file_path)
    upload_id = None

    try:
        # 初始化分片上传
        upload_id = bucket.init_multipart_upload(object_key).upload_id

        # 计算分片数量
        part_count = (total_size + part_size - 1) // part_size

        # 开始分片上传
        parts = []
        with open(local_file_path, 'rb') as fileobj:
            for i in range(part_count):
                offset = i * part_size
                size = min(part_size, total_size - offset)
                upload_part = bucket.upload_part(object_key, upload_id, i + 1, oss2.PartIterator(fileobj, size))
                parts.append(oss2.models.PartInfo(i + 1, upload_part.etag))

        # 完成分片上传
        bucket.complete_multipart_upload(object_key, upload_id, parts)
        print(f'Successfully uploaded {object_key}')
    except Exception as e:
        print(f'Error uploading {object_key}: {e}')
        if upload_id:
            # 如果出错,取消分片上传
            bucket.abort_multipart_upload(object_key, upload_id)
            print(f'Upload of {object_key} aborted')

# 断点续传函数
def resume_upload(bucket, object_key, local_file_path):
    # 检查对象是否存在,如果存在则获取已上传的分片信息
    if bucket.object_exists(object_key):
        print(f'Resuming upload of {object_key}')
        parts = bucket.list_parts(object_key)

        # 计算已上传的分片数
        uploaded_parts = [part.part_number for part in parts]
        next_part_number = max(uploaded_parts) + 1 if uploaded_parts else 1

        # 打开本地文件,从上次上传结束的地方继续上传
        with open(local_file_path, 'rb') as fileobj:
            fileobj.seek((next_part_number - 1) * part_size)
            upload_part = bucket.upload_part(object_key, upload_id, next_part_number, oss2.PartIterator(fileobj, part_size))
            print(f'Uploaded part {next_part_number}: {upload_part.etag}')
    else:
        # 如果对象不存在,则执行分片上传
        multipart_upload(bucket, object_key, local_file_path)

# 检查是否已经存在对象
if bucket.object_exists(object_key):
    # 如果对象已经存在,则执行断点续传
    resume_upload(bucket, object_key, local_file_path)
else:
    # 如果对象不存在,则执行分片上传
    multipart_upload(bucket, object_key, local_file_path)

在上面的代码中,我们首先导入了必要的库并设置了阿里云 OSS 访问信息。然后,我们定义了一个 multipart_upload 函数,用于执行分片上传,以及一个 resume_upload 函数,用于执行断点续传。最后,我们检查对象是否已经存在,如果存在则执行断点续传,否则执行分片上传。

请确保替换代码中的 your_access_key_id、your_access_key_secret、your_endpoint 和 your_bucket_name 为你的实际阿里云 OSS 访问信息和配置。另外,将 path_to_local_file 替换为你要上传的本地文件的路径,将 destination_object_key 替换为文件在 OSS 上的目标对象键。