K3cloud数据字典更新库表及字段注释(一)

发布时间 2023-05-06 13:36:24作者: zaggu

目标

由于K3CLOUD并未将表及字段注释填写在数据库表中,元数据采集过程中无法获取到注释信息,手工补充耗时耗力,打算通过爬取K3CLOUD数据字典的表及字段注释,然后批量填充到数据库。

一、获取K3CLOUD数据库表及字段数据

import PyClass.ConnMsSQL as msDB    --自建类,用于数据查询及获取
from sqlalchemy import create_engine
import pandas as pd

读取K3CLOUD数据库结构,导入tmp_k3dict 表

if __name__=='__main__':

    #初始化数据库连接
    ms=msDB.DBHelper(
        host="192.168.10.96",
        user="XXXX",
        pwd="XXXXX",
        dbname="K3CLOUD数据库"
    )
    #写入目标数据库
    wms=msDB.DBHelper(
        host="192.168.10.44",
        user="XXX",
        pwd="XXXXX",
        dbname="目标库"
    )
    #创建SQLSERVER引擎
    engine=create_engine('mssql+pymssql://BI:BI*.0@192.168.10.44/目标库')
    #清理旧数据
    # sql="delete from tmp_k3dict"
    # res=wms.ExecNonQuery(sql)
    #读取数据表名及字段名
    sql="select system='k3cloud',d.name N'表名','表注释',a.name N'字段名',isnull(g.[value],'') AS N'字段注释',b.name N'类型'," \
        " COLUMNPROPERTY(a.id,a.name,'PRECISION') as N'长度',(case when (select  count(*) from sysobjects where (name in  " \
        " (select name from sysindexes where (id = a.id) AND (indid in  (select indid  from sysindexkeys where (id = a.id) " \
        " AND (colid in (select colid  from syscolumns where (id = a.id) AND (name = a.name)))))))   AND  (xtype = 'PK'))>0 then 'PK' else '' end) N'主键'," \
        " (case when COLUMNPROPERTY( a.id,a.name,'IsIdentity')=1 then 'identity'else '' end) N'标识',(case when a.isnullable=1 then 'true' else 'false' end) N'允许空' " \
        " from syscolumns a left join systypes b on a.xtype=b.xusertype inner join sysobjects d on a.id=d.id and d.xtype='U' and d.name<>'dtproperties'" \
        " left join syscomments e on a.cdefault=e.id left join sys.extended_properties g on a.id=g.major_id AND a.colid = g.minor_id where d.name not like '%TMP%' order by d.name  "
    rTable = ms.Df_ExceQuery(sql)
    # df=pd.DataFrame(ms.search_one(sql))b
    rTable.to_sql('tmp_k3dict',engine,if_exists='append',index=False)

二、确定通过SQL代码更新表及字段注释是否可行:

--修改表注释
EXEC sys.sp_addextendedproperty
@name='MS_Description',@value='客户价值',
@level0type='SCHEMA',@level0name='dbo',
@level1type='TABLE',@level1name='T_DM_CustRFM';
--修改字段注释
EXEC sys.sp_addextendedproperty
@name=N'MS_Description',@value=N'注释内容',
@level0type=N'SCHEMA',@level0name=N'dbo',
@level1type=N'TABLE',@level1name=N'book'
@level2type='COLUMN',@level2name='AUTHOR';
三、确认K3cloud数据字典是否可以进行爬取,并容易获取表及字段注释信息
数据字典地址金蝶云 协同开发云 (kingdee.com)

查看第一个,基础信息

只能说厉害,一番折腾后发现数据字典地址有变动
https://open.kingdee.com/K3Cloud/PDM/BD.htm (无法打开)
https://open.kingdee.com/K3Cloud/PDM/BD基础_files/BD基础_toc.html (正常打开)

继续,查看基础信息网页源码发现表注释内容

没有表字段数据,点开表找到字段信息

通过1级网址获取每张表的地址,循环采集每张表的字段数据
本机环境 python 3.10、 pandas 1.4.3 (高版本中apend函数取消了,改为concat)、 pymssql、sqlalchemy
爬取数据,更新到SQLSERVER数据库 tmp_datadict 表