FlinkSQL和SparkSQL区别

发布时间 2023-05-16 16:58:42作者: -见

区别:

  • FlinkSQL 的 insert 语句可只操作部分字段,而 SparkSQL 必须指定所有字段:
spark-sql> create table t11 (
         >     ds BIGINT,
         >     ts BIGINT,
         >     pk BIGINT,
         >     f0 BIGINT,
         >     f1 BIGINT,
         >     f2 BIGINT,
         >     f3 BIGINT,
         >     f4 BIGINT
         > ) using hudi
         > partitioned by (ds)
         > tblproperties ( -- 这里也可使用 options (https://hudi.apache.org/docs/table_management)
         >   type = 'mor',
         >   primaryKey = 'pk',
         >   preCombineField = 'ts',
         >   hoodie.bucket.index.num.buckets = '2',
         >   hoodie.index.type = 'BUCKET',
         >   hoodie.compaction.payload.class = 'org.apache.hudi.common.model.OverwriteNonDefaultsWithLatestAvroPayload',
         >   hoodie.datasource.write.payload.class = 'org.apache.hudi.common.model.OverwriteNonDefaultsWithLatestAvroPayload'
         > );
Time taken: 1.382 seconds

spark-sql> insert into t11 (ds,ts,pk,f0) values (20230101,CAST(CURRENT_TIMESTAMP AS BIGINT),1006,1);
Error in query: Cannot write to 'default.t11', not enough data columns:
Table columns: 'ts', 'pk', 'f0', 'f1', 'f2', 'f3', 'f4', 'ds'
Data columns: 'col1', 'col2', 'col3', 'col4'