Hbase 检索成绩在80到90之间的同学姓名

发布时间 2023-04-10 02:48:44作者: 小能日记

给定一个student表,列族包含学生ID,姓名和成绩,检索成绩在80到90之间的同学姓名

create 'student','S_NO','S_NAME','S_SCORE'

put 'student','s001','S_NO','2022001'
put 'student','s001','S_NAME','小王'
put 'student','s001','S_SCORE','45'

put 'student','s002','S_NO','2022002'
put 'student','s002','S_NAME','小林'
put 'student','s002','S_SCORE','67'

put 'student','s003','S_NO','2022003'
put 'student','s003','S_NAME','小张'
put 'student','s003','S_SCORE','85'

put 'student','s004','S_NO','2022004'
put 'student','s004','S_NAME','小刘'
put 'student','s004','S_SCORE','95'

put 'student','s005','S_NO','2022005'
put 'student','s005','S_NAME','小李'
put 'student','s005','S_SCORE','61'

put 'student','s006','S_NO','2022006'
put 'student','s006','S_NAME','小冰'
put 'student','s006','S_SCORE','83'

put 'student','s007','S_NO','2022007'
put 'student','s007','S_NAME','小明'
put 'student','s007','S_SCORE','71'

put 'student','s008','S_NO','2022008'
put 'student','s008','S_NAME','小帅'
put 'student','s008','S_SCORE','88'

put 'student','s009','S_NO','2022009'
put 'student','s009','S_NAME','小东'
put 'student','s009','S_SCORE','09'

在 hbase 中,一般都转成字符串,然后再保存

类似 价格年龄 这类数字,前面补 0,转成定长的,再保存,这样便于过滤

原因是Hbase的四种过滤器都是字符串比较,不能直接进行数字类型比较

  • BinaryComparator - lexicographically compares against the specified byte array using the Bytes.compareTo(byte[], byte[]) method.

  • BinaryPrefixComparator - lexicographically compares against a specified byte array. It only compares up to the length of this byte array.

  • RegexStringComparator - compares against the specified byte array using the given regular expression. Only EQUAL and NOT_EQUAL comparisons are valid with this comparator.

  • SubStringComparator - tests whether or not the given substring appears in a specified byte array. The comparison is case insensitive. Only EQUAL and NOT_EQUAL comparisons are valid with this comparator.


scan 'student',FILTER => "SingleColumnValueFilter('S_SCORE', '', >=, 'binary:80') AND SingleColumnValueFilter('S_SCORE','',<=,'binary:90')",FORMATTER=>'toString'

toString用于将shell返回结果的人名从UTF-8编码显示为正常的字符串。

hbase(main):038:0> scan 'student',FILTER => "SingleColumnValueFilter('S_SCORE', '', >=, 'binary:80') AND SingleColumnValueFilter('S_SCORE','',<=,'binary:90')",FORMATTER=>'toString'
ROW                         COLUMN+CELL
 s003                       column=S_NAME:, timestamp=1681065615165, value=小张
 s003                       column=S_NO:, timestamp=1681065615130, value=2022003
 s003                       column=S_SCORE:, timestamp=1681065615198, value=85
 s006                       column=S_NAME:, timestamp=1681065615575, value=小冰
 s006                       column=S_NO:, timestamp=1681065615542, value=2022006
 s006                       column=S_SCORE:, timestamp=1681065615611, value=83
 s008                       column=S_NAME:, timestamp=1681065615804, value=小帅
 s008                       column=S_NO:, timestamp=1681065615773, value=2022008
 s008                       column=S_SCORE:, timestamp=1681065615847, value=88

资料

https://docs.cloudera.com/documentation/enterprise/6/6.3/topics/admin_hbase_filtering.html

https://www.xinbaoku.com/archive/K8IwH9F4.html

https://bianma.hao86.com/