Proj CDeepFuzz Paper Reading: Differential Testing of Cross Deep Learning Framework APIs: Revealing Inconsistencies and Vulnerabilities

发布时间 2023-09-05 05:56:02作者: 雪溯

Abstract

背景:目前对cross-framework conversion中的inconsistencies和security bugs的研究少有
本文:TensorScope
Task: test cross-frame APIs in Machine Learning Libraries
Method: 1. Differential Testing among Machine Learning Libraries 2. joint constraint analysis 3. error-guided test case fixing

实验:
Competitors: FreeFuzz, DocTer
对象: TensorFlow, TensorFlow Lite, ORT Mindspore, PyTorch, Paddle
效果:

  1. +28.7% code coverage compared to FreeFuzz, + 24.3 % according to DocTer
  2. +230 bugs,+ 8 CVEs, + $1,100 bounty
  3. can use these inconsistencies to reduce 3 models drop at most 3.5%

5 Measurement & Analysis

5.1 Statistics of Bugs

The buggy APIs often occur in arithmetic modules, such as math (54), linalg (28). 56 of them are fused calculations with more than one primitive operator, primarily in the TensorFlow framework, e.g., tf.raw_ops.CropAndResizeGradBox es. These buggy APIs are typically used in various stages of DL model development, serving functions including data processing (98), arithmetic operations (58), gradient computation (15), loss functions (4), optimizers (2), and others (7). 32 operator APIs appear in commonly-used models, such as Logsoftmax in many classification models and so on. The root cause of most bugs is missing checks, accounting for 158 bugs, including checking data types (88), array boundaries (32), empty values (31) and others (7).

5.2 Types of Bugs

Inconsistent

  1. Precision bugs: +46bugs
  2. Data layout bugs: +7 bugs
  3. Different exception handling bugs: + 27 bugs

Crashes: e.g.: Reachable Assertion, OOB Read/Write, Floating Point Exception, Command Injection