PII[^1]泄露–用CodeQL识别日志中的PII数据
shopizer是一款开源电子商务系统,使用Java语言开发。shopizer‘s github
本次实验所以内容代码都会上传至https://github.com/SummerSec/learning-codeql
Source–敏感字段
已知敏感有:
- phone
- creditCare
CodeQL的Field
获取字段名再根据正则模糊匹配的方式
/**
*@name SimplePIIField
*/
import java
from Field f
where
(f.getName().matches("%email%") or
f.getName().matches("%phone%") or
f.getName().matches("creditCard%")) and
f.fromSource()
select f
转化成类的形式
/**
*@name SimplePIIClass
*/
import java
class SenInfoField extends Field{
SenInfoField(){
(this.getName().matches("%email%") or
this.getName().matches("%phone%") or
this.getName().matches("creditCard%")) and
this.fromSource()
}
}
from SenInfoField sif
select sif
Sink–日志输出调用
shopizer
使用的是slf4j
日志框架输出日志,StringFormatMethod
是CodeQL对该日志框架处理其定义是:
/**
* A format method using the `org.slf4j.Logger` format string syntax. That is,
* the placeholder string is `"{}"`.
*/
查询slf4j
调用
/**
*@name Logger slf4j 记录器记录方法调用查询
*/
import java
import semmle.code.java.StringFormat
from LoggerFormatMethod lfm
// select lfm.getAReference().getAnArgument()
select lfm.getAReference()
污点数据流追踪
source
以PII字段email
、phone
和creditCard
,sink
是slf4j
的参数。
/**
*@name PIIQuery
*@kind problem
*/
import java
import semmle.code.java.dataflow.TaintTracking
import semmle.code.java.StringFormat
class SenInfoField extends Field{
SenInfoField(){
(this.getName().matches("%email%") or
this.getName().matches("%phone%") or
this.getName().matches("creditCard%")) and
this.fromSource()
}
}
class MySenInfoTaintConfig extends TaintTracking::Configuration{
MySenInfoTaintConfig(){
this = "MySenInfoTaintConfig"
}
override predicate isSource(DataFlow::Node source){
source.asExpr() = any(SenInfoField sif).getAnAccess()
}
override predicate isSink(DataFlow::Node sink){
sink.asExpr() = any(LoggerFormatMethod lfm).getAReference().getAnArgument()
}
}
from MySenInfoTaintConfig config, DataFlow::Node source, DataFlow::Node sink, SenInfoField f
where
config.hasFlow(source, sink) and
// source.asExpr() = f.getAnAccess()
f.getAnAccess() = source.asExpr()
select sink, "PII data from field $@ is written to long here",f , f.getName()
// select sink,"PII data from field $@ is written to long here",source, source.asExpr().toString()
在where clause中38和39行是一样的效果,因为在CodeQL中=
的作用是判断左右两边是否是相同、相等,所以左右的顺序是没有区别。
或者也可以这样子写:
from MySenInfoTaintConfig config, DataFlow::Node source, DataFlow::Node sink
where
config.hasFlow(source, sink)
select sink,"PII data from field $@ is written to long here",source, source.asExpr().toString()
PS:关于$@
参考Defining the results of a query
完整路径显示
显示路径需要将@kind problem
改成@kind path-problem
,并且导入import DataFlow::PathGraph
。
/**
*@name PIIQueryPath
*@kind path-problem
*@description 污染路径
*/
import java
import semmle.code.java.StringFormat
import semmle.code.java.dataflow.TaintTracking
import DataFlow::PathGraph
class SenInfoField extends Field{
SenInfoField(){
(this.getName().matches("%email%") or
this.getName().matches("%phone%") or
this.getName().matches("creditCard%")) and
this.fromSource()
}
}
class MySenInfoTaintConfig extends TaintTracking::Configuration{
MySenInfoTaintConfig(){
this = "MySenInfoTaintConfig"
}
override predicate isSource(DataFlow::Node source){
source.asExpr() = any(SenInfoField sif).getAnAccess()
}
override predicate isSink(DataFlow::Node sink){
sink.asExpr() = any(LoggerFormatMethod lfm).getAReference().getAnArgument()
}
}
from MySenInfoTaintConfig config, DataFlow::PathNode source, DataFlow::PathNode sink
where config.hasFlowPath(source, sink)
select sink, source, sink, "PII data from field $@ is written to long here", source, source.getNode().toString()
### 无害处理
在路径查询结果中,我们查看的时候可以发现creditCard
被mask%
方法处理了,mask%
方法是马赛克
的意思。排除这个有这个方法路径,让结果更少的误报,这就需要重写isSanitizer
谓词。在污点追踪里,Sanitizer
即是无害处理。
重写谓词isSanitizer
,这里只需要排除方法名有mask%
即可。
/**
*@name PIIQuerySanitizerPath
*@kind path-problem
*@description 排除一些无效查询
*/
import java
import semmle.code.java.StringFormat
import semmle.code.java.dataflow.TaintTracking
import DataFlow::PathGraph
import java
import semmle.code.java.dataflow.TaintTracking
import semmle.code.java.StringFormat
class SenInfoField extends Field{
SenInfoField(){
(this.getName().matches("%email%") or
this.getName().matches("%phone%") or
this.getName().matches("creditCard%")) and
this.fromSource()
}
}
class MySenInfoTaintConfig extends TaintTracking::Configuration{
MySenInfoTaintConfig(){
this = "MySenInfoTaintConfig"
}
override predicate isSource(DataFlow::Node source){
source.asExpr() = any(SenInfoField sif).getAnAccess()
}
override predicate isSink(DataFlow::Node sink){
sink.asExpr() = any(LoggerFormatMethod lfm).getAReference().getAnArgument()
}
override predicate isSanitizer(DataFlow::Node sanitizer){
sanitizer.asExpr() = any(
Method m| m.getName().matches("mask%")
).getAReference().getAnArgument()
}
}
from MySenInfoTaintConfig config, DataFlow::PathNode source, DataFlow::PathNode sink, SenInfoField f
where
config.hasFlowPath(source, sink) and
source.getNode().asExpr() = f.getAnAccess()
select sink,source,sink ,"PII data from field $@ is written to long here",f ,f.getName()
总结
确定Sources
、Sink
编写污点追踪数据流–>完善和进一步找到路径–>无害处理。
参考
https://youtu.be/hHaOxbyqy44