异常奇遇之栈溢出

Hibernate Validator抛出奇怪StackOverFlowError

Updated on 2019-11-04 10:33 (Created on: 2019-08-28 17:32)

StackOverFlowError

前言

本月我们对维护的产品系统进行重构, 系统的代码大量使用了Hibernate Validator框架进行字段校验. 上周在dev环境进行异常测试的时候, 遇到了一个奇怪的报错 在测试修改商户功能的时候出现了StackOverFlowError, 堆栈如下(省略重复的部分):

java.lang.StackOverflowError
at java.lang.StringBuilder.append(StringBuilder.java:202)
at org.hibernate.validator.internal.engine.messageinterpolation.parser.Token.append(Token.java:44)
at org.hibernate.validator.internal.engine.messageinterpolation.parser.TokenCollector.appendToToken(TokenCollector.java:60)
at org.hibernate.validator.internal.engine.messageinterpolation.parser.MessageState.handleNonMetaCharacter(MessageState.java:32)
at org.hibernate.validator.internal.engine.messageinterpolation.parser.TokenCollector.next(TokenCollector.java:98)
at org.hibernate.validator.internal.engine.messageinterpolation.parser.MessageState.handleNonMetaCharacter(MessageState.java:33)
at org.hibernate.validator.internal.engine.messageinterpolation.parser.TokenCollector.next(TokenCollector.java:98)
at org.hibernate.validator.internal.engine.messageinterpolation.parser.MessageState.handleNonMetaCharacter(MessageState.java:33)
at org.hibernate.validator.internal.engine.messageinterpolation.parser.TokenCollector.next(TokenCollector.java:98)
at org.hibernate.validator.internal.engine.messageinterpolation.parser.MessageState.handleNonMetaCharacter(MessageState.java:33)
at org.hibernate.validator.internal.engine.messageinterpolation.parser.TokenCollector.next(TokenCollector.java:98)
at org.hibernate.validator.internal.engine.messageinterpolation.parser.MessageState.handleNonMetaCharacter(MessageState.java:33)
at org.hibernate.validator.internal.engine.messageinterpolation.parser.TokenCollector.next(TokenCollector.java:98)
at org.hibernate.validator.internal.engine.messageinterpolation.parser.MessageState.handleNonMetaCharacter(MessageState.java:33)
at org.hibernate.validator.internal.engine.messageinterpolation.parser.TokenCollector.next(TokenCollector.java:98)
at org.hibernate.validator.internal.engine.messageinterpolation.parser.BeginState.handleNonMetaCharacter(BeginState.java:33)
at org.hibernate.validator.internal.engine.messageinterpolation.parser.TokenCollector.next(TokenCollector.java:98)
at org.hibernate.validator.internal.engine.messageinterpolation.parser.BeginState.start(BeginState.java:25)
at org.hibernate.validator.internal.engine.messageinterpolation.parser.TokenCollector.parse(TokenCollector.java:106)
at org.hibernate.validator.internal.engine.messageinterpolation.parser.TokenCollector.<init>(TokenCollector.java:43)
at org.hibernate.validator.messageinterpolation.AbstractMessageInterpolator.interpolateBundleMessage(AbstractMessageInterpolator.java:369)
at org.hibernate.validator.messageinterpolation.AbstractMessageInterpolator.interpolateMessage(AbstractMessageInterpolator.java:274)
at org.hibernate.validator.messageinterpolation.AbstractMessageInterpolator.interpolate(AbstractMessageInterpolator.java:216)
at org.hibernate.validator.internal.engine.ValidationContext.interpolate(ValidationContext.java:422)
at org.hibernate.validator.internal.engine.ValidationContext.createConstraintViolation(ValidationContext.java:300)
at org.hibernate.validator.internal.engine.ValidationContext.createConstraintViolations(ValidationContext.java:261)
at org.hibernate.validator.internal.engine.constraintvalidation.ConstraintTree.validateSingleConstraint(ConstraintTree.java:456)
at org.hibernate.validator.internal.engine.constraintvalidation.ConstraintTree.validateConstraints(ConstraintTree.java:127)
at org.hibernate.validator.internal.engine.constraintvalidation.ConstraintTree.validateConstraints(ConstraintTree.java:87)
at org.hibernate.validator.internal.metadata.core.MetaConstraint.validateConstraint(MetaConstraint.java:73)
at org.hibernate.validator.internal.engine.ValidatorImpl.validateMetaConstraint(ValidatorImpl.java:617)
at org.hibernate.validator.internal.engine.ValidatorImpl.validateConstraint(ValidatorImpl.java:580)
at org.hibernate.validator.internal.engine.ValidatorImpl.validateConstraintsForSingleDefaultGroupElement(ValidatorImpl.java:524)
at org.hibernate.validator.internal.engine.ValidatorImpl.validateConstraintsForDefaultGroup(ValidatorImpl.java:492)
at org.hibernate.validator.internal.engine.ValidatorImpl.validateConstraintsForCurrentGroup(ValidatorImpl.java:457)
at org.hibernate.validator.internal.engine.ValidatorImpl.validateInContext(ValidatorImpl.java:407)
at org.hibernate.validator.internal.engine.ValidatorImpl.validate(ValidatorImpl.java:205)
at com.mybank.bkmerchantprod.common.util.ParamCheckUtil.validate(ParamCheckUtil.java:52)
at com.mybank.bkmerchantprod.core.service.v1.busimodel.BusiModelUtils.lambda$paramCheck$2(BusiModelUtils.java:147)
at com.mybank.bkmerchantprod.core.service.v1.busimodel.BusiModelUtils.memberObjConsumer(BusiModelUtils.java:161)
at com.mybank.bkmerchantprod.core.service.v1.busimodel.BusiModelUtils.paramCheck(BusiModelUtils.java:146)
at com.mybank.bkmerchantprod.core.service.v1.template.ServiceTemplate.domainEventExecute(ServiceTemplate.java:148)
at com.mybank.bkmerchantprod.biz.service.impl.trade.TradeMerchFacadeImpl$3.process(TradeMerchFacadeImpl.java:325)
at com.mybank.bkmerchantprod.biz.service.impl.trade.TradeMerchFacadeImpl$3.process(TradeMerchFacadeImpl.java:308)
at com.mybank.bkmerchantprod.biz.shared.support.BizTemplate.process(BizTemplate.java:104)

从堆栈信息中, 并不能看出什么问题, 也没看出是什么地方导致StackOverFlowError, 只是感觉是在递归或者循环什么内容. 百思不得其解, 为什么调用validate方法会出现StackOverFlowError的? 我们系统的调用栈也只是对validate方法进行了封装而已, 到底是什么导致了这个问题?

排查过程

猜想

研究了很久堆栈信息, 也没看出什么端倪. 只好拿出最后的手段: 远程debug

MerchantExtBusiModel

堆栈信息

debug的时候发现, validator校验MerchantExtBusiModel的时候出现异常的, 这样时候只能研究MerchantExtBusiModel的代码, 然后MerchantExtBusiModel也只是个简单的POJO类, 除了校验注解, 并没有其他的特殊逻辑(getter/setter方法省去了):

public class MerchantExtBusiModel extends BaseBusiModel {
/**
 * 税务登记证号码
 */
private String taxNum;

/**
 * 营业执照有效期
 */
@Pattern(regexp = "\\d{4}-\\d{2}-\\d{2}")
private String bussAuthVld;

/**
 * 控股人姓名
 */
private String shareholderName;

/**
 * 控股人身份证件类型
 */
@ForEnum(enumClass = CertTypeEnum.class, isOptional = true)
private String shareholderCertType;

/**
 * 控股人证件号
 */
private String shareholderCertNo;

/**
 * 控股人身份证有效期
 */
@Pattern(regexp = "\\d{4}-\\d{2}-\\d{2}")
private String shareholderCertVld;

/**
 * 法人代表或负责人证件有效期
 */
@Pattern(regexp = "\\d{4}-\\d{2}-\\d{2}")
private String principalCertVld;

/**
 * 性别
 */
@ForEnum(enumClass = PersonSexEnum.class, isOptional = true)
private String personSex;

/**
 * 职业
 */
@Pattern(regexp = "^(?![\\u4e00-\\u9fa5\\·]{32,}$).+$", message = "职业最多32个汉字")
private String personProfession;

/**
 * 身份证件有效期
 */
@Pattern(regexp = "\\d{4}-\\d{2}-\\d{2}")
private String personCertVld;
}

上面用到了@Pattern@ForEnum的注解, @ForEnum是我们自定义的注解, 主要用于判断某个传入的字段值是否是在枚举内, 如果不在的话, 就返回对应的错误信息, 指明有效的枚举范围. ForEnum校验器的源码如下:

public class ForEnumValidator implements ConstraintValidator<ForEnum, String> {

/** 枚举类型 */
private ForEnum forEnum;

@Override
public boolean isValid(String value, ConstraintValidatorContext context) {
    if (StringUtils.isBlank(value)) {
        if (forEnum.isOptional()) {
            return true;
        } else {
            ValidatorHelper.addErrorMessage(context, "属性值不能为空");
            return false;
        }
    }

    Method getCodeMethod;

    Class<? extends Enum<?>> enumClass = forEnum.enumClass();
    try {
        /**  检查枚举实现了getCode()方法 */
        getCodeMethod = enumClass.getMethod("getCode");
    } catch (Exception e) {
        throw new IllegalArgumentException(enumClass.getName() + " 找不到getCode方法", e);
    }

    /** 校验是否匹配其中一个code值 */
    StringBuilder buf = new StringBuilder();
    for (Object enumObject : enumClass.getEnumConstants()) {
        try {
            String result = (String) getCodeMethod.invoke(enumObject);
            if (value.equals(result)) {
                return true;
            }
            buf.append("[").append(result).append("]");
        } catch (Exception e) {
            throw new IllegalArgumentException(enumClass.getName() + "调用getCode出错", e);
        }
    }
    ValidatorHelper.addErrorMessage(context, "可选枚举值为:%s,但当前值为%s", buf.toString(), value);
    return false;
}
}

如果是Validator校验MerchantExtBusiModel有问题, 那么是否和@Pattern的正则表达式有关系, 毕竟现实中也有正则表达式导致故障的先例, 为了验证是正则表达式导致StackOverFlowError的猜想, 我把MerchantExtBusiModel所有@Pattern注解去掉, 如果是@Pattern的问题, 那么重新请求就不会出现StackOverFlowError. 满怀期待地重试了一波, 然而StackOverFlowError又来了...

重新debug

原来本不想去翻Hibernate Validator的源码, 毕竟最初只想快点解决问题, 重新回到测试上面去. 但是手段都不管用了之后, 只能按着堆栈翻看源码。经过若干次重复调试之后, 终于发现了问题所在:

public class BeginState implements ParserState {
// 1. 从这个方法开始解释自定义的错误信息
@Override
public void start(TokenCollector tokenCollector) throws MessageDescriptorFormatException {
    // 然后调用TokenCollector.next()方法, 获取下一个错误信息字符
    tokenCollector.next();
}

@Override
public void handleNonMetaCharacter(char character, TokenCollector tokenCollector)
        throws MessageDescriptorFormatException {
    tokenCollector.appendToToken( character );
    // 设置调用的状态处理器
    tokenCollector.transitionState( new MessageState() );
    // 4. 然后递归调用TokenCollector, 解析下一个字符信息
    tokenCollector.next();
}
}
public class TokenCollector {
// 2. 解析错误信息, 解析下一个字符信息
public void next() throws MessageDescriptorFormatException {
    if ( currentPosition == originalMessageDescriptor.length() ) {
        // give the current context the chance to complete
        currentParserState.terminate( this );
        return;
    }
    char currentCharacter = originalMessageDescriptor.charAt( currentPosition );
    currentPosition++;
    switch ( currentCharacter ) {
        case BEGIN_TERM: {
            currentParserState.handleBeginTerm( currentCharacter, this );
            break;
        }
        case END_TERM: {
            currentParserState.handleEndTerm( currentCharacter, this );
            break;
        }
        case EL_DESIGNATOR: {
            currentParserState.handleELDesignator( currentCharacter, this );
            break;
        }
        case ESCAPE_CHARACTER: {
            currentParserState.handleEscapeCharacter( currentCharacter, this );
            break;
        }
        // 3. 自定义信息匹配只匹配默认选项, 然后调用MessageState
        default: {
            currentParserState.handleNonMetaCharacter( currentCharacter, this );
        }
    }
    // make sure the last token is terminated
    terminateToken();
}
}

原来是@ForEnum这个注解的锅, Hibernate Validator框架会把用户的自定义信息逐个字符(Token)递归解析, 然后输入错误信息, 而@ForEnum刚好输出了一个非常长的字符串错误信息, 长度达到近600个字符, 就导致MessageState这个类递归了至少600次, 最终就导致超出虚拟机栈的总长度, StackOverFlowError就如约而至. 而@ForEnum产生非常长的字符串错误信息的原因在于@ForEnum会把可选的枚举值都打印出来, 恰好遇上了一个有上百个枚举值的枚举类, 就会导致产生很长的错误信息:

StringBuilder buf = new StringBuilder();
for (Object enumObject : enumClass.getEnumConstants()) {
    try {
        String result = (String) getCodeMethod.invoke(enumObject);
        if (value.equals(result)) {
            return true;
        }
        buf.append("[").append(result).append("]");
    } catch (Exception e) {
        throw new IllegalArgumentException(enumClass.getName() + "调用getCode出错", e);
    }
}
ValidatorHelper.addErrorMessage(context, "可选枚举值为:%s,但当前值为%s", buf.toString(), value);

以上只是猜想, 要验证是否正确, 理应只需要将shareholderCertType设成不合法的证件类型, 就能复现, 只要修改为正确的值, 就不会出现StackOverFlowError; 在dev测试了一波, 果然如此~~. 还有一个奇怪的问题是, 在本地写测试用例, 就没法复现了, 最后猜想是因为本地系统是Mac OS, 每个线程默认的栈大小是1024kb, 而dev环境下, 系统是Linux, 设置的每个线程的栈大小是256kb, 溢出的栈大小刚好在1024>stackSize>256kb区间, 所以只会在dev环境复现。

总结

最后发现, 还是我们自己系统代码处理逻辑有问题, 在特定情况下输出了过多的错误信息导致Hibernate Validator 解析失败, 引发栈溢出, 以后有类似的奇怪问题发生, 还是首先应该怀疑自己的代码 :( 此外, debug果然是最快的解决问题的途径, 如果下次再遇到这样奇怪的问题, 就应该debug, 最近的捷径果然不是最快的捷径~