Hibernate Validator StackOverFlowError

框架抛出奇怪异常

Updated on 2019-09-12 12:22 (Created on: 2019-08-28 17:32)

前言

本月我们对维护的产品系统进行重构, 系统的代码大量使用了Hibernate Validator框架进行字段校验. 上周在dev环境进行异常测试的时候, 遇到了一个奇怪的报错 在测试修改商户功能的时候出现了StackOverFlowError, 堆栈如下(省略重复的部分):

java.lang.StackOverflowError
	at java.lang.StringBuilder.append(StringBuilder.java:202)
	at org.hibernate.validator.internal.engine.messageinterpolation.parser.Token.append(Token.java:44)
	at org.hibernate.validator.internal.engine.messageinterpolation.parser.TokenCollector.appendToToken(TokenCollector.java:60)
	at org.hibernate.validator.internal.engine.messageinterpolation.parser.MessageState.handleNonMetaCharacter(MessageState.java:32)
	at org.hibernate.validator.internal.engine.messageinterpolation.parser.TokenCollector.next(TokenCollector.java:98)
	at org.hibernate.validator.internal.engine.messageinterpolation.parser.MessageState.handleNonMetaCharacter(MessageState.java:33)
    at org.hibernate.validator.internal.engine.messageinterpolation.parser.TokenCollector.next(TokenCollector.java:98)
    at org.hibernate.validator.internal.engine.messageinterpolation.parser.MessageState.handleNonMetaCharacter(MessageState.java:33)
    at org.hibernate.validator.internal.engine.messageinterpolation.parser.TokenCollector.next(TokenCollector.java:98)
    at org.hibernate.validator.internal.engine.messageinterpolation.parser.MessageState.handleNonMetaCharacter(MessageState.java:33)
    at org.hibernate.validator.internal.engine.messageinterpolation.parser.TokenCollector.next(TokenCollector.java:98)
    at org.hibernate.validator.internal.engine.messageinterpolation.parser.MessageState.handleNonMetaCharacter(MessageState.java:33)
	at org.hibernate.validator.internal.engine.messageinterpolation.parser.TokenCollector.next(TokenCollector.java:98)
	at org.hibernate.validator.internal.engine.messageinterpolation.parser.MessageState.handleNonMetaCharacter(MessageState.java:33)
	at org.hibernate.validator.internal.engine.messageinterpolation.parser.TokenCollector.next(TokenCollector.java:98)
	at org.hibernate.validator.internal.engine.messageinterpolation.parser.BeginState.handleNonMetaCharacter(BeginState.java:33)
	at org.hibernate.validator.internal.engine.messageinterpolation.parser.TokenCollector.next(TokenCollector.java:98)
	at org.hibernate.validator.internal.engine.messageinterpolation.parser.BeginState.start(BeginState.java:25)
	at org.hibernate.validator.internal.engine.messageinterpolation.parser.TokenCollector.parse(TokenCollector.java:106)
	at org.hibernate.validator.internal.engine.messageinterpolation.parser.TokenCollector.<init>(TokenCollector.java:43)
	at org.hibernate.validator.messageinterpolation.AbstractMessageInterpolator.interpolateBundleMessage(AbstractMessageInterpolator.java:369)
	at org.hibernate.validator.messageinterpolation.AbstractMessageInterpolator.interpolateMessage(AbstractMessageInterpolator.java:274)
	at org.hibernate.validator.messageinterpolation.AbstractMessageInterpolator.interpolate(AbstractMessageInterpolator.java:216)
	at org.hibernate.validator.internal.engine.ValidationContext.interpolate(ValidationContext.java:422)
	at org.hibernate.validator.internal.engine.ValidationContext.createConstraintViolation(ValidationContext.java:300)
	at org.hibernate.validator.internal.engine.ValidationContext.createConstraintViolations(ValidationContext.java:261)
	at org.hibernate.validator.internal.engine.constraintvalidation.ConstraintTree.validateSingleConstraint(ConstraintTree.java:456)
	at org.hibernate.validator.internal.engine.constraintvalidation.ConstraintTree.validateConstraints(ConstraintTree.java:127)
	at org.hibernate.validator.internal.engine.constraintvalidation.ConstraintTree.validateConstraints(ConstraintTree.java:87)
	at org.hibernate.validator.internal.metadata.core.MetaConstraint.validateConstraint(MetaConstraint.java:73)
	at org.hibernate.validator.internal.engine.ValidatorImpl.validateMetaConstraint(ValidatorImpl.java:617)
	at org.hibernate.validator.internal.engine.ValidatorImpl.validateConstraint(ValidatorImpl.java:580)
	at org.hibernate.validator.internal.engine.ValidatorImpl.validateConstraintsForSingleDefaultGroupElement(ValidatorImpl.java:524)
	at org.hibernate.validator.internal.engine.ValidatorImpl.validateConstraintsForDefaultGroup(ValidatorImpl.java:492)
	at org.hibernate.validator.internal.engine.ValidatorImpl.validateConstraintsForCurrentGroup(ValidatorImpl.java:457)
	at org.hibernate.validator.internal.engine.ValidatorImpl.validateInContext(ValidatorImpl.java:407)
	at org.hibernate.validator.internal.engine.ValidatorImpl.validate(ValidatorImpl.java:205)
	at com.mybank.bkmerchantprod.common.util.ParamCheckUtil.validate(ParamCheckUtil.java:52)
	at com.mybank.bkmerchantprod.core.service.v1.busimodel.BusiModelUtils.lambda$paramCheck$2(BusiModelUtils.java:147)
	at com.mybank.bkmerchantprod.core.service.v1.busimodel.BusiModelUtils.memberObjConsumer(BusiModelUtils.java:161)
	at com.mybank.bkmerchantprod.core.service.v1.busimodel.BusiModelUtils.paramCheck(BusiModelUtils.java:146)
	at com.mybank.bkmerchantprod.core.service.v1.template.ServiceTemplate.domainEventExecute(ServiceTemplate.java:148)
	at com.mybank.bkmerchantprod.biz.service.impl.trade.TradeMerchFacadeImpl$3.process(TradeMerchFacadeImpl.java:325)
	at com.mybank.bkmerchantprod.biz.service.impl.trade.TradeMerchFacadeImpl$3.process(TradeMerchFacadeImpl.java:308)
	at com.mybank.bkmerchantprod.biz.shared.support.BizTemplate.process(BizTemplate.java:104)

从堆栈信息中, 并不能看出什么问题, 也没看出是什么地方导致StackOverFlowError, 只是感觉是在递归或者循环什么内容. 百思不得其解, 为什么调用validate方法会出现StackOverFlowError的? 我们系统的调用栈也只是对validate方法进行了封装而已, 到底是什么导致了这个问题?

排查过程

猜想

研究了很久堆栈信息, 也没看出什么端倪. 只好拿出最后的手段: 远程debug

MerchantExtBusiModel

堆栈信息

debug的时候发现, validator校验MerchantExtBusiModel的时候出现异常的, 这样时候只能研究MerchantExtBusiModel的代码, 然后MerchantExtBusiModel也只是个简单的POJO类, 除了校验注解, 并没有其他的特殊逻辑(getter/setter方法省去了):

public class MerchantExtBusiModel extends BaseBusiModel {
    /**
     * 税务登记证号码
     */
    private String taxNum;

    /**
     * 营业执照有效期
     */
    @Pattern(regexp = "\\d{4}-\\d{2}-\\d{2}")
    private String bussAuthVld;

    /**
     * 控股人姓名
     */
    private String shareholderName;

    /**
     * 控股人身份证件类型
     */
    @ForEnum(enumClass = CertTypeEnum.class, isOptional = true)
    private String shareholderCertType;

    /**
     * 控股人证件号
     */
    private String shareholderCertNo;

    /**
     * 控股人身份证有效期
     */
    @Pattern(regexp = "\\d{4}-\\d{2}-\\d{2}")
    private String shareholderCertVld;

    /**
     * 法人代表或负责人证件有效期
     */
    @Pattern(regexp = "\\d{4}-\\d{2}-\\d{2}")
    private String principalCertVld;

    /**
     * 性别
     */
    @ForEnum(enumClass = PersonSexEnum.class, isOptional = true)
    private String personSex;

    /**
     * 职业
     */
    @Pattern(regexp = "^(?![\\u4e00-\\u9fa5\\·]{32,}$).+$", message = "职业最多32个汉字")
    private String personProfession;

    /**
     * 身份证件有效期
     */
    @Pattern(regexp = "\\d{4}-\\d{2}-\\d{2}")
    private String personCertVld;
}

上面用到了@Pattern@ForEnum的注解, @ForEnum是我们自定义的注解, 主要用于判断某个传入的字段值是否是在枚举内, 如果不在的话, 就返回对应的错误信息, 指明有效的枚举范围. ForEnum校验器的源码如下:

public class ForEnumValidator implements ConstraintValidator<ForEnum, String> {

    /** 枚举类型 */
    private ForEnum forEnum;

    @Override
    public boolean isValid(String value, ConstraintValidatorContext context) {
        if (StringUtils.isBlank(value)) {
            if (forEnum.isOptional()) {
                return true;
            } else {
                ValidatorHelper.addErrorMessage(context, "属性值不能为空");
                return false;
            }
        }

        Method getCodeMethod;

        Class<? extends Enum<?>> enumClass = forEnum.enumClass();
        try {
            /**  检查枚举实现了getCode()方法 */
            getCodeMethod = enumClass.getMethod("getCode");
        } catch (Exception e) {
            throw new IllegalArgumentException(enumClass.getName() + " 找不到getCode方法", e);
        }

        /** 校验是否匹配其中一个code值 */
        StringBuilder buf = new StringBuilder();
        for (Object enumObject : enumClass.getEnumConstants()) {
            try {
                String result = (String) getCodeMethod.invoke(enumObject);
                if (value.equals(result)) {
                    return true;
                }
                buf.append("[").append(result).append("]");
            } catch (Exception e) {
                throw new IllegalArgumentException(enumClass.getName() + "调用getCode出错", e);
            }
        }
        ValidatorHelper.addErrorMessage(context, "可选枚举值为:%s,但当前值为%s", buf.toString(), value);
        return false;
    }
}

如果是Validator校验MerchantExtBusiModel有问题, 那么是否和@Pattern的正则表达式有关系, 毕竟现实中也有正则表达式导致故障的先例, 为了验证是正则表达式导致StackOverFlowError的猜想, 我把MerchantExtBusiModel所有@Pattern注解去掉, 如果是@Pattern的问题, 那么重新请求就不会出现StackOverFlowError. 满怀期待地重试了一波, 然而StackOverFlowError又来了...

重新debug

原来本不想去翻Hibernate Validator的源码, 毕竟最初只想快点解决问题, 重新回到测试上面去. 但是手段都不管用了之后, 只能按着堆栈翻看源码。经过若干次重复调试之后, 终于发现了问题所在:

public class BeginState implements ParserState {
	// 1. 从这个方法开始解释自定义的错误信息
	@Override
	public void start(TokenCollector tokenCollector) throws MessageDescriptorFormatException {
		// 然后调用TokenCollector.next()方法, 获取下一个错误信息字符
		tokenCollector.next();
	}

	@Override
	public void handleNonMetaCharacter(char character, TokenCollector tokenCollector)
			throws MessageDescriptorFormatException {
        tokenCollector.appendToToken( character );
        // 设置调用的状态处理器
        tokenCollector.transitionState( new MessageState() );
        // 4. 然后递归调用TokenCollector, 解析下一个字符信息
		tokenCollector.next();
	}
}
public class TokenCollector {
    // 2. 解析错误信息, 解析下一个字符信息
    public void next() throws MessageDescriptorFormatException {
		if ( currentPosition == originalMessageDescriptor.length() ) {
			// give the current context the chance to complete
			currentParserState.terminate( this );
			return;
		}
		char currentCharacter = originalMessageDescriptor.charAt( currentPosition );
		currentPosition++;
		switch ( currentCharacter ) {
			case BEGIN_TERM: {
				currentParserState.handleBeginTerm( currentCharacter, this );
				break;
			}
			case END_TERM: {
				currentParserState.handleEndTerm( currentCharacter, this );
				break;
			}
			case EL_DESIGNATOR: {
				currentParserState.handleELDesignator( currentCharacter, this );
				break;
			}
			case ESCAPE_CHARACTER: {
				currentParserState.handleEscapeCharacter( currentCharacter, this );
				break;
			}
			// 3. 自定义信息匹配只匹配默认选项, 然后调用MessageState
			default: {
				currentParserState.handleNonMetaCharacter( currentCharacter, this );
			}
		}
		// make sure the last token is terminated
		terminateToken();
	}
}

原来是@ForEnum这个注解的锅, Hibernate Validator框架会把用户的自定义信息逐个字符(Token)递归解析, 然后输入错误信息, 而@ForEnum刚好输出了一个非常长的字符串错误信息, 长度达到近600个字符, 就导致MessageState这个类递归了至少600次, 最终就导致超出虚拟机栈的总长度, StackOverFlowError就如约而至. 而@ForEnum产生非常长的字符串错误信息的原因在于@ForEnum会把可选的枚举值都打印出来, 恰好遇上了一个有上百个枚举值的枚举类, 就会导致产生很长的错误信息:

        StringBuilder buf = new StringBuilder();
        for (Object enumObject : enumClass.getEnumConstants()) {
            try {
                String result = (String) getCodeMethod.invoke(enumObject);
                if (value.equals(result)) {
                    return true;
                }
                buf.append("[").append(result).append("]");
            } catch (Exception e) {
                throw new IllegalArgumentException(enumClass.getName() + "调用getCode出错", e);
            }
        }
        ValidatorHelper.addErrorMessage(context, "可选枚举值为:%s,但当前值为%s", buf.toString(), value);

以上只是猜想, 要验证是否正确, 理应只需要将shareholderCertType设成不合法的证件类型, 就能复现, 只要修改为正确的值, 就不会出现StackOverFlowError; 在dev测试了一波, 果然如此~~. 还有一个奇怪的问题是, 在本地写测试用例, 就没法复现了, 最后猜想是因为本地系统是Mac OS, 每个线程默认的栈大小是1024kb, 而dev环境下, 系统是Linux, 设置的每个线程的栈大小是256kb, 溢出的栈大小刚好在1024>stackSize>256kb区间, 所以只会在dev环境复现。

总结

最后发现, 还是我们自己系统代码处理逻辑有问题, 在特定情况下输出了过多的错误信息导致Hibernate Validator 解析失败, 引发栈溢出, 以后有类似的奇怪问题发生, 还是首先应该怀疑自己的代码 :( 此外, debug果然是最快的解决问题的途径, 如果下次再遇到这样奇怪的问题, 就应该debug, 最近的捷径果然不是最快的捷径~