写在前面
seata是阿里巴巴开源,用于解决分布式事务的中间件,目前在github上已经拥有18k+的star,是分布式事务中间件的翘楚,它拥有四种解决分布式事务的模式:AT、TCC、XA、SAGA。
下面给出四种模式的简要说明:
AT:一种通过动态代理实现无侵入的分布式事务解决方案,是2pc的一种实现,一阶段:业务数据和回滚日志记录在同一个本地事务中提交,释放本地锁和连接资源。二阶段:提交异步化,非常快速地完成,而回滚则通过一阶段的回滚日志进行反向补偿。
TCC:一种侵入式解决方案,每个分支事务都需要自己实现TCC的行为,支持把 “自定义” 的分支事务纳入到全局事务的管理中。
XA:利用事务资源(数据库、消息服务等)对 XA 协议的支持,以 XA 协议的机制来管理分支事务的一种解决方案。
SAGA:把事务看成多个阶段,每个阶段都可以向前补偿与向后回滚,需要自己实现一阶段正向服务和二阶段补偿服务,以状态机引擎来驱动全局事务的一种解决方案。
本文会按照AT模式的执行全局事务的主流程顺序,对三端(TM、RM、TC)如何协同处理分布式事务进行解析,解析顺序是TM->RM->TC,会对源码中无关的部分做适当的删减,注明代码省略,以便阅读。
下面是AT模式三端的简要说明:
TM (Transaction Manager) - 事务管理器:定义全局事务的范围:开始全局事务、提交或回滚全局事务。
RM (Resource Manager) - 资源管理器:管理分支事务处理的资源,与TC交谈以注册分支事务和报告分支事务的状态,并驱动分支事务提交或回滚。
TC (Transaction Coordinator) - 事务协调者:维护全局和分支事务的状态,驱动全局事务提交或回滚。
AT-TM端分析
AT-TM端是基于spring的ioc与aop能力对原方法进行动态代理,处理全局事务的发起与提交或回滚。
@Bean
@DependsOn({BEAN_NAME_SPRING_APPLICATION_CONTEXT_PROVIDER, BEAN_NAME_FAILURE_HANDLER})
@ConditionalOnMissingBean(GlobalTransactionScanner.class)
public GlobalTransactionScanner globalTransactionScanner(SeataProperties seataProperties, FailureHandler failureHandler) {
...//代码省略
return new GlobalTransactionScanner(seataProperties.getApplicationId(), seataProperties.getTxServiceGroup(), failureHandler);
}
SeataAutoConfiguration是AT-TM端的源头,利用了spring自动注入的特点,为某个方法打上@Bean注解,生成GlobalTransactionScanner。
接下来我们进入GlobalTransactionScanner,它实现了spring的InitializingBean接口,因此利用spring的特性,当bean实现时会调用afterPropertiesSet方法进行初始化。
@Override
public void afterPropertiesSet() {
ConfigurationCache.addConfigListener(ConfigurationKeys.DISABLE_GLOBAL_TRANSACTION,
(ConfigurationChangeListener)this);
...//代码省略
if (initialized.compareAndSet(false, true)) {
initClient();
}
...//代码省略
}
GlobalTransactionScanner的afterPropertiesSet方法中,initClient会对TM、RM的client进行初始化,对关闭钩子的注册。
同时,GlobalTransactionScanner也实现了AbstractAutoProxyCreator抽象类,会对所有由spring生成的bean进行wrapIfNecessary的二次处理,正是由此方法,为打上@GlobalTransaction注解的bean进行动态代理,由GlobalTransactionalInterceptor进行代理。
@Override
protected Object wrapIfNecessary(Object bean, String beanName, Object cacheKey) {
try {
synchronized (PROXYED_SET) {
...//代码省略
if (interceptor == null) {
if (globalTransactionalInterceptor == null) {
globalTransactionalInterceptor = new GlobalTransactionalInterceptor(failureHandlerHook);
ConfigurationCache.addConfigListener(
ConfigurationKeys.DISABLE_GLOBAL_TRANSACTION,
(ConfigurationChangeListener)globalTransactionalInterceptor);
}
interceptor = globalTransactionalInterceptor;
}
}
if (!AopUtils.isAopProxy(bean)) {
bean = super.wrapIfNecessary(bean, beanName, cacheKey);
} else {
AdvisedSupport advised = SpringProxyUtils.getAdvisedSupport(bean);
Advisor[] advisor = buildAdvisors(beanName, getAdvicesAndAdvisorsForBean(null, null, null));
for (Advisor avr : advisor) {
advised.addAdvisor(0, avr);
}
}
PROXYED_SET.add(beanName);
return bean;
}
...//代码省略
}
由GlobalTransactionalInterceptor接管后,调用将经过它的invoke方法,再经过handleGlobalTransaction或者handleGlobalLock进行处理,这是看方法上的注解是@GlobalTransactional或者@GlobalLock决定的。
@Override
public Object invoke(final MethodInvocation methodInvocation) throws Throwable {
...//代码省略
if (!localDisable) {
if (globalTransactionalAnnotation != null) {
return handleGlobalTransaction(methodInvocation, globalTransactionalAnnotation);
} else if (globalLockAnnotation != null) {
return handleGlobalLock(methodInvocation, globalLockAnnotation);
}
}
}
return methodInvocation.proceed();
}
handleGlobalLock其实是handleGlobalTransaction的子集,接下来只分析handleGlobalTransaction方法,handleGlobalTransaction比较简单,直接调用事务模板类TransactionalTemplate的execute方法继续进行,报错则调用错误处理器进行处理,最后发送消息到事务总线。execute入参是TransactionalExecutor,封装了对原方法的调用、名称、事务配置的获取。
Object handleGlobalTransaction(final MethodInvocation methodInvocation,
final GlobalTransactional globalTrxAnno) throws Throwable {
boolean succeed = true;
try {
return transactionalTemplate.execute(new TransactionalExecutor() {
@Override
public Object execute() throws Throwable {
return methodInvocation.proceed();
}
...//代码省略
@Override
public TransactionInfo getTransactionInfo() {
// reset the value of timeout
int timeout = globalTrxAnno.timeoutMills();
if (timeout <= 0 || timeout == DEFAULT_GLOBAL_TRANSACTION_TIMEOUT) {
timeout = defaultGlobalTransactionTimeout;
}
TransactionInfo transactionInfo = new TransactionInfo();
transactionInfo.setTimeOut(timeout);
transactionInfo.setName(name());
transactionInfo.setPropagation(globalTrxAnno.propagation());
transactionInfo.setLockRetryInternal(globalTrxAnno.lockRetryInternal());
transactionInfo.setLockRetryTimes(globalTrxAnno.lockRetryTimes());
Set<RollbackRule> rollbackRules = new LinkedHashSet<>();
...//代码省略
transactionInfo.setRollbackRules(rollbackRules);
return transactionInfo;
}
});
} catch (TransactionalExecutor.ExecutionException e) {
TransactionalExecutor.Code code = e.getCode();
switch (code) {
...//代码省略
case RollbackRetrying:
failureHandler.onRollbackRetrying(e.getTransaction(), e.getOriginalException());
throw e.getOriginalException();
default:
throw new ShouldNeverHappenException(String.format("Unknown TransactionalExecutor.Code: %s", code));
}
} finally {
if (degradeCheck) {
EVENT_BUS.post(new DegradeCheckEvent(succeed));
}
}
}
接下来到seata事务的核心,事务模板TransactionalTemplate,里面封装了spring的事务传播模式:
NOT_SUPPORTED(如果有事务则挂起,不在事务中执行原方法)
REQUIRES_NEW(如果有事务则挂起,新建事务中执行原方法)
SUPPORTS(如果不存在事务则直接执行原方法,若存在事务则在原事务执行原方法)
REQUIRED(不进行任何处理,若存在事务则在事务中执行,否则相反)
NEVER(若存在事务直接报错,没有事务则执行)
MANDATORY(不存在事务则报错,必须在原事务中执行)
,还有类似spring的事务执行顺序,beginTransaction->business.execute->completeTransactionAfterThrowing->commitTransaction->cleanUp。似曾相识,在seata还能复习下spring事务,哈哈。这里的英文注释非常详细,很容易看明白seata的封装方式。
public Object execute(TransactionalExecutor business) throws Throwable {
...//代码省略
TransactionInfo txInfo = business.getTransactionInfo();
Propagation propagation = txInfo.getPropagation();
try {
switch (propagation) {
case NOT_SUPPORTED:
// If transaction is existing, suspend it.
if (existingTransaction(tx)) {
suspendedResourcesHolder = tx.suspend();
}
// Execute without transaction and return.
return business.execute();
case REQUIRES_NEW:
// If transaction is existing, suspend it, and then begin new transaction.
if (existingTransaction(tx)) {
suspendedResourcesHolder = tx.suspend();
tx = GlobalTransactionContext.createNew();
}
// Continue and execute with new transaction
break;
case SUPPORTS:
// If transaction is not existing, execute without transaction.
if (notExistingTransaction(tx)) {
return business.execute();
}
// Continue and execute with new transaction
break;
case REQUIRED:
// If current transaction is existing, execute with current transaction,
// else continue and execute with new transaction.
break;
case NEVER:
// If transaction is existing, throw exception.
if (existingTransaction(tx)) {
throw new TransactionException(
String.format("Existing transaction found for transaction marked with propagation 'never', xid = %s"
, tx.getXid()));
} else {
// Execute without transaction and return.
return business.execute();
}
case MANDATORY:
// If transaction is not existing, throw exception.
if (notExistingTransaction(tx)) {
throw new TransactionException("No existing transaction found for transaction marked with propagation 'mandatory'");
}
// Continue and execute with current transaction.
break;
default:
throw new TransactionException("Not Supported Propagation:" + propagation);
}
...//代码省略
try {
beginTransaction(txInfo, tx);
Object rs;
try {
// Do Your Business
rs = business.execute();
} catch (Throwable ex) {
completeTransactionAfterThrowing(txInfo, tx, ex);
throw ex;
}
commitTransaction(tx);
return rs;
} finally {
...//代码省略
cleanUp();
}
}...//代码省略
}
1.来到beginTransaction,如果看过镇楼图,就是TM向TC注册事务的过程,调用链路是TransactionalTemplate->DefaultGlobalTransaction->DefaultTransactionManager的begin方法,值得一提的是,TM端(DefaultTransactionManager)和TC端(DefaultCore)处理事务的类,都是实现了TransactionManager接口,的确TM和TC端的处理方法是一一对应的。在DefaultTransactionManager使用TmNettyRemotingClient发送事务注册请求,还记得刚开始初始化的TMclient吗?养兵千日,用在一时。
@Override
public String begin(String applicationId, String transactionServiceGroup, String name, int timeout)
throws TransactionException {
GlobalBeginRequest request = new GlobalBeginRequest();
request.setTransactionName(name);
request.setTimeout(timeout);
GlobalBeginResponse response = (GlobalBeginResponse) syncCall(request);
if (response.getResultCode() == ResultCode.Failed) {
throw new TmTransactionException(TransactionExceptionCode.BeginFailed, response.getMsg());
}
return response.getXid();
}
2.business.execute是原方法的执行,会在AT-RM端进行分析,此处省略。
3.completeTransactionAfterThrowing是在原方法调用后报错进行rollback,跟begin注册事务的方式几乎一模一样,都是使用TMclient向TC发出rollback请求,此处省略。
4.commitTransaction是在原方法调用成功后进行commit,跟begin注册事务的方式几乎一模一样,都是使用TMclient向TC发出commit请求,此处省略。
5.cleanUp是在事务模板执行后进行清扫现场的方法,目的是对保存在ThreadLocal中的事务钩子进行清除。
AT-TM端总结
AT-TM端对打上事务注解的方法使用了动态代理,将原方法封装成事务模板进行执行,是事务的注册到事务的提交或回滚的发起方。可以说AT-TM端生于spring,也对spring的事务处理进行增强。
AT-RM端分析
AT-RM端的实现思想比较巧妙,是通过自上而下的动态代理数据库相关类(Database、Connection、Statement),在原方法执行过程中对sql的执行进行拦截处理。
首先AT-RM端会将原DataSource进行代理形成DataSourceProxy,这样可以通过DataSourceProxy从原DataSource获取Connection进行代理形成ConnectionProxy,最后可以通过ConnectionProxy获取Statement进行代理形成StatementProxy。由ConnectionProxy与StatementProxy接管sql的执行与全局事务的处理。
@Bean
public DataSourceProxy dataSourceProxy(DataSource dataSource) {
return new DataSourceProxy(dataSource);
}
@Override
public ConnectionProxy getConnection() throws SQLException {
Connection targetConnection = targetDataSource.getConnection();
return new ConnectionProxy(this, targetConnection);
}
@Override
public Statement createStatement(int resultSetType, int resultSetConcurrency) throws SQLException {
Statement statement = targetConnection.createStatement(resultSetType, resultSetConcurrency);
return new StatementProxy<Statement>(this, statement);
}
直接进入StatementProxy,可以看到里面的方法都是使用了ExecuteTemplate执行模板进行处理,比如executeUpdate方法。
@Override
public int executeUpdate(String sql) throws SQLException {
this.targetSQL = sql;
return ExecuteTemplate.execute(this, (statement, args) -> statement.executeUpdate((String) args[0]), sql);
}
可以看到ExecuteTemplate.execute会区别对待不同的sql类型,生成不同的Executor,最后调用Executor的execute方法执行sql。
public static <T, S extends Statement> T execute(List<SQLRecognizer> sqlRecognizers,
StatementProxy<S> statementProxy,
StatementCallback<T, S> statementCallback,
Object... args) throws SQLException {
...//代码省略
Executor<T> executor;
...//代码省略
switch (sqlRecognizer.getSQLType()) {
case INSERT:
executor = EnhancedServiceLoader.load(InsertExecutor.class, dbType,
new Class[]{StatementProxy.class, StatementCallback.class, SQLRecognizer.class},
new Object[]{statementProxy, statementCallback, sqlRecognizer});
break;
case UPDATE:
executor = new UpdateExecutor<>(statementProxy, statementCallback, sqlRecognizer);
break;
case DELETE:
executor = new DeleteExecutor<>(statementProxy, statementCallback, sqlRecognizer);
break;
case SELECT_FOR_UPDATE:
executor = new SelectForUpdateExecutor<>(statementProxy, statementCallback, sqlRecognizer);
break;
default:
executor = new PlainExecutor<>(statementProxy, statementCallback);
break;
}
} else {
executor = new MultiExecutor<>(statementProxy, statementCallback, sqlRecognizers);
}
}
T rs;
try {
rs = executor.execute(args);
} catch (Throwable ex) {
...//代码省略
}
return rs;
}
选取一个比较简单又能体现完整AT-RM端流程的UpdateExecutor进行分析,如果设置了自动提交,会进入executeAutoCommitTrue流程,否则会进入executeAutoCommitFalse流程。
继续选取executeAutoCommitTrue流程进行分析,首先设置自动提交为false,然后执行executeAutoCommitFalse后会调用ConnectionProxy的commit方法,最后重置上下文与自动提交状态位,这正是很多中间件处理本地事务的步骤。
protected T executeAutoCommitTrue(Object[] args) throws Throwable {
ConnectionProxy connectionProxy = statementProxy.getConnectionProxy();
try {
connectionProxy.setAutoCommit(false);
return new LockRetryPolicy(connectionProxy).execute(() -> {
T result = executeAutoCommitFalse(args);
connectionProxy.commit();
return result;
});
} catch (Exception e) {
...//代码省略
throw e;
} finally {
connectionProxy.getContext().reset();
connectionProxy.setAutoCommit(true);
}
}
executeAutoCommitFalse流程会执行beforeImage保存前置镜像,调用原sql执行回调,afterImage保存后置镜像,合并镜像形成undolog与lockkey。
形成镜像的原理是根据原sql生成等价的查询sql,执行查询sql形成镜像。
形成lockkey的原理是根据镜像获取主键列表(primary key 简称pk)进行拼接形成lockkey。
protected T executeAutoCommitFalse(Object[] args) throws Exception {
...//代码省略
TableRecords beforeImage = beforeImage();
T result = statementCallback.execute(statementProxy.getTargetStatement(), args);
TableRecords afterImage = afterImage(beforeImage);
prepareUndoLog(beforeImage, afterImage);
return result;
}
protected void prepareUndoLog(TableRecords beforeImage, TableRecords afterImage) throws SQLException {
...//代码省略
ConnectionProxy connectionProxy = statementProxy.getConnectionProxy();
TableRecords lockKeyRecords = sqlRecognizer.getSQLType() == SQLType.DELETE ? beforeImage : afterImage;
String lockKeys = buildLockKey(lockKeyRecords);
connectionProxy.appendLockKey(lockKeys);
SQLUndoLog sqlUndoLog = buildUndoItem(beforeImage, afterImage);
connectionProxy.appendUndoLog(sqlUndoLog);
}
connectionProxy.commit就是最终的收尾流程,向TC进行注册的同时加上全局锁,本地sql与undolog一起提交到本地数据库,最后向TC报告本身事务状态,当然,RM与TC的交互还是用最开始初始化的RmClient,就不再细说了。
private void processGlobalTransactionCommit() throws SQLException {
try {
register();
} catch (TransactionException e) {
...//代码省略
}
try {
UndoLogManagerFactory.getUndoLogManager(this.getDbType()).flushUndoLogs(this);
targetConnection.commit();
} catch (Throwable ex) {
report(false);
throw new SQLException(ex);
}
if (IS_REPORT_SUCCESS_ENABLE) {
report(true);
}
context.reset();
}
AT-RM端总结
AT-RM端同样使用了动态代理,将原方法封装成执行模板进行执行,保证了全局事务的加锁、向TC报告本身事务状态、本地sql与undolog“同生共死”,一起提交到数据库,为全局提交或回滚做好准备。
AT-TC端分析
AT-TC端是负责整个全局事务的注册、提交或回滚的总控制节点。
由上文可知,AT-TC端接收并处理了TM端的全局事务的注册、提交或回滚请求,RM端的分支事务的报告本身状态请求。
当TM发起全局提交或回滚时,会回调全局事务底下所有RM的最终提交或回滚接口,删除undolog或者根据undolog进行重放恢复数据。
AT-TC端是以NettyRemotingServer启动并且接收处理来自TM与RM的请求,请求会流经DefaultCoordinator->DefaultCore到达默认的核心处理类。
首先看TM发起的全局事务的注册,会调用DefaultCore的begin方法,创建GlobalSession通过生命周期监听服务保存到数据库,向事务总线发送消息,调用完成后返回GlobalSession的xid给到TM,xid将由TM保存在执行链路的上下文中,保证RM分支事务提交时关联到TM全局事务。
@Override
public String begin(String applicationId, String transactionServiceGroup, String name, int timeout)
throws TransactionException {
GlobalSession session = GlobalSession.createGlobalSession(applicationId, transactionServiceGroup, name, timeout);
session.addSessionLifecycleListener(SessionHolder.getRootSessionManager());
session.begin();
eventBus.post(new GlobalTransactionEvent(session.getTransactionId(), GlobalTransactionEvent.ROLE_TC,
session.getTransactionName(), session.getBeginTime(), null, session.getStatus()));
return session.getXid();
}
RM的分支注册,会调用DefaultCore的branchRegister方法,创建BatchSession,检查全局锁,同样会通过生命周期监听服务保存到数据库,调用完成后返回BatchSession的btachId给到RM,btachId对RM的主要作用是与xid一起定位唯一一条undolog。
@Override
public Long branchRegister(BranchType branchType, String resourceId, String clientId, String xid,
String applicationData, String lockKeys) throws TransactionException {
GlobalSession globalSession = assertGlobalSessionNotNull(xid, false);
return SessionHolder.lockAndExecute(globalSession, () -> {
globalSessionStatusCheck(globalSession);
globalSession.addSessionLifecycleListener(SessionHolder.getRootSessionManager());
BranchSession branchSession = SessionHelper.newBranchByGlobal(globalSession, branchType, resourceId,
applicationData, lockKeys, clientId);
branchSessionLock(globalSession, branchSession);
try {
globalSession.addBranch(branchSession);
} catch (RuntimeException ex) {
branchSessionUnlock(branchSession);
...//代码省略
}
...//代码省略
return branchSession.getBranchId();
});
}
RM的报告本身事务状态,会调用DefaultCore的branchReport方法,会根据xid获取到关联的GlobalSession,再根据batchId获取到BranchSession,调用GlobalSession的changeBranchStatus改变分支状态,同样会通过生命周期监听服务保存到数据库。
@Override
public void branchReport(BranchType branchType, String xid, long branchId, BranchStatus status,
String applicationData) throws TransactionException {
GlobalSession globalSession = assertGlobalSessionNotNull(xid, true);
BranchSession branchSession = globalSession.getBranch(branchId);
...//代码省略
globalSession.addSessionLifecycleListener(SessionHolder.getRootSessionManager());
globalSession.changeBranchStatus(branchSession, status);
...//代码省略
}
TM的全局提交,会调用DefaultCore的commit方法,会通过加锁与判断全局事务模式与分支节点是否进行异步提交,如果是异步提交则将globalSession的status置为AsyncCommitting,等待定时线程池捞取此状态的GlobalSession进行提交,否则直接调用doGlobalCommit进行全局提交,若全局事务提交成功,但是还拥有分支节点,则继续走异步提交流程,清除分支节点。
@Override
public GlobalStatus commit(String xid) throws TransactionException {
GlobalSession globalSession = SessionHolder.findGlobalSession(xid);
...//代码省略
globalSession.addSessionLifecycleListener(SessionHolder.getRootSessionManager());
boolean shouldCommit = SessionHolder.lockAndExecute(globalSession, () -> {
globalSession.closeAndClean();
if (globalSession.getStatus() == GlobalStatus.Begin) {
if (globalSession.canBeCommittedAsync()) {
globalSession.asyncCommit();
return false;
} else {
globalSession.changeStatus(GlobalStatus.Committing);
return true;
}
}
return false;
});
if (shouldCommit) {
boolean success = doGlobalCommit(globalSession, false);
if (success && globalSession.hasBranch() && globalSession.canBeCommittedAsync()) {
globalSession.asyncCommit();
return GlobalStatus.Committed;
} else {
return globalSession.getStatus();
}
} else {
return globalSession.getStatus() == GlobalStatus.AsyncCommitting ? GlobalStatus.Committed : globalSession.getStatus();
}
}
doGlobalCommit执行全局事务下的所有分支提交回调,若分支提交状态不是PhaseTwo_Committed会进行重试,将globalSession的状态置为CommitRetrying,等待定时线程池捞取此状态的GlobalSession进行提交。
@Override
public boolean doGlobalCommit(GlobalSession globalSession, boolean retrying) throws TransactionException {
boolean success = true;
...//代码省略
...//if是SAGA模式处理,代码省略
else {
for (BranchSession branchSession : globalSession.getSortedBranches()) {
if (!retrying && branchSession.canBeCommittedAsync()) {
continue;
}
BranchStatus currentStatus = branchSession.getStatus();
if (currentStatus == BranchStatus.PhaseOne_Failed) {
globalSession.removeBranch(branchSession);
continue;
}
try {
BranchStatus branchStatus = getCore(branchSession.getBranchType()).branchCommit(globalSession, branchSession);
switch (branchStatus) {
case PhaseTwo_Committed:
globalSession.removeBranch(branchSession);
continue;
case PhaseTwo_CommitFailed_Unretryable:
if (globalSession.canBeCommittedAsync()) {
continue;
} else {
SessionHelper.endCommitFailed(globalSession);
return false;
}
default:
if (!retrying) {
globalSession.queueToRetryCommit();
return false;
}
if (globalSession.canBeCommittedAsync()) {
continue;
} else {
return false;
}
}
} catch (Exception ex) {
if (!retrying) {
globalSession.queueToRetryCommit();
throw new TransactionException(ex);
}
}
}
if (globalSession.hasBranch() && !globalSession.canBeCommittedAsync()) {
return false;
}
}
if (success && globalSession.getBranchSessions().isEmpty()) {
SessionHelper.endCommitted(globalSession);
...//代码省略
}
return success;
}
全局提交对每个branchSession都进行分支提交的回调,在RM端由DataSourceManager.branchCommit方法提交到待提交队列ASYNC_COMMIT_BUFFER,异步等待timerExecutor调用UndoLogManager的batchDeleteUndoLog方法进行删除undolog。
@Override
public BranchStatus branchCommit(BranchType branchType, String xid, long branchId, String resourceId,
String applicationData) throws TransactionException {
ASYNC_COMMIT_BUFFER.offer(new Phase2Context(branchType, xid, branchId, resourceId, applicationData)));
...//代码省略
return BranchStatus.PhaseTwo_Committed;
}
private void doBranchCommits() {
...//代码省略
for (Map.Entry<String, List<Phase2Context>> entry : mappedContexts.entrySet()) {
Connection conn = null;
DataSourceProxy dataSourceProxy;
try {
try {
...//代码省略
for (Phase2Context commitContext : contextsGroupedByResourceId) {
...//代码省略
if (maxSize == UNDOLOG_DELETE_LIMIT_SIZE) {
...//代码省略
UndoLogManagerFactory.getUndoLogManager(dataSourceProxy.getDbType()).batchDeleteUndoLog(
xids, branchIds, conn);
...//代码省略
xids.clear();
branchIds.clear();
}
}
...//代码省略
}
TM的全局回滚,会调用DefaultCore的rollback方法,会通过加锁与判断全局事务模式是否进行回滚,如果是进行回滚则将globalSession的status置为Rollbacking,直接调用doGlobalRollback进行全局回滚。
@Override
public GlobalStatus rollback(String xid) throws TransactionException {
GlobalSession globalSession = SessionHolder.findGlobalSession(xid);
...//代码省略
globalSession.addSessionLifecycleListener(SessionHolder.getRootSessionManager());
boolean shouldRollBack = SessionHolder.lockAndExecute(globalSession, () -> {
globalSession.close();
if (globalSession.getStatus() == GlobalStatus.Begin) {
globalSession.changeStatus(GlobalStatus.Rollbacking);
return true;
}
return false;
});
...//代码省略
doGlobalRollback(globalSession, false);
return globalSession.getStatus();
}
在doGlobalRollback中,会执行全局事务下的所有分支回滚回调,若分支提交状态不是PhaseTwo_Rollbacked会进行重试,将globalSession的状态置为TimeoutRollbackRetrying或者RollbackRetrying,等待定时线程池捞取此状态的GlobalSession进行回滚。
@Override
public boolean doGlobalRollback(GlobalSession globalSession, boolean retrying) throws TransactionException {
boolean success = true;
...//if是SAGA模式,代码省略
else {
for (BranchSession branchSession : globalSession.getReverseSortedBranches()) {
BranchStatus currentBranchStatus = branchSession.getStatus();
if (currentBranchStatus == BranchStatus.PhaseOne_Failed) {
globalSession.removeBranch(branchSession);
continue;
}
try {
BranchStatus branchStatus = branchRollback(globalSession, branchSession);
switch (branchStatus) {
case PhaseTwo_Rollbacked:
globalSession.removeBranch(branchSession);
continue;
case PhaseTwo_RollbackFailed_Unretryable:
SessionHelper.endRollbackFailed(globalSession);
return false;
default:
if (!retrying) {
globalSession.queueToRetryRollback();
}
return false;
}
} catch (Exception ex) {
if (!retrying) {
globalSession.queueToRetryRollback();
}
throw new TransactionException(ex);
}
}
...//代码省略
if (success) {
SessionHelper.endRollbacked(globalSession);
...//代码省略
}
return success;
}
全局回滚对每个branchSession都进行分支回滚的回调,在RM端由DataSourceManager.branchRollback方法调用UndoLogManager.undo逻辑,将undolog进行重放恢复数据。
@Override
public BranchStatus branchRollback(BranchType branchType, String xid, long branchId, String resourceId,
String applicationData) throws TransactionException {
DataSourceProxy dataSourceProxy = get(resourceId);
...//代码省略
try {
UndoLogManagerFactory.getUndoLogManager(dataSourceProxy.getDbType()).undo(dataSourceProxy, xid, branchId);
} catch (TransactionException te) {
...//代码省略
return BranchStatus.PhaseTwo_RollbackFailed_Retryable;
}
return BranchStatus.PhaseTwo_Rollbacked;
}
AT-TC端总结
新版的AT-TM端将数据存放在数据库而不是本地文件,进而可以基于zk、etcd等分布式协调服务进行高可用部署,它的主要功能是将全局事务的相关数据进行保存,掌控全局事务的提交与回滚回调,并且可以对全局事务进行补偿重试。
最后总结
AT模式通过本地事务先提交,全局事务提交时异步删除分支undolog,提高了全局事务的性能,但是相对于朴素的正向补偿,AT模式带来了形成镜像的查询、与TC通信加大耗时等负面效果,只能说AT模式是把双刃剑,并不是银弹。软件的艺术之美源于trade-off,是否使用还是要看业务形态的适用情况与布道者在关键时机的推广落地。