当前位置:首页 > 学习笔记 > 正文内容

分布式系统完全指南:高可用架构设计的核心实践

廖万里21小时前学习笔记0
分布式系统完全指南
"分布式系统是由多台计算机组成的系统,这些计算机通过网络通信协作,对外呈现为单一系统。掌握分布式系统的原理和设计,是构建大规模高可用应用的必备技能。"

一、分布式系统基础理论

CAP理论

CAP理论指出,在分布式系统中,一致性(Consistency)、可用性(Availability)、分区容错性(Partition tolerance)三个特性最多只能同时满足两个。 一致性:所有节点在同一时间看到相同的数据 可用性:每个请求都能获得响应(成功或失败) 分区容错性:系统在网络分区情况下仍能继续运行 由于网络分区在分布式系统中不可避免,因此实际设计时需要在一致性和可用性之间权衡。

BASE理论

BASE是对CAP理论中AP方案的延伸: - 基本可用(Basically Available):系统出现故障时允许损失部分可用性 - 软状态(Soft State):允许系统存在中间状态 - 最终一致性(Eventually Consistent):经过一段时间后,所有副本达到一致

二、一致性协议

Paxos算法

Paxos是最经典的分布式一致性算法,由Leslie Lamport提出。算法分为Prepare和Accept两个阶段,通过提案编号保证最终一致性。
// Paxos Proposer伪代码
class Proposer {
    private int proposalNumber;
    
    public void propose(Object value) {
        // 阶段1:Prepare
        for (Acceptor acceptor : acceptors) {
            Promise promise = acceptor.prepare(proposalNumber);
            if (promise.hasAcceptedValue()) {
                value = promise.getAcceptedValue();
            }
        }
        
        // 阶段2:Accept
        int majority = acceptors.size() / 2 + 1;
        int acceptedCount = 0;
        for (Acceptor acceptor : acceptors) {
            if (acceptor.accept(proposalNumber, value)) {
                acceptedCount++;
            }
        }
        
        if (acceptedCount >= majority) {
            notifyLearners(value);
        }
    }
}

class Acceptor {
    private int promisedNumber = -1;
    private int acceptedNumber = -1;
    private Object acceptedValue = null;
    
    public synchronized Promise prepare(int proposalNumber) {
        if (proposalNumber > promisedNumber) {
            promisedNumber = proposalNumber;
            return new Promise(acceptedNumber, acceptedValue);
        }
        return new Promise(promisedNumber, acceptedValue);
    }
    
    public synchronized boolean accept(int proposalNumber, Object value) {
        if (proposalNumber >= promisedNumber) {
            promisedNumber = proposalNumber;
            acceptedNumber = proposalNumber;
            acceptedValue = value;
            return true;
        }
        return false;
    }
}

Raft算法

Raft是更易理解的一致性算法,通过领导者选举和日志复制实现一致性。
// Raft节点状态
type State int

const (
    Follower State = iota
    Candidate
    Leader
)

type Node struct {
    state       State
    currentTerm int
    votedFor    int
    log         []LogEntry
    
    // 选举超时
    electionTimeout  time.Duration
    lastHeartbeat    time.Time
    
    // 领导者相关
    nextIndex  []int
    matchIndex []int
}

// 领导者选举
func (n *Node) startElection() {
    n.state = Candidate
    n.currentTerm++
    n.votedFor = n.id
    
    votes := 1  // 投自己一票
    for _, peer := range n.peers {
        go func(peer *Peer) {
            resp := peer.RequestVote(n.currentTerm, n.id, len(n.log)-1, n.log[len(n.log)-1].Term)
            if resp.VoteGranted {
                votes++
                if votes > len(n.peers)/2 {
                    n.becomeLeader()
                }
            }
        }(peer)
    }
}

// 日志复制
func (n *Node) appendEntries(entry LogEntry) {
    n.log = append(n.log, entry)
    
    for i, peer := range n.peers {
        go func(idx int, p *Peer) {
            prevLogIndex := n.nextIndex[idx] - 1
            entries := n.log[n.nextIndex[idx]:]
            
            resp := p.AppendEntries(
                n.currentTerm,
                prevLogIndex,
                n.log[prevLogIndex].Term,
                entries,
                n.commitIndex,
            )
            
            if resp.Success {
                n.nextIndex[idx] += len(entries)
                n.matchIndex[idx] = n.nextIndex[idx] - 1
            } else {
                n.nextIndex[idx]--
            }
        }(i, peer)
    }
}

三、分布式ID生成

Snowflake算法

Twitter的Snowflake算法生成64位唯一ID,包含时间戳、机器ID和序列号。
public class SnowflakeIdGenerator {
    private final long twepoch = 1288834974657L;
    private final long workerIdBits = 5L;
    private final long datacenterIdBits = 5L;
    private final long sequenceBits = 12L;
    
    private final long maxWorkerId = -1L ^ (-1L << workerIdBits);
    private final long maxDatacenterId = -1L << datacenterIdBits;
    
    private final long workerIdShift = sequenceBits;
    private final long datacenterIdShift = sequenceBits + workerIdBits;
    private final long timestampLeftShift = sequenceBits + workerIdBits + datacenterIdBits;
    private final long sequenceMask = -1L ^ (-1L << sequenceBits);
    
    private long workerId;
    private long datacenterId;
    private long sequence = 0L;
    private long lastTimestamp = -1L;
    
    public synchronized long nextId() {
        long timestamp = timeGen();
        
        if (timestamp < lastTimestamp) {
            throw new RuntimeException("时钟回拨");
        }
        
        if (lastTimestamp == timestamp) {
            sequence = (sequence + 1) & sequenceMask;
            if (sequence == 0) {
                timestamp = tilNextMillis(lastTimestamp);
            }
        } else {
            sequence = 0L;
        }
        
        lastTimestamp = timestamp;
        
        return ((timestamp - twepoch) << timestampLeftShift)
            | (datacenterId << datacenterIdShift)
            | (workerId << workerIdShift)
            | sequence;
    }
    
    private long tilNextMillis(long lastTimestamp) {
        long timestamp = timeGen();
        while (timestamp <= lastTimestamp) {
            timestamp = timeGen();
        }
        return timestamp;
    }
}

四、分布式锁

Redis分布式锁

public class RedisDistributedLock {
    private final RedisTemplate redisTemplate;
    private final String lockKey;
    private final String lockValue;
    private final long expireTime;
    
    public boolean tryLock() {
        String script = 
            "if redis.call('setnx', KEYS[1], ARGV[1]) == 1 then " +
            "  redis.call('expire', KEYS[1], ARGV[2]) " +
            "  return 1 " +
            "end " +
            "return 0";
        
        DefaultRedisScript redisScript = new DefaultRedisScript<>(script, Long.class);
        Long result = redisTemplate.execute(redisScript, 
            Collections.singletonList(lockKey), 
            lockValue, 
            String.valueOf(expireTime));
        
        return result != null && result == 1;
    }
    
    public void unlock() {
        // 使用Lua脚本保证原子性
        String script = 
            "if redis.call('get', KEYS[1]) == ARGV[1] then " +
            "  return redis.call('del', KEYS[1]) " +
            "end " +
            "return 0";
        
        redisTemplate.execute(new DefaultRedisScript<>(script, Long.class), 
            Collections.singletonList(lockKey), lockValue);
    }
}

// 使用示例
public void processWithLock() {
    RedisDistributedLock lock = new RedisDistributedLock(redisTemplate, "order:123", UUID.randomUUID().toString(), 30);
    
    if (lock.tryLock()) {
        try {
            // 执行业务逻辑
            processOrder();
        } finally {
            lock.unlock();
        }
    }
}

Zookeeper分布式锁

public class ZookeeperDistributedLock implements Watcher {
    private ZooKeeper zk;
    private String lockPath;
    private String currentLockPath;
    
    public void lock() throws Exception {
        // 创建临时顺序节点
        currentLockPath = zk.create(lockPath + "/lock-", 
            new byte[0], 
            ZooDefs.Ids.OPEN_ACL_UNSAFE, 
            CreateMode.EPHEMERAL_SEQUENTIAL);
        
        // 获取所有子节点
        List children = zk.getChildren(lockPath, false);
        Collections.sort(children);
        
        // 判断是否是最小节点
        String currentNode = currentLockPath.substring(lockPath.length() + 1);
        int currentIndex = children.indexOf(currentNode);
        
        if (currentIndex == 0) {
            return;  // 获得锁
        }
        
        // 监听前一个节点
        String prevNode = lockPath + "/" + children.get(currentIndex - 1);
        final CountDownLatch latch = new CountDownLatch(1);
        
        Stat stat = zk.exists(prevNode, event -> {
            if (event.getType() == Event.EventType.NodeDeleted) {
                latch.countDown();
            }
        });
        
        if (stat != null) {
            latch.await();
        }
    }
    
    public void unlock() throws Exception {
        zk.delete(currentLockPath, -1);
    }
}

五、负载均衡策略

一致性哈希

public class ConsistentHash {
    private final TreeMap ring = new TreeMap<>();
    private final int virtualNodes;
    private final HashFunction hashFunction;
    
    public void addNode(T node) {
        for (int i = 0; i < virtualNodes; i++) {
            int hash = hashFunction.hash(node.toString() + ":" + i);
            ring.put(hash, node);
        }
    }
    
    public void removeNode(T node) {
        for (int i = 0; i < virtualNodes; i++) {
            int hash = hashFunction.hash(node.toString() + ":" + i);
            ring.remove(hash);
        }
    }
    
    public T getNode(String key) {
        if (ring.isEmpty()) {
            return null;
        }
        
        int hash = hashFunction.hash(key);
        
        // 找到第一个大于等于hash的节点
        Map.Entry entry = ring.ceilingEntry(hash);
        
        if (entry == null) {
            // 环形,回到第一个节点
            entry = ring.firstEntry();
        }
        
        return entry.getValue();
    }
}

六、分布式事务

两阶段提交(2PC)

public class TwoPhaseCommitCoordinator {
    private List participants;
    
    public boolean commit(Transaction transaction) {
        // 阶段1:准备
        for (Participant participant : participants) {
            if (!participant.prepare(transaction)) {
                // 任意参与者失败,回滚所有
                rollback(transaction);
                return false;
            }
        }
        
        // 阶段2:提交
        for (Participant participant : participants) {
            participant.commit(transaction);
        }
        
        return true;
    }
    
    private void rollback(Transaction transaction) {
        for (Participant participant : participants) {
            try {
                participant.rollback(transaction);
            } catch (Exception e) {
                log.error("Rollback failed", e);
            }
        }
    }
}

总结

分布式系统设计需要在CAP理论指导下做出合理权衡。掌握一致性协议(Paxos/Raft)、分布式ID生成、分布式锁、负载均衡等核心技术,是构建高可用分布式系统的基础。 实践中要注意: 1. 网络分区不可避免,设计要考虑故障场景 2. 选择合适的一致性级别,平衡性能和正确性 3. 实施完善的监控和告警机制 4. 设计幂等接口,处理重复请求 5. 制定故障恢复预案,定期演练

本文链接:https://www.kkkliao.cn/?id=850 转载需授权!

分享到:

版权声明:本文由廖万里的博客发布,如需转载请注明出处。


发表评论

访客

看不清,换一张

◎欢迎参与讨论,请在这里发表您的看法和观点。