本文最后更新于 66 天前,其中的信息可能已经有所发展或是发生改变。
传统的雪花算法由41位时间戳、10位机器码以及12位序列号组成的,第一位为保留的符号位。
在分布式集群中,通常使用雪花 ID 作为主键,但是这里就会出现一些问题:
- 若 A表 与 B表 需要分表且 A 与 B 之间有一定的关联,例如 B表 同时储存 bid 与 aid,并且根据 bid 分片。而此时有两个需求,第一是根据 aid 获取对应的 bid 所在的分表,第二是根据 bid 获取对应的 aid 所在的分表,如果 aid 与 bid 之间没有任何联系,则两种需求都需要查询所有 A 或 B 的分表才能确定。
- 若在同一毫秒时间内产生的雪花 ID 超过最大限制,则需要向后借用一毫秒时间,以此类推,否则将抛出异常。
雪花算法实现
先来看看传统的雪花算法是如何实现的:
/**
* 常规雪花算法
* 01-42 位: 时间戳
* 43-47 位: 数据中心 ID
* 47-52 位: 工作 ID
* 52-64 位: 序列 ID
*/
public synchronized long nextId() {
long timestamp = timeGen();
if (timestamp < lastTimestamp) {
throw new RuntimeException(
String.format("Clock moved backwards. Refusing to generate id for %d milliseconds", lastTimestamp - timestamp));
}
if (lastTimestamp == timestamp) {
sequence = (sequence + 1) & sequenceMask;
if (sequence == 0) {
timestamp = tilNextMillis(lastTimestamp);
}
}
else {
sequence = 0L;
}
lastTimestamp = timestamp;
return ((timestamp - twepoch) << 22)
| (datacenterId << 17)
| (workerId << 12)
| sequence;
}
这里的时间戳左移22位是通过 (63 – 时间戳位数) 得到的,以此类推,在此基础上减去数据中心位数得到下一个左移量。
以下均使用 Kotlin 实现
现在有下面几个常量:
private val startPoint: Long
private val timestampLength: Int
private val dataCenterIdLength: Int
private val workerIdLength: Int
private val sequenceIdLength: Int
根据上面的常量可以计算出:
val shlBitsForTimestamp = 63 - timestampLength
val timestamp = (currentTimestamp - startPoint) shl shlBitsForTimestamp
val shlBitsForDataCenter = shlBitsForTimestamp - dataCenterIdLength
val datacenterId = dataCenterId shl shlBitsForDataCenter
val shlBitsForWorker = shlBitsForDataCenter - workerIdLength
val workerId = workerId shl shlBitsForWorker
然后得到最终的雪花 ID:
timestamp or datacenterId or workerId or sequence
引入基因片段
顾名思义,基因片段可以与另一个值联系,因为这个基因片段就来自于另一个值。
例如 191 % 16 = 15,如果 aid 为 191,那么对应的表应该是 a_15。
如果要通过 aid 191 得到对应的 bid 在哪一个分片,那么就可以通过相同的基因算法,得出对应 bid 的分片编号。
现在 bid 根据 aid 生成,也就是说,只要 bid 包含 aid 的基因片段,就能通过 bid 知道对应的 aid 在哪个分片了。
说到这里,基因片段实际上是某个二进制数的最后几位,例如 191 的二进制 1011 1111,取最后四位 1111 作为基因片段,如果把这个片段拼接到 bid 的末尾,那么也同样可以通过 bid 得到这个基因片段。
如果仅仅根据 aid 取模来决定 bid 的分片编号,那么 A 与 B 两张表的数量是相同的,引入基因片段之后,则 B表的数量取决于基因片段的长度。
接下来引入基因片段:
class SnowIdGenerator(
private val startPoint: Long,
private val timestampLength: Int,
private val dataCenterIdLength: Int,
private val workerIdLength: Int,
private val sequenceIdLength: Int,
private val geneIdLength: Int,
private val dataCenterId: Long,
private val workerId: Long,
private val actualGeneLength: Int
)
初始化
/**
* flag timestamp dataCenter worker sequence gene
* 0 10000000000000000000000000000000000000001 10001 10001 1000001 10001
* 1 --------------------41------------------- --5-- --5-- ---7--- --5--
* @author lovelycat
* @since 2024-08-24 19:40
* @version 1.0
*/
class SnowIdGenerator(
private val startPoint: Long,
private val timestampLength: Int,
private val dataCenterIdLength: Int,
private val workerIdLength: Int,
private val sequenceIdLength: Int,
private val geneIdLength: Int,
private val dataCenterId: Long,
private val workerId: Long,
private val actualGeneLength: Int
) {
private val maxSequence: Long
private val maxDataCenters: Long
private val maxWorkers: Long
init {
if (timestampLength + dataCenterIdLength + workerIdLength + sequenceIdLength + geneIdLength != 63) {
throw IllegalArgumentException("Id length must be 64 in total.")
}
maxSequence = (1 shl sequenceIdLength).toLong()
maxDataCenters = (1 shl dataCenterIdLength).toLong()
maxWorkers = (1 shl workerIdLength).toLong()
if (dataCenterId >= maxDataCenters) {
throw IllegalArgumentException("Data center id out of range, max $maxDataCenters but current $dataCenterId")
}
if (workerId >= maxWorkers) {
throw IllegalArgumentException("Worker id out of range, max $maxWorkers but current $workerId")
}
}
}
获取基因片段
/**
* Get gene piece from original sequence,
* 1011 1001 and 0000 1111 = 0000 1001,
* 1111 = (1 << 4) - 1
*
* @param origin
* @param geneLength
* @return
*/
fun getGene(origin: Long, geneLength: Int = actualGeneLength): Long {
return origin and ((1 shl geneLength) - 1).toLong()
}
雪花 ID 生成
private var lastTimestamp = 0L
private val sequenceMap = mutableMapOf<Long, Long>()
private var borrowedTimestamp = 0L
@Synchronized
fun nextId(gene: Long): Long {
var currentTimestamp = System.currentTimeMillis()
val shlBitsForTimestamp = 63 - timestampLength
if (currentTimestamp <= borrowedTimestamp) {
currentTimestamp = borrowedTimestamp
} else if (currentTimestamp < lastTimestamp) {
throw IllegalStateException("Clock moved backwards. Refusing to generate id for timestamp $currentTimestamp. Latest generated at $lastTimestamp")
}
if (currentTimestamp == lastTimestamp) {
with(gene) {
val original = (sequenceMap[this] ?: 0)
sequenceMap[this] = original + 1
if (original + 1 == maxSequence) {
borrowedTimestamp = currentTimestamp + 1
currentTimestamp = borrowedTimestamp
sequenceMap.clear()
}
}
} else {
sequenceMap.clear()
}
lastTimestamp = currentTimestamp
val timestamp = (currentTimestamp - startPoint) shl shlBitsForTimestamp
val shlBitsForDataCenter = shlBitsForTimestamp - dataCenterIdLength
val datacenterId = dataCenterId shl shlBitsForDataCenter
val shlBitsForWorker = shlBitsForDataCenter - workerIdLength
val workerId = workerId shl shlBitsForWorker
val realSequence = sequenceMap[gene] ?: 0
return if (geneIdLength == 0)
timestamp or datacenterId or workerId or realSequence
else {
val shlBitsForSequence = shlBitsForWorker - sequenceIdLength
val seq = realSequence shl shlBitsForSequence
timestamp or datacenterId or workerId or seq or gene
}
}
最终代码
/**
* flag timestamp dataCenter worker sequence gene
* 0 10000000000000000000000000000000000000001 10001 10001 1000001 10001
* 1 --------------------41------------------- --5-- --5-- ---7--- --5--
* @author lovelycat
* @since 2024-08-24 19:40
* @version 1.0
*/
class SnowIdGenerator(
private val startPoint: Long,
private val timestampLength: Int,
private val dataCenterIdLength: Int,
private val workerIdLength: Int,
private val sequenceIdLength: Int,
private val geneIdLength: Int,
private val dataCenterId: Long,
private val workerId: Long,
private val actualGeneLength: Int
) {
private val maxSequence: Long
private val maxDataCenters: Long
private val maxWorkers: Long
init {
if (timestampLength + dataCenterIdLength + workerIdLength + sequenceIdLength + geneIdLength != 63) {
throw IllegalArgumentException("Id length must be 64 in total.")
}
maxSequence = (1 shl sequenceIdLength).toLong()
maxDataCenters = (1 shl dataCenterIdLength).toLong()
maxWorkers = (1 shl workerIdLength).toLong()
if (dataCenterId >= maxDataCenters) {
throw IllegalArgumentException("Data center id out of range, max $maxDataCenters but current $dataCenterId")
}
if (workerId >= maxWorkers) {
throw IllegalArgumentException("Worker id out of range, max $maxWorkers but current $workerId")
}
}
private var lastTimestamp = 0L
private val sequenceMap = mutableMapOf<Long, Long>()
private var borrowedTimestamp = 0L
@Synchronized
fun nextId(gene: Long): Long {
var currentTimestamp = System.currentTimeMillis()
val shlBitsForTimestamp = 63 - timestampLength
if (currentTimestamp <= borrowedTimestamp) {
currentTimestamp = borrowedTimestamp
} else if (currentTimestamp < lastTimestamp) {
throw IllegalStateException("Clock moved backwards. Refusing to generate id for timestamp $currentTimestamp. Latest generated at $lastTimestamp")
}
if (currentTimestamp == lastTimestamp) {
with(gene) {
val original = (sequenceMap[this] ?: 0)
sequenceMap[this] = original + 1
if (original + 1 == maxSequence) {
borrowedTimestamp = currentTimestamp + 1
currentTimestamp = borrowedTimestamp
sequenceMap.clear()
}
}
} else {
sequenceMap.clear()
}
lastTimestamp = currentTimestamp
val timestamp = (currentTimestamp - startPoint) shl shlBitsForTimestamp
val shlBitsForDataCenter = shlBitsForTimestamp - dataCenterIdLength
val datacenterId = dataCenterId shl shlBitsForDataCenter
val shlBitsForWorker = shlBitsForDataCenter - workerIdLength
val workerId = workerId shl shlBitsForWorker
val realSequence = sequenceMap[gene] ?: 0
return if (geneIdLength == 0)
timestamp or datacenterId or workerId or realSequence
else {
val shlBitsForSequence = shlBitsForWorker - sequenceIdLength
val seq = realSequence shl shlBitsForSequence
timestamp or datacenterId or workerId or seq or gene
}
}
/**
* Get gene piece from original sequence,
* 1011 1001 and 0000 1111 = 0000 1001,
* 1111 = (1 << 4) - 1
*
* @param origin
* @param geneLength
* @return
*/
fun getGene(origin: Long, geneLength: Int = actualGeneLength): Long {
return origin and ((1 shl geneLength) - 1).toLong()
}
}
使用示例
无基因片段:
由于基因片段长度为0,则 nextId() 参数可以为任意值。
fun main() {
val generator = SnowIdGenerator(1724505364000, 41, 5, 5, 12, 0, 9, 17, 0)
val id = generator.nextId(0L)
generator.debug(id)
}
包含基因片段:
fun main() {
val generator = SnowIdGenerator(1724505364000, 41, 5, 5, 7, 5, 9, 17, 4)
val id = generator.nextId(generator.getGene(189L))
generator.debug(id)
}
调试:
通过此函数可以具体查看不同区段的二进制字符串。
fun debug(id: Long) {
val binString = id.toBinaryString().run {
"0".repeat(63 - this.length) + this
}
println("Binary:\n$binString")
println("Timestamp:\n" + binString.substring(0, timestampLength))
val a = timestampLength + dataCenterIdLength
println("DataCenter:\n" + binString.substring(timestampLength, a))
val b = a + workerIdLength
println("Worker:\n" + binString.substring(a, b))
val c = b + sequenceIdLength
println("Sequence:\n" + binString.substring(b, c))
val d = c + geneIdLength
println("Gene:\n" + binString.substring(c, d))
}
fun Long.toBinaryString() = java.lang.Long.toBinaryString(this)
结语
如果不需要自定义各区段的长度,可以自行简化为常量。如果错误欢迎在下方指出。