Perf: optimize Tablet write with columnar string storage and lazy DeviceID construction (~10x throughput)#748
Perf: optimize Tablet write with columnar string storage and lazy DeviceID construction (~10x throughput)#748
Conversation
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## develop #748 +/- ##
===========================================
+ Coverage 61.85% 61.87% +0.02%
===========================================
Files 704 704
Lines 41276 41347 +71
Branches 5929 5948 +19
===========================================
+ Hits 25531 25585 +54
- Misses 14905 14915 +10
- Partials 840 847 +7 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
LGTM |
a79a4e7 to
dbf86df
Compare
| auto* sc = new StringColumn(); | ||
| sc->init(max_row_num_, max_row_num_ * 32); | ||
| value_matrix_[c].string_col = sc; |
| for (auto col_idx : id_column_indexes_) { | ||
| const StringColumn& sc = *value_matrix_[col_idx].string_col; | ||
| const uint32_t* off = sc.offsets; | ||
| const char* buf = sc.buffer; | ||
| for (uint32_t i = 1; i < row_count; i++) { | ||
| if (boundary[i >> 6] & (1ULL << (i & 63))) continue; | ||
| uint32_t len_a = off[i] - off[i - 1]; | ||
| uint32_t len_b = off[i + 1] - off[i]; | ||
| if (len_a != len_b || | ||
| (len_a > 0 && | ||
| memcmp(buf + off[i - 1], buf + off[i], len_a) != 0)) { | ||
| boundary[i >> 6] |= (1ULL << (i & 63)); | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
May traverse tag columns in reversed order, because we tend to organize tags from big (like country) to small (like street).
You are more likely to find differences between small tags within the same TsFile or write batch.
| if (len_a != len_b || | ||
| (len_a > 0 && | ||
| memcmp(buf + off[i - 1], buf + off[i], len_a) != 0)) { | ||
| boundary[i >> 6] |= (1ULL << (i & 63)); |
There was a problem hiding this comment.
If the number of boundaries reaches the number of rows, may break.
| void append(uint32_t row, const char* data, uint32_t len) { | ||
| // Grow buffer if needed | ||
| if (buf_used + len > buf_capacity) { | ||
| buf_capacity = buf_capacity * 2 + len; | ||
| buffer = (char*)common::mem_realloc(buffer, buf_capacity); | ||
| } | ||
| memcpy(buffer + buf_used, data, len); | ||
| offsets[row] = buf_used; | ||
| offsets[row + 1] = buf_used + len; | ||
| buf_used += len; | ||
| } |
There was a problem hiding this comment.
If data equals the value of the previous row, may simply use the same offsets and avoid a memory copy.
However, if the memory comparison is too often, but the memory copy is not avoided, we should stop comparing them.
Write 200 x 50000 rows: 15039 ms
Throughput: 664938 rows/s
Write 200 x 50000 rows: 1578 ms
Throughput: 6337140 rows/s