-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathimpl-bloom-filters.html
More file actions
335 lines (297 loc) · 25.6 KB
/
impl-bloom-filters.html
File metadata and controls
335 lines (297 loc) · 25.6 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en">
<head>
<!-- 2022-W11-4 02:12 -->
<meta http-equiv="Content-Type" content="text/html;charset=utf-8" />
<meta name="viewport" content="width=device-width, initial-scale=1" />
<title>Implementing Bloom Filters</title>
<meta name="author" content="Inanna" />
<meta name="description" content="A short introduction to bloom filters, how they work, and a short implementation thereof." />
<meta name="generator" content="Org Mode" />
<link href="/site.css" rel="stylesheet" type="text/css" /><link href="images/website-icon.png" rel="icon" />
<script type="text/x-mathjax-config">
MathJax.Hub.Config({
displayAlign: "center",
displayIndent: "0em",
"HTML-CSS": { scale: 100,
linebreaks: { automatic: "false" },
webFont: "TeX"
},
SVG: {scale: 100,
linebreaks: { automatic: "false" },
font: "TeX"},
NativeMML: {scale: 100},
TeX: { equationNumbers: {autoNumber: "AMS"},
MultLineWidth: "85%",
TagSide: "right",
TagIndent: ".8em"
}
});
</script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.0/MathJax.js?config=TeX-AMS_HTML"></script>
</head>
<body>
<div id="preamble" class="status">
<div><a href="/index.html"><img alt="An abstract logo representing a series of three assembly line stamping machines with the words CONS, DEV, and embalzoned in white on each machine." id="site-logo" src="/images/website-logo.png" /></a></div>
</div>
<div id="content" class="content">
<h1 class="title">Implementing Bloom Filters</h1>
<div id="table-of-contents" role="doc-toc">
<h2>Table of Contents</h2>
<div id="text-table-of-contents" role="doc-toc">
<ul>
<li><a href="#org9b76571">Introduction</a></li>
<li><a href="#org126def0">Algorithm</a>
<ul>
<li><a href="#orga864098">Example</a></li>
</ul>
</li>
<li><a href="#org82e58b6">Implementation</a>
<ul>
<li><a href="#orgf3dd6e8">Hash Function</a>
<ul>
<li><a href="#org119c280">pads</a></li>
<li><a href="#orgcc8bfdb">hash-obj</a></li>
<li><a href="#org3bd3196">bloom-hash</a></li>
</ul>
</li>
<li><a href="#org23a913d">optimal-k</a></li>
<li><a href="#org68ff064">Bloom Filter Objects</a>
<ul>
<li><a href="#org08f7f71">The BasicBloomFilter Record</a></li>
<li><a href="#orgb5535f0">The CountingBloomFilter Record</a></li>
</ul>
</li>
<li><a href="#org049ed68">Tangled Code Block</a></li>
</ul>
</li>
<li><a href="#org311bbd0">Conclusion</a></li>
</ul>
</div>
</div>
<div id="outline-container-org9b76571" class="outline-2">
<h2 id="org9b76571">Introduction</h2>
<div class="outline-text-2" id="text-org9b76571">
<p>
This will be the first post in my "Implementing" series of tutorials, where I implement a common (or not so common) algorithm and describe how to use it. Bloom filters quickly give you an answer of if something is probably in a set, or definitely not in a set. This can be used to reduce search times for high volume but low access speed media.<label class="sidenote-number" for="1"><sup>1</sup></label><input checked="checked" id="1" style="display:none" type="checkbox" /><span class="sidenote"><span class="sidenote-number"> 1</span> A good example would be a collection of backup disk or (gasp) tape drives.</span>
</p>
</div>
</div>
<div id="outline-container-org126def0" class="outline-2">
<h2 id="org126def0">Algorithm</h2>
<div class="outline-text-2" id="text-org126def0">
<p>
Bloom filters are really simple if you know how hash functions work. You create a series of \(k\) hash functions who output numbers from \(0\) to \(m\), the size of the filter array. To insert items into the filter you hash the item with the functions and set the indices provided to true<label class="sidenote-number" for="2"><sup>2</sup></label><input checked="checked" id="2" style="display:none" type="checkbox" /><span class="sidenote"><span class="sidenote-number"> 2</span> Or 1, or incrementing a number.</span>. To check that an element is probably contained in the set or definitely not contained in the set you simply check that all the indices are set to a truth-y value.
</p>
</div>
<div id="outline-container-orga864098" class="outline-3">
<h3 id="orga864098">Example</h3>
<div class="outline-text-3" id="text-orga864098">
<p>
The string <code>"Bloom filters are awesome!"</code> is hashed in a simple bloom filter of \(m=16\), \(k=4\) and converted into a list of indicies who are set to the <code>true</code>, or in this case <code>1</code>.
</p>
<div id="orgdf7200f" class="figure">
<p><img src="./images/bloom-filter-add-1_6a931db0-db9f-4ea8-8f88-7fba73d52576.png" alt="The string 'Bloom filters are awesome!' being added to a bloom filter." />
</p>
</div>
<p>
Next we add the string <code>"Hello World."</code> to our bloom filter.
</p>
<div id="org9add828" class="figure">
<p><img src="./images/bloom-filter-add-2_6a931db0-db9f-4ea8-8f88-7fba73d52576.png" alt="The string 'Hello World.' being added to a blom filter." />
</p>
</div>
<p>
To determine if something is in the set we can check the indices output by our hash functions are set to 1 when we hash an object. If all of them are set to <code>1</code> it is <i>probably</i> within the bloom filter. If any one of them is <code>0</code> then it is <i>definitely not</i> in the bloom filter.
</p>
<div id="org9966879" class="figure">
<p><img src="./images/bloom-filter-add-3_6a931db0-db9f-4ea8-8f88-7fba73d52576.png" alt="CHecking if the number 27 is in our bloom filter, leading to a false-postiive." />
</p>
</div>
<p>
And now we can clearly see that <code>27</code> is in the bloom fitler… oh, wait…
</p>
</div>
</div>
</div>
<div id="outline-container-org82e58b6" class="outline-2">
<h2 id="org82e58b6">Implementation</h2>
<div class="outline-text-2" id="text-org82e58b6">
</div>
<div id="outline-container-orgf3dd6e8" class="outline-3">
<h3 id="orgf3dd6e8">Hash Function</h3>
<div class="outline-text-3" id="text-orgf3dd6e8">
<p>
Here we implement a simple folding hash function that takes arguments and then folds them over a list.
</p>
</div>
<div id="outline-container-org119c280" class="outline-4">
<h4 id="org119c280">pads</h4>
<div class="outline-text-4" id="text-org119c280">
<p>
The bloom filter requires a series of different hash functions. To generate them we simply use a list of large prime number <code>pads</code> that produce hash functions that yield wildly different outputs.
</p>
<div class="org-src-container">
<pre class="src src-clojure" id="org1cb786b"><span style="color: #494949;">(</span><span style="color: #E53935; font-style: italic;">def</span> ^<span style="color: #E53935;">:private</span> <span style="font-style: italic;">pads</span> <span style="color: #bbb;">[</span>3484115242153581271N
8024590393702697081N
4268910574050118391N
5802681426294560369N
5506370764815358409N
8992090933206599789N
4276741680929616149N
2280789073680025603N<span style="color: #bbb;">]</span><span style="color: #494949;">)</span>
</pre>
</div>
</div>
</div>
<div id="outline-container-orgcc8bfdb" class="outline-4">
<h4 id="orgcc8bfdb">hash-obj</h4>
<div class="outline-text-4" id="text-orgcc8bfdb">
<p>
Now here's the simple hash function that I was talking about earlier. As you can see it simply converts the object into a string, the string into a series of numbers, and then folds the string over itself with modular multiplication, basically ignoring overflows.
</p>
<div class="org-src-container">
<pre class="src src-clojure" id="org33fc8f1"><span style="color: #494949;">(</span><span style="color: #E53935; font-style: italic;">defn</span> <span style="font-weight: bold; font-style: italic;">hash-obj</span> <span style="color: #bbb;">[</span>obj pad<span style="color: #bbb;">]</span>
<span style="color: #bbb;">(</span><span style="color: #E53935; font-style: italic;">->></span> <span style="color: #494949;">(</span>str obj<span style="color: #494949;">)</span>
<span style="color: #494949;">(</span>map int<span style="color: #494949;">)</span>
<span style="color: #494949;">(</span>apply <span style="color: #bbb;">(</span><span style="color: #E53935; font-style: italic;">fn</span> <span style="color: #494949;">[</span>x & xs<span style="color: #494949;">]</span> <span style="color: #494949;">(</span>cons <span style="color: #bbb;">(</span>mod <span style="color: #494949;">(</span>*' x pad<span style="color: #494949;">)</span> <span style="color: #E53935;">Long</span>/MAX_VALUE<span style="color: #bbb;">)</span> xs<span style="color: #494949;">)</span><span style="color: #bbb;">)</span><span style="color: #494949;">)</span>
<span style="color: #494949;">(</span>reduce #<span style="color: #bbb;">(</span>mod <span style="color: #494949;">(</span>*' <span style="font-style: italic;">%1</span> <span style="font-style: italic;">%2</span> pad<span style="color: #494949;">)</span> <span style="color: #E53935;">Long</span>/MAX_VALUE<span style="color: #bbb;">)</span><span style="color: #494949;">)</span>
long<span style="color: #bbb;">)</span><span style="color: #494949;">)</span>
</pre>
</div>
</div>
</div>
<div id="outline-container-org3bd3196" class="outline-4">
<h4 id="org3bd3196">bloom-hash</h4>
<div class="outline-text-4" id="text-org3bd3196">
<p>
The <code>bloom-hash</code> function is trivial and <del>is left as an exercise to the reader</del> simply generates a hash function using the prime numbers and then hashes the object for each one. Following that it takes the modulus of \(m\) to produce the index. There are up to 7 pads that may be used for the number, meaning the maximum value of \(k\) is 7.
</p>
<div class="org-src-container">
<pre class="src src-clojure" id="org5cb3002"><span style="color: #494949;">(</span><span style="color: #E53935; font-style: italic;">defn</span> <span style="font-weight: bold; font-style: italic;">bloom-hash</span>
<span style="color: #bbb;">(</span><span style="color: #494949;">[</span>obj<span style="color: #494949;">]</span> <span style="color: #494949;">(</span>bloom-hash obj 1000<span style="color: #494949;">)</span><span style="color: #bbb;">)</span>
<span style="color: #bbb;">(</span><span style="color: #494949;">[</span>obj m<span style="color: #494949;">]</span> <span style="color: #494949;">(</span>bloom-hash obj m 4<span style="color: #494949;">)</span><span style="color: #bbb;">)</span>
<span style="color: #bbb;">(</span><span style="color: #494949;">[</span>obj m k<span style="color: #494949;">]</span> <span style="color: #494949;">(</span>map #<span style="color: #bbb;">(</span>mod <span style="color: #494949;">(</span>hash-obj obj <span style="font-style: italic;">%</span><span style="color: #494949;">)</span> m<span style="color: #bbb;">)</span> <span style="color: #bbb;">(</span>take k pads<span style="color: #bbb;">)</span><span style="color: #494949;">)</span><span style="color: #bbb;">)</span><span style="color: #494949;">)</span>
</pre>
</div>
</div>
</div>
</div>
<div id="outline-container-org23a913d" class="outline-3">
<h3 id="org23a913d">optimal-k</h3>
<div class="outline-text-3" id="text-org23a913d">
<p>
To find the optimal \(k\) we use the following equation
</p>
<p>
\(\displaystyle k = \frac{m}{n}ln2\)
</p>
<p>
Where \(k\) is the number of hash functions, \(m\) is the size of the filter, and \(n\) is the number items that are expected to be added to the filter. This must be computed before the filter is used, and therefore likely will be slightly inaccurate.<label class="sidenote-number" for="3"><sup>3</sup></label><input checked="checked" id="3" style="display:none" type="checkbox" /><span class="sidenote"><span class="sidenote-number"> 3</span> I think that you probably could use some sort of very simple adaptive control system to minimize the error over many uses. (yes, I am intentionally avoiding saying "AI".)</span> This may be a problem should the sets be an unexpected size.
</p>
<div class="org-src-container">
<pre class="src src-clojure" id="org303e147"><span style="color: #494949;">(</span><span style="color: #E53935; font-style: italic;">defn</span> <span style="font-weight: bold; font-style: italic;">optimal-k</span>
<span style="color: #bbb;">[</span>m n<span style="color: #bbb;">]</span>
<span style="color: #bbb;">(</span><span style="color: #E53935; font-style: italic;">-></span> <span style="color: #494949;">(</span>/ m n<span style="color: #494949;">)</span>
<span style="color: #494949;">(</span>* <span style="color: #bbb;">(</span><span style="color: #E53935;">Math</span>/log 2.0<span style="color: #bbb;">)</span><span style="color: #494949;">)</span>
long<span style="color: #bbb;">)</span><span style="color: #494949;">)</span>
</pre>
</div>
</div>
</div>
<div id="outline-container-org68ff064" class="outline-3">
<h3 id="org68ff064">Bloom Filter Objects</h3>
<div class="outline-text-3" id="text-org68ff064">
</div>
<div id="outline-container-org08f7f71" class="outline-4">
<h4 id="org08f7f71">The BasicBloomFilter Record</h4>
<div class="outline-text-4" id="text-org08f7f71">
<p>
And now the <code>BasicBloomFilter</code> type, our solution is fairly simple, we use a boolean array and check if the values in the array are true.
</p>
<div class="org-src-container">
<pre class="src src-clojure" id="orgd691ca6"><span style="color: #494949;">(</span><span style="color: #E53935; font-style: italic;">defrecord</span> <span style="color: #E53935;">BasicBloomFilter</span> <span style="color: #bbb;">[</span>m k filter<span style="color: #bbb;">]</span><span style="color: #494949;">)</span>
<span style="color: #494949;">(</span><span style="color: #E53935; font-style: italic;">defn</span> <span style="font-weight: bold; font-style: italic;">contains?</span> <span style="color: #bbb;">[</span>filter-obj obj<span style="color: #bbb;">]</span>
<span style="color: #bbb;">(</span><span style="color: #E53935; font-style: italic;">->></span> <span style="color: #494949;">(</span>bloom-hash obj m k<span style="color: #494949;">)</span>
<span style="color: #494949;">(</span>map #<span style="color: #bbb;">(</span>get-in filter-obj <span style="color: #494949;">[</span><span style="color: #E53935;">:filter</span> <span style="font-style: italic;">%</span><span style="color: #494949;">]</span><span style="color: #bbb;">)</span><span style="color: #494949;">)</span>
<span style="color: #494949;">(</span>reduce #<span style="color: #bbb;">(</span><span style="color: #E53935; font-style: italic;">and</span> <span style="font-style: italic;">%1</span> <span style="font-style: italic;">%2</span><span style="color: #bbb;">)</span><span style="color: #494949;">)</span><span style="color: #bbb;">)</span><span style="color: #494949;">)</span>
<span style="color: #494949;">(</span><span style="color: #E53935; font-style: italic;">defn</span> <span style="font-weight: bold; font-style: italic;">add</span> <span style="color: #bbb;">[</span><span style="color: #494949;">{</span><span style="color: #E53935;">:keys</span> <span style="color: #bbb;">[</span>m k<span style="color: #bbb;">]</span> <span style="color: #E53935;">:as</span> filter-obj<span style="color: #494949;">}</span> obj<span style="color: #bbb;">]</span>
<span style="color: #bbb;">(</span><span style="color: #E53935; font-style: italic;">->></span> <span style="color: #494949;">(</span>bloom-hash obj m k<span style="color: #494949;">)</span>
<span style="color: #494949;">(</span>reduce #<span style="color: #bbb;">(</span>assoc-in <span style="font-style: italic;">%1</span> <span style="color: #494949;">[</span><span style="color: #E53935;">:filter</span> <span style="font-style: italic;">%2</span><span style="color: #494949;">]</span> <span style="color: #E53935;">true</span><span style="color: #bbb;">)</span> filter-obj<span style="color: #494949;">)</span><span style="color: #bbb;">)</span><span style="color: #494949;">)</span>
<span style="color: #494949;">(</span><span style="color: #E53935; font-style: italic;">defn</span> <span style="font-weight: bold; font-style: italic;">new-bloom-filter</span> <span style="color: #bbb;">[</span>m n<span style="color: #bbb;">]</span>
<span style="color: #bbb;">(</span>->BasicBloomFilter m <span style="color: #494949;">(</span>optimal-k m n<span style="color: #494949;">)</span> <span style="color: #494949;">(</span>vec <span style="color: #bbb;">(</span>repeat m <span style="color: #E53935;">false</span><span style="color: #bbb;">)</span><span style="color: #494949;">)</span><span style="color: #bbb;">)</span><span style="color: #494949;">)</span>
</pre>
</div>
<p>
The drawback to this basic bloom filter is that you can't delete items from it. It's very space efficient, but to delete a single item you have to rebuild the entire bloom fliter, hashing all the objects you want to add. To solve this problem, at the cost of some space efficiency<label class="sidenote-number" for="4"><sup>4</sup></label><input checked="checked" id="4" style="display:none" type="checkbox" /><span class="sidenote"><span class="sidenote-number"> 4</span> Clojure booleans use a whole byte for each value, a better approach would be to use numbers and then extract the bits from there using some tricks with bit-shifting. But this is a toy example and bit-shifting would be both a longer post and less clear.</span>, we can move on to using integers or even longs.
</p>
</div>
</div>
<div id="outline-container-orgb5535f0" class="outline-4">
<h4 id="orgb5535f0">The CountingBloomFilter Record</h4>
<div class="outline-text-4" id="text-orgb5535f0">
<p>
The counting bloom filter extends the <code>BasicBloomFilter</code> with the <code>delete</code> operation by using numbers instead of boolean values. This is basically the same as storing
</p>
<div class="org-src-container">
<pre class="src src-clojure" id="org887f226">
<span style="color: #494949;">(</span><span style="color: #E53935; font-style: italic;">defrecord</span> <span style="color: #E53935;">CountingBloomFilter</span> <span style="color: #bbb;">[</span>m k filter<span style="color: #bbb;">]</span><span style="color: #494949;">)</span>
<span style="color: #494949;">(</span><span style="color: #E53935; font-style: italic;">defn</span> <span style="font-weight: bold; font-style: italic;">contains?</span> <span style="color: #bbb;">[</span><span style="color: #494949;">{</span><span style="color: #E53935;">:keys</span> <span style="color: #bbb;">[</span>m k<span style="color: #bbb;">]</span> <span style="color: #E53935;">:as</span> filter-obj<span style="color: #494949;">}</span> obj<span style="color: #bbb;">]</span>
<span style="color: #bbb;">(</span><span style="color: #E53935; font-style: italic;">->></span> <span style="color: #494949;">(</span>bloom-hash obj m k<span style="color: #494949;">)</span>
<span style="color: #494949;">(</span>map #<span style="color: #bbb;">(</span>get-in filter-obj <span style="color: #494949;">[</span><span style="color: #E53935;">:filter</span> <span style="font-style: italic;">%</span><span style="color: #494949;">]</span><span style="color: #bbb;">)</span><span style="color: #494949;">)</span>
<span style="color: #494949;">(</span>reduce #<span style="color: #bbb;">(</span><span style="color: #E53935; font-style: italic;">and</span> <span style="font-style: italic;">%1</span> <span style="font-style: italic;">%2</span><span style="color: #bbb;">)</span><span style="color: #494949;">)</span><span style="color: #bbb;">)</span><span style="color: #494949;">)</span>
<span style="color: #494949;">(</span><span style="color: #E53935; font-style: italic;">defn</span> <span style="font-weight: bold; font-style: italic;">add</span> <span style="color: #bbb;">[</span><span style="color: #494949;">{</span><span style="color: #E53935;">:keys</span> <span style="color: #bbb;">[</span>m k<span style="color: #bbb;">]</span> <span style="color: #E53935;">:as</span> filter-obj<span style="color: #494949;">}</span> obj<span style="color: #bbb;">]</span>
<span style="color: #bbb;">(</span><span style="color: #E53935; font-style: italic;">->></span> <span style="color: #494949;">(</span>bloom-hash obj m k<span style="color: #494949;">)</span>
<span style="color: #494949;">(</span>reduce #<span style="color: #bbb;">(</span>update-in <span style="font-style: italic;">%1</span> <span style="color: #494949;">[</span><span style="color: #E53935;">:filter</span> <span style="font-style: italic;">%2</span><span style="color: #494949;">]</span> inc<span style="color: #bbb;">)</span> filter-obj<span style="color: #494949;">)</span><span style="color: #bbb;">)</span><span style="color: #494949;">)</span>
<span style="color: #494949;">(</span><span style="color: #E53935; font-style: italic;">defn</span> <span style="font-weight: bold; font-style: italic;">del</span> <span style="color: #bbb;">[</span><span style="color: #494949;">{</span><span style="color: #E53935;">:keys</span> <span style="color: #bbb;">[</span>m k<span style="color: #bbb;">]</span> <span style="color: #E53935;">:as</span> filter-obj<span style="color: #494949;">}</span> obj<span style="color: #bbb;">]</span>
<span style="color: #bbb;">(</span><span style="color: #E53935; font-style: italic;">->></span> <span style="color: #494949;">(</span>bloom-hash obj m k<span style="color: #494949;">)</span>
<span style="color: #494949;">(</span>reduce #<span style="color: #bbb;">(</span>update-in <span style="font-style: italic;">%1</span> <span style="color: #494949;">[</span><span style="color: #E53935;">:filter</span> <span style="font-style: italic;">%2</span><span style="color: #494949;">]</span> dec<span style="color: #bbb;">)</span> filter-obj<span style="color: #494949;">)</span><span style="color: #bbb;">)</span><span style="color: #494949;">)</span>
<span style="color: #494949;">(</span><span style="color: #E53935; font-style: italic;">defn</span> <span style="font-weight: bold; font-style: italic;">new-counting-bloom-filter</span> <span style="color: #bbb;">[</span>m n<span style="color: #bbb;">]</span>
<span style="color: #bbb;">(</span>->CountingBloomFilter m <span style="color: #494949;">(</span>optimal-k m n<span style="color: #494949;">)</span> <span style="color: #494949;">(</span>vec <span style="color: #bbb;">(</span>repeat m 0<span style="color: #bbb;">)</span><span style="color: #494949;">)</span><span style="color: #bbb;">)</span><span style="color: #494949;">)</span>
</pre>
</div>
</div>
</div>
</div>
<div id="outline-container-org049ed68" class="outline-3">
<h3 id="org049ed68">Tangled Code Block</h3>
<div class="outline-text-3" id="text-org049ed68">
<p>
Here is where the code is tangled, producing a single file.
</p>
<div class="org-src-container">
<pre class="src src-clojure"><span style="color: #494949;">(</span><span style="color: #E53935; font-style: italic;">ns</span> <span style="color: #E53935;">bloom-filter</span>
<span style="color: #bbb;">(</span><span style="color: #E53935;">:require</span> <span style="color: #494949;">[</span>clojure.math.numeric-tower <span style="color: #E53935;">:refer</span> <span style="color: #bbb;">[</span>expt<span style="color: #bbb;">]</span><span style="color: #494949;">]</span><span style="color: #bbb;">)</span><span style="color: #494949;">)</span>
<<pads>>
<<hash-obj>>
<<bloom-hash>>
<<optimal-k>>
<<multi-methods>>
<<basic-bloom-filter-record>>
<<counting-bloom-filter-record>>
</pre>
</div>
</div>
</div>
</div>
<div id="outline-container-org311bbd0" class="outline-2">
<h2 id="org311bbd0">Conclusion</h2>
<div class="outline-text-2" id="text-org311bbd0">
<p>
Now you should have a rough idea on how a bloom filter works and how to implement one, as well as (presumably) some idea of how hash functions work.
</p>
</div>
</div>
<!-- Footnotes --><!--
<div class="footdef"><sup><a id="fn.1" class="footnum" href="#fnr.1" role="doc-backlink">1</a></sup> <div class="footpara" role="doc-footnote"><p class="footpara">A good example would be a collection of backup disk or (gasp) tape drives.</p></div></div>
<div class="footdef"><sup><a id="fn.2" class="footnum" href="#fnr.2" role="doc-backlink">2</a></sup> <div class="footpara" role="doc-footnote"><p class="footpara">Or 1, or incrementing a number.</p></div></div>
<div class="footdef"><sup><a id="fn.3" class="footnum" href="#fnr.3" role="doc-backlink">3</a></sup> <div class="footpara" role="doc-footnote"><p class="footpara">I think that you probably could use some sort of very simple adaptive control system to minimize the error over many uses. (yes, I am intentionally avoiding saying "AI".)</p></div></div>
<div class="footdef"><sup><a id="fn.4" class="footnum" href="#fnr.4" role="doc-backlink">4</a></sup> <div class="footpara" role="doc-footnote"><p class="footpara">Clojure booleans use a whole byte for each value, a better approach would be to use numbers and then extract the bits from there using some tricks with bit-shifting. But this is a toy example and bit-shifting would be both a longer post and less clear.</p></div></div>
--></div>
<div id="postamble" class="status">
<p class="date">Last Modified: 2021-W52-2 01:02</p><p class="creator">Generated Using: <a href="https://www.gnu.org/software/emacs/">Emacs</a> 27.2 (<a href="https://orgmode.org">Org</a> mode 9.4.6)</p><p class="license">Except where otherwise noted content on <a href="https://cons.dev">cons.dev</a> is licensed under a <a href="https://creativecommons.org/licenses/by-sa/4.0/" rel="license">Creative Commons Attribution-ShareAlike 4.0 International License</a>.</p>
</div>
</body>
</html>