Skip to content

Commit 764adeb

Browse files
authored
Added lazy operations (#27)
* First draft of lazy operations * Sorting lazy vectors * Update README.md * Added lazy zip * Update README.md * added lazy_set * added missing lazy_set overload * lazy_set set algebra added * added lazy map implementation * Added contents table in README * Update README.md * fixed header ordering * header cleanups * Fixed PR comments * Fixed lazy_set and lazy_map ownership * fixed release builds asserts * fixed set comparator and dead code * Fixed map comparator and removed dead code * lazy set algebra operations normalization * assert for release builds
1 parent 086bca2 commit 764adeb

10 files changed

Lines changed: 2469 additions & 38 deletions

File tree

README.md

Lines changed: 255 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,36 @@ The primary focus of this library is
1010
* encapsulation of the iterator madness
1111
* removal of manual for-loops
1212

13+
## Contents
14+
* [Compilation (Cmake)](#compilation-cmake)
15+
* [Dependencies](#dependencies)
16+
* [Minimum C++ version](#minimum-c-version)
17+
* [macOS (Xcode)](#macos-xcode)
18+
* [macOS (Makefiles/clang)](#macos-makefilesclang)
19+
* [macOS (Makefiles/g++)](#macos-makefilesg)
20+
* [Linux (Makefiles)](#linux-makefiles)
21+
* [Windows (Visual Studio)](#windows-visual-studio)
22+
* [Functional vector usage (fcpp::vector)](#functional-vector-usage-fcppvector)
23+
* [extract unique (distinct) elements in a set](#extract-unique-distinct-elements-in-a-set)
24+
* [zip, map, filter, sort, reduce](#zip-map-filter-sort-reduce)
25+
* [lazy operations](#lazy-operations)
26+
* [index search](#index-search)
27+
* [remove, insert](#remove-insert)
28+
* [size, capacity, reserve, resize](#size-capacity-reserve-resize)
29+
* [all_of, any_of, none_of](#all_of-any_of-none_of)
30+
* [Parallel algorithms](#parallel-algorithms)
31+
* [Functional set usage (fcpp::set)](#functional-set-usage-fcppset)
32+
* [difference, union, intersection](#difference-union-intersection-works-with-fcppset-and-stdset)
33+
* [zip, map, filter, reduce](#zip-map-filter-reduce)
34+
* [lazy operations](#lazy-operations-1)
35+
* [all_of, any_of, none_of](#all_of-any_of-none_of-1)
36+
* [remove, insert, contains, size, clear](#remove-insert-contains-size-clear)
37+
* [Functional map usage (fcpp::map)](#functional-map-usage-fcppmap)
38+
* [map_to, filter, reduce, for_each](#map_to-filter-reduce-for_each)
39+
* [lazy operations](#lazy-operations-2)
40+
* [all_of, any_of, none_of](#all_of-any_of-none_of-2)
41+
* [keys, values, remove, insert](#keys-values-remove-insert)
42+
1343
## Compilation (Cmake)
1444
### Dependencies
1545
* CMake >= 3.14
@@ -135,6 +165,93 @@ const auto total_age = employees_below_40.reduce(0, [](const int& partial_sum, c
135165
return partial_sum + p.age;
136166
});
137167
```
168+
169+
### lazy operations
170+
Lazy vectors are useful when chaining multiple operations over a large vector. A regular `map().filter().reduce()` style chain creates intermediate vectors and iterates once per algorithm. Calling `.lazy()` stores the following operations and executes them only when a terminal operation is called, such as `get()` or `reduce()`. This can avoid unnecessary intermediate allocations and lets map/filter/reduce-style pipelines process elements in one pass. Sorting is an important exception: it cannot be streamed element by element, so lazy `sort`, `sort_ascending`, and `sort_descending` first collect the current lazy pipeline's values, sort that collected vector, and then continue feeding the rest of the lazy chain.
171+
172+
```c++
173+
#include "vector.h" // instead of <vector>
174+
175+
const fcpp::vector<int> numbers({5, 1, 4, 2, 3});
176+
177+
const auto processed_numbers = numbers
178+
// start a lazy pipeline from this point on
179+
.lazy()
180+
181+
// this predicate is not evaluated yet
182+
.filter([](const int& number) {
183+
return number > 2;
184+
})
185+
186+
// sorting is also deferred, but it needs to materialize the filtered
187+
// values internally when the terminal operation is called
188+
.sort_ascending()
189+
190+
// this transform is not evaluated yet
191+
.map<std::string>([](const int& number) {
192+
return std::to_string(number);
193+
})
194+
195+
// terminal operation: all stored operations are executed here
196+
.get();
197+
198+
// processed_numbers -> fcpp::vector<std::string>({ "3", "4", "5" })
199+
// numbers -> fcpp::vector<int>({ 5, 1, 4, 2, 3 })
200+
```
201+
202+
Here is another example without sorting, thus all operations are materialized in the end.
203+
204+
```c++
205+
const auto total = numbers
206+
// start a lazy pipeline from this point on
207+
.lazy()
208+
209+
// this transform is not evaluated yet
210+
.map<int>([](const int& number) {
211+
return number * 3;
212+
})
213+
214+
// this predicate is not evaluated yet
215+
.filter([](const int& number) {
216+
return number > 5;
217+
})
218+
219+
// terminal operation: all stored operations are executed here
220+
.reduce(0, [](const int& partial_sum, const int& number) {
221+
return partial_sum + number;
222+
});
223+
224+
// total -> 42
225+
```
226+
227+
Lazy zip can combine a lazy vector with an `fcpp::vector`, a `std::vector`, or another `fcpp::lazy_vector` and also waits until a terminal operation is called, and only then checks that both sides have equal sizes. When zipping with another lazy vector, the right-hand lazy vector is materialized internally at that point, so its values can be paired by index.
228+
229+
```c++
230+
const fcpp::vector<int> ages({32, 45, 37});
231+
const fcpp::vector<std::string> names({"Jake", "Anna", "Kate"});
232+
233+
const auto employees = ages
234+
// start a lazy pipeline from this point on
235+
.lazy()
236+
237+
// zip is not evaluated yet
238+
.zip(names)
239+
240+
// this transform is not evaluated yet
241+
.map<person>([](const std::pair<int, std::string>& pair) {
242+
return person(pair.first, pair.second);
243+
})
244+
245+
// terminal operation: zip size validation and all stored operations run here
246+
.get();
247+
248+
// employees -> fcpp::vector<person>({
249+
// person(32, "Jake"),
250+
// person(45, "Anna"),
251+
// person(37, "Kate"),
252+
// })
253+
```
254+
138255
### index search
139256
```c++
140257
#include "vector.h" // instead of <vector>
@@ -367,6 +484,93 @@ const auto total_age = employees_below_40.reduce(0, [](const int& partial_sum, c
367484
});
368485
```
369486

487+
### lazy operations
488+
Lazy sets are useful when chaining operations over a large set and only needing the final materialized set or a reduced value. A regular `map().filter().reduce()` chain creates intermediate sets and iterates once per algorithm. Calling `.lazy()` stores the following operations and executes them only when a terminal operation is called, such as `get()` or `reduce()`. This can avoid unnecessary intermediate allocations and lets map/filter/reduce-style pipelines process keys in one pass. Unlike vectors, sets are already ordered by their comparator, so lazy sets focus on the operations that make sense for set data: `map`, `filter`, `difference_with`, `union_with`, `intersect_with`, `zip`, and `reduce`.
489+
490+
```c++
491+
#include "set.h" // instead of <set>
492+
493+
const fcpp::set<int> numbers({1, 2, 3, 4, 5});
494+
495+
const auto total = numbers
496+
// start a lazy pipeline from this point on
497+
.lazy()
498+
499+
// this transform is not evaluated yet
500+
.map<int>([](const int& number) {
501+
return number * 3;
502+
})
503+
504+
// this predicate is not evaluated yet
505+
.filter([](const int& number) {
506+
return number > 5;
507+
})
508+
509+
// terminal operation: all stored operations are executed here
510+
.reduce(0, [](const int& partial_sum, const int& number) {
511+
return partial_sum + number;
512+
});
513+
514+
// total -> 42
515+
```
516+
517+
Lazy set algebra can combine a lazy set with an `fcpp::set`, a `std::set`, or another `fcpp::lazy_set`. The operation is still deferred, but set algebra needs set membership and sorted set semantics, so the current lazy pipeline is materialized internally when the terminal operation is called. When the right-hand side is also lazy, it is materialized internally at the same point.
518+
519+
```c++
520+
const fcpp::set<int> colleague_ages({15, 18, 25, 41, 51});
521+
const fcpp::set<int> friend_ages({41, 42, 51});
522+
const fcpp::set<int> family_ages({51, 81});
523+
524+
const auto guests = colleague_ages
525+
// start a lazy pipeline from this point on
526+
.lazy()
527+
528+
// this predicate is not evaluated yet
529+
.filter([](const int& age) {
530+
return age >= 18;
531+
})
532+
533+
// set difference is not evaluated yet
534+
.difference_with(friend_ages)
535+
536+
// set union is not evaluated yet
537+
.union_with(family_ages)
538+
539+
// terminal operation: the lazy filter and set algebra run here
540+
.get();
541+
542+
// guests -> fcpp::set<int>({18, 25, 51, 81})
543+
```
544+
545+
Lazy set zip can combine a lazy set with an `fcpp::set`, a `std::set`, an `fcpp::vector`, a `std::vector`, an `fcpp::lazy_vector`, or another `fcpp::lazy_set`. Size validation is deferred until a terminal operation is called. When zipping with a vector, duplicate vector values are removed before zipping, just like the eager set zip operation. When zipping with a lazy vector, the right-hand lazy vector is materialized internally at that point and then deduplicated. When zipping with another lazy set, the right-hand lazy set is materialized internally at that point, so its keys can be paired in set order.
546+
547+
```c++
548+
const fcpp::set<int> ages({25, 45, 30, 63});
549+
const fcpp::set<std::string> names({"Jake", "Bob", "Michael", "Philipp"});
550+
551+
const auto employees = ages
552+
// start a lazy pipeline from this point on
553+
.lazy()
554+
555+
// zip is not evaluated yet
556+
.zip(names)
557+
558+
// this transform is not evaluated yet
559+
.map<person>([](const std::pair<int, std::string>& pair) {
560+
return person(pair.first, pair.second);
561+
})
562+
563+
// terminal operation: zip size validation and all stored operations run here
564+
.get();
565+
566+
// employees -> fcpp::set<person>({
567+
// person(25, "Bob"),
568+
// person(30, "Jake"),
569+
// person(45, "Michael"),
570+
// person(63, "Philipp"),
571+
// })
572+
```
573+
370574
### all_of, any_of, none_of
371575
```c++
372576
#include "set.h" // instead of <set>
@@ -466,6 +670,57 @@ adults.for_each([](const std::pair<const std::string, int>& element) {
466670
});
467671
```
468672

673+
### lazy operations
674+
Lazy maps are useful when chaining `map_to`, `filter`, and `reduce` over a large map. A regular `filtered().map_to().reduce()` style chain creates intermediate maps and iterates once per algorithm. Calling `.lazy()` stores the following operations and executes them only when a terminal operation is called, such as `get()` or `reduce()`. This can avoid unnecessary intermediate allocations and lets map_to/filter/reduce-style pipelines process key/value pairs in one pass. When a lazy `map_to` creates equivalent output keys, the first key/value pair encountered in sorted map order is kept, following `std::map::insert` semantics.
675+
676+
```c++
677+
#include "map.h" // instead of <map>
678+
679+
const fcpp::map<std::string, int> ages({
680+
{"jake", 32},
681+
{"mary", 16},
682+
{"david", 40}
683+
});
684+
685+
const auto ages_by_initial = ages
686+
// start a lazy pipeline from this point on
687+
.lazy()
688+
689+
// this predicate is not evaluated yet
690+
.filter([](const std::pair<const std::string, int>& element) {
691+
return element.second >= 18;
692+
})
693+
694+
// this transform is not evaluated yet
695+
.map_to<char, std::string>([](const std::pair<const std::string, int>& element) {
696+
return std::make_pair(element.first[0], std::to_string(element.second) + " years");
697+
})
698+
699+
// terminal operation: all stored operations are executed here
700+
.get();
701+
702+
// ages_by_initial -> fcpp::map<char, std::string>({{'d', "40 years"}, {'j', "32 years"}})
703+
// ages -> fcpp::map<std::string, int>({{"david", 40}, {"jake", 32}, {"mary", 16}})
704+
```
705+
706+
```c++
707+
const auto total_age = ages
708+
// start a lazy pipeline from this point on
709+
.lazy()
710+
711+
// this predicate is not evaluated yet
712+
.filter([](const std::pair<const std::string, int>& element) {
713+
return element.second >= 18;
714+
})
715+
716+
// terminal operation: all stored operations are executed here
717+
.reduce(0, [](const int& partial_sum, const std::pair<const std::string, int>& element) {
718+
return partial_sum + element.second;
719+
});
720+
721+
// total_age -> 72
722+
```
723+
469724
### all_of, any_of, none_of
470725
```c++
471726
#include "map.h" // instead of <map>

include/export_def.h

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -31,5 +31,3 @@
3131
#else
3232
#define FunctionalCppExport __attribute__ ((__visibility__("default")))
3333
#endif
34-
35-
#include "compatibility.h"

0 commit comments

Comments
 (0)