Skip to content

Commit d1de0a9

Browse files
author
Konstantinos Diamantis
committed
Merge remote-tracking branch 'origin/lambdas'
2 parents 0100998 + aa30dec commit d1de0a9

4 files changed

Lines changed: 826 additions & 0 deletions

File tree

content/posts/loops-vs-lambdas.md

Lines changed: 290 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,290 @@
1+
+++
2+
date = '2026-02-25T17:09:56+01:00'
3+
draft = false
4+
title = '0% Loops vs 100% Lambdas, TMP and Views: Maximal Inlining'
5+
tags = ["advanced-level", "performance", "lambdas", "views", "parallelization"]
6+
+++
7+
8+
9+
On this article, we can see in a mini real-world example how we can get rid of imperative and manual loops that are not at all descriptive, they are difficult to read and maintain, do not help the compiler to inline them and hence, they lack of performance. We will progress with refactoring one step at a time, starting with refactoring using lambdas, then we can advance a bit and expose the lambdas from a custom template function, which handles the internal iteration - a pattern that mimics the logic of modern libraries (and imititates the `views` implementation) - and finally we will run the code with `C++20` and use modern `views` directly.
10+
11+
12+
Consider the below example. We have a `NetWorkPacket` and then a `NetworkBuffer` that stores a vector of packets. We would like to filter some of the packets based on - for instance - the encryption or the sourceIP, gather these filtered packets from the buffer and maybe apply some logic on these.
13+
14+
15+
16+
You can find the full code in the [github repo](https://github.com/konstantd/konstantd.github.io).
17+
18+
19+
``` cpp
20+
struct NetworkPacket {
21+
22+
// Source and Destination
23+
std::string m_sourceIp;
24+
std::string m_destinationIp;
25+
26+
// Let's skip the payload and use size of payload for simplicity on the ctor
27+
size_t m_packetSize;
28+
29+
// Encryption and Priority
30+
bool m_isEncrypted;
31+
Priority m_priority;
32+
33+
NetworkPacket(std::string src, std::string dest,
34+
int size, bool encrypted = false,
35+
Priority priority = Priority::LOW)
36+
:
37+
m_sourceIp(src), m_destinationIp(dest),
38+
m_packetSize(size), m_isEncrypted(encrypted),
39+
m_priority(priority) {}
40+
41+
// Move ctor default and noexcept
42+
NetworkPacket(NetworkPacket&& other) noexcept = default;
43+
44+
// Above line deleted also the copy ctor
45+
// We need it for the filtered vectors, let's define it
46+
NetworkPacket(const NetworkPacket& other) = default;
47+
};
48+
49+
50+
struct NetworkBuffer {
51+
52+
// Container for the Packets
53+
std::vector<NetworkPacket> m_packetBuffer;
54+
55+
// Forward a packet to the container
56+
template <typename T>
57+
void addPacketForward(T&& packet) {
58+
m_packetBuffer.emplace_back(std::forward<T>(packet));
59+
}
60+
61+
};
62+
```
63+
64+
## Populate the Buffer
65+
66+
67+
So given the above Buffer of packets, now I am populating it randomly, allocating for 2^17 packets. The random generators are not of interest here but you can find the full code, just note that I keep the seed fixed so we have the same random packets generated every time we run it.
68+
69+
70+
``` cpp
71+
// We know the size, let's reserve it to avoid reallocations
72+
const int N = 1 << 17;
73+
buffer.m_packetBuffer.reserve(N);
74+
75+
// Create N random packets in the buffer
76+
for (int i = 0; i < N; ++i) {
77+
// Create them as temporaries rvalues
78+
buffer.addPacketForward(NetworkPacket(getRandomSrc(),
79+
getRandomDst(),
80+
getRandomSize(),
81+
getRandomEncryptionBool(),
82+
getRandomPriority()
83+
));
84+
}
85+
```
86+
87+
88+
# Code with Manual for loops
89+
90+
91+
And now this is our logic. As we said we are filtering some packets from the buffer and gathering the packets in a new vector. I have 3 filters here, we could also operate on the data - but you get the idea.
92+
93+
94+
``` cpp
95+
// 1. Filter packets by IP "10.0.0.5" source
96+
std::vector<NetworkPacket> filteredPacketsfromSrc;
97+
for (const auto& packet : buffer.m_packetBuffer) {
98+
if (packet.m_sourceIp == "10.0.0.5") {
99+
filteredPacketsfromSrc.push_back(packet);
100+
}
101+
}
102+
103+
// 2. Filter packets that are encrypted with HIGH priority
104+
std::vector<NetworkPacket> filteredHighPriorEncrypted;
105+
for (const auto& packet : buffer.m_packetBuffer) {
106+
if ( (packet.m_isEncrypted) && (packet.m_priority == Priority::HIGH) ) {
107+
filteredHighPriorEncrypted.push_back(packet);
108+
}
109+
}
110+
111+
// 3. Filter packets by IP "6.8.8.8" destination and size > 128 bytes
112+
std::vector<NetworkPacket> filteredPacketsfromDst_128;
113+
for (const auto& packet : buffer.m_packetBuffer) {
114+
if ( (packet.m_destinationIp == "6.8.8.8") && (packet.m_packetSize > 128) ) {
115+
filteredPacketsfromDst_128.push_back(packet);
116+
}
117+
}
118+
```
119+
120+
121+
The loops are not really showing intention here, imagine we had some hard-coded extra filtering - or some transformations that the logic is hard to be understood.
122+
123+
124+
## 1st Improvement - `for_each` is slightly better
125+
126+
As a 1st step, we can replace every for loop with a `std::for_each` and a lambda to gain inlining and moving the overhead to the compilation time.
127+
128+
``` cpp
129+
// 1. Filter packets by IP "10.0.0.5" source
130+
std::vector<NetworkPacket> filteredPacketsfromSrc;
131+
std::for_each(buffer.m_packetBuffer.begin(), buffer.m_packetBuffer.end(), [&](const auto& packet) {
132+
if (packet.m_sourceIp == "10.0.0.5") {
133+
filteredPacketsfromSrc.push_back(packet);
134+
}
135+
});
136+
137+
// 2. Filter packets that are encrypted with HIGH priority
138+
std::vector<NetworkPacket> filteredHighPriorEncrypted;
139+
std::for_each(buffer.m_packetBuffer.begin(), buffer.m_packetBuffer.end(), [&](const auto& packet) {
140+
if (packet.m_isEncrypted && packet.m_priority == Priority::HIGH) {
141+
filteredHighPriorEncrypted.push_back(packet);
142+
}
143+
});
144+
145+
// 3. Filter packets by IP "6.8.8.8" destination and size > 128 bytes
146+
std::vector<NetworkPacket> filteredPacketsfromDst_128;
147+
std::for_each(buffer.m_packetBuffer.begin(), buffer.m_packetBuffer.end(), [&](const auto& packet) {
148+
if (packet.m_destinationIp == "6.8.8.8" && packet.m_packetSize > 128) {
149+
filteredPacketsfromDst_128.push_back(packet);
150+
}
151+
});
152+
```
153+
154+
155+
### Lambdas & Parallelization
156+
157+
Lambdas offer parallelization in hand. By switching from a loop to a lambda-based algorithm, you gain the ability to parallelize super easily just with `std::execution::par` .
158+
159+
160+
Though, if the underlying algorithm is not operating on atomics, we should lock manually, in order to avoid pushing back on the same memory and to protect the vector.
161+
162+
163+
Just as an example the 1st filter above, parallelized would be:
164+
165+
166+
``` cpp
167+
#include <execution>
168+
#include <mutex>
169+
170+
171+
// A mutex to lock
172+
std::mutex mtx;
173+
174+
std::vector<NetworkPacket> filteredPacketsfromSrc;
175+
std::for_each(std::execution::paar, buffer.m_packetBuffer.begin(), buffer.m_packetBuffer.end(), [&](const auto& packet) {
176+
if (packet.m_sourceIp == "10.0.0.5") {
177+
std::lock_guard<std::mutex> lock(mtx); // Lock here, unlock is provided by RAII
178+
filteredPacketsfromSrc.push_back(packet);
179+
}
180+
});
181+
```
182+
183+
184+
## 2nd Improvement - Avdanced Predicate and Action template class
185+
186+
Then we can identify the pattern and implement a template function that accepts lambdas to filter and act on the buffer. Like this, we decouple the traversal mechanics (the How) from the business logic (the What). This **internal iteration** pattern allows the compiler to inline the lambdas directly into the loop while significantly improving code reuse. This idea is preffered in modern C++ libraries as well, since it is great for encapsulation. Imagine, even if we change the underlying container that we are iterating over, this would still work without changing code in so many places.
187+
188+
189+
190+
191+
192+
We create a template function for the `struct NetworkBuffer` class that accepts a Predicate and an Action.
193+
194+
195+
``` cpp
196+
template <typename Predicate, typename Action>
197+
inline void filter_and_execute(Predicate&& filter, Action&& work) {
198+
// Because this is a template, 'filter' and 'work' are NOT function pointers.
199+
// They are unique types, allowing the compiler to 'paste' their logic here.
200+
for (const NetworkPacket& packet: m_packetBuffer) {
201+
if (filter(packet)) {
202+
work(packet);
203+
}
204+
}
205+
}
206+
```
207+
208+
209+
And now we use it like:
210+
211+
212+
```cpp
213+
// 1. Filter packets by IP "10.0.0.5" source
214+
std::vector<NetworkPacket> filteredPacketsfromSrc;
215+
buffer.filter_and_execute(
216+
[](const NetworkPacket& packet) {
217+
return packet.m_sourceIp == "10.0.0.5";
218+
},
219+
[&](const NetworkPacket& packet) {
220+
filteredPacketsfromSrc.push_back(packet);
221+
}
222+
);
223+
224+
// 2. Filter packets that are encrypted with HIGH priority
225+
std::vector<NetworkPacket> filteredHighPriorEncrypted;
226+
buffer.filter_and_execute(
227+
[](const auto& p) { return p.m_isEncrypted && p.m_priority == Priority::HIGH; },
228+
[&](const auto& p) { filteredHighPriorEncrypted.push_back(p); }
229+
);
230+
231+
// 3. Filter packets by IP "6.8.8.8" destination and size > 128 bytes
232+
std::vector<NetworkPacket> filteredPacketsfromDst_128;
233+
buffer.filter_and_execute(
234+
[](const auto& p) { return p.m_destinationIp == "6.8.8.8" && p.m_packetSize > 128; },
235+
[&](const auto& p) { filteredPacketsfromDst_128.push_back(p); }
236+
);
237+
```
238+
239+
240+
241+
242+
## 3rd Improvement - Views are even more readable and give highest performance
243+
244+
Finally we can see how we can achieve the same with `views` from `C++20`.
245+
246+
247+
248+
```cpp
249+
// 1. Filter packets by IP "10.0.0.5" source
250+
auto filter1 = buffer
251+
| std::views::filter([](const auto& p) { return p.m_sourceIp == "10.0.0.5";} );
252+
253+
// 2. Filter packets that are encrypted with HIGH priority
254+
auto filter2 = buffer
255+
| std::views::filter([](const auto& p) { return p.m_isEncrypted && p.m_priority == Priority::HIGH;});
256+
257+
// 3. Filter packets by IP "6.8.8.8" destination and size > 128 bytes
258+
auto filter3 = buffer
259+
| std::views::filter([](const auto& p) { return p.m_destinationIp == "6.8.8.8"; })
260+
| std::views::filter([](const auto& p) { return p.m_packetSize > 128; });
261+
```
262+
263+
264+
The advantages now:
265+
266+
- Obviously way more readable
267+
268+
- Previously we were manually doing a `push_back`, which could trigger mem allocations. (In our case we had reserved memory, so we avoided it). Views do not create a new vector - `std::views::filter` is lazy, meaning that it doesn't move or copy anything. This saves us from allocating 3 separate temporary vectors.
269+
270+
- Also, In our `template filter_and_execute`, we have to run a new loop for every filter. With `views`, we can chain them and the compiler can optimize the logic into a single pass over the data, which is much better for the CPU and cache.
271+
272+
273+
274+
## Concluding
275+
276+
277+
278+
To summarize the evolution from manual loops to modern C++ abstractions:
279+
280+
281+
1. for loops with `if-else` logic make the compiler hard to optimize
282+
2. `lamdbas` are more readable and easier for the compiler to inline
283+
3. `lamdbas` offer parallelization in hand with `std::execution::par` - way easier than a manual loop
284+
4. A custom template is used often in modern libraries and is great for splitting the traversal (the **How**) from the applied logic (the **What**), completely independent of the underlying container. It handles the **internal iteration** and is powerful for enapsulation. The forwarded lambdas like `Predicate` and `Action` simply define the criteria and the What.
285+
5. `views` are even more readable and better with performance because
286+
6. `views` are lazy - as developers call them, since they are doing **external iteration** - they do not create temporary vectors for copies
287+
7. `views` are perfect for piping many filters at once. In a loop we should iterate again over the packets and apply the new filter, adding significant overhead.
288+
8. `views` give maximum inlining and performance
289+
9. `views` do not offer parallelization as easy as lambdas
290+

0 commit comments

Comments
 (0)