Input: G = (V, E): social network; pop: the size of population;
g_max: the maximum number of iterations; cr: the probability of
crossover; mu: the probability of mutation; k: the size of the
seed set; λ: balance weight;
Output: A k-node set S with fair influence spread;
Step1: Community Detection.
1: C = {C₁, C₂, ..., Cₘ} ← obtain community structure by Leiden algorithm;
Step2: Population Initialization.
2: SN = {SN₁, SN₂, ..., SNₙ} ← calculate the scores of nodes
in G using PageRank;
3: SC = {SC₁, SC₂, ..., SCₘ} ← calculate the scores of communities in C;
4: P ← Initialization(G, pop, k, C, SN, SC);
Step3: Population Evolution.
5: Set iteration counter to zero: g = 0;
6: while g < g_max do
7: Sort all individuals of P in descending order using F;
8: P' ← Crossover(G, pop, k, C, SN, P, cr);
9: P'' ← Mutation(G, pop, k, C, SN, P', mu);
10: for i = 1 to pop do
11: Pᵢ ← Max(Pᵢ, P''ᵢ, F);
12: end for
13: g = g + 1;
14: end while
15: S ← arg max_{Pᵢ ∈ P}(F(Pᵢ));Input: G = (V, E): social network; pop: the size of population; k: the size of the seed set; C: community structure; SN: node scores; SC: community scores;
Output: An initialized population P;
- Initialize P with empty vectors;
- for i = 1 to pop do
- // Community-based solution
- Initialize community counter CC = {0, 0, ..., 0}_m;
- for j = 1 to k do
- Obtain the potential community C_t according to the probability calculated by SC scores;
- CC_t = CC_t + 1;
- end for
- for h = 1 to m do
- if CC_h ≠ 0 then
- Add the top CC_h nodes with the highest node scores in C_h to P_i;
- end if
- end for
- Append P_i to P;
- // Fairness-aware random solution
- Vectorize community selection: picks ← Random_Choice(C, size=k, p=Normalize(SC));
- counts ← Count_Occurrences(picks);
- for each (C_t, count) in counts do
- Select top count nodes by SN from C_t and add to P_i;
- end for
- Fill remaining slots with top-SN nodes globally;
- Append P_i to P;
- // SN-weighted random solution
- P_i ← Random_Choice(V, size=k, p=Normalize(SN));
- Append P_i to P;
- end for
Input: G, pop, k, C, SN, P, cr;
Output: Population P' after crossover;
- P' ← Copy(P);
- for i = 1 to ⌊pop/2⌋ do
- for j = 1 to k do
- if random() < cr then
- Swap P'i[j] ↔ P'{pop-i}[j];
- end if
- end for
- // Repair if needed: remove duplicates and fill with top-SN nodes
- combined ← Set(P'i) ∪ Set(P'{pop-i});
- sorted ← Sort_Descending(combined, by SN);
- P'_i ← sorted[0:k];
- end for
- return P';
Input: G, pop, k, C, SN, P', mu, groups;
Output: Population P'' after mutation;
- P'' ← Copy(P');
- for i = 1 to pop do
- if random() < mu then
- // Remove random node
- idx ← Random_Int(0, k-1);
- removed_node ← P''_i[idx];
- Remove P''_i[idx];
- // Calculate current group coverage
- for each group g in groups do
- coverage[g] ← Count_Nodes_In_Group(P''_i, g);
- end for
- // Compute fairness-weighted community scores
- for each community c in C do
- weight[c] ← 0;
- for each group g in groups do
- overlap ← |c ∩ g|;
- ideal_count ← max(1, ⌊k × |g| / |V|⌋);
- deficit ← max(0, ideal_count - coverage[g]);
- weight[c] ← weight[c] + deficit × overlap;
- end for
- weight[c] ← weight[c] + 1; // Avoid zero weight
- end for
- // Select community by fairness-weighted probability
- probs ← Normalize(weight);
- C_selected ← Random_Choice(C, p=probs);
- // Select node from community by SN-weighted probability
- candidates ← {v ∈ C_selected | v ∉ P''_i};
- if candidates ≠ ∅ then
- SN_probs ← Normalize([SN[v] for v in candidates]);
- new_node ← Random_Choice(candidates, p=SN_probs);
- Add new_node to P''_i;
- else
- Add removed_node to P''_i; // Fallback
- end if
- end if
- end for
- return P'';
Input: S: seed set; G: graph; groups: attribute groups; ideal_influences: ideal influence per group; λ: balance weight; p: propagation probability; mc: Monte Carlo simulations;
Output: Fitness score F(S);
- for each group g_i in groups do
- actual_influence[g_i] ← Run_IC(G, S, p, mc, target=nodes_in_g_i);
- fraction[g_i] ← actual_influence[g_i] / |g_i|;
- violation[g_i] ← max(0, (ideal_influences[g_i] - actual_influence[g_i]) / ideal_influences[g_i]);
- end for
- MF ← min(fraction);
- DCV ← mean(violation);
- F(S) ← λ × MF - (1-λ) × DCV;
- return F(S);
Input: G, S, p, mc, target_nodes;
Output: Average influence spread;
- live_graphs ← Get_Live_Edge_Samples(G, p, mc); // Cached globally
- total_spread ← 0;
- for each live_graph in live_graphs do
- active ← Set(S);
- queue ← List(S);
- while queue ≠ ∅ do
- u ← queue.pop_front();
- for each v in live_graph.neighbors(u) do
- if v ∉ active then
- active.add(v);
- queue.append(v);
- end if
- end for
- end while
- if target_nodes specified then
- spread ← |active ∩ target_nodes|;
- else
- spread ← |active|;
- end if
- total_spread ← total_spread + spread;
- end for
- return total_spread / mc;
- Khởi tạo quần thể: O(POP_SIZE × k × |V|)
- Mỗi thế hệ GA:
- Đánh giá: O(POP_SIZE × |groups| × mc × |V|) với caching → giảm đáng kể
- Lai ghép/đột biến: O(POP_SIZE × k)
- Tổng: O(MAX_GEN × POP_SIZE × |groups| × mc × |V|)
- Live-Edge Caching: Pre-sample mc đồ thị live-edge, tái sử dụng cho tất cả tính toán IC
- Influence Memoization: Cache kết quả IC theo (seed_set, group_id) trong mỗi thế hệ
- Vector hóa: Dùng NumPy cho chọn cộng đồng, sort nodes, giảm vòng lặp Python
- Greedy Ideal với Live-Edge: Dùng cùng live-edge samples cho greedy selection
- k = 40
- POP_SIZE = 10
- MAX_GEN = 150
- P_CROSSOVER = 0.6
- P_MUTATION = 0.1
- λ = 0.5
- p = 0.01
- mc = 1000 (evaluation), 30 (ideal influence)