What happened?
The idx_workflow_pgroonga full-text index on texera_db.workflow makes workflow uploads hang indefinitely once a workflow's content field is non-trivial in size. The current definition:
CREATE INDEX idx_workflow_pgroonga ON texera_db.workflow USING pgroonga (
((((((COALESCE(name, '')::text || ' ') ||
COALESCE(description, '')) || ' ') ||
COALESCE(content, ''))))
) WITH (tokenizer='TokenMecab');
Two compounding problems make this catastrophic for everyday usage:
content is in the indexed expression. workflow.content holds the serialized workflow graph; for any realistically sized workflow it is many KB to several MB of JSON-encoded data (operator properties, schemas, code, embedded resources). Every INSERT INTO workflow therefore runs the tokenizer over the entire blob.
- Tokenizer is
TokenMecab — a Japanese morphological tokenizer (it depends on local config of user). Running MeCab over JSON / serialized data is dramatically slower than the natural-language input it was designed for, and produces useless index entries (no human will full-text-search the inside of a workflow's JSON content as Japanese prose).
Once a single insert is in flight inside pgroonga's C code, pg_cancel_backend and pg_terminate_backend return t but the backend ignores them — pgroonga's tokenization loop doesn't call CHECK_FOR_INTERRUPTS(). The session keeps its RowExclusiveLock on the table; every subsequent insert / update / delete queues behind it and hangs forever. Restarting the Texera JVMs doesn't help — the stuck Postgres backend survives. Recovery requires bouncing the Postgres service itself (Stop-Service postgresql / taskkill /F of the postmaster).
Visible symptom: workflow upload silently hangs in the dashboard with no error toast; the dev-server proxy eventually emits ECONNRESET → ECONNREFUSED. Workflow create / delete is broken across JVM restarts until Postgres is bounced.
Suggested fixes
In order of preference:
- Drop
content from the indexed expression. Users search workflows by name and description, not by the internal JSON of the graph. The index becomes:
USING pgroonga ((COALESCE(name, '') || ' ' || COALESCE(description, '')))
That cuts the indexed text from MB to bytes per row and removes the worst-case tokenization load.
- Replace
TokenMecab with TokenBigramSplitSymbolAlphaDigit (pgroonga's default for mixed-language / non-natural-language input). TokenMecab is appropriate only when the indexed text is Japanese prose.
- Make it a partial index excluding rows where
octet_length(content) exceeds some threshold, as a defensive backstop.
- Independently, add
lock_timeout=5s, statement_timeout=30s to the JDBC connection string used by WorkflowResource so a hung insert returns a 5xx in seconds rather than pinning a Postgres backend forever.
How to reproduce?
- With the
pgroonga extension and the idx_workflow_pgroonga index in place (current schema), upload any workflow whose content is ≥ ~1 MB (most non-trivial workflows qualify). Also include any binary payload.
pg_stat_activity shows the corresponding INSERT INTO workflow at state='active', wait_event=NULL, age climbing into minutes, holding RowExclusiveLock on the table.
- Try to upload anything else — the second INSERT queues behind it forever.
pg_terminate_backend(<stuck_pid>) returns t but the backend keeps running.
Branch
main
Commit Hash (Optional)
No response
What browsers are you seeing the problem on?
Not browser-specific; reproduces from any client.
Relevant log output
-- pg_stat_activity while the upload hangs
pid | state | wait_event_type | wait_event | age_s | q
-------+--------+-----------------+------------+-------+-----------------------------------------------------------
41144 | active | | | 558 | insert into "texera_db"."workflow" ("name", "description", "content") values ($1, $2, $3) returning ...
-- The granted-but-stuck lock that pins the table
locktype | mode | granted | pid | state | age_s
----------+------------------+---------+-------+--------+-------
relation | RowExclusiveLock | t | 41144 | active | 63
-- pg_cancel_backend / pg_terminate_backend report success; the session ignores them
SELECT pg_terminate_backend(41144); -- returns 't', PID 41144 still active 5 minutes later
-- JVM side: RUNNABLE inside socketRead0, waiting for a Postgres response that never comes
"dw-50 - POST /api/workflow/create" #50 elapsed=127s runnable
java.lang.Thread.State: RUNNABLE
at java.net.SocketInputStream.socketRead0(Native Method)
at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2174)
at org.postgresql.jdbc.PgPreparedStatement.executeQuery(PgPreparedStatement.java:137)
at org.jooq.impl.DAOImpl.insert(DAOImpl.java:156)
at WorkflowResource.insertWorkflow(WorkflowResource.scala:89)
at WorkflowResource.createWorkflow(WorkflowResource.scala:573)
What happened?
The
idx_workflow_pgroongafull-text index ontexera_db.workflowmakes workflow uploads hang indefinitely once a workflow'scontentfield is non-trivial in size. The current definition:Two compounding problems make this catastrophic for everyday usage:
contentis in the indexed expression.workflow.contentholds the serialized workflow graph; for any realistically sized workflow it is many KB to several MB of JSON-encoded data (operator properties, schemas, code, embedded resources). EveryINSERT INTO workflowtherefore runs the tokenizer over the entire blob.TokenMecab— a Japanese morphological tokenizer (it depends on local config of user). Running MeCab over JSON / serialized data is dramatically slower than the natural-language input it was designed for, and produces useless index entries (no human will full-text-search the inside of a workflow's JSON content as Japanese prose).Once a single insert is in flight inside pgroonga's C code,
pg_cancel_backendandpg_terminate_backendreturntbut the backend ignores them — pgroonga's tokenization loop doesn't callCHECK_FOR_INTERRUPTS(). The session keeps itsRowExclusiveLockon the table; every subsequent insert / update / delete queues behind it and hangs forever. Restarting the Texera JVMs doesn't help — the stuck Postgres backend survives. Recovery requires bouncing the Postgres service itself (Stop-Service postgresql/taskkill /Fof the postmaster).Visible symptom: workflow upload silently hangs in the dashboard with no error toast; the dev-server proxy eventually emits
ECONNRESET → ECONNREFUSED. Workflow create / delete is broken across JVM restarts until Postgres is bounced.Suggested fixes
In order of preference:
contentfrom the indexed expression. Users search workflows by name and description, not by the internal JSON of the graph. The index becomes:TokenMecabwithTokenBigramSplitSymbolAlphaDigit(pgroonga's default for mixed-language / non-natural-language input).TokenMecabis appropriate only when the indexed text is Japanese prose.octet_length(content)exceeds some threshold, as a defensive backstop.lock_timeout=5s, statement_timeout=30sto the JDBC connection string used byWorkflowResourceso a hung insert returns a 5xx in seconds rather than pinning a Postgres backend forever.How to reproduce?
pgroongaextension and theidx_workflow_pgroongaindex in place (current schema), upload any workflow whosecontentis ≥ ~1 MB (most non-trivial workflows qualify). Also include any binary payload.pg_stat_activityshows the correspondingINSERT INTO workflowatstate='active',wait_event=NULL, age climbing into minutes, holdingRowExclusiveLockon the table.pg_terminate_backend(<stuck_pid>)returnstbut the backend keeps running.Branch
main
Commit Hash (Optional)
No response
What browsers are you seeing the problem on?
Not browser-specific; reproduces from any client.
Relevant log output