How can I prevent Pentaho DI/Spoon from deadlocking?
1
vote
1
answer
3420
views
I have a transformation with four steps:
1. Read a pipe-delimited text (a text file input step)
2. Write to a staging table, with truncation (import table)
3. Run a stored procedure against the table populated in #2
4. Run a stored procedure against the table modified in #3
These have hops between them: 1 to 2 to 3 to 4.
It consistently hangs (tested several times, waiting up to 1.6 hours); that is, Pentaho reports starting the job in the Logging window, then nothing more. The GUI itself does not hang, in fact the Stop button disables after a moment, but the transformation does not proceed.
It appears to be deadlocking: checking for blocked sessions on the target database (PostgreSQL), I see two sessions, one for the stored procedure in step #3 (which shouldn't have started yet), which is blocked by a session which is trying to truncate the table in step #2. The
TRUNCATE
never completes (though Postgres does not report it as blocked), so no records get written and step #3 cannot run. If I kill either session, the transformation fails, with no records written; the sprocs in steps #3 or #4 sometimes run, but with no new data on which to operate.
If I disable the hop from 2 to 3, the job runs fine, and it takes just a couple of minutes.
Why is the sproc in step #3 trying to run when it has a dependency on step #2? Does a Pentaho "hop" not mean "run A, wait for it to end, and then run B"? What can I do to prevent the deadlock situation, or have I misread what's going on?
Asked by Jon of All Trades
(5987 rep)
Apr 4, 2017, 05:39 PM
Last activity: Apr 5, 2017, 07:12 PM
Last activity: Apr 5, 2017, 07:12 PM