IRC log of #zope for Tuesday, 2012-05-29

mgedminanyone ever heard of mysteriously dying zope.server threads?17:18
mgedminnot a peep in the logs, but three out of four threads are now gone17:18
mgedminwhere's the bug tracker for zc.zservertracelog?17:44
koshit is not something I have ever seen18:25
koshnot in 10+ years of using zope18:25
betabugmgedmin: gone? or deadlocked maybe?18:26
koshbetabug: so how goes the fall of civilization?18:26
betabugendless, how else18:26
mgedmingone: pstree shows only 4 running threads there (main thread, zeo client thread, mail delivery thread, one worker thread)18:26
mgedminthe staging instance on the same server with very similar config has 7 threads that show up in pstree -- and also in sys._current_frames()18:27
mgedminI've collectd graphs showing me when the thread count goes down18:27
mgedminbut I haven't been able to pin the cause yet18:27
mgedminnothing suspicious in the logs (z3.log, trace.log, access.log)...18:27
mgedminwell, I saw four exceptions about a short socket read (client closed connection before the entire response was sent)18:28
mgedminif I'd seen three, I could believe they might've shut the three threads down somehow18:28
mgedminshame that there's no thread ID in trace.log18:29
mgedminoh, hey, actually18:36
mgedminthree out of those four "ERROR zope.server.taskthreads Exception during task" messages correlate rather neatly with the thread count going down by one in the collectd graph18:36
mgedminthe exceptions are here:
mgedminI see no differences between the first and the rest18:39
mgedminand here's the thread # graph:
mgedminsunday night the thread # went down to zero live worker threads, which alerted me to the issue18:42
koshgeeze that sounds so bizarre I wonder what is in your code or that your code is using that would cause that problem18:43
mgedminsure, blame my code for bugs in zope.server :)18:44
koshit just seems if it was a general bug in zope.server more people would have the problem18:45
mgedminI agree18:45
koshI just suspected some kind of c extension involved that is causing the thread to die18:45
mgedminI've been running this configuration for a long time, only now started noticing strangeness18:45
koshhardware or library update problem?18:45
koshalthough I have no idea how that would be involved18:45
mgedminlooking at the yearly graph, thread # started going down in mid-April18:46
mgedminno significant OS-level upgrades near that time18:47
mgedminthere were app version updates, I'll have to check if they changed any package versions18:47
* mgedmin needs a break18:48
koshI can get a vacation in a little over 2 years it looks like18:49
mgedminwell, good luck with that :)18:50
koshjust the unfortunate reality of running a business and going back to school with other issues having come up18:57
betabugI don't need a vacation, I just go to work on an island18:57
benbangertJ1m: gevent + real os threads is such a blast....19:32
J1mI like both of those. :)19:33
J1mThe combination can be explosive. :)19:33
benbangert seems to work safely19:33
benbangertalmost got the base client bit setup, then I'll mix in the testing stuff from zc.zk19:33
benbangertand add the higher level children/properties API's from zc.zk19:34
J1mvery cool19:34
benbangertthat pipe trick feels kind of weak, but oh well19:35
benbangertthats the main thing when using the coordination objects, also, even weird, waking the async object from a separete real os thread can apparently cause issues, which is why the actual set/set_exception is curryed over to the gevent thread to run19:35
benbangertthe side effect of the 3 greenlets/threads at the moment means its possible for session events to fire at the same time as a watcher executes... which doesn't immediately strike me as problematic, does that raise any warning bells in your head?19:37
benbangertat the very least, it makes it easier to use ZK calls in a watcher without worrying about blocking session re-establishment19:37
benbangert(normally you'd need to spawn your watcher to another thread to ensure you don't block those anyways)19:37
mgedminactually, the zope.server-dying-thread problem is probably my fault23:55
mgedminor, rather, the fault of zilch:23:56
mgedminif this happens in zope.server's handlerThread method, in log.exception(), that terminates the entire thread23:56
mgedminis the code from zope.server23:56

