Skip to content

tests: server logging via tempfile may have threading issues #1196

@jku

Description

@jku

Filing this after debugging rare errors in #1192

The tempfile approach to capturing server logging works fine in current develop branch but it triggers issues in PR 1192 where we start reading from the file at a rapid pace -- based on the results this seems to be a threading issue in file object as the log contents become corrupted (rarely but still reproducibly)

I just tried another approach (worker thread + Queue instead of a TemporaryFile) and it seems to work (this is completely unintegrated into the tests, just an example):

try:
  import queue
except ImportError:
  import Queue as queue # python2

import subprocess
import sys
import threading
import time


# Worker function to run in separate thread
# Reads from 'stream' (stdout), puts lines in Queue 'line_queue' (Queue is thread-safe)
def log_queue_worker(stream, line_queue):
  while True:
    log_line = stream.readline()
    line_queue.put(log_line)
    if len(log_line) == 0:
      # EOS
      break

# Start child process
proc = subprocess.Popen([sys.executable, "-u", "proxy_server.py"],
    stdout=subprocess.PIPE, stderr=subprocess.STDOUT)

# Run log_queue_worker() in a thread
# The thread will exit when child process dies
log_queue = queue.Queue()
log_thread = threading.Thread(target=log_queue_worker, args=(proc.stdout, log_queue))
log_thread.daemon = True
log_thread.start()

# setup timeout variables
start = time.time()
elapsed = 0
timeout = 10

# Get lines from log_queue until port is found, child process exits, or we timeout
while True:
  try:
    line = log_queue.get(timeout=timeout - elapsed)
    if len(line) == 0:
      print ("process exit")
      break
    elif line.startswith(b"bind succeeded, server port is: "):
      print (line.rstrip('\n'))
      break
  except queue.Empty:
    print("timeout")
    break
  finally:
    elapsed = time.time() - start

This has the added benefit of not needing a single sleep (because the worker thread can block and main thread uses Queue.get() timeout)...

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions