-
Notifications
You must be signed in to change notification settings - Fork 6
Improve thread safety of SwaggerClient #202
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
This incorporates the change in HumanCellAtlas/dcp-cli#202
This incorporates the change in HumanCellAtlas/dcp-cli#202
This incorporates the change in HumanCellAtlas/dcp-cli#202
This incorporates the change in HumanCellAtlas/dcp-cli#202
This incorporates the change in HumanCellAtlas/dcp-cli#202
* parallelize read from DSS in workers * Use thread-safer DCP interface This incorporates the change in HumanCellAtlas/dcp-cli#202 * Include thread id in logs This is helpful in our multithreaded code paths * Don't print logs twice AWS Lambda has a default root logger that captures everything that goes to stdout. If you don't turn `propagate` off, you get every message twice in cloudwatch. * Poll for locks much less frequently A failure mode we encounter is using up capacity on the lock table. If we're anticipating smaller numbers of worker lambdas that handle larger blocks of works, we can wait longer between checking for lock availability. This reduced our dynamo consumed capacity. * Distribute 100 bundles to each lambda And aggressively parallelize DSS I/O. * Assume that each bundle has a single cell in the mapper This is a bad assumption, but it's helpful for the demo. * Address comments
ttung
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unit tests are broken, btw.
my suggestion for making this a bit safer: in __init__(..), create a lock. in stream(..), hold the lock, and verify that _context_manager_response is None before issuing the request. if it's not None, then the previous request is plausibly still running.
|
That's definitely safer, but I think the safety is achieved by fully serializing calls to |
|
Also, re unit tests, I'm pretty sure master is broken? Is that possible? |
I think my suggestion achieves that same effect...? |
Yes, that's definitely possible. |
|
Passing now, would you like to try your unit tests again? |
* parallelize read from DSS in workers * Use thread-safer DCP interface This incorporates the change in HumanCellAtlas/dcp-cli#202 * Include thread id in logs This is helpful in our multithreaded code paths * Don't print logs twice AWS Lambda has a default root logger that captures everything that goes to stdout. If you don't turn `propagate` off, you get every message twice in cloudwatch. * Poll for locks much less frequently A failure mode we encounter is using up capacity on the lock table. If we're anticipating smaller numbers of worker lambdas that handle larger blocks of works, we can wait longer between checking for lock availability. This reduced our dynamo consumed capacity. * Distribute 100 bundles to each lambda And aggressively parallelize DSS I/O. * Assume that each bundle has a single cell in the mapper This is a bad assumption, but it's helpful for the demo. * Address comments
SwaggerClient discovers methods by reading a swagger document, creating instances of _ClientMethodFactory for paths in the swagger, and assigning those instances as class attributes of SwaggerClient. This means that a single instance of each _ClientMethodFactory is shared across threads. This is not so good because _ClientMethodFactories store state associated with requests. In particular, consider the following scenario: Thread-1: client = hca.dss.DSSClient() Thread-2: client = hca.dss.DSSClient() Thread-1: with client.get_file.stream(...) as handle: Thread-2: with client.get_file.stream(...) as handle: Thread-1: handle.raw.read() Depending on the execution of __enter__, Thread-1 may have just read from the file for Thread-2! Going on, Thread-1: __exit__ the with block Thread-2: handle.raw.read() Thread-2: __exit__ the with block Exception: NoneType has no attribute close() Or you know, something like that. This change makes the methods instance attributes.
Really feel like this one should pass...
ae247b3 to
3c57a7f
Compare
|
Thanks @kozbo , still failing, but interesting failures now. |
|
@mckinsel any progress on this? Is this still an issue? |
SwaggerClient discovers methods by reading a swagger document, creating
instances of _ClientMethodFactory for paths in the swagger, and
assigning those instances as class attributes of SwaggerClient.
This means that a single instance of each _ClientMethodFactory is shared
across threads. This is not so good because _ClientMethodFactories
store state associated with requests. In particular, consider the
following scenario:
Depending on the execution of
__enter__, Thread-1 may have just read fromthe file for Thread-2! Going on,
Or you know, something like that.
This change makes the methods instance attributes.