Description
The OpenTelemetry::Context
uses a fiber local Ruby Array to keep track of the active Context
instance.
There are at least 2 instances in gems that are breaking encapsulation and copying fiber local variables to child threads:
This leads to concurrency and parallel processing issues because multi-threads are pushing and popping entries on the same Stack, resulting in OpenTelemetry::Context::DetachError
as well as non-deterministic behavior where the OpenTelemetry::Context.current
method returns the wrong instance of the context object.
Helpers that rely on the implicitly active Context, like OpenTelemetry::Trace.current_span
, will end up receiving the wrong Span object instance.
There is a proposed fixed here that would end up using Immutable Arrays, however this will increase the number of object allocations per Context stack mutation:
Another option I have considered is to propose using Thread Local variables instead of Fiber local variables, however that would result in concurrency issues when using Fibers where multiple fibers would have access to the same Array instance. We would need a mechanism to seed a spawned thread with a snapshot of the stack so that there could be some consistent context propagation for the gems listed above.
I would like to get some feedback from @open-telemetry/ruby-maintainers on whether or not there are addition alternatives we can consider.