Skip to content

Context concurrency and parallelism issues #1766

Open
@arielvalentin

Description

@arielvalentin

The OpenTelemetry::Context uses a fiber local Ruby Array to keep track of the active Context instance.

There are at least 2 instances in gems that are breaking encapsulation and copying fiber local variables to child threads:

This leads to concurrency and parallel processing issues because multi-threads are pushing and popping entries on the same Stack, resulting in OpenTelemetry::Context::DetachError as well as non-deterministic behavior where the OpenTelemetry::Context.current method returns the wrong instance of the context object.

Helpers that rely on the implicitly active Context, like OpenTelemetry::Trace.current_span, will end up receiving the wrong Span object instance.

There is a proposed fixed here that would end up using Immutable Arrays, however this will increase the number of object allocations per Context stack mutation:

#1760

Another option I have considered is to propose using Thread Local variables instead of Fiber local variables, however that would result in concurrency issues when using Fibers where multiple fibers would have access to the same Array instance. We would need a mechanism to seed a spawned thread with a snapshot of the stack so that there could be some consistent context propagation for the gems listed above.

I would like to get some feedback from @open-telemetry/ruby-maintainers on whether or not there are addition alternatives we can consider.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions