Skip to content

[STORM-4116] Heartbeats mechanism is affected by Y2038 bug #7897

Open
@jira-importer

Description

@jira-importer

I have a test after year 2038 (ex: 2040) to validate my topology is not affected by Y2038 bug.

Context:

  • I have installed storm 2.4.0 on each node of my platform
  • I updated my platform to date 24/04/2040
  • In the storm nimbus configuration I set the following pacemaker configuration:
    1. ######################################
      1.    Pacemaker configuration     ###
        ######################################
  1. Cluster state management. PaceMakerStateStorageFactory to use pacemaker instead of Zookeeper.
    storm.cluster.state.store: "org.apache.storm.cluster.PaceMakerStateStorageFactory"
  1. Pacemaker servers and port configuration
    pacemaker.servers: [""]
    pacemaker.port: 6699
  1. Minimal number of thread used to monitor topologies lifecycle
    pacemaker.base.threads: 10
  1. Maximal number of thread used to monitor topologies lifecycle
    pacemaker.max.threads: 50
  1. Number of maximum thread for each connected client
    pacemaker.client.max.threads: 2
  1. Thread client timeout
    pacemaker.thread.timeout: 10
  1. Childopts for server
    pacemaker.childopts: "-Xmx4096m"
  1. Authentification if needed (Kerberos, etc ...)
    pacemaker.auth.method: "NONE"
    pacemaker.kerberos.users: []

Size maximum of message sent by supervisor to pacemaker
pacemaker.thrift.message.size.max: 10485760

  • In the storm supervisor, I put the following one:
    1. ######################################
      1.    Pacemaker configuration     ###
        ######################################
  1. Cluster state management. PaceMakerStateStorageFactory to use pacemaker instead of Zookeeper.
    storm.cluster.state.store: "org.apache.storm.cluster.PaceMakerStateStorageFactory"

Pacemaker servers and port configuration
pacemaker.servers: [""]
pacemaker.port: 6699

  • I submitted my topology
  • The topology is well submitted on the supervisor node

Observations:

I checked the from supervisor node and I am able to ping.

In the nimbus log, I observed that after a certain time the topology is reassigned because no heartbeat has been received inside Nimbus server from workers. I checked logs and content of sources and I observed that timestamps of heartbeats (time_secs and uptime_secs variables) are set into integer.

This issue also been observed on 2.7.0 version.


Originally reported by alexisdureuil, imported from: Heartbeats mechanism is affected by Y2038 bug
  • status: Open
  • priority: Major
  • resolution: Unresolved
  • imported: 2025-01-24

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions