Implement a Network Keep-Alive Mechanism in Go
This challenge requires you to build a simple keep-alive mechanism for network connections in Go. A keep-alive system is crucial for maintaining active network connections, detecting dead connections, and ensuring resources are properly managed by sending periodic "heartbeat" messages.
Problem Description
You need to design and implement a Go program that establishes a TCP connection to a specified server and periodically sends a "ping" message to ensure the connection remains alive. If the server doesn't respond to a "ping" within a certain timeout, the program should detect this as a dead connection and attempt to re-establish it.
Key Requirements:
- Establish TCP Connection: The program should be able to connect to a given host and port.
- Periodic Pinging: After a successful connection, the program must send a predefined "ping" message at regular intervals.
- Response Timeout: The program should wait for a specific "pong" response from the server after sending a "ping". If no response is received within the timeout period, the connection should be considered dead.
- Connection Monitoring: A separate goroutine should continuously monitor the connection's health.
- Reconnection Logic: If the connection is detected as dead, the program should attempt to reconnect to the server after a delay.
- Graceful Shutdown: The program should be able to shut down gracefully, closing any open connections.
Expected Behavior:
- The program starts and attempts to connect to the target server.
- Upon successful connection, it begins sending "ping" messages and waiting for "pong" responses.
- If a "pong" is received, the cycle restarts.
- If the timeout is reached without a "pong", the connection is marked as dead, closed, and a reconnection attempt is made after a short interval.
- If the program receives a shutdown signal (e.g., from a context cancellation), it should cleanly close the connection and exit.
Edge Cases to Consider:
- Server unavailable: What happens if the initial connection fails?
- Intermittent network issues: How does the system handle temporary disconnections?
- Server not responding to pings: The primary scenario for detecting a dead connection.
- Large delays in "pong" responses: Ensure the timeout is correctly handled.
Examples
For the purpose of this challenge, we will simulate a server that responds to "ping" with "pong" and can also simulate a dead connection by not responding.
Example 1: Successful Ping-Pong Cycle
Assume a server is running at localhost:8080 and correctly responds to "ping" with "pong".
Input (Conceptual):
- Server Address:
localhost:8080 - Ping Interval:
5s - Ping Timeout:
3s - Ping Message:
ping - Pong Message:
pong
Program Output (Conceptual):
Connecting to localhost:8080...
Connection established.
Sent: ping
Received: pong
Sent: ping
Received: pong
...
Explanation: The program successfully connects, sends "ping", receives "pong", and repeats this cycle every 5 seconds.
Example 2: Connection Timeout and Reconnection
Assume the server at localhost:8080 is running but stops responding to "ping" messages.
Input (Conceptual):
- Server Address:
localhost:8080 - Ping Interval:
5s - Ping Timeout:
3s - Ping Message:
ping - Pong Message:
pong - Reconnection Delay:
2s
Program Output (Conceptual):
Connecting to localhost:8080...
Connection established.
Sent: ping
Received: pong
Sent: ping
[Timeout waiting for pong after 3s]
Connection to localhost:8080 lost.
Attempting to reconnect in 2s...
Connecting to localhost:8080...
[Assuming server recovers or becomes available]
Connection established.
Sent: ping
Received: pong
...
Explanation: The program successfully connects and sends a "ping". However, it doesn't receive a "pong" within 3 seconds. It logs the timeout, closes the connection, and waits 2 seconds before attempting to reconnect.
Constraints
- The program must be written in Go.
- Use standard Go libraries for networking (
netpackage). - Implement the keep-alive logic within a dedicated goroutine.
- Ping Interval: Between
1sand60s. - Ping Timeout: Between
1sand30s. - Reconnection Delay: Between
1sand10s. - The program should handle a maximum of 5 concurrent dead connection detection events before entering a more aggressive backoff strategy (optional but good to consider for robustness).
Notes
- Consider using
context.Contextfor managing the lifecycle of the connection and goroutines, allowing for graceful shutdown. - Error handling is critical. Log errors appropriately.
- For testing, you might need to run a simple mock TCP server that simulates these behaviors (responding, not responding, etc.).
- Think about how to safely read from and write to the network connection from multiple goroutines if necessary (though a single goroutine for pinging is sufficient for this problem).
- The "ping" and "pong" messages are simple byte slices. Ensure they are sent and received correctly, including any necessary delimiters or length prefixes if your mock server requires them (for this challenge, simple strings are fine).