Arjun
2014-06-19 13:57:15 UTC
Hi,
I have a set up of 3 kafka servers, with a replication factor of 2.
I have only one topic in this setup as of now.
bin/kafka-list-topic.sh --zookeeper
server1:2181,server2:2181,server3:2181 --topic topic1
topic: topic1 partition: 0 leader: 1 replicas: 2,1 isr: 1
topic: topic1 partition: 1 leader: 0 replicas: 0,2 isr: 0
topic: topic1 partition: 2 leader: 1 replicas: 1,0 isr: 0,1
topic: topic1 partition: 3 leader: 0 replicas: 2,0 isr: 0
topic: topic1 partition: 4 leader: 0 replicas: 0,1 isr: 0,1
topic: topic1 partition: 5 leader: 1 replicas: 1,2 isr: 1
topic: topic1 partition: 6 leader: 1 replicas: 2,1 isr: 1
topic: topic1 partition: 7 leader: 0 replicas: 0,2 isr: 0
topic: topic1 partition: 8 leader: 1 replicas: 1,0 isr: 0,1
topic: topic1 partition: 9 leader: 0 replicas: 2,0 isr: 0
topic: topic1 partition: 10 leader: 0 replicas: 0,1 isr: 0,1
topic: topic1 partition: 11 leader: 1 replicas: 1,2 isr: 1
The Third broker is not in the ISR list. There are no errors in the logs.
The Thread dump doesn't have any thread with "RepliacaFetcherManager"
*Thread Dump
------------------------------------------------------------------------------------------------------------------------------------------------------
*2014-06-19 13:27:39
Full thread dump Java HotSpot(TM) 64-Bit Server VM (20.4-b02 mixed mode):
"RMI TCP Connection(idle)" daemon prio=10 tid=0x00007fccec004800
nid=0x201f waiting on condition [0x00007fcce540f000]
java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000000bc30e6c8> (a
java.util.concurrent.SynchronousQueue$TransferStack)
at
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:196)
at
java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:424)
at
java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:323)
at
java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:874)
at
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:945)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
at java.lang.Thread.run(Thread.java:662)
"JMX server connection timeout 30" daemon prio=10 tid=0x00007fccf800a800
nid=0x555 in Object.wait() [0x00007fcce530e000]
java.lang.Thread.State: TIMED_WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at
com.sun.jmx.remote.internal.ServerCommunicatorAdmin$Timeout.run(ServerCommunicatorAdmin.java:150)
- locked <0x00000000bc39a640> (a [I)
at java.lang.Thread.run(Thread.java:662)
"RMI Scheduler(0)" daemon prio=10 tid=0x00007fccf0040000 nid=0x550
waiting on condition [0x00007fcce5510000]
java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000000bc2e1fe8> (a
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:196)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2025)
at java.util.concurrent.DelayQueue.take(DelayQueue.java:164)
at
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:609)
at
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:602)
at
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:947)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
at java.lang.Thread.run(Thread.java:662)
"kafka-logflusher-1" daemon prio=10 tid=0x00007fcd102b9800 nid=0x54d
waiting on condition [0x00007fcce5813000]
java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------
I haven't seen any GC pauses in the system. JMX max lag (
"kafka.server":name="([-.\w]+)-MaxLag",type="ReplicaFetcherManager") for
this node is 0.
We have restarted the nodes one after the other and we cant make this
node to push to ISR.
Can some one please let me know, how to push this node to ISR.
Thanks
Arjun Narasimha Kota
I have a set up of 3 kafka servers, with a replication factor of 2.
I have only one topic in this setup as of now.
bin/kafka-list-topic.sh --zookeeper
server1:2181,server2:2181,server3:2181 --topic topic1
topic: topic1 partition: 0 leader: 1 replicas: 2,1 isr: 1
topic: topic1 partition: 1 leader: 0 replicas: 0,2 isr: 0
topic: topic1 partition: 2 leader: 1 replicas: 1,0 isr: 0,1
topic: topic1 partition: 3 leader: 0 replicas: 2,0 isr: 0
topic: topic1 partition: 4 leader: 0 replicas: 0,1 isr: 0,1
topic: topic1 partition: 5 leader: 1 replicas: 1,2 isr: 1
topic: topic1 partition: 6 leader: 1 replicas: 2,1 isr: 1
topic: topic1 partition: 7 leader: 0 replicas: 0,2 isr: 0
topic: topic1 partition: 8 leader: 1 replicas: 1,0 isr: 0,1
topic: topic1 partition: 9 leader: 0 replicas: 2,0 isr: 0
topic: topic1 partition: 10 leader: 0 replicas: 0,1 isr: 0,1
topic: topic1 partition: 11 leader: 1 replicas: 1,2 isr: 1
The Third broker is not in the ISR list. There are no errors in the logs.
The Thread dump doesn't have any thread with "RepliacaFetcherManager"
*Thread Dump
------------------------------------------------------------------------------------------------------------------------------------------------------
*2014-06-19 13:27:39
Full thread dump Java HotSpot(TM) 64-Bit Server VM (20.4-b02 mixed mode):
"RMI TCP Connection(idle)" daemon prio=10 tid=0x00007fccec004800
nid=0x201f waiting on condition [0x00007fcce540f000]
java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000000bc30e6c8> (a
java.util.concurrent.SynchronousQueue$TransferStack)
at
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:196)
at
java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:424)
at
java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:323)
at
java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:874)
at
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:945)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
at java.lang.Thread.run(Thread.java:662)
"JMX server connection timeout 30" daemon prio=10 tid=0x00007fccf800a800
nid=0x555 in Object.wait() [0x00007fcce530e000]
java.lang.Thread.State: TIMED_WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at
com.sun.jmx.remote.internal.ServerCommunicatorAdmin$Timeout.run(ServerCommunicatorAdmin.java:150)
- locked <0x00000000bc39a640> (a [I)
at java.lang.Thread.run(Thread.java:662)
"RMI Scheduler(0)" daemon prio=10 tid=0x00007fccf0040000 nid=0x550
waiting on condition [0x00007fcce5510000]
java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000000bc2e1fe8> (a
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:196)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2025)
at java.util.concurrent.DelayQueue.take(DelayQueue.java:164)
at
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:609)
at
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:602)
at
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:947)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
at java.lang.Thread.run(Thread.java:662)
"kafka-logflusher-1" daemon prio=10 tid=0x00007fcd102b9800 nid=0x54d
waiting on condition [0x00007fcce5813000]
java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------
I haven't seen any GC pauses in the system. JMX max lag (
"kafka.server":name="([-.\w]+)-MaxLag",type="ReplicaFetcherManager") for
this node is 0.
We have restarted the nodes one after the other and we cant make this
node to push to ISR.
Can some one please let me know, how to push this node to ISR.
Thanks
Arjun Narasimha Kota