Hello,
I did find these exceptions. I issued the loadbalance command on node
192.168.2.10.
INFO [MESSAGING-SERVICE-POOL:3] 2010-03-01 10:34:40,764 TcpConnection.java
(line 315) Closing errored connection
java.nio.channels.SocketChannel[connected local=/192.168.2.10:55973 remote=/
192.168.2.13:7000]
WARN [MESSAGE-DESERIALIZER-POOL:1] 2010-03-01 10:34:40,964
MessagingService.java (line 555) Running on default stage - beware
WARN [MESSAGING-SERVICE-POOL:1] 2010-03-01 10:34:40,964 TcpConnection.java
(line 484) Problem reading from socket connected to :
java.nio.channels.SocketChannel[connected local=/192.168.2.10:40758 remote=/
192.168.2.13:7000]
WARN [MESSAGING-SERVICE-POOL:1] 2010-03-01 10:34:40,964 TcpConnection.java
(line 485) Exception was generated at : 03/01/2010 10:34:40 on thread
MESSAGING-SERVICE-POOL:1
Reached an EOL or something bizzare occured. Reading from:
/192.168.2.13BufferSizeRemaining: 16
java.io.IOException: Reached an EOL or something bizzare occured. Reading
from: /192.168.2.13 BufferSizeRemaining: 16
at org.apache.cassandra.net.io.StartState.doRead(StartState.java:44)
at org.apache.cassandra.net.io.ProtocolState.read(ProtocolState.java:39)
at org.apache.cassandra.net.io.TcpReader.read(TcpReader.java:95)
at
org.apache.cassandra.net.TcpConnection$ReadWorkItem.run(TcpConnection.java:445)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown
Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
INFO [MESSAGING-SERVICE-POOL:1] 2010-03-01 10:34:40,964 TcpConnection.java
(line 315) Closing errored connection
java.nio.channels.SocketChannel[connected local=/192.168.2.10:40758 remote=/
192.168.2.13:7000]
INFO [MESSAGE-STREAMING-POOL:1] 2010-03-01 10:35:23,171 TcpConnection.java
(line 315) Closing errored connection
java.nio.channels.SocketChannel[connected local=/192.168.2.10:56728 remote=/
192.168.2.13:7000]
INFO [MESSAGE-STREAMING-POOL:1] 2010-03-01 10:35:23,221 FileStreamTask.java
(line 79) Exception was generated at : 03/01/2010 10:35:23 on thread
MESSAGE-STREAMING-POOL:1
Value too large for defined data type
java.io.IOException: Value too large for defined data type
at sun.nio.ch.FileChannelImpl.transferTo0(Native Method)
at sun.nio.ch.FileChannelImpl.transferToDirectly(Unknown Source)
at sun.nio.ch.FileChannelImpl.transferTo(Unknown Source)
at org.apache.cassandra.net.TcpConnection.stream(TcpConnection.java:226)
at org.apache.cassandra.net.FileStreamTask.run(FileStreamTask.java:55)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown
Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
I can certainly upgrade to 0.6 and try a loadbalance there, do you
still think it is advisable?
All of my key/value entries are well under 1024 bytes but I have millions of
them.
Do you think I have a data corruption problem?
Thanks,
Jon
Post by Jonathan EllisPost by Jon GrahamThanks Jonathan.
It seems like the load balance operation isn't moving. I haven't seen any
data file time changes in 2 hours and no location file time
changes in over an hour.
I can see a tcp port # 7000 opened on the node where I ran the
loadbalance
Post by Jon Grahamcommand. It is connected to
port 39033 on the node receiving the data. The CPU usage on both systems
is
Post by Jon Grahamvery low. There are about 10
million records on the node where the load balance command was issued.
Did you check logs for exceptions?
Post by Jon GrahamMy six node Cassandra ring consists of tokens for nodes 1-6 of: 0
(ascii 0x30) 6 B H O (the letter O) T
The load balance target node initially had a token of 'H' (using ordered
partitioning). The source node has a key of 0 (ascii 0x30). Most of the
data
Post by Jon Grahamon the source node has keys starting with '/'. Slash falls between tokens
T
Post by Jon Grahamand 0 in my ring so most of the data landed on the node with token 0
with
Post by Jon Grahamreplicas on the next 2 nodes. My token space is badly divided for the
data I
Post by Jon Grahamhave already inserted.
Does the initial token value of the load balance target node selected by
Cassandra need to be cleared or set to a specific value before hand to
accomodate the load balance data transfer?
No.
Post by Jon GrahamWould I have better luck decommissioning nodes 4,5,6 and trying to
bootstrapping these nodes one at a time
with better initial token values?
LoadBalance is basically sugar for decommission + bootstrap, so no.
Post by Jon GrahamI am looking for a good way to move/split/re-balance data from nodes
1,2,3
Post by Jon Grahamto nodes 4, 5, 6 while achiving a better token space distribution.
I would upgrade to the 0.6 beta and try loadbalance again.
-Jonathan