Discussion:
Cassandra Needs to Grow Up by Version Five!
(too old to reply)
Kenneth Brotman
2018-02-19 05:39:01 UTC
Permalink
Cassandra feels like an unfinished program to me. The problem is not that
it's open source or cutting edge. It's an open source cutting edge program
that lacks some of its basic functionality. We are all stuck addressing
fundamental mechanical tasks for Cassandra because the basic code that would
do that part has not been contributed yet.

Ease of use issues need to be given much more attention. For an
administrator, the ease of use of Cassandra is very poor.

Furthermore, currently Cassandra is an idiot. We have to do everything for
Cassandra. Contrast that with the fact that we are in the dawn of artificial
intelligence.

Software exists to automate tasks for humans, not mechanize humans to
administer tasks for a database. I'm an engineering type. My job is to
apply science and technology to solve real world problems. And that's where
I need an organization's I.T. talent to focus; not in crank starting an
unfinished database.

For example, I should be able to go to any node, replace the Cassandra.yaml
file and have a prompt on the display ask me if I want to update all the
yaml files across the cluster. I shouldn't have to manually modify yaml
files on each node or have to create a script for some third party
automation tool to do it.

I should not have to turn off service, clear directories, restart service in
coordination with the other nodes. It's already a computer system. It can
do those things on its own.

How about read repair. First there is something wrong with the name. Maybe
it should be called Consistency Repair. An administrator shouldn't have to
do anything. It should be a behavior of Cassandra that is programmed in. It
should consider the GC setting of each node, calculate how often it has to
run repair, when it should run it so all the nodes aren't trying at the same
time and when other circumstances indicate it should also run it.

Certificate management should be automated.

Cluster wide management should be a big theme in any next major release.
What is a major release? How many major releases could a program have
before all the coding for basic stuff like installation, configuration and
maintenance is included!

Finish the basic coding of Cassandra, make it easy to use for
administrators, make is smart, add cluster wide management. Keep Cassandra
competitive or it will soon be the old Model T we all remember fondly.

I ask the Committee to compile a list of all such items, make a plan, and
commit to including the completed and tested code as part of major release
5.0. I further ask that release 4.0 not be delayed and then there be an
unusually short skip to version 5.0.

Kenneth Brotman
Jeff Jirsa
2018-02-19 06:58:11 UTC
Permalink
Comments inline


> On Feb 18, 2018, at 9:39 PM, Kenneth Brotman <***@yahoo.com.INVALID> wrote:
>
> Cassandra feels like an unfinished program to me. The problem is not that it’s open source or cutting edge. It’s an open source cutting edge program that lacks some of its basic functionality. We are all stuck addressing fundamental mechanical tasks for Cassandra because the basic code that would do that part has not been contributed yet.
>
There’s probably 2-3 reasons why here:

1) Historically the pmc has tried to keep the scope of the project very narrow. It’s a database. We don’t ship drivers. We don’t ship developer tools. We don’t ship fancy UIs. We ship a database. I think for the most part the narrow vision has been for the best, but maybe it’s time to reconsider some of the scope.

Postgres will autovacuum to prevent wraparound (hopefully), but everyone I know running Postgres uses flexible-freeze in cron - sometimes it’s ok to let the database have its opinions and let third party tools fill in the gaps.

2) Cassandra is, by definition, a database for large scale problems. Most of the companies working on/with it tend to be big companies. Big companies often have pre-existing automation that solved the stuff you consider fundamental tasks, so there’s probably nobody actively working on the solved problems that you may consider missing features - for many people they’re already solved.

3) It’s not nearly as basic as you think it is. Datastax seemingly had a multi-person team on opscenter, and while it was better than anything else around last time I used it (before it stopped supporting the OSS version), it left a lot to be desired. It’s probably 2-3 engineers working for a month to have any sort of meaningful, reliable, mostly trivial cluster-managing UI, and I can think of about 10 JIRAs I’d rather see that time be spent on first.

> Ease of use issues need to be given much more attention. For an administrator, the ease of use of Cassandra is very poor.
>
> Furthermore, currently Cassandra is an idiot. We have to do everything for Cassandra. Contrast that with the fact that we are in the dawn of artificial intelligence.
>

And for everything you think is obvious, there’s a 50% chance someone else will have already solved differently, and your obvious new solution will be seen as an inconvenient assumption and complexity they won’t appreciate. Open source projects get to walk a fine line of trying to be useful without making too many assumptions, being “too” opinionated, or overstepping bounds. We may be too conservative, but it’s very easy to go too far in the opposite direction.

> Software exists to automate tasks for humans, not mechanize humans to administer tasks for a database. I’m an engineering type. My job is to apply science and technology to solve real world problems. And that’s where I need an organization’s I.T. talent to focus; not in crank starting an unfinished database.
>

And that’s why nobody’s done it - we all have bigger problems we’re being paid to solve, and nobody’s felt it necessary. Because it’s not necessary, it’s nice, but not required.

> For example, I should be able to go to any node, replace the Cassandra.yaml file and have a prompt on the display ask me if I want to update all the yaml files across the cluster. I shouldn’t have to manually modify yaml files on each node or have to create a script for some third party automation tool to do it.
>
I don’t see this ever happening. Your config management already pushes files around your infrastructure, Cassandra doesn’t need to do it.

> I should not have to turn off service, clear directories, restart service in coordination with the other nodes. It’s already a computer system. It can do those things on its own.
>

The only time you should be doing this is when you’re wiping nodes from failed bootstrap, and that stopped being required in 2.2.
> How about read repair. First there is something wrong with the name. Maybe it should be called Consistency Repair. An administrator shouldn’t have to do anything. It should be a behavior of Cassandra that is programmed in. It should consider the GC setting of each node, calculate how often it has to run repair, when it should run it so all the nodes aren’t trying at the same time and when other circumstances indicate it should also run it.
>
There’s a good argument to be made that something like Reaper should be shipped with Cassandra. There’s another good argument that most tools like this end up needing some sort of leader election for scheduling and that goes against a lot of the fundamental assumptions in Cassandra (all nodes are equal, etc) - solving that problem is probably at least part of why you haven’t seen them built into the db. “Leader election is easy” you’ll say, and I’ll laugh and tell you about users I know who have DCs go offline for weeks at a time.

> Certificate management should be automated.
>
Stefan (in particular) has done a fair amount of work on this, but I’d bet 90% of users don’t use ssl and genuinely don’t care.

> Cluster wide management should be a big theme in any next major release.
>
Na. Stability and testing should be a big theme in the next major release.

> What is a major release? How many major releases could a program have before all the coding for basic stuff like installation, configuration and maintenance is included!
>
> Finish the basic coding of Cassandra, make it easy to use for administrators, make is smart, add cluster wide management. Keep Cassandra competitive or it will soon be the old Model T we all remember fondly.
>

Let’s keep some perspective. Most of us came to Cassandra from rdbms worlds where we were building solutions out of a bunch of master/slave MySQL / Postgres type databases. I started using Cassandra 0.6 when I needed to store something like 400gb/day in 200whatever on spinning disks when 100gb felt like a “big” database, and the thought of writing runbooks and automation to automatically pick the most up to date slave as the new master, promote it, repoint the other slave to the new master, then reformat the old master and add it as a new slave without downtime and without potentially deleting the company’s whole dataset sounded awful. Cassandra solved that problem, at the cost of maintaining a few yaml (then xml) files. Yes there are rough edges - they get slightly less rough on each new release. Can we do better? Sure, use your engineering time and send some patches. But the basic stuff is the nuts and bolts of the database: I care way more about streaming and compaction than I’ll ever care about installation.

> I ask the Committee to compile a list of all such items, make a plan, and commit to including the completed and tested code as part of major release 5.0. I further ask that release 4.0 not be delayed and then there be an unusually short skip to version 5.0.
>

The committers are working their ass off on all sorts of hard problems. Some of those are probably even related to Cassandra. If you have idea, open a JIRA. If you have time, send a patch. Or review a patch. But don’t expect a bunch of people to set down work on optimizing the database to work on packaging and installation, because there’s no ROI in it for 99% of the existing committers: we’re working on the database to solve problems, and installation isn’t one of those problems.
Kenneth Brotman
2018-02-19 09:01:10 UTC
Permalink
Comments inline

>-----Original Message-----
>From: Jeff Jirsa [mailto:***@gmail.com]
>Sent: Sunday, February 18, 2018 10:58 PM
>To: ***@cassandra.apache.org
>Cc: ***@cassandra.apache.org
>Subject: Re: Cassandra Needs to Grow Up by Version Five!
>
>Comments inline
>
>
>> On Feb 18, 2018, at 9:39 PM, Kenneth Brotman <***@yahoo.com.INVALID> wrote:
>>
> >Cassandra feels like an unfinished program to me. The problem is not that it’s open source or cutting edge. It’s an open source cutting edge program that lacks some of its basic functionality. We are all stuck addressing fundamental mechanical tasks for Cassandra because the basic code that would do that part has not been contributed yet.
>>
>There’s probably 2-3 reasons why here:
>
>1) Historically the pmc has tried to keep the scope of the project very narrow. It’s a database. We don’t ship drivers. We don’t ship developer tools. We don’t ship fancy UIs. We ship a database. I think for the most part the narrow vision has been for the best, but maybe it’s time to reconsider some of the scope.
>
>Postgres will autovacuum to prevent wraparound (hopefully), but everyone I know running Postgres uses flexible-freeze in cron - sometimes it’s ok to let the database have its opinions and let third party tools fill in the gaps.
>

I can appreciate the desire to stay in scope. I believe usability is the King. When users have to learn the database, then learn what they have to automate, then learn an automation tool and then use the automation tool to do something that is as fundamental as the fundamental tasks I described, then something is missing from the database itself that is adversely affecting usability - and that is very bad. Where those big companies need to calculate the ROI is in the cost of acquiring or training the next group of users. Consider how steep the learning curve is for new users. Consider the business case for improving ease of use.

>2) Cassandra is, by definition, a database for large scale problems. Most of the companies working on/with it tend to be big companies. Big companies often have pre-existing automation that solved the stuff you consider fundamental tasks, so there’s probably nobody actively working on the solved problems that you may consider missing features - for many people they’re already solved.
>

I could be wrong but it sounds like a lot of the code work is done, and if the companies would take the time to contribute more code, then the rest of the code needed could be generated easily.

>3) It’s not nearly as basic as you think it is. Datastax seemingly had a multi-person team on opscenter, and while it was better than anything else around last time I used it (before it stopped supporting the OSS version), it left a lot to be desired. It’s probably 2-3 engineers working for a month to have any sort of meaningful, reliable, mostly trivial cluster-managing UI, and I can think of about 10 JIRAs I’d rather see that time be spent on first.

How about 6-9 engineers working 12 months a year on it then. I'm not kidding. For a big company with revenues in the tens of billions or more, and a heavy use of Cassandra nodes, it's easy to make a case for having a full time person or more that involved. They aren't paying for using the open source code that is Cassandra. Let's see what would the licensing fees be for a big company if the costs where like Microsoft or Oracle would charge for their enterprise level relational database? What's the contribution of one or two people in comparison.

>> Ease of use issues need to be given much more attention. For an administrator, the ease of use of Cassandra is very poor.
>>
>>Furthermore, currently Cassandra is an idiot. We have to do everything for Cassandra. Contrast that with the fact that we are in the dawn of artificial intelligence.
>>
>
>And for everything you think is obvious, there’s a 50% chance someone else will have already solved differently, and your obvious new solution will be seen as an inconvenient assumption and complexity they won’t appreciate. Open source projects get to walk a fine line of trying to be useful without making too many assumptions, being “too” opinionated, or overstepping bounds. We may be too conservative, but it’s very easy to go too far in the opposite direction.
>

I appreciate that but when such concerns result in inaction instead of resolution that is no good.

>> Software exists to automate tasks for humans, not mechanize humans to administer tasks for a database. I’m an engineering type. My job is to apply science and technology to solve real world problems. And that’s where I need an organization’s I.T. talent to focus; not in crank starting an unfinished database.
>>
>
>And that’s why nobody’s done it - we all have bigger problems we’re being paid to solve, and nobody’s felt it necessary. Because it’s not necessary, it’s nice, but not required.
>

Of course you would say that, you're Jeff Jirsa. In apprenticeship speak, you’re a master. It's the classic challenge of trying to get a master to see the legitimate issues of the apprentices. I do appreciate the time you give to answer posts to the groups , like this post. So I don't want you to take anything the wrong way. Where it's going to bit everyone is in the future adoption rate. It has to be addressed.

[snip]

>> Certificate management should be automated.
>>
>Stefan (in particular) has done a fair amount of work on this, but I’d bet 90% of users don’t use ssl and genuinely don’t care.
>

I didn't realize. Could I trouble you for a link so I could get up to speed?

>> Cluster wide management should be a big theme in any next major release.
>>
>Na. Stability and testing should be a big theme in the next major release.
>

Double Na on that one Jeff. I think you have a concern there about the need to test sufficiently to ensure the stability of the next major release. That makes perfect sense.- for every release, especially the major ones. Continuous improvement is not a phase of development for example. CI should be in everything, in every phase. Stability and testing a part of every release not just one. A major release should be a nice step from the previous major release though.

>> What is a major release? How many major releases could a program have before all the coding for basic stuff like installation, configuration and maintenance is included!
>>
>> Finish the basic coding of Cassandra, make it easy to use for administrators, make is smart, add cluster wide management. Keep Cassandra competitive or it will soon be the old Model T we all remember fondly.
>>
>
>Let’s keep some perspective. Most of us came to Cassandra from rdbms worlds where we were building solutions out of a bunch of master/slave MySQL / Postgres type databases. I started using Cassandra 0.6 when I needed to store something like 400gb/day in 200whatever on spinning disks when 100gb felt like a “big” database, and the thought of writing runbooks and automation to automatically pick the most up to date slave as the new master, promote it, repoint the other slave to the new master, then reformat the old master and add it as a new slave without downtime and without potentially deleting the company’s whole dataset sounded awful. Cassandra solved that problem, at the cost of maintaining a few yaml (then xml) files. Yes there are rough edges - they get slightly less rough on each new release. Can we do better? Sure, use your engineering time and send some patches. But the basic stuff is the nuts and bolts of the database: I care way more about streaming and compaction than I’ll ever care about installation.
>

I can relate. I was studying the enterprise level MS SQL Server stuff. I noticed exactly what you described. I decided maybe I'll just do other stuff and wait for things to develop more. I'm very excited about the way Cassandra addresses things. Streaming and compaction - very good. I'm glad. Items related to usability are not optional though.

>> I ask the Committee to compile a list of all such items, make a plan, and commit to including the completed and tested code as part of major release 5.0. I further ask that release 4.0 not be delayed and then there be an unusually short skip to version 5.0.
>>
>
>The committers are working their ass off on all sorts of hard problems. Some of those are probably even related to Cassandra. If you have idea, open a JIRA. If you have time, send a patch. Or review a patch. But don’t expect a bunch of people to set down work on optimizing the database to work on packaging and installation, because there’s no ROI in it for 99% of the existing committers: we’re working on the database to solve problems, and installation isn’t one of those problems.

I'm sure they are working very hard on all kinds of hard problems. I actually wrote "Committee", not "committers". There is an obvious shortage of contributors when you consider the size of the organizations using Cassandra. That leave the burden on an unfair few. Installation or more generally I would say usability is not that big a problem for the big companies out there. Good for them.

Ask a new organization or a modest size organization that is struggling to manage their Cassandra cluster that usability is not a big problem. It truly is a big problem for many stakeholders of Cassandra. It needs to be given a bigger priority. Hopefully others will weigh in.

Kenneth Brotman


---------------------------------------------------------------------
To unsubscribe, e-mail: user-***@cassandra.apache.org
For additional commands, e-mail: user-***@cassandra.apache.org
Jeff Jirsa
2018-02-19 17:10:06 UTC
Permalink
There's a lot of things below I disagree with, but it's ok. I convinced
myself not to nit-pick every point.

https://issues.apache.org/jira/browse/CASSANDRA-13971 has some of Stefan's
work with cert management

Beyond that, I encourage you to do what Michael suggested: open JIRAs for
things you care strongly about, work on them if you have time. Sometime
this year we'll schedule a NGCC (Next Generation Cassandra Conference)
where we talk about future project work and direction, I encourage you to
attend if you're able (I encourage anyone who cares about the direction of
Cassandra to attend, it's probably be either free or very low cost, just to
cover a venue and some food). If nothing else, you'll meet some of the
teams who are working on the project, and learn why they've selected the
projects on which they're working. You'll have an opportunity to pitch your
vision, and maybe you can talk some folks into helping out.

- Jeff




On Mon, Feb 19, 2018 at 1:01 AM, Kenneth Brotman <
***@yahoo.com.invalid> wrote:

> Comments inline
>
> >-----Original Message-----
> >From: Jeff Jirsa [mailto:***@gmail.com]
> >Sent: Sunday, February 18, 2018 10:58 PM
> >To: ***@cassandra.apache.org
> >Cc: ***@cassandra.apache.org
> >Subject: Re: Cassandra Needs to Grow Up by Version Five!
> >
> >Comments inline
> >
> >
> >> On Feb 18, 2018, at 9:39 PM, Kenneth Brotman
> <***@yahoo.com.INVALID> wrote:
> >>
> > >Cassandra feels like an unfinished program to me. The problem is not
> that it’s open source or cutting edge. It’s an open source cutting edge
> program that lacks some of its basic functionality. We are all stuck
> addressing fundamental mechanical tasks for Cassandra because the basic
> code that would do that part has not been contributed yet.
> >>
> >There’s probably 2-3 reasons why here:
> >
> >1) Historically the pmc has tried to keep the scope of the project very
> narrow. It’s a database. We don’t ship drivers. We don’t ship developer
> tools. We don’t ship fancy UIs. We ship a database. I think for the most
> part the narrow vision has been for the best, but maybe it’s time to
> reconsider some of the scope.
> >
> >Postgres will autovacuum to prevent wraparound (hopefully), but everyone
> I know running Postgres uses flexible-freeze in cron - sometimes it’s ok to
> let the database have its opinions and let third party tools fill in the
> gaps.
> >
>
> I can appreciate the desire to stay in scope. I believe usability is the
> King. When users have to learn the database, then learn what they have to
> automate, then learn an automation tool and then use the automation tool to
> do something that is as fundamental as the fundamental tasks I described,
> then something is missing from the database itself that is adversely
> affecting usability - and that is very bad. Where those big companies need
> to calculate the ROI is in the cost of acquiring or training the next group
> of users. Consider how steep the learning curve is for new users.
> Consider the business case for improving ease of use.
>
> >2) Cassandra is, by definition, a database for large scale problems. Most
> of the companies working on/with it tend to be big companies. Big companies
> often have pre-existing automation that solved the stuff you consider
> fundamental tasks, so there’s probably nobody actively working on the
> solved problems that you may consider missing features - for many people
> they’re already solved.
> >
>
> I could be wrong but it sounds like a lot of the code work is done, and if
> the companies would take the time to contribute more code, then the rest of
> the code needed could be generated easily.
>
> >3) It’s not nearly as basic as you think it is. Datastax seemingly had a
> multi-person team on opscenter, and while it was better than anything else
> around last time I used it (before it stopped supporting the OSS version),
> it left a lot to be desired. It’s probably 2-3 engineers working for a
> month to have any sort of meaningful, reliable, mostly trivial
> cluster-managing UI, and I can think of about 10 JIRAs I’d rather see that
> time be spent on first.
>
> How about 6-9 engineers working 12 months a year on it then. I'm not
> kidding. For a big company with revenues in the tens of billions or more,
> and a heavy use of Cassandra nodes, it's easy to make a case for having a
> full time person or more that involved. They aren't paying for using the
> open source code that is Cassandra. Let's see what would the licensing
> fees be for a big company if the costs where like Microsoft or Oracle would
> charge for their enterprise level relational database? What's the
> contribution of one or two people in comparison.
>
> >> Ease of use issues need to be given much more attention. For an
> administrator, the ease of use of Cassandra is very poor.
> >>
> >>Furthermore, currently Cassandra is an idiot. We have to do everything
> for Cassandra. Contrast that with the fact that we are in the dawn of
> artificial intelligence.
> >>
> >
> >And for everything you think is obvious, there’s a 50% chance someone
> else will have already solved differently, and your obvious new solution
> will be seen as an inconvenient assumption and complexity they won’t
> appreciate. Open source projects get to walk a fine line of trying to be
> useful without making too many assumptions, being “too” opinionated, or
> overstepping bounds. We may be too conservative, but it’s very easy to go
> too far in the opposite direction.
> >
>
> I appreciate that but when such concerns result in inaction instead of
> resolution that is no good.
>
> >> Software exists to automate tasks for humans, not mechanize humans to
> administer tasks for a database. I’m an engineering type. My job is to
> apply science and technology to solve real world problems. And that’s
> where I need an organization’s I.T. talent to focus; not in crank starting
> an unfinished database.
> >>
> >
> >And that’s why nobody’s done it - we all have bigger problems we’re being
> paid to solve, and nobody’s felt it necessary. Because it’s not necessary,
> it’s nice, but not required.
> >
>
> Of course you would say that, you're Jeff Jirsa. In apprenticeship speak,
> you’re a master. It's the classic challenge of trying to get a master to
> see the legitimate issues of the apprentices. I do appreciate the time you
> give to answer posts to the groups , like this post. So I don't want you
> to take anything the wrong way. Where it's going to bit everyone is in the
> future adoption rate. It has to be addressed.
>
> [snip]
>
> >> Certificate management should be automated.
> >>
> >Stefan (in particular) has done a fair amount of work on this, but I’d
> bet 90% of users don’t use ssl and genuinely don’t care.
> >
>
> I didn't realize. Could I trouble you for a link so I could get up to
> speed?
>
> >> Cluster wide management should be a big theme in any next major release.
> >>
> >Na. Stability and testing should be a big theme in the next major release.
> >
>
> Double Na on that one Jeff. I think you have a concern there about the
> need to test sufficiently to ensure the stability of the next major
> release. That makes perfect sense.- for every release, especially the
> major ones. Continuous improvement is not a phase of development for
> example. CI should be in everything, in every phase. Stability and
> testing a part of every release not just one. A major release should be a
> nice step from the previous major release though.
>
> >> What is a major release? How many major releases could a program have
> before all the coding for basic stuff like installation, configuration and
> maintenance is included!
> >>
> >> Finish the basic coding of Cassandra, make it easy to use for
> administrators, make is smart, add cluster wide management. Keep Cassandra
> competitive or it will soon be the old Model T we all remember fondly.
> >>
> >
> >Let’s keep some perspective. Most of us came to Cassandra from rdbms
> worlds where we were building solutions out of a bunch of master/slave
> MySQL / Postgres type databases. I started using Cassandra 0.6 when I
> needed to store something like 400gb/day in 200whatever on spinning disks
> when 100gb felt like a “big” database, and the thought of writing runbooks
> and automation to automatically pick the most up to date slave as the new
> master, promote it, repoint the other slave to the new master, then
> reformat the old master and add it as a new slave without downtime and
> without potentially deleting the company’s whole dataset sounded awful.
> Cassandra solved that problem, at the cost of maintaining a few yaml (then
> xml) files. Yes there are rough edges - they get slightly less rough on
> each new release. Can we do better? Sure, use your engineering time and
> send some patches. But the basic stuff is the nuts and bolts of the
> database: I care way more about streaming and compaction than I’ll ever
> care about installation.
> >
>
> I can relate. I was studying the enterprise level MS SQL Server stuff. I
> noticed exactly what you described. I decided maybe I'll just do other
> stuff and wait for things to develop more. I'm very excited about the way
> Cassandra addresses things. Streaming and compaction - very good. I'm
> glad. Items related to usability are not optional though.
>
> >> I ask the Committee to compile a list of all such items, make a plan,
> and commit to including the completed and tested code as part of major
> release 5.0. I further ask that release 4.0 not be delayed and then there
> be an unusually short skip to version 5.0.
> >>
> >
> >The committers are working their ass off on all sorts of hard problems.
> Some of those are probably even related to Cassandra. If you have idea,
> open a JIRA. If you have time, send a patch. Or review a patch. But don’t
> expect a bunch of people to set down work on optimizing the database to
> work on packaging and installation, because there’s no ROI in it for 99% of
> the existing committers: we’re working on the database to solve problems,
> and installation isn’t one of those problems.
>
> I'm sure they are working very hard on all kinds of hard problems. I
> actually wrote "Committee", not "committers". There is an obvious shortage
> of contributors when you consider the size of the organizations using
> Cassandra. That leave the burden on an unfair few. Installation or more
> generally I would say usability is not that big a problem for the big
> companies out there. Good for them.
>
> Ask a new organization or a modest size organization that is struggling to
> manage their Cassandra cluster that usability is not a big problem. It
> truly is a big problem for many stakeholders of Cassandra. It needs to be
> given a bigger priority. Hopefully others will weigh in.
>
> Kenneth Brotman
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-***@cassandra.apache.org
> For additional commands, e-mail: user-***@cassandra.apache.org
>
>
Kenneth Brotman
2018-02-19 18:43:15 UTC
Permalink
Well said. Very fair. I wouldn’t mind hearing from others still. You’re a good guy!



Kenneth Brotman



From: Jeff Jirsa [mailto:***@gmail.com]
Sent: Monday, February 19, 2018 9:10 AM
To: cassandra
Cc: Cassandra DEV
Subject: Re: Cassandra Needs to Grow Up by Version Five!



There's a lot of things below I disagree with, but it's ok. I convinced myself not to nit-pick every point.



https://issues.apache.org/jira/browse/CASSANDRA-13971 has some of Stefan's work with cert management



Beyond that, I encourage you to do what Michael suggested: open JIRAs for things you care strongly about, work on them if you have time. Sometime this year we'll schedule a NGCC (Next Generation Cassandra Conference) where we talk about future project work and direction, I encourage you to attend if you're able (I encourage anyone who cares about the direction of Cassandra to attend, it's probably be either free or very low cost, just to cover a venue and some food). If nothing else, you'll meet some of the teams who are working on the project, and learn why they've selected the projects on which they're working. You'll have an opportunity to pitch your vision, and maybe you can talk some folks into helping out.



- Jeff









On Mon, Feb 19, 2018 at 1:01 AM, Kenneth Brotman <***@yahoo.com.invalid> wrote:

Comments inline

>-----Original Message-----
>From: Jeff Jirsa [mailto:***@gmail.com]
>Sent: Sunday, February 18, 2018 10:58 PM
>To: ***@cassandra.apache.org
>Cc: ***@cassandra.apache.org
>Subject: Re: Cassandra Needs to Grow Up by Version Five!
>
>Comments inline
>
>
>> On Feb 18, 2018, at 9:39 PM, Kenneth Brotman <***@yahoo.com.INVALID> wrote:
>>
> >Cassandra feels like an unfinished program to me. The problem is not that it’s open source or cutting edge. It’s an open source cutting edge program that lacks some of its basic functionality. We are all stuck addressing fundamental mechanical tasks for Cassandra because the basic code that would do that part has not been contributed yet.
>>
>There’s probably 2-3 reasons why here:
>
>1) Historically the pmc has tried to keep the scope of the project very narrow. It’s a database. We don’t ship drivers. We don’t ship developer tools. We don’t ship fancy UIs. We ship a database. I think for the most part the narrow vision has been for the best, but maybe it’s time to reconsider some of the scope.
>
>Postgres will autovacuum to prevent wraparound (hopefully), but everyone I know running Postgres uses flexible-freeze in cron - sometimes it’s ok to let the database have its opinions and let third party tools fill in the gaps.
>

I can appreciate the desire to stay in scope. I believe usability is the King. When users have to learn the database, then learn what they have to automate, then learn an automation tool and then use the automation tool to do something that is as fundamental as the fundamental tasks I described, then something is missing from the database itself that is adversely affecting usability - and that is very bad. Where those big companies need to calculate the ROI is in the cost of acquiring or training the next group of users. Consider how steep the learning curve is for new users. Consider the business case for improving ease of use.

>2) Cassandra is, by definition, a database for large scale problems. Most of the companies working on/with it tend to be big companies. Big companies often have pre-existing automation that solved the stuff you consider fundamental tasks, so there’s probably nobody actively working on the solved problems that you may consider missing features - for many people they’re already solved.
>

I could be wrong but it sounds like a lot of the code work is done, and if the companies would take the time to contribute more code, then the rest of the code needed could be generated easily.

>3) It’s not nearly as basic as you think it is. Datastax seemingly had a multi-person team on opscenter, and while it was better than anything else around last time I used it (before it stopped supporting the OSS version), it left a lot to be desired. It’s probably 2-3 engineers working for a month to have any sort of meaningful, reliable, mostly trivial cluster-managing UI, and I can think of about 10 JIRAs I’d rather see that time be spent on first.

How about 6-9 engineers working 12 months a year on it then. I'm not kidding. For a big company with revenues in the tens of billions or more, and a heavy use of Cassandra nodes, it's easy to make a case for having a full time person or more that involved. They aren't paying for using the open source code that is Cassandra. Let's see what would the licensing fees be for a big company if the costs where like Microsoft or Oracle would charge for their enterprise level relational database? What's the contribution of one or two people in comparison.

>> Ease of use issues need to be given much more attention. For an administrator, the ease of use of Cassandra is very poor.
>>
>>Furthermore, currently Cassandra is an idiot. We have to do everything for Cassandra. Contrast that with the fact that we are in the dawn of artificial intelligence.
>>
>
>And for everything you think is obvious, there’s a 50% chance someone else will have already solved differently, and your obvious new solution will be seen as an inconvenient assumption and complexity they won’t appreciate. Open source projects get to walk a fine line of trying to be useful without making too many assumptions, being “too” opinionated, or overstepping bounds. We may be too conservative, but it’s very easy to go too far in the opposite direction.
>

I appreciate that but when such concerns result in inaction instead of resolution that is no good.

>> Software exists to automate tasks for humans, not mechanize humans to administer tasks for a database. I’m an engineering type. My job is to apply science and technology to solve real world problems. And that’s where I need an organization’s I.T. talent to focus; not in crank starting an unfinished database.
>>
>
>And that’s why nobody’s done it - we all have bigger problems we’re being paid to solve, and nobody’s felt it necessary. Because it’s not necessary, it’s nice, but not required.
>

Of course you would say that, you're Jeff Jirsa. In apprenticeship speak, you’re a master. It's the classic challenge of trying to get a master to see the legitimate issues of the apprentices. I do appreciate the time you give to answer posts to the groups , like this post. So I don't want you to take anything the wrong way. Where it's going to bit everyone is in the future adoption rate. It has to be addressed.

[snip]

>> Certificate management should be automated.
>>
>Stefan (in particular) has done a fair amount of work on this, but I’d bet 90% of users don’t use ssl and genuinely don’t care.
>

I didn't realize. Could I trouble you for a link so I could get up to speed?

>> Cluster wide management should be a big theme in any next major release.
>>
>Na. Stability and testing should be a big theme in the next major release.
>

Double Na on that one Jeff. I think you have a concern there about the need to test sufficiently to ensure the stability of the next major release. That makes perfect sense.- for every release, especially the major ones. Continuous improvement is not a phase of development for example. CI should be in everything, in every phase. Stability and testing a part of every release not just one. A major release should be a nice step from the previous major release though.

>> What is a major release? How many major releases could a program have before all the coding for basic stuff like installation, configuration and maintenance is included!
>>
>> Finish the basic coding of Cassandra, make it easy to use for administrators, make is smart, add cluster wide management. Keep Cassandra competitive or it will soon be the old Model T we all remember fondly.
>>
>
>Let’s keep some perspective. Most of us came to Cassandra from rdbms worlds where we were building solutions out of a bunch of master/slave MySQL / Postgres type databases. I started using Cassandra 0.6 when I needed to store something like 400gb/day in 200whatever on spinning disks when 100gb felt like a “big” database, and the thought of writing runbooks and automation to automatically pick the most up to date slave as the new master, promote it, repoint the other slave to the new master, then reformat the old master and add it as a new slave without downtime and without potentially deleting the company’s whole dataset sounded awful. Cassandra solved that problem, at the cost of maintaining a few yaml (then xml) files. Yes there are rough edges - they get slightly less rough on each new release. Can we do better? Sure, use your engineering time and send some patches. But the basic stuff is the nuts and bolts of the database: I care way more about streaming and compaction than I’ll ever care about installation.
>

I can relate. I was studying the enterprise level MS SQL Server stuff. I noticed exactly what you described. I decided maybe I'll just do other stuff and wait for things to develop more. I'm very excited about the way Cassandra addresses things. Streaming and compaction - very good. I'm glad. Items related to usability are not optional though.

>> I ask the Committee to compile a list of all such items, make a plan, and commit to including the completed and tested code as part of major release 5.0. I further ask that release 4.0 not be delayed and then there be an unusually short skip to version 5.0.
>>
>
>The committers are working their ass off on all sorts of hard problems. Some of those are probably even related to Cassandra. If you have idea, open a JIRA. If you have time, send a patch. Or review a patch. But don’t expect a bunch of people to set down work on optimizing the database to work on packaging and installation, because there’s no ROI in it for 99% of the existing committers: we’re working on the database to solve problems, and installation isn’t one of those problems.

I'm sure they are working very hard on all kinds of hard problems. I actually wrote "Committee", not "committers" There is an obvious shortage of contributors when you consider the size of the organizations using Cassandra. That leave the burden on an unfair few. Installation or more generally I would say usability is not that big a problem for the big companies out there. Good for them.

Ask a new organization or a modest size organization that is struggling to manage their Cassandra cluster that usability is not a big problem. It truly is a big problem for many stakeholders of Cassandra. It needs to be given a bigger priority. Hopefully others will weigh in.

Kenneth Brotman


---------------------------------------------------------------------
To unsubscribe, e-mail: user-***@cassandra.apache.org <mailto:user-***@cassandra.apacheorg>
For additional commands, e-mail: user-***@cassandra.apache.org
Kenneth Brotman
2018-02-20 00:55:56 UTC
Permalink
Jeff, you helped me figure out what I was missing. It just took me a day to digest what you wrote. I’m coming over from another type of engineering. I didn’t know and it’s not really documented. Cassandra runs in a data center. Now days that means the nodes are going to be in managed containers, Docker containers, managed by Kerbernetes, Meso or something, and for that reason anyone operating Cassandra in a real world setting would not encounter the issues I raised in the way I described.



Shouldn’t the architectural diagrams people reference indicate that in some way? That would have help me.



Kenneth Brotman



From: Kenneth Brotman [mailto:***@yahoo.com]
Sent: Monday, February 19, 2018 10:43 AM
To: '***@cassandra.apache.org'
Cc: '***@cassandra.apache.org'
Subject: RE: Cassandra Needs to Grow Up by Version Five!



Well said. Very fair. I wouldn’t mind hearing from others still. You’re a good guy!



Kenneth Brotman



From: Jeff Jirsa [mailto:***@gmail.com]
Sent: Monday, February 19, 2018 9:10 AM
To: cassandra
Cc: Cassandra DEV
Subject: Re: Cassandra Needs to Grow Up by Version Five!



There's a lot of things below I disagree with, but it's ok. I convinced myself not to nit-pick every point.



https://issues.apache.org/jira/browse/CASSANDRA-13971 has some of Stefan's work with cert management



Beyond that, I encourage you to do what Michael suggested: open JIRAs for things you care strongly about, work on them if you have time. Sometime this year we'll schedule a NGCC (Next Generation Cassandra Conference) where we talk about future project work and direction, I encourage you to attend if you're able (I encourage anyone who cares about the direction of Cassandra to attend, it's probably be either free or very low cost, just to cover a venue and some food). If nothing else, you'll meet some of the teams who are working on the project, and learn why they've selected the projects on which they're working. You'll have an opportunity to pitch your vision, and maybe you can talk some folks into helping out.



- Jeff









On Mon, Feb 19, 2018 at 1:01 AM, Kenneth Brotman <***@yahoo.com.invalid> wrote:

Comments inline

>-----Original Message-----
>From: Jeff Jirsa [mailto:***@gmail.com]
>Sent: Sunday, February 18, 2018 10:58 PM
>To: ***@cassandra.apache.org
>Cc: ***@cassandra.apache.org
>Subject: Re: Cassandra Needs to Grow Up by Version Five!
>
>Comments inline
>
>
>> On Feb 18, 2018, at 9:39 PM, Kenneth Brotman <***@yahoo.com.INVALID> wrote:
>>
> >Cassandra feels like an unfinished program to me. The problem is not that it’s open source or cutting edge. It’s an open source cutting edge program that lacks some of its basic functionality. We are all stuck addressing fundamental mechanical tasks for Cassandra because the basic code that would do that part has not been contributed yet.
>>
>There’s probably 2-3 reasons why here:
>
>1) Historically the pmc has tried to keep the scope of the project very narrow. It’s a database. We don’t ship drivers. We don’t ship developer tools. We don’t ship fancy UIs. We ship a database. I think for the most part the narrow vision has been for the best, but maybe it’s time to reconsider some of the scope.
>
>Postgres will autovacuum to prevent wraparound (hopefully), but everyone I know running Postgres uses flexible-freeze in cron - sometimes it’s ok to let the database have its opinions and let third party tools fill in the gaps.
>

I can appreciate the desire to stay in scope. I believe usability is the King. When users have to learn the database, then learn what they have to automate, then learn an automation tool and then use the automation tool to do something that is as fundamental as the fundamental tasks I described, then something is missing from the database itself that is adversely affecting usability - and that is very bad. Where those big companies need to calculate the ROI is in the cost of acquiring or training the next group of users. Consider how steep the learning curve is for new users. Consider the business case for improving ease of use.

>2) Cassandra is, by definition, a database for large scale problems. Most of the companies working on/with it tend to be big companies. Big companies often have pre-existing automation that solved the stuff you consider fundamental tasks, so there’s probably nobody actively working on the solved problems that you may consider missing features - for many people they’re already solved.
>

I could be wrong but it sounds like a lot of the code work is done, and if the companies would take the time to contribute more code, then the rest of the code needed could be generated easily.

>3) It’s not nearly as basic as you think it is. Datastax seemingly had a multi-person team on opscenter, and while it was better than anything else around last time I used it (before it stopped supporting the OSS version), it left a lot to be desired. It’s probably 2-3 engineers working for a month to have any sort of meaningful, reliable, mostly trivial cluster-managing UI, and I can think of about 10 JIRAs I’d rather see that time be spent on first.

How about 6-9 engineers working 12 months a year on it then. I'm not kidding. For a big company with revenues in the tens of billions or more, and a heavy use of Cassandra nodes, it's easy to make a case for having a full time person or more that involved. They aren't paying for using the open source code that is Cassandra. Let's see what would the licensing fees be for a big company if the costs where like Microsoft or Oracle would charge for their enterprise level relational database? What's the contribution of one or two people in comparison.

>> Ease of use issues need to be given much more attention. For an administrator, the ease of use of Cassandra is very poor.
>>
>>Furthermore, currently Cassandra is an idiot. We have to do everything for Cassandra. Contrast that with the fact that we are in the dawn of artificial intelligence.
>>
>
>And for everything you think is obvious, there’s a 50% chance someone else will have already solved differently, and your obvious new solution will be seen as an inconvenient assumption and complexity they won’t appreciate. Open source projects get to walk a fine line of trying to be useful without making too many assumptions, being “too” opinionated, or overstepping bounds. We may be too conservative, but it’s very easy to go too far in the opposite direction.
>

I appreciate that but when such concerns result in inaction instead of resolution that is no good.

>> Software exists to automate tasks for humans, not mechanize humans to administer tasks for a database. I’m an engineering type. My job is to apply science and technology to solve real world problems. And that’s where I need an organization’s I.T. talent to focus; not in crank starting an unfinished database.
>>
>
>And that’s why nobody’s done it - we all have bigger problems we’re being paid to solve, and nobody’s felt it necessary. Because it’s not necessary, it’s nice, but not required.
>

Of course you would say that, you're Jeff Jirsa. In apprenticeship speak, you’re a master. It's the classic challenge of trying to get a master to see the legitimate issues of the apprentices. I do appreciate the time you give to answer posts to the groups , like this post. So I don't want you to take anything the wrong way. Where it's going to bit everyone is in the future adoption rate. It has to be addressed.

[snip]

>> Certificate management should be automated.
>>
>Stefan (in particular) has done a fair amount of work on this, but I’d bet 90% of users don’t use ssl and genuinely don’t care.
>

I didn't realize. Could I trouble you for a link so I could get up to speed?

>> Cluster wide management should be a big theme in any next major release.
>>
>Na. Stability and testing should be a big theme in the next major release.
>

Double Na on that one Jeff. I think you have a concern there about the need to test sufficiently to ensure the stability of the next major release. That makes perfect sense.- for every release, especially the major ones. Continuous improvement is not a phase of development for example. CI should be in everything, in every phase. Stability and testing a part of every release not just one. A major release should be a nice step from the previous major release though.

>> What is a major release? How many major releases could a program have before all the coding for basic stuff like installation, configuration and maintenance is included!
>>
>> Finish the basic coding of Cassandra, make it easy to use for administrators, make is smart, add cluster wide management. Keep Cassandra competitive or it will soon be the old Model T we all remember fondly.
>>
>
>Let’s keep some perspective. Most of us came to Cassandra from rdbms worlds where we were building solutions out of a bunch of master/slave MySQL / Postgres type databases. I started using Cassandra 0.6 when I needed to store something like 400gb/day in 200whatever on spinning disks when 100gb felt like a “big” database, and the thought of writing runbooks and automation to automatically pick the most up to date slave as the new master, promote it, repoint the other slave to the new master, then reformat the old master and add it as a new slave without downtime and without potentially deleting the company’s whole dataset sounded awful. Cassandra solved that problem, at the cost of maintaining a few yaml (then xml) files. Yes there are rough edges - they get slightly less rough on each new release. Can we do better? Sure, use your engineering time and send some patches. But the basic stuff is the nuts and bolts of the database: I care way more about streaming and compaction than I’ll ever care about installation.
>

I can relate. I was studying the enterprise level MS SQL Server stuff. I noticed exactly what you described. I decided maybe I'll just do other stuff and wait for things to develop more. I'm very excited about the way Cassandra addresses things. Streaming and compaction - very good. I'm glad. Items related to usability are not optional though.

>> I ask the Committee to compile a list of all such items, make a plan, and commit to including the completed and tested code as part of major release 5.0. I further ask that release 4.0 not be delayed and then there be an unusually short skip to version 5.0.
>>
>
>The committers are working their ass off on all sorts of hard problems. Some of those are probably even related to Cassandra. If you have idea, open a JIRA. If you have time, send a patch. Or review a patch. But don’t expect a bunch of people to set down work on optimizing the database to work on packaging and installation, because there’s no ROI in it for 99% of the existing committers: we’re working on the database to solve problems, and installation isn’t one of those problems.

I'm sure they are working very hard on all kinds of hard problems. I actually wrote "Committee", not "committers" There is an obvious shortage of contributors when you consider the size of the organizations using Cassandra. That leave the burden on an unfair few. Installation or more generally I would say usability is not that big a problem for the big companies out there. Good for them.

Ask a new organization or a modest size organization that is struggling to manage their Cassandra cluster that usability is not a big problem. It truly is a big problem for many stakeholders of Cassandra. It needs to be given a bigger priority. Hopefully others will weigh in.

Kenneth Brotman


---------------------------------------------------------------------
To unsubscribe, e-mail: user-***@cassandra.apache.org <mailto:user-***@cassandra.apacheorg>
For additional commands, e-mail: user-***@cassandra.apache.org
James Briggs
2018-02-20 04:56:25 UTC
Permalink
Kenneth:
What you said is not wrong.

Vertica and Riak are examples of distributed databases that don't require hand-holding.

Cassandra is for Java-programmer DIYers, or more often Datastax clients, at this point.
Thanks, James.

From: Kenneth Brotman <***@yahoo.com.INVALID>
To: ***@cassandra.apache.org
Cc: ***@cassandra.apache.org
Sent: Monday, February 19, 2018 4:56 PM
Subject: RE: Cassandra Needs to Grow Up by Version Five!

#yiv0297673896 #yiv0297673896 -- _filtered #yiv0297673896 {font-family:Calibri;panose-1:2 15 5 2 2 2 4 3 2 4;} _filtered #yiv0297673896 {font-family:Tahoma;panose-1:2 11 6 4 3 5 4 4 2 4;}#yiv0297673896 #yiv0297673896 p.yiv0297673896MsoNormal, #yiv0297673896 li.yiv0297673896MsoNormal, #yiv0297673896 div.yiv0297673896MsoNormal {margin:0in;margin-bottom:.0001pt;font-size:12.0pt;}#yiv0297673896 a:link, #yiv0297673896 span.yiv0297673896MsoHyperlink {color:blue;text-decoration:underline;}#yiv0297673896 a:visited, #yiv0297673896 span.yiv0297673896MsoHyperlinkFollowed {color:purple;text-decoration:underline;}#yiv0297673896 p.yiv0297673896MsoAcetate, #yiv0297673896 li.yiv0297673896MsoAcetate, #yiv0297673896 div.yiv0297673896MsoAcetate {margin:0in;margin-bottom:.0001pt;font-size:8.0pt;}#yiv0297673896 span.yiv0297673896EmailStyle17 {color:#1F497D;}#yiv0297673896 span.yiv0297673896BalloonTextChar {}#yiv0297673896 span.yiv0297673896EmailStyle20 {color:#1F497D;}#yiv0297673896 .yiv0297673896MsoChpDefault {font-size:10.0pt;} _filtered #yiv0297673896 {margin:1.0in 1.0in 1.0in 1.0in;}#yiv0297673896 div.yiv0297673896WordSection1 {}#yiv0297673896 Jeff, you helped me figure out what I was missing.  It just took me a day to digest what you wrote.  I’m coming over from another type of engineering.  I didn’t know and it’s not really documented.  Cassandra runs in a data center.  Now days that means the nodes are going to be in managed containers, Docker containers, managed by Kerbernetes,  Meso or something, and for that reason anyone operating Cassandra in a real world setting would not encounter the issues I raised in the way I described.  Shouldn’t the architectural diagrams people reference indicate that in some way?  That would have help me.  Kenneth Brotman  From: Kenneth Brotman [mailto:***@yahoo.com]
Sent: Monday, February 19, 2018 10:43 AM
To: '***@cassandra.apache.org'
Cc: '***@cassandra.apache.org'
Subject: RE: Cassandra Needs to Grow Up by Version Five!  Well said.  Very fair.  I wouldn’t mind hearing from others still.  You’re a good guy!  Kenneth Brotman  From: Jeff Jirsa [mailto:***@gmail.com]
Sent: Monday, February 19, 2018 9:10 AM
To: cassandra
Cc: Cassandra DEV
Subject: Re: Cassandra Needs to Grow Up by Version Five!  There's a lot of things below I disagree with, but it's ok. I convinced myself not to nit-pick every point.  https://issues.apache.org/jira/browse/CASSANDRA-13971 has some of Stefan's work with cert management  Beyond that, I encourage you to do what Michael suggested: open JIRAs for things you care strongly about, work on them if you have time. Sometime this year we'll schedule a NGCC (Next Generation Cassandra Conference) where we talk about future project work and direction, I encourage you to attend if you're able (I encourage anyone who cares about the direction of Cassandra to attend, it's probably be either free or very low cost, just to cover a venue and some food). If nothing else, you'll meet some of the teams who are working on the project, and learn why they've selected the projects on which they're working. You'll have an opportunity to pitch your vision, and maybe you can talk some folks into helping out.   - Jeff        On Mon, Feb 19, 2018 at 1:01 AM, Kenneth Brotman <***@yahoo.com.invalid> wrote:Comments inline

>-----Original Message-----
>From: Jeff Jirsa [mailto:***@gmail.com]
>Sent: Sunday, February 18, 2018 10:58 PM
>To: ***@cassandra.apache.org
>Cc: ***@cassandra.apache.org
>Subject: Re: Cassandra Needs to Grow Up by Version Five!
>
>Comments inline
>
>
>> On Feb 18, 2018, at 9:39 PM, Kenneth Brotman <***@yahoo.com.INVALID> wrote:
>>
> >Cassandra feels like an unfinished program to me. The problem is not that it’s open source or cutting edge.  It’s an open source cutting edge program that lacks some of its basic functionality.  We are all stuck addressing fundamental mechanical tasks for Cassandra because the basic code that would do that part has not been contributed yet.
>>
>There’s probably 2-3 reasons why here:
>
>1) Historically the pmc has tried to keep the scope of the project very narrow. It’s a database. We don’t ship drivers. We don’t ship developer tools. We don’t ship fancy UIs. We ship a database. I think for the most part the narrow vision has been for the best, but maybe it’s time to reconsider some of the scope.
>
>Postgres will autovacuum to prevent wraparound (hopefully),  but everyone I know running Postgres uses flexible-freeze in cron - sometimes it’s ok to let the database have its opinions and let third party tools fill in the gaps.
>

I can appreciate the desire to stay in scope.  I believe usability is the King.  When users have to learn the database, then learn what they have to automate, then learn an automation tool and then use the automation tool to do something that is as fundamental as the fundamental tasks I described, then something is missing from the database itself that is adversely affecting usability - and that is very bad.  Where those big companies need to calculate the ROI is in the cost of acquiring or training the next group of users.  Consider how steep the learning curve is for new users.  Consider the business case for improving ease of use.

>2) Cassandra is, by definition, a database for large scale problems. Most of the companies working on/with it tend to be big companies. Big companies often have pre-existing automation that solved the stuff you consider fundamental tasks, so there’s probably nobody actively working on the solved problems that you may consider missing features - for many people they’re already solved.
>

I could be wrong but it sounds like a lot of the code work is done, and if the companies would take the time to contribute more code, then the rest of the code needed could be generated easily.

>3) It’s not nearly as basic as you think it is. Datastax seemingly had a multi-person team on opscenter, and while it was better than anything else around last time I used it (before it stopped supporting the OSS version), it left a lot to be desired. It’s probably 2-3 engineers working for a month  to have any sort of meaningful, reliable, mostly trivial cluster-managing UI, and I can think of about 10 JIRAs I’d rather see that time be spent on first.

How about 6-9 engineers working 12 months a year on it then.  I'm not kidding.  For a big company with revenues in the tens of billions or more, and a heavy use of Cassandra nodes, it's easy to make a case for having a full time person or more that involved.  They aren't paying for using the open source code that is Cassandra.  Let's see what would the licensing fees be for a big company if the costs where like Microsoft or Oracle would charge for their enterprise level relational database?   What's the contribution of one or two people in comparison.

>> Ease of use issues need to be given much more attention.  For an administrator, the ease of use of Cassandra is very poor.
>>
>>Furthermore, currently Cassandra is an idiot.  We have to do everything for Cassandra. Contrast that with the fact that we are in the dawn of artificial intelligence.
>>
>
>And for everything you think is obvious, there’s a 50% chance someone else will have already solved differently, and your obvious new solution will be seen as an inconvenient assumption and complexity they won’t appreciate. Open source projects get to walk a fine line of trying to be useful without making too many assumptions, being “too” opinionated, or overstepping bounds. We may be too conservative, but it’s very easy to go too far in the opposite direction.
>

I appreciate that but when such concerns result in inaction instead of resolution that is no good.

>> Software exists to automate tasks for humans, not mechanize humans to administer tasks for a database.  I’m an engineering type.  My job is to apply science and technology to solve real world problems.  And that’s where I need an organization’s I.T. talent to focus; not in crank starting an unfinished database.
>>
>
>And that’s why nobody’s done it - we all have bigger problems we’re being paid to solve, and nobody’s felt it necessary. Because it’s not necessary, it’s nice, but not required.
>

Of course you would say that, you're Jeff Jirsa.  In apprenticeship speak, you’re a master.  It's the classic challenge of trying to  get a master to see the legitimate issues of the apprentices.  I do appreciate the time you give to answer posts to the groups , like this post.  So I don't want you to take anything the wrong way.  Where it's going to bit everyone is in the future adoption rate.  It has to be addressed.

[snip]

>> Certificate management should be automated.
>>
>Stefan (in particular) has done a fair amount of work on this, but I’d bet 90% of users don’t use ssl and genuinely don’t care.
>

I didn't realize.  Could I trouble you for a link so I could get up to speed?

>> Cluster wide management should be a big theme in any next major release.
>>
>Na. Stability and testing should be a big theme in the next major release.
>

Double Na on that one Jeff.  I think you have a concern there about the need to test sufficiently to ensure the stability of the next major release.  That makes perfect sense.- for every release, especially the major ones.  Continuous improvement is not a phase of development for example.  CI should be in everything, in every phase.  Stability and testing a part of every release not just one.  A major release should be a nice step from the previous major release though.

>> What is a major release?  How many major releases could a program have before all the coding for basic stuff like installation, configuration and maintenance is included!
>>
>> Finish the basic coding of Cassandra, make it easy to use for administrators, make is smart, add cluster wide management.  Keep Cassandra competitive or it will soon be the old Model T we all remember fondly.
>>
>
>Let’s keep some perspective. Most of us came to Cassandra from rdbms worlds where we were building solutions out of a bunch of master/slave MySQL / Postgres type databases. I started using Cassandra 0.6 when I needed to store something like 400gb/day in 200whatever on spinning disks when 100gb felt like a “big” database, and the thought of writing runbooks and automation to automatically pick the most up to date slave as the new master, promote it, repoint the other slave to the new master, then reformat the old master and add it as a new slave without downtime and without potentially deleting the company’s whole dataset sounded awful. Cassandra solved that problem, at the cost of maintaining a few yaml (then xml) files. Yes there are rough edges - they get slightly less rough on each new release. Can we do better? Sure, use your engineering time and send some patches. But the basic stuff is the nuts and bolts of the database: I care way more about streaming and compaction than I’ll ever care about installation.
>

I can relate.  I was studying the enterprise level MS SQL Server stuff. I noticed exactly what you described.  I decided maybe I'll just do other stuff and wait for things to develop more.  I'm very excited about the way Cassandra addresses things.  Streaming and compaction - very good.  I'm glad.  Items related to usability are not optional though.

>> I ask the Committee to compile a list of all such items, make a plan, and commit to including the completed and tested code as part of major release 5.0.  I further ask that release 4.0 not be delayed and then there be an unusually short skip to version 5.0.
>>
>
>The committers are working their ass off on all sorts of hard problems. Some of those are probably even related to Cassandra. If you have idea, open a JIRA. If you have time, send a patch. Or review a patch. But don’t expect a bunch of people to set down work on optimizing the database to work on packaging and installation, because there’s no ROI in it for 99% of the existing committers: we’re working on the database to solve problems, and installation isn’t one of those problems.

I'm sure they are working very hard on all kinds of hard problems.  I actually wrote "Committee", not "committers"  There is an obvious shortage of contributors when you consider the size of the organizations using Cassandra.  That leave the burden on an unfair few.  Installation or more generally I would say usability is not that big a problem for the big companies out there. Good for them.

Ask a new organization or a modest size organization that is struggling to manage their Cassandra cluster that usability is not a big problem. It truly is a big problem for many stakeholders of Cassandra. It needs to be given a bigger priority.  Hopefully others will weigh in.

Kenneth Brotman


---------------------------------------------------------------------
To unsubscribe, e-mail: user-***@cassandra.apache.org
For additional commands, e-mail: user-***@cassandra.apache.org  
Daniel Hölbling-Inzko
2018-02-20 09:28:13 UTC
Permalink
Hi,

I have to add my own two cents here as the main thing that keeps me from
really running Cassandra is the amount of pain running it incurs.
Not so much because it's actually painful but because the tools are so
different and the documentation and best practices are scattered across a
dozen outdated DataStax articles and this mailing list etc.. We've been
hesitant (although our use case is perfect for using Cassandra) to deploy
Cassandra to any critical systems as even after a year of running it we
still don't have the operational experience to confidently run critical
systems with it.

Simple things like a foolproof / safe cluster-wide S3 Backup (like
Elasticsearch has it) would for example solve a TON of issues for new
people. I don't need it auto-scheduled or something, but having to
configure cron jobs across the whole cluster is a pain in the ass for small
teams.
To be honest, even the way snapshots are done right now is already super
painful. Every other system I operated so far will just create one backup
folder I can export, in C* the Backup is scattered across a bunch of
different Keyspace folders etc.. needless to say that it took a while until
I trusted my backup scripts fully.

And especially for a Database I believe Backup/Restore needs to be a
non-issue that's documented front and center. If not smaller teams just
don't have the resources to dedicate to learning and building the tools
around it.

Now that the team is getting larger we could spare the resources to operate
these things, but switching from a well-understood RDBMs schema to
Cassandra is now incredibly hard and will probably take years.

greetings Daniel

On Tue, 20 Feb 2018 at 05:56 James Briggs <***@yahoo.com.invalid>
wrote:

> Kenneth:
>
> What you said is not wrong.
>
> Vertica and Riak are examples of distributed databases that don't require
> hand-holding.
>
> Cassandra is for Java-programmer DIYers, or more often Datastax clients,
> at this point.
> Thanks, James.
>
> ------------------------------
> *From:* Kenneth Brotman <***@yahoo.com.INVALID>
> *To:* ***@cassandra.apache.org
> *Cc:* ***@cassandra.apache.org
> *Sent:* Monday, February 19, 2018 4:56 PM
>
> *Subject:* RE: Cassandra Needs to Grow Up by Version Five!
>
> Jeff, you helped me figure out what I was missing. It just took me a day
> to digest what you wrote. I’m coming over from another type of
> engineering. I didn’t know and it’s not really documented. Cassandra runs
> in a data center. Now days that means the nodes are going to be in managed
> containers, Docker containers, managed by Kerbernetes, Meso or something,
> and for that reason anyone operating Cassandra in a real world setting
> would not encounter the issues I raised in the way I described.
>
> Shouldn’t the architectural diagrams people reference indicate that in
> some way? That would have help me.
>
> Kenneth Brotman
>
> *From:* Kenneth Brotman [mailto:***@yahoo.com]
> *Sent:* Monday, February 19, 2018 10:43 AM
> *To:* '***@cassandra.apache.org'
> *Cc:* '***@cassandra.apache.org'
> *Subject:* RE: Cassandra Needs to Grow Up by Version Five!
>
> Well said. Very fair. I wouldn’t mind hearing from others still. You’re
> a good guy!
>
> Kenneth Brotman
>
> *From:* Jeff Jirsa [mailto:***@gmail.com <***@gmail.com>]
> *Sent:* Monday, February 19, 2018 9:10 AM
> *To:* cassandra
> *Cc:* Cassandra DEV
> *Subject:* Re: Cassandra Needs to Grow Up by Version Five!
>
> There's a lot of things below I disagree with, but it's ok. I convinced
> myself not to nit-pick every point.
>
> https://issues.apache.org/jira/browse/CASSANDRA-13971 has some of
> Stefan's work with cert management
>
> Beyond that, I encourage you to do what Michael suggested: open JIRAs for
> things you care strongly about, work on them if you have time. Sometime
> this year we'll schedule a NGCC (Next Generation Cassandra Conference)
> where we talk about future project work and direction, I encourage you to
> attend if you're able (I encourage anyone who cares about the direction of
> Cassandra to attend, it's probably be either free or very low cost, just to
> cover a venue and some food). If nothing else, you'll meet some of the
> teams who are working on the project, and learn why they've selected the
> projects on which they're working. You'll have an opportunity to pitch your
> vision, and maybe you can talk some folks into helping out.
>
> - Jeff
>
>
>
>
> On Mon, Feb 19, 2018 at 1:01 AM, Kenneth Brotman <
> ***@yahoo.com.invalid> wrote:
> Comments inline
>
> >-----Original Message-----
> >From: Jeff Jirsa [mailto:***@gmail.com]
> >Sent: Sunday, February 18, 2018 10:58 PM
> >To: ***@cassandra.apache.org
> >Cc: ***@cassandra.apache.org
> >Subject: Re: Cassandra Needs to Grow Up by Version Five!
> >
> >Comments inline
> >
> >
> >> On Feb 18, 2018, at 9:39 PM, Kenneth Brotman <
> ***@yahoo.com.INVALID> wrote:
> >>
> > >Cassandra feels like an unfinished program to me. The problem is not
> that it’s open source or cutting edge. It’s an open source cutting edge
> program that lacks some of its basic functionality. We are all stuck
> addressing fundamental mechanical tasks for Cassandra because the basic
> code that would do that part has not been contributed yet.
> >>
> >There’s probably 2-3 reasons why here:
> >
> >1) Historically the pmc has tried to keep the scope of the project very
> narrow. It’s a database. We don’t ship drivers. We don’t ship developer
> tools. We don’t ship fancy UIs. We ship a database. I think for the most
> part the narrow vision has been for the best, but maybe it’s time to
> reconsider some of the scope.
> >
> >Postgres will autovacuum to prevent wraparound (hopefully), but everyone
> I know running Postgres uses flexible-freeze in cron - sometimes it’s ok to
> let the database have its opinions and let third party tools fill in the
> gaps.
> >
>
> I can appreciate the desire to stay in scope. I believe usability is the
> King. When users have to learn the database, then learn what they have to
> automate, then learn an automation tool and then use the automation tool to
> do something that is as fundamental as the fundamental tasks I described,
> then something is missing from the database itself that is adversely
> affecting usability - and that is very bad. Where those big companies need
> to calculate the ROI is in the cost of acquiring or training the next group
> of users. Consider how steep the learning curve is for new users.
> Consider the business case for improving ease of use.
>
> >2) Cassandra is, by definition, a database for large scale problems. Most
> of the companies working on/with it tend to be big companies. Big companies
> often have pre-existing automation that solved the stuff you consider
> fundamental tasks, so there’s probably nobody actively working on the
> solved problems that you may consider missing features - for many people
> they’re already solved.
> >
>
> I could be wrong but it sounds like a lot of the code work is done, and if
> the companies would take the time to contribute more code, then the rest of
> the code needed could be generated easily.
>
> >3) It’s not nearly as basic as you think it is. Datastax seemingly had a
> multi-person team on opscenter, and while it was better than anything else
> around last time I used it (before it stopped supporting the OSS version),
> it left a lot to be desired. It’s probably 2-3 engineers working for a
> month to have any sort of meaningful, reliable, mostly trivial
> cluster-managing UI, and I can think of about 10 JIRAs I’d rather see that
> time be spent on first.
>
> How about 6-9 engineers working 12 months a year on it then. I'm not
> kidding. For a big company with revenues in the tens of billions or more,
> and a heavy use of Cassandra nodes, it's easy to make a case for having a
> full time person or more that involved. They aren't paying for using the
> open source code that is Cassandra. Let's see what would the licensing
> fees be for a big company if the costs where like Microsoft or Oracle would
> charge for their enterprise level relational database? What's the
> contribution of one or two people in comparison.
>
> >> Ease of use issues need to be given much more attention. For an
> administrator, the ease of use of Cassandra is very poor.
> >>
> >>Furthermore, currently Cassandra is an idiot. We have to do everything
> for Cassandra. Contrast that with the fact that we are in the dawn of
> artificial intelligence.
> >>
> >
> >And for everything you think is obvious, there’s a 50% chance someone
> else will have already solved differently, and your obvious new solution
> will be seen as an inconvenient assumption and complexity they won’t
> appreciate. Open source projects get to walk a fine line of trying to be
> useful without making too many assumptions, being “too” opinionated, or
> overstepping bounds. We may be too conservative, but it’s very easy to go
> too far in the opposite direction.
> >
>
> I appreciate that but when such concerns result in inaction instead of
> resolution that is no good.
>
> >> Software exists to automate tasks for humans, not mechanize humans to
> administer tasks for a database. I’m an engineering type. My job is to
> apply science and technology to solve real world problems. And that’s
> where I need an organization’s I.T. talent to focus; not in crank starting
> an unfinished database.
> >>
> >
> >And that’s why nobody’s done it - we all have bigger problems we’re being
> paid to solve, and nobody’s felt it necessary. Because it’s not necessary,
> it’s nice, but not required.
> >
>
> Of course you would say that, you're Jeff Jirsa. In apprenticeship speak,
> you’re a master. It's the classic challenge of trying to get a master to
> see the legitimate issues of the apprentices. I do appreciate the time you
> give to answer posts to the groups , like this post. So I don't want you
> to take anything the wrong way. Where it's going to bit everyone is in the
> future adoption rate. It has to be addressed.
>
> [snip]
>
> >> Certificate management should be automated.
> >>
> >Stefan (in particular) has done a fair amount of work on this, but I’d
> bet 90% of users don’t use ssl and genuinely don’t care.
> >
>
> I didn't realize. Could I trouble you for a link so I could get up to
> speed?
>
> >> Cluster wide management should be a big theme in any next major release.
> >>
> >Na. Stability and testing should be a big theme in the next major release.
> >
>
> Double Na on that one Jeff. I think you have a concern there about the
> need to test sufficiently to ensure the stability of the next major
> release. That makes perfect sense.- for every release, especially the
> major ones. Continuous improvement is not a phase of development for
> example. CI should be in everything, in every phase. Stability and
> testing a part of every release not just one. A major release should be a
> nice step from the previous major release though.
>
> >> What is a major release? How many major releases could a program have
> before all the coding for basic stuff like installation, configuration and
> maintenance is included!
> >>
> >> Finish the basic coding of Cassandra, make it easy to use for
> administrators, make is smart, add cluster wide management. Keep Cassandra
> competitive or it will soon be the old Model T we all remember fondly.
> >>
> >
> >Let’s keep some perspective. Most of us came to Cassandra from rdbms
> worlds where we were building solutions out of a bunch of master/slave
> MySQL / Postgres type databases. I started using Cassandra 0.6 when I
> needed to store something like 400gb/day in 200whatever on spinning disks
> when 100gb felt like a “big” database, and the thought of writing runbooks
> and automation to automatically pick the most up to date slave as the new
> master, promote it, repoint the other slave to the new master, then
> reformat the old master and add it as a new slave without downtime and
> without potentially deleting the company’s whole dataset sounded awful.
> Cassandra solved that problem, at the cost of maintaining a few yaml (then
> xml) files. Yes there are rough edges - they get slightly less rough on
> each new release. Can we do better? Sure, use your engineering time and
> send some patches. But the basic stuff is the nuts and bolts of the
> database: I care way more about streaming and compaction than I’ll ever
> care about installation.
> >
>
> I can relate. I was studying the enterprise level MS SQL Server stuff. I
> noticed exactly what you described. I decided maybe I'll just do other
> stuff and wait for things to develop more. I'm very excited about the way
> Cassandra addresses things. Streaming and compaction - very good. I'm
> glad. Items related to usability are not optional though.
>
> >> I ask the Committee to compile a list of all such items, make a plan,
> and commit to including the completed and tested code as part of major
> release 5.0. I further ask that release 4.0 not be delayed and then there
> be an unusually short skip to version 5.0.
> >>
> >
> >The committers are working their ass off on all sorts of hard problems.
> Some of those are probably even related to Cassandra. If you have idea,
> open a JIRA. If you have time, send a patch. Or review a patch. But don’t
> expect a bunch of people to set down work on optimizing the database to
> work on packaging and installation, because there’s no ROI in it for 99% of
> the existing committers: we’re working on the database to solve problems,
> and installation isn’t one of those problems.
>
> I'm sure they are working very hard on all kinds of hard problems. I
> actually wrote "Committee", not "committers" There is an obvious shortage
> of contributors when you consider the size of the organizations using
> Cassandra. That leave the burden on an unfair few. Installation or more
> generally I would say usability is not that big a problem for the big
> companies out there. Good for them.
>
> Ask a new organization or a modest size organization that is struggling to
> manage their Cassandra cluster that usability is not a big problem. It
> truly is a big problem for many stakeholders of Cassandra. It needs to be
> given a bigger priority. Hopefully others will weigh in.
>
> Kenneth Brotman
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-***@cassandra.apache.org
> <user-***@cassandra.apacheorg>
> For additional commands, e-mail: user-***@cassandra.apache.org
>
>
>
>
Kyrylo Lebediev
2018-02-20 11:39:51 UTC
Permalink
Agree with you, Daniel, regarding gaps in documentation.


---

At the same time I disagree with the folks who are complaining in this thread about some functionality like 'advanced backup' etc is missing out of the box.

We all live in the time where there are literally tons of open-source tools (automation, monitoring) and languages are available, also there are some really powerful SaaS solutions on the market which support C* (Datadog, for instance).


For example, while C* provides basic building blocks for anti-entropy repairs [I mean basic usage of 'nodetool repair' is not suitable for large production clusters], Reaper (many thanks to Spotify and TheLastPickle!) which uses this basic functionality solves the task very well for real-world C* setups.


Something is missing / could be improved in your opinion - we're in era of open-source. Create your own tool, let's say for C* backups automation using EBS snapshots, and upload it on GitHub.


C* is a DB-engine, not a fully-automated self-contained suite.
End-users are able to work on automation of routine [3rd party projects], meanwhile C* contributors may focus on core functionality.

--

Going back to documentation topic, as far as I understand, DataStax is no longer main C* contributor and is focused on own C*-based proprietary software [correct me smb if I'm wrong].

This has led us to the situation when development of C* is progressing (as far as I understand, work is done mainly by some large C* users having enough resources to contribute to the C* project to get the features they need), but there is no single company which has taken over actualization of C* documentation / Wiki.

Honestly, even DataStax's documentation is too concise and is missing a lot of important details.

[BTW, just've taken a look at https://cassandra.apache.org/doc/latest/ and it looks not that 'bad': despite of TODOs it contains a lot of valuable information]


So, I feel the C* Community has to join efforts on enriching existing documentation / resurrection of Wiki [where can be placed howto's, information about 3rd party automations and integrations etc].

By the Community I mean all of us including myself.



Regards,

Kyrill

________________________________
From: Daniel Hölbling-Inzko <daniel.hoelbling-***@bitmovin.com>
Sent: Tuesday, February 20, 2018 11:28:13 AM
To: ***@cassandra.apache.org; James Briggs
Cc: ***@cassandra.apache.org
Subject: Re: Cassandra Needs to Grow Up by Version Five!

Hi,

I have to add my own two cents here as the main thing that keeps me from really running Cassandra is the amount of pain running it incurs.
Not so much because it's actually painful but because the tools are so different and the documentation and best practices are scattered across a dozen outdated DataStax articles and this mailing list etc.. We've been hesitant (although our use case is perfect for using Cassandra) to deploy Cassandra to any critical systems as even after a year of running it we still don't have the operational experience to confidently run critical systems with it.

Simple things like a foolproof / safe cluster-wide S3 Backup (like Elasticsearch has it) would for example solve a TON of issues for new people. I don't need it auto-scheduled or something, but having to configure cron jobs across the whole cluster is a pain in the ass for small teams.
To be honest, even the way snapshots are done right now is already super painful. Every other system I operated so far will just create one backup folder I can export, in C* the Backup is scattered across a bunch of different Keyspace folders etc.. needless to say that it took a while until I trusted my backup scripts fully.

And especially for a Database I believe Backup/Restore needs to be a non-issue that's documented front and center. If not smaller teams just don't have the resources to dedicate to learning and building the tools around it.

Now that the team is getting larger we could spare the resources to operate these things, but switching from a well-understood RDBMs schema to Cassandra is now incredibly hard and will probably take years.

greetings Daniel

On Tue, 20 Feb 2018 at 05:56 James Briggs <***@yahoo.com.invalid> wrote:
Kenneth:

What you said is not wrong.

Vertica and Riak are examples of distributed databases that don't require hand-holding.

Cassandra is for Java-programmer DIYers, or more often Datastax clients, at this point.
Thanks, James.

________________________________
From: Kenneth Brotman <***@yahoo.com.INVALID>
To: ***@cassandra.apache.org<mailto:***@cassandra.apache.org>
Cc: ***@cassandra.apache.org<mailto:***@cassandra.apache.org>
Sent: Monday, February 19, 2018 4:56 PM

Subject: RE: Cassandra Needs to Grow Up by Version Five!

Jeff, you helped me figure out what I was missing. It just took me a day to digest what you wrote. I’m coming over from another type of engineering. I didn’t know and it’s not really documented. Cassandra runs in a data center. Now days that means the nodes are going to be in managed containers, Docker containers, managed by Kerbernetes, Meso or something, and for that reason anyone operating Cassandra in a real world setting would not encounter the issues I raised in the way I described.

Shouldn’t the architectural diagrams people reference indicate that in some way? That would have help me.

Kenneth Brotman

From: Kenneth Brotman [mailto:***@yahoo.com<mailto:***@yahoo.com>]
Sent: Monday, February 19, 2018 10:43 AM
To: '***@cassandra.apache.org<mailto:***@cassandra.apache.org>'
Cc: '***@cassandra.apache.org<mailto:***@cassandra.apache.org>'
Subject: RE: Cassandra Needs to Grow Up by Version Five!

Well said. Very fair. I wouldn’t mind hearing from others still. You’re a good guy!

Kenneth Brotman

From: Jeff Jirsa [mailto:***@gmail.com]
Sent: Monday, February 19, 2018 9:10 AM
To: cassandra
Cc: Cassandra DEV
Subject: Re: Cassandra Needs to Grow Up by Version Five!

There's a lot of things below I disagree with, but it's ok. I convinced myself not to nit-pick every point.

https://issues.apache.org/jira/browse/CASSANDRA-13971 has some of Stefan's work with cert management

Beyond that, I encourage you to do what Michael suggested: open JIRAs for things you care strongly about, work on them if you have time. Sometime this year we'll schedule a NGCC (Next Generation Cassandra Conference) where we talk about future project work and direction, I encourage you to attend if you're able (I encourage anyone who cares about the direction of Cassandra to attend, it's probably be either free or very low cost, just to cover a venue and some food). If nothing else, you'll meet some of the teams who are working on the project, and learn why they've selected the projects on which they're working. You'll have an opportunity to pitch your vision, and maybe you can talk some folks into helping out.

- Jeff




On Mon, Feb 19, 2018 at 1:01 AM, Kenneth Brotman <***@yahoo.com.invalid<mailto:***@yahoo.com.invalid>> wrote:
Comments inline

>-----Original Message-----
>From: Jeff Jirsa [mailto:***@gmail.com<mailto:***@gmail.com>]
>Sent: Sunday, February 18, 2018 10:58 PM
>To: ***@cassandra.apache.org<mailto:***@cassandra.apache.org>
>Cc: ***@cassandra.apache.org<mailto:***@cassandra.apache.org>
>Subject: Re: Cassandra Needs to Grow Up by Version Five!
>
>Comments inline
>
>
>> On Feb 18, 2018, at 9:39 PM, Kenneth Brotman <***@yahoo.com.INVALID<mailto:***@yahoo.com.INVALID>> wrote:
>>
> >Cassandra feels like an unfinished program to me. The problem is not that it’s open source or cutting edge. It’s an open source cutting edge program that lacks some of its basic functionality. We are all stuck addressing fundamental mechanical tasks for Cassandra because the basic code that would do that part has not been contributed yet.
>>
>There’s probably 2-3 reasons why here:
>
>1) Historically the pmc has tried to keep the scope of the project very narrow. It’s a database. We don’t ship drivers. We don’t ship developer tools. We don’t ship fancy UIs. We ship a database. I think for the most part the narrow vision has been for the best, but maybe it’s time to reconsider some of the scope.
>
>Postgres will autovacuum to prevent wraparound (hopefully), but everyone I know running Postgres uses flexible-freeze in cron - sometimes it’s ok to let the database have its opinions and let third party tools fill in the gaps.
>

I can appreciate the desire to stay in scope. I believe usability is the King. When users have to learn the database, then learn what they have to automate, then learn an automation tool and then use the automation tool to do something that is as fundamental as the fundamental tasks I described, then something is missing from the database itself that is adversely affecting usability - and that is very bad. Where those big companies need to calculate the ROI is in the cost of acquiring or training the next group of users. Consider how steep the learning curve is for new users. Consider the business case for improving ease of use.

>2) Cassandra is, by definition, a database for large scale problems. Most of the companies working on/with it tend to be big companies. Big companies often have pre-existing automation that solved the stuff you consider fundamental tasks, so there’s probably nobody actively working on the solved problems that you may consider missing features - for many people they’re already solved.
>

I could be wrong but it sounds like a lot of the code work is done, and if the companies would take the time to contribute more code, then the rest of the code needed could be generated easily.

>3) It’s not nearly as basic as you think it is. Datastax seemingly had a multi-person team on opscenter, and while it was better than anything else around last time I used it (before it stopped supporting the OSS version), it left a lot to be desired. It’s probably 2-3 engineers working for a month to have any sort of meaningful, reliable, mostly trivial cluster-managing UI, and I can think of about 10 JIRAs I’d rather see that time be spent on first.

How about 6-9 engineers working 12 months a year on it then. I'm not kidding. For a big company with revenues in the tens of billions or more, and a heavy use of Cassandra nodes, it's easy to make a case for having a full time person or more that involved. They aren't paying for using the open source code that is Cassandra. Let's see what would the licensing fees be for a big company if the costs where like Microsoft or Oracle would charge for their enterprise level relational database? What's the contribution of one or two people in comparison.

>> Ease of use issues need to be given much more attention. For an administrator, the ease of use of Cassandra is very poor.
>>
>>Furthermore, currently Cassandra is an idiot. We have to do everything for Cassandra. Contrast that with the fact that we are in the dawn of artificial intelligence.
>>
>
>And for everything you think is obvious, there’s a 50% chance someone else will have already solved differently, and your obvious new solution will be seen as an inconvenient assumption and complexity they won’t appreciate. Open source projects get to walk a fine line of trying to be useful without making too many assumptions, being “too” opinionated, or overstepping bounds. We may be too conservative, but it’s very easy to go too far in the opposite direction.
>

I appreciate that but when such concerns result in inaction instead of resolution that is no good.

>> Software exists to automate tasks for humans, not mechanize humans to administer tasks for a database. I’m an engineering type. My job is to apply science and technology to solve real world problems. And that’s where I need an organization’s I.T. talent to focus; not in crank starting an unfinished database.
>>
>
>And that’s why nobody’s done it - we all have bigger problems we’re being paid to solve, and nobody’s felt it necessary. Because it’s not necessary, it’s nice, but not required.
>

Of course you would say that, you're Jeff Jirsa. In apprenticeship speak, you’re a master. It's the classic challenge of trying to get a master to see the legitimate issues of the apprentices. I do appreciate the time you give to answer posts to the groups , like this post. So I don't want you to take anything the wrong way. Where it's going to bit everyone is in the future adoption rate. It has to be addressed.

[snip]

>> Certificate management should be automated.
>>
>Stefan (in particular) has done a fair amount of work on this, but I’d bet 90% of users don’t use ssl and genuinely don’t care.
>

I didn't realize. Could I trouble you for a link so I could get up to speed?

>> Cluster wide management should be a big theme in any next major release.
>>
>Na. Stability and testing should be a big theme in the next major release.
>

Double Na on that one Jeff. I think you have a concern there about the need to test sufficiently to ensure the stability of the next major release. That makes perfect sense.- for every release, especially the major ones. Continuous improvement is not a phase of development for example. CI should be in everything, in every phase. Stability and testing a part of every release not just one. A major release should be a nice step from the previous major release though.

>> What is a major release? How many major releases could a program have before all the coding for basic stuff like installation, configuration and maintenance is included!
>>
>> Finish the basic coding of Cassandra, make it easy to use for administrators, make is smart, add cluster wide management. Keep Cassandra competitive or it will soon be the old Model T we all remember fondly.
>>
>
>Let’s keep some perspective. Most of us came to Cassandra from rdbms worlds where we were building solutions out of a bunch of master/slave MySQL / Postgres type databases. I started using Cassandra 0.6 when I needed to store something like 400gb/day in 200whatever on spinning disks when 100gb felt like a “big” database, and the thought of writing runbooks and automation to automatically pick the most up to date slave as the new master, promote it, repoint the other slave to the new master, then reformat the old master and add it as a new slave without downtime and without potentially deleting the company’s whole dataset sounded awful. Cassandra solved that problem, at the cost of maintaining a few yaml (then xml) files. Yes there are rough edges - they get slightly less rough on each new release. Can we do better? Sure, use your engineering time and send some patches. But the basic stuff is the nuts and bolts of the database: I care way more about streaming and compaction than I’ll ever care about installation.
>

I can relate. I was studying the enterprise level MS SQL Server stuff. I noticed exactly what you described. I decided maybe I'll just do other stuff and wait for things to develop more. I'm very excited about the way Cassandra addresses things. Streaming and compaction - very good. I'm glad. Items related to usability are not optional though.

>> I ask the Committee to compile a list of all such items, make a plan, and commit to including the completed and tested code as part of major release 5.0. I further ask that release 4.0 not be delayed and then there be an unusually short skip to version 5.0.
>>
>
>The committers are working their ass off on all sorts of hard problems. Some of those are probably even related to Cassandra. If you have idea, open a JIRA. If you have time, send a patch. Or review a patch. But don’t expect a bunch of people to set down work on optimizing the database to work on packaging and installation, because there’s no ROI in it for 99% of the existing committers: we’re working on the database to solve problems, and installation isn’t one of those problems.

I'm sure they are working very hard on all kinds of hard problems. I actually wrote "Committee", not "committers" There is an obvious shortage of contributors when you consider the size of the organizations using Cassandra. That leave the burden on an unfair few. Installation or more generally I would say usability is not that big a problem for the big companies out there. Good for them.

Ask a new organization or a modest size organization that is struggling to manage their Cassandra cluster that usability is not a big problem. It truly is a big problem for many stakeholders of Cassandra. It needs to be given a bigger priority. Hopefully others will weigh in.

Kenneth Brotman


---------------------------------------------------------------------
To unsubscribe, e-mail: user-***@cassandra.apache.org<mailto:user-***@cassandra.apacheorg>
For additional commands, e-mail: user-***@cassandra.apache.org<mailto:user-***@cassandra.apache.org>
Carl Mueller
2018-02-20 17:01:57 UTC
Permalink
I think what is really necessary is providing table-level recipes for
storing data. We need a lot of real world examples and the resulting
schema, compaction strategies, and tunings that were performed for them.
Right now I don't see such crucial cookbook data in the project.

AI is a bit ridiculous, we'd need to AI a collection of big data systems,
and then cassandra would need to have an entirely separate AI system
ingesting ALL THE DATA that comes into the already Big Data system in order
to... what... what would we have the AI do? restructure schemas? Switch
compaction strategeis? Add/subtract nodes? Increase/decrease RF? Those are
all insane things to allocate to AI approaches which are not transparent to
the factors and processing that led to the conclusions. The best we could
hope for are recommendations.

On Tue, Feb 20, 2018 at 5:39 AM, Kyrylo Lebediev <***@epam.com>
wrote:

> Agree with you, Daniel, regarding gaps in documentation.
>
>
> ---
>
> At the same time I disagree with the folks who are complaining in this
> thread about some functionality like 'advanced backup' etc is missing out
> of the box.
>
> We all live in the time where there are literally tons of open-source
> tools (automation, monitoring) and languages are available, also there are
> some really powerful SaaS solutions on the market which support C*
> (Datadog, for instance).
>
>
> For example, while C* provides basic building blocks for anti-entropy
> repairs [I mean basic usage of 'nodetool repair' is not suitable for
> large production clusters], Reaper (many thanks to Spotify and
> TheLastPickle!) which uses this basic functionality solves the task very
> well for real-world C* setups.
>
>
> Something is missing / could be improved in your opinion - we're in era
> of open-source. Create your own tool, let's say for C* backups automation
> using EBS snapshots, and upload it on GitHub.
>
>
> C* is a DB-engine, not a fully-automated self-contained suite.
> End-users are able to work on automation of routine [3rd party projects],
> meanwhile C* contributors may focus on core functionality.
>
> --
>
> Going back to documentation topic, as far as I understand, DataStax is no
> longer main C* contributor and is focused on own C*-based proprietary
> software [correct me smb if I'm wrong].
>
> This has led us to the situation when development of C* is progressing (as
> far as I understand, work is done mainly by some large C* users having
> enough resources to contribute to the C* project to get the features they
> need), but there is no single company which has taken over actualization of
> C* documentation / Wiki.
>
> Honestly, even DataStax's documentation is too concise and is missing a
> lot of important details.
>
> [BTW, just've taken a look at https://cassandra.apache.org/doc/latest/
> and it looks not that 'bad': despite of TODOs it contains a lot of
> valuable information]
>
>
> So, I feel the C* Community has to join efforts on enriching existing
> documentation / resurrection of Wiki [where can be placed howto's,
> information about 3rd party automations and integrations etc].
>
> By the Community I mean all of us including myself.
>
>
>
> Regards,
>
> Kyrill
> ------------------------------
> *From:* Daniel Hölbling-Inzko <daniel.hoelbling-***@bitmovin.com>
> *Sent:* Tuesday, February 20, 2018 11:28:13 AM
> *To:* ***@cassandra.apache.org; James Briggs
>
> *Cc:* ***@cassandra.apache.org
> *Subject:* Re: Cassandra Needs to Grow Up by Version Five!
>
> Hi,
>
> I have to add my own two cents here as the main thing that keeps me from
> really running Cassandra is the amount of pain running it incurs.
> Not so much because it's actually painful but because the tools are so
> different and the documentation and best practices are scattered across a
> dozen outdated DataStax articles and this mailing list etc.. We've been
> hesitant (although our use case is perfect for using Cassandra) to deploy
> Cassandra to any critical systems as even after a year of running it we
> still don't have the operational experience to confidently run critical
> systems with it.
>
> Simple things like a foolproof / safe cluster-wide S3 Backup (like
> Elasticsearch has it) would for example solve a TON of issues for new
> people. I don't need it auto-scheduled or something, but having to
> configure cron jobs across the whole cluster is a pain in the ass for small
> teams.
> To be honest, even the way snapshots are done right now is already super
> painful. Every other system I operated so far will just create one backup
> folder I can export, in C* the Backup is scattered across a bunch of
> different Keyspace folders etc.. needless to say that it took a while until
> I trusted my backup scripts fully.
>
> And especially for a Database I believe Backup/Restore needs to be a
> non-issue that's documented front and center. If not smaller teams just
> don't have the resources to dedicate to learning and building the tools
> around it.
>
> Now that the team is getting larger we could spare the resources to
> operate these things, but switching from a well-understood RDBMs schema to
> Cassandra is now incredibly hard and will probably take years.
>
> greetings Daniel
>
> On Tue, 20 Feb 2018 at 05:56 James Briggs <***@yahoo.com.invalid>
> wrote:
>
> Kenneth:
>
> What you said is not wrong.
>
> Vertica and Riak are examples of distributed databases that don't require
> hand-holding.
>
> Cassandra is for Java-programmer DIYers, or more often Datastax clients,
> at this point.
> Thanks, James.
>
> ------------------------------
> *From:* Kenneth Brotman <***@yahoo.com.INVALID>
> *To:* ***@cassandra.apache.org
> *Cc:* ***@cassandra.apache.org
> *Sent:* Monday, February 19, 2018 4:56 PM
>
> *Subject:* RE: Cassandra Needs to Grow Up by Version Five!
>
> Jeff, you helped me figure out what I was missing. It just took me a day
> to digest what you wrote. I’m coming over from another type of
> engineering. I didn’t know and it’s not really documented. Cassandra runs
> in a data center. Now days that means the nodes are going to be in managed
> containers, Docker containers, managed by Kerbernetes, Meso or something,
> and for that reason anyone operating Cassandra in a real world setting
> would not encounter the issues I raised in the way I described.
>
> Shouldn’t the architectural diagrams people reference indicate that in
> some way? That would have help me.
>
> Kenneth Brotman
>
> *From:* Kenneth Brotman [mailto:***@yahoo.com]
> *Sent:* Monday, February 19, 2018 10:43 AM
> *To:* '***@cassandra.apache.org'
> *Cc:* '***@cassandra.apache.org'
> *Subject:* RE: Cassandra Needs to Grow Up by Version Five!
>
> Well said. Very fair. I wouldn’t mind hearing from others still. You’re
> a good guy!
>
> Kenneth Brotman
>
> *From:* Jeff Jirsa [mailto:***@gmail.com <***@gmail.com>]
> *Sent:* Monday, February 19, 2018 9:10 AM
> *To:* cassandra
> *Cc:* Cassandra DEV
> *Subject:* Re: Cassandra Needs to Grow Up by Version Five!
>
> There's a lot of things below I disagree with, but it's ok. I convinced
> myself not to nit-pick every point.
>
> https://issues.apache.org/jira/browse/CASSANDRA-13971 has some of
> Stefan's work with cert management
>
> Beyond that, I encourage you to do what Michael suggested: open JIRAs for
> things you care strongly about, work on them if you have time. Sometime
> this year we'll schedule a NGCC (Next Generation Cassandra Conference)
> where we talk about future project work and direction, I encourage you to
> attend if you're able (I encourage anyone who cares about the direction of
> Cassandra to attend, it's probably be either free or very low cost, just to
> cover a venue and some food). If nothing else, you'll meet some of the
> teams who are working on the project, and learn why they've selected the
> projects on which they're working. You'll have an opportunity to pitch your
> vision, and maybe you can talk some folks into helping out.
>
> - Jeff
>
>
>
>
> On Mon, Feb 19, 2018 at 1:01 AM, Kenneth Brotman <
> ***@yahoo.com.invalid> wrote:
> Comments inline
>
> >-----Original Message-----
> >From: Jeff Jirsa [mailto:***@gmail.com]
> >Sent: Sunday, February 18, 2018 10:58 PM
> >To: ***@cassandra.apache.org
> >Cc: ***@cassandra.apache.org
> >Subject: Re: Cassandra Needs to Grow Up by Version Five!
> >
> >Comments inline
> >
> >
> >> On Feb 18, 2018, at 9:39 PM, Kenneth Brotman <
> ***@yahoo.com.INVALID> wrote:
> >>
> > >Cassandra feels like an unfinished program to me. The problem is not
> that it’s open source or cutting edge. It’s an open source cutting edge
> program that lacks some of its basic functionality. We are all stuck
> addressing fundamental mechanical tasks for Cassandra because the basic
> code that would do that part has not been contributed yet.
> >>
> >There’s probably 2-3 reasons why here:
> >
> >1) Historically the pmc has tried to keep the scope of the project very
> narrow. It’s a database. We don’t ship drivers. We don’t ship developer
> tools. We don’t ship fancy UIs. We ship a database. I think for the most
> part the narrow vision has been for the best, but maybe it’s time to
> reconsider some of the scope.
> >
> >Postgres will autovacuum to prevent wraparound (hopefully), but everyone
> I know running Postgres uses flexible-freeze in cron - sometimes it’s ok to
> let the database have its opinions and let third party tools fill in the
> gaps.
> >
>
> I can appreciate the desire to stay in scope. I believe usability is the
> King. When users have to learn the database, then learn what they have to
> automate, then learn an automation tool and then use the automation tool to
> do something that is as fundamental as the fundamental tasks I described,
> then something is missing from the database itself that is adversely
> affecting usability - and that is very bad. Where those big companies need
> to calculate the ROI is in the cost of acquiring or training the next group
> of users. Consider how steep the learning curve is for new users.
> Consider the business case for improving ease of use.
>
> >2) Cassandra is, by definition, a database for large scale problems. Most
> of the companies working on/with it tend to be big companies. Big companies
> often have pre-existing automation that solved the stuff you consider
> fundamental tasks, so there’s probably nobody actively working on the
> solved problems that you may consider missing features - for many people
> they’re already solved.
> >
>
> I could be wrong but it sounds like a lot of the code work is done, and if
> the companies would take the time to contribute more code, then the rest of
> the code needed could be generated easily.
>
> >3) It’s not nearly as basic as you think it is. Datastax seemingly had a
> multi-person team on opscenter, and while it was better than anything else
> around last time I used it (before it stopped supporting the OSS version),
> it left a lot to be desired. It’s probably 2-3 engineers working for a
> month to have any sort of meaningful, reliable, mostly trivial
> cluster-managing UI, and I can think of about 10 JIRAs I’d rather see that
> time be spent on first.
>
> How about 6-9 engineers working 12 months a year on it then. I'm not
> kidding. For a big company with revenues in the tens of billions or more,
> and a heavy use of Cassandra nodes, it's easy to make a case for having a
> full time person or more that involved. They aren't paying for using the
> open source code that is Cassandra. Let's see what would the licensing
> fees be for a big company if the costs where like Microsoft or Oracle would
> charge for their enterprise level relational database? What's the
> contribution of one or two people in comparison.
>
> >> Ease of use issues need to be given much more attention. For an
> administrator, the ease of use of Cassandra is very poor.
> >>
> >>Furthermore, currently Cassandra is an idiot. We have to do everything
> for Cassandra. Contrast that with the fact that we are in the dawn of
> artificial intelligence.
> >>
> >
> >And for everything you think is obvious, there’s a 50% chance someone
> else will have already solved differently, and your obvious new solution
> will be seen as an inconvenient assumption and complexity they won’t
> appreciate. Open source projects get to walk a fine line of trying to be
> useful without making too many assumptions, being “too” opinionated, or
> overstepping bounds. We may be too conservative, but it’s very easy to go
> too far in the opposite direction.
> >
>
> I appreciate that but when such concerns result in inaction instead of
> resolution that is no good.
>
> >> Software exists to automate tasks for humans, not mechanize humans to
> administer tasks for a database. I’m an engineering type. My job is to
> apply science and technology to solve real world problems. And that’s
> where I need an organization’s I.T. talent to focus; not in crank starting
> an unfinished database.
> >>
> >
> >And that’s why nobody’s done it - we all have bigger problems we’re being
> paid to solve, and nobody’s felt it necessary. Because it’s not necessary,
> it’s nice, but not required.
> >
>
> Of course you would say that, you're Jeff Jirsa. In apprenticeship speak,
> you’re a master. It's the classic challenge of trying to get a master to
> see the legitimate issues of the apprentices. I do appreciate the time you
> give to answer posts to the groups , like this post. So I don't want you
> to take anything the wrong way. Where it's going to bit everyone is in the
> future adoption rate. It has to be addressed.
>
> [snip]
>
> >> Certificate management should be automated.
> >>
> >Stefan (in particular) has done a fair amount of work on this, but I’d
> bet 90% of users don’t use ssl and genuinely don’t care.
> >
>
> I didn't realize. Could I trouble you for a link so I could get up to
> speed?
>
> >> Cluster wide management should be a big theme in any next major release.
> >>
> >Na. Stability and testing should be a big theme in the next major release.
> >
>
> Double Na on that one Jeff. I think you have a concern there about the
> need to test sufficiently to ensure the stability of the next major
> release. That makes perfect sense.- for every release, especially the
> major ones. Continuous improvement is not a phase of development for
> example. CI should be in everything, in every phase. Stability and
> testing a part of every release not just one. A major release should be a
> nice step from the previous major release though.
>
> >> What is a major release? How many major releases could a program have
> before all the coding for basic stuff like installation, configuration and
> maintenance is included!
> >>
> >> Finish the basic coding of Cassandra, make it easy to use for
> administrators, make is smart, add cluster wide management. Keep Cassandra
> competitive or it will soon be the old Model T we all remember fondly.
> >>
> >
> >Let’s keep some perspective. Most of us came to Cassandra from rdbms
> worlds where we were building solutions out of a bunch of master/slave
> MySQL / Postgres type databases. I started using Cassandra 0.6 when I
> needed to store something like 400gb/day in 200whatever on spinning disks
> when 100gb felt like a “big” database, and the thought of writing runbooks
> and automation to automatically pick the most up to date slave as the new
> master, promote it, repoint the other slave to the new master, then
> reformat the old master and add it as a new slave without downtime and
> without potentially deleting the company’s whole dataset sounded awful.
> Cassandra solved that problem, at the cost of maintaining a few yaml (then
> xml) files. Yes there are rough edges - they get slightly less rough on
> each new release. Can we do better? Sure, use your engineering time and
> send some patches. But the basic stuff is the nuts and bolts of the
> database: I care way more about streaming and compaction than I’ll ever
> care about installation.
> >
>
> I can relate. I was studying the enterprise level MS SQL Server stuff. I
> noticed exactly what you described. I decided maybe I'll just do other
> stuff and wait for things to develop more. I'm very excited about the way
> Cassandra addresses things. Streaming and compaction - very good. I'm
> glad. Items related to usability are not optional though.
>
> >> I ask the Committee to compile a list of all such items, make a plan,
> and commit to including the completed and tested code as part of major
> release 5.0. I further ask that release 4.0 not be delayed and then there
> be an unusually short skip to version 5.0.
> >>
> >
> >The committers are working their ass off on all sorts of hard problems.
> Some of those are probably even related to Cassandra. If you have idea,
> open a JIRA. If you have time, send a patch. Or review a patch. But don’t
> expect a bunch of people to set down work on optimizing the database to
> work on packaging and installation, because there’s no ROI in it for 99% of
> the existing committers: we’re working on the database to solve problems,
> and installation isn’t one of those problems.
>
> I'm sure they are working very hard on all kinds of hard problems. I
> actually wrote "Committee", not "committers" There is an obvious shortage
> of contributors when you consider the size of the organizations using
> Cassandra. That leave the burden on an unfair few. Installation or more
> generally I would say usability is not that big a problem for the big
> companies out there. Good for them.
>
> Ask a new organization or a modest size organization that is struggling to
> manage their Cassandra cluster that usability is not a big problem. It
> truly is a big problem for many stakeholders of Cassandra. It needs to be
> given a bigger priority. Hopefully others will weigh in.
>
> Kenneth Brotman
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-***@cassandra.apache.org
> <user-***@cassandra.apacheorg>
> For additional commands, e-mail: user-***@cassandra.apache.org
>
>
>
>
Kenneth Brotman
2018-02-21 06:22:52 UTC
Permalink
If you watch this video through you'll see why usability is so important. You can't ignore usability issues.

Cassandra does not exist in a vacuum. The competitors are world class.

The video is on the New Cassandra API for Azure Cosmos DB:
https://www.youtube.com/watch?v=1Sf4McGN1AQ

Kenneth Brotman

-----Original Message-----
From: Daniel Hölbling-Inzko [mailto:daniel.hoelbling-***@bitmovin.com]
Sent: Tuesday, February 20, 2018 1:28 AM
To: ***@cassandra.apache.org; James Briggs
Cc: ***@cassandra.apache.org
Subject: Re: Cassandra Needs to Grow Up by Version Five!

Hi,

I have to add my own two cents here as the main thing that keeps me from really running Cassandra is the amount of pain running it incurs.
Not so much because it's actually painful but because the tools are so different and the documentation and best practices are scattered across a dozen outdated DataStax articles and this mailing list etc.. We've been hesitant (although our use case is perfect for using Cassandra) to deploy Cassandra to any critical systems as even after a year of running it we still don't have the operational experience to confidently run critical systems with it.

Simple things like a foolproof / safe cluster-wide S3 Backup (like Elasticsearch has it) would for example solve a TON of issues for new people. I don't need it auto-scheduled or something, but having to configure cron jobs across the whole cluster is a pain in the ass for small teams.
To be honest, even the way snapshots are done right now is already super painful. Every other system I operated so far will just create one backup folder I can export, in C* the Backup is scattered across a bunch of different Keyspace folders etc.. needless to say that it took a while until I trusted my backup scripts fully.

And especially for a Database I believe Backup/Restore needs to be a non-issue that's documented front and center. If not smaller teams just don't have the resources to dedicate to learning and building the tools around it.

Now that the team is getting larger we could spare the resources to operate these things, but switching from a well-understood RDBMs schema to Cassandra is now incredibly hard and will probably take years.

greetings Daniel

On Tue, 20 Feb 2018 at 05:56 James Briggs <***@yahoo.com.invalid>
wrote:

> Kenneth:
>
> What you said is not wrong.
>
> Vertica and Riak are examples of distributed databases that don't
> require hand-holding.
>
> Cassandra is for Java-programmer DIYers, or more often Datastax
> clients, at this point.
> Thanks, James.
>
> ------------------------------
> *From:* Kenneth Brotman <***@yahoo.com.INVALID>
> *To:* ***@cassandra.apache.org
> *Cc:* ***@cassandra.apache.org
> *Sent:* Monday, February 19, 2018 4:56 PM
>
> *Subject:* RE: Cassandra Needs to Grow Up by Version Five!
>
> Jeff, you helped me figure out what I was missing. It just took me a
> day to digest what you wrote. I’m coming over from another type of
> engineering. I didn’t know and it’s not really documented. Cassandra
> runs in a data center. Now days that means the nodes are going to be
> in managed containers, Docker containers, managed by Kerbernetes,
> Meso or something, and for that reason anyone operating Cassandra in a
> real world setting would not encounter the issues I raised in the way I described.
>
> Shouldn’t the architectural diagrams people reference indicate that in
> some way? That would have help me.
>
> Kenneth Brotman
>
> *From:* Kenneth Brotman [mailto:***@yahoo.com]
> *Sent:* Monday, February 19, 2018 10:43 AM
> *To:* '***@cassandra.apache.org'
> *Cc:* '***@cassandra.apache.org'
> *Subject:* RE: Cassandra Needs to Grow Up by Version Five!
>
> Well said. Very fair. I wouldn’t mind hearing from others still
> You’re a good guy!
>
> Kenneth Brotman
>
> *From:* Jeff Jirsa [mailto:***@gmail.com <***@gmail.com>]
> *Sent:* Monday, February 19, 2018 9:10 AM
> *To:* cassandra
> *Cc:* Cassandra DEV
> *Subject:* Re: Cassandra Needs to Grow Up by Version Five!
>
> There's a lot of things below I disagree with, but it's ok. I
> convinced myself not to nit-pick every point.
>
> https://issues.apache.org/jira/browse/CASSANDRA-13971 has some of
> Stefan's work with cert management
>
> Beyond that, I encourage you to do what Michael suggested: open JIRAs
> for things you care strongly about, work on them if you have time.
> Sometime this year we'll schedule a NGCC (Next Generation Cassandra
> Conference) where we talk about future project work and direction, I
> encourage you to attend if you're able (I encourage anyone who cares
> about the direction of Cassandra to attend, it's probably be either
> free or very low cost, just to cover a venue and some food). If
> nothing else, you'll meet some of the teams who are working on the
> project, and learn why they've selected the projects on which they're
> working. You'll have an opportunity to pitch your vision, and maybe you can talk some folks into helping out.
>
> - Jeff
>
>
>
>
> On Mon, Feb 19, 2018 at 1:01 AM, Kenneth Brotman <
> ***@yahoo.com.invalid> wrote:
> Comments inline
>
> >-----Original Message-----
> >From: Jeff Jirsa [mailto:***@gmail.com]
> >Sent: Sunday, February 18, 2018 10:58 PM
> >To: ***@cassandra.apache.org
> >Cc: ***@cassandra.apache.org
> >Subject: Re: Cassandra Needs to Grow Up by Version Five!
> >
> >Comments inline
> >
> >
> >> On Feb 18, 2018, at 9:39 PM, Kenneth Brotman <
> ***@yahoo.com.INVALID> wrote:
> >>
> > >Cassandra feels like an unfinished program to me. The problem is
> > >not
> that it’s open source or cutting edge. It’s an open source cutting
> edge program that lacks some of its basic functionality. We are all
> stuck addressing fundamental mechanical tasks for Cassandra because
> the basic code that would do that part has not been contributed yet.
> >>
> >There’s probably 2-3 reasons why here:
> >
> >1) Historically the pmc has tried to keep the scope of the project
> >very
> narrow. It’s a database. We don’t ship drivers. We don’t ship
> developer tools. We don’t ship fancy UIs. We ship a database. I think
> for the most part the narrow vision has been for the best, but maybe
> it’s time to reconsider some of the scope.
> >
> >Postgres will autovacuum to prevent wraparound (hopefully), but
> >everyone
> I know running Postgres uses flexible-freeze in cron - sometimes it’s
> ok to let the database have its opinions and let third party tools
> fill in the gaps.
> >
>
> I can appreciate the desire to stay in scope. I believe usability is
> the King. When users have to learn the database, then learn what they
> have to automate, then learn an automation tool and then use the
> automation tool to do something that is as fundamental as the
> fundamental tasks I described, then something is missing from the
> database itself that is adversely affecting usability - and that is
> very bad. Where those big companies need to calculate the ROI is in
> the cost of acquiring or training the next group of users. Consider how steep the learning curve is for new users.
> Consider the business case for improving ease of use.
>
> >2) Cassandra is, by definition, a database for large scale problems.
> >Most
> of the companies working on/with it tend to be big companies. Big
> companies often have pre-existing automation that solved the stuff you
> consider fundamental tasks, so there’s probably nobody actively
> working on the solved problems that you may consider missing features
> - for many people they’re already solved.
> >
>
> I could be wrong but it sounds like a lot of the code work is done,
> and if the companies would take the time to contribute more code, then
> the rest of the code needed could be generated easily.
>
> >3) It’s not nearly as basic as you think it is. Datastax seemingly
> >had a
> multi-person team on opscenter, and while it was better than anything
> else around last time I used it (before it stopped supporting the OSS
> version), it left a lot to be desired. It’s probably 2-3 engineers
> working for a month to have any sort of meaningful, reliable, mostly
> trivial cluster-managing UI, and I can think of about 10 JIRAs I’d
> rather see that time be spent on first.
>
> How about 6-9 engineers working 12 months a year on it then. I'm not
> kidding. For a big company with revenues in the tens of billions or
> more, and a heavy use of Cassandra nodes, it's easy to make a case for
> having a full time person or more that involved. They aren't paying
> for using the open source code that is Cassandra. Let's see what
> would the licensing fees be for a big company if the costs where like Microsoft or Oracle would
> charge for their enterprise level relational database? What's the
> contribution of one or two people in comparison.
>
> >> Ease of use issues need to be given much more attention. For an
> administrator, the ease of use of Cassandra is very poor.
> >>
> >>Furthermore, currently Cassandra is an idiot. We have to do
> >>everything
> for Cassandra. Contrast that with the fact that we are in the dawn of
> artificial intelligence.
> >>
> >
> >And for everything you think is obvious, there’s a 50% chance someone
> else will have already solved differently, and your obvious new
> solution will be seen as an inconvenient assumption and complexity
> they won’t appreciate. Open source projects get to walk a fine line of
> trying to be useful without making too many assumptions, being “too”
> opinionated, or overstepping bounds. We may be too conservative, but
> it’s very easy to go too far in the opposite direction.
> >
>
> I appreciate that but when such concerns result in inaction instead of
> resolution that is no good.
>
> >> Software exists to automate tasks for humans, not mechanize humans
> >> to
> administer tasks for a database. I’m an engineering type. My job is
> to apply science and technology to solve real world problems. And
> that’s where I need an organization’s I.T. talent to focus; not in
> crank starting an unfinished database.
> >>
> >
> >And that’s why nobody’s done it - we all have bigger problems we’re
> >being
> paid to solve, and nobody’s felt it necessary. Because it’s not
> necessary, it’s nice, but not required.
> >
>
> Of course you would say that, you're Jeff Jirsa. In apprenticeship
> speak, you’re a master. It's the classic challenge of trying to get
> a master to see the legitimate issues of the apprentices. I do
> appreciate the time you give to answer posts to the groups , like this
> post. So I don't want you to take anything the wrong way. Where it's
> going to bit everyone is in the future adoption rate. It has to be addressed.
>
> [snip]
>
> >> Certificate management should be automated.
> >>
> >Stefan (in particular) has done a fair amount of work on this, but
> >I’d
> bet 90% of users don’t use ssl and genuinely don’t care.
> >
>
> I didn't realize. Could I trouble you for a link so I could get up to
> speed?
>
> >> Cluster wide management should be a big theme in any next major release.
> >>
> >Na. Stability and testing should be a big theme in the next major release.
> >
>
> Double Na on that one Jeff. I think you have a concern there about
> the need to test sufficiently to ensure the stability of the next
> major release. That makes perfect sense.- for every release,
> especially the major ones. Continuous improvement is not a phase of
> development for example. CI should be in everything, in every phase.
> Stability and testing a part of every release not just one. A major
> release should be a nice step from the previous major release though.
>
> >> What is a major release? How many major releases could a program
> >> have
> before all the coding for basic stuff like installation, configuration
> and maintenance is included!
> >>
> >> Finish the basic coding of Cassandra, make it easy to use for
> administrators, make is smart, add cluster wide management. Keep
> Cassandra competitive or it will soon be the old Model T we all remember fondly.
> >>
> >
> >Let’s keep some perspective. Most of us came to Cassandra from rdbms
> worlds where we were building solutions out of a bunch of master/slave
> MySQL / Postgres type databases. I started using Cassandra 0.6 when I
> needed to store something like 400gb/day in 200whatever on spinning
> disks when 100gb felt like a “big” database, and the thought of
> writing runbooks and automation to automatically pick the most up to
> date slave as the new master, promote it, repoint the other slave to
> the new master, then reformat the old master and add it as a new slave
> without downtime and without potentially deleting the company’s whole dataset sounded awful.
> Cassandra solved that problem, at the cost of maintaining a few yaml
> (then
> xml) files. Yes there are rough edges - they get slightly less rough
> on each new release. Can we do better? Sure, use your engineering time
> and send some patches. But the basic stuff is the nuts and bolts of
> the
> database: I care way more about streaming and compaction than I’ll
> ever care about installation.
> >
>
> I can relate. I was studying the enterprise level MS SQL Server
> stuff. I noticed exactly what you described. I decided maybe I'll
> just do other stuff and wait for things to develop more. I'm very
> excited about the way Cassandra addresses things. Streaming and
> compaction - very good. I'm glad. Items related to usability are not optional though.
>
> >> I ask the Committee to compile a list of all such items, make a
> >> plan,
> and commit to including the completed and tested code as part of major
> release 5.0. I further ask that release 4.0 not be delayed and then
> there be an unusually short skip to version 5.0.
> >>
> >
> >The committers are working their ass off on all sorts of hard problems.
> Some of those are probably even related to Cassandra. If you have
> idea, open a JIRA. If you have time, send a patch. Or review a patch.
> But don’t expect a bunch of people to set down work on optimizing the
> database to work on packaging and installation, because there’s no ROI
> in it for 99% of the existing committers: we’re working on the
> database to solve problems, and installation isn’t one of those problems.
>
> I'm sure they are working very hard on all kinds of hard problems. I
> actually wrote "Committee", not "committers" There is an obvious
> shortage of contributors when you consider the size of the
> organizations using Cassandra. That leave the burden on an unfair
> few. Installation or more generally I would say usability is not that
> big a problem for the big companies out there. Good for them.
>
> Ask a new organization or a modest size organization that is
> struggling to manage their Cassandra cluster that usability is not a
> big problem. It truly is a big problem for many stakeholders of
> Cassandra. It needs to be given a bigger priority. Hopefully others will weigh in.
>
> Kenneth Brotman
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-***@cassandra.apache.org
> <user-***@cassandra.apacheorg>
> For additional commands, e-mail: user-***@cassandra.apache.org
>
>
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: user-***@cassandra.apache.org
For additional commands, e-mail: user-***@cassandra.apache.org
Prasenjit Sarkar
2018-02-21 07:28:22 UTC
Permalink
Jeff,

I don't think you can push the topic of usability back to developers by
asking them to open JIRAs. It is upon the technical leaders of the
Cassandra community to take the initiative in this regard. We can argue
back and forth on the dynamics of open source projects, but the usability
concerns of Cassandra is a reality that can not be ignored.

Prasenjit

PS My views, not those of my employer

On Tue, Feb 20, 2018 at 10:22 PM, Kenneth Brotman <
***@yahoo.com.invalid> wrote:

> If you watch this video through you'll see why usability is so important.
> You can't ignore usability issues.
>
> Cassandra does not exist in a vacuum. The competitors are world class.
>
> The video is on the New Cassandra API for Azure Cosmos DB:
> https://www.youtube.com/watch?v=1Sf4McGN1AQ
>
> Kenneth Brotman
>
> -----Original Message-----
> From: Daniel Hölbling-Inzko [mailto:daniel.hoelbling-***@bitmovin.com]
> Sent: Tuesday, February 20, 2018 1:28 AM
> To: ***@cassandra.apache.org; James Briggs
> Cc: ***@cassandra.apache.org
> Subject: Re: Cassandra Needs to Grow Up by Version Five!
>
> Hi,
>
> I have to add my own two cents here as the main thing that keeps me from
> really running Cassandra is the amount of pain running it incurs.
> Not so much because it's actually painful but because the tools are so
> different and the documentation and best practices are scattered across a
> dozen outdated DataStax articles and this mailing list etc.. We've been
> hesitant (although our use case is perfect for using Cassandra) to deploy
> Cassandra to any critical systems as even after a year of running it we
> still don't have the operational experience to confidently run critical
> systems with it.
>
> Simple things like a foolproof / safe cluster-wide S3 Backup (like
> Elasticsearch has it) would for example solve a TON of issues for new
> people. I don't need it auto-scheduled or something, but having to
> configure cron jobs across the whole cluster is a pain in the ass for small
> teams.
> To be honest, even the way snapshots are done right now is already super
> painful. Every other system I operated so far will just create one backup
> folder I can export, in C* the Backup is scattered across a bunch of
> different Keyspace folders etc.. needless to say that it took a while until
> I trusted my backup scripts fully.
>
> And especially for a Database I believe Backup/Restore needs to be a
> non-issue that's documented front and center. If not smaller teams just
> don't have the resources to dedicate to learning and building the tools
> around it.
>
> Now that the team is getting larger we could spare the resources to
> operate these things, but switching from a well-understood RDBMs schema to
> Cassandra is now incredibly hard and will probably take years.
>
> greetings Daniel
>
> On Tue, 20 Feb 2018 at 05:56 James Briggs <***@yahoo.com.invalid>
> wrote:
>
> > Kenneth:
> >
> > What you said is not wrong.
> >
> > Vertica and Riak are examples of distributed databases that don't
> > require hand-holding.
> >
> > Cassandra is for Java-programmer DIYers, or more often Datastax
> > clients, at this point.
> > Thanks, James.
> >
> > ------------------------------
> > *From:* Kenneth Brotman <***@yahoo.com.INVALID>
> > *To:* ***@cassandra.apache.org
> > *Cc:* ***@cassandra.apache.org
> > *Sent:* Monday, February 19, 2018 4:56 PM
> >
> > *Subject:* RE: Cassandra Needs to Grow Up by Version Five!
> >
> > Jeff, you helped me figure out what I was missing. It just took me a
> > day to digest what you wrote. I’m coming over from another type of
> > engineering. I didn’t know and it’s not really documented. Cassandra
> > runs in a data center. Now days that means the nodes are going to be
> > in managed containers, Docker containers, managed by Kerbernetes,
> > Meso or something, and for that reason anyone operating Cassandra in a
> > real world setting would not encounter the issues I raised in the way I
> described.
> >
> > Shouldn’t the architectural diagrams people reference indicate that in
> > some way? That would have help me.
> >
> > Kenneth Brotman
> >
> > *From:* Kenneth Brotman [mailto:***@yahoo.com]
> > *Sent:* Monday, February 19, 2018 10:43 AM
> > *To:* '***@cassandra.apache.org'
> > *Cc:* '***@cassandra.apache.org'
> > *Subject:* RE: Cassandra Needs to Grow Up by Version Five!
> >
> > Well said. Very fair. I wouldn’t mind hearing from others still
> > You’re a good guy!
> >
> > Kenneth Brotman
> >
> > *From:* Jeff Jirsa [mailto:***@gmail.com <***@gmail.com>]
> > *Sent:* Monday, February 19, 2018 9:10 AM
> > *To:* cassandra
> > *Cc:* Cassandra DEV
> > *Subject:* Re: Cassandra Needs to Grow Up by Version Five!
> >
> > There's a lot of things below I disagree with, but it's ok. I
> > convinced myself not to nit-pick every point.
> >
> > https://issues.apache.org/jira/browse/CASSANDRA-13971 has some of
> > Stefan's work with cert management
> >
> > Beyond that, I encourage you to do what Michael suggested: open JIRAs
> > for things you care strongly about, work on them if you have time.
> > Sometime this year we'll schedule a NGCC (Next Generation Cassandra
> > Conference) where we talk about future project work and direction, I
> > encourage you to attend if you're able (I encourage anyone who cares
> > about the direction of Cassandra to attend, it's probably be either
> > free or very low cost, just to cover a venue and some food). If
> > nothing else, you'll meet some of the teams who are working on the
> > project, and learn why they've selected the projects on which they're
> > working. You'll have an opportunity to pitch your vision, and maybe you
> can talk some folks into helping out.
> >
> > - Jeff
> >
> >
> >
> >
> > On Mon, Feb 19, 2018 at 1:01 AM, Kenneth Brotman <
> > ***@yahoo.com.invalid> wrote:
> > Comments inline
> >
> > >-----Original Message-----
> > >From: Jeff Jirsa [mailto:***@gmail.com]
> > >Sent: Sunday, February 18, 2018 10:58 PM
> > >To: ***@cassandra.apache.org
> > >Cc: ***@cassandra.apache.org
> > >Subject: Re: Cassandra Needs to Grow Up by Version Five!
> > >
> > >Comments inline
> > >
> > >
> > >> On Feb 18, 2018, at 9:39 PM, Kenneth Brotman <
> > ***@yahoo.com.INVALID> wrote:
> > >>
> > > >Cassandra feels like an unfinished program to me. The problem is
> > > >not
> > that it’s open source or cutting edge. It’s an open source cutting
> > edge program that lacks some of its basic functionality. We are all
> > stuck addressing fundamental mechanical tasks for Cassandra because
> > the basic code that would do that part has not been contributed yet.
> > >>
> > >There’s probably 2-3 reasons why here:
> > >
> > >1) Historically the pmc has tried to keep the scope of the project
> > >very
> > narrow. It’s a database. We don’t ship drivers. We don’t ship
> > developer tools. We don’t ship fancy UIs. We ship a database. I think
> > for the most part the narrow vision has been for the best, but maybe
> > it’s time to reconsider some of the scope.
> > >
> > >Postgres will autovacuum to prevent wraparound (hopefully), but
> > >everyone
> > I know running Postgres uses flexible-freeze in cron - sometimes it’s
> > ok to let the database have its opinions and let third party tools
> > fill in the gaps.
> > >
> >
> > I can appreciate the desire to stay in scope. I believe usability is
> > the King. When users have to learn the database, then learn what they
> > have to automate, then learn an automation tool and then use the
> > automation tool to do something that is as fundamental as the
> > fundamental tasks I described, then something is missing from the
> > database itself that is adversely affecting usability - and that is
> > very bad. Where those big companies need to calculate the ROI is in
> > the cost of acquiring or training the next group of users. Consider how
> steep the learning curve is for new users.
> > Consider the business case for improving ease of use.
> >
> > >2) Cassandra is, by definition, a database for large scale problems.
> > >Most
> > of the companies working on/with it tend to be big companies. Big
> > companies often have pre-existing automation that solved the stuff you
> > consider fundamental tasks, so there’s probably nobody actively
> > working on the solved problems that you may consider missing features
> > - for many people they’re already solved.
> > >
> >
> > I could be wrong but it sounds like a lot of the code work is done,
> > and if the companies would take the time to contribute more code, then
> > the rest of the code needed could be generated easily.
> >
> > >3) It’s not nearly as basic as you think it is. Datastax seemingly
> > >had a
> > multi-person team on opscenter, and while it was better than anything
> > else around last time I used it (before it stopped supporting the OSS
> > version), it left a lot to be desired. It’s probably 2-3 engineers
> > working for a month to have any sort of meaningful, reliable, mostly
> > trivial cluster-managing UI, and I can think of about 10 JIRAs I’d
> > rather see that time be spent on first.
> >
> > How about 6-9 engineers working 12 months a year on it then. I'm not
> > kidding. For a big company with revenues in the tens of billions or
> > more, and a heavy use of Cassandra nodes, it's easy to make a case for
> > having a full time person or more that involved. They aren't paying
> > for using the open source code that is Cassandra. Let's see what
> > would the licensing fees be for a big company if the costs where like
> Microsoft or Oracle would
> > charge for their enterprise level relational database? What's the
> > contribution of one or two people in comparison.
> >
> > >> Ease of use issues need to be given much more attention. For an
> > administrator, the ease of use of Cassandra is very poor.
> > >>
> > >>Furthermore, currently Cassandra is an idiot. We have to do
> > >>everything
> > for Cassandra. Contrast that with the fact that we are in the dawn of
> > artificial intelligence.
> > >>
> > >
> > >And for everything you think is obvious, there’s a 50% chance someone
> > else will have already solved differently, and your obvious new
> > solution will be seen as an inconvenient assumption and complexity
> > they won’t appreciate. Open source projects get to walk a fine line of
> > trying to be useful without making too many assumptions, being “too”
> > opinionated, or overstepping bounds. We may be too conservative, but
> > it’s very easy to go too far in the opposite direction.
> > >
> >
> > I appreciate that but when such concerns result in inaction instead of
> > resolution that is no good.
> >
> > >> Software exists to automate tasks for humans, not mechanize humans
> > >> to
> > administer tasks for a database. I’m an engineering type. My job is
> > to apply science and technology to solve real world problems. And
> > that’s where I need an organization’s I.T. talent to focus; not in
> > crank starting an unfinished database.
> > >>
> > >
> > >And that’s why nobody’s done it - we all have bigger problems we’re
> > >being
> > paid to solve, and nobody’s felt it necessary. Because it’s not
> > necessary, it’s nice, but not required.
> > >
> >
> > Of course you would say that, you're Jeff Jirsa. In apprenticeship
> > speak, you’re a master. It's the classic challenge of trying to get
> > a master to see the legitimate issues of the apprentices. I do
> > appreciate the time you give to answer posts to the groups , like this
> > post. So I don't want you to take anything the wrong way. Where it's
> > going to bit everyone is in the future adoption rate. It has to be
> addressed.
> >
> > [snip]
> >
> > >> Certificate management should be automated.
> > >>
> > >Stefan (in particular) has done a fair amount of work on this, but
> > >I’d
> > bet 90% of users don’t use ssl and genuinely don’t care.
> > >
> >
> > I didn't realize. Could I trouble you for a link so I could get up to
> > speed?
> >
> > >> Cluster wide management should be a big theme in any next major
> release.
> > >>
> > >Na. Stability and testing should be a big theme in the next major
> release.
> > >
> >
> > Double Na on that one Jeff. I think you have a concern there about
> > the need to test sufficiently to ensure the stability of the next
> > major release. That makes perfect sense.- for every release,
> > especially the major ones. Continuous improvement is not a phase of
> > development for example. CI should be in everything, in every phase.
> > Stability and testing a part of every release not just one. A major
> > release should be a nice step from the previous major release though.
> >
> > >> What is a major release? How many major releases could a program
> > >> have
> > before all the coding for basic stuff like installation, configuration
> > and maintenance is included!
> > >>
> > >> Finish the basic coding of Cassandra, make it easy to use for
> > administrators, make is smart, add cluster wide management. Keep
> > Cassandra competitive or it will soon be the old Model T we all remember
> fondly.
> > >>
> > >
> > >Let’s keep some perspective. Most of us came to Cassandra from rdbms
> > worlds where we were building solutions out of a bunch of master/slave
> > MySQL / Postgres type databases. I started using Cassandra 0.6 when I
> > needed to store something like 400gb/day in 200whatever on spinning
> > disks when 100gb felt like a “big” database, and the thought of
> > writing runbooks and automation to automatically pick the most up to
> > date slave as the new master, promote it, repoint the other slave to
> > the new master, then reformat the old master and add it as a new slave
> > without downtime and without potentially deleting the company’s whole
> dataset sounded awful.
> > Cassandra solved that problem, at the cost of maintaining a few yaml
> > (then
> > xml) files. Yes there are rough edges - they get slightly less rough
> > on each new release. Can we do better? Sure, use your engineering time
> > and send some patches. But the basic stuff is the nuts and bolts of
> > the
> > database: I care way more about streaming and compaction than I’ll
> > ever care about installation.
> > >
> >
> > I can relate. I was studying the enterprise level MS SQL Server
> > stuff. I noticed exactly what you described. I decided maybe I'll
> > just do other stuff and wait for things to develop more. I'm very
> > excited about the way Cassandra addresses things. Streaming and
> > compaction - very good. I'm glad. Items related to usability are not
> optional though.
> >
> > >> I ask the Committee to compile a list of all such items, make a
> > >> plan,
> > and commit to including the completed and tested code as part of major
> > release 5.0. I further ask that release 4.0 not be delayed and then
> > there be an unusually short skip to version 5.0.
> > >>
> > >
> > >The committers are working their ass off on all sorts of hard problems.
> > Some of those are probably even related to Cassandra. If you have
> > idea, open a JIRA. If you have time, send a patch. Or review a patch.
> > But don’t expect a bunch of people to set down work on optimizing the
> > database to work on packaging and installation, because there’s no ROI
> > in it for 99% of the existing committers: we’re working on the
> > database to solve problems, and installation isn’t one of those problems.
> >
> > I'm sure they are working very hard on all kinds of hard problems. I
> > actually wrote "Committee", not "committers" There is an obvious
> > shortage of contributors when you consider the size of the
> > organizations using Cassandra. That leave the burden on an unfair
> > few. Installation or more generally I would say usability is not that
> > big a problem for the big companies out there. Good for them.
> >
> > Ask a new organization or a modest size organization that is
> > struggling to manage their Cassandra cluster that usability is not a
> > big problem. It truly is a big problem for many stakeholders of
> > Cassandra. It needs to be given a bigger priority. Hopefully others
> will weigh in.
> >
> > Kenneth Brotman
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: user-***@cassandra.apache.org
> > <user-***@cassandra.apacheorg>
> > For additional commands, e-mail: user-***@cassandra.apache.org
> >
> >
> >
> >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-***@cassandra.apache.org
> For additional commands, e-mail: user-***@cassandra.apache.org
>
>
Daniel Hölbling-Inzko
2018-02-21 08:19:25 UTC
Permalink
But what does this video really show? That Microsoft managed to run
Cassandra as a SaaS product with nice UI?
Google did that years ago with BigTable and Amazon with DynamoDB.

I agree that we need more tools, but not so much for querying (although
that would also help a bit), but just in general the project feels
unapproachable right now.
Besides the excellent DataStax documentation there is little best practice
knowledge about how to operate and provision Cassandra clusters.
Having some recipes for Chef, Puppet or Ansible that show the most common
settings (or some Cloudfoundry/GCP Templates or Helm Charts) would be
really useful.
Also a list of all the projects that Cassandra goes well with (like TLP
Reaper and and Netflix's Priam etc..)

greetings Daniel

On Wed, 21 Feb 2018 at 07:23 Kenneth Brotman <***@yahoo.com.invalid>
wrote:

> If you watch this video through you'll see why usability is so important.
> You can't ignore usability issues.
>
> Cassandra does not exist in a vacuum. The competitors are world class.
>
> The video is on the New Cassandra API for Azure Cosmos DB:
> https://www.youtube.com/watch?v=1Sf4McGN1AQ
>
> Kenneth Brotman
>
> -----Original Message-----
> From: Daniel Hölbling-Inzko [mailto:daniel.hoelbling-***@bitmovin.com]
> Sent: Tuesday, February 20, 2018 1:28 AM
> To: ***@cassandra.apache.org; James Briggs
> Cc: ***@cassandra.apache.org
> Subject: Re: Cassandra Needs to Grow Up by Version Five!
>
> Hi,
>
> I have to add my own two cents here as the main thing that keeps me from
> really running Cassandra is the amount of pain running it incurs.
> Not so much because it's actually painful but because the tools are so
> different and the documentation and best practices are scattered across a
> dozen outdated DataStax articles and this mailing list etc.. We've been
> hesitant (although our use case is perfect for using Cassandra) to deploy
> Cassandra to any critical systems as even after a year of running it we
> still don't have the operational experience to confidently run critical
> systems with it.
>
> Simple things like a foolproof / safe cluster-wide S3 Backup (like
> Elasticsearch has it) would for example solve a TON of issues for new
> people. I don't need it auto-scheduled or something, but having to
> configure cron jobs across the whole cluster is a pain in the ass for small
> teams.
> To be honest, even the way snapshots are done right now is already super
> painful. Every other system I operated so far will just create one backup
> folder I can export, in C* the Backup is scattered across a bunch of
> different Keyspace folders etc.. needless to say that it took a while until
> I trusted my backup scripts fully.
>
> And especially for a Database I believe Backup/Restore needs to be a
> non-issue that's documented front and center. If not smaller teams just
> don't have the resources to dedicate to learning and building the tools
> around it.
>
> Now that the team is getting larger we could spare the resources to
> operate these things, but switching from a well-understood RDBMs schema to
> Cassandra is now incredibly hard and will probably take years.
>
> greetings Daniel
>
> On Tue, 20 Feb 2018 at 05:56 James Briggs <***@yahoo.com.invalid>
> wrote:
>
> > Kenneth:
> >
> > What you said is not wrong.
> >
> > Vertica and Riak are examples of distributed databases that don't
> > require hand-holding.
> >
> > Cassandra is for Java-programmer DIYers, or more often Datastax
> > clients, at this point.
> > Thanks, James.
> >
> > ------------------------------
> > *From:* Kenneth Brotman <***@yahoo.com.INVALID>
> > *To:* ***@cassandra.apache.org
> > *Cc:* ***@cassandra.apache.org
> > *Sent:* Monday, February 19, 2018 4:56 PM
> >
> > *Subject:* RE: Cassandra Needs to Grow Up by Version Five!
> >
> > Jeff, you helped me figure out what I was missing. It just took me a
> > day to digest what you wrote. I’m coming over from another type of
> > engineering. I didn’t know and it’s not really documented. Cassandra
> > runs in a data center. Now days that means the nodes are going to be
> > in managed containers, Docker containers, managed by Kerbernetes,
> > Meso or something, and for that reason anyone operating Cassandra in a
> > real world setting would not encounter the issues I raised in the way I
> described.
> >
> > Shouldn’t the architectural diagrams people reference indicate that in
> > some way? That would have help me.
> >
> > Kenneth Brotman
> >
> > *From:* Kenneth Brotman [mailto:***@yahoo.com]
> > *Sent:* Monday, February 19, 2018 10:43 AM
> > *To:* '***@cassandra.apache.org'
> > *Cc:* '***@cassandra.apache.org'
> > *Subject:* RE: Cassandra Needs to Grow Up by Version Five!
> >
> > Well said. Very fair. I wouldn’t mind hearing from others still
> > You’re a good guy!
> >
> > Kenneth Brotman
> >
> > *From:* Jeff Jirsa [mailto:***@gmail.com <***@gmail.com>]
> > *Sent:* Monday, February 19, 2018 9:10 AM
> > *To:* cassandra
> > *Cc:* Cassandra DEV
> > *Subject:* Re: Cassandra Needs to Grow Up by Version Five!
> >
> > There's a lot of things below I disagree with, but it's ok. I
> > convinced myself not to nit-pick every point.
> >
> > https://issues.apache.org/jira/browse/CASSANDRA-13971 has some of
> > Stefan's work with cert management
> >
> > Beyond that, I encourage you to do what Michael suggested: open JIRAs
> > for things you care strongly about, work on them if you have time.
> > Sometime this year we'll schedule a NGCC (Next Generation Cassandra
> > Conference) where we talk about future project work and direction, I
> > encourage you to attend if you're able (I encourage anyone who cares
> > about the direction of Cassandra to attend, it's probably be either
> > free or very low cost, just to cover a venue and some food). If
> > nothing else, you'll meet some of the teams who are working on the
> > project, and learn why they've selected the projects on which they're
> > working. You'll have an opportunity to pitch your vision, and maybe you
> can talk some folks into helping out.
> >
> > - Jeff
> >
> >
> >
> >
> > On Mon, Feb 19, 2018 at 1:01 AM, Kenneth Brotman <
> > ***@yahoo.com.invalid> wrote:
> > Comments inline
> >
> > >-----Original Message-----
> > >From: Jeff Jirsa [mailto:***@gmail.com]
> > >Sent: Sunday, February 18, 2018 10:58 PM
> > >To: ***@cassandra.apache.org
> > >Cc: ***@cassandra.apache.org
> > >Subject: Re: Cassandra Needs to Grow Up by Version Five!
> > >
> > >Comments inline
> > >
> > >
> > >> On Feb 18, 2018, at 9:39 PM, Kenneth Brotman <
> > ***@yahoo.com.INVALID> wrote:
> > >>
> > > >Cassandra feels like an unfinished program to me. The problem is
> > > >not
> > that it’s open source or cutting edge. It’s an open source cutting
> > edge program that lacks some of its basic functionality. We are all
> > stuck addressing fundamental mechanical tasks for Cassandra because
> > the basic code that would do that part has not been contributed yet.
> > >>
> > >There’s probably 2-3 reasons why here:
> > >
> > >1) Historically the pmc has tried to keep the scope of the project
> > >very
> > narrow. It’s a database. We don’t ship drivers. We don’t ship
> > developer tools. We don’t ship fancy UIs. We ship a database. I think
> > for the most part the narrow vision has been for the best, but maybe
> > it’s time to reconsider some of the scope.
> > >
> > >Postgres will autovacuum to prevent wraparound (hopefully), but
> > >everyone
> > I know running Postgres uses flexible-freeze in cron - sometimes it’s
> > ok to let the database have its opinions and let third party tools
> > fill in the gaps.
> > >
> >
> > I can appreciate the desire to stay in scope. I believe usability is
> > the King. When users have to learn the database, then learn what they
> > have to automate, then learn an automation tool and then use the
> > automation tool to do something that is as fundamental as the
> > fundamental tasks I described, then something is missing from the
> > database itself that is adversely affecting usability - and that is
> > very bad. Where those big companies need to calculate the ROI is in
> > the cost of acquiring or training the next group of users. Consider how
> steep the learning curve is for new users.
> > Consider the business case for improving ease of use.
> >
> > >2) Cassandra is, by definition, a database for large scale problems.
> > >Most
> > of the companies working on/with it tend to be big companies. Big
> > companies often have pre-existing automation that solved the stuff you
> > consider fundamental tasks, so there’s probably nobody actively
> > working on the solved problems that you may consider missing features
> > - for many people they’re already solved.
> > >
> >
> > I could be wrong but it sounds like a lot of the code work is done,
> > and if the companies would take the time to contribute more code, then
> > the rest of the code needed could be generated easily.
> >
> > >3) It’s not nearly as basic as you think it is. Datastax seemingly
> > >had a
> > multi-person team on opscenter, and while it was better than anything
> > else around last time I used it (before it stopped supporting the OSS
> > version), it left a lot to be desired. It’s probably 2-3 engineers
> > working for a month to have any sort of meaningful, reliable, mostly
> > trivial cluster-managing UI, and I can think of about 10 JIRAs I’d
> > rather see that time be spent on first.
> >
> > How about 6-9 engineers working 12 months a year on it then. I'm not
> > kidding. For a big company with revenues in the tens of billions or
> > more, and a heavy use of Cassandra nodes, it's easy to make a case for
> > having a full time person or more that involved. They aren't paying
> > for using the open source code that is Cassandra. Let's see what
> > would the licensing fees be for a big company if the costs where like
> Microsoft or Oracle would
> > charge for their enterprise level relational database? What's the
> > contribution of one or two people in comparison.
> >
> > >> Ease of use issues need to be given much more attention. For an
> > administrator, the ease of use of Cassandra is very poor.
> > >>
> > >>Furthermore, currently Cassandra is an idiot. We have to do
> > >>everything
> > for Cassandra. Contrast that with the fact that we are in the dawn of
> > artificial intelligence.
> > >>
> > >
> > >And for everything you think is obvious, there’s a 50% chance someone
> > else will have already solved differently, and your obvious new
> > solution will be seen as an inconvenient assumption and complexity
> > they won’t appreciate. Open source projects get to walk a fine line of
> > trying to be useful without making too many assumptions, being “too”
> > opinionated, or overstepping bounds. We may be too conservative, but
> > it’s very easy to go too far in the opposite direction.
> > >
> >
> > I appreciate that but when such concerns result in inaction instead of
> > resolution that is no good.
> >
> > >> Software exists to automate tasks for humans, not mechanize humans
> > >> to
> > administer tasks for a database. I’m an engineering type. My job is
> > to apply science and technology to solve real world problems. And
> > that’s where I need an organization’s I.T. talent to focus; not in
> > crank starting an unfinished database.
> > >>
> > >
> > >And that’s why nobody’s done it - we all have bigger problems we’re
> > >being
> > paid to solve, and nobody’s felt it necessary. Because it’s not
> > necessary, it’s nice, but not required.
> > >
> >
> > Of course you would say that, you're Jeff Jirsa. In apprenticeship
> > speak, you’re a master. It's the classic challenge of trying to get
> > a master to see the legitimate issues of the apprentices. I do
> > appreciate the time you give to answer posts to the groups , like this
> > post. So I don't want you to take anything the wrong way. Where it's
> > going to bit everyone is in the future adoption rate. It has to be
> addressed.
> >
> > [snip]
> >
> > >> Certificate management should be automated.
> > >>
> > >Stefan (in particular) has done a fair amount of work on this, but
> > >I’d
> > bet 90% of users don’t use ssl and genuinely don’t care.
> > >
> >
> > I didn't realize. Could I trouble you for a link so I could get up to
> > speed?
> >
> > >> Cluster wide management should be a big theme in any next major
> release.
> > >>
> > >Na. Stability and testing should be a big theme in the next major
> release.
> > >
> >
> > Double Na on that one Jeff. I think you have a concern there about
> > the need to test sufficiently to ensure the stability of the next
> > major release. That makes perfect sense.- for every release,
> > especially the major ones. Continuous improvement is not a phase of
> > development for example. CI should be in everything, in every phase.
> > Stability and testing a part of every release not just one. A major
> > release should be a nice step from the previous major release though.
> >
> > >> What is a major release? How many major releases could a program
> > >> have
> > before all the coding for basic stuff like installation, configuration
> > and maintenance is included!
> > >>
> > >> Finish the basic coding of Cassandra, make it easy to use for
> > administrators, make is smart, add cluster wide management. Keep
> > Cassandra competitive or it will soon be the old Model T we all remember
> fondly.
> > >>
> > >
> > >Let’s keep some perspective. Most of us came to Cassandra from rdbms
> > worlds where we were building solutions out of a bunch of master/slave
> > MySQL / Postgres type databases. I started using Cassandra 0.6 when I
> > needed to store something like 400gb/day in 200whatever on spinning
> > disks when 100gb felt like a “big” database, and the thought of
> > writing runbooks and automation to automatically pick the most up to
> > date slave as the new master, promote it, repoint the other slave to
> > the new master, then reformat the old master and add it as a new slave
> > without downtime and without potentially deleting the company’s whole
> dataset sounded awful.
> > Cassandra solved that problem, at the cost of maintaining a few yaml
> > (then
> > xml) files. Yes there are rough edges - they get slightly less rough
> > on each new release. Can we do better? Sure, use your engineering time
> > and send some patches. But the basic stuff is the nuts and bolts of
> > the
> > database: I care way more about streaming and compaction than I’ll
> > ever care about installation.
> > >
> >
> > I can relate. I was studying the enterprise level MS SQL Server
> > stuff. I noticed exactly what you described. I decided maybe I'll
> > just do other stuff and wait for things to develop more. I'm very
> > excited about the way Cassandra addresses things. Streaming and
> > compaction - very good. I'm glad. Items related to usability are not
> optional though.
> >
> > >> I ask the Committee to compile a list of all such items, make a
> > >> plan,
> > and commit to including the completed and tested code as part of major
> > release 5.0. I further ask that release 4.0 not be delayed and then
> > there be an unusually short skip to version 5.0.
> > >>
> > >
> > >The committers are working their ass off on all sorts of hard problems.
> > Some of those are probably even related to Cassandra. If you have
> > idea, open a JIRA. If you have time, send a patch. Or review a patch.
> > But don’t expect a bunch of people to set down work on optimizing the
> > database to work on packaging and installation, because there’s no ROI
> > in it for 99% of the existing committers: we’re working on the
> > database to solve problems, and installation isn’t one of those problems.
> >
> > I'm sure they are working very hard on all kinds of hard problems. I
> > actually wrote "Committee", not "committers" There is an obvious
> > shortage of contributors when you consider the size of the
> > organizations using Cassandra. That leave the burden on an unfair
> > few. Installation or more generally I would say usability is not that
> > big a problem for the big companies out there. Good for them.
> >
> > Ask a new organization or a modest size organization that is
> > struggling to manage their Cassandra cluster that usability is not a
> > big problem. It truly is a big problem for many stakeholders of
> > Cassandra. It needs to be given a bigger priority. Hopefully others
> will weigh in.
> >
> > Kenneth Brotman
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: user-***@cassandra.apache.org
> > <user-***@cassandra.apacheorg>
> > For additional commands, e-mail: user-***@cassandra.apache.org
> >
> >
> >
> >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-***@cassandra.apache.org
> For additional commands, e-mail: user-***@cassandra.apache.org
>
>
DuyHai Doan
2018-02-21 08:21:47 UTC
Permalink
For UI and interactive data exploration there is already the Cassandra
interpreter for Apache Zeppelin that is more than decent for the job

On Wed, Feb 21, 2018 at 9:19 AM, Daniel Hölbling-Inzko <
daniel.hoelbling-***@bitmovin.com> wrote:

> But what does this video really show? That Microsoft managed to run
> Cassandra as a SaaS product with nice UI?
> Google did that years ago with BigTable and Amazon with DynamoDB.
>
> I agree that we need more tools, but not so much for querying (although
> that would also help a bit), but just in general the project feels
> unapproachable right now.
> Besides the excellent DataStax documentation there is little best practice
> knowledge about how to operate and provision Cassandra clusters.
> Having some recipes for Chef, Puppet or Ansible that show the most common
> settings (or some Cloudfoundry/GCP Templates or Helm Charts) would be
> really useful.
> Also a list of all the projects that Cassandra goes well with (like TLP
> Reaper and and Netflix's Priam etc..)
>
> greetings Daniel
>
> On Wed, 21 Feb 2018 at 07:23 Kenneth Brotman <***@yahoo.com.invalid>
> wrote:
>
>> If you watch this video through you'll see why usability is so
>> important. You can't ignore usability issues.
>>
>> Cassandra does not exist in a vacuum. The competitors are world class.
>>
>> The video is on the New Cassandra API for Azure Cosmos DB:
>> https://www.youtube.com/watch?v=1Sf4McGN1AQ
>>
>> Kenneth Brotman
>>
>> -----Original Message-----
>> From: Daniel Hölbling-Inzko [mailto:daniel.hoelbling-***@bitmovin.com]
>> Sent: Tuesday, February 20, 2018 1:28 AM
>> To: ***@cassandra.apache.org; James Briggs
>> Cc: ***@cassandra.apache.org
>> Subject: Re: Cassandra Needs to Grow Up by Version Five!
>>
>> Hi,
>>
>> I have to add my own two cents here as the main thing that keeps me from
>> really running Cassandra is the amount of pain running it incurs.
>> Not so much because it's actually painful but because the tools are so
>> different and the documentation and best practices are scattered across a
>> dozen outdated DataStax articles and this mailing list etc.. We've been
>> hesitant (although our use case is perfect for using Cassandra) to deploy
>> Cassandra to any critical systems as even after a year of running it we
>> still don't have the operational experience to confidently run critical
>> systems with it.
>>
>> Simple things like a foolproof / safe cluster-wide S3 Backup (like
>> Elasticsearch has it) would for example solve a TON of issues for new
>> people. I don't need it auto-scheduled or something, but having to
>> configure cron jobs across the whole cluster is a pain in the ass for small
>> teams.
>> To be honest, even the way snapshots are done right now is already super
>> painful. Every other system I operated so far will just create one backup
>> folder I can export, in C* the Backup is scattered across a bunch of
>> different Keyspace folders etc.. needless to say that it took a while until
>> I trusted my backup scripts fully.
>>
>> And especially for a Database I believe Backup/Restore needs to be a
>> non-issue that's documented front and center. If not smaller teams just
>> don't have the resources to dedicate to learning and building the tools
>> around it.
>>
>> Now that the team is getting larger we could spare the resources to
>> operate these things, but switching from a well-understood RDBMs schema to
>> Cassandra is now incredibly hard and will probably take years.
>>
>> greetings Daniel
>>
>> On Tue, 20 Feb 2018 at 05:56 James Briggs <***@yahoo.com.
>> invalid>
>> wrote:
>>
>> > Kenneth:
>> >
>> > What you said is not wrong.
>> >
>> > Vertica and Riak are examples of distributed databases that don't
>> > require hand-holding.
>> >
>> > Cassandra is for Java-programmer DIYers, or more often Datastax
>> > clients, at this point.
>> > Thanks, James.
>> >
>> > ------------------------------
>> > *From:* Kenneth Brotman <***@yahoo.com.INVALID>
>> > *To:* ***@cassandra.apache.org
>> > *Cc:* ***@cassandra.apache.org
>> > *Sent:* Monday, February 19, 2018 4:56 PM
>> >
>> > *Subject:* RE: Cassandra Needs to Grow Up by Version Five!
>> >
>> > Jeff, you helped me figure out what I was missing. It just took me a
>> > day to digest what you wrote. I’m coming over from another type of
>> > engineering. I didn’t know and it’s not really documented. Cassandra
>> > runs in a data center. Now days that means the nodes are going to be
>> > in managed containers, Docker containers, managed by Kerbernetes,
>> > Meso or something, and for that reason anyone operating Cassandra in a
>> > real world setting would not encounter the issues I raised in the way I
>> described.
>> >
>> > Shouldn’t the architectural diagrams people reference indicate that in
>> > some way? That would have help me.
>> >
>> > Kenneth Brotman
>> >
>> > *From:* Kenneth Brotman [mailto:***@yahoo.com]
>> > *Sent:* Monday, February 19, 2018 10:43 AM
>> > *To:* '***@cassandra.apache.org'
>> > *Cc:* '***@cassandra.apache.org'
>> > *Subject:* RE: Cassandra Needs to Grow Up by Version Five!
>> >
>> > Well said. Very fair. I wouldn’t mind hearing from others still
>> > You’re a good guy!
>> >
>> > Kenneth Brotman
>> >
>> > *From:* Jeff Jirsa [mailto:***@gmail.com <***@gmail.com>]
>> > *Sent:* Monday, February 19, 2018 9:10 AM
>> > *To:* cassandra
>> > *Cc:* Cassandra DEV
>> > *Subject:* Re: Cassandra Needs to Grow Up by Version Five!
>> >
>> > There's a lot of things below I disagree with, but it's ok. I
>> > convinced myself not to nit-pick every point.
>> >
>> > https://issues.apache.org/jira/browse/CASSANDRA-13971 has some of
>> > Stefan's work with cert management
>> >
>> > Beyond that, I encourage you to do what Michael suggested: open JIRAs
>> > for things you care strongly about, work on them if you have time.
>> > Sometime this year we'll schedule a NGCC (Next Generation Cassandra
>> > Conference) where we talk about future project work and direction, I
>> > encourage you to attend if you're able (I encourage anyone who cares
>> > about the direction of Cassandra to attend, it's probably be either
>> > free or very low cost, just to cover a venue and some food). If
>> > nothing else, you'll meet some of the teams who are working on the
>> > project, and learn why they've selected the projects on which they're
>> > working. You'll have an opportunity to pitch your vision, and maybe you
>> can talk some folks into helping out.
>> >
>> > - Jeff
>> >
>> >
>> >
>> >
>> > On Mon, Feb 19, 2018 at 1:01 AM, Kenneth Brotman <
>> > ***@yahoo.com.invalid> wrote:
>> > Comments inline
>> >
>> > >-----Original Message-----
>> > >From: Jeff Jirsa [mailto:***@gmail.com]
>> > >Sent: Sunday, February 18, 2018 10:58 PM
>> > >To: ***@cassandra.apache.org
>> > >Cc: ***@cassandra.apache.org
>> > >Subject: Re: Cassandra Needs to Grow Up by Version Five!
>> > >
>> > >Comments inline
>> > >
>> > >
>> > >> On Feb 18, 2018, at 9:39 PM, Kenneth Brotman <
>> > ***@yahoo.com.INVALID> wrote:
>> > >>
>> > > >Cassandra feels like an unfinished program to me. The problem is
>> > > >not
>> > that it’s open source or cutting edge. It’s an open source cutting
>> > edge program that lacks some of its basic functionality. We are all
>> > stuck addressing fundamental mechanical tasks for Cassandra because
>> > the basic code that would do that part has not been contributed yet.
>> > >>
>> > >There’s probably 2-3 reasons why here:
>> > >
>> > >1) Historically the pmc has tried to keep the scope of the project
>> > >very
>> > narrow. It’s a database. We don’t ship drivers. We don’t ship
>> > developer tools. We don’t ship fancy UIs. We ship a database. I think
>> > for the most part the narrow vision has been for the best, but maybe
>> > it’s time to reconsider some of the scope.
>> > >
>> > >Postgres will autovacuum to prevent wraparound (hopefully), but
>> > >everyone
>> > I know running Postgres uses flexible-freeze in cron - sometimes it’s
>> > ok to let the database have its opinions and let third party tools
>> > fill in the gaps.
>> > >
>> >
>> > I can appreciate the desire to stay in scope. I believe usability is
>> > the King. When users have to learn the database, then learn what they
>> > have to automate, then learn an automation tool and then use the
>> > automation tool to do something that is as fundamental as the
>> > fundamental tasks I described, then something is missing from the
>> > database itself that is adversely affecting usability - and that is
>> > very bad. Where those big companies need to calculate the ROI is in
>> > the cost of acquiring or training the next group of users. Consider
>> how steep the learning curve is for new users.
>> > Consider the business case for improving ease of use.
>> >
>> > >2) Cassandra is, by definition, a database for large scale problems.
>> > >Most
>> > of the companies working on/with it tend to be big companies. Big
>> > companies often have pre-existing automation that solved the stuff you
>> > consider fundamental tasks, so there’s probably nobody actively
>> > working on the solved problems that you may consider missing features
>> > - for many people they’re already solved.
>> > >
>> >
>> > I could be wrong but it sounds like a lot of the code work is done,
>> > and if the companies would take the time to contribute more code, then
>> > the rest of the code needed could be generated easily.
>> >
>> > >3) It’s not nearly as basic as you think it is. Datastax seemingly
>> > >had a
>> > multi-person team on opscenter, and while it was better than anything
>> > else around last time I used it (before it stopped supporting the OSS
>> > version), it left a lot to be desired. It’s probably 2-3 engineers
>> > working for a month to have any sort of meaningful, reliable, mostly
>> > trivial cluster-managing UI, and I can think of about 10 JIRAs I’d
>> > rather see that time be spent on first.
>> >
>> > How about 6-9 engineers working 12 months a year on it then. I'm not
>> > kidding. For a big company with revenues in the tens of billions or
>> > more, and a heavy use of Cassandra nodes, it's easy to make a case for
>> > having a full time person or more that involved. They aren't paying
>> > for using the open source code that is Cassandra. Let's see what
>> > would the licensing fees be for a big company if the costs where like
>> Microsoft or Oracle would
>> > charge for their enterprise level relational database? What's the
>> > contribution of one or two people in comparison.
>> >
>> > >> Ease of use issues need to be given much more attention. For an
>> > administrator, the ease of use of Cassandra is very poor.
>> > >>
>> > >>Furthermore, currently Cassandra is an idiot. We have to do
>> > >>everything
>> > for Cassandra. Contrast that with the fact that we are in the dawn of
>> > artificial intelligence.
>> > >>
>> > >
>> > >And for everything you think is obvious, there’s a 50% chance someone
>> > else will have already solved differently, and your obvious new
>> > solution will be seen as an inconvenient assumption and complexity
>> > they won’t appreciate. Open source projects get to walk a fine line of
>> > trying to be useful without making too many assumptions, being “too”
>> > opinionated, or overstepping bounds. We may be too conservative, but
>> > it’s very easy to go too far in the opposite direction.
>> > >
>> >
>> > I appreciate that but when such concerns result in inaction instead of
>> > resolution that is no good.
>> >
>> > >> Software exists to automate tasks for humans, not mechanize humans
>> > >> to
>> > administer tasks for a database. I’m an engineering type. My job is
>> > to apply science and technology to solve real world problems. And
>> > that’s where I need an organization’s I.T. talent to focus; not in
>> > crank starting an unfinished database.
>> > >>
>> > >
>> > >And that’s why nobody’s done it - we all have bigger problems we’re
>> > >being
>> > paid to solve, and nobody’s felt it necessary. Because it’s not
>> > necessary, it’s nice, but not required.
>> > >
>> >
>> > Of course you would say that, you're Jeff Jirsa. In apprenticeship
>> > speak, you’re a master. It's the classic challenge of trying to get
>> > a master to see the legitimate issues of the apprentices. I do
>> > appreciate the time you give to answer posts to the groups , like this
>> > post. So I don't want you to take anything the wrong way. Where it's
>> > going to bit everyone is in the future adoption rate. It has to be
>> addressed.
>> >
>> > [snip]
>> >
>> > >> Certificate management should be automated.
>> > >>
>> > >Stefan (in particular) has done a fair amount of work on this, but
>> > >I’d
>> > bet 90% of users don’t use ssl and genuinely don’t care.
>> > >
>> >
>> > I didn't realize. Could I trouble you for a link so I could get up to
>> > speed?
>> >
>> > >> Cluster wide management should be a big theme in any next major
>> release.
>> > >>
>> > >Na. Stability and testing should be a big theme in the next major
>> release.
>> > >
>> >
>> > Double Na on that one Jeff. I think you have a concern there about
>> > the need to test sufficiently to ensure the stability of the next
>> > major release. That makes perfect sense.- for every release,
>> > especially the major ones. Continuous improvement is not a phase of
>> > development for example. CI should be in everything, in every phase.
>> > Stability and testing a part of every release not just one. A major
>> > release should be a nice step from the previous major release though.
>> >
>> > >> What is a major release? How many major releases could a program
>> > >> have
>> > before all the coding for basic stuff like installation, configuration
>> > and maintenance is included!
>> > >>
>> > >> Finish the basic coding of Cassandra, make it easy to use for
>> > administrators, make is smart, add cluster wide management. Keep
>> > Cassandra competitive or it will soon be the old Model T we all
>> remember fondly.
>> > >>
>> > >
>> > >Let’s keep some perspective. Most of us came to Cassandra from rdbms
>> > worlds where we were building solutions out of a bunch of master/slave
>> > MySQL / Postgres type databases. I started using Cassandra 0.6 when I
>> > needed to store something like 400gb/day in 200whatever on spinning
>> > disks when 100gb felt like a “big” database, and the thought of
>> > writing runbooks and automation to automatically pick the most up to
>> > date slave as the new master, promote it, repoint the other slave to
>> > the new master, then reformat the old master and add it as a new slave
>> > without downtime and without potentially deleting the company’s whole
>> dataset sounded awful.
>> > Cassandra solved that problem, at the cost of maintaining a few yaml
>> > (then
>> > xml) files. Yes there are rough edges - they get slightly less rough
>> > on each new release. Can we do better? Sure, use your engineering time
>> > and send some patches. But the basic stuff is the nuts and bolts of
>> > the
>> > database: I care way more about streaming and compaction than I’ll
>> > ever care about installation.
>> > >
>> >
>> > I can relate. I was studying the enterprise level MS SQL Server
>> > stuff. I noticed exactly what you described. I decided maybe I'll
>> > just do other stuff and wait for things to develop more. I'm very
>> > excited about the way Cassandra addresses things. Streaming and
>> > compaction - very good. I'm glad. Items related to usability are not
>> optional though.
>> >
>> > >> I ask the Committee to compile a list of all such items, make a
>> > >> plan,
>> > and commit to including the completed and tested code as part of major
>> > release 5.0. I further ask that release 4.0 not be delayed and then
>> > there be an unusually short skip to version 5.0.
>> > >>
>> > >
>> > >The committers are working their ass off on all sorts of hard problems.
>> > Some of those are probably even related to Cassandra. If you have
>> > idea, open a JIRA. If you have time, send a patch. Or review a patch.
>> > But don’t expect a bunch of people to set down work on optimizing the
>> > database to work on packaging and installation, because there’s no ROI
>> > in it for 99% of the existing committers: we’re working on the
>> > database to solve problems, and installation isn’t one of those
>> problems.
>> >
>> > I'm sure they are working very hard on all kinds of hard problems. I
>> > actually wrote "Committee", not "committers" There is an obvious
>> > shortage of contributors when you consider the size of the
>> > organizations using Cassandra. That leave the burden on an unfair
>> > few. Installation or more generally I would say usability is not that
>> > big a problem for the big companies out there. Good for them.
>> >
>> > Ask a new organization or a modest size organization that is
>> > struggling to manage their Cassandra cluster that usability is not a
>> > big problem. It truly is a big problem for many stakeholders of
>> > Cassandra. It needs to be given a bigger priority. Hopefully others
>> will weigh in.
>> >
>> > Kenneth Brotman
>> >
>> >
>> > ---------------------------------------------------------------------
>> > To unsubscribe, e-mail: user-***@cassandra.apache.org
>> > <user-***@cassandra.apacheorg>
>> > For additional commands, e-mail: user-***@cassandra.apache.org
>> >
>> >
>> >
>> >
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-***@cassandra.apache.org
>> For additional commands, e-mail: user-***@cassandra.apache.org
>>
>>
Ben Slater
2018-02-21 08:39:14 UTC
Permalink
I’ve been bitting my tongue because I don’t normally like to directly plug
our service on the mailing list but if you’re going to compare Cassandra to
a full managed service from Microsoft then you really should check out
Instaclustr (www.instaclustr.com) and you’ll find that we take care of many
of this issues you have raised is just the same way that Microsoft does
with CosmosDB (ie hiding them behind our managed service tooling).

Cheers
Ben

On Wed, 21 Feb 2018 at 19:22 DuyHai Doan <***@gmail.com> wrote:

> For UI and interactive data exploration there is already the Cassandra
> interpreter for Apache Zeppelin that is more than decent for the job
>
> On Wed, Feb 21, 2018 at 9:19 AM, Daniel Hölbling-Inzko <
> daniel.hoelbling-***@bitmovin.com> wrote:
>
>> But what does this video really show? That Microsoft managed to run
>> Cassandra as a SaaS product with nice UI?
>> Google did that years ago with BigTable and Amazon with DynamoDB.
>>
>> I agree that we need more tools, but not so much for querying (although
>> that would also help a bit), but just in general the project feels
>> unapproachable right now.
>> Besides the excellent DataStax documentation there is little best
>> practice knowledge about how to operate and provision Cassandra clusters.
>> Having some recipes for Chef, Puppet or Ansible that show the most common
>> settings (or some Cloudfoundry/GCP Templates or Helm Charts) would be
>> really useful.
>> Also a list of all the projects that Cassandra goes well with (like TLP
>> Reaper and and Netflix's Priam etc..)
>>
>> greetings Daniel
>>
>> On Wed, 21 Feb 2018 at 07:23 Kenneth Brotman <***@yahoo.com.invalid>
>> wrote:
>>
>>> If you watch this video through you'll see why usability is so
>>> important. You can't ignore usability issues.
>>>
>>> Cassandra does not exist in a vacuum. The competitors are world class.
>>>
>>> The video is on the New Cassandra API for Azure Cosmos DB:
>>> https://www.youtube.com/watch?v=1Sf4McGN1AQ
>>>
>>> Kenneth Brotman
>>>
>>> -----Original Message-----
>>> From: Daniel Hölbling-Inzko [mailto:daniel.hoelbling-***@bitmovin.com]
>>> Sent: Tuesday, February 20, 2018 1:28 AM
>>> To: ***@cassandra.apache.org; James Briggs
>>> Cc: ***@cassandra.apache.org
>>> Subject: Re: Cassandra Needs to Grow Up by Version Five!
>>>
>>> Hi,
>>>
>>> I have to add my own two cents here as the main thing that keeps me from
>>> really running Cassandra is the amount of pain running it incurs.
>>> Not so much because it's actually painful but because the tools are so
>>> different and the documentation and best practices are scattered across a
>>> dozen outdated DataStax articles and this mailing list etc.. We've been
>>> hesitant (although our use case is perfect for using Cassandra) to deploy
>>> Cassandra to any critical systems as even after a year of running it we
>>> still don't have the operational experience to confidently run critical
>>> systems with it.
>>>
>>> Simple things like a foolproof / safe cluster-wide S3 Backup (like
>>> Elasticsearch has it) would for example solve a TON of issues for new
>>> people. I don't need it auto-scheduled or something, but having to
>>> configure cron jobs across the whole cluster is a pain in the ass for small
>>> teams.
>>> To be honest, even the way snapshots are done right now is already super
>>> painful. Every other system I operated so far will just create one backup
>>> folder I can export, in C* the Backup is scattered across a bunch of
>>> different Keyspace folders etc.. needless to say that it took a while until
>>> I trusted my backup scripts fully.
>>>
>>> And especially for a Database I believe Backup/Restore needs to be a
>>> non-issue that's documented front and center. If not smaller teams just
>>> don't have the resources to dedicate to learning and building the tools
>>> around it.
>>>
>>> Now that the team is getting larger we could spare the resources to
>>> operate these things, but switching from a well-understood RDBMs schema to
>>> Cassandra is now incredibly hard and will probably take years.
>>>
>>> greetings Daniel
>>>
>>> On Tue, 20 Feb 2018 at 05:56 James Briggs <***@yahoo.com
>>> .invalid>
>>> wrote:
>>>
>>> > Kenneth:
>>> >
>>> > What you said is not wrong.
>>> >
>>> > Vertica and Riak are examples of distributed databases that don't
>>> > require hand-holding.
>>> >
>>> > Cassandra is for Java-programmer DIYers, or more often Datastax
>>> > clients, at this point.
>>> > Thanks, James.
>>> >
>>> > ------------------------------
>>> > *From:* Kenneth Brotman <***@yahoo.com.INVALID>
>>> > *To:* ***@cassandra.apache.org
>>> > *Cc:* ***@cassandra.apache.org
>>> > *Sent:* Monday, February 19, 2018 4:56 PM
>>> >
>>> > *Subject:* RE: Cassandra Needs to Grow Up by Version Five!
>>> >
>>> > Jeff, you helped me figure out what I was missing. It just took me a
>>> > day to digest what you wrote. I’m coming over from another type of
>>> > engineering. I didn’t know and it’s not really documented. Cassandra
>>> > runs in a data center. Now days that means the nodes are going to be
>>> > in managed containers, Docker containers, managed by Kerbernetes,
>>> > Meso or something, and for that reason anyone operating Cassandra in a
>>> > real world setting would not encounter the issues I raised in the way
>>> I described.
>>> >
>>> > Shouldn’t the architectural diagrams people reference indicate that in
>>> > some way? That would have help me.
>>> >
>>> > Kenneth Brotman
>>> >
>>> > *From:* Kenneth Brotman [mailto:***@yahoo.com]
>>> > *Sent:* Monday, February 19, 2018 10:43 AM
>>> > *To:* '***@cassandra.apache.org'
>>> > *Cc:* '***@cassandra.apache.org'
>>> > *Subject:* RE: Cassandra Needs to Grow Up by Version Five!
>>> >
>>> > Well said. Very fair. I wouldn’t mind hearing from others still
>>> > You’re a good guy!
>>> >
>>> > Kenneth Brotman
>>> >
>>> > *From:* Jeff Jirsa [mailto:***@gmail.com <***@gmail.com>]
>>> > *Sent:* Monday, February 19, 2018 9:10 AM
>>> > *To:* cassandra
>>> > *Cc:* Cassandra DEV
>>> > *Subject:* Re: Cassandra Needs to Grow Up by Version Five!
>>> >
>>> > There's a lot of things below I disagree with, but it's ok. I
>>> > convinced myself not to nit-pick every point.
>>> >
>>> > https://issues.apache.org/jira/browse/CASSANDRA-13971 has some of
>>> > Stefan's work with cert management
>>> >
>>> > Beyond that, I encourage you to do what Michael suggested: open JIRAs
>>> > for things you care strongly about, work on them if you have time.
>>> > Sometime this year we'll schedule a NGCC (Next Generation Cassandra
>>> > Conference) where we talk about future project work and direction, I
>>> > encourage you to attend if you're able (I encourage anyone who cares
>>> > about the direction of Cassandra to attend, it's probably be either
>>> > free or very low cost, just to cover a venue and some food). If
>>> > nothing else, you'll meet some of the teams who are working on the
>>> > project, and learn why they've selected the projects on which they're
>>> > working. You'll have an opportunity to pitch your vision, and maybe
>>> you can talk some folks into helping out.
>>> >
>>> > - Jeff
>>> >
>>> >
>>> >
>>> >
>>> > On Mon, Feb 19, 2018 at 1:01 AM, Kenneth Brotman <
>>> > ***@yahoo.com.invalid> wrote:
>>> > Comments inline
>>> >
>>> > >-----Original Message-----
>>> > >From: Jeff Jirsa [mailto:***@gmail.com]
>>> > >Sent: Sunday, February 18, 2018 10:58 PM
>>> > >To: ***@cassandra.apache.org
>>> > >Cc: ***@cassandra.apache.org
>>> > >Subject: Re: Cassandra Needs to Grow Up by Version Five!
>>> > >
>>> > >Comments inline
>>> > >
>>> > >
>>> > >> On Feb 18, 2018, at 9:39 PM, Kenneth Brotman <
>>> > ***@yahoo.com.INVALID> wrote:
>>> > >>
>>> > > >Cassandra feels like an unfinished program to me. The problem is
>>> > > >not
>>> > that it’s open source or cutting edge. It’s an open source cutting
>>> > edge program that lacks some of its basic functionality. We are all
>>> > stuck addressing fundamental mechanical tasks for Cassandra because
>>> > the basic code that would do that part has not been contributed yet.
>>> > >>
>>> > >There’s probably 2-3 reasons why here:
>>> > >
>>> > >1) Historically the pmc has tried to keep the scope of the project
>>> > >very
>>> > narrow. It’s a database. We don’t ship drivers. We don’t ship
>>> > developer tools. We don’t ship fancy UIs. We ship a database. I think
>>> > for the most part the narrow vision has been for the best, but maybe
>>> > it’s time to reconsider some of the scope.
>>> > >
>>> > >Postgres will autovacuum to prevent wraparound (hopefully), but
>>> > >everyone
>>> > I know running Postgres uses flexible-freeze in cron - sometimes it’s
>>> > ok to let the database have its opinions and let third party tools
>>> > fill in the gaps.
>>> > >
>>> >
>>> > I can appreciate the desire to stay in scope. I believe usability is
>>> > the King. When users have to learn the database, then learn what they
>>> > have to automate, then learn an automation tool and then use the
>>> > automation tool to do something that is as fundamental as the
>>> > fundamental tasks I described, then something is missing from the
>>> > database itself that is adversely affecting usability - and that is
>>> > very bad. Where those big companies need to calculate the ROI is in
>>> > the cost of acquiring or training the next group of users. Consider
>>> how steep the learning curve is for new users.
>>> > Consider the business case for improving ease of use.
>>> >
>>> > >2) Cassandra is, by definition, a database for large scale problems.
>>> > >Most
>>> > of the companies working on/with it tend to be big companies. Big
>>> > companies often have pre-existing automation that solved the stuff you
>>> > consider fundamental tasks, so there’s probably nobody actively
>>> > working on the solved problems that you may consider missing features
>>> > - for many people they’re already solved.
>>> > >
>>> >
>>> > I could be wrong but it sounds like a lot of the code work is done,
>>> > and if the companies would take the time to contribute more code, then
>>> > the rest of the code needed could be generated easily.
>>> >
>>> > >3) It’s not nearly as basic as you think it is. Datastax seemingly
>>> > >had a
>>> > multi-person team on opscenter, and while it was better than anything
>>> > else around last time I used it (before it stopped supporting the OSS
>>> > version), it left a lot to be desired. It’s probably 2-3 engineers
>>> > working for a month to have any sort of meaningful, reliable, mostly
>>> > trivial cluster-managing UI, and I can think of about 10 JIRAs I’d
>>> > rather see that time be spent on first.
>>> >
>>> > How about 6-9 engineers working 12 months a year on it then. I'm not
>>> > kidding. For a big company with revenues in the tens of billions or
>>> > more, and a heavy use of Cassandra nodes, it's easy to make a case for
>>> > having a full time person or more that involved. They aren't paying
>>> > for using the open source code that is Cassandra. Let's see what
>>> > would the licensing fees be for a big company if the costs where like
>>> Microsoft or Oracle would
>>> > charge for their enterprise level relational database? What's the
>>> > contribution of one or two people in comparison.
>>> >
>>> > >> Ease of use issues need to be given much more attention. For an
>>> > administrator, the ease of use of Cassandra is very poor.
>>> > >>
>>> > >>Furthermore, currently Cassandra is an idiot. We have to do
>>> > >>everything
>>> > for Cassandra. Contrast that with the fact that we are in the dawn of
>>> > artificial intelligence.
>>> > >>
>>> > >
>>> > >And for everything you think is obvious, there’s a 50% chance someone
>>> > else will have already solved differently, and your obvious new
>>> > solution will be seen as an inconvenient assumption and complexity
>>> > they won’t appreciate. Open source projects get to walk a fine line of
>>> > trying to be useful without making too many assumptions, being “too”
>>> > opinionated, or overstepping bounds. We may be too conservative, but
>>> > it’s very easy to go too far in the opposite direction.
>>> > >
>>> >
>>> > I appreciate that but when such concerns result in inaction instead of
>>> > resolution that is no good.
>>> >
>>> > >> Software exists to automate tasks for humans, not mechanize humans
>>> > >> to
>>> > administer tasks for a database. I’m an engineering type. My job is
>>> > to apply science and technology to solve real world problems. And
>>> > that’s where I need an organization’s I.T. talent to focus; not in
>>> > crank starting an unfinished database.
>>> > >>
>>> > >
>>> > >And that’s why nobody’s done it - we all have bigger problems we’re
>>> > >being
>>> > paid to solve, and nobody’s felt it necessary. Because it’s not
>>> > necessary, it’s nice, but not required.
>>> > >
>>> >
>>> > Of course you would say that, you're Jeff Jirsa. In apprenticeship
>>> > speak, you’re a master. It's the classic challenge of trying to get
>>> > a master to see the legitimate issues of the apprentices. I do
>>> > appreciate the time you give to answer posts to the groups , like this
>>> > post. So I don't want you to take anything the wrong way. Where it's
>>> > going to bit everyone is in the future adoption rate. It has to be
>>> addressed.
>>> >
>>> > [snip]
>>> >
>>> > >> Certificate management should be automated.
>>> > >>
>>> > >Stefan (in particular) has done a fair amount of work on this, but
>>> > >I’d
>>> > bet 90% of users don’t use ssl and genuinely don’t care.
>>> > >
>>> >
>>> > I didn't realize. Could I trouble you for a link so I could get up to
>>> > speed?
>>> >
>>> > >> Cluster wide management should be a big theme in any next major
>>> release.
>>> > >>
>>> > >Na. Stability and testing should be a big theme in the next major
>>> release.
>>> > >
>>> >
>>> > Double Na on that one Jeff. I think you have a concern there about
>>> > the need to test sufficiently to ensure the stability of the next
>>> > major release. That makes perfect sense.- for every release,
>>> > especially the major ones. Continuous improvement is not a phase of
>>> > development for example. CI should be in everything, in every phase.
>>> > Stability and testing a part of every release not just one. A major
>>> > release should be a nice step from the previous major release though.
>>> >
>>> > >> What is a major release? How many major releases could a program
>>> > >> have
>>> > before all the coding for basic stuff like installation, configuration
>>> > and maintenance is included!
>>> > >>
>>> > >> Finish the basic coding of Cassandra, make it easy to use for
>>> > administrators, make is smart, add cluster wide management. Keep
>>> > Cassandra competitive or it will soon be the old Model T we all
>>> remember fondly.
>>> > >>
>>> > >
>>> > >Let’s keep some perspective. Most of us came to Cassandra from rdbms
>>> > worlds where we were building solutions out of a bunch of master/slave
>>> > MySQL / Postgres type databases. I started using Cassandra 0.6 when I
>>> > needed to store something like 400gb/day in 200whatever on spinning
>>> > disks when 100gb felt like a “big” database, and the thought of
>>> > writing runbooks and automation to automatically pick the most up to
>>> > date slave as the new master, promote it, repoint the other slave to
>>> > the new master, then reformat the old master and add it as a new slave
>>> > without downtime and without potentially deleting the company’s whole
>>> dataset sounded awful.
>>> > Cassandra solved that problem, at the cost of maintaining a few yaml
>>> > (then
>>> > xml) files. Yes there are rough edges - they get slightly less rough
>>> > on each new release. Can we do better? Sure, use your engineering time
>>> > and send some patches. But the basic stuff is the nuts and bolts of
>>> > the
>>> > database: I care way more about streaming and compaction than I’ll
>>> > ever care about installation.
>>> > >
>>> >
>>> > I can relate. I was studying the enterprise level MS SQL Server
>>> > stuff. I noticed exactly what you described. I decided maybe I'll
>>> > just do other stuff and wait for things to develop more. I'm very
>>> > excited about the way Cassandra addresses things. Streaming and
>>> > compaction - very good. I'm glad. Items related to usability are not
>>> optional though.
>>> >
>>> > >> I ask the Committee to compile a list of all such items, make a
>>> > >> plan,
>>> > and commit to including the completed and tested code as part of major
>>> > release 5.0. I further ask that release 4.0 not be delayed and then
>>> > there be an unusually short skip to version 5.0.
>>> > >>
>>> > >
>>> > >The committers are working their ass off on all sorts of hard
>>> problems.
>>> > Some of those are probably even related to Cassandra. If you have
>>> > idea, open a JIRA. If you have time, send a patch. Or review a patch.
>>> > But don’t expect a bunch of people to set down work on optimizing the
>>> > database to work on packaging and installation, because there’s no ROI
>>> > in it for 99% of the existing committers: we’re working on the
>>> > database to solve problems, and installation isn’t one of those
>>> problems.
>>> >
>>> > I'm sure they are working very hard on all kinds of hard problems. I
>>> > actually wrote "Committee", not "committers" There is an obvious
>>> > shortage of contributors when you consider the size of the
>>> > organizations using Cassandra. That leave the burden on an unfair
>>> > few. Installation or more generally I would say usability is not that
>>> > big a problem for the big companies out there. Good for them.
>>> >
>>> > Ask a new organization or a modest size organization that is
>>> > struggling to manage their Cassandra cluster that usability is not a
>>> > big problem. It truly is a big problem for many stakeholders of
>>> > Cassandra. It needs to be given a bigger priority. Hopefully others
>>> will weigh in.
>>> >
>>> > Kenneth Brotman
>>> >
>>> >
>>> > ---------------------------------------------------------------------
>>> > To unsubscribe, e-mail: user-***@cassandra.apache.org
>>> > <user-***@cassandra.apacheorg>
>>> > For additional commands, e-mail: user-***@cassandra.apache.org
>>> >
>>> >
>>> >
>>> >
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: user-***@cassandra.apache.org
>>> For additional commands, e-mail: user-***@cassandra.apache.org
>>>
>>>
> --


*Ben Slater*

*Chief Product Officer <https://www.instaclustr.com/>*

<https://www.facebook.com/instaclustr> <https://twitter.com/instaclustr>
<https://www.linkedin.com/company/instaclustr>

Read our latest technical blog posts here
<https://www.instaclustr.com/blog/>.

This email has been sent on behalf of Instaclustr Pty. Limited (Australia)
and Instaclustr Inc (USA).

This email and any attachments may contain confidential and legally
privileged information. If you are not the intended recipient, do not copy
or disclose its content, but please reply to this email immediately and
highlight the error to the sender and then immediately delete the message.
Oleksandr Shulgin
2018-02-21 10:45:59 UTC
Permalink
On Mon, Feb 19, 2018 at 10:01 AM, Kenneth Brotman <
***@yahoo.com.invalid> wrote:

>
> >> Cluster wide management should be a big theme in any next major release.
> >>
> >Na. Stability and testing should be a big theme in the next major release.
> >
>
> Double Na on that one Jeff. I think you have a concern there about the
> need to test sufficiently to ensure the stability of the next major
> release. That makes perfect sense.- for every release, especially the
> major ones. Continuous improvement is not a phase of development for
> example. CI should be in everything, in every phase. Stability and
> testing a part of every release not just one. A major release should be a
> nice step from the previous major release though.
>

I guess what Jeff refers to is the tick-tock release cycle experiment,
which has proven to be a complete disaster by popular opinion.

There's also the "materialized views" feature which failed to materialize
in the end (pun intended) and had to be declared experimental retroactively.

Another prominent example is incremental repair which was introduced as the
default option in 2.2 and now is not recommended to use because of so many
corner cases where it can fail. So again experimental as an afterthought.

Not to mention that even if you are aware of the default incremental and go
with full repair instead, you're still up for a sad surprise:
anti-compaction will be triggered despite the "full" repair. Because
anti-compaction is only disabled in case of sub-range repair (don't ask
why), so you need to use something advanced like Reaper if you want to
avoid that. I don't think you'll ever find this in the documentation.

Honestly, for an eventually-consistent system like Cassandra anti-entropy
repair is one of the most important pieces to get right. And Cassandra
fails really badly on that one: the feature is not really well designed,
poorly implemented and under-documented.

In a summary, IMO, Cassandra is a poor implementation of some good ideas.
It is a collection of hacks, not features. They sometimes play together
accidentally, and rarely by design.

Regards,
--
Alex
Josh McKenzie
2018-02-21 16:27:44 UTC
Permalink
There's a disheartening amount of "here's where Cassandra is bad, and
here's what it needs to do for me for free" happening in this thread.

This is open-source software. Everyone is *strongly encouraged* to submit a
patch to move the needle on *any* of these things being complained about in
this thread.

For the Apache Way <https://www.apache.org/foundation/governance/> to work,
people need to step up and meaningfully contribute to a project to scratch
their own itch instead of just waiting for a random corporation-subsidized
engineer to happen to have interests that align with them and contribute
that to the project.

Beating a dead horse for things everyone on the project knows are serious
pain points is not productive.

On Wed, Feb 21, 2018 at 5:45 AM, Oleksandr Shulgin <
***@zalando.de> wrote:

> On Mon, Feb 19, 2018 at 10:01 AM, Kenneth Brotman <
> ***@yahoo.com.invalid> wrote:
>
> >
> > >> Cluster wide management should be a big theme in any next major
> release.
> > >>
> > >Na. Stability and testing should be a big theme in the next major
> release.
> > >
> >
> > Double Na on that one Jeff. I think you have a concern there about the
> > need to test sufficiently to ensure the stability of the next major
> > release. That makes perfect sense.- for every release, especially the
> > major ones. Continuous improvement is not a phase of development for
> > example. CI should be in everything, in every phase. Stability and
> > testing a part of every release not just one. A major release should be
> a
> > nice step from the previous major release though.
> >
>
> I guess what Jeff refers to is the tick-tock release cycle experiment,
> which has proven to be a complete disaster by popular opinion.
>
> There's also the "materialized views" feature which failed to materialize
> in the end (pun intended) and had to be declared experimental
> retroactively.
>
> Another prominent example is incremental repair which was introduced as the
> default option in 2.2 and now is not recommended to use because of so many
> corner cases where it can fail. So again experimental as an afterthought.
>
> Not to mention that even if you are aware of the default incremental and go
> with full repair instead, you're still up for a sad surprise:
> anti-compaction will be triggered despite the "full" repair. Because
> anti-compaction is only disabled in case of sub-range repair (don't ask
> why), so you need to use something advanced like Reaper if you want to
> avoid that. I don't think you'll ever find this in the documentation.
>
> Honestly, for an eventually-consistent system like Cassandra anti-entropy
> repair is one of the most important pieces to get right. And Cassandra
> fails really badly on that one: the feature is not really well designed,
> poorly implemented and under-documented.
>
> In a summary, IMO, Cassandra is a poor implementation of some good ideas.
> It is a collection of hacks, not features. They sometimes play together
> accidentally, and rarely by design.
>
> Regards,
> --
> Alex
>
Kenneth Brotman
2018-02-21 19:56:01 UTC
Permalink
Josh,

To say nothing is indifference. If you care about your community, sometimes don't you have to bring up a subject even though you know it's also temporarily adding some discomfort?

As to opening a JIRA, I've got a very specific topic to try in mind now. An easy one I'll work on and then announce. Someone else will have to do the coding. A year from now I would probably just knock it out to make sure it's as easy as I expect it to be but to be honest, as I've been saying, I'm not set up to do that right now. I've barely looked at any Cassandra code; for one; everyone on this list probably codes more than I do, secondly; and lastly, it's a good one for someone that wants an easy one to start with: vNodes. I've already seen too many people seeking assistance with the vNode setting.

And you can expect as others have been mentioning that there should be similar ones on compaction, repair and backup.

Microsoft knows poor usability gives them an easy market to take over. And they make it easy to switch.

Beginning at 4:17 in the video, it says the following:

"You don't need to worry about replica sets, quorum or read repair. You can focus on writing correct application logic."

At 4:42, it says:
"Hopefully this gives you a quick idea of how seamlessly you can bring your existing Cassandra applications to Azure Cosmos DB. No code changes are required. It works with your favorite Cassandra tools and drivers including for example native Cassandra driver for Spark. And it takes seconds to get going, and it's elastically and globally scalable."

More to come,

Kenneth Brotman

-----Original Message-----
From: Josh McKenzie [mailto:***@apache.org]
Sent: Wednesday, February 21, 2018 8:28 AM
To: ***@cassandra.apache.org
Cc: User
Subject: Re: Cassandra Needs to Grow Up by Version Five!

There's a disheartening amount of "here's where Cassandra is bad, and here's what it needs to do for me for free" happening in this thread.

This is open-source software. Everyone is *strongly encouraged* to submit a patch to move the needle on *any* of these things being complained about in this thread.

For the Apache Way <https://www.apache.org/foundation/governance/> to work, people need to step up and meaningfully contribute to a project to scratch their own itch instead of just waiting for a random corporation-subsidized engineer to happen to have interests that align with them and contribute that to the project.

Beating a dead horse for things everyone on the project knows are serious pain points is not productive.

On Wed, Feb 21, 2018 at 5:45 AM, Oleksandr Shulgin < ***@zalando.de> wrote:

> On Mon, Feb 19, 2018 at 10:01 AM, Kenneth Brotman <
> ***@yahoo.com.invalid> wrote:
>
> >
> > >> Cluster wide management should be a big theme in any next major
> release.
> > >>
> > >Na. Stability and testing should be a big theme in the next major
> release.
> > >
> >
> > Double Na on that one Jeff. I think you have a concern there about
> > the need to test sufficiently to ensure the stability of the next
> > major release. That makes perfect sense.- for every release,
> > especially the major ones. Continuous improvement is not a phase of
> > development for example. CI should be in everything, in every
> > phase. Stability and testing a part of every release not just one.
> > A major release should be
> a
> > nice step from the previous major release though.
> >
>
> I guess what Jeff refers to is the tick-tock release cycle experiment,
> which has proven to be a complete disaster by popular opinion.
>
> There's also the "materialized views" feature which failed to
> materialize in the end (pun intended) and had to be declared
> experimental retroactively.
>
> Another prominent example is incremental repair which was introduced
> as the default option in 2.2 and now is not recommended to use because
> of so many corner cases where it can fail. So again experimental as an afterthought.
>
> Not to mention that even if you are aware of the default incremental
> and go with full repair instead, you're still up for a sad surprise:
> anti-compaction will be triggered despite the "full" repair. Because
> anti-compaction is only disabled in case of sub-range repair (don't
> ask why), so you need to use something advanced like Reaper if you
> want to avoid that. I don't think you'll ever find this in the documentation.
>
> Honestly, for an eventually-consistent system like Cassandra
> anti-entropy repair is one of the most important pieces to get right.
> And Cassandra fails really badly on that one: the feature is not
> really well designed, poorly implemented and under-documented.
>
> In a summary, IMO, Cassandra is a poor implementation of some good ideas.
> It is a collection of hacks, not features. They sometimes play
> together accidentally, and rarely by design.
>
> Regards,
> --
> Alex
>


---------------------------------------------------------------------
To unsubscribe, e-mail: user-***@cassandra.apache.org
For additional commands, e-mail: user-***@cassandra.apache.org
DuyHai Doan
2018-02-21 20:11:39 UTC
Permalink
So before buying any marketing claims from Microsoft or whoever, maybe
should you try to use it extensively ?

And talking about backup, have a look at DynamoDB:
http://i68.tinypic.com/n1b6yr.jpg

From my POV, if a multi-billions company like Amazon doesn't get it right
or can't make it easy for end-user (without involving an unwieldy Hadoop
machinery:
https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/DynamoDBPipeline.html),
what Cassandra offers in term of back-up restore is more than satisfactory




On Wed, Feb 21, 2018 at 8:56 PM, Kenneth Brotman <
***@yahoo.com.invalid> wrote:

> Josh,
>
> To say nothing is indifference. If you care about your community,
> sometimes don't you have to bring up a subject even though you know it's
> also temporarily adding some discomfort?
>
> As to opening a JIRA, I've got a very specific topic to try in mind now.
> An easy one I'll work on and then announce. Someone else will have to do
> the coding. A year from now I would probably just knock it out to make
> sure it's as easy as I expect it to be but to be honest, as I've been
> saying, I'm not set up to do that right now. I've barely looked at any
> Cassandra code; for one; everyone on this list probably codes more than I
> do, secondly; and lastly, it's a good one for someone that wants an easy
> one to start with: vNodes. I've already seen too many people seeking
> assistance with the vNode setting.
>
> And you can expect as others have been mentioning that there should be
> similar ones on compaction, repair and backup.
>
> Microsoft knows poor usability gives them an easy market to take over. And
> they make it easy to switch.
>
> Beginning at 4:17 in the video, it says the following:
>
> "You don't need to worry about replica sets, quorum or read
> repair. You can focus on writing correct application logic."
>
> At 4:42, it says:
> "Hopefully this gives you a quick idea of how seamlessly you can
> bring your existing Cassandra applications to Azure Cosmos DB. No code
> changes are required. It works with your favorite Cassandra tools and
> drivers including for example native Cassandra driver for Spark. And it
> takes seconds to get going, and it's elastically and globally scalable."
>
> More to come,
>
> Kenneth Brotman
>
> -----Original Message-----
> From: Josh McKenzie [mailto:***@apache.org]
> Sent: Wednesday, February 21, 2018 8:28 AM
> To: ***@cassandra.apache.org
> Cc: User
> Subject: Re: Cassandra Needs to Grow Up by Version Five!
>
> There's a disheartening amount of "here's where Cassandra is bad, and
> here's what it needs to do for me for free" happening in this thread.
>
> This is open-source software. Everyone is *strongly encouraged* to submit
> a patch to move the needle on *any* of these things being complained about
> in this thread.
>
> For the Apache Way <https://www.apache.org/foundation/governance/> to
> work, people need to step up and meaningfully contribute to a project to
> scratch their own itch instead of just waiting for a random
> corporation-subsidized engineer to happen to have interests that align with
> them and contribute that to the project.
>
> Beating a dead horse for things everyone on the project knows are serious
> pain points is not productive.
>
> On Wed, Feb 21, 2018 at 5:45 AM, Oleksandr Shulgin <
> ***@zalando.de> wrote:
>
> > On Mon, Feb 19, 2018 at 10:01 AM, Kenneth Brotman <
> > ***@yahoo.com.invalid> wrote:
> >
> > >
> > > >> Cluster wide management should be a big theme in any next major
> > release.
> > > >>
> > > >Na. Stability and testing should be a big theme in the next major
> > release.
> > > >
> > >
> > > Double Na on that one Jeff. I think you have a concern there about
> > > the need to test sufficiently to ensure the stability of the next
> > > major release. That makes perfect sense.- for every release,
> > > especially the major ones. Continuous improvement is not a phase of
> > > development for example. CI should be in everything, in every
> > > phase. Stability and testing a part of every release not just one.
> > > A major release should be
> > a
> > > nice step from the previous major release though.
> > >
> >
> > I guess what Jeff refers to is the tick-tock release cycle experiment,
> > which has proven to be a complete disaster by popular opinion.
> >
> > There's also the "materialized views" feature which failed to
> > materialize in the end (pun intended) and had to be declared
> > experimental retroactively.
> >
> > Another prominent example is incremental repair which was introduced
> > as the default option in 2.2 and now is not recommended to use because
> > of so many corner cases where it can fail. So again experimental as an
> afterthought.
> >
> > Not to mention that even if you are aware of the default incremental
> > and go with full repair instead, you're still up for a sad surprise:
> > anti-compaction will be triggered despite the "full" repair. Because
> > anti-compaction is only disabled in case of sub-range repair (don't
> > ask why), so you need to use something advanced like Reaper if you
> > want to avoid that. I don't think you'll ever find this in the
> documentation.
> >
> > Honestly, for an eventually-consistent system like Cassandra
> > anti-entropy repair is one of the most important pieces to get right.
> > And Cassandra fails really badly on that one: the feature is not
> > really well designed, poorly implemented and under-documented.
> >
> > In a summary, IMO, Cassandra is a poor implementation of some good ideas.
> > It is a collection of hacks, not features. They sometimes play
> > together accidentally, and rarely by design.
> >
> > Regards,
> > --
> > Alex
> >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-***@cassandra.apache.org
> For additional commands, e-mail: user-***@cassandra.apache.org
>
>
Jon Haddad
2018-02-21 20:43:41 UTC
Permalink
Ken,

Maybe it’s not clear how open source projects work, so let me try to explain. There’s a bunch of us who either get paid by someone or volunteer on our free time. The folks that get paid, (yay!) usually take direction on what the priorities are, and work on projects that directly affect our jobs. That means that someone needs to care enough about the features you want to work on them, if you’re not going to do it yourself.

Now as others have said already, please put your list of demands in JIRA, if someone is interested, they will work on it. You may need to contribute a little more than you’ve done already, be prepared to get involved if you actually want to to see something get done. Perhaps learning a little more about Cassandra’s internals and the people involved will reveal some of the design decisions and priorities of the project.

Third, you seem to be a little obsessed with market share. While market share is fun to talk about, *most* of us that are working on and contributing to Cassandra do so because it does actually solve a problem we have, and solves it reasonably well. If some magic open source DB appears out of no where and does everything you want Cassandra to, and is bug free, keeps your data consistent, automatically does backups, comes with really nice cert management, ad hoc querying, amazing materialized views that are perfect, no caveats to secondary indexes, and somehow still gives you linear scalability without any mental overhead whatsoever then sure, people might start using it. And that’s actually OK, because if that happens we’ll all be incredibly pumped out of our minds because we won’t have to work as hard. If on the slim chance that doesn’t manifest, those of us that use Cassandra and are part of the community will keep working on the things we care about, iterating, and improving things. Maybe someone will even take a look at your JIRA issues.

Further filling the mailing list with your grievances will likely not help you progress towards your goal of a Cassandra that’s easier to use, so I encourage you to try to be a little more productive and try to help rather than just complain, which is not constructive. I did a quick search for your name on the mailing list, and I’ve seen very little from you, so to everyone’s who’s been around for a while and trying to help you it looks like you’re just some random dude asking for people to work for free on the things you’re asking for, without offering anything back in return.

Jon


> On Feb 21, 2018, at 11:56 AM, Kenneth Brotman <***@yahoo.com.INVALID> wrote:
>
> Josh,
>
> To say nothing is indifference. If you care about your community, sometimes don't you have to bring up a subject even though you know it's also temporarily adding some discomfort?
>
> As to opening a JIRA, I've got a very specific topic to try in mind now. An easy one I'll work on and then announce. Someone else will have to do the coding. A year from now I would probably just knock it out to make sure it's as easy as I expect it to be but to be honest, as I've been saying, I'm not set up to do that right now. I've barely looked at any Cassandra code; for one; everyone on this list probably codes more than I do, secondly; and lastly, it's a good one for someone that wants an easy one to start with: vNodes. I've already seen too many people seeking assistance with the vNode setting.
>
> And you can expect as others have been mentioning that there should be similar ones on compaction, repair and backup.
>
> Microsoft knows poor usability gives them an easy market to take over. And they make it easy to switch.
>
> Beginning at 4:17 in the video, it says the following:
>
> "You don't need to worry about replica sets, quorum or read repair. You can focus on writing correct application logic."
>
> At 4:42, it says:
> "Hopefully this gives you a quick idea of how seamlessly you can bring your existing Cassandra applications to Azure Cosmos DB. No code changes are required. It works with your favorite Cassandra tools and drivers including for example native Cassandra driver for Spark. And it takes seconds to get going, and it's elastically and globally scalable."
>
> More to come,
>
> Kenneth Brotman
>
> -----Original Message-----
> From: Josh McKenzie [mailto:***@apache.org]
> Sent: Wednesday, February 21, 2018 8:28 AM
> To: ***@cassandra.apache.org
> Cc: User
> Subject: Re: Cassandra Needs to Grow Up by Version Five!
>
> There's a disheartening amount of "here's where Cassandra is bad, and here's what it needs to do for me for free" happening in this thread.
>
> This is open-source software. Everyone is *strongly encouraged* to submit a patch to move the needle on *any* of these things being complained about in this thread.
>
> For the Apache Way <https://www.apache.org/foundation/governance/> to work, people need to step up and meaningfully contribute to a project to scratch their own itch instead of just waiting for a random corporation-subsidized engineer to happen to have interests that align with them and contribute that to the project.
>
> Beating a dead horse for things everyone on the project knows are serious pain points is not productive.
>
> On Wed, Feb 21, 2018 at 5:45 AM, Oleksandr Shulgin < ***@zalando.de> wrote:
>
>> On Mon, Feb 19, 2018 at 10:01 AM, Kenneth Brotman <
>> ***@yahoo.com.invalid> wrote:
>>
>>>
>>>>> Cluster wide management should be a big theme in any next major
>> release.
>>>>>
>>>> Na. Stability and testing should be a big theme in the next major
>> release.
>>>>
>>>
>>> Double Na on that one Jeff. I think you have a concern there about
>>> the need to test sufficiently to ensure the stability of the next
>>> major release. That makes perfect sense.- for every release,
>>> especially the major ones. Continuous improvement is not a phase of
>>> development for example. CI should be in everything, in every
>>> phase. Stability and testing a part of every release not just one.
>>> A major release should be
>> a
>>> nice step from the previous major release though.
>>>
>>
>> I guess what Jeff refers to is the tick-tock release cycle experiment,
>> which has proven to be a complete disaster by popular opinion.
>>
>> There's also the "materialized views" feature which failed to
>> materialize in the end (pun intended) and had to be declared
>> experimental retroactively.
>>
>> Another prominent example is incremental repair which was introduced
>> as the default option in 2.2 and now is not recommended to use because
>> of so many corner cases where it can fail. So again experimental as an afterthought.
>>
>> Not to mention that even if you are aware of the default incremental
>> and go with full repair instead, you're still up for a sad surprise:
>> anti-compaction will be triggered despite the "full" repair. Because
>> anti-compaction is only disabled in case of sub-range repair (don't
>> ask why), so you need to use something advanced like Reaper if you
>> want to avoid that. I don't think you'll ever find this in the documentation.
>>
>> Honestly, for an eventually-consistent system like Cassandra
>> anti-entropy repair is one of the most important pieces to get right.
>> And Cassandra fails really badly on that one: the feature is not
>> really well designed, poorly implemented and under-documented.
>>
>> In a summary, IMO, Cassandra is a poor implementation of some good ideas.
>> It is a collection of hacks, not features. They sometimes play
>> together accidentally, and rarely by design.
>>
>> Regards,
>> --
>> Alex
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-***@cassandra.apache.org
> For additional commands, e-mail: user-***@cassandra.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: user-***@cassandra.apache.org
For additional commands, e-mail: user-***@cassandra.apache.org
Kenneth Brotman
2018-02-21 22:20:54 UTC
Permalink
Jon,

Very sorry that you don't see the value of the time I'm taking for this. I don't have demands; I do have a stern warning and I'm right Jon. Please be very careful not to mischaracterized my words Jon.

You suggest I put things in JIRA's, then seem to suggest that I'd be lucky if anyone looked at it and did anything. That's what I figured too.

I don't appreciate the hostility. You will understand more fully in the next post where I'm coming from. Try to keep the conversation civilized. I'm trying or at least so you understand I think what I'm doing is saving your gig and mine. I really like a lot of people is this group.

I've come to a preliminary assessment on things. Soon the cloud will clear or I'll be gone. Don't worry. I'm a very peaceful person and like you I am driven by real important projects that I feel compelled to work on for the good of others. I don't have time for people to hand hold a database and I can't get stuck with my projects on the wrong stuff.

Kenneth Brotman


-----Original Message-----
From: Jon Haddad [mailto:***@gmail.com] On Behalf Of Jon Haddad
Sent: Wednesday, February 21, 2018 12:44 PM
To: ***@cassandra.apache.org
Cc: ***@cassandra.apache.org
Subject: Re: Cassandra Needs to Grow Up by Version Five!

Ken,

Maybe it’s not clear how open source projects work, so let me try to explain. There’s a bunch of us who either get paid by someone or volunteer on our free time. The folks that get paid, (yay!) usually take direction on what the priorities are, and work on projects that directly affect our jobs. That means that someone needs to care enough about the features you want to work on them, if you’re not going to do it yourself.

Now as others have said already, please put your list of demands in JIRA, if someone is interested, they will work on it. You may need to contribute a little more than you’ve done already, be prepared to get involved if you actually want to to see something get done. Perhaps learning a little more about Cassandra’s internals and the people involved will reveal some of the design decisions and priorities of the project.

Third, you seem to be a little obsessed with market share. While market share is fun to talk about, *most* of us that are working on and contributing to Cassandra do so because it does actually solve a problem we have, and solves it reasonably well. If some magic open source DB appears out of no where and does everything you want Cassandra to, and is bug free, keeps your data consistent, automatically does backups, comes with really nice cert management, ad hoc querying, amazing materialized views that are perfect, no caveats to secondary indexes, and somehow still gives you linear scalability without any mental overhead whatsoever then sure, people might start using it. And that’s actually OK, because if that happens we’ll all be incredibly pumped out of our minds because we won’t have to work as hard. If on the slim chance that doesn’t manifest, those of us that use Cassandra and are part of the community will keep working on the things we care about, iterating, and improving things. Maybe someone will even take a look at your JIRA issues.

Further filling the mailing list with your grievances will likely not help you progress towards your goal of a Cassandra that’s easier to use, so I encourage you to try to be a little more productive and try to help rather than just complain, which is not constructive. I did a quick search for your name on the mailing list, and I’ve seen very little from you, so to everyone’s who’s been around for a while and trying to help you it looks like you’re just some random dude asking for people to work for free on the things you’re asking for, without offering anything back in return.

Jon


> On Feb 21, 2018, at 11:56 AM, Kenneth Brotman <***@yahoo.com.INVALID> wrote:
>
> Josh,
>
> To say nothing is indifference. If you care about your community, sometimes don't you have to bring up a subject even though you know it's also temporarily adding some discomfort?
>
> As to opening a JIRA, I've got a very specific topic to try in mind now. An easy one I'll work on and then announce. Someone else will have to do the coding. A year from now I would probably just knock it out to make sure it's as easy as I expect it to be but to be honest, as I've been saying, I'm not set up to do that right now. I've barely looked at any Cassandra code; for one; everyone on this list probably codes more than I do, secondly; and lastly, it's a good one for someone that wants an easy one to start with: vNodes. I've already seen too many people seeking assistance with the vNode setting.
>
> And you can expect as others have been mentioning that there should be similar ones on compaction, repair and backup.
>
> Microsoft knows poor usability gives them an easy market to take over. And they make it easy to switch.
>
> Beginning at 4:17 in the video, it says the following:
>
> "You don't need to worry about replica sets, quorum or read repair. You can focus on writing correct application logic."
>
> At 4:42, it says:
> "Hopefully this gives you a quick idea of how seamlessly you can bring your existing Cassandra applications to Azure Cosmos DB. No code changes are required. It works with your favorite Cassandra tools and drivers including for example native Cassandra driver for Spark. And it takes seconds to get going, and it's elastically and globally scalable."
>
> More to come,
>
> Kenneth Brotman
>
> -----Original Message-----
> From: Josh McKenzie [mailto:***@apache.org]
> Sent: Wednesday, February 21, 2018 8:28 AM
> To: ***@cassandra.apache.org
> Cc: User
> Subject: Re: Cassandra Needs to Grow Up by Version Five!
>
> There's a disheartening amount of "here's where Cassandra is bad, and here's what it needs to do for me for free" happening in this thread.
>
> This is open-source software. Everyone is *strongly encouraged* to submit a patch to move the needle on *any* of these things being complained about in this thread.
>
> For the Apache Way <https://www.apache.org/foundation/governance/> to work, people need to step up and meaningfully contribute to a project to scratch their own itch instead of just waiting for a random corporation-subsidized engineer to happen to have interests that align with them and contribute that to the project.
>
> Beating a dead horse for things everyone on the project knows are serious pain points is not productive.
>
> On Wed, Feb 21, 2018 at 5:45 AM, Oleksandr Shulgin < ***@zalando.de> wrote:
>
>> On Mon, Feb 19, 2018 at 10:01 AM, Kenneth Brotman <
>> ***@yahoo.com.invalid> wrote:
>>
>>>
>>>>> Cluster wide management should be a big theme in any next major
>> release.
>>>>>
>>>> Na. Stability and testing should be a big theme in the next major
>> release.
>>>>
>>>
>>> Double Na on that one Jeff. I think you have a concern there about
>>> the need to test sufficiently to ensure the stability of the next
>>> major release. That makes perfect sense.- for every release,
>>> especially the major ones. Continuous improvement is not a phase of
>>> development for example. CI should be in everything, in every
>>> phase. Stability and testing a part of every release not just one.
>>> A major release should be
>> a
>>> nice step from the previous major release though.
>>>
>>
>> I guess what Jeff refers to is the tick-tock release cycle
>> experiment, which has proven to be a complete disaster by popular opinion.
>>
>> There's also the "materialized views" feature which failed to
>> materialize in the end (pun intended) and had to be declared
>> experimental retroactively.
>>
>> Another prominent example is incremental repair which was introduced
>> as the default option in 2.2 and now is not recommended to use
>> because of so many corner cases where it can fail. So again experimental as an afterthought.
>>
>> Not to mention that even if you are aware of the default incremental
>> and go with full repair instead, you're still up for a sad surprise:
>> anti-compaction will be triggered despite the "full" repair. Because
>> anti-compaction is only disabled in case of sub-range repair (don't
>> ask why), so you need to use something advanced like Reaper if you
>> want to avoid that. I don't think you'll ever find this in the documentation.
>>
>> Honestly, for an eventually-consistent system like Cassandra
>> anti-entropy repair is one of the most important pieces to get right.
>> And Cassandra fails really badly on that one: the feature is not
>> really well designed, poorly implemented and under-documented.
>>
>> In a summary, IMO, Cassandra is a poor implementation of some good ideas.
>> It is a collection of hacks, not features. They sometimes play
>> together accidentally, and rarely by design.
>>
>> Regards,
>> --
>> Alex
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-***@cassandra.apache.org
> For additional commands, e-mail: user-***@cassandra.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-***@cassandra.apache.org
For additional commands, e-mail: dev-***@cassandra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: user-***@cassandra.apache.org
For additional commands, e-mail: user-***@cassandra.apache.org
Akash Gangil
2018-02-21 22:23:47 UTC
Permalink
I would second Jon in the arguments he made. Contributing outside work is
draining and really requires a lot of commitment. If someone requires
features around usability etc, just pay for it, period.

On Wed, Feb 21, 2018 at 2:20 PM, Kenneth Brotman <
***@yahoo.com.invalid> wrote:

> Jon,
>
> Very sorry that you don't see the value of the time I'm taking for this.
> I don't have demands; I do have a stern warning and I'm right Jon. Please
> be very careful not to mischaracterized my words Jon.
>
> You suggest I put things in JIRA's, then seem to suggest that I'd be lucky
> if anyone looked at it and did anything. That's what I figured too.
>
> I don't appreciate the hostility. You will understand more fully in the
> next post where I'm coming from. Try to keep the conversation civilized.
> I'm trying or at least so you understand I think what I'm doing is saving
> your gig and mine. I really like a lot of people is this group.
>
> I've come to a preliminary assessment on things. Soon the cloud will
> clear or I'll be gone. Don't worry. I'm a very peaceful person and like
> you I am driven by real important projects that I feel compelled to work on
> for the good of others. I don't have time for people to hand hold a
> database and I can't get stuck with my projects on the wrong stuff.
>
> Kenneth Brotman
>
>
> -----Original Message-----
> From: Jon Haddad [mailto:***@gmail.com] On Behalf Of Jon
> Haddad
> Sent: Wednesday, February 21, 2018 12:44 PM
> To: ***@cassandra.apache.org
> Cc: ***@cassandra.apache.org
> Subject: Re: Cassandra Needs to Grow Up by Version Five!
>
> Ken,
>
> Maybe it’s not clear how open source projects work, so let me try to
> explain. There’s a bunch of us who either get paid by someone or volunteer
> on our free time. The folks that get paid, (yay!) usually take direction
> on what the priorities are, and work on projects that directly affect our
> jobs. That means that someone needs to care enough about the features you
> want to work on them, if you’re not going to do it yourself.
>
> Now as others have said already, please put your list of demands in JIRA,
> if someone is interested, they will work on it. You may need to contribute
> a little more than you’ve done already, be prepared to get involved if you
> actually want to to see something get done. Perhaps learning a little more
> about Cassandra’s internals and the people involved will reveal some of the
> design decisions and priorities of the project.
>
> Third, you seem to be a little obsessed with market share. While market
> share is fun to talk about, *most* of us that are working on and
> contributing to Cassandra do so because it does actually solve a problem we
> have, and solves it reasonably well. If some magic open source DB appears
> out of no where and does everything you want Cassandra to, and is bug free,
> keeps your data consistent, automatically does backups, comes with really
> nice cert management, ad hoc querying, amazing materialized views that are
> perfect, no caveats to secondary indexes, and somehow still gives you
> linear scalability without any mental overhead whatsoever then sure, people
> might start using it. And that’s actually OK, because if that happens
> we’ll all be incredibly pumped out of our minds because we won’t have to
> work as hard. If on the slim chance that doesn’t manifest, those of us
> that use Cassandra and are part of the community will keep working on the
> things we care about, iterating, and improving things. Maybe someone will
> even take a look at your JIRA issues.
>
> Further filling the mailing list with your grievances will likely not help
> you progress towards your goal of a Cassandra that’s easier to use, so I
> encourage you to try to be a little more productive and try to help rather
> than just complain, which is not constructive. I did a quick search for
> your name on the mailing list, and I’ve seen very little from you, so to
> everyone’s who’s been around for a while and trying to help you it looks
> like you’re just some random dude asking for people to work for free on the
> things you’re asking for, without offering anything back in return.
>
> Jon
>
>
> > On Feb 21, 2018, at 11:56 AM, Kenneth Brotman
> <***@yahoo.com.INVALID> wrote:
> >
> > Josh,
> >
> > To say nothing is indifference. If you care about your community,
> sometimes don't you have to bring up a subject even though you know it's
> also temporarily adding some discomfort?
> >
> > As to opening a JIRA, I've got a very specific topic to try in mind
> now. An easy one I'll work on and then announce. Someone else will have
> to do the coding. A year from now I would probably just knock it out to
> make sure it's as easy as I expect it to be but to be honest, as I've been
> saying, I'm not set up to do that right now. I've barely looked at any
> Cassandra code; for one; everyone on this list probably codes more than I
> do, secondly; and lastly, it's a good one for someone that wants an easy
> one to start with: vNodes. I've already seen too many people seeking
> assistance with the vNode setting.
> >
> > And you can expect as others have been mentioning that there should be
> similar ones on compaction, repair and backup.
> >
> > Microsoft knows poor usability gives them an easy market to take over.
> And they make it easy to switch.
> >
> > Beginning at 4:17 in the video, it says the following:
> >
> > "You don't need to worry about replica sets, quorum or read
> repair. You can focus on writing correct application logic."
> >
> > At 4:42, it says:
> > "Hopefully this gives you a quick idea of how seamlessly you can
> bring your existing Cassandra applications to Azure Cosmos DB. No code
> changes are required. It works with your favorite Cassandra tools and
> drivers including for example native Cassandra driver for Spark. And it
> takes seconds to get going, and it's elastically and globally scalable."
> >
> > More to come,
> >
> > Kenneth Brotman
> >
> > -----Original Message-----
> > From: Josh McKenzie [mailto:***@apache.org]
> > Sent: Wednesday, February 21, 2018 8:28 AM
> > To: ***@cassandra.apache.org
> > Cc: User
> > Subject: Re: Cassandra Needs to Grow Up by Version Five!
> >
> > There's a disheartening amount of "here's where Cassandra is bad, and
> here's what it needs to do for me for free" happening in this thread.
> >
> > This is open-source software. Everyone is *strongly encouraged* to
> submit a patch to move the needle on *any* of these things being complained
> about in this thread.
> >
> > For the Apache Way <https://www.apache.org/foundation/governance/> to
> work, people need to step up and meaningfully contribute to a project to
> scratch their own itch instead of just waiting for a random
> corporation-subsidized engineer to happen to have interests that align with
> them and contribute that to the project.
> >
> > Beating a dead horse for things everyone on the project knows are
> serious pain points is not productive.
> >
> > On Wed, Feb 21, 2018 at 5:45 AM, Oleksandr Shulgin <
> ***@zalando.de> wrote:
> >
> >> On Mon, Feb 19, 2018 at 10:01 AM, Kenneth Brotman <
> >> ***@yahoo.com.invalid> wrote:
> >>
> >>>
> >>>>> Cluster wide management should be a big theme in any next major
> >> release.
> >>>>>
> >>>> Na. Stability and testing should be a big theme in the next major
> >> release.
> >>>>
> >>>
> >>> Double Na on that one Jeff. I think you have a concern there about
> >>> the need to test sufficiently to ensure the stability of the next
> >>> major release. That makes perfect sense.- for every release,
> >>> especially the major ones. Continuous improvement is not a phase of
> >>> development for example. CI should be in everything, in every
> >>> phase. Stability and testing a part of every release not just one.
> >>> A major release should be
> >> a
> >>> nice step from the previous major release though.
> >>>
> >>
> >> I guess what Jeff refers to is the tick-tock release cycle
> >> experiment, which has proven to be a complete disaster by popular
> opinion.
> >>
> >> There's also the "materialized views" feature which failed to
> >> materialize in the end (pun intended) and had to be declared
> >> experimental retroactively.
> >>
> >> Another prominent example is incremental repair which was introduced
> >> as the default option in 2.2 and now is not recommended to use
> >> because of so many corner cases where it can fail. So again
> experimental as an afterthought.
> >>
> >> Not to mention that even if you are aware of the default incremental
> >> and go with full repair instead, you're still up for a sad surprise:
> >> anti-compaction will be triggered despite the "full" repair. Because
> >> anti-compaction is only disabled in case of sub-range repair (don't
> >> ask why), so you need to use something advanced like Reaper if you
> >> want to avoid that. I don't think you'll ever find this in the
> documentation.
> >>
> >> Honestly, for an eventually-consistent system like Cassandra
> >> anti-entropy repair is one of the most important pieces to get right.
> >> And Cassandra fails really badly on that one: the feature is not
> >> really well designed, poorly implemented and under-documented.
> >>
> >> In a summary, IMO, Cassandra is a poor implementation of some good
> ideas.
> >> It is a collection of hacks, not features. They sometimes play
> >> together accidentally, and rarely by design.
> >>
> >> Regards,
> >> --
> >> Alex
> >>
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: user-***@cassandra.apache.org
> > For additional commands, e-mail: user-***@cassandra.apache.org
> >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-***@cassandra.apache.org
> For additional commands, e-mail: dev-***@cassandra.apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-***@cassandra.apache.org
> For additional commands, e-mail: dev-***@cassandra.apache.org
>
>


--
Akash
Kenneth Brotman
2018-02-21 22:53:32 UTC
Permalink
Hi Akash,

I get the part about outside work which is why in replying to Jeff Jirsa I was suggesting the big companies could justify taking it on easy enough and you know actually pay the people who would be working at it so those people could have a life.

The part I don't get is the aversion to usability. Isn't that what you think about when you are coding? "Am I making this thing I'm building easy to use?" If you were programming for me, we would be constantly talking about what we are building and how we can make things easier for users. If I had to fight with a developer, architect or engineer about usability all the time, they would be gone and quick. How do approach programming if you aren't trying to make things easy.

Kenneth Brotman

-----Original Message-----
From: Akash Gangil [mailto:***@gmail.com]
Sent: Wednesday, February 21, 2018 2:24 PM
To: ***@cassandra.apache.org
Cc: ***@cassandra.apache.org
Subject: Re: Cassandra Needs to Grow Up by Version Five!

I would second Jon in the arguments he made. Contributing outside work is draining and really requires a lot of commitment. If someone requires features around usability etc, just pay for it, period.

On Wed, Feb 21, 2018 at 2:20 PM, Kenneth Brotman < ***@yahoo.com.invalid> wrote:

> Jon,
>
> Very sorry that you don't see the value of the time I'm taking for this.
> I don't have demands; I do have a stern warning and I'm right Jon.
> Please be very careful not to mischaracterized my words Jon.
>
> You suggest I put things in JIRA's, then seem to suggest that I'd be
> lucky if anyone looked at it and did anything. That's what I figured too.
>
> I don't appreciate the hostility. You will understand more fully in
> the next post where I'm coming from. Try to keep the conversation civilized.
> I'm trying or at least so you understand I think what I'm doing is
> saving your gig and mine. I really like a lot of people is this group.
>
> I've come to a preliminary assessment on things. Soon the cloud will
> clear or I'll be gone. Don't worry. I'm a very peaceful person and
> like you I am driven by real important projects that I feel compelled
> to work on for the good of others. I don't have time for people to
> hand hold a database and I can't get stuck with my projects on the wrong stuff.
>
> Kenneth Brotman
>
>
> -----Original Message-----
> From: Jon Haddad [mailto:***@gmail.com] On Behalf Of Jon
> Haddad
> Sent: Wednesday, February 21, 2018 12:44 PM
> To: ***@cassandra.apache.org
> Cc: ***@cassandra.apache.org
> Subject: Re: Cassandra Needs to Grow Up by Version Five!
>
> Ken,
>
> Maybe it’s not clear how open source projects work, so let me try to
> explain. There’s a bunch of us who either get paid by someone or
> volunteer on our free time. The folks that get paid, (yay!) usually
> take direction on what the priorities are, and work on projects that
> directly affect our jobs. That means that someone needs to care
> enough about the features you want to work on them, if you’re not going to do it yourself.
>
> Now as others have said already, please put your list of demands in
> JIRA, if someone is interested, they will work on it. You may need to
> contribute a little more than you’ve done already, be prepared to get
> involved if you actually want to to see something get done. Perhaps
> learning a little more about Cassandra’s internals and the people
> involved will reveal some of the design decisions and priorities of the project.
>
> Third, you seem to be a little obsessed with market share. While
> market share is fun to talk about, *most* of us that are working on
> and contributing to Cassandra do so because it does actually solve a
> problem we have, and solves it reasonably well. If some magic open
> source DB appears out of no where and does everything you want
> Cassandra to, and is bug free, keeps your data consistent,
> automatically does backups, comes with really nice cert management, ad
> hoc querying, amazing materialized views that are perfect, no caveats
> to secondary indexes, and somehow still gives you linear scalability
> without any mental overhead whatsoever then sure, people might start
> using it. And that’s actually OK, because if that happens we’ll all
> be incredibly pumped out of our minds because we won’t have to work as
> hard. If on the slim chance that doesn’t manifest, those of us that
> use Cassandra and are part of the community will keep working on the
> things we care about, iterating, and improving things. Maybe someone will even take a look at your JIRA issues.
>
> Further filling the mailing list with your grievances will likely not
> help you progress towards your goal of a Cassandra that’s easier to
> use, so I encourage you to try to be a little more productive and try
> to help rather than just complain, which is not constructive. I did a
> quick search for your name on the mailing list, and I’ve seen very
> little from you, so to everyone’s who’s been around for a while and
> trying to help you it looks like you’re just some random dude asking
> for people to work for free on the things you’re asking for, without offering anything back in return.
>
> Jon
>
>
> > On Feb 21, 2018, at 11:56 AM, Kenneth Brotman
> <***@yahoo.com.INVALID> wrote:
> >
> > Josh,
> >
> > To say nothing is indifference. If you care about your community,
> sometimes don't you have to bring up a subject even though you know
> it's also temporarily adding some discomfort?
> >
> > As to opening a JIRA, I've got a very specific topic to try in mind
> now. An easy one I'll work on and then announce. Someone else will
> have to do the coding. A year from now I would probably just knock it
> out to make sure it's as easy as I expect it to be but to be honest,
> as I've been saying, I'm not set up to do that right now. I've barely
> looked at any Cassandra code; for one; everyone on this list probably
> codes more than I do, secondly; and lastly, it's a good one for
> someone that wants an easy one to start with: vNodes. I've already
> seen too many people seeking assistance with the vNode setting.
> >
> > And you can expect as others have been mentioning that there should
> > be
> similar ones on compaction, repair and backup.
> >
> > Microsoft knows poor usability gives them an easy market to take over.
> And they make it easy to switch.
> >
> > Beginning at 4:17 in the video, it says the following:
> >
> > "You don't need to worry about replica sets, quorum or read
> repair. You can focus on writing correct application logic."
> >
> > At 4:42, it says:
> > "Hopefully this gives you a quick idea of how seamlessly you
> > can
> bring your existing Cassandra applications to Azure Cosmos DB. No
> code changes are required. It works with your favorite Cassandra
> tools and drivers including for example native Cassandra driver for
> Spark. And it takes seconds to get going, and it's elastically and globally scalable."
> >
> > More to come,
> >
> > Kenneth Brotman
> >
> > -----Original Message-----
> > From: Josh McKenzie [mailto:***@apache.org]
> > Sent: Wednesday, February 21, 2018 8:28 AM
> > To: ***@cassandra.apache.org
> > Cc: User
> > Subject: Re: Cassandra Needs to Grow Up by Version Five!
> >
> > There's a disheartening amount of "here's where Cassandra is bad,
> > and
> here's what it needs to do for me for free" happening in this thread.
> >
> > This is open-source software. Everyone is *strongly encouraged* to
> submit a patch to move the needle on *any* of these things being
> complained about in this thread.
> >
> > For the Apache Way <https://www.apache.org/foundation/governance/>
> > to
> work, people need to step up and meaningfully contribute to a project
> to scratch their own itch instead of just waiting for a random
> corporation-subsidized engineer to happen to have interests that align
> with them and contribute that to the project.
> >
> > Beating a dead horse for things everyone on the project knows are
> serious pain points is not productive.
> >
> > On Wed, Feb 21, 2018 at 5:45 AM, Oleksandr Shulgin <
> ***@zalando.de> wrote:
> >
> >> On Mon, Feb 19, 2018 at 10:01 AM, Kenneth Brotman <
> >> ***@yahoo.com.invalid> wrote:
> >>
> >>>
> >>>>> Cluster wide management should be a big theme in any next major
> >> release.
> >>>>>
> >>>> Na. Stability and testing should be a big theme in the next major
> >> release.
> >>>>
> >>>
> >>> Double Na on that one Jeff. I think you have a concern there
> >>> about the need to test sufficiently to ensure the stability of the
> >>> next major release. That makes perfect sense.- for every release,
> >>> especially the major ones. Continuous improvement is not a phase
> >>> of development for example. CI should be in everything, in every
> >>> phase. Stability and testing a part of every release not just one.
> >>> A major release should be
> >> a
> >>> nice step from the previous major release though.
> >>>
> >>
> >> I guess what Jeff refers to is the tick-tock release cycle
> >> experiment, which has proven to be a complete disaster by popular
> opinion.
> >>
> >> There's also the "materialized views" feature which failed to
> >> materialize in the end (pun intended) and had to be declared
> >> experimental retroactively.
> >>
> >> Another prominent example is incremental repair which was
> >> introduced as the default option in 2.2 and now is not recommended
> >> to use because of so many corner cases where it can fail. So again
> experimental as an afterthought.
> >>
> >> Not to mention that even if you are aware of the default
> >> incremental and go with full repair instead, you're still up for a sad surprise:
> >> anti-compaction will be triggered despite the "full" repair.
> >> Because anti-compaction is only disabled in case of sub-range
> >> repair (don't ask why), so you need to use something advanced like
> >> Reaper if you want to avoid that. I don't think you'll ever find
> >> this in the
> documentation.
> >>
> >> Honestly, for an eventually-consistent system like Cassandra
> >> anti-entropy repair is one of the most important pieces to get right.
> >> And Cassandra fails really badly on that one: the feature is not
> >> really well designed, poorly implemented and under-documented.
> >>
> >> In a summary, IMO, Cassandra is a poor implementation of some good
> ideas.
> >> It is a collection of hacks, not features. They sometimes play
> >> together accidentally, and rarely by design.
> >>
> >> Regards,
> >> --
> >> Alex
> >>
> >
> >
> > --------------------------------------------------------------------
> > - To unsubscribe, e-mail: user-***@cassandra.apache.org
> > For additional commands, e-mail: user-***@cassandra.apache.org
> >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-***@cassandra.apache.org
> For additional commands, e-mail: dev-***@cassandra.apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-***@cassandra.apache.org
> For additional commands, e-mail: dev-***@cassandra.apache.org
>
>


--
Akash


---------------------------------------------------------------------
To unsubscribe, e-mail: user-***@cassandra.apache.org
For additional commands, e-mail: user-***@cassandra.apache.org
Brandon Williams
2018-02-21 23:10:45 UTC
Permalink
The only progress from this point is what Jon said: enumerate and detail
your issues in jira tickets.

On Wed, Feb 21, 2018 at 4:53 PM, Kenneth Brotman <
***@yahoo.com.invalid> wrote:

> Hi Akash,
>
> I get the part about outside work which is why in replying to Jeff Jirsa I
> was suggesting the big companies could justify taking it on easy enough and
> you know actually pay the people who would be working at it so those people
> could have a life.
>
> The part I don't get is the aversion to usability. Isn't that what you
> think about when you are coding? "Am I making this thing I'm building easy
> to use?" If you were programming for me, we would be constantly talking
> about what we are building and how we can make things easier for users. If
> I had to fight with a developer, architect or engineer about usability all
> the time, they would be gone and quick. How do approach programming if you
> aren't trying to make things easy.
>
> Kenneth Brotman
>
> -----Original Message-----
> From: Akash Gangil [mailto:***@gmail.com]
> Sent: Wednesday, February 21, 2018 2:24 PM
> To: ***@cassandra.apache.org
> Cc: ***@cassandra.apache.org
> Subject: Re: Cassandra Needs to Grow Up by Version Five!
>
> I would second Jon in the arguments he made. Contributing outside work is
> draining and really requires a lot of commitment. If someone requires
> features around usability etc, just pay for it, period.
>
> On Wed, Feb 21, 2018 at 2:20 PM, Kenneth Brotman <
> ***@yahoo.com.invalid> wrote:
>
> > Jon,
> >
> > Very sorry that you don't see the value of the time I'm taking for this.
> > I don't have demands; I do have a stern warning and I'm right Jon.
> > Please be very careful not to mischaracterized my words Jon.
> >
> > You suggest I put things in JIRA's, then seem to suggest that I'd be
> > lucky if anyone looked at it and did anything. That's what I figured too.
> >
> > I don't appreciate the hostility. You will understand more fully in
> > the next post where I'm coming from. Try to keep the conversation
> civilized.
> > I'm trying or at least so you understand I think what I'm doing is
> > saving your gig and mine. I really like a lot of people is this group.
> >
> > I've come to a preliminary assessment on things. Soon the cloud will
> > clear or I'll be gone. Don't worry. I'm a very peaceful person and
> > like you I am driven by real important projects that I feel compelled
> > to work on for the good of others. I don't have time for people to
> > hand hold a database and I can't get stuck with my projects on the wrong
> stuff.
> >
> > Kenneth Brotman
> >
> >
> > -----Original Message-----
> > From: Jon Haddad [mailto:***@gmail.com] On Behalf Of Jon
> > Haddad
> > Sent: Wednesday, February 21, 2018 12:44 PM
> > To: ***@cassandra.apache.org
> > Cc: ***@cassandra.apache.org
> > Subject: Re: Cassandra Needs to Grow Up by Version Five!
> >
> > Ken,
> >
> > Maybe it’s not clear how open source projects work, so let me try to
> > explain. There’s a bunch of us who either get paid by someone or
> > volunteer on our free time. The folks that get paid, (yay!) usually
> > take direction on what the priorities are, and work on projects that
> > directly affect our jobs. That means that someone needs to care
> > enough about the features you want to work on them, if you’re not going
> to do it yourself.
> >
> > Now as others have said already, please put your list of demands in
> > JIRA, if someone is interested, they will work on it. You may need to
> > contribute a little more than you’ve done already, be prepared to get
> > involved if you actually want to to see something get done. Perhaps
> > learning a little more about Cassandra’s internals and the people
> > involved will reveal some of the design decisions and priorities of the
> project.
> >
> > Third, you seem to be a little obsessed with market share. While
> > market share is fun to talk about, *most* of us that are working on
> > and contributing to Cassandra do so because it does actually solve a
> > problem we have, and solves it reasonably well. If some magic open
> > source DB appears out of no where and does everything you want
> > Cassandra to, and is bug free, keeps your data consistent,
> > automatically does backups, comes with really nice cert management, ad
> > hoc querying, amazing materialized views that are perfect, no caveats
> > to secondary indexes, and somehow still gives you linear scalability
> > without any mental overhead whatsoever then sure, people might start
> > using it. And that’s actually OK, because if that happens we’ll all
> > be incredibly pumped out of our minds because we won’t have to work as
> > hard. If on the slim chance that doesn’t manifest, those of us that
> > use Cassandra and are part of the community will keep working on the
> > things we care about, iterating, and improving things. Maybe someone
> will even take a look at your JIRA issues.
> >
> > Further filling the mailing list with your grievances will likely not
> > help you progress towards your goal of a Cassandra that’s easier to
> > use, so I encourage you to try to be a little more productive and try
> > to help rather than just complain, which is not constructive. I did a
> > quick search for your name on the mailing list, and I’ve seen very
> > little from you, so to everyone’s who’s been around for a while and
> > trying to help you it looks like you’re just some random dude asking
> > for people to work for free on the things you’re asking for, without
> offering anything back in return.
> >
> > Jon
> >
> >
> > > On Feb 21, 2018, at 11:56 AM, Kenneth Brotman
> > <***@yahoo.com.INVALID> wrote:
> > >
> > > Josh,
> > >
> > > To say nothing is indifference. If you care about your community,
> > sometimes don't you have to bring up a subject even though you know
> > it's also temporarily adding some discomfort?
> > >
> > > As to opening a JIRA, I've got a very specific topic to try in mind
> > now. An easy one I'll work on and then announce. Someone else will
> > have to do the coding. A year from now I would probably just knock it
> > out to make sure it's as easy as I expect it to be but to be honest,
> > as I've been saying, I'm not set up to do that right now. I've barely
> > looked at any Cassandra code; for one; everyone on this list probably
> > codes more than I do, secondly; and lastly, it's a good one for
> > someone that wants an easy one to start with: vNodes. I've already
> > seen too many people seeking assistance with the vNode setting.
> > >
> > > And you can expect as others have been mentioning that there should
> > > be
> > similar ones on compaction, repair and backup.
> > >
> > > Microsoft knows poor usability gives them an easy market to take over.
> > And they make it easy to switch.
> > >
> > > Beginning at 4:17 in the video, it says the following:
> > >
> > > "You don't need to worry about replica sets, quorum or read
> > repair. You can focus on writing correct application logic."
> > >
> > > At 4:42, it says:
> > > "Hopefully this gives you a quick idea of how seamlessly you
> > > can
> > bring your existing Cassandra applications to Azure Cosmos DB. No
> > code changes are required. It works with your favorite Cassandra
> > tools and drivers including for example native Cassandra driver for
> > Spark. And it takes seconds to get going, and it's elastically and
> globally scalable."
> > >
> > > More to come,
> > >
> > > Kenneth Brotman
> > >
> > > -----Original Message-----
> > > From: Josh McKenzie [mailto:***@apache.org]
> > > Sent: Wednesday, February 21, 2018 8:28 AM
> > > To: ***@cassandra.apache.org
> > > Cc: User
> > > Subject: Re: Cassandra Needs to Grow Up by Version Five!
> > >
> > > There's a disheartening amount of "here's where Cassandra is bad,
> > > and
> > here's what it needs to do for me for free" happening in this thread.
> > >
> > > This is open-source software. Everyone is *strongly encouraged* to
> > submit a patch to move the needle on *any* of these things being
> > complained about in this thread.
> > >
> > > For the Apache Way <https://www.apache.org/foundation/governance/>
> > > to
> > work, people need to step up and meaningfully contribute to a project
> > to scratch their own itch instead of just waiting for a random
> > corporation-subsidized engineer to happen to have interests that align
> > with them and contribute that to the project.
> > >
> > > Beating a dead horse for things everyone on the project knows are
> > serious pain points is not productive.
> > >
> > > On Wed, Feb 21, 2018 at 5:45 AM, Oleksandr Shulgin <
> > ***@zalando.de> wrote:
> > >
> > >> On Mon, Feb 19, 2018 at 10:01 AM, Kenneth Brotman <
> > >> ***@yahoo.com.invalid> wrote:
> > >>
> > >>>
> > >>>>> Cluster wide management should be a big theme in any next major
> > >> release.
> > >>>>>
> > >>>> Na. Stability and testing should be a big theme in the next major
> > >> release.
> > >>>>
> > >>>
> > >>> Double Na on that one Jeff. I think you have a concern there
> > >>> about the need to test sufficiently to ensure the stability of the
> > >>> next major release. That makes perfect sense.- for every release,
> > >>> especially the major ones. Continuous improvement is not a phase
> > >>> of development for example. CI should be in everything, in every
> > >>> phase. Stability and testing a part of every release not just one.
> > >>> A major release should be
> > >> a
> > >>> nice step from the previous major release though.
> > >>>
> > >>
> > >> I guess what Jeff refers to is the tick-tock release cycle
> > >> experiment, which has proven to be a complete disaster by popular
> > opinion.
> > >>
> > >> There's also the "materialized views" feature which failed to
> > >> materialize in the end (pun intended) and had to be declared
> > >> experimental retroactively.
> > >>
> > >> Another prominent example is incremental repair which was
> > >> introduced as the default option in 2.2 and now is not recommended
> > >> to use because of so many corner cases where it can fail. So again
> > experimental as an afterthought.
> > >>
> > >> Not to mention that even if you are aware of the default
> > >> incremental and go with full repair instead, you're still up for a
> sad surprise:
> > >> anti-compaction will be triggered despite the "full" repair.
> > >> Because anti-compaction is only disabled in case of sub-range
> > >> repair (don't ask why), so you need to use something advanced like
> > >> Reaper if you want to avoid that. I don't think you'll ever find
> > >> this in the
> > documentation.
> > >>
> > >> Honestly, for an eventually-consistent system like Cassandra
> > >> anti-entropy repair is one of the most important pieces to get right.
> > >> And Cassandra fails really badly on that one: the feature is not
> > >> really well designed, poorly implemented and under-documented.
> > >>
> > >> In a summary, IMO, Cassandra is a poor implementation of some good
> > ideas.
> > >> It is a collection of hacks, not features. They sometimes play
> > >> together accidentally, and rarely by design.
> > >>
> > >> Regards,
> > >> --
> > >> Alex
> > >>
> > >
> > >
> > > --------------------------------------------------------------------
> > > - To unsubscribe, e-mail: user-***@cassandra.apache.org
> > > For additional commands, e-mail: user-***@cassandra.apache.org
> > >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-***@cassandra.apache.org
> > For additional commands, e-mail: dev-***@cassandra.apache.org
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-***@cassandra.apache.org
> > For additional commands, e-mail: dev-***@cassandra.apache.org
> >
> >
>
>
> --
> Akash
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-***@cassandra.apache.org
> For additional commands, e-mail: user-***@cassandra.apache.org
>
>
Jeff Jirsa
2018-02-21 23:11:43 UTC
Permalink
On Wed, Feb 21, 2018 at 2:53 PM, Kenneth Brotman <
***@yahoo.com.invalid> wrote:

> Hi Akash,
>
> I get the part about outside work which is why in replying to Jeff Jirsa I
> was suggesting the big companies could justify taking it on easy enough and
> you know actually pay the people who would be working at it so those people
> could have a life.
>
> The part I don't get is the aversion to usability. Isn't that what you
> think about when you are coding? "Am I making this thing I'm building easy
> to use?" If you were programming for me, we would be constantly talking
> about what we are building and how we can make things easier for users. If
> I had to fight with a developer, architect or engineer about usability all
> the time, they would be gone and quick. How do approach programming if you
> aren't trying to make things easy.
>


There's no aversion to usability, you're assuming things that just aren't
true. Nobody's against usability, we've just prioritized other things
HIGHER. We make those decisions in part by looking at open JIRAs and
determining what's asked for the most, what members of the community have
contributed, and then balance that against what we ourselves care about.
You're making a statement that it should be the top priority for the next
release, with no JIRA, and history of contributing (and indeed, no real
clear sign that you even understand the full extent of the database), no
sign that you're willing to do the work yourself, and making a ton of
assumptions about the level of effort and ROI.

I would love for Cassandra to be easier to use, I'm sure everyone does.
There's a dozen features I'd love to add if I had infinite budget and
infinite manpower. But what you're asking for is A LOT of effort and / or A
LOT of money, and you're assuming someone's going to step up and foot the
bill, but there's no real reason to believe that's the case.

In the mean time, everyone's spending hours replying to this thread that is
0% actionable. We would all have been objectively better off had everyone
ignored this thread and just spent 10 minutes writing some section of the
docs. So the next time I get the urge to reply, I'm just going to do that
instead.
Kenneth Brotman
2018-02-22 00:13:19 UTC
Permalink
Jeff,



I already addressed everything you said. Boy! Would I like to bring up the out of date articles on the web that trip people up and the lousy documentation on the Apache website but I can’t because a lot of folks don’t know me or why I’m saying these things.



I will be making another post that I hope clarifies what’s going on with me. After that I will either be a freakishly valuable asset to this community or I will be a freakishly valuable asset to another community.



You sure have a funny way of reigning in people that are used to helping out. You sure misjudged me. Wow.



Kenneth Brotman



From: Jeff Jirsa [mailto:***@gmail.com]
Sent: Wednesday, February 21, 2018 3:12 PM
To: cassandra
Cc: Cassandra DEV
Subject: Re: Cassandra Needs to Grow Up by Version Five!





On Wed, Feb 21, 2018 at 2:53 PM, Kenneth Brotman <***@yahoo.com.invalid> wrote:

Hi Akash,

I get the part about outside work which is why in replying to Jeff Jirsa I was suggesting the big companies could justify taking it on easy enough and you know actually pay the people who would be working at it so those people could have a life.

The part I don't get is the aversion to usability. Isn't that what you think about when you are coding? "Am I making this thing I'm building easy to use?" If you were programming for me, we would be constantly talking about what we are building and how we can make things easier for users. If I had to fight with a developer, architect or engineer about usability all the time, they would be gone and quick. How do approach programming if you aren't trying to make things easy.





There's no aversion to usability, you're assuming things that just aren't true Nobody's against usability, we've just prioritized other things HIGHER. We make those decisions in part by looking at open JIRAs and determining what's asked for the most, what members of the community have contributed, and then balance that against what we ourselves care about. You're making a statement that it should be the top priority for the next release, with no JIRA, and history of contributing (and indeed, no real clear sign that you even understand the full extent of the database), no sign that you're willing to do the work yourself, and making a ton of assumptions about the level of effort and ROI.



I would love for Cassandra to be easier to use, I'm sure everyone does. There's a dozen features I'd love to add if I had infinite budget and infinite manpower. But what you're asking for is A LOT of effort and / or A LOT of money, and you're assuming someone's going to step up and foot the bill, but there's no real reason to believe that's the case.



In the mean time, everyone's spending hours replying to this thread that is 0% actionable. We would all have been objectively better off had everyone ignored this thread and just spent 10 minutes writing some section of the docs. So the next time I get the urge to reply, I'm just going to do that instead.
Jason Brown
2018-02-21 23:14:17 UTC
Permalink
Hi all,

I'd like to deescalate a bit here.

Since this is an Apache and an OSS project, contributions come in many
forms: code, speaking/advocacy, documentation, support, project management,
and so on. None of these things come for free.

Ken, I appreciate you bring up these usability topics; they are certainly
valid concerns. You've mentioned you are working on posting of some sort
that I think will amount to an enumerated list of the topics/issues you
feel need addressing. Some may be simple changes, some may be more
invasive, some we can consider implementing, some not. I look forward to a
positive discussion.

I think what would be best would be for you to complete that list and work
with the community, in a *positive and constructive manner*, towards
getting it done. That is certainly contributing, and contributing in a big
way: project management. Working with the community is going to be the most
beneficial path for everyone.

Ken, if you feel like you'd like some help getting such an initiative
going, and contributing substantively to it (not necessarily in terms of
code) please feel free to reach out to me directly (***@gmail.com).

Hoping this leads somewhere positive, that benefits everyone,

-Jason



On Wed, Feb 21, 2018 at 2:53 PM, Kenneth Brotman <
***@yahoo.com.invalid> wrote:

> Hi Akash,
>
> I get the part about outside work which is why in replying to Jeff Jirsa I
> was suggesting the big companies could justify taking it on easy enough and
> you know actually pay the people who would be working at it so those people
> could have a life.
>
> The part I don't get is the aversion to usability. Isn't that what you
> think about when you are coding? "Am I making this thing I'm building easy
> to use?" If you were programming for me, we would be constantly talking
> about what we are building and how we can make things easier for users. If
> I had to fight with a developer, architect or engineer about usability all
> the time, they would be gone and quick. How do approach programming if you
> aren't trying to make things easy.
>
> Kenneth Brotman
>
> -----Original Message-----
> From: Akash Gangil [mailto:***@gmail.com]
> Sent: Wednesday, February 21, 2018 2:24 PM
> To: ***@cassandra.apache.org
> Cc: ***@cassandra.apache.org
> Subject: Re: Cassandra Needs to Grow Up by Version Five!
>
> I would second Jon in the arguments he made. Contributing outside work is
> draining and really requires a lot of commitment. If someone requires
> features around usability etc, just pay for it, period.
>
> On Wed, Feb 21, 2018 at 2:20 PM, Kenneth Brotman <
> ***@yahoo.com.invalid> wrote:
>
> > Jon,
> >
> > Very sorry that you don't see the value of the time I'm taking for this.
> > I don't have demands; I do have a stern warning and I'm right Jon.
> > Please be very careful not to mischaracterized my words Jon.
> >
> > You suggest I put things in JIRA's, then seem to suggest that I'd be
> > lucky if anyone looked at it and did anything. That's what I figured too.
> >
> > I don't appreciate the hostility. You will understand more fully in
> > the next post where I'm coming from. Try to keep the conversation
> civilized.
> > I'm trying or at least so you understand I think what I'm doing is
> > saving your gig and mine. I really like a lot of people is this group.
> >
> > I've come to a preliminary assessment on things. Soon the cloud will
> > clear or I'll be gone. Don't worry. I'm a very peaceful person and
> > like you I am driven by real important projects that I feel compelled
> > to work on for the good of others. I don't have time for people to
> > hand hold a database and I can't get stuck with my projects on the wrong
> stuff.
> >
> > Kenneth Brotman
> >
> >
> > -----Original Message-----
> > From: Jon Haddad [mailto:***@gmail.com] On Behalf Of Jon
> > Haddad
> > Sent: Wednesday, February 21, 2018 12:44 PM
> > To: ***@cassandra.apache.org
> > Cc: ***@cassandra.apache.org
> > Subject: Re: Cassandra Needs to Grow Up by Version Five!
> >
> > Ken,
> >
> > Maybe it’s not clear how open source projects work, so let me try to
> > explain. There’s a bunch of us who either get paid by someone or
> > volunteer on our free time. The folks that get paid, (yay!) usually
> > take direction on what the priorities are, and work on projects that
> > directly affect our jobs. That means that someone needs to care
> > enough about the features you want to work on them, if you’re not going
> to do it yourself.
> >
> > Now as others have said already, please put your list of demands in
> > JIRA, if someone is interested, they will work on it. You may need to
> > contribute a little more than you’ve done already, be prepared to get
> > involved if you actually want to to see something get done. Perhaps
> > learning a little more about Cassandra’s internals and the people
> > involved will reveal some of the design decisions and priorities of the
> project.
> >
> > Third, you seem to be a little obsessed with market share. While
> > market share is fun to talk about, *most* of us that are working on
> > and contributing to Cassandra do so because it does actually solve a
> > problem we have, and solves it reasonably well. If some magic open
> > source DB appears out of no where and does everything you want
> > Cassandra to, and is bug free, keeps your data consistent,
> > automatically does backups, comes with really nice cert management, ad
> > hoc querying, amazing materialized views that are perfect, no caveats
> > to secondary indexes, and somehow still gives you linear scalability
> > without any mental overhead whatsoever then sure, people might start
> > using it. And that’s actually OK, because if that happens we’ll all
> > be incredibly pumped out of our minds because we won’t have to work as
> > hard. If on the slim chance that doesn’t manifest, those of us that
> > use Cassandra and are part of the community will keep working on the
> > things we care about, iterating, and improving things. Maybe someone
> will even take a look at your JIRA issues.
> >
> > Further filling the mailing list with your grievances will likely not
> > help you progress towards your goal of a Cassandra that’s easier to
> > use, so I encourage you to try to be a little more productive and try
> > to help rather than just complain, which is not constructive. I did a
> > quick search for your name on the mailing list, and I’ve seen very
> > little from you, so to everyone’s who’s been around for a while and
> > trying to help you it looks like you’re just some random dude asking
> > for people to work for free on the things you’re asking for, without
> offering anything back in return.
> >
> > Jon
> >
> >
> > > On Feb 21, 2018, at 11:56 AM, Kenneth Brotman
> > <***@yahoo.com.INVALID> wrote:
> > >
> > > Josh,
> > >
> > > To say nothing is indifference. If you care about your community,
> > sometimes don't you have to bring up a subject even though you know
> > it's also temporarily adding some discomfort?
> > >
> > > As to opening a JIRA, I've got a very specific topic to try in mind
> > now. An easy one I'll work on and then announce. Someone else will
> > have to do the coding. A year from now I would probably just knock it
> > out to make sure it's as easy as I expect it to be but to be honest,
> > as I've been saying, I'm not set up to do that right now. I've barely
> > looked at any Cassandra code; for one; everyone on this list probably
> > codes more than I do, secondly; and lastly, it's a good one for
> > someone that wants an easy one to start with: vNodes. I've already
> > seen too many people seeking assistance with the vNode setting.
> > >
> > > And you can expect as others have been mentioning that there should
> > > be
> > similar ones on compaction, repair and backup.
> > >
> > > Microsoft knows poor usability gives them an easy market to take over.
> > And they make it easy to switch.
> > >
> > > Beginning at 4:17 in the video, it says the following:
> > >
> > > "You don't need to worry about replica sets, quorum or read
> > repair. You can focus on writing correct application logic."
> > >
> > > At 4:42, it says:
> > > "Hopefully this gives you a quick idea of how seamlessly you
> > > can
> > bring your existing Cassandra applications to Azure Cosmos DB. No
> > code changes are required. It works with your favorite Cassandra
> > tools and drivers including for example native Cassandra driver for
> > Spark. And it takes seconds to get going, and it's elastically and
> globally scalable."
> > >
> > > More to come,
> > >
> > > Kenneth Brotman
> > >
> > > -----Original Message-----
> > > From: Josh McKenzie [mailto:***@apache.org]
> > > Sent: Wednesday, February 21, 2018 8:28 AM
> > > To: ***@cassandra.apache.org
> > > Cc: User
> > > Subject: Re: Cassandra Needs to Grow Up by Version Five!
> > >
> > > There's a disheartening amount of "here's where Cassandra is bad,
> > > and
> > here's what it needs to do for me for free" happening in this thread.
> > >
> > > This is open-source software. Everyone is *strongly encouraged* to
> > submit a patch to move the needle on *any* of these things being
> > complained about in this thread.
> > >
> > > For the Apache Way <https://www.apache.org/foundation/governance/>
> > > to
> > work, people need to step up and meaningfully contribute to a project
> > to scratch their own itch instead of just waiting for a random
> > corporation-subsidized engineer to happen to have interests that align
> > with them and contribute that to the project.
> > >
> > > Beating a dead horse for things everyone on the project knows are
> > serious pain points is not productive.
> > >
> > > On Wed, Feb 21, 2018 at 5:45 AM, Oleksandr Shulgin <
> > ***@zalando.de> wrote:
> > >
> > >> On Mon, Feb 19, 2018 at 10:01 AM, Kenneth Brotman <
> > >> ***@yahoo.com.invalid> wrote:
> > >>
> > >>>
> > >>>>> Cluster wide management should be a big theme in any next major
> > >> release.
> > >>>>>
> > >>>> Na. Stability and testing should be a big theme in the next major
> > >> release.
> > >>>>
> > >>>
> > >>> Double Na on that one Jeff. I think you have a concern there
> > >>> about the need to test sufficiently to ensure the stability of the
> > >>> next major release. That makes perfect sense.- for every release,
> > >>> especially the major ones. Continuous improvement is not a phase
> > >>> of development for example. CI should be in everything, in every
> > >>> phase. Stability and testing a part of every release not just one.
> > >>> A major release should be
> > >> a
> > >>> nice step from the previous major release though.
> > >>>
> > >>
> > >> I guess what Jeff refers to is the tick-tock release cycle
> > >> experiment, which has proven to be a complete disaster by popular
> > opinion.
> > >>
> > >> There's also the "materialized views" feature which failed to
> > >> materialize in the end (pun intended) and had to be declared
> > >> experimental retroactively.
> > >>
> > >> Another prominent example is incremental repair which was
> > >> introduced as the default option in 2.2 and now is not recommended
> > >> to use because of so many corner cases where it can fail. So again
> > experimental as an afterthought.
> > >>
> > >> Not to mention that even if you are aware of the default
> > >> incremental and go with full repair instead, you're still up for a
> sad surprise:
> > >> anti-compaction will be triggered despite the "full" repair.
> > >> Because anti-compaction is only disabled in case of sub-range
> > >> repair (don't ask why), so you need to use something advanced like
> > >> Reaper if you want to avoid that. I don't think you'll ever find
> > >> this in the
> > documentation.
> > >>
> > >> Honestly, for an eventually-consistent system like Cassandra
> > >> anti-entropy repair is one of the most important pieces to get right.
> > >> And Cassandra fails really badly on that one: the feature is not
> > >> really well designed, poorly implemented and under-documented.
> > >>
> > >> In a summary, IMO, Cassandra is a poor implementation of some good
> > ideas.
> > >> It is a collection of hacks, not features. They sometimes play
> > >> together accidentally, and rarely by design.
> > >>
> > >> Regards,
> > >> --
> > >> Alex
> > >>
> > >
> > >
> > > --------------------------------------------------------------------
> > > - To unsubscribe, e-mail: user-***@cassandra.apache.org
> > > For additional commands, e-mail: user-***@cassandra.apache.org
> > >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-***@cassandra.apache.org
> > For additional commands, e-mail: dev-***@cassandra.apache.org
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-***@cassandra.apache.org
> > For additional commands, e-mail: dev-***@cassandra.apache.org
> >
> >
>
>
> --
> Akash
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-***@cassandra.apache.org
> For additional commands, e-mail: dev-***@cassandra.apache.org
>
>
Chris Lohfink
2018-02-21 23:17:03 UTC
Permalink
Instead of saying "Make X better" you can quantify "Here's how we can make X better" in a jira and the conversation will continue with interested parties (opening jiras are free!). Being combative and insulting project on mailing list may help vent some frustrations but it is counter productive and makes people defensive.

People are not averse to usability, quite the opposite actually. People do tend to be averse to conversations opened up with "cassandra is an idiot" with no clear definition of how to make it better or what a better solution would look like though. Note however that saying "make backups better" or "look at marketing literature for these guys" is hard for an engineer or architect to break into actionable item. Coming up with cool ideas on how to do something will more likely hook a developer into working on it then trying to shame the community with a sales pitch from another DB's sales guy.

Chris

> On Feb 21, 2018, at 4:53 PM, Kenneth Brotman <***@yahoo.com.INVALID> wrote:
>
> Hi Akash,
>
> I get the part about outside work which is why in replying to Jeff Jirsa I was suggesting the big companies could justify taking it on easy enough and you know actually pay the people who would be working at it so those people could have a life.
>
> The part I don't get is the aversion to usability. Isn't that what you think about when you are coding? "Am I making this thing I'm building easy to use?" If you were programming for me, we would be constantly talking about what we are building and how we can make things easier for users. If I had to fight with a developer, architect or engineer about usability all the time, they would be gone and quick. How do approach programming if you aren't trying to make things easy.
>
> Kenneth Brotman
>
> -----Original Message-----
> From: Akash Gangil [mailto:***@gmail.com]
> Sent: Wednesday, February 21, 2018 2:24 PM
> To: ***@cassandra.apache.org
> Cc: ***@cassandra.apache.org
> Subject: Re: Cassandra Needs to Grow Up by Version Five!
>
> I would second Jon in the arguments he made. Contributing outside work is draining and really requires a lot of commitment. If someone requires features around usability etc, just pay for it, period.
>
> On Wed, Feb 21, 2018 at 2:20 PM, Kenneth Brotman < ***@yahoo.com.invalid> wrote:
>
>> Jon,
>>
>> Very sorry that you don't see the value of the time I'm taking for this.
>> I don't have demands; I do have a stern warning and I'm right Jon.
>> Please be very careful not to mischaracterized my words Jon.
>>
>> You suggest I put things in JIRA's, then seem to suggest that I'd be
>> lucky if anyone looked at it and did anything. That's what I figured too.
>>
>> I don't appreciate the hostility. You will understand more fully in
>> the next post where I'm coming from. Try to keep the conversation civilized.
>> I'm trying or at least so you understand I think what I'm doing is
>> saving your gig and mine. I really like a lot of people is this group.
>>
>> I've come to a preliminary assessment on things. Soon the cloud will
>> clear or I'll be gone. Don't worry. I'm a very peaceful person and
>> like you I am driven by real important projects that I feel compelled
>> to work on for the good of others. I don't have time for people to
>> hand hold a database and I can't get stuck with my projects on the wrong stuff.
>>
>> Kenneth Brotman
>>
>>
>> -----Original Message-----
>> From: Jon Haddad [mailto:***@gmail.com] On Behalf Of Jon
>> Haddad
>> Sent: Wednesday, February 21, 2018 12:44 PM
>> To: ***@cassandra.apache.org
>> Cc: ***@cassandra.apache.org
>> Subject: Re: Cassandra Needs to Grow Up by Version Five!
>>
>> Ken,
>>
>> Maybe it’s not clear how open source projects work, so let me try to
>> explain. There’s a bunch of us who either get paid by someone or
>> volunteer on our free time. The folks that get paid, (yay!) usually
>> take direction on what the priorities are, and work on projects that
>> directly affect our jobs. That means that someone needs to care
>> enough about the features you want to work on them, if you’re not going to do it yourself.
>>
>> Now as others have said already, please put your list of demands in
>> JIRA, if someone is interested, they will work on it. You may need to
>> contribute a little more than you’ve done already, be prepared to get
>> involved if you actually want to to see something get done. Perhaps
>> learning a little more about Cassandra’s internals and the people
>> involved will reveal some of the design decisions and priorities of the project.
>>
>> Third, you seem to be a little obsessed with market share. While
>> market share is fun to talk about, *most* of us that are working on
>> and contributing to Cassandra do so because it does actually solve a
>> problem we have, and solves it reasonably well. If some magic open
>> source DB appears out of no where and does everything you want
>> Cassandra to, and is bug free, keeps your data consistent,
>> automatically does backups, comes with really nice cert management, ad
>> hoc querying, amazing materialized views that are perfect, no caveats
>> to secondary indexes, and somehow still gives you linear scalability
>> without any mental overhead whatsoever then sure, people might start
>> using it. And that’s actually OK, because if that happens we’ll all
>> be incredibly pumped out of our minds because we won’t have to work as
>> hard. If on the slim chance that doesn’t manifest, those of us that
>> use Cassandra and are part of the community will keep working on the
>> things we care about, iterating, and improving things. Maybe someone will even take a look at your JIRA issues.
>>
>> Further filling the mailing list with your grievances will likely not
>> help you progress towards your goal of a Cassandra that’s easier to
>> use, so I encourage you to try to be a little more productive and try
>> to help rather than just complain, which is not constructive. I did a
>> quick search for your name on the mailing list, and I’ve seen very
>> little from you, so to everyone’s who’s been around for a while and
>> trying to help you it looks like you’re just some random dude asking
>> for people to work for free on the things you’re asking for, without offering anything back in return.
>>
>> Jon
>>
>>
>>> On Feb 21, 2018, at 11:56 AM, Kenneth Brotman
>> <***@yahoo.com.INVALID> wrote:
>>>
>>> Josh,
>>>
>>> To say nothing is indifference. If you care about your community,
>> sometimes don't you have to bring up a subject even though you know
>> it's also temporarily adding some discomfort?
>>>
>>> As to opening a JIRA, I've got a very specific topic to try in mind
>> now. An easy one I'll work on and then announce. Someone else will
>> have to do the coding. A year from now I would probably just knock it
>> out to make sure it's as easy as I expect it to be but to be honest,
>> as I've been saying, I'm not set up to do that right now. I've barely
>> looked at any Cassandra code; for one; everyone on this list probably
>> codes more than I do, secondly; and lastly, it's a good one for
>> someone that wants an easy one to start with: vNodes. I've already
>> seen too many people seeking assistance with the vNode setting.
>>>
>>> And you can expect as others have been mentioning that there should
>>> be
>> similar ones on compaction, repair and backup.
>>>
>>> Microsoft knows poor usability gives them an easy market to take over.
>> And they make it easy to switch.
>>>
>>> Beginning at 4:17 in the video, it says the following:
>>>
>>> "You don't need to worry about replica sets, quorum or read
>> repair. You can focus on writing correct application logic."
>>>
>>> At 4:42, it says:
>>> "Hopefully this gives you a quick idea of how seamlessly you
>>> can
>> bring your existing Cassandra applications to Azure Cosmos DB. No
>> code changes are required. It works with your favorite Cassandra
>> tools and drivers including for example native Cassandra driver for
>> Spark. And it takes seconds to get going, and it's elastically and globally scalable."
>>>
>>> More to come,
>>>
>>> Kenneth Brotman
>>>
>>> -----Original Message-----
>>> From: Josh McKenzie [mailto:***@apache.org]
>>> Sent: Wednesday, February 21, 2018 8:28 AM
>>> To: ***@cassandra.apache.org
>>> Cc: User
>>> Subject: Re: Cassandra Needs to Grow Up by Version Five!
>>>
>>> There's a disheartening amount of "here's where Cassandra is bad,
>>> and
>> here's what it needs to do for me for free" happening in this thread.
>>>
>>> This is open-source software. Everyone is *strongly encouraged* to
>> submit a patch to move the needle on *any* of these things being
>> complained about in this thread.
>>>
>>> For the Apache Way <https://www.apache.org/foundation/governance/>
>>> to
>> work, people need to step up and meaningfully contribute to a project
>> to scratch their own itch instead of just waiting for a random
>> corporation-subsidized engineer to happen to have interests that align
>> with them and contribute that to the project.
>>>
>>> Beating a dead horse for things everyone on the project knows are
>> serious pain points is not productive.
>>>
>>> On Wed, Feb 21, 2018 at 5:45 AM, Oleksandr Shulgin <
>> ***@zalando.de> wrote:
>>>
>>>> On Mon, Feb 19, 2018 at 10:01 AM, Kenneth Brotman <
>>>> ***@yahoo.com.invalid> wrote:
>>>>
>>>>>
>>>>>>> Cluster wide management should be a big theme in any next major
>>>> release.
>>>>>>>
>>>>>> Na. Stability and testing should be a big theme in the next major
>>>> release.
>>>>>>
>>>>>
>>>>> Double Na on that one Jeff. I think you have a concern there
>>>>> about the need to test sufficiently to ensure the stability of the
>>>>> next major release. That makes perfect sense.- for every release,
>>>>> especially the major ones. Continuous improvement is not a phase
>>>>> of development for example. CI should be in everything, in every
>>>>> phase. Stability and testing a part of every release not just one.
>>>>> A major release should be
>>>> a
>>>>> nice step from the previous major release though.
>>>>>
>>>>
>>>> I guess what Jeff refers to is the tick-tock release cycle
>>>> experiment, which has proven to be a complete disaster by popular
>> opinion.
>>>>
>>>> There's also the "materialized views" feature which failed to
>>>> materialize in the end (pun intended) and had to be declared
>>>> experimental retroactively.
>>>>
>>>> Another prominent example is incremental repair which was
>>>> introduced as the default option in 2.2 and now is not recommended
>>>> to use because of so many corner cases where it can fail. So again
>> experimental as an afterthought.
>>>>
>>>> Not to mention that even if you are aware of the default
>>>> incremental and go with full repair instead, you're still up for a sad surprise:
>>>> anti-compaction will be triggered despite the "full" repair.
>>>> Because anti-compaction is only disabled in case of sub-range
>>>> repair (don't ask why), so you need to use something advanced like
>>>> Reaper if you want to avoid that. I don't think you'll ever find
>>>> this in the
>> documentation.
>>>>
>>>> Honestly, for an eventually-consistent system like Cassandra
>>>> anti-entropy repair is one of the most important pieces to get right.
>>>> And Cassandra fails really badly on that one: the feature is not
>>>> really well designed, poorly implemented and under-documented.
>>>>
>>>> In a summary, IMO, Cassandra is a poor implementation of some good
>> ideas.
>>>> It is a collection of hacks, not features. They sometimes play
>>>> together accidentally, and rarely by design.
>>>>
>>>> Regards,
>>>> --
>>>> Alex
>>>>
>>>
>>>
>>> --------------------------------------------------------------------
>>> - To unsubscribe, e-mail: user-***@cassandra.apache.org
>>> For additional commands, e-mail: user-***@cassandra.apache.org
>>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-***@cassandra.apache.org
>> For additional commands, e-mail: dev-***@cassandra.apache.org
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-***@cassandra.apache.org
>> For additional commands, e-mail: dev-***@cassandra.apache.org
>>
>>
>
>
> --
> Akash
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-***@cassandra.apache.org
> For additional commands, e-mail: user-***@cassandra.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: user-***@cassandra.apache.org
For additional commands, e-mail: user-***@cassandra.apache.org
kurt greaves
2018-02-21 23:45:18 UTC
Permalink
>
> Instead of saying "Make X better" you can quantify "Here's how we can make
> X better" in a jira and the conversation will continue with interested
> parties (opening jiras are free!). Being combative and insulting project on
> mailing list may help vent some frustrations but it is counter productive
> and makes people defensive.

Yep. In the Cassandra project you'll have a very hard time convincing
someone else (under someone elses pay) to work on what you want even if you
approach it in the right way. Being assertive/aggressive is sure to remove
all chances entirely.
OSS for such large projects as Cassandra only works if we have a variety of
perspectives all working on the project together, as it's not very feasible
for volunteers to get into the C* project on their own time (nor will it
ever be). At the moment we don't have enough different perspectives working
on the project and the only way to improve that is get involved (preferably
writing some code).

I have to disagree with people here and point out that just creating JIRA's
and (trying to) have discussions about these issues will not lead to change
in any reasonable timeframe, because everyone who could do the work has an
endless list of bigger fish to fry. I strongly encourage you to get
involved and write some code, or pay someone to do it, because to put it
bluntly, it's *very* unlikely your JIRA's will get actioned unless you
contribute significantly to them yourself.

Of course there are also other ways to contribute as well, but by far the
most effective would be to contribute fixes, the next most effective would
be to contribute documentation and help users on the mailing list. Your
Slender Cassandra project is a great example of this, because despite C*
being hard to administer, it would give a lot of users examples to work
off. If people can get it working properly with the right advice, usability
is not such a big issue.
​
Sylvain Lebresne
2018-02-22 09:17:04 UTC
Permalink
>
> I have to disagree with people here and point out that just creating
> JIRA's and (trying to) have discussions about these issues will not lead to
> change in any reasonable timeframe, because everyone who could do the work
> has an endless list of bigger fish to fry. I strongly encourage you to get
> involved and write some code, or pay someone to do it, because to put it
> bluntly, it's *very* unlikely your JIRA's will get actioned unless you
> contribute significantly to them yourself.
>

Though I don't truly disagree with the overall point that getting into code
is the surest way to get something you care about see progress, I'd love
for this to not be understood as "we don't care about your idea unless you
bring code". There has been tons of JIRA tickets in the past suggesting
improvements where some contributor said "you know what, that's a good
idea" and implemented it. I've certainly see it happen numerous times and
trust I did it a lot as well (and sure, it happens dis-proportionally more
for small improvement than for lets-rewrite-the-whole-database ones, for
obvious reasons hopefully).

So if you have a relatively concrete idea for an improvement, I'd say,
please, share it. Don't get me wrong though, please do your homework first
and take a few minutes googling/JIRA searching to see if that hasn't been
discussed first; don't assume your time is more valuable than that of other
contributors. It's rude to assume so (I'd say in general, but even more so
because it's a free-as-in-beer software).

That said, and to paraphrase what others have said, one should always come
to this with a few understandings:
- For all that people may like your idea and have the time to help it get
in, there is not guarantee here. And yes, more often than not, contributors
already have a list of things they want to fix and only a finite amount of
time for contributions, so the bar for your idea to make it in some other
contributor "list" is probably high. And remember that behavior science
strongly suggests that you thinking your ideas are obviously the most
important ones likely involves a fair amount of bias. That's why
contributing the code yourself, if possible, definitively helps a lot.
- A distributed database is not exactly a simple software. In particular,
Cassandra make the choice to be fully distributed, which is a clear
trade-off: it gives it very interesting properties (scalability, fault
tolerance, ...) almost for free, but it makes some things quite a bit more
challenging. My point being, some things may look like easy problem to
solve on the surface, but are in fact more complex than they appear (which
in turns means solving them take much more time that it seems, and we get
back to contribution time/efforts not be infinite). So it's imo a good idea
to seek first to understand why things are a certain way rather than assume
than contributors don't care.
- Cassandra is not perfect, no software is, but don't assume contributors
are not aware of the weaknesses. We are for the most part. So if those
weaknesses are still there, it's generally (there is of course exceptions)
due to some combination of 1) a lack of time, 2) the difficulties of
solving those weaknesses (without creating new, worth ones) and 3) some
actually well though trade-off (we accept that weakness as the price for
other strengths). As such, if you come simply pointing deficiencies, you
may feel like you are pointing things nobody knows, but chances are, you
aren't. You're probably just reminding contributors how frustrating it is
they don't have time to solve everything. Pointing deficiencies is ok, but
unless you take the time to offer some constructive steps to improve as
well, it's often useless to be honest.

--
Sylvain
Jacques-Henri Berthemet
2018-02-22 08:52:28 UTC
Permalink
Rahul Singh
2018-02-22 16:55:56 UTC
Permalink
There’s always a reason to complain if you aren’t paying for something. There’s always a reason to complain if you are paying for something.

TLDR; If you want to help curate / organize / gather knowledge about Cassandra, send me an email. I’d love to solve at least the knowledge management problem.

Complaining itself is not a solution or a step in the right direction. Defining an issue helps by identifying specifically what the pain is and a decision can be made to resolve or not resolve it.

Over the last few days I’ve seen a lot of commentary but few suggestions for actual tangible outcomes and actions.

I have been nurturing a knowledge management project for our own firm for some time know and currently curate tons of links / blogs / slideshares, etc. I want to take that a step further and try to do something equivalent to “Planet Cassandra 2.0” which would be more than a stream of blog posts, but have the following goals.

1. A Canonical Topic tree of Cassandra topics that most of us give a shit about across Architecture, Development, Administration, DevOps
2. Content that is primarily curated from existing current knowledge out there in the form of Blog poss, mailing list archive answers,  JIRA tickets, Slides , Youtube videos, etc.
3. A community driven approach to cull / throw out old content or flag it for approval where we may say that something is no longer relevant.

Here are two of my own contributions:

1. A blog post + slide deck assembled through standing on the shoulders of content from Ebay, The Last Pickle, Open Credo, Ebay, etc. ( I gave this talk at a DC Cassandra meetup a few days ago)

https://blog.anant.us/common-problems-cassandra-data-models/

2. A screenshot of my mock for organizing all of my cassandra / data processing / whatever technology knowledge links. The actual project is open source @ http://www.github.com/appleseed/leaves.lite and it initially started as a fork of some “awesome lists” https://github.com/anant/awesome-cassandra https://github.com/anant/awesome-solr https://github.com/anant/awesome-lucene




Would love to get help and make a great resource for the world’s Apache Cassandra community. Best,

--
Rahul Singh
***@anant.us

Anant Corporation


On Feb 21, 2018, 5:53 PM -0500, Kenneth Brotman <***@yahoo.com.invalid>, wrote:
> Hi Akash,
>
> I get the part about outside work which is why in replying to Jeff Jirsa I was suggesting the big companies could justify taking it on easy enough and you know actually pay the people who would be working at it so those people could have a life.
>
> The part I don't get is the aversion to usability. Isn't that what you think about when you are coding? "Am I making this thing I'm building easy to use?" If you were programming for me, we would be constantly talking about what we are building and how we can make things easier for users. If I had to fight with a developer, architect or engineer about usability all the time, they would be gone and quick. How do approach programming if you aren't trying to make things easy.
>
> Kenneth Brotman
>
> -----Original Message-----
> From: Akash Gangil [mailto:***@gmail.com]
> Sent: Wednesday, February 21, 2018 2:24 PM
> To: ***@cassandra.apache.org
> Cc: ***@cassandra.apache.org
> Subject: Re: Cassandra Needs to Grow Up by Version Five!
>
> I would second Jon in the arguments he made. Contributing outside work is draining and really requires a lot of commitment. If someone requires features around usability etc, just pay for it, period.
>
> On Wed, Feb 21, 2018 at 2:20 PM, Kenneth Brotman < ***@yahoo.com.invalid> wrote:
>
> > Jon,
> >
> > Very sorry that you don't see the value of the time I'm taking for this.
> > I don't have demands; I do have a stern warning and I'm right Jon.
> > Please be very careful not to mischaracterized my words Jon.
> >
> > You suggest I put things in JIRA's, then seem to suggest that I'd be
> > lucky if anyone looked at it and did anything. That's what I figured too.
> >
> > I don't appreciate the hostility. You will understand more fully in
> > the next post where I'm coming from. Try to keep the conversation civilized.
> > I'm trying or at least so you understand I think what I'm doing is
> > saving your gig and mine. I really like a lot of people is this group.
> >
> > I've come to a preliminary assessment on things. Soon the cloud will
> > clear or I'll be gone. Don't worry. I'm a very peaceful person and
> > like you I am driven by real important projects that I feel compelled
> > to work on for the good of others. I don't have time for people to
> > hand hold a database and I can't get stuck with my projects on the wrong stuff.
> >
> > Kenneth Brotman
> >
> >
> > -----Original Message-----
> > From: Jon Haddad [mailto:***@gmail.com] On Behalf Of Jon
> > Haddad
> > Sent: Wednesday, February 21, 2018 12:44 PM
> > To: ***@cassandra.apache.org
> > Cc: ***@cassandra.apache.org
> > Subject: Re: Cassandra Needs to Grow Up by Version Five!
> >
> > Ken,
> >
> > Maybe it’s not clear how open source projects work, so let me try to
> > explain. There’s a bunch of us who either get paid by someone or
> > volunteer on our free time. The folks that get paid, (yay!) usually
> > take direction on what the priorities are, and work on projects that
> > directly affect our jobs. That means that someone needs to care
> > enough about the features you want to work on them, if you’re not going to do it yourself.
> >
> > Now as others have said already, please put your list of demands in
> > JIRA, if someone is interested, they will work on it. You may need to
> > contribute a little more than you’ve done already, be prepared to get
> > involved if you actually want to to see something get done. Perhaps
> > learning a little more about Cassandra’s internals and the people
> > involved will reveal some of the design decisions and priorities of the project.
> >
> > Third, you seem to be a little obsessed with market share. While
> > market share is fun to talk about, *most* of us that are working on
> > and contributing to Cassandra do so because it does actually solve a
> > problem we have, and solves it reasonably well. If some magic open
> > source DB appears out of no where and does everything you want
> > Cassandra to, and is bug free, keeps your data consistent,
> > automatically does backups, comes with really nice cert management, ad
> > hoc querying, amazing materialized views that are perfect, no caveats
> > to secondary indexes, and somehow still gives you linear scalability
> > without any mental overhead whatsoever then sure, people might start
> > using it. And that’s actually OK, because if that happens we’ll all
> > be incredibly pumped out of our minds because we won’t have to work as
> > hard. If on the slim chance that doesn’t manifest, those of us that
> > use Cassandra and are part of the community will keep working on the
> > things we care about, iterating, and improving things. Maybe someone will even take a look at your JIRA issues.
> >
> > Further filling the mailing list with your grievances will likely not
> > help you progress towards your goal of a Cassandra that’s easier to
> > use, so I encourage you to try to be a little more productive and try
> > to help rather than just complain, which is not constructive. I did a
> > quick search for your name on the mailing list, and I’ve seen very
> > little from you, so to everyone’s who’s been around for a while and
> > trying to help you it looks like you’re just some random dude asking
> > for people to work for free on the things you’re asking for, without offering anything back in return.
> >
> > Jon
> >
> >
> > > On Feb 21, 2018, at 11:56 AM, Kenneth Brotman
> > <***@yahoo.com.INVALID> wrote:
> > >
> > > Josh,
> > >
> > > To say nothing is indifference. If you care about your community,
> > sometimes don't you have to bring up a subject even though you know
> > it's also temporarily adding some discomfort?
> > >
> > > As to opening a JIRA, I've got a very specific topic to try in mind
> > now. An easy one I'll work on and then announce. Someone else will
> > have to do the coding. A year from now I would probably just knock it
> > out to make sure it's as easy as I expect it to be but to be honest,
> > as I've been saying, I'm not set up to do that right now. I've barely
> > looked at any Cassandra code; for one; everyone on this list probably
> > codes more than I do, secondly; and lastly, it's a good one for
> > someone that wants an easy one to start with: vNodes. I've already
> > seen too many people seeking assistance with the vNode setting.
> > >
> > > And you can expect as others have been mentioning that there should
> > > be
> > similar ones on compaction, repair and backup.
> > >
> > > Microsoft knows poor usability gives them an easy market to take over.
> > And they make it easy to switch.
> > >
> > > Beginning at 4:17 in the video, it says the following:
> > >
> > > "You don't need to worry about replica sets, quorum or read
> > repair. You can focus on writing correct application logic."
> > >
> > > At 4:42, it says:
> > > "Hopefully this gives you a quick idea of how seamlessly you
> > > can
> > bring your existing Cassandra applications to Azure Cosmos DB. No
> > code changes are required. It works with your favorite Cassandra
> > tools and drivers including for example native Cassandra driver for
> > Spark. And it takes seconds to get going, and it's elastically and globally scalable."
> > >
> > > More to come,
> > >
> > > Kenneth Brotman
> > >
> > > -----Original Message-----
> > > From: Josh McKenzie [mailto:***@apache.org]
> > > Sent: Wednesday, February 21, 2018 8:28 AM
> > > To: ***@cassandra.apache.org
> > > Cc: User
> > > Subject: Re: Cassandra Needs to Grow Up by Version Five!
> > >
> > > There's a disheartening amount of "here's where Cassandra is bad,
> > > and
> > here's what it needs to do for me for free" happening in this thread.
> > >
> > > This is open-source software. Everyone is *strongly encouraged* to
> > submit a patch to move the needle on *any* of these things being
> > complained about in this thread.
> > >
> > > For the Apache Way <https://www.apache.org/foundation/governance/
> > > to
> > work, people need to step up and meaningfully contribute to a project
> > to scratch their own itch instead of just waiting for a random
> > corporation-subsidized engineer to happen to have interests that align
> > with them and contribute that to the project.
> > >
> > > Beating a dead horse for things everyone on the project knows are
> > serious pain points is not productive.
> > >
> > > On Wed, Feb 21, 2018 at 5:45 AM, Oleksandr Shulgin <
> > ***@zalando.de> wrote:
> > >
> > > > On Mon, Feb 19, 2018 at 10:01 AM, Kenneth Brotman <
> > > > ***@yahoo.com.invalid> wrote:
> > > >
> > > > >
> > > > > > > Cluster wide management should be a big theme in any next major
> > > > release.
> > > > > > >
> > > > > > Na. Stability and testing should be a big theme in the next major
> > > > release.
> > > > > >
> > > > >
> > > > > Double Na on that one Jeff. I think you have a concern there
> > > > > about the need to test sufficiently to ensure the stability of the
> > > > > next major release. That makes perfect sense.- for every release,
> > > > > especially the major ones. Continuous improvement is not a phase
> > > > > of development for example. CI should be in everything, in every
> > > > > phase. Stability and testing a part of every release not just one.
> > > > > A major release should be
> > > > a
> > > > > nice step from the previous major release though.
> > > > >
> > > >
> > > > I guess what Jeff refers to is the tick-tock release cycle
> > > > experiment, which has proven to be a complete disaster by popular
> > opinion.
> > > >
> > > > There's also the "materialized views" feature which failed to
> > > > materialize in the end (pun intended) and had to be declared
> > > > experimental retroactively.
> > > >
> > > > Another prominent example is incremental repair which was
> > > > introduced as the default option in 2.2 and now is not recommended
> > > > to use because of so many corner cases where it can fail. So again
> > experimental as an afterthought.
> > > >
> > > > Not to mention that even if you are aware of the default
> > > > incremental and go with full repair instead, you're still up for a sad surprise:
> > > > anti-compaction will be triggered despite the "full" repair.
> > > > Because anti-compaction is only disabled in case of sub-range
> > > > repair (don't ask why), so you need to use something advanced like
> > > > Reaper if you want to avoid that. I don't think you'll ever find
> > > > this in the
> > documentation.
> > > >
> > > > Honestly, for an eventually-consistent system like Cassandra
> > > > anti-entropy repair is one of the most important pieces to get right.
> > > > And Cassandra fails really badly on that one: the feature is not
> > > > really well designed, poorly implemented and under-documented.
> > > >
> > > > In a summary, IMO, Cassandra is a poor implementation of some good
> > ideas.
> > > > It is a collection of hacks, not features. They sometimes play
> > > > together accidentally, and rarely by design.
> > > >
> > > > Regards,
> > > > --
> > > > Alex
> > > >
> > >
> > >
> > > --------------------------------------------------------------------
> > > - To unsubscribe, e-mail: user-***@cassandra.apache.org
> > > For additional commands, e-mail: user-***@cassandra.apache.org
> > >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-***@cassandra.apache.org
> > For additional commands, e-mail: dev-***@cassandra.apache.org
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-***@cassandra.apache.org
> > For additional commands, e-mail: dev-***@cassandra.apache.org
> >
> >
>
>
> --
> Akash
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-***@cassandra.apache.org
> For additional commands, e-mail: user-***@cassandra.apache.org
>
Carl Mueller
2018-02-23 22:54:32 UTC
Permalink
Isn't a github markdown site about the most easiest collaborative platform
there is for stuff like this? I'm not saying the end product will knock
anyone's socks off.

On Thu, Feb 22, 2018 at 10:55 AM, Rahul Singh <***@gmail.com>
wrote:

> There’s always a reason to complain if you aren’t paying for something.
> There’s always a reason to complain if you are paying for something.
>
> *TLDR; If you want to help curate / organize / gather knowledge about
> Cassandra, send me an email. I’d love to solve at least the knowledge
> management problem. *
>
> Complaining itself is not a solution or a step in the right direction.
> Defining an issue helps by identifying specifically what the pain is and a
> decision can be made to resolve or not resolve it.
>
> Over the last few days I’ve seen a lot of commentary but few suggestions
> for actual tangible outcomes and actions.
>
> I have been nurturing a knowledge management project for our own firm for
> some time know and currently curate tons of links / blogs / slideshares,
> etc. I want to take that a step further and try to do something equivalent
> to “Planet Cassandra 2.0” which would be more than a stream of blog posts,
> but have the following goals.
>
> 1. A Canonical Topic tree of Cassandra topics that most of us give a shit
> about across Architecture, Development, Administration, DevOps
> 2. Content that is primarily curated from existing current knowledge out
> there in the form of Blog poss, mailing list archive answers, JIRA
> tickets, Slides , Youtube videos, etc.
> 3. A community driven approach to cull / throw out old content or flag it
> for approval where we may say that something is no longer relevant.
>
> Here are two of my own contributions:
>
> 1. A blog post + slide deck assembled through standing on the shoulders of
> content from Ebay, The Last Pickle, Open Credo, Ebay, etc. ( I gave this
> talk at a DC Cassandra meetup a few days ago)
>
> https://blog.anant.us/common-problems-cassandra-data-models/
>
> 2. A screenshot of my mock for organizing all of my cassandra / data
> processing / whatever technology knowledge links. The actual project is
> open source @ http://www.github.com/appleseed/leaves.lite and it
> initially started as a fork of some “awesome lists” https://github.com/
> anant/awesome-cassandra https://github.com/anant/awesome-solr
> https://github.com/anant/awesome-lucene
>
>
>
>
> Would love to get help and make a great resource for the world’s Apache
> Cassandra community. Best,
>
> --
> Rahul Singh
> ***@anant.us
>
> Anant Corporation
>
>
> On Feb 21, 2018, 5:53 PM -0500, Kenneth Brotman
> <***@yahoo.com.invalid>, wrote:
>
> Hi Akash,
>
> I get the part about outside work which is why in replying to Jeff Jirsa I
> was suggesting the big companies could justify taking it on easy enough and
> you know actually pay the people who would be working at it so those people
> could have a life.
>
> The part I don't get is the aversion to usability. Isn't that what you
> think about when you are coding? "Am I making this thing I'm building easy
> to use?" If you were programming for me, we would be constantly talking
> about what we are building and how we can make things easier for users. If
> I had to fight with a developer, architect or engineer about usability all
> the time, they would be gone and quick. How do approach programming if you
> aren't trying to make things easy.
>
> Kenneth Brotman
>
> -----Original Message-----
> From: Akash Gangil [mailto:***@gmail.com]
> Sent: Wednesday, February 21, 2018 2:24 PM
> To: ***@cassandra.apache.org
> Cc: ***@cassandra.apache.org
> Subject: Re: Cassandra Needs to Grow Up by Version Five!
>
> I would second Jon in the arguments he made. Contributing outside work is
> draining and really requires a lot of commitment. If someone requires
> features around usability etc, just pay for it, period.
>
> On Wed, Feb 21, 2018 at 2:20 PM, Kenneth Brotman <
> ***@yahoo.com.invalid> wrote:
>
> Jon,
>
> Very sorry that you don't see the value of the time I'm taking for this.
> I don't have demands; I do have a stern warning and I'm right Jon.
> Please be very careful not to mischaracterized my words Jon.
>
> You suggest I put things in JIRA's, then seem to suggest that I'd be
> lucky if anyone looked at it and did anything. That's what I figured too.
>
> I don't appreciate the hostility. You will understand more fully in
> the next post where I'm coming from. Try to keep the conversation
> civilized.
> I'm trying or at least so you understand I think what I'm doing is
> saving your gig and mine. I really like a lot of people is this group.
>
> I've come to a preliminary assessment on things. Soon the cloud will
> clear or I'll be gone. Don't worry. I'm a very peaceful person and
> like you I am driven by real important projects that I feel compelled
> to work on for the good of others. I don't have time for people to
> hand hold a database and I can't get stuck with my projects on the wrong
> stuff.
>
> Kenneth Brotman
>
>
> -----Original Message-----
> From: Jon Haddad [mailto:***@gmail.com] On Behalf Of Jon
> Haddad
> Sent: Wednesday, February 21, 2018 12:44 PM
> To: ***@cassandra.apache.org
> Cc: ***@cassandra.apache.org
> Subject: Re: Cassandra Needs to Grow Up by Version Five!
>
> Ken,
>
> Maybe it’s not clear how open source projects work, so let me try to
> explain. There’s a bunch of us who either get paid by someone or
> volunteer on our free time. The folks that get paid, (yay!) usually
> take direction on what the priorities are, and work on projects that
> directly affect our jobs. That means that someone needs to care
> enough about the features you want to work on them, if you’re not going to
> do it yourself.
>
> Now as others have said already, please put your list of demands in
> JIRA, if someone is interested, they will work on it. You may need to
> contribute a little more than you’ve done already, be prepared to get
> involved if you actually want to to see something get done. Perhaps
> learning a little more about Cassandra’s internals and the people
> involved will reveal some of the design decisions and priorities of the
> project.
>
> Third, you seem to be a little obsessed with market share. While
> market share is fun to talk about, *most* of us that are working on
> and contributing to Cassandra do so because it does actually solve a
> problem we have, and solves it reasonably well. If some magic open
> source DB appears out of no where and does everything you want
> Cassandra to, and is bug free, keeps your data consistent,
> automatically does backups, comes with really nice cert management, ad
> hoc querying, amazing materialized views that are perfect, no caveats
> to secondary indexes, and somehow still gives you linear scalability
> without any mental overhead whatsoever then sure, people might start
> using it. And that’s actually OK, because if that happens we’ll all
> be incredibly pumped out of our minds because we won’t have to work as
> hard. If on the slim chance that doesn’t manifest, those of us that
> use Cassandra and are part of the community will keep working on the
> things we care about, iterating, and improving things. Maybe someone will
> even take a look at your JIRA issues.
>
> Further filling the mailing list with your grievances will likely not
> help you progress towards your goal of a Cassandra that’s easier to
> use, so I encourage you to try to be a little more productive and try
> to help rather than just complain, which is not constructive. I did a
> quick search for your name on the mailing list, and I’ve seen very
> little from you, so to everyone’s who’s been around for a while and
> trying to help you it looks like you’re just some random dude asking
> for people to work for free on the things you’re asking for, without
> offering anything back in return.
>
> Jon
>
>
> On Feb 21, 2018, at 11:56 AM, Kenneth Brotman
>
> <***@yahoo.com.INVALID> wrote:
>
>
> Josh,
>
> To say nothing is indifference. If you care about your community,
>
> sometimes don't you have to bring up a subject even though you know
> it's also temporarily adding some discomfort?
>
>
> As to opening a JIRA, I've got a very specific topic to try in mind
>
> now. An easy one I'll work on and then announce. Someone else will
> have to do the coding. A year from now I would probably just knock it
> out to make sure it's as easy as I expect it to be but to be honest,
> as I've been saying, I'm not set up to do that right now. I've barely
> looked at any Cassandra code; for one; everyone on this list probably
> codes more than I do, secondly; and lastly, it's a good one for
> someone that wants an easy one to start with: vNodes. I've already
> seen too many people seeking assistance with the vNode setting.
>
>
> And you can expect as others have been mentioning that there should
> be
>
> similar ones on compaction, repair and backup.
>
>
> Microsoft knows poor usability gives them an easy market to take over.
>
> And they make it easy to switch.
>
>
> Beginning at 4:17 in the video, it says the following:
>
> "You don't need to worry about replica sets, quorum or read
>
> repair. You can focus on writing correct application logic."
>
>
> At 4:42, it says:
> "Hopefully this gives you a quick idea of how seamlessly you
> can
>
> bring your existing Cassandra applications to Azure Cosmos DB. No
> code changes are required. It works with your favorite Cassandra
> tools and drivers including for example native Cassandra driver for
> Spark. And it takes seconds to get going, and it's elastically and
> globally scalable."
>
>
> More to come,
>
> Kenneth Brotman
>
> -----Original Message-----
> From: Josh McKenzie [mailto:***@apache.org]
> Sent: Wednesday, February 21, 2018 8:28 AM
> To: ***@cassandra.apache.org
> Cc: User
> Subject: Re: Cassandra Needs to Grow Up by Version Five!
>
> There's a disheartening amount of "here's where Cassandra is bad,
> and
>
> here's what it needs to do for me for free" happening in this thread.
>
>
> This is open-source software. Everyone is *strongly encouraged* to
>
> submit a patch to move the needle on *any* of these things being
> complained about in this thread.
>
>
> For the Apache Way <https://www.apache.org/foundation/governance/
> to
>
> work, people need to step up and meaningfully contribute to a project
> to scratch their own itch instead of just waiting for a random
> corporation-subsidized engineer to happen to have interests that align
> with them and contribute that to the project.
>
>
> Beating a dead horse for things everyone on the project knows are
>
> serious pain points is not productive.
>
>
> On Wed, Feb 21, 2018 at 5:45 AM, Oleksandr Shulgin <
>
> ***@zalando.de> wrote:
>
>
> On Mon, Feb 19, 2018 at 10:01 AM, Kenneth Brotman <
> ***@yahoo.com.invalid> wrote:
>
>
> Cluster wide management should be a big theme in any next major
>
> release.
>
>
> Na. Stability and testing should be a big theme in the next major
>
> release.
>
>
>
> Double Na on that one Jeff. I think you have a concern there
> about the need to test sufficiently to ensure the stability of the
> next major release. That makes perfect sense.- for every release,
> especially the major ones. Continuous improvement is not a phase
> of development for example. CI should be in everything, in every
> phase. Stability and testing a part of every release not just one.
> A major release should be
>
> a
>
> nice step from the previous major release though.
>
>
> I guess what Jeff refers to is the tick-tock release cycle
> experiment, which has proven to be a complete disaster by popular
>
> opinion.
>
>
> There's also the "materialized views" feature which failed to
> materialize in the end (pun intended) and had to be declared
> experimental retroactively.
>
> Another prominent example is incremental repair which was
> introduced as the default option in 2.2 and now is not recommended
> to use because of so many corner cases where it can fail. So again
>
> experimental as an afterthought.
>
>
> Not to mention that even if you are aware of the default
> incremental and go with full repair instead, you're still up for a sad
> surprise:
> anti-compaction will be triggered despite the "full" repair.
> Because anti-compaction is only disabled in case of sub-range
> repair (don't ask why), so you need to use something advanced like
> Reaper if you want to avoid that. I don't think you'll ever find
> this in the
>
> documentation.
>
>
> Honestly, for an eventually-consistent system like Cassandra
> anti-entropy repair is one of the most important pieces to get right.
> And Cassandra fails really badly on that one: the feature is not
> really well designed, poorly implemented and under-documented.
>
> In a summary, IMO, Cassandra is a poor implementation of some good
>
> ideas.
>
> It is a collection of hacks, not features. They sometimes play
> together accidentally, and rarely by design.
>
> Regards,
> --
> Alex
>
>
>
> --------------------------------------------------------------------
> - To unsubscribe, e-mail: user-***@cassandra.apache.org
> For additional commands, e-mail: user-***@cassandra.apache.org
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-***@cassandra.apache.org
> For additional commands, e-mail: dev-***@cassandra.apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-***@cassandra.apache.org
> For additional commands, e-mail: dev-***@cassandra.apache.org
>
>
>
>
> --
> Akash
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-***@cassandra.apache.org
> For additional commands, e-mail: user-***@cassandra.apache.org
>
>
Rahul Singh
2018-02-24 00:56:50 UTC
Permalink
That’s why I started with it! There are several “good” sources and from those here probably “awesome” resources and that’s why the awesome movement by sindresorhus captured that with coming up with the “awesome” list format. But it also has its limitations.

Personally having worked with hundreds of content and knowledge management systems to help solve the knowledge problem, I’m coming in to this with the thought of what is the best for the learner and user not just the collaborator / editor.

For example the search even on the DataStax docs site sucks. Totally sucks. You have to google to get something off the DataStax Site.

Good documentation and UI to get the user to the “perpetual intermediate” that Alan Cooper talks about can ensure continued user and community growth.

I’m not saying that we need to rebuild Google for the Cassandra knowledge but we need to definitely have something better than what we have now.

--
Rahul Singh
***@anant.us

Anant Corporation

On Feb 23, 2018, 5:56 PM -0500, Carl Mueller <***@smartthings.com>, wrote:
> Isn't a github markdown site about the most easiest collaborative platform there is for stuff like this? I'm not saying the end product will knock anyone's socks off.
>
> > On Thu, Feb 22, 2018 at 10:55 AM, Rahul Singh <***@gmail.com> wrote:
> > > There’s always a reason to complain if you aren’t paying for something. There’s always a reason to complain if you are paying for something.
> > >
> > > TLDR; If you want to help curate / organize / gather knowledge about Cassandra, send me an email. I’d love to solve at least the knowledge management problem.
> > >
> > > Complaining itself is not a solution or a step in the right direction. Defining an issue helps by identifying specifically what the pain is and a decision can be made to resolve or not resolve it.
> > >
> > > Over the last few days I’ve seen a lot of commentary but few suggestions for actual tangible outcomes and actions.
> > >
> > > I have been nurturing a knowledge management project for our own firm for some time know and currently curate tons of links / blogs / slideshares, etc. I want to take that a step further and try to do something equivalent to “Planet Cassandra 2.0” which would be more than a stream of blog posts, but have the following goals.
> > >
> > > 1. A Canonical Topic tree of Cassandra topics that most of us give a shit about across Architecture, Development, Administration, DevOps
> > > 2. Content that is primarily curated from existing current knowledge out there in the form of Blog poss, mailing list archive answers,  JIRA tickets, Slides , Youtube videos, etc.
> > > 3. A community driven approach to cull / throw out old content or flag it for approval where we may say that something is no longer relevant.
> > >
> > > Here are two of my own contributions:
> > >
> > > 1. A blog post + slide deck assembled through standing on the shoulders of content from Ebay, The Last Pickle, Open Credo, Ebay, etc. ( I gave this talk at a DC Cassandra meetup a few days ago)
> > >
> > > https://blog.anant.us/common-problems-cassandra-data-models/
> > >
> > > 2. A screenshot of my mock for organizing all of my cassandra / data processing / whatever technology knowledge links. The actual project is open source @ http://www.github.com/appleseed/leaves.lite and it initially started as a fork of some “awesome lists” https://github.com/anant/awesome-cassandra https://github.com/anant/awesome-solr https://github.com/anant/awesome-lucene
> > >
> > >
> > >
> > >
> > > Would love to get help and make a great resource for the world’s Apache Cassandra community. Best,
> > >
> > > --
> > > Rahul Singh
> > > ***@anant.us
> > >
> > > Anant Corporation
> > >
> > >
> > > On Feb 21, 2018, 5:53 PM -0500, Kenneth Brotman <***@yahoo.com.invalid>, wrote:
> > > > Hi Akash,
> > > >
> > > > I get the part about outside work which is why in replying to Jeff Jirsa I was suggesting the big companies could justify taking it on easy enough and you know actually pay the people who would be working at it so those people could have a life.
> > > >
> > > > The part I don't get is the aversion to usability. Isn't that what you think about when you are coding? "Am I making this thing I'm building easy to use?" If you were programming for me, we would be constantly talking about what we are building and how we can make things easier for users. If I had to fight with a developer, architect or engineer about usability all the time, they would be gone and quick. How do approach programming if you aren't trying to make things easy.
> > > >
> > > > Kenneth Brotman
> > > >
> > > > -----Original Message-----
> > > > From: Akash Gangil [mailto:***@gmail.com]
> > > > Sent: Wednesday, February 21, 2018 2:24 PM
> > > > To: ***@cassandra.apache.org
> > > > Cc: ***@cassandra.apache.org
> > > > Subject: Re: Cassandra Needs to Grow Up by Version Five!
> > > >
> > > > I would second Jon in the arguments he made. Contributing outside work is draining and really requires a lot of commitment. If someone requires features around usability etc, just pay for it, period.
> > > >
> > > > On Wed, Feb 21, 2018 at 2:20 PM, Kenneth Brotman < ***@yahoo.com.invalid> wrote:
> > > >
> > > > > Jon,
> > > > >
> > > > > Very sorry that you don't see the value of the time I'm taking for this.
> > > > > I don't have demands; I do have a stern warning and I'm right Jon.
> > > > > Please be very careful not to mischaracterized my words Jon.
> > > > >
> > > > > You suggest I put things in JIRA's, then seem to suggest that I'd be
> > > > > lucky if anyone looked at it and did anything. That's what I figured too.
> > > > >
> > > > > I don't appreciate the hostility. You will understand more fully in
> > > > > the next post where I'm coming from. Try to keep the conversation civilized.
> > > > > I'm trying or at least so you understand I think what I'm doing is
> > > > > saving your gig and mine. I really like a lot of people is this group.
> > > > >
> > > > > I've come to a preliminary assessment on things. Soon the cloud will
> > > > > clear or I'll be gone. Don't worry. I'm a very peaceful person and
> > > > > like you I am driven by real important projects that I feel compelled
> > > > > to work on for the good of others. I don't have time for people to
> > > > > hand hold a database and I can't get stuck with my projects on the wrong stuff.
> > > > >
> > > > > Kenneth Brotman
> > > > >
> > > > >
> > > > > -----Original Message-----
> > > > > From: Jon Haddad [mailto:***@gmail.com] On Behalf Of Jon
> > > > > Haddad
> > > > > Sent: Wednesday, February 21, 2018 12:44 PM
> > > > > To: ***@cassandra.apache.org
> > > > > Cc: ***@cassandra.apache.org
> > > > > Subject: Re: Cassandra Needs to Grow Up by Version Five!
> > > > >
> > > > > Ken,
> > > > >
> > > > > Maybe it’s not clear how open source projects work, so let me try to
> > > > > explain. There’s a bunch of us who either get paid by someone or
> > > > > volunteer on our free time. The folks that get paid, (yay!) usually
> > > > > take direction on what the priorities are, and work on projects that
> > > > > directly affect our jobs. That means that someone needs to care
> > > > > enough about the features you want to work on them, if you’re not going to do it yourself.
> > > > >
> > > > > Now as others have said already, please put your list of demands in
> > > > > JIRA, if someone is interested, they will work on it. You may need to
> > > > > contribute a little more than you’ve done already, be prepared to get
> > > > > involved if you actually want to to see something get done. Perhaps
> > > > > learning a little more about Cassandra’s internals and the people
> > > > > involved will reveal some of the design decisions and priorities of the project.
> > > > >
> > > > > Third, you seem to be a little obsessed with market share. While
> > > > > market share is fun to talk about, *most* of us that are working on
> > > > > and contributing to Cassandra do so because it does actually solve a
> > > > > problem we have, and solves it reasonably well. If some magic open
> > > > > source DB appears out of no where and does everything you want
> > > > > Cassandra to, and is bug free, keeps your data consistent,
> > > > > automatically does backups, comes with really nice cert management, ad
> > > > > hoc querying, amazing materialized views that are perfect, no caveats
> > > > > to secondary indexes, and somehow still gives you linear scalability
> > > > > without any mental overhead whatsoever then sure, people might start
> > > > > using it. And that’s actually OK, because if that happens we’ll all
> > > > > be incredibly pumped out of our minds because we won’t have to work as
> > > > > hard. If on the slim chance that doesn’t manifest, those of us that
> > > > > use Cassandra and are part of the community will keep working on the
> > > > > things we care about, iterating, and improving things. Maybe someone will even take a look at your JIRA issues.
> > > > >
> > > > > Further filling the mailing list with your grievances will likely not
> > > > > help you progress towards your goal of a Cassandra that’s easier to
> > > > > use, so I encourage you to try to be a little more productive and try
> > > > > to help rather than just complain, which is not constructive. I did a
> > > > > quick search for your name on the mailing list, and I’ve seen very
> > > > > little from you, so to everyone’s who’s been around for a while and
> > > > > trying to help you it looks like you’re just some random dude asking
> > > > > for people to work for free on the things you’re asking for, without offering anything back in return.
> > > > >
> > > > > Jon
> > > > >
> > > > >
> > > > > > On Feb 21, 2018, at 11:56 AM, Kenneth Brotman
> > > > > <***@yahoo.com.INVALID> wrote:
> > > > > >
> > > > > > Josh,
> > > > > >
> > > > > > To say nothing is indifference. If you care about your community,
> > > > > sometimes don't you have to bring up a subject even though you know
> > > > > it's also temporarily adding some discomfort?
> > > > > >
> > > > > > As to opening a JIRA, I've got a very specific topic to try in mind
> > > > > now. An easy one I'll work on and then announce. Someone else will
> > > > > have to do the coding. A year from now I would probably just knock it
> > > > > out to make sure it's as easy as I expect it to be but to be honest,
> > > > > as I've been saying, I'm not set up to do that right now. I've barely
> > > > > looked at any Cassandra code; for one; everyone on this list probably
> > > > > codes more than I do, secondly; and lastly, it's a good one for
> > > > > someone that wants an easy one to start with: vNodes. I've already
> > > > > seen too many people seeking assistance with the vNode setting.
> > > > > >
> > > > > > And you can expect as others have been mentioning that there should
> > > > > > be
> > > > > similar ones on compaction, repair and backup.
> > > > > >
> > > > > > Microsoft knows poor usability gives them an easy market to take over.
> > > > > And they make it easy to switch.
> > > > > >
> > > > > > Beginning at 4:17 in the video, it says the following:
> > > > > >
> > > > > > "You don't need to worry about replica sets, quorum or read
> > > > > repair. You can focus on writing correct application logic."
> > > > > >
> > > > > > At 4:42, it says:
> > > > > > "Hopefully this gives you a quick idea of how seamlessly you
> > > > > > can
> > > > > bring your existing Cassandra applications to Azure Cosmos DB. No
> > > > > code changes are required. It works with your favorite Cassandra
> > > > > tools and drivers including for example native Cassandra driver for
> > > > > Spark. And it takes seconds to get going, and it's elastically and globally scalable."
> > > > > >
> > > > > > More to come,
> > > > > >
> > > > > > Kenneth Brotman
> > > > > >
> > > > > > -----Original Message-----
> > > > > > From: Josh McKenzie [mailto:***@apache.org]
> > > > > > Sent: Wednesday, February 21, 2018 8:28 AM
> > > > > > To: ***@cassandra.apache.org
> > > > > > Cc: User
> > > > > > Subject: Re: Cassandra Needs to Grow Up by Version Five!
> > > > > >
> > > > > > There's a disheartening amount of "here's where Cassandra is bad,
> > > > > > and
> > > > > here's what it needs to do for me for free" happening in this thread.
> > > > > >
> > > > > > This is open-source software. Everyone is *strongly encouraged* to
> > > > > submit a patch to move the needle on *any* of these things being
> > > > > complained about in this thread.
> > > > > >
> > > > > > For the Apache Way <https://www.apache.org/foundation/governance/
> > > > > > to
> > > > > work, people need to step up and meaningfully contribute to a project
> > > > > to scratch their own itch instead of just waiting for a random
> > > > > corporation-subsidized engineer to happen to have interests that align
> > > > > with them and contribute that to the project.
> > > > > >
> > > > > > Beating a dead horse for things everyone on the project knows are
> > > > > serious pain points is not productive.
> > > > > >
> > > > > > On Wed, Feb 21, 2018 at 5:45 AM, Oleksandr Shulgin <
> > > > > ***@zalando.de> wrote:
> > > > > >
> > > > > > > On Mon, Feb 19, 2018 at 10:01 AM, Kenneth Brotman <
> > > > > > > ***@yahoo.com.invalid> wrote:
> > > > > > >
> > > > > > > >
> > > > > > > > > > Cluster wide management should be a big theme in any next major
> > > > > > > release.
> > > > > > > > > >
> > > > > > > > > Na. Stability and testing should be a big theme in the next major
> > > > > > > release.
> > > > > > > > >
> > > > > > > >
> > > > > > > > Double Na on that one Jeff. I think you have a concern there
> > > > > > > > about the need to test sufficiently to ensure the stability of the
> > > > > > > > next major release. That makes perfect sense.- for every release,
> > > > > > > > especially the major ones. Continuous improvement is not a phase
> > > > > > > > of development for example. CI should be in everything, in every
> > > > > > > > phase. Stability and testing a part of every release not just one.
> > > > > > > > A major release should be
> > > > > > > a
> > > > > > > > nice step from the previous major release though.
> > > > > > > >
> > > > > > >
> > > > > > > I guess what Jeff refers to is the tick-tock release cycle
> > > > > > > experiment, which has proven to be a complete disaster by popular
> > > > > opinion.
> > > > > > >
> > > > > > > There's also the "materialized views" feature which failed to
> > > > > > > materialize in the end (pun intended) and had to be declared
> > > > > > > experimental retroactively.
> > > > > > >
> > > > > > > Another prominent example is incremental repair which was
> > > > > > > introduced as the default option in 2.2 and now is not recommended
> > > > > > > to use because of so many corner cases where it can fail. So again
> > > > > experimental as an afterthought.
> > > > > > >
> > > > > > > Not to mention that even if you are aware of the default
> > > > > > > incremental and go with full repair instead, you're still up for a sad surprise:
> > > > > > > anti-compaction will be triggered despite the "full" repair.
> > > > > > > Because anti-compaction is only disabled in case of sub-range
> > > > > > > repair (don't ask why), so you need to use something advanced like
> > > > > > > Reaper if you want to avoid that. I don't think you'll ever find
> > > > > > > this in the
> > > > > documentation.
> > > > > > >
> > > > > > > Honestly, for an eventually-consistent system like Cassandra
> > > > > > > anti-entropy repair is one of the most important pieces to get right.
> > > > > > > And Cassandra fails really badly on that one: the feature is not
> > > > > > > really well designed, poorly implemented and under-documented.
> > > > > > >
> > > > > > > In a summary, IMO, Cassandra is a poor implementation of some good
> > > > > ideas.
> > > > > > > It is a collection of hacks, not features. They sometimes play
> > > > > > > together accidentally, and rarely by design.
> > > > > > >
> > > > > > > Regards,
> > > > > > > --
> > > > > > > Alex
> > > > > > >
> > > > > >
> > > > > >
> > > > > > --------------------------------------------------------------------
> > > > > > - To unsubscribe, e-mail: user-***@cassandra.apache.org
> > > > > > For additional commands, e-mail: user-***@cassandra.apache.org
> > > > > >
> > > > >
> > > > >
> > > > > ---------------------------------------------------------------------
> > > > > To unsubscribe, e-mail: dev-***@cassandra.apache.org
> > > > > For additional commands, e-mail: dev-***@cassandra.apache.org
> > > > >
> > > > >
> > > > > ---------------------------------------------------------------------
> > > > > To unsubscribe, e-mail: dev-***@cassandra.apache.org
> > > > > For additional commands, e-mail: dev-***@cassandra.apache.org
> > > > >
> > > > >
> > > >
> > > >
> > > > --
> > > > Akash
> > > >
> > > >
> > > > ---------------------------------------------------------------------
> > > > To unsubscribe, e-mail: user-***@cassandra.apache.org
> > > > For additional commands, e-mail: user-***@cassandra.apache.org
> > > >
>
Kenneth Brotman
2018-02-24 14:29:10 UTC
Permalink
Rahul,



I was just thinking last night how weird it is that the DataStax web site of all places would lack really slick search capabilities. The Apache web sites should strive for that too from now on as it’s getting pretty easy to implement. But the DataStax’ web site! That should be an elite example of what search is capable of now.



Good point Rahul! Right on!



Kenneth Brotman



From: Rahul Singh [mailto:***@gmail.com]
Sent: Friday, February 23, 2018 4:57 PM
To: ***@cassandra.apache.org
Subject: Re: Gathering / Curating / Organizing Cassandra Best Practices & Patterns



That’s why I started with it! There are several “good” sources and from those here probably “awesome” resources and that’s why the awesome movement by sindresorhus captured that with coming up with the “awesome” list format. But it also has its limitations.

Personally having worked with hundreds of content and knowledge management systems to help solve the knowledge problem, I’m coming in to this with the thought of what is the best for the learner and user not just the collaborator / editor.

For example the search even on the DataStax docs site sucks. Totally sucks. You have to google to get something off the DataStax Site.

Good documentation and UI to get the user to the “perpetual intermediate” that Alan Cooper talks about can ensure continued user and community growth.

I’m not saying that we need to rebuild Google for the Cassandra knowledge but we need to definitely have something better than what we have now.


--
Rahul Singh
***@anant.us

Anant Corporation


On Feb 23, 2018, 5:56 PM -0500, Carl Mueller <***@smartthings.com>, wrote:



Isn't a github markdown site about the most easiest collaborative platform there is for stuff like this? I'm not saying the end product will knock anyone's socks off.



On Thu, Feb 22, 2018 at 10:55 AM, Rahul Singh <***@gmail.com> wrote:

There’s always a reason to complain if you aren’t paying for something. There’s always a reason to complain if you are paying for something.



TLDR; If you want to help curate / organize / gather knowledge about Cassandra, send me an email. I’d love to solve at least the knowledge management problem.

Complaining itself is not a solution or a step in the right direction. Defining an issue helps by identifying specifically what the pain is and a decision can be made to resolve or not resolve it.

Over the last few days I’ve seen a lot of commentary but few suggestions for actual tangible outcomes and actions.



I have been nurturing a knowledge management project for our own firm for some time know and currently curate tons of links / blogs / slideshares, etc. I want to take that a step further and try to do something equivalent to “Planet Cassandra 2.0” which would be more than a stream of blog posts, but have the following goals.



1. A Canonical Topic tree of Cassandra topics that most of us give a shit about across Architecture, Development, Administration, DevOps

2. Content that is primarily curated from existing current knowledge out there in the form of Blog poss, mailing list archive answers, JIRA tickets, Slides , Youtube videos, etc.

3. A community driven approach to cull / throw out old content or flag it for approval where we may say that something is no longer relevant.



Here are two of my own contributions:



1. A blog post + slide deck assembled through standing on the shoulders of content from Ebay, The Last Pickle, Open Credo, Ebay, etc. ( I gave this talk at a DC Cassandra meetup a few days ago)



https://blog.anant.us/common-problems-cassandra-data-models/



2. A screenshot of my mock for organizing all of my cassandra / data processing / whatever technology knowledge links. The actual project is open source @ http://www.github.com/appleseed/leaves.lite and it initially started as a fork of some “awesome lists” https://github.com/anant/awesome-cassandra https://github.com/anant/awesome-solr https://github.com/anant/awesome-lucene









Would love to get help and make a great resource for the world’s Apache Cassandra community. Best,


--
Rahul Singh
***@anant.us

Anant Corporation



On Feb 21, 2018, 5:53 PM -0500, Kenneth Brotman <***@yahoo.com.invalid>, wrote:



Hi Akash,

I get the part about outside work which is why in replying to Jeff Jirsa I was suggesting the big companies could justify taking it on easy enough and you know actually pay the people who would be working at it so those people could have a life.

The part I don't get is the aversion to usability. Isn't that what you think about when you are coding? "Am I making this thing I'm building easy to use?" If you were programming for me, we would be constantly talking about what we are building and how we can make things easier for users. If I had to fight with a developer, architect or engineer about usability all the time, they would be gone and quick. How do approach programming if you aren't trying to make things easy.

Kenneth Brotman

-----Original Message-----
From: Akash Gangil [mailto:***@gmail.com]
Sent: Wednesday, February 21, 2018 2:24 PM
To: ***@cassandra.apache.org
Cc: ***@cassandra.apache.org
Subject: Re: Cassandra Needs to Grow Up by Version Five!

I would second Jon in the arguments he made. Contributing outside work is draining and really requires a lot of commitment. If someone requires features around usability etc, just pay for it, period.

On Wed, Feb 21, 2018 at 2:20 PM, Kenneth Brotman < ***@yahoo.com.invalid> wrote:




Jon,

Very sorry that you don't see the value of the time I'm taking for this.
I don't have demands; I do have a stern warning and I'm right Jon.
Please be very careful not to mischaracterized my words Jon.

You suggest I put things in JIRA's, then seem to suggest that I'd be
lucky if anyone looked at it and did anything. That's what I figured too.

I don't appreciate the hostility. You will understand more fully in
the next post where I'm coming from. Try to keep the conversation civilized.
I'm trying or at least so you understand I think what I'm doing is
saving your gig and mine. I really like a lot of people is this group.

I've come to a preliminary assessment on things. Soon the cloud will
clear or I'll be gone. Don't worry. I'm a very peaceful person and
like you I am driven by real important projects that I feel compelled
to work on for the good of others. I don't have time for people to
hand hold a database and I can't get stuck with my projects on the wrong stuff.

Kenneth Brotman


-----Original Message-----
From: Jon Haddad [mailto:***@gmail.com] On Behalf Of Jon
Haddad
Sent: Wednesday, February 21, 2018 12:44 PM
To: ***@cassandra.apache.org
Cc: ***@cassandra.apache.org
Subject: Re: Cassandra Needs to Grow Up by Version Five!

Ken,

Maybe it’s not clear how open source projects work, so let me try to
explain. There’s a bunch of us who either get paid by someone or
volunteer on our free time. The folks that get paid, (yay!) usually
take direction on what the priorities are, and work on projects that
directly affect our jobs. That means that someone needs to care
enough about the features you want to work on them, if you’re not going to do it yourself.

Now as others have said already, please put your list of demands in
JIRA, if someone is interested, they will work on it. You may need to
contribute a little more than you’ve done already, be prepared to get
involved if you actually want to to see something get done. Perhaps
learning a little more about Cassandra’s internals and the people
involved will reveal some of the design decisions and priorities of the project.

Third, you seem to be a little obsessed with market share. While
market share is fun to talk about, *most* of us that are working on
and contributing to Cassandra do so because it does actually solve a
problem we have, and solves it reasonably well. If some magic open
source DB appears out of no where and does everything you want
Cassandra to, and is bug free, keeps your data consistent,
automatically does backups, comes with really nice cert management, ad
hoc querying, amazing materialized views that are perfect, no caveats
to secondary indexes, and somehow still gives you linear scalability
without any mental overhead whatsoever then sure, people might start
using it. And that’s actually OK, because if that happens we’ll all
be incredibly pumped out of our minds because we won’t have to work as
hard. If on the slim chance that doesn’t manifest, those of us that
use Cassandra and are part of the community will keep working on the
things we care about, iterating, and improving things. Maybe someone will even take a look at your JIRA issues.

Further filling the mailing list with your grievances will likely not
help you progress towards your goal of a Cassandra that’s easier to
use, so I encourage you to try to be a little more productive and try
to help rather than just complain, which is not constructive. I did a
quick search for your name on the mailing list, and I’ve seen very
little from you, so to everyone’s who’s been around for a while and
trying to help you it looks like you’re just some random dude asking
for people to work for free on the things you’re asking for, without offering anything back in return.

Jon





On Feb 21, 2018, at 11:56 AM, Kenneth Brotman

<***@yahoo.com.INVALID> wrote:




Josh,

To say nothing is indifference. If you care about your community,

sometimes don't you have to bring up a subject even though you know
it's also temporarily adding some discomfort?




As to opening a JIRA, I've got a very specific topic to try in mind

now. An easy one I'll work on and then announce. Someone else will
have to do the coding. A year from now I would probably just knock it
out to make sure it's as easy as I expect it to be but to be honest,
as I've been saying, I'm not set up to do that right now. I've barely
looked at any Cassandra code; for one; everyone on this list probably
codes more than I do, secondly; and lastly, it's a good one for
someone that wants an easy one to start with: vNodes. I've already
seen too many people seeking assistance with the vNode setting.




And you can expect as others have been mentioning that there should
be

similar ones on compaction, repair and backup.




Microsoft knows poor usability gives them an easy market to take over.

And they make it easy to switch.




Beginning at 4:17 in the video, it says the following:

"You don't need to worry about replica sets, quorum or read

repair. You can focus on writing correct application logic."




At 4:42, it says:
"Hopefully this gives you a quick idea of how seamlessly you
can

bring your existing Cassandra applications to Azure Cosmos DB. No
code changes are required. It works with your favorite Cassandra
tools and drivers including for example native Cassandra driver for
Spark. And it takes seconds to get going, and it's elastically and globally scalable."




More to come,

Kenneth Brotman

-----Original Message-----
From: Josh McKenzie [mailto:***@apache.org]
Sent: Wednesday, February 21, 2018 8:28 AM
To: ***@cassandra.apache.org
Cc: User
Subject: Re: Cassandra Needs to Grow Up by Version Five!

There's a disheartening amount of "here's where Cassandra is bad,
and

here's what it needs to do for me for free" happening in this thread.




This is open-source software. Everyone is *strongly encouraged* to

submit a patch to move the needle on *any* of these things being
complained about in this thread.




For the Apache Way <https://www.apache.org/foundation/governance/
to

work, people need to step up and meaningfully contribute to a project
to scratch their own itch instead of just waiting for a random
corporation-subsidized engineer to happen to have interests that align
with them and contribute that to the project.




Beating a dead horse for things everyone on the project knows are

serious pain points is not productive.




On Wed, Feb 21, 2018 at 5:45 AM, Oleksandr Shulgin <

***@zalando.de> wrote:







On Mon, Feb 19, 2018 at 10:01 AM, Kenneth Brotman <
***@yahoo.com.invalid> wrote:








Cluster wide management should be a big theme in any next major

release.





Na. Stability and testing should be a big theme in the next major

release.






Double Na on that one Jeff. I think you have a concern there
about the need to test sufficiently to ensure the stability of the
next major release. That makes perfect sense.- for every release,
especially the major ones. Continuous improvement is not a phase
of development for example. CI should be in everything, in every
phase. Stability and testing a part of every release not just one.
A major release should be

a



nice step from the previous major release though.


I guess what Jeff refers to is the tick-tock release cycle
experiment, which has proven to be a complete disaster by popular

opinion.




There's also the "materialized views" feature which failed to
materialize in the end (pun intended) and had to be declared
experimental retroactively.

Another prominent example is incremental repair which was
introduced as the default option in 2.2 and now is not recommended
to use because of so many corner cases where it can fail. So again

experimental as an afterthought.




Not to mention that even if you are aware of the default
incremental and go with full repair instead, you're still up for a sad surprise:
anti-compaction will be triggered despite the "full" repair.
Because anti-compaction is only disabled in case of sub-range
repair (don't ask why), so you need to use something advanced like
Reaper if you want to avoid that. I don't think you'll ever find
this in the

documentation.




Honestly, for an eventually-consistent system like Cassandra
anti-entropy repair is one of the most important pieces to get right.
And Cassandra fails really badly on that one: the feature is not
really well designed, poorly implemented and under-documented.

In a summary, IMO, Cassandra is a poor implementation of some good

ideas.



It is a collection of hacks, not features. They sometimes play
together accidentally, and rarely by design.

Regards,
--
Alex



--------------------------------------------------------------------
- To unsubscribe, e-mail: user-***@cassandra.apache.org
For additional commands, e-mail: user-***@cassandra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-***@cassandra.apache.org
For additional commands, e-mail: dev-***@cassandra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-***@cassandra.apache.org
For additional commands, e-mail: dev-***@cassandra.apache.org





--
Akash


---------------------------------------------------------------------
To unsubscribe, e-mail: user-***@cassandra.apache.org
For additional commands, e-mail: user-***@cassandra.apache.org
Kenneth Brotman
2018-02-24 18:16:29 UTC
Permalink
To Rahul,



This is your official email (just from me as an individual) requesting your assistance to help solve the knowledge management problem. I can appreciate the work you put into the Awesome Cassandra list. It is difficult to keep everything up to date. I’ve been there too.



The golden trophy if you want to do the absolute best thing is a full-fledged professional development initiative for Cassandra. From an instructional design view, what you do is create a body of knowledge and exhaustive list of competencies, some call KSA’s: knowledge, skills and abilities; then you do a gap analysis to find the areas in practice where gaps exists between the competencies desired and those of practitioners, then generate a mix of media for difference learning styles in a structured properly sequenced series of easy to work through steps complete with apperception exercises, and everyone will then have a smooth path towards mastery. It’s that easy.



So, yes let’s turn it up a few notches.



Thank you,



Kenneth Brotman




--
Rahul Singh
***@anant.us

Anant Corporation


On Feb 23, 2018, 5:56 PM -0500, Carl Mueller <***@smartthings.com>, wrote:

Isn't a github markdown site about the most easiest collaborative platform there is for stuff like this? I'm not saying the end product will knock anyone's socks off.



On Thu, Feb 22, 2018 at 10:55 AM, Rahul Singh <***@gmail.com> wrote:

There’s always a reason to complain if you aren’t paying for something. There’s always a reason to complain if you are paying for something.



TLDR; If you want to help curate / organize / gather knowledge about Cassandra, send me an email. I’d love to solve at least the knowledge management problem.

Complaining itself is not a solution or a step in the right direction. Defining an issue helps by identifying specifically what the pain is and a decision can be made to resolve or not resolve it.

Over the last few days I’ve seen a lot of commentary but few suggestions for actual tangible outcomes and actions.



I have been nurturing a knowledge management project for our own firm for some time know and currently curate tons of links / blogs / slideshares, etc. I want to take that a step further and try to do something equivalent to “Planet Cassandra 2.0” which would be more than a stream of blog posts, but have the following goals.



1. A Canonical Topic tree of Cassandra topics that most of us give a shit about across Architecture, Development, Administration, DevOps

2. Content that is primarily curated from existing current knowledge out there in the form of Blog poss, mailing list archive answers, JIRA tickets, Slides , Youtube videos, etc.

3. A community driven approach to cull / throw out old content or flag it for approval where we may say that something is no longer relevant.



Here are two of my own contributions:



1. A blog post + slide deck assembled through standing on the shoulders of content from Ebay, The Last Pickle, Open Credo, Ebay, etc. ( I gave this talk at a DC Cassandra meetup a few days ago)



https://blog.anant.us/common-problems-cassandra-data-models/



2. A screenshot of my mock for organizing all of my cassandra / data processing / whatever technology knowledge links. The actual project is open source @ http://www.github.com/appleseed/leaves.lite and it initially started as a fork of some “awesome lists” https://github.com/anant/awesome-cassandra https://github.com/anant/awesome-solr https://github.com/anant/awesome-lucene









Would love to get help and make a great resource for the world’s Apache Cassandra community. Best,


--
Rahul Singh
***@anant.us

Anant Corporation



On Feb 21, 2018, 5:53 PM -0500, Kenneth Brotman <***@yahoo.com.invalid>, wrote:

Hi Akash,

I get the part about outside work which is why in replying to Jeff Jirsa I was suggesting the big companies could justify taking it on easy enough and you know actually pay the people who would be working at it so those people could have a life.

The part I don't get is the aversion to usability. Isn't that what you think about when you are coding? "Am I making this thing I'm building easy to use?" If you were programming for me, we would be constantly talking about what we are building and how we can make things easier for users. If I had to fight with a developer, architect or engineer about usability all the time, they would be gone and quick. How do approach programming if you aren't trying to make things easy.

Kenneth Brotman

-----Original Message-----
From: Akash Gangil [mailto:***@gmail.com]
Sent: Wednesday, February 21, 2018 2:24 PM
To: ***@cassandra.apache.org
Cc: ***@cassandra.apache.org
Subject: Re: Cassandra Needs to Grow Up by Version Five!

I would second Jon in the arguments he made. Contributing outside work is draining and really requires a lot of commitment. If someone requires features around usability etc, just pay for it, period.

On Wed, Feb 21, 2018 at 2:20 PM, Kenneth Brotman < ***@yahoo.com.invalid> wrote:



Jon,

Very sorry that you don't see the value of the time I'm taking for this.
I don't have demands; I do have a stern warning and I'm right Jon.
Please be very careful not to mischaracterized my words Jon.

You suggest I put things in JIRA's, then seem to suggest that I'd be
lucky if anyone looked at it and did anything. That's what I figured too.

I don't appreciate the hostility. You will understand more fully in
the next post where I'm coming from. Try to keep the conversation civilized.
I'm trying or at least so you understand I think what I'm doing is
saving your gig and mine. I really like a lot of people is this group.

I've come to a preliminary assessment on things. Soon the cloud will
clear or I'll be gone. Don't worry. I'm a very peaceful person and
like you I am driven by real important projects that I feel compelled
to work on for the good of others. I don't have time for people to
hand hold a database and I can't get stuck with my projects on the wrong stuff.

Kenneth Brotman


-----Original Message-----
From: Jon Haddad [mailto:***@gmail.com] On Behalf Of Jon
Haddad
Sent: Wednesday, February 21, 2018 12:44 PM
To: ***@cassandra.apache.org
Cc: ***@cassandra.apache.org
Subject: Re: Cassandra Needs to Grow Up by Version Five!

Ken,

Maybe it’s not clear how open source projects work, so let me try to
explain. There’s a bunch of us who either get paid by someone or
volunteer on our free time. The folks that get paid, (yay!) usually
take direction on what the priorities are, and work on projects that
directly affect our jobs. That means that someone needs to care
enough about the features you want to work on them, if you’re not going to do it yourself.

Now as others have said already, please put your list of demands in
JIRA, if someone is interested, they will work on it. You may need to
contribute a little more than you’ve done already, be prepared to get
involved if you actually want to to see something get done. Perhaps
learning a little more about Cassandra’s internals and the people
involved will reveal some of the design decisions and priorities of the project.

Third, you seem to be a little obsessed with market share. While
market share is fun to talk about, *most* of us that are working on
and contributing to Cassandra do so because it does actually solve a
problem we have, and solves it reasonably well. If some magic open
source DB appears out of no where and does everything you want
Cassandra to, and is bug free, keeps your data consistent,
automatically does backups, comes with really nice cert management, ad
hoc querying, amazing materialized views that are perfect, no caveats
to secondary indexes, and somehow still gives you linear scalability
without any mental overhead whatsoever then sure, people might start
using it. And that’s actually OK, because if that happens we’ll all
be incredibly pumped out of our minds because we won’t have to work as
hard. If on the slim chance that doesn’t manifest, those of us that
use Cassandra and are part of the community will keep working on the
things we care about, iterating, and improving things. Maybe someone will even take a look at your JIRA issues.

Further filling the mailing list with your grievances will likely not
help you progress towards your goal of a Cassandra that’s easier to
use, so I encourage you to try to be a little more productive and try
to help rather than just complain, which is not constructive. I did a
quick search for your name on the mailing list, and I’ve seen very
little from you, so to everyone’s who’s been around for a while and
trying to help you it looks like you’re just some random dude asking
for people to work for free on the things you’re asking for, without offering anything back in return.

Jon




On Feb 21, 2018, at 11:56 AM, Kenneth Brotman

<***@yahoo.com.INVALID> wrote:


Josh,

To say nothing is indifference. If you care about your community,

sometimes don't you have to bring up a subject even though you know
it's also temporarily adding some discomfort?


As to opening a JIRA, I've got a very specific topic to try in mind

now. An easy one I'll work on and then announce. Someone else will
have to do the coding. A year from now I would probably just knock it
out to make sure it's as easy as I expect it to be but to be honest,
as I've been saying, I'm not set up to do that right now. I've barely
looked at any Cassandra code; for one; everyone on this list probably
codes more than I do, secondly; and lastly, it's a good one for
someone that wants an easy one to start with: vNodes. I've already
seen too many people seeking assistance with the vNode setting.


And you can expect as others have been mentioning that there should
be

similar ones on compaction, repair and backup.


Microsoft knows poor usability gives them an easy market to take over.

And they make it easy to switch.


Beginning at 4:17 in the video, it says the following:

"You don't need to worry about replica sets, quorum or read

repair. You can focus on writing correct application logic."


At 4:42, it says:
"Hopefully this gives you a quick idea of how seamlessly you
can

bring your existing Cassandra applications to Azure Cosmos DB. No
code changes are required. It works with your favorite Cassandra
tools and drivers including for example native Cassandra driver for
Spark. And it takes seconds to get going, and it's elastically and globally scalable."


More to come,

Kenneth Brotman

-----Original Message-----
From: Josh McKenzie [mailto:***@apache.org]
Sent: Wednesday, February 21, 2018 8:28 AM
To: ***@cassandra.apache.org
Cc: User
Subject: Re: Cassandra Needs to Grow Up by Version Five!

There's a disheartening amount of "here's where Cassandra is bad,
and

here's what it needs to do for me for free" happening in this thread.


This is open-source software. Everyone is *strongly encouraged* to

submit a patch to move the needle on *any* of these things being
complained about in this thread.


For the Apache Way <https://www.apache.org/foundation/governance/
to

work, people need to step up and meaningfully contribute to a project
to scratch their own itch instead of just waiting for a random
corporation-subsidized engineer to happen to have interests that align
with them and contribute that to the project.


Beating a dead horse for things everyone on the project knows are

serious pain points is not productive.


On Wed, Feb 21, 2018 at 5:45 AM, Oleksandr Shulgin <

***@zalando.de> wrote:



On Mon, Feb 19, 2018 at 10:01 AM, Kenneth Brotman <
***@yahoo.com.invalid> wrote:





Cluster wide management should be a big theme in any next major

release.



Na. Stability and testing should be a big theme in the next major

release.




Double Na on that one Jeff. I think you have a concern there
about the need to test sufficiently to ensure the stability of the
next major release. That makes perfect sense.- for every release,
especially the major ones. Continuous improvement is not a phase
of development for example. CI should be in everything, in every
phase. Stability and testing a part of every release not just one.
A major release should be

a

nice step from the previous major release though.


I guess what Jeff refers to is the tick-tock release cycle
experiment, which has proven to be a complete disaster by popular

opinion.


There's also the "materialized views" feature which failed to
materialize in the end (pun intended) and had to be declared
experimental retroactively.

Another prominent example is incremental repair which was
introduced as the default option in 2.2 and now is not recommended
to use because of so many corner cases where it can fail. So again

experimental as an afterthought.


Not to mention that even if you are aware of the default
incremental and go with full repair instead, you're still up for a sad surprise:
anti-compaction will be triggered despite the "full" repair.
Because anti-compaction is only disabled in case of sub-range
repair (don't ask why), so you need to use something advanced like
Reaper if you want to avoid that. I don't think you'll ever find
this in the

documentation


Honestly, for an eventually-consistent system like Cassandra
anti-entropy repair is one of the most important pieces to get right.
And Cassandra fails really badly on that one: the feature is not
really well designed, poorly implemented and under-documented.

In a summary, IMO, Cassandra is a poor implementation of some good

ideas.

It is a collection of hacks, not features. They sometimes play
together accidentally, and rarely by design.

Regards,
--
Alex



--------------------------------------------------------------------
- To unsubscribe, e-mail: user-***@cassandra.apache.org
For additional commands, e-mail: user-***@cassandra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-***@cassandra.apache.org
For additional commands, e-mail: dev-***@cassandra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-***@cassandra.apache.org
For additional commands, e-mail: dev-***@cassandra.apache.org



--
Akash


---------------------------------------------------------------------
To unsubscribe, e-mail: user-***@cassandra.apache.org
For additional commands, e-mail: user-***@cassandra.apache.org
Kenneth Brotman
2018-02-24 18:42:22 UTC
Permalink
Any efforts described below should be aligned with, complement, enhance, fill in the outstanding work of DataStax Academy.



Kenneth Brotman



From: Kenneth Brotman [mailto:***@yahoo.com]
Sent: Saturday, February 24, 2018 10:16 AM
To: '***@cassandra.apache.org'
Subject: RE: Gathering / Curating / Organizing Cassandra Best Practices & Patterns



To Rahul,



This is your official email (just from me as an individual) requesting your assistance to help solve the knowledge management problem. I can appreciate the work you put into the Awesome Cassandra list. It is difficult to keep everything up to date. I’ve been there too.



The golden trophy if you want to do the absolute best thing is a full-fledged professional development initiative for Cassandra. From an instructional design view, what you do is create a body of knowledge and exhaustive list of competencies, some call KSA’s: knowledge, skills and abilities; then you do a gap analysis to find the areas in practice where gaps exists between the competencies desired and those of practitioners, then generate a mix of media for difference learning styles in a structured properly sequenced series of easy to work through steps complete with apperception exercises, and everyone will then have a smooth path towards mastery. It’s that easy.



So, yes let’s turn it up a few notches.



Thank you,



Kenneth Brotman




--
Rahul Singh
***@anant.us

Anant Corporation


On Feb 23, 2018, 5:56 PM -0500, Carl Mueller <***@smartthings.com>, wrote:

Isn't a github markdown site about the most easiest collaborative platform there is for stuff like this? I'm not saying the end product will knock anyone's socks off.



On Thu, Feb 22, 2018 at 10:55 AM, Rahul Singh <***@gmail.com> wrote:

There’s always a reason to complain if you aren’t paying for something. There’s always a reason to complain if you are paying for something.



TLDR; If you want to help curate / organize / gather knowledge about Cassandra, send me an email. I’d love to solve at least the knowledge management problem.

Complaining itself is not a solution or a step in the right direction. Defining an issue helps by identifying specifically what the pain is and a decision can be made to resolve or not resolve it.
Jon Haddad
2018-02-24 18:44:26 UTC
Permalink
DataStax academy is great but no, no work needs to be or should be aligned with it. Datastax is an independent company trying to make a profit, they could yank their docs at any time. There’s a reason why we started doing the docs in-tree, there was too much of a reliance on DS documentation.

DataStax isn’t Cassandra.

> On Feb 24, 2018, at 10:42 AM, Kenneth Brotman <***@yahoo.com.INVALID> wrote:
>
> Any efforts described below should be aligned with, complement, enhance, fill in the outstanding work of DataStax Academy.
>
> Kenneth Brotman
>
> From: Kenneth Brotman [mailto:***@yahoo.com <mailto:***@yahoo.com>]
> Sent: Saturday, February 24, 2018 10:16 AM
> To: '***@cassandra.apache.org <mailto:***@cassandra.apache.org>'
> Subject: RE: Gathering / Curating / Organizing Cassandra Best Practices & Patterns
>
> To Rahul,
>
> This is your official email (just from me as an individual) requesting your assistance to help solve the knowledge management problem. I can appreciate the work you put into the Awesome Cassandra list. It is difficult to keep everything up to date. I’ve been there too.
>
> The golden trophy if you want to do the absolute best thing is a full-fledged professional development initiative for Cassandra. From an instructional design view, what you do is create a body of knowledge and exhaustive list of competencies, some call KSA’s: knowledge, skills and abilities; then you do a gap analysis to find the areas in practice where gaps exists between the competencies desired and those of practitioners, then generate a mix of media for difference learning styles in a structured properly sequenced series of easy to work through steps complete with apperception exercises, and everyone will then have a smooth path towards mastery. It’s that easy.
>
> So, yes let’s turn it up a few notches.
>
> Thank you,
>
> Kenneth Brotman
>
>
> --
> Rahul Singh
> ***@anant.us <mailto:***@anant.us>
>
> Anant Corporation
>
> On Feb 23, 2018, 5:56 PM -0500, Carl Mueller <***@smartthings.com <mailto:***@smartthings.com>>, wrote:
>
> Isn't a github markdown site about the most easiest collaborative platform there is for stuff like this? I'm not saying the end product will knock anyone's socks off.
>
> On Thu, Feb 22, 2018 at 10:55 AM, Rahul Singh <***@gmail.com <mailto:***@gmail.com>> wrote:
> There’s always a reason to complain if you aren’t paying for something. There’s always a reason to complain if you are paying for something.
>
> TLDR; If you want to help curate / organize / gather knowledge about Cassandra, send me an email. I’d love to solve at least the knowledge management problem.
>
> Complaining itself is not a solution or a step in the right direction. Defining an issue helps by identifying specifically what the pain is and a decision can be made to resolve or not resolve it.
Kenneth Brotman
2018-02-24 20:08:54 UTC
Permalink
Hey Jon,



If that was the issue the whole time, it’s a big nothing to fix. All DataStax and Apache Foundation ever had to do, and it’s really really easy, is execute a property rights sharing agreement that makes everyone comfortable and protects the parties from being controlled by the other party. Super, super easy stuff to work out WHEN you have two parties that want it to work out. If they would just do that we could go back to being one big healthy family. I could work that out with them. I’ve done this type of thing before. I’m not kidding it’s really easy. Just so you know. Just for the record. Just in case the right people are following along.



Kenneth Brotman



From: Jon Haddad [mailto:***@gmail.com] On Behalf Of Jon Haddad
Sent: Saturday, February 24, 2018 10:44 AM
To: ***@cassandra.apache.org
Subject: Re: Gathering / Curating / Organizing Cassandra Best Practices & Patterns



DataStax academy is great but no, no work needs to be or should be aligned with it. Datastax is an independent company trying to make a profit, they could yank their docs at any time. There’s a reason why we started doing the docs in-tree, there was too much of a reliance on DS documentation.



DataStax isn’t Cassandra.





On Feb 24, 2018, at 10:42 AM, Kenneth Brotman <***@yahoo.com.INVALID> wrote:



Any efforts described below should be aligned with, complement, enhance, fill in the outstanding work of DataStax Academy.



Kenneth Brotman



From: Kenneth Brotman [ <mailto:***@yahoo.com> mailto:***@yahoo.com]
Sent: Saturday, February 24, 2018 10:16 AM
To: ' <mailto:***@cassandra.apache.org> ***@cassandra.apache.org'
Subject: RE: Gathering / Curating / Organizing Cassandra Best Practices & Patterns



To Rahul,



This is your official email (just from me as an individual) requesting your assistance to help solve the knowledge management problem. I can appreciate the work you put into the Awesome Cassandra list. It is difficult to keep everything up to date. I’ve been there too.



The golden trophy if you want to do the absolute best thing is a full-fledged professional development initiative for Cassandra. From an instructional design view, what you do is create a body of knowledge and exhaustive list of competencies, some call KSA’s: knowledge, skills and abilities; then you do a gap analysis to find the areas in practice where gaps exists between the competencies desired and those of practitioners, then generate a mix of media for difference learning styles in a structured properly sequenced series of easy to work through steps complete with apperception exercises, and everyone will then have a smooth path towards mastery. It’s that easy.



So, yes let’s turn it up a few notches.



Thank you,



Kenneth Brotman




--
Rahul Singh
<mailto:***@anant.us> ***@anant.us

Anant Corporation


On Feb 23, 2018, 5:56 PM -0500, Carl Mueller < <mailto:***@smartthings.com> ***@smartthings.com>, wrote:

Isn't a github markdown site about the most easiest collaborative platform there is for stuff like this? I'm not saying the end product will knock anyone's socks off.



On Thu, Feb 22, 2018 at 10:55 AM, Rahul Singh < <mailto:***@gmail.com> ***@gmail.com> wrote:

There’s always a reason to complain if you aren’t paying for something. There’s always a reason to complain if you are paying for something.



TLDR; If you want to help curate / organize / gather knowledge about Cassandra, send me an email. I’d love to solve at least the knowledge management problem.

Complaining itself is not a solution or a step in the right direction. Defining an issue helps by identifying specifically what the pain is and a decision can be made to resolve or not resolve it.
Kenneth Brotman
2018-02-24 20:57:32 UTC
Permalink
Jon,



This is considered the start of the problem: https://www.mail-archive.com/***@cassandra.apache.org/msg09050.html



That’s according to this well sourced article called “Fear of Staxit: What next for ASF’s Cassandra as biggest donor cuts back” https://www.theregister.co.uk/2016/11/14/datastax_versus_asf_staxeit/



I am one of the people who didn’t know the history and is now as this article describes, caught between “A Rock and a hard place
:

http://www.zdnet.com/article/a-rock-and-a-hard-place-between-scylladb-and-cassandra/



I bet it’s been painful for everyone. It’s really said.



Kenneth Brotman



From: Jon Haddad [mailto:***@gmail.com] On Behalf Of Jon Haddad
Sent: Saturday, February 24, 2018 12:26 PM
To: Kenneth Brotman
Subject: Re: Gathering / Curating / Organizing Cassandra Best Practices & Patterns



I really don’t want to continue this discussion any further on the ML, because I used to work at DataStax and I’d rather not have this turn into a mess. Take a look at the closed JIRAS and git history, they’re mostly pulled out of Apache Cassandra development and ship their own fork. They are done contributing docs as well.



https://www.datastax.com/2016/11/serving-customers-serving-the-community



Any discussion on the matter is a waste of time, so this is the last email from me on the topic.



Jon







On Feb 24, 2018, at 12:08 PM, Kenneth Brotman <***@yahoo.com.INVALID> wrote:



Hey Jon,



If that was the issue the whole time, it’s a big nothing to fix. All DataStax and Apache Foundation ever had to do, and it’s really really easy, is execute a property rights sharing agreement that makes everyone comfortable and protects the parties from being controlled by the other party. Super, super easy stuff to work out WHEN you have two parties that want it to work out. If they would just do that we could go back to being one big healthy family. I could work that out with them. I’ve done this type of thing before. I’m not kidding it’s really easy. Just so you know. Just for the record. Just in case the right people are following along.



Kenneth Brotman



From: Jon Haddad [mailto:***@gmail.com] On Behalf Of Jon Haddad
Sent: Saturday, February 24, 2018 10:44 AM
To: ***@cassandra.apache.org
Subject: Re: Gathering / Curating / Organizing Cassandra Best Practices & Patterns



DataStax academy is great but no, no work needs to be or should be aligned with it. Datastax is an independent company trying to make a profit, they could yank their docs at any time. There’s a reason why we started doing the docs in-tree, there was too much of a reliance on DS documentation.



DataStax isn’t Cassandra.






On Feb 24, 2018, at 10:42 AM, Kenneth Brotman < <mailto:***@yahoo.com.INVALID> ***@yahoo.com.INVALID> wrote:



Any efforts described below should be aligned with, complement, enhance, fill in the outstanding work of DataStax Academy.



Kenneth Brotman



From: Kenneth Brotman [ <mailto:***@yahoo.com> mailto:***@yahoo.com]
Sent: Saturday, February 24, 2018 10:16 AM
To: ' <mailto:***@cassandra.apache.org> ***@cassandra.apache.org'
Subject: RE: Gathering / Curating / Organizing Cassandra Best Practices & Patterns



To Rahul,



This is your official email (just from me as an individual) requesting your assistance to help solve the knowledge management problem. I can appreciate the work you put into the Awesome Cassandra list. It is difficult to keep everything up to date. I’ve been there too.



The golden trophy if you want to do the absolute best thing is a full-fledged professional development initiative for Cassandra. From an instructional design view, what you do is create a body of knowledge and exhaustive list of competencies, some call KSA’s: knowledge, skills and abilities; then you do a gap analysis to find the areas in practice where gaps exists between the competencies desired and those of practitioners, then generate a mix of media for difference learning styles in a structured properly sequenced series of easy to work through steps complete with apperception exercises, and everyone will then have a smooth path towards mastery. It’s that easy.



So, yes let’s turn it up a few notches.



Thank you,



Kenneth Brotman




--
Rahul Singh
<mailto:***@anant.us> ***@anant.us

Anant Corporation


On Feb 23, 2018, 5:56 PM -0500, Carl Mueller < <mailto:***@smartthings.com> ***@smartthings.com>, wrote:

Isn't a github markdown site about the most easiest collaborative platform there is for stuff like this? I'm not saying the end product will knock anyone's socks off.



On Thu, Feb 22, 2018 at 10:55 AM, Rahul Singh < <mailto:***@gmail.com> ***@gmail.com> wrote:

There’s always a reason to complain if you aren’t paying for something. There’s always a reason to complain if you are paying for something.



TLDR; If you want to help curate / organize / gather knowledge about Cassandra, send me an email. I’d love to solve at least the knowledge management problem.

Complaining itself is not a solution or a step in the right direction. Defining an issue helps by identifying specifically what the pain is and a decision can be made to resolve or not resolve it.
Kenneth Brotman
2018-02-24 23:29:15 UTC
Permalink
If you read the email message, the first link below, you’ll see that it’s a well intending Apache Foundation board member who could not grasp how our community functioned. Apache Foundation messed up our community by the way they handled a routine inquiry, leaving no option for DataStax but to seek legal counsel. I’ve been there. Your own legal counsel deal the final blow. They tell you all communication has to go through them. They tell you there has to be clear separation. They say you have to take their advice or they will not keep defending you and you will not any personal protection. Anyone can be sued and you will be liable for defending yourself. Sound familiar!



Everyone kept saying that everything was good. That the community, our community liked the way things worked.



I call on Apache Foundation to reach out to DataStax and fix the mess forthwith! Report openly on your efforts. You can fix your mess Apache Foundation. This email says it all. A total miscall: https://www.mail-archive.com/***@cassandra.apache.org/msg09090.html. And the guy has a PhD!



Kenneth Brotman



From: Kenneth Brotman [mailto:***@yahoo.com.INVALID]
Sent: Saturday, February 24, 2018 12:58 PM
To: ***@cassandra.apache.org
Subject: RE: Gathering / Curating / Organizing Cassandra Best Practices & Patterns



Jon,



This is considered the start of the problem: https://www.mail-archive.com/***@cassandra.apache.org/msg09050.html



That’s according to this well sourced article called “Fear of Staxit: What next for ASF’s Cassandra as biggest donor cuts back” https://www.theregister.co.uk/2016/11/14/datastax_versus_asf_staxeit/



I am one of the people who didn’t know the history and is now as this article describes, caught between “A Rock and a hard place
:

http://www.zdnet.com/article/a-rock-and-a-hard-place-between-scylladb-and-cassandra/



I bet it’s been painful for everyone. It’s really said.



Kenneth Brotman
Kenneth Brotman
2018-02-25 14:45:39 UTC
Permalink
Chris Mattmann acted without authority and completely improperly as an Apache Software Foundation board member as a board member on their own has no authority. Their authority is to participate and vote at board meetings. They are not allowed to transact business, they are not supposed to force themselves on anyone or order anyone around. The one that was acting controlling was this idiot board member that has caused this situation between DataStax and the rest of our community.



Furthermore, when he instructed Cassandra legend Jonathan Ellis, the Cassandra PMC Chair to include certain information in a report to the Apache Software Foundation board that escalated the matter to something that was before the board.



I am not an attorney and this should not be taken as legal advice!



It is clear to me as one someone who is experienced and trained as a board member that Chris Mattmann and the ASF itself probably will find themselves in court over this. I think a lot of folks should raise this matter with their legal counsel.



What happened is not trivial. It is news worthy. I suggest people talk to the media about this story. Ask them to investigate and report the story.



Is APC interfering with other communities?



Kenneth Brotman





From: Kenneth Brotman [mailto:***@yahoo.com.INVALID]
Sent: Saturday, February 24, 2018 3:29 PM
To: ***@cassandra.apache.org
Subject: RE: Gathering / Curating / Organizing Cassandra Best Practices & Patterns
Importance: High



If you read the email message, the first link below, you’ll see that it’s a well intending Apache Foundation board member who could not grasp how our community functioned. Apache Foundation messed up our community by the way they handled a routine inquiry, leaving no option for DataStax but to seek legal counsel. I’ve been there. Your own legal counsel deal the final blow. They tell you all communication has to go through them. They tell you there has to be clear separation. They say you have to take their advice or they will not keep defending you and you will not any personal protection. Anyone can be sued and you will be liable for defending yourself. Sound familiar!



Everyone kept saying that everything was good. That the community, our community liked the way things worked.



I call on Apache Foundation to reach out to DataStax and fix the mess forthwith! Report openly on your efforts. You can fix your mess Apache Foundation. This email says it all. A total miscall: https://www.mail-archive.com/***@cassandra.apache.org/msg09090.html. And the guy has a PhD!



Kenneth Brotman



From: Kenneth Brotman [mailto:***@yahoo.com.INVALID]
Sent: Saturday, February 24, 2018 12:58 PM
To: ***@cassandra.apache.org
Subject: RE: Gathering / Curating / Organizing Cassandra Best Practices & Patterns



Jon,



This is considered the start of the problem: https://www.mail-archive.com/***@cassandra.apache.org/msg09050.html



That’s according to this well sourced article called “Fear of Staxit: What next for ASF’s Cassandra as biggest donor cuts back” https://www.theregister.co.uk/2016/11/14/datastax_versus_asf_staxeit/



I am one of the people who didn’t know the history and is now as this article describes, caught between “A Rock and a hard place
:

http://www.zdnet.com/article/a-rock-and-a-hard-place-between-scylladb-and-cassandra/



I bet it’s been painful for everyone. It’s really said.



Kenneth Brotman
Eric Evans
2018-02-26 17:15:34 UTC
Permalink
On Sun, Feb 25, 2018 at 8:45 AM, Kenneth Brotman <
***@yahoo.com.invalid> wrote:

> Chris Mattmann acted without authority and completely improperly as an
> Apache Software Foundation board member as a board member on their own has
> no authority. Their authority is to participate and vote at board
> meetings. They are not allowed to transact business, they are not supposed
> to force themselves on anyone or order anyone around. The one that was
> acting controlling was this idiot board member that has caused this
> situation between DataStax and the rest of our community.
>
>
>
> Furthermore, when he instructed Cassandra legend Jonathan Ellis, the
> Cassandra PMC Chair to include certain information in a report to the
> Apache Software Foundation board that escalated the matter to something
> that was before the board.
>
>
>
> I am not an attorney and this should not be taken as legal advice!
>
>
>
> It is clear to me as one someone who is experienced and trained as a board
> member that Chris Mattmann and the ASF itself probably will find themselves
> in court over this. I think a lot of folks should raise this matter with
> their legal counsel.
>
>
>
> What happened is not trivial. It is news worthy. I suggest people talk
> to the media about this story. Ask them to investigate and report the
> story.
>
>
>
> Is APC interfering with other communities?
>

Kenneth, I really think you need to pump the brakes here. You're leveling
some pretty serious accusations, and have now resorted to personal attacks;
This is not constructive.


*From:* Kenneth Brotman [mailto:***@yahoo.com.INVALID]
> *Sent:* Saturday, February 24, 2018 3:29 PM
> *To:* ***@cassandra.apache.org
> *Subject:* RE: Gathering / Curating / Organizing Cassandra Best Practices
> & Patterns
> *Importance:* High
>
>
>
> If you read the email message, the first link below, you’ll see that it’s
> a well intending Apache Foundation board member who could not grasp how our
> community functioned. Apache Foundation messed up our community by the way
> they handled a routine inquiry, leaving no option for DataStax but to seek
> legal counsel. I’ve been there. Your own legal counsel deal the final
> blow. They tell you all communication has to go through them. They tell
> you there has to be clear separation. They say you have to take their
> advice or they will not keep defending you and you will not any personal
> protection. Anyone can be sued and you will be liable for defending
> yourself. Sound familiar!
>
>
>
> Everyone kept saying that everything was good. That the community, our
> community liked the way things worked.
>
>
>
> I call on Apache Foundation to reach out to DataStax and fix the mess
> forthwith! Report openly on your efforts. You can fix your mess Apache
> Foundation. This email says it all. A total miscall:
> https://www.mail-archive.com/***@cassandra.apache.org/msg09090.html. And
> the guy has a PhD!
>
>
>
> Kenneth Brotman
>
>
>
> *From:* Kenneth Brotman [mailto:***@yahoo.com.INVALID
> <***@yahoo.com.INVALID>]
> *Sent:* Saturday, February 24, 2018 12:58 PM
> *To:* ***@cassandra.apache.org
> *Subject:* RE: Gathering / Curating / Organizing Cassandra Best Practices
> & Patterns
>
>
>
> Jon,
>
>
>
> This is considered the start of the problem: https://www.mail-archive.com/
> ***@cassandra.apache.org/msg09050.html
>
>
>
> That’s according to this well sourced article called “Fear of Staxit: What
> next for ASF’s Cassandra as biggest donor cuts back”
> https://www.theregister.co.uk/2016/11/14/datastax_versus_asf_staxeit/
>
>
>
> I am one of the people who didn’t know the history and is now as this
> article describes, caught between “A Rock and a hard place
:
>
> http://www.zdnet.com/article/a-rock-and-a-hard-place-between-scylladb-and-
> cassandra/
>
>
>
> I bet it’s been painful for everyone. It’s really said.
>
>
>
> Kenneth Brotman
>



--
Eric Evans
***@gmail.com
Kenneth Brotman
2018-02-26 17:35:32 UTC
Permalink
I got caught in the middle of this stuff. I feel for everyone. I said my two cents. I had to vent. I’m back to concentrating on helping the group.



Kenneth Brotman



From: Eric Evans [mailto:***@gmail.com]
Sent: Monday, February 26, 2018 9:16 AM
To: ***@cassandra.apache.org
Subject: Re: Gathering / Curating / Organizing Cassandra Best Practices & Patterns







On Sun, Feb 25, 2018 at 8:45 AM, Kenneth Brotman <***@yahoo.com.invalid> wrote:

Chris Mattmann acted without authority and completely improperly as an Apache Software Foundation board member as a board member on their own has no authority. Their authority is to participate and vote at board meetings. They are not allowed to transact business, they are not supposed to force themselves on anyone or order anyone around. The one that was acting controlling was this idiot board member that has caused this situation between DataStax and the rest of our community.



Furthermore, when he instructed Cassandra legend Jonathan Ellis, the Cassandra PMC Chair to include certain information in a report to the Apache Software Foundation board that escalated the matter to something that was before the board.



I am not an attorney and this should not be taken as legal advice!



It is clear to me as one someone who is experienced and trained as a board member that Chris Mattmann and the ASF itself probably will find themselves in court over this. I think a lot of folks should raise this matter with their legal counsel.



What happened is not trivial. It is news worthy. I suggest people talk to the media about this story Ask them to investigate and report the story.



Is APC interfering with other communities?



Kenneth, I really think you need to pump the brakes here. You're leveling some pretty serious accusations, and have now resorted to personal attacks; This is not constructive.



From: Kenneth Brotman [mailto:***@yahoo.com.INVALID]
Sent: Saturday, February 24, 2018 3:29 PM
To: ***@cassandra.apache.org
Subject: RE: Gathering / Curating / Organizing Cassandra Best Practices & Patterns
Importance: High



If you read the email message, the first link below, you’ll see that it’s a well intending Apache Foundation board member who could not grasp how our community functioned. Apache Foundation messed up our community by the way they handled a routine inquiry, leaving no option for DataStax but to seek legal counsel. I’ve been there. Your own legal counsel deal the final blow. They tell you all communication has to go through them. They tell you there has to be clear separation. They say you have to take their advice or they will not keep defending you and you will not any personal protection. Anyone can be sued and you will be liable for defending yourself. Sound familiar!



Everyone kept saying that everything was good. That the community, our community liked the way things worked.



I call on Apache Foundation to reach out to DataStax and fix the mess forthwith! Report openly on your efforts. You can fix your mess Apache Foundation. This email says it all. A total miscall: https://www.mail-archive.com/***@cassandra.apache.org/msg09090.html. And the guy has a PhD!



Kenneth Brotman



From: Kenneth Brotman [mailto:***@yahoo.com.INVALID]
Sent: Saturday, February 24, 2018 12:58 PM
To: ***@cassandra.apache.org
Subject: RE: Gathering / Curating / Organizing Cassandra Best Practices & Patterns



Jon,



This is considered the start of the problem: https://www.mail-archive.com/***@cassandra.apache.org/msg09050.html



That’s according to this well sourced article called “Fear of Staxit: What next for ASF’s Cassandra as biggest donor cuts back” https://www.theregister.co.uk/2016/11/14/datastax_versus_asf_staxeit/ <https://www.theregisterco.uk/2016/11/14/datastax_versus_asf_staxeit/>



I am one of the people who didn’t know the history and is now as this article describes, caught between “A Rock and a hard place
:

http://www.zdnet.com/article/a-rock-and-a-hard-place-between-scylladb-and-cassandra/



I bet it’s been painful for everyone. It’s really said.



Kenneth Brotman




--

Eric Evans
***@gmail.com
Eric Plowe
2018-02-26 21:13:53 UTC
Permalink
*Kenneth, *

How did you get "caught in the middle" of this "stuff"? You are the one
bringing it up? Also, your tone switched between calling Chris a "well
intended ASF" board member, to calling him an "idiot". He asked a perfectly
reasonable question, and then other questions followed as a result. If you
want to contribute to the community, please start by being respectful to
all members of the community.

Regards,

Eric Plowe

On Mon, Feb 26, 2018 at 12:35 PM Kenneth Brotman
<***@yahoo.com.invalid> wrote:

> I got caught in the middle of this stuff. I feel for everyone. I said
> my two cents. I had to vent. I’m back to concentrating on helping the
> group.
>
>
>
> Kenneth Brotman
>
>
>
> *From:* Eric Evans [mailto:***@gmail.com]
> *Sent:* Monday, February 26, 2018 9:16 AM
> *To:* ***@cassandra.apache.org
> *Subject:* Re: Gathering / Curating / Organizing Cassandra Best Practices
> & Patterns
>
>
>
>
>
>
>
> On Sun, Feb 25, 2018 at 8:45 AM, Kenneth Brotman <
> ***@yahoo.com.invalid> wrote:
>
> Chris Mattmann acted without authority and completely improperly as an
> Apache Software Foundation board member as a board member on their own has
> no authority. Their authority is to participate and vote at board
> meetings. They are not allowed to transact business, they are not supposed
> to force themselves on anyone or order anyone around. The one that was
> acting controlling was this idiot board member that has caused this
> situation between DataStax and the rest of our community.
>
>
>
> Furthermore, when he instructed Cassandra legend Jonathan Ellis, the
> Cassandra PMC Chair to include certain information in a report to the
> Apache Software Foundation board that escalated the matter to something
> that was before the board.
>
>
>
> I am not an attorney and this should not be taken as legal advice!
>
>
>
> It is clear to me as one someone who is experienced and trained as a board
> member that Chris Mattmann and the ASF itself probably will find themselves
> in court over this. I think a lot of folks should raise this matter with
> their legal counsel.
>
>
>
> What happened is not trivial. It is news worthy. I suggest people talk
> to the media about this story Ask them to investigate and report the
> story.
>
>
>
> Is APC interfering with other communities?
>
>
>
> Kenneth, I really think you need to pump the brakes here. You're leveling
> some pretty serious accusations, and have now resorted to personal attacks;
> This is not constructive.
>
>
>
> *From:* Kenneth Brotman [mailto:***@yahoo.com.INVALID]
> *Sent:* Saturday, February 24, 2018 3:29 PM
> *To:* ***@cassandra.apache.org
> *Subject:* RE: Gathering / Curating / Organizing Cassandra Best Practices
> & Patterns
> *Importance:* High
>
>
>
> If you read the email message, the first link below, you’ll see that it’s
> a well intending Apache Foundation board member who could not grasp how our
> community functioned. Apache Foundation messed up our community by the way
> they handled a routine inquiry, leaving no option for DataStax but to seek
> legal counsel. I’ve been there. Your own legal counsel deal the final
> blow. They tell you all communication has to go through them. They tell
> you there has to be clear separation. They say you have to take their
> advice or they will not keep defending you and you will not any personal
> protection. Anyone can be sued and you will be liable for defending
> yourself. Sound familiar!
>
>
>
> Everyone kept saying that everything was good. That the community, our
> community liked the way things worked.
>
>
>
> I call on Apache Foundation to reach out to DataStax and fix the mess
> forthwith! Report openly on your efforts. You can fix your mess Apache
> Foundation. This email says it all. A total miscall:
> https://www.mail-archive.com/***@cassandra.apache.org/msg09090.html. And
> the guy has a PhD!
>
>
>
> Kenneth Brotman
>
>
>
> *From:* Kenneth Brotman [mailto:***@yahoo.com.INVALID
> <***@yahoo.com.INVALID>]
> *Sent:* Saturday, February 24, 2018 12:58 PM
> *To:* ***@cassandra.apache.org
> *Subject:* RE: Gathering / Curating / Organizing Cassandra Best Practices
> & Patterns
>
>
>
> Jon,
>
>
>
> This is considered the start of the problem:
> https://www.mail-archive.com/***@cassandra.apache.org/msg09050.html
>
>
>
> That’s according to this well sourced article called “Fear of Staxit: What
> next for ASF’s Cassandra as biggest donor cuts back”
> https://www.theregister.co.uk/2016/11/14/datastax_versus_asf_staxeit/
> <https://www.theregisterco.uk/2016/11/14/datastax_versus_asf_staxeit/>
>
>
>
> I am one of the people who didn’t know the history and is now as this
> article describes, caught between “A Rock and a hard place
:
>
>
> http://www.zdnet.com/article/a-rock-and-a-hard-place-between-scylladb-and-cassandra/
>
>
>
> I bet it’s been painful for everyone. It’s really said.
>
>
>
> Kenneth Brotman
>
>
>
>
> --
>
> Eric Evans
> ***@gmail.com
>
Kenneth Brotman
2018-02-26 21:25:33 UTC
Permalink
Eric,



My tone changed as I studied in more detail the thread. He begin with a well-intended but ill-advised inquiry, very public inquiry at that which itself was problematic. It’s not a board member’s place to push their weight around like that. That’s board member training 101. Not his job. He stepped in it. Go through staff. Very poorly handled. I’ll give him the benefit of the doubt that he meant well. We have a problem. It must be fixed.



As to getting caught in the middle I will let you ponder that. I have to help get Cassandra out of Document Hell!!!



Kenneth Brotman





From: Eric Plowe [mailto:***@gmail.com]
Sent: Monday, February 26, 2018 1:14 PM
To: ***@cassandra.apache.org
Subject: Re: Gathering / Curating / Organizing Cassandra Best Practices & Patterns



Kenneth,



How did you get "caught in the middle" of this "stuff"? You are the one bringing it up? Also, your tone switched between calling Chris a "well intended ASF" board member, to calling him an "idiot". He asked a perfectly reasonable question, and then other questions followed as a result. If you want to contribute to the community, please start by being respectful to all members of the community.



Regards,



Eric Plowe



On Mon, Feb 26, 2018 at 12:35 PM Kenneth Brotman <***@yahoo.com.invalid> wrote:

I got caught in the middle of this stuff. I feel for everyone. I said my two cents. I had to vent. I’m back to concentrating on helping the group.



Kenneth Brotman



From: Eric Evans [mailto:***@gmail.com]
Sent: Monday, February 26, 2018 9:16 AM
To: ***@cassandra.apache.org
Subject: Re: Gathering / Curating / Organizing Cassandra Best Practices & Patterns







On Sun, Feb 25, 2018 at 8:45 AM, Kenneth Brotman <***@yahoo.com.invalid> wrote:

Chris Mattmann acted without authority and completely improperly as an Apache Software Foundation board member as a board member on their own has no authority. Their authority is to participate and vote at board meetings. They are not allowed to transact business, they are not supposed to force themselves on anyone or order anyone around. The one that was acting controlling was this idiot board member that has caused this situation between DataStax and the rest of our community.



Furthermore, when he instructed Cassandra legend Jonathan Ellis, the Cassandra PMC Chair to include certain information in a report to the Apache Software Foundation board that escalated the matter to something that was before the board.



I am not an attorney and this should not be taken as legal advice!



It is clear to me as one someone who is experienced and trained as a board member that Chris Mattmann and the ASF itself probably will find themselves in court over this. I think a lot of folks should raise this matter with their legal counsel.



What happened is not trivial. It is news worthy. I suggest people talk to the media about this story Ask them to investigate and report the story.



Is APC interfering with other communities?



Kenneth, I really think you need to pump the brakes here. You're leveling some pretty serious accusations, and have now resorted to personal attacks; This is not constructive.



From: Kenneth Brotman [mailto:***@yahoo.com.INVALID]
Sent: Saturday, February 24, 2018 3:29 PM
To: ***@cassandra.apache.org
Subject: RE: Gathering / Curating / Organizing Cassandra Best Practices & Patterns
Importance: High



If you read the email message, the first link below, you’ll see that it’s a well intending Apache Foundation board member who could not grasp how our community functioned. Apache Foundation messed up our community by the way they handled a routine inquiry, leaving no option for DataStax but to seek legal counsel. I’ve been there. Your own legal counsel deal the final blow. They tell you all communication has to go through them. They tell you there has to be clear separation. They say you have to take their advice or they will not keep defending you and you will not any personal protection. Anyone can be sued and you will be liable for defending yourself. Sound familiar!



Everyone kept saying that everything was good. That the community, our community liked the way things worked.



I call on Apache Foundation to reach out to DataStax and fix the mess forthwith! Report openly on your efforts. You can fix your mess Apache Foundation. This email says it all. A total miscall: https://www.mail-archive.com/***@cassandra.apache.org/msg09090.html. And the guy has a PhD!



Kenneth Brotman



From: Kenneth Brotman [mailto:***@yahoo.com.INVALID]
Sent: Saturday, February 24, 2018 12:58 PM
To: ***@cassandra.apache.org
Subject: RE: Gathering / Curating / Organizing Cassandra Best Practices & Patterns



Jon,



This is considered the start of the problem: https://www.mail-archive.com/***@cassandra.apache.org/msg09050.html



That’s according to this well sourced article called “Fear of Staxit: What next for ASF’s Cassandra as biggest donor cuts back” https://www.theregister.co.uk/2016/11/14/datastax_versus_asf_staxeit/ <https://www.theregisterco.uk/2016/11/14/datastax_versus_asf_staxeit/>



I am one of the people who didn’t know the history and is now as this article describes, caught between “A Rock and a hard place
:

http://www.zdnet.com/article/a-rock-and-a-hard-place-between-scylladb-and-cassandra/



I bet it’s been painful for everyone. It’s really said.



Kenneth Brotman




--

Eric Evans
***@gmail.com
Russell Bateman
2018-02-20 17:50:30 UTC
Permalink
I ask Cassandra to be a database that is high-performance, highly
scalable with no single point of failure. Anything "cool" that's added
beyond must be added only as a separate, optional ring around Cassandra
and must not get in the way of my usage.

Yes, I would like some help with some of what's listed here, but you
should understand that most shops adopting Cassandra are already going
to have DevOps/database management personnel, expertise, methods,
protocols and, in some instances, tools already in place. Even the small
shop I work in has guys saddled with taking care of Cassandra (I'm a
developer and not one of these guys) and seem not to share these
concerns because they've already got it covered (like the specific YAML
configuration complaint).

If there were an option or two I'd like to see, one would be the ability
to duplicate data centers exactly (as part of what we stipulate when
creating our KEYSPACE), but this is probably something I want because of
what we were doing up until or what we wanted when we adopted Cassandra
for our future product direction. I would also like to see an option in
Cassandra configuration for absolutelylocking out access to certain
commands (like DROP TABLE, DROP INDEXand DELETE).

From my point of view as a developer, I've had to do many of these
things also for MongoDB, PostgreSQL, MySQL and other databases over my
career.

I'm not criticizing these concerns and suggestions. I'm just pointing
out that, in my opinion, not everything said here is in the realm of,
"duh, Cassandra needs to grow up."

There's so much right about Cassandra, from the great, unequaled
technology to the very liberal licensing model without which I could not
be here.

Russ Bateman


On 02/18/2018 10:39 PM, Kenneth Brotman wrote:
>
> Cassandra feels like an unfinished program to me.  The problem is not
> that it’s open source or cutting edge.  It’s an open source cutting
> edge program that lacks some of its basic functionality.  We are all
> stuck addressing fundamental mechanical tasks for Cassandra because
> the basic code that would do that part has not been contributed yet.
>
> Ease of use issues need to be given much more attention.  For an
> administrator, the ease of use of Cassandra is very poor.
>
> Furthermore, currently Cassandra is an idiot.  We have to do
> everything for Cassandra. Contrast that with the fact that we are in
> the dawn of artificial intelligence.
>
> Software exists to automate tasks for humans, not mechanize humans to
> administer tasks for a database.  I’m an engineering type.  My job is
> to apply science and technology to solve real world problems.  And
> that’s where I need an organization’s I.T. talent to focus; not in
> crank starting an unfinished database.
>
> For example, I should be able to go to any node, replace the
> Cassandra.yaml file and have a prompt on the display ask me if I want
> to update all the yaml files across the cluster.  I shouldn’t have to
> manually modify yaml files on each node or have to create a script for
> some third party automation tool to do it.
>
> I should not have to turn off service, clear directories, restart
> service in coordination with the other nodes.  It’s already a computer
> system.  It can do those things on its own.
>
> How about read repair.  First there is something wrong with the name. 
> Maybe it should be called Consistency Repair.  An administrator
> shouldn’t have to do anything.  It should be a behavior of Cassandra
> that is programmed in. It should consider the GC setting of each node,
> calculate how often it has to run repair, when it should run it so all
> the nodes aren’t trying at the same time and when other circumstances
> indicate it should also run it.
>
> Certificate management should be automated.
>
> Cluster wide management should be a big theme in any next major
> release. What is a major release?  How many major releases could a
> program have before all the coding for basic stuff like installation,
> configuration and maintenance is included!
>
> Finish the basic coding of Cassandra, make it easy to use for
> administrators, make is smart, add cluster wide management.  Keep
> Cassandra competitive or it will soon be the old Model T we all
> remember fondly.
>
> I ask the Committee to compile a list of all such items, make a plan,
> and commit to including the completed and tested code as part of major
> release 5.0.  I further ask that release 4.0 not be delayed and then
> there be an unusually short skip to version 5.0.
>
> Kenneth Brotman
>
Durity, Sean R
2018-02-21 18:54:06 UTC
Permalink
It is instructive to listen to the concerns of new and existing users in order to improve a product like Cassandra, but I think the school yard taunt model isn’t the most effective.

In my experience with open and closed source databases, there are always things that could be improved. Many have a historical base in how the product evolved over time. A newcomer sees those as rough edges right away. In other cases, the database creators have often widened their scope to try and solve every data problem. This creates the complexity of too many configuration options, etc. Even the best RDBMS (Informix!) battled these kinds of issues.

Cassandra, though, introduced another angle of difficulty. In trying to relate to RDBMS users (pun intended), it often borrowed terminology to make it seem familiar. But they don’t work the same way or even solve the same problems. The classic example is secondary indexes. For RDBMS, they are very useful; for Cassandra, they are anathema (except for very narrow cases).

However, I think the shots at Cassandra are generally unfair. When I started working with it, the DataStax documentation was some of the best documentation I had seen on any project, especially an open source one. (If anything the cooling off between Apache Cassandra and DataStax may be the most serious misstep so far
) The more I learned about how Cassandra worked, the more I marveled at the clever combination of intricate solutions (gossip, merkle trees, compaction strategies, etc.) to solve specific data problems. This is a great product! It has given me lots of sleep-filled nights over the last 4+ years. My customers love it, once I explain what it should be used for (and what it shouldn’t). I applaud the contributors, whether coders or users. Thank you!

Finally, a note on backup. Backing up a distributed system is tough, but restores are even more complex (if you want no down-time, no extra disk space, point-in-time recovery, etc). If you want to investigate why it is a tough problem for Cassandra, go look at RecoverX from Datos IO. They have solved many of the problems, but it isn’t an easy task. You could ask people to try and recreate all that, or just point them to a working solution. If backup and recovery is required (and I would argue it isn’t always required), it is probably worth paying for.


Sean Durity
From: Josh McKenzie [mailto:***@apache.org]
Sent: Wednesday, February 21, 2018 11:28 AM
To: ***@cassandra.apache.org
Cc: User <***@cassandra.apache.org>
Subject: [EXTERNAL] Re: Cassandra Needs to Grow Up by Version Five!

There's a disheartening amount of "here's where Cassandra is bad, and here's what it needs to do for me for free" happening in this thread.

This is open-source software. Everyone is *strongly encouraged* to submit a patch to move the needle on *any* of these things being complained about in this thread.

For the Apache Way<https://urldefense.proofpoint.com/v2/url?u=https-3A__www.apache.org_foundation_governance_&d=DwMFaQ&c=MtgQEAMQGqekjTjiAhkudQ&r=aC_gxC6z_4f9GLlbWiKzHm1vucZTtVYWDDvyLkh8IaQ&m=2rQSVEnngxWT4yH5056Hyg7HIoaXWYKxcndEyMQhGDU&s=rcKJB94vQnrbZaED-nzTrMFsTPedeCHopB8ch79XB7s&e=> to work, people need to step up and meaningfully contribute to a project to scratch their own itch instead of just waiting for a random corporation-subsidized engineer to happen to have interests that align with them and contribute that to the project.

Beating a dead horse for things everyone on the project knows are serious pain points is not productive.

On Wed, Feb 21, 2018 at 5:45 AM, Oleksandr Shulgin <***@zalando.de<mailto:***@zalando.de>> wrote:
On Mon, Feb 19, 2018 at 10:01 AM, Kenneth Brotman <
***@yahoo.com.invalid<mailto:***@yahoo.com.invalid>> wrote:

>
> >> Cluster wide management should be a big theme in any next major release.
> >>
> >Na. Stability and testing should be a big theme in the next major release.
> >
>
> Double Na on that one Jeff. I think you have a concern there about the
> need to test sufficiently to ensure the stability of the next major
> release. That makes perfect sense.- for every release, especially the
> major ones. Continuous improvement is not a phase of development for
> example. CI should be in everything, in every phase. Stability and
> testing a part of every release not just one. A major release should be a
> nice step from the previous major release though.
>

I guess what Jeff refers to is the tick-tock release cycle experiment,
which has proven to be a complete disaster by popular opinion.

There's also the "materialized views" feature which failed to materialize
in the end (pun intended) and had to be declared experimental retroactively.

Another prominent example is incremental repair which was introduced as the
default option in 2.2 and now is not recommended to use because of so many
corner cases where it can fail. So again experimental as an afterthought.

Not to mention that even if you are aware of the default incremental and go
with full repair instead, you're still up for a sad surprise:
anti-compaction will be triggered despite the "full" repair. Because
anti-compaction is only disabled in case of sub-range repair (don't ask
why), so you need to use something advanced like Reaper if you want to
avoid that. I don't think you'll ever find this in the documentation.

Honestly, for an eventually-consistent system like Cassandra anti-entropy
repair is one of the most important pieces to get right. And Cassandra
fails really badly on that one: the feature is not really well designed,
poorly implemented and under-documented.

In a summary, IMO, Cassandra is a poor implementation of some good ideas.
It is a collection of hacks, not features. They sometimes play together
accidentally, and rarely by design.

Regards,
--
Alex


________________________________

The information in this Internet Email is confidential and may be legally privileged. It is intended solely for the addressee. Access to this Email by anyone else is unauthorized. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited and may be unlawful. When addressed to our clients any opinions or advice contained in this Email are subject to the terms and conditions expressed in any applicable governing The Home Depot terms of business or client engagement letter. The Home Depot disclaims all responsibility and liability for the accuracy and content of this attachment and for any damages or losses arising from any inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other items of a destructive nature, which may be contained in this attachment and shall not be liable for direct, indirect, consequential or special damages in connection with this e-mail message or its attachment.
Oleksandr Shulgin
2018-02-22 07:55:08 UTC
Permalink
On Wed, Feb 21, 2018 at 7:54 PM, Durity, Sean R <***@homedepot.com
> wrote:

>
>
> However, I think the shots at Cassandra are generally unfair. When I
> started working with it, the DataStax documentation was some of the best
> documentation I had seen on any project, especially an open source one.
>

Oh, don't get me started on documentation, especially the DataStax one. I
come from Postgres. In comparison, Cassandra documentation is mostly
non-existent (and this is just a way to avoid listing other uncomfortable
epithets).

Not sure if I would be able to submit patches to improve that, however,
since most of the time it would require me to already know the answer to my
questions when the doc is incomplete.

The move from DataStax to Apache.org for docs is actually good, IMO, since
the docs were maintained very poorly and there was no real leverage to
influence that.

Cheers,
--
Alex
Eric Plowe
2018-02-22 08:50:13 UTC
Permalink
Cassandra, hard to use? I disagree completely. With that said, there are
definitely deficiencies in certain parts of the documentation, but nothing
that is a show stopper. We’ve been using Cassandra since the sub 1.0 days
and have had nothing but great things to say about it.

With that said, its an open source project; you get from it what you’re
willing to put in. If you just expect something that installs, asks a
couple of questions and you’re off to the races, Cassandra might not be for
you.

If you’re willing to put in the time to understand how Cassandra works, and
how it fits into your use case, and if it is the right fit for your use
case, you’ll be more than happy, I bet.

If there are things that are lacking, that you can’t find a work around
for, submit a PR! That’s the beauty of open source projects.

On Thu, Feb 22, 2018 at 2:55 AM Oleksandr Shulgin <
***@zalando.de> wrote:

> On Wed, Feb 21, 2018 at 7:54 PM, Durity, Sean R <
> ***@homedepot.com> wrote:
>
>>
>>
>> However, I think the shots at Cassandra are generally unfair. When I
>> started working with it, the DataStax documentation was some of the best
>> documentation I had seen on any project, especially an open source one.
>>
>
> Oh, don't get me started on documentation, especially the DataStax one. I
> come from Postgres. In comparison, Cassandra documentation is mostly
> non-existent (and this is just a way to avoid listing other uncomfortable
> epithets).
>
> Not sure if I would be able to submit patches to improve that, however,
> since most of the time it would require me to already know the answer to my
> questions when the doc is incomplete.
>
> The move from DataStax to Apache.org for docs is actually good, IMO, since
> the docs were maintained very poorly and there was no real leverage to
> influence that.
>
> Cheers,
> --
> Alex
>
>
Oleksandr Shulgin
2018-02-22 09:17:54 UTC
Permalink
On Thu, Feb 22, 2018 at 9:50 AM, Eric Plowe <***@gmail.com> wrote:

> Cassandra, hard to use? I disagree completely. With that said, there are
> definitely deficiencies in certain parts of the documentation, but nothing
> that is a show stopper.


True, there are no show-stoppers from the docs side, it's just all those
little things--they add up.

We’ve been using Cassandra since the sub 1.0 days and have had nothing but
> great things to say about it.
>
> With that said, its an open source project; you get from it what you’re
> willing to put in. If you just expect something that installs, asks a
> couple of questions and you’re off to the races, Cassandra might not be for
> you.
>
> If you’re willing to put in the time to understand how Cassandra works,
> and how it fits into your use case, and if it is the right fit for your use
> case, you’ll be more than happy, I bet.
>

We are using Cassandra since v2.1 for more than 2 years now, and installing
was never a problem. It does work and allows us to sleep well, which
cannot be underappreciated.

The problems begin when you need to do operations. You never know what
exactly will happen when you start a certain repair command or how the
streaming will happen in case of bootstrap/rebuild, and the docs just
aren't detailed enough, so you have go the trial and error path most of the
time.

Regards,
--
Alex
Kenneth Brotman
2018-02-23 17:54:38 UTC
Permalink
A sincere thank you for everyone that replied. I will heavy lift the docs for a while, do my Slender Cassandra reference project and then I’ll try to find one or two areas where I can contribute code to get going on that.

I'll have a few JIRA's started by the end of the workday.

Kenneth Brotman


---------------------------------------------------------------------
To unsubscribe, e-mail: user-***@cassandra.apache.org
For additional commands, e-mail: user-***@cassandra.apache.org
Continue reading on narkive:
Loading...