Database Administrators

Q&A for database professionals who wish to improve their database skills

Latest Questions

0 votes

1 answers

127 views

DSE Spark not able to find Cassandra tables

cassandra datastax-enterprise spark-cassandra-connector

I started the spark by dse spark command. Later I created a one rdd with cassandra keyspace and table .. Trying to print contents of tables using rddname.first But it shows couldn’t able to find keyspace or tables

                                  I started the spark by dse spark command.

Later I created a one rdd with cassandra keyspace and table ..
Trying to print contents of tables using rddname.first 
But it shows couldn’t able to find keyspace or tables
                                

Anil Kumar yellapu (1 rep)

May 12, 2023, 08:54 AM • Last activity: Sep 26, 2024, 06:08 AM

0 votes

1 answers

286 views

spark-cassandra-connector read throughput unpredictable

cassandra apache-spark spark-cassandra-connector

A user reports that the range query throughput is far higher than expected when setting spark.cassandra.input.readsPerSec in the spark-cassandra-connector. Job dependencies. The Java driver version is set to 4.13.0. com.datastax.spark spark-cassandra-connector_2.12 3.2.0 com.datastax.oss java-driver...

                                  A user reports that the range query throughput is far higher than expected when setting spark.cassandra.input.readsPerSec in the spark-cassandra-connector.

Job dependencies. The Java driver version is set to 4.13.0.

            com.datastax.spark
            spark-cassandra-connector_2.12
            3.2.0
            
                     com.datastax.oss
                    java-driver-core-shaded
                
...

            com.datastax.oss
            java-driver-core
            4.13.0
        
There are two steps in the job (both an FTS):

    Dataset dataset = sparkSession.sqlContext().read()
    .format("org.apache.spark.sql.cassandra")
    .option("table", "inbox_user_msg_dummy")
    .option("keyspace", "ssmp_inbox2").load();

-and- 

    Dataset olderDataset = sparkSession.sql("SELECT * FROM inbox_user_msg_dummy where app_uuid = 'cb663e07-7bcc-4039-ae97-8fb8e8a9ff77' AND " +
    "create_hour = token(G9e7Y4Y, 2023-08-10T04:17:27.234Z, cb663e07-7bcc-4039-ae97-8fb8e8a9ff77) AND token(user_id, create_hour, app_uuid) <= 9121832956220923771 LIMIT 10

FWIW, avg partition size is 649 bytes, max is 2.7kb.

Paul (416 rep)

Nov 7, 2023, 07:56 PM • Last activity: Nov 8, 2023, 02:07 PM

0 votes

2 answers

592 views

Sporadic write failures with UnauthorizedException: Unable to perform authorization of super-user permission: Cannot achieve consistency level QUORUM

cassandra spark-cassandra-connector

I'm using Apache Spark to write data to a Cassandra cluster. The deployment is kubernetes based and am using cassandra helm chart. Sporadically, I encounter a SparkException that leads to job abortion, as detailed below: > Caused by: com.datastax.oss.driver.api.core.servererrors.UnauthorizedException: Unable to perform authorization of permissions: Unable to perform authorization of super-user permission: Cannot achieve consistency level QUORUM Additional details on the Cassandra cluster: Datacenter: datacenter1

Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address        Load       Tokens  Owns (effective)  Host ID                               Rack
UN  10.x.x.x  10.98 GiB  256     64.7%             blahblah-c2e0509a03                        rack1
UN  10.x.x.x  12.17 GiB  256     69.7%             blahblah-a617-4dfbcdb999aa                 rack1
UN  10.x.x.x  12.6 GiB   256     65.6%             blahblah-9d4f-9111f4ae55a3

I have already ensured that the system_auth keyspace is replicated to all these nodes. However, the issue still appears intermittently. I'd appreciate any insight into why this might be happening and how to potentially resolve it.

Pro (101 rep)

Sep 4, 2023, 03:56 PM • Last activity: Sep 15, 2023, 05:34 AM

Showing page 1 of 3 total questions