Database Administrators

Q&A for database professionals who wish to improve their database skills

Latest Questions

0 votes

0 answers

36 views

How to setup continuous DB replication from one MS SQL server to another MS SQL server without reconfiguring source machine?

sql-server aws change-data-capture kafka

I am looking for a way to set up continuous data replication between one Microsoft SQL Server to another Microsoft SQL Server both in AWS. The source tables are updated continuously. The challenge here is that I don't want to enable CDC on the source DB.

                                  I am looking for a way to set up continuous data replication between one Microsoft SQL Server to another Microsoft SQL Server both in AWS. The source tables are updated continuously. The challenge here is that I don't want to enable CDC on the source DB.
                                

praloy infinios (1 rep)

Apr 9, 2025, 07:48 AM

-1 votes

1 answers

409 views

io.debezium.DebeziumException: Client requested master to start replication from impossible position

mysql mariadb binlog mariadb-10.5 kafka

I am using Kafka and Debezium to connect to a MariaDB I have a problem where in the event the MariaDB loses power and shutsdown abruptly, on restart Debezium is no longer able to connect I have this error in particular ``` [2023-12-06 13:26:05,800] ERROR [192.168.108.1|task-0] Error during binlog pr...

[2023-12-06 13:26:05,800] ERROR [192.168.108.1|task-0] Error during binlog processing. Last offset stored = null, binlog reader near position = mariadb-bin.000006/53938 (io.debezium.connector.mysql.MySqlStreamingChangeEventSource:1161)

io.debezium.DebeziumException: Client requested master to start replication from impossible position; the first event 'mariadb-bin.000006' at 53938, the last event read from 'mariadb-bin.000006' at 4, the last byte read from 'mariadb-bin.000006' at 4. Error code: 1236; SQLSTATE: HY000.

And on the mariadb side i have this

+--------------------+-----------+
| Log_name           | File_size |
+--------------------+-----------+
| mariadb-bin.000001 |       330 |
| mariadb-bin.000002 |       330 |
| mariadb-bin.000003 |     67494 |
| mariadb-bin.000004 |     69007 |
| mariadb-bin.000005 |    126922 |
| mariadb-bin.000006 |     45056 |
| mariadb-bin.000007 |    103304 |
+--------------------+-----------+
7 rows in set (0.001 sec)

MariaDB [(none)]> show master status;
+--------------------+----------+--------------+------------------+
| File               | Position | Binlog_Do_DB | Binlog_Ignore_DB |
+--------------------+----------+--------------+------------------+
| mariadb-bin.000007 |   103304 |              |                  |
+--------------------+----------+--------------+------------------+
1 row in set (0.000 sec)

I have tried resetting offsets on Kafka side of things but I get the exact same error without fail, does anyone know why exactly there is a problem and how to resolve things? All help appreciated thanks so much!

RedRum69 (1 rep)

Dec 6, 2023, 06:39 PM • Last activity: Mar 15, 2025, 12:09 AM

0 votes

2 answers

51 views

Cassandra and Prometheus communication over docker compose

cassandra docker kafka prometheus

I am sorry to come back to this point, but it's still unsolved. Let me introduce the scenario. Working on Linux machine (3.10.0-1160.el7.x86_64), I am using docker compose to deploy different containers such as cassandra, kafka, prometheus, grafana, etc... After deploying the docker compose file, al...

                                  I am sorry to come back to this point, but it's still unsolved. Let me introduce the scenario. Working on Linux machine (3.10.0-1160.el7.x86_64), I am using docker compose to deploy different containers such as cassandra, kafka, prometheus, grafana, etc... After deploying the docker compose file, all the containers look running:

    

        docker ps --format "{{.Names}}: {{.Status}}"
    jenkins-schema-registry-1: Up 41 minutes
    jenkins-broker-1: Up 41 minutes
    jenkins-storage-1: Up 41 minutes
    jenkins-prometheus-1: Up 41 minutes
    jenkins-acs-1: Up 41 minutes
    jenkins-zookeeper-1: Up 41 minutes
    jenkins-logaggregator-1: Up 41 minutes
    jenkins-grafana-1: Up 41 minutes
    jenkins-loki-1: Up 41 minutes
    jenkins-promtail-1: Up 41 minutes
    jenkins-sql_acadacdb-1: Up 41 minutes (healthy)
    jenkins-opcuasimulatoraas-1: Up 41 minutes
    jenkins-opcuasimulatormon-1: Up 41 minutes
    jenkins-logsimulator-1: Up 41 minutes
    jenkins-hmi_redis-1: Up 41 minutes
    jenkins-mysql-1: Up 41 minutes (healthy)
    jenkins-mongo-1: Up 41 minutes
    
    In detail, here is the network configuration:
    
        docker ps --format "{{.ID}}: {{.Names}} -> {{.Networks}}"
        bef94007b3cc: jenkins-schema-registry-1 -> jenkins_monitoring
        8c5dd97de847: jenkins-broker-1 -> jenkins_monitoring
        525a4d21f146: jenkins-storage-1 -> jenkins_monitoring
        790f5a91013b: jenkins-prometheus-1 -> jenkins_monitoring
        cd1e964deed8: jenkins-acs-1 -> jenkins_default
        315859268aa9: jenkins-zookeeper-1 -> jenkins_monitoring
        a7229f21f3c5: jenkins-logaggregator-1 -> jenkins_monitoring
        8ee8483ad5a0: jenkins-grafana-1 -> jenkins_monitoring
        29f552f4d239: jenkins-loki-1 -> jenkins_loki-net,jenkins_monitoring
        c08294688cec: jenkins-promtail-1 -> jenkins_loki-net,jenkins_monitoring
        e3cf072659f0: jenkins-sql_acadacdb-1 -> jenkins_default
        cb78b00c13fe: jenkins-opcuasimulatoraas-1 -> jenkins_default
        01d046b685c8: jenkins-opcuasimulatormon-1 -> jenkins_default
        7a978478f082: jenkins-logsimulator-1 -> jenkins_default
        cd981c617974: jenkins-hmi_redis-1 -> jenkins_default
        0ef9bee718a4: jenkins-mysql-1 -> jenkins_default
        6cc8588a3910: jenkins-mongo-1 -> jenkins_default

And the exposed ports for each container:

    docker ps --format "{{.ID}}: {{.Names}} -> {{.Ports}}"
    bef94007b3cc: jenkins-schema-registry-1 -> 0.0.0.0:32800->8081/tcp, :::32800->8081/tcp
    8c5dd97de847: jenkins-broker-1 -> 0.0.0.0:7072->7072/tcp, :::7072->7072/tcp, 0.0.0.0:9091->9091/tcp, :::9091->9091/tcp, 9092/tcp
    525a4d21f146: jenkins-storage-1 -> 0.0.0.0:7000-7001->7000-7001/tcp, :::7000-7001->7000-7001/tcp, 0.0.0.0:7199->7199/tcp, :::7199->7199/tcp, 0.0.0.0:9042->9042/tcp, :::9042->9042/tcp, 0.0.0.0:9100->9100/tcp, :::9100->9100/tcp, 0.0.0.0:9160->9160/tcp, :::9160->9160/tcp
    790f5a91013b: jenkins-prometheus-1 -> 0.0.0.0:9090->9090/tcp, :::9090->9090/tcp
    cd1e964deed8: jenkins-acs-1 -> 
    315859268aa9: jenkins-zookeeper-1 -> 2888/tcp, 3888/tcp, 0.0.0.0:32796->2181/tcp, :::32796->2181/tcp
    a7229f21f3c5: jenkins-logaggregator-1 -> 0.0.0.0:32798->5044/tcp, :::32798->5044/tcp
    8ee8483ad5a0: jenkins-grafana-1 -> 3000/tcp, 0.0.0.0:3210->3210/tcp, :::3210->3210/tcp
    29f552f4d239: jenkins-loki-1 -> 0.0.0.0:3100->3100/tcp, :::3100->3100/tcp
    c08294688cec: jenkins-promtail-1 -> 
    e3cf072659f0: jenkins-sql_acadacdb-1 -> 3306/tcp, 33060-33061/tcp
    cb78b00c13fe: jenkins-opcuasimulatoraas-1 -> 0.0.0.0:32799->52522/tcp, :::32799->52522/tcp
    01d046b685c8: jenkins-opcuasimulatormon-1 -> 0.0.0.0:32797->52520/tcp, :::32797->52520/tcp
    7a978478f082: jenkins-logsimulator-1 -> 
    cd981c617974: jenkins-hmi_redis-1 -> 6379/tcp
    0ef9bee718a4: jenkins-mysql-1 -> 3306/tcp, 33060-33061/tcp
    6cc8588a3910: jenkins-mongo-1 -> 27017/tcp

Here are the prometheus, cassandra and kafka sections from the docker-compose file:

    storage:
        image: oci-reg-cta.zeuthen.desy.de/acada/loggingsystem/monstorage:lite
        ports:
          - "7000:7000"      # Gossip communication
          - "7001:7001"      # Intra-node TLS
          - "7199:7199"      # JMX port
          - "9042:9042"      # Native transport
          - "9160:9160"      # Thrift service
          - "9100:9100"      # Prometheus JMX Exporter
        volumes:
          - storage_cassandra:/var/lib/cassandra
          - ./jmx_prometheus_javaagent-0.15.0.jar:/opt/jmx_prometheus_javaagent.jar
          - ./cassandra.yml:/opt/cassandra.yml
        environment:
          - JVM_OPTS=-javaagent:/opt/jmx_prometheus_javaagent.jar=0.0.0.0:7071:/opt/cassandra.yml -Dlog.level=DEBUG -Dcom.sun.management.jmxremote=true -Dcom.sun.management.jmxremote.local.only=false -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Djava.rmi.server.hostname=0.0.0.0 -Dcom.sun.management.jmxremote.rmi.port=7199 -Dcassandra.jmx.remote.port=7199
         cap_add:
          - SYS_ADMIN
        security_opt:
          - seccomp:unconfined
        networks:
           - monitoring
    
    broker:
        image: oci-reg-cta.zeuthen.desy.de/acada/confluentinc/cp-kafka:5.4.0
        depends_on:
          - zookeeper
        ports:
          - "7072:7072"  # Porta per Prometheus JMX Exporter
          - "9091:9091"  #Porta RMI
        environment:
          KAFKA_BROKER_ID: 1
          KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
          KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT
          KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://broker:29092,PLAINTEXT_HOST://localhost:9092
          KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1 # must be set to 1 when running with a single-node cluster
          KAFKA_DEFAULT_REPLICATION_FACTOR: 1 # cannot be larger than the number of brokers
          KAFKA_NUM_PARTITIONS: 3 # default number of partitions per topic
          KAFKA_OPTS: -javaagent:/opt/jmx_prometheus_javaagent.jar=0.0.0.0:7072:/opt/kafka.yml -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Djava.rmi.server.hostname=0.0.0.0 -Dcom.sun.management.jmxremote.rmi.port=9091
          KAFKA_CONFLUENT_SUPPORT_METRICS_ENABLE: "false"
          #JVM_OPTS: -javaagent:/opt/jmx_prometheus_javaagent.jar=7072:/opt/kafka.yml
        volumes:
          - broker_data:/var/lib/kafka/data
          - broker_secrets:/etc/kafka/secrets
          - ./jmx_prometheus_javaagent-0.15.0.jar:/opt/jmx_prometheus_javaagent.jar
          - ./kafka.yml:/opt/kafka.yml
        networks:
          - monitoring
    
     prometheus:
        image: prom/prometheus:v2.53.1
        ports:
          - "9090:9090" # Prometheus web interface and API (TCP)
        volumes:
          - type: bind
            source: ./prometheus.yml
            target: /etc/prometheus/prometheus.yml
        networks:
          - monitoring

And the .yaml files:

    cassandra.yml
    ---
    startDelaySeconds: 0
    hostPort: 0.0.0.0:7199
    username: xxxxx
    password: xxxxx
    ssl: false
    lowercaseOutputName: false
    lowercaseOutputLabelNames: false
    whitelistObjectNames: ["org.apache.cassandra.metrics:*"]
    rules:
      - pattern: 'org.apache.cassandra.metricsValue'
        name: cassandra_$1_$2
        type: GAUGE
        labels:
          mylabel: "myvalue"
        help: "Cassandra metric $1 $2"
      - pattern: 'org.apache.cassandra.metricsValue'
        name: cassandra_$1_$2_$3
        type: GAUGE
        help: "Cassandra metric $1 $2 $3"
      - pattern: 'org.apache.cassandra.metricsValue'
        name: cassandra_$1_$2_$3_$4
        type: GAUGE
        help: "Cassandra metric $1 $2 $3 $4"

    prometheus.yml 
    global:
      scrape_interval: 25s
      scrape_timeout: 25s
    
    scrape_configs:
      - job_name: 'prometheus'
        static_configs:
          - targets: ['localhost:9090']  # Prometheus server
    
      - job_name: 'storage'
        metrics_path: /metrics
        static_configs:
          - targets: ['storage:7071']  # Storage/Cassandra JMX exporter
          
      - job_name: 'broker'
        static_configs:
          - targets: ['broker:7072']  # Broker/Kafka JMX exporter

    kafka.yml 
    #jmxUrl: "service:jmx:rmi:///jndi/rmi://localhost:7072/jmxrmi"
    
    lowercaseOutputName: true
    
    rules:
      - pattern: kafka.serverValue
        name: kafka_server_fetcher_bytes_consumed_total
        labels:
          client_id: "$1"

Querying Prometheus about the status of its targets containers, the situation looks normal:

    curl -s http://localhost:9090/api/v1/targets | grep '"health":"up"'
    {"status":"success","data":{"activeTargets":[{"discoveredLabels":{"__address__":"broker:7072","__metrics_path__":"/metrics","__scheme__":"http","__scrape_interval__":"25s","__scrape_timeout__":"25s","job":"broker"},"labels":{"instance":"broker:7072","job":"broker"},"scrapePool":"broker","scrapeUrl":"http://broker:7072/metrics ","globalUrl":"http://broker:7072/metrics ","lastError":"","lastScrape":"2025-01-28T06:13:23.641150824Z","lastScrapeDuration":0.089283854,"health":"up","scrapeInterval":"25s","scrapeTimeout":"25s"},{"discoveredLabels":{"__address__":"localhost:9090","__metrics_path__":"/metrics","__scheme__":"http","__scrape_interval__":"25s","__scrape_timeout__":"25s","job":"prometheus"},"labels":{"instance":"localhost:9090","job":"prometheus"},"scrapePool":"prometheus","scrapeUrl":"http://localhost:9090/metrics","globalUrl":"http://790f5a91013b:9090/metrics ","lastError":"","lastScrape":"2025-01-28T06:13:06.761760372Z","lastScrapeDuration":0.011126742,"health":"up","scrapeInterval":"25s","scrapeTimeout":"25s"},{"discoveredLabels":{"__address__":"storage:7071","__metrics_path__":"/metrics","__scheme__":"http","__scrape_interval__":"25s","__scrape_timeout__":"25s","job":"storage"},"labels":{"instance":"storage:7071","job":"storage"},"scrapePool":"storage","scrapeUrl":"http://storage:7071/metrics ","globalUrl":"http://storage:7071/metrics ","lastError":"","lastScrape":"2025-01-28T06:13:07.393065033Z","lastScrapeDuration":3.497353034,"health":"up","scrapeInterval":"25s","scrapeTimeout":"25s"}],"droppedTargets":[],"droppedTargetCounts":{"broker":0,"prometheus":0,"storage":0}}}

Checking for the Cassandra metrics ketch by Prometheus, I get (for example):

    curl -s http://localhost:9090/api/v1/query?query=jvm_memory_bytes_used | grep 'result'
    {"status":"success","data":{"resultType":"vector","result":[{"metric":{"__name__":"jvm_memory_bytes_used","area":"heap","instance":"broker:7072","job":"broker"},"value":[1738044886.425,"443680528"]},{"metric":{"__name__":"jvm_memory_bytes_used","area":"nonheap","instance":"broker:7072","job":"broker"},"value":[1738044886.425,"67814792"]},{"metric":{"__name__":"jvm_memory_bytes_used","area":"heap","instance":"storage:7071","job":"storage"},"value":[1738044886.425,"438304896"]},{"metric":{"__name__":"jvm_memory_bytes_used","area":"nonheap","instance":"storage:7071","job":"storage"},"value":[1738044886.425,"75872616"]}]}}

It seems Prometheus is working and communicating with Cassandra and Kafka. 
But with the following commands (asking to a specific port) I don't get any result (exception done for port 7072 ... it seems just some java metrics and not SPECIFIC to Kafka):

    [common@muoni-wn-15 jenkins]$ curl -s http://localhost:7071/metrics
    [common@muoni-wn-15 jenkins]$ curl -s http://localhost:7199/metrics
    [common@muoni-wn-15 jenkins]$ curl -s http://localhost:7071/metrics
    [common@muoni-wn-15 jenkins]$ curl -s http://localhost:9091/metrics
    [common@muoni-wn-15 jenkins]$ curl -s http://localhost:7072/metrics
    # HELP jmx_config_reload_success_total Number of times configuration have successfully been reloaded.
    # TYPE jmx_config_reload_success_total counter
    jmx_config_reload_success_total 0.0
    # HELP jvm_classes_loaded The number of classes that are currently loaded in the JVM
    # TYPE jvm_classes_loaded gauge
    jvm_classes_loaded 6275.0
    # HELP jvm_classes_loaded_total The total number of classes that have been loaded since the JVM has started execution
    # TYPE jvm_classes_loaded_total counter
    jvm_classes_loaded_total 6275.0
    # HELP jvm_classes_unloaded_total The total number of classes that have been unloaded since the JVM has started execution
    # TYPE jvm_classes_unloaded_total counter
    jvm_classes_unloaded_total 0.0 

So, I suspect there is some misconfiguration in the middle, because I am not sure: 1. the JMX are collecting information; 2. the information are what I want

And also, more severely... I got the following exceptions:

KAFKA -->

    [common@muoni-wn-15 jenkins]$ docker compose -f docker-compose.yml.prometheus-cassandra.v4.yml exec broker bash -l
    root@3b8e9d856d3f:/# kafka-topics --bootstrap-server localhost:29092 --list
    Exception in thread "main" java.lang.reflect.InvocationTargetException
            at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
            at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
            at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
            at java.lang.reflect.Method.invoke(Method.java:498)
            at sun.instrument.InstrumentationImpl.loadClassAndStartAgent(InstrumentationImpl.java:386)
            at sun.instrument.InstrumentationImpl.loadClassAndCallPremain(InstrumentationImpl.java:401)
    Caused by: java.net.BindException: Address already in use
            at sun.nio.ch.Net.bind0(Native Method)
            at sun.nio.ch.Net.bind(Net.java:433)
            at sun.nio.ch.Net.bind(Net.java:425)
            at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
            at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
            at sun.net.httpserver.ServerImpl.(ServerImpl.java:100)
            at sun.net.httpserver.HttpServerImpl.(HttpServerImpl.java:50)
            at sun.net.httpserver.DefaultHttpServerProvider.createHttpServer(DefaultHttpServerProvider.java:35)
            at com.sun.net.httpserver.HttpServer.create(HttpServer.java:130)
            at io.prometheus.jmx.shaded.io.prometheus.client.exporter.HTTPServer.(HTTPServer.java:179)
            at io.prometheus.jmx.shaded.io.prometheus.jmx.JavaAgent.premain(JavaAgent.java:31)
            ... 6 more
    FATAL ERROR in native method: processing of -javaagent failed
    Aborted (core dumped)

CASSANDRA -->

    [common@muoni-wn-15 jenkins]$ docker compose -f docker-compose.yml.prometheus-cassandra.v3.yml exec storage bash -l
    root@a1c1e2c5e95c:/# nodetool status
    Exception in thread "main" java.lang.reflect.InvocationTargetException
            at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
            at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
            at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
            at java.lang.reflect.Method.invoke(Method.java:498)
            at sun.instrument.InstrumentationImpl.loadClassAndStartAgent(InstrumentationImpl.java:386)
            at sun.instrument.InstrumentationImpl.loadClassAndCallPremain(InstrumentationImpl.java:401)
    Caused by: java.net.BindException: Address already in use
            at sun.nio.ch.Net.bind0(Native Method)
            at sun.nio.ch.Net.bind(Net.java:461)
            at sun.nio.ch.Net.bind(Net.java:453)
            at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:222)
            at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:85)
            at sun.net.httpserver.ServerImpl.(ServerImpl.java:100)
            at sun.net.httpserver.HttpServerImpl.(HttpServerImpl.java:50)
            at sun.net.httpserver.DefaultHttpServerProvider.createHttpServer(DefaultHttpServerProvider.java:35)
            at com.sun.net.httpserver.HttpServer.create(HttpServer.java:130)
            at io.prometheus.jmx.shaded.io.prometheus.client.exporter.HTTPServer.(HTTPServer.java:179)
            at io.prometheus.jmx.shaded.io.prometheus.jmx.JavaAgent.premain(JavaAgent.java:31)
            ... 6 more
    FATAL ERROR in native method: processing of -javaagent failed
    Aborted (core dumped)

 I hope someone will help me to leave this dark tunnel ... 

Best regards,
Emilio

Hello All,
recently I decided to simply the docker compose file to identify the error source. This is the storage section:

    storage:
         image: oci-reg-cta.zeuthen.desy.de/acada/loggingsystem/monstorage:lite
         ports:
           - "1234:1234"  # JMX Exporter HTTP port
           - "7000:7000"  # Gossip
           - "7001:7001"  # TLS
           - "7199:7199"  # JMX port (Cassandra native)
           - "7198:7198"  # RMI registry port (newly added)
           - "9042:9042"  # CQL
         volumes:
           - storage_cassandra:/var/lib/cassandra
           - ./jmx_prometheus_javaagent-0.15.0.jar:/opt/jmx_prometheus_javaagent.jar
           - ./cassandra.v5.yml:/opt/cassandra.v5.yml
         environment:
            - JVM_OPTS=
               -javaagent:/opt/jmx_prometheus_javaagent.jar=1234:/opt/cassandra.v5.yml
               -Dcassandra.jmx.remote.port=7199
               -Dcom.sun.management.jmxremote.rmi.port=7198
               -Djava.rmi.server.hostname=storage
               -Dcom.sun.management.jmxremote.authenticate=false
               -Dcom.sun.management.jmxremote.ssl=false

and this is the cassandra.v5.yml

    startDelaySeconds: 0
    lowercaseOutputName: false
    lowercaseOutputLabelNames: false
    whitelistObjectNames: ["org.apache.cassandra.metrics:*"]
    rules:
      - pattern: 'org.apache.cassandra.metricsValue'
        name: cassandra_$1_$2
        type: GAUGE
        help: "Cassandra metric $1 $2"
      - pattern: 'org.apache.cassandra.metricsValue'
        name: cassandra_$1_$2_$3
        type: GAUGE
        help: "Cassandra metric $1 $2 $3"

And this is the prometheus.v5.yml

    global:
      scrape_interval: 25s
      scrape_timeout: 25s
    
    scrape_configs:
      - job_name: 'prometheus'
        static_configs:
          - targets: ['localhost:9090']  # Prometheus server
    
      - job_name: 'storage'
        metrics_path: /metrics
        static_configs:
          - targets: ['storage:1234']  # Storage/Cassandra JMX exporter

Using this configuration I can scrape the cassandra metrics exposed at http://localhost:1234/metrics , and there's no problem to query the kafka container, as you can see:

    [common@muoni-wn-15 jenkins]$ docker exec -it jenkins-broker-1 kafka-topics --bootstrap-server localhost:29092 --list
    __confluent.support.metrics
    _schemas
    logStorage

But when I try to invoke the nodetool command inside the storage container, I get always the same error ... never mind the port I use (the error is the same even if I don't specify the -p 7197 option of if I change the jmx_prometheus_javaagent.jar port):

    [common@muoni-wn-15 jenkins]$ docker exec -it jenkins-storage-1 nodetool -p 7197 status
    Exception in thread "main" java.lang.reflect.InvocationTargetException
    	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    	at java.lang.reflect.Method.invoke(Method.java:498)
    	at sun.instrument.InstrumentationImpl.loadClassAndStartAgent(InstrumentationImpl.java:386)
    	at sun.instrument.InstrumentationImpl.loadClassAndCallPremain(InstrumentationImpl.java:401)
    Caused by: java.net.BindException: Address already in use
    	at sun.nio.ch.Net.bind0(Native Method)
    	at sun.nio.ch.Net.bind(Net.java:461)
    	at sun.nio.ch.Net.bind(Net.java:453)
    	at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:222)
    	at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:85)
    	at sun.net.httpserver.ServerImpl.(ServerImpl.java:100)
    	at sun.net.httpserver.HttpServerImpl.(HttpServerImpl.java:50)
    	at sun.net.httpserver.DefaultHttpServerProvider.createHttpServer(DefaultHttpServerProvider.java:35)
    	at com.sun.net.httpserver.HttpServer.create(HttpServer.java:130)
    	at io.prometheus.jmx.shaded.io.prometheus.client.exporter.HTTPServer.(HTTPServer.java:179)
    	at io.prometheus.jmx.shaded.io.prometheus.jmx.JavaAgent.premain(JavaAgent.java:31)
    	... 6 more
    FATAL ERROR in native method: processing of -javaagent failed
    Aborted (core dumped)

Please, I hope someone will help me to leave this tunnel ... 

Hello again, I checked the logs as suggested by piotr:

    root@8bf0bcba72c7:/var/log/cassandra# grep 1234 system.log 
    INFO  [main] 2025-02-05 07:09:45,973 CassandraDaemon.java:507 - JVM Arguments: [-javaagent:/opt/jmx_prometheus_javaagent.jar=1234:/opt/cassandra.v5.yml, -Dcassandra.jmx.remote.port=7199, -Dcom.sun.management.jmxremote.rmi.port=7198, -Djava.rmi.server.hostname=storage, -Dcom.sun.management.jmxremote.authenticate=false, -Dcom.sun.management.jmxremote.ssl=false, -Xloggc:/opt/cassandra/logs/gc.log, -ea, -XX:+UseThreadPriorities, -XX:ThreadPriorityPolicy=42, -XX:+HeapDumpOnOutOfMemoryError, -Xss256k, -XX:StringTableSize=1000003, -XX:+AlwaysPreTouch, -XX:-UseBiasedLocking, -XX:+UseTLAB, -XX:+ResizeTLAB, -XX:+UseNUMA, -XX:+PerfDisableSharedMem, -Djava.net.preferIPv4Stack=true, -Xms1G, -Xmx1G, -XX:+UseParNewGC, -XX:+UseConcMarkSweepGC, -XX:+CMSParallelRemarkEnabled, -XX:SurvivorRatio=8, -XX:MaxTenuringThreshold=1, -XX:CMSInitiatingOccupancyFraction=75, -XX:+UseCMSInitiatingOccupancyOnly, -XX:CMSWaitDuration=10000, -XX:+CMSParallelInitialMarkEnabled, -XX:+CMSEdenChunksRecordAlways, -XX:+CMSClassUnloadingEnabled, -XX:+PrintGCDetails, -XX:+PrintGCDateStamps, -XX:+PrintHeapAtGC, -XX:+PrintTenuringDistribution, -XX:+PrintGCApplicationStoppedTime, -XX:+PrintPromotionFailure, -XX:+UseGCLogFileRotation, -XX:NumberOfGCLogFiles=10, -XX:GCLogFileSize=10M, -Xmn2048M, -XX:+UseCondCardMark, -XX:CompileCommandFile=/etc/cassandra/hotspot_compiler, -javaagent:/opt/cassandra/lib/jamm-0.3.0.jar, -Dcassandra.jmx.local.port=7199, -Dcom.sun.management.jmxremote.authenticate=false, -Dcom.sun.management.jmxremote.password.file=/etc/cassandra/jmxremote.password, -Djava.library.path=/opt/cassandra/lib/sigar-bin, -Dcassandra.libjemalloc=/usr/local/lib/libjemalloc.so, -XX:OnOutOfMemoryError=kill -9 %p, -Dlogback.configurationFile=logback.xml, -Dcassandra.logdir=/opt/cassandra/logs, -Dcassandra.storagedir=/opt/cassandra/data, -Dcassandra-foreground=yes]

showing the configuration file is loaded (if I am not wrong), and the process looks running and waiting:

    root@8bf0bcba72c7:/var/log/cassandra# ss -tulp
    Netid             State              Recv-Q             Send-Q                         Local Address:Port                          Peer Address:Port             Process             
    udp               UNCONN             0                  0                                 127.0.0.11:43202                              0.0.0.0:*                                    
    tcp               LISTEN             0                  50                                   0.0.0.0:7198                               0.0.0.0:*                                    
    tcp               LISTEN             0                  50                                   0.0.0.0:7199                               0.0.0.0:*                                    
    tcp               LISTEN             0                  128                               127.0.0.11:36235                              0.0.0.0:*                                    
    tcp               LISTEN             0                  128                                  0.0.0.0:9042                               0.0.0.0:*                                    
    tcp               LISTEN             0                  3                                    0.0.0.0:1234                               0.0.0.0:*                                    
    tcp               LISTEN             0                  128                               172.31.0.8:7000                               0.0.0.0:*  

Of course, both checks have be done inside the cassandra container.

                                  
                                

Emilio Mastriani (1 rep)

Jan 28, 2025, 06:44 AM • Last activity: Feb 6, 2025, 08:35 AM

3 votes

2 answers

1546 views

Kafka Internal Data structure vs LSM tree

architecture storage-engine kafka

I was going through Database storage engines and found out about LSM trees. I had also read about Kafka architecture and know that Kafka internally is a commit log. I want to know if Kafka internally uses LSM data structure for append-only store or uses some other data structure for storing data?

                                  I was going through Database storage engines and found out about LSM trees. I had also read about Kafka architecture and know that Kafka internally is a commit log. I want to know if Kafka internally uses LSM data structure for append-only store or uses some other data structure for storing data?
                                

Ayman Patel (153 rep)

Mar 1, 2021, 07:39 AM • Last activity: Nov 26, 2024, 02:48 PM

0 votes

0 answers

20 views

is it a good idea to use GTID for tenant meta data?

mysql change-data-capture gtid kafka

i want To build an audit log for tracking user activity using `Kafka Connect CDC (Change Data Capture)` and `MySQL`, and set a custom GTID (Global Transaction Identifier) before each transaction, here’s a high-level design pattern and implementation strategy. **Overview of the Solution:** **CDC with...

                                  
i want To build an audit log for tracking user activity using Kafka Connect CDC (Change Data Capture) and MySQL, and set a custom GTID (Global Transaction Identifier) before each transaction, here’s a high-level design pattern and implementation strategy.

**Overview of the Solution:**

**CDC with Kafka Connect:** i will use Debezium (a Kafka Connect connector) to capture changes in my MySQL database, which will be streamed to Kafka topics. This will act as the foundation for my audit log system.

**Custom GTID:** (Global Transaction ID): Before executing each transaction in my application, i will set the GTID with custom information (e.g., tenant ID, email, UUID). This GTID will help me uniquely identify the transaction and associate it with a specific tenant and action. SET gtid_next = 'tenantId:tenantEmail;randomUUID'

**Audit Log Consumer:** Kafka consumers will read changes from the Kafka topics, process the audit log information (including the GTID), and store the logs in a database or external system.

what re the down side of this approach could  issue with replactions?

Naor Tedgi (101 rep)

Sep 30, 2024, 03:37 PM

0 votes

0 answers

49 views

Causing SQL to prefer casted time seek in view query

sql-server sql-server-2022 kafka

We have ETL process executed by Kafka. It can only get and query by `datetime2` date data type. It generates queries like this: EXEC sp_executesql N'SELECT * FROM ViewName WHERE ViewName.UpdateDate > @P0 AND ViewName.UpdateDate '2022-07-01' --Can add UpdateDate > DATEADD(d, -2, GETDATE()) GO But in...

                                  We have ETL process executed by Kafka. It can only get and query by datetime2 date data type.
It generates queries like this:

    EXEC sp_executesql N'SELECT *
    FROM ViewName
    WHERE ViewName.UpdateDate > @P0
          AND ViewName.UpdateDate  '2022-07-01'
    		--Can add UpdateDate > DATEADD(d, -2, GETDATE())
    GO
But in this case SQL decides first to scan this index and only then apply parameters filtering causing heavy reads. I can add commented out filter to reduce reads, but I'm searching for solution to make SQL "understand" that it can use parameters instead. We cannot create non filtered index because of table size. Also adding Cast to UpdateDate in where clause causes SQL not to use index at all.

Is there something that can be done in this case code wise?(Forceseek hint throws error)

                                

Michael Cherevko (742 rep)

Aug 6, 2024, 02:13 PM

1 votes

1 answers

102 views

SSL Encryption for Kafka Pipelines

encryption ssl kafka singlestore

Any [SingleStore] users out there? For enabling SSL encryption (only) for Kafka pipelines, we created the pipeline like below: CREATE PIPELINE `kafka_ssl` AS LOAD DATA KAFKA ' : /test' CONFIG '{"security.protocol": "ssl" "ssl.ca.location": " "}'' INTO table ; from https://docs.memsql.com/v7.0/concep...

                                  Any [SingleStore] users out there?

For enabling SSL encryption (only) for Kafka pipelines, we created the pipeline like below:

    CREATE PIPELINE kafka_ssl
    AS LOAD DATA KAFKA ':/test'
    CONFIG '{"security.protocol": "ssl"
    "ssl.ca.location": ""}''
    INTO table ;
from https://docs.memsql.com/v7.0/concepts/pipelines/kafka-kerberos-ssl/  22

Please note that we’re not looking for Client authentication (mutual TLS connection). This requires us to manually copy the CA Cert into memsql nodes and provide that path. This works fine for standalone Kafka cluster.

What is the recommended approach if we’re using AWS MSK?

Following documentation points to the client configuration for SSL encryption for MSK clients. They use client truststore location as a property which is not an identified property for memsql pipeline config json

https://docs.aws.amazon.com/msk/latest/developerguide/msk-encryption.html  10
https://docs.aws.amazon.com/msk/latest/developerguide/msk-authentication.html  

See others  having the same issue in the SingleStore forums. Unable to create a pipeline from Singlestore in a box to AWS MSK.
                                

Matt Brown (49 rep)

Jun 18, 2022, 06:51 PM • Last activity: Jul 26, 2022, 04:51 PM

1 votes

1 answers

437 views

SingleStore: Sign Up Log In Cannot get source metadata for pipeline. Failed to create consumer: ssl.ca.location failed: No further error info

kafka memsql singlestore

below is the command that i am using, but when i create the kafka pipeline command i get the below error: ERROR 1933 ER_EXTRACTOR_EXTRACTOR_GET_LATEST_OFFSETS: Cannot get source metadata for pipeline. Failed to create consumer: ssl.ca.location failed: No further error information available Command u...

                                  below is the command that i am using, but when i create the kafka pipeline command i get the below error:
ERROR 1933 ER_EXTRACTOR_EXTRACTOR_GET_LATEST_OFFSETS: Cannot get source metadata for pipeline. Failed to create consumer: ssl.ca.location failed: No further error information available

Command used:

    CREATE PIPELINE ticketmaster_pipeline AS
    
    LOAD DATA KAFKA ‘b-3.etainmentnonprod.z2xjta.c25.kafka.us-east-1.amazonaws.com:9094,b-1.etainmentnonprod.z2xjta.c25.kafka.us-east-1.amazonaws.com:9094,b-2.etainmentnonprod.z2xjta.c25.kafka.us-east-1.amazonaws.com:9094/ticketmaster’
    
    CONFIG '{“sasl.username”: “AWS_ACCESS_KEY”,
    
         "sasl.mechanism": "PLAIN",
    
         "security.protocol": "SASL_SSL",
    
         "ssl.ca.location": "/etc/pki/ca-trust/extracted/java/cacerts"}'
    CREDENTIALS ‘{“sasl.password”: “AWS_PSWD/Z”}’
    
    INTO TABLE ticketmaster_kafka
    
    FORMAT JSON (event_url ← event_url,event_id ← event_id,timestamp ← timestamp,event_name ← event_name,venue ← venue,event_datetime ← event_datetime,city ← city,state ← state,section ← section,row ← row,qty ← qty,seat_ids ← seat_ids,seat_numbers ← seat_numbers,inventory_type ← inventory_type,price ← price);

My kafka setup in on AWS MSK insatnce and i have data in the topic also

Matt Brown (49 rep)

Jun 30, 2022, 09:33 PM • Last activity: Jul 14, 2022, 11:26 PM

0 votes

2 answers

71 views

how do modern data warehouses tackle frequent small writes? esp. when streaming data is one of the sources?

data-warehouse database-internals kafka

So for many days, I had a question in mind. **How do modern data warehouses tackle frequent small writes?** esp. when streaming data is one of the sources? e.g. Kafka/Kinesis => DW(Snowflake, Teradata, Oracle ADW, etc) I was under the impression that since data warehouse tables are typically **highl...

                                  So for many days, I had a question in mind.

**How do modern data warehouses tackle frequent small writes?** esp. when streaming data is one of the sources?   
e.g. Kafka/Kinesis => DW(Snowflake, Teradata, Oracle ADW, etc)

I was under the impression that since data warehouse tables are typically **highly denormalized and columnar** (for quick performance for reporting queries to avoid joins) they are slow for frequent small writes, but have good performance for reporting style SELECT statements. Hence the concept of **bulk nightly uploads from OLTP data sources to OLAP data warehouses**.

- What has changed in the modern DW internal architecture?  
- Is there a staging area within DW itself, where data lands and then it is aggregated, stats are collected and then denormalized before it finally rests into actual DW tables powering the reporting?  

I am interested in knowing how does it internally works at a high level.

I know this is a basic question, but this is my understanding from my school days hence I am pretty sure it is out of date.

Libertarian (3 rep)

Oct 23, 2021, 07:53 PM • Last activity: Nov 1, 2021, 12:03 PM

-2 votes

1 answers

154 views

Relational database with Kafka-like durability/caching implementation

database-design replication query-performance kafka

When reading about Kafka I found it really interesting how they are providing durability, especially in comparison to databases: * Kafka - never writes synchronously to disk, provides durability by replicating to other servers (and getting their confirmation); the added benefit is that it can reuses...

                                  When reading about Kafka I found it really interesting how they are providing durability, especially in comparison to databases:
* Kafka - never writes synchronously to disk, provides durability by replicating to other servers (and getting their confirmation); the added benefit is that it can reuses filesystem cache instead of rolling out your own
* relational database - flush log to disk to guarantee durability.

From what I was reading the Kafka approach results in a better performance so I was wondering if there is anything in particular that makes this approach infeasible for relational database? Or maybe some of the "NewSQL" guys are doing it and I'm just unaware?

Sources about Kafka mechanisms I'm talking about:

https://kafka.apache.org/documentation/#persistence 
https://kafka.apache.org/documentation/#replication 
                                

Krzysztof Nawara (1 rep)

Sep 18, 2021, 11:35 AM • Last activity: Sep 20, 2021, 10:51 PM

2 votes

0 answers

40 views

How can I seed all the rows through CDC

mysql change-data-capture kafka

We want to add a Kafka stream to publish changes to a table. The stream will be populated through the bin log and Atunity CDC. How do I seed the change data with the initial state of the table?

                                  We want to add a Kafka stream to publish changes to a table. The stream will be populated through the bin log and Atunity CDC. How do I seed the change data with the initial state of the table?
                                

David W (121 rep)

Dec 19, 2019, 10:58 PM

2 votes

1 answers

1428 views

Kafka Partitions vs Mongo Sharding which one is better throughput

mongodb sharding kafka

I am using Mongo Sharding to register page views on my website. We have hashed shard key to evenly distribute data in multiple shards. Then our aggregation queries run over time range at interval to aggregate this data and provide trends on site. We came across Kafka for write distribution for heavy...

                                  I am using Mongo Sharding to register page views on my website. We have hashed shard key to evenly distribute data in multiple shards. Then our aggregation queries run over time range at interval to aggregate this data and provide trends on site.  
We came across Kafka for write distribution for heavy load and this kind of streaming.  
I compared both systems and both provide distribution on partitions in a topic with leader follower approach. 
Kafka does it using multiple partition on different brokers with partition replication and Mongo does it with multiple shards which have replica sets.  
As aggregation query will always be on time range than it will go to multiple shards/ partitions always.  
My question is how we can compare which will provide better through put and run time scalability in case of heavy load as i understand both use same mechanism adding new partitions in case of Kafka or adding new shards in case of Mongo.  
Please provide suggestions. 

                                

viren (511 rep)

Mar 21, 2018, 04:15 AM • Last activity: Oct 30, 2019, 01:47 PM

0 votes

1 answers

457 views

Row based replication on slave where as statement based on Master

mysql replication master-slave-replication kafka

We have a master slave setup of MYSQL 5.6. Master and slave runs SBR replication where as slave is not writing binlogs. We need to enable binlogs on the Slave server but we want the binlogs to be in RBR format on the slave instead. We want to ship them to Kafka and it wants RBR replication only. Is...

                                  We have a master slave setup of MYSQL 5.6. Master and slave runs SBR replication where as slave is not writing binlogs.

We need to enable binlogs on the Slave server but we want the binlogs to be in RBR format on the slave instead. We want to ship them to Kafka and it wants RBR replication only.

Is this doable to have RBR on the slave where as it is getting data from master as SBR ?

Thanks

Mysql Consultancy (35 rep)

Apr 17, 2018, 02:07 PM • Last activity: Oct 30, 2019, 01:46 PM

Showing page 1 of 13 total questions