You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@seatunnel.apache.org by "a11dev (via GitHub)" <gi...@apache.org> on 2024/04/08 09:35:00 UTC

[I] ORA-01292 Oracle to Postgres [seatunnel]

a11dev opened a new issue, #6662:
URL: https://github.com/apache/seatunnel/issues/6662

   ### Search before asking
   
   - [X] I had searched in the [issues](https://github.com/apache/seatunnel/issues?q=is%3Aissue+label%3A%22bug%22) and found no similar issues.
   
   
   ### What happened
   
   seatunnel 2.3.4 
   windows server local environment deployment 
   openjdk version "21.0.2" 2024-01-16
   oracle database ( dbaas ) 
   local postgres 
   3 oracle table - OracleCDC
   3 Sink - jdbc 
   
   It crash with an ORA-01292 
   [seatunnellog.zip](https://github.com/apache/seatunnel/files/14903517/seatunnellog.zip)
   
   All logs ( and config example ) attacched.
   
   I can increase log level if necessary.
   
   Might it be related to an automatic backup of redologs?
   [config.zip](https://github.com/apache/seatunnel/files/14903592/config.zip)
   
   ### SeaTunnel Version
   
   2.3.4
   
   ### SeaTunnel Config
   
   ```conf
   #Also into the first attachment:
   
   
   env {
   	parallelism = 2
       job.mode=STREAMING
       job.name=SeaTunnel_Job
       read_limit.bytes_per_second=7000000
       read_limit.rows_per_second=400
   }
   
     Oracle-CDC {
   
       result_table_name = "tab1"
       base-url = "jdbc:oracle:thin:user/password@ip:1521/service_name"
       source.reader.close.timeout = 120000
       username = "user"
       password = "password"
       database-names = ["DBNAME"]
   	# real db name DBNAME.domain.local ( it works with DBNAME )
       schema-names = ["SCHEMA"]
       startup.mode = "INITIAL"
       table-names = ["DBNAME.SCHEMA.TABLE1"]
     }
   
   
   }
   
   
   sink {
       jdbc {
   		source_table_name="tab1"
   		url = "jdbc:postgresql://localhost:5432/pgdatabase"
   		driver = "org.postgresql.Driver"
   		user = pguser
   		password = pgpassword
   		generate_sink_sql = true
   		database = pgdatabase
   		table = public.tab1
   		primary_keys = ["PKNAME"]
   		field_ide = LOWERCASE,
   		schema_save_mode = "CREATE_SCHEMA_WHEN_NOT_EXIST"
   		data_save_mode="APPEND_DATA"
       }
   
   }
   ```
   
   
   ### Running Command
   
   ```shell
   java -Dlog4j2.configurationFile=E:\programmi\apache-seatunnel-2.3.4\config\log4j2_client.properties -Dhazelcast.client.config=E:\programmi\apache-seatunnel-2.3.4\config\hazelcast-client.yaml -Dseatunnel.config=E:\programmi\apache-seatunnel-2.3.4\config\seatunnel.yaml -Dhazelcast.config=E:\programmi\apache-seatunnel-2.3.4\config\hazelcast.yaml -Dseatunnel.logs.file_name=seatunnel-starter-clienttest -Xms256m -Xmx512m -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=E:\programmi\apache-seatunnel-2.3.4\dump\zeta-client  -cp E:\programmi\apache-seatunnel-2.3.4\lib\*;E:\programmi\apache-seatunnel-2.3.4\starter\seatunnel-starter.jar org.apache.seatunnel.core.starter.seatunnel.SeaTunnelClient  --config .\config\v2.batch.config.template -m local
   ```
   
   
   ### Error Exception
   
   ```log
   2024-04-06 14:35:03,888 ERROR [i.d.c.o.l.LogMinerHelper      ] [debezium-reader-0] - Mining session stopped due to the java.sql.SQLException: ORA-01291: file di log mancante
   ORA-16241: In attesa del file di log dell'intervallo (n. thread 2, n. sequenza 106896)
   
   
   
   // into the first attachment
   ```
   
   
   ### Zeta or Flink or Spark Version
   
   Zeta
   
   ### Java or Scala Version
   
   openjdk version "21.0.2" 2024-01-16
   
   ### Screenshots
   
   everything is inside attachements.
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@seatunnel.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] ORA-01292 Oracle to Postgres [seatunnel]

Posted by "a11dev (via GitHub)" <gi...@apache.org>.
a11dev commented on issue #6662:
URL: https://github.com/apache/seatunnel/issues/6662#issuecomment-2074301162

   I would like to share some other information gathered about this topic.
   Comparing other orcle cdc connector ( debezium/confluent ref: https://docs.confluent.io/kafka-connectors/oracle-cdc/current/overview.html) . As suspected it happen when the redo log is purged from the system. It is recommended to implement practices such as maintaining archived log files for a period longer than the connector might be deactivated but in my case are purged every 30 mins. Debezium is using the strategy to implement a heartbeat in the connector by setting heartbeat.interval.ms to a positive value. This helps to ensure that the source offsets are advanced regularly, reducing the risk of the redo logs being purged before they are consumed by the connector, but still if the connector is down no hearthbeat will be generated.   
   
   I hope this could be useful!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@seatunnel.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] ORA-01292 Oracle to Postgres [seatunnel]

Posted by "Carl-Zhou-CN (via GitHub)" <gi...@apache.org>.
Carl-Zhou-CN commented on issue #6662:
URL: https://github.com/apache/seatunnel/issues/6662#issuecomment-2097560467

     It should be configured this way
     debezium {
           heartbeat.interval.ms = 10000
       }


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@seatunnel.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] ORA-01292 Oracle to Postgres [seatunnel]

Posted by "a11dev (via GitHub)" <gi...@apache.org>.
a11dev commented on issue #6662:
URL: https://github.com/apache/seatunnel/issues/6662#issuecomment-2043666599

   Thanks for your email.
   
   Yes it might be.
   For test environment it is not a problem, we can run it again and Seatunnel
   is able to recover lost sync.
   Production though, we know log files backup is performed every 30 minutes.
   is there a workaround to avoid blocks?
   
   
   
   Thanks
   Alessandro
   
   
   Il lun 8 apr 2024, 14:07 Carl-Zhou-CN ***@***.***> ha scritto:
   
   > Is it possible that the log expired and was removed
   >
   > —
   > Reply to this email directly, view it on GitHub
   > <https://github.com/apache/seatunnel/issues/6662#issuecomment-2042577100>,
   > or unsubscribe
   > <https://github.com/notifications/unsubscribe-auth/AI2V56ZYKYQOMIKVP5SHEMDY4KCBTAVCNFSM6AAAAABF4K3G7OVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANBSGU3TOMJQGA>
   > .
   > You are receiving this because you authored the thread.Message ID:
   > ***@***.***>
   >
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@seatunnel.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] ORA-01292 Oracle to Postgres [seatunnel]

Posted by "Carl-Zhou-CN (via GitHub)" <gi...@apache.org>.
Carl-Zhou-CN commented on issue #6662:
URL: https://github.com/apache/seatunnel/issues/6662#issuecomment-2074339117

   @a11dev  Thank you very much. For seatunnel, which uses Debezium 1.6, it includes the parameter you suggested, heartbeat.interval.ms. Can you give it a try?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@seatunnel.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] ORA-01292 Oracle to Postgres [seatunnel]

Posted by "Carl-Zhou-CN (via GitHub)" <gi...@apache.org>.
Carl-Zhou-CN commented on issue #6662:
URL: https://github.com/apache/seatunnel/issues/6662#issuecomment-2043987841

   > Is it possible that the log expired and was removed
   
   This is just one possibility. You can also check if the Oracle archive logs and some settings are configured correctly, as it seems that your incremental phase has not been able to read the data correctly.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@seatunnel.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] ORA-01292 Oracle to Postgres [seatunnel]

Posted by "Carl-Zhou-CN (via GitHub)" <gi...@apache.org>.
Carl-Zhou-CN commented on issue #6662:
URL: https://github.com/apache/seatunnel/issues/6662#issuecomment-2042577100

   Is it possible that the log expired and was removed


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@seatunnel.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] ORA-01292 Oracle to Postgres [seatunnel]

Posted by "a11dev (via GitHub)" <gi...@apache.org>.
a11dev commented on issue #6662:
URL: https://github.com/apache/seatunnel/issues/6662#issuecomment-2085057326

   oracle cdc configuration improved with : heartbeat.interval.ms = 10000
   
   ```
     Oracle-CDC {
   
       result_table_name = "tab1"
       base-url = "jdbc:oracle:thin:user/password@ip:1521/service_name"
       source.reader.close.timeout = 120000
       username = "user"
       password = "password"
       database-names = ["DBNAME"]
   	# real db name DBNAME.domain.local ( it works with DBNAME )
       schema-names = ["SCHEMA"]
       startup.mode = "INITIAL"
       table-names = ["DBNAME.SCHEMA.TABLE1"]
       heartbeat.interval.ms = 10000
     }
   ```
   
   how can I check it from logs?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@seatunnel.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org