You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@camel.apache.org by "ajithcnambiar (via GitHub)" <gi...@apache.org> on 2023/10/12 22:21:53 UTC

[I] Clarification Required: S3 Source Connector doesn't fetch new files [camel-kafka-connector]

ajithcnambiar opened a new issue, #1570:
URL: https://github.com/apache/camel-kafka-connector/issues/1570

   **Usecase**:
    I'm trying out the S3 source connector. The S3 bucket will be periodically updated, and I want the new files to be sourced to the Kafka topic, without duplicates, without deleting from the existing S3 bucket, and without moving to a new bucket.
   
   **Test**
   My test was with the deleteAfterRead `false` and with idempotency enabled (with the Kafka type repository), with the below configuration:
   
   **Configuration**:
   ```
   apiVersion: kafka.strimzi.io/v1beta2
   kind: KafkaConnector
   metadata:
     name: aws-s3-source-connector
     namespace: kafka
     labels:
       strimzi.io/cluster: kafka-connect
   spec:
     class: org.apache.camel.kafkaconnector.awss3source.CamelAwss3sourceSourceConnector
     tasksMax: 1
     config:
       camel.kamelet.aws-s3-source.accessKey: <access-key>
       camel.kamelet.aws-s3-source.secretKey: <secret-key>
       camel.kamelet.aws-s3-source.region: <region>
       camel.kamelet.aws-s3-source.deleteAfterRead: false
       camel.kamelet.aws-s3-source.bucketNameOrArn: arn:aws:s3:::<bucket-name>
   
       camel.idempotency.enabled: true
       camel.idempotency.repository.type: kafka
       camel.idempotency.expression.type: header
       camel.idempotency.expression.header: CamelAwsS3Key
       camel.idempotency.kafka.topic: idem-topic
       camel.idempotency.kafka.bootstrap.servers: <kafka-servers>:9092
       camel.idempotency.kafka.poll.duration.ms: 150
   
       topics: bucket-topic
   
   ```
   
   
   **Other info**: 
   - My reference was [this old thread](https://github.com/apache/camel-kafka-connector/issues/311) and [the idempotency blog](https://camel.apache.org/blog/2020/12/CKC-idempotency-070/) to achieve the intended functionality.
   - Could it be because `maxMessagesPerPoll` is 10? But then there seems to be no configuration property to set this for S3 source connector? 🤔 
   
   
   **Versions tested**
   camel-aws2-s3-kafka-connector 0.11.5
   camel-aws-s3-source-kafka-connector 3.20.6
   camel-aws-s3-source-kafka-connector 4.0.0
   
   **Question**
   Please let me know if the intended functionality can be realized. And if so, what am I missing? Kindly advise 🙏 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@camel.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] Clarification Required: S3 Source Connector doesn't fetch new files [camel-kafka-connector]

Posted by "ajithcnambiar (via GitHub)" <gi...@apache.org>.
ajithcnambiar commented on issue #1570:
URL: https://github.com/apache/camel-kafka-connector/issues/1570#issuecomment-1760937637

   thanks for the quick response.
   
   >max messages per poll
   
   Is this value configurable for the connector?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@camel.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] Clarification Required: S3 Source Connector doesn't fetch new files [camel-kafka-connector]

Posted by "ajithcnambiar (via GitHub)" <gi...@apache.org>.
ajithcnambiar commented on issue #1570:
URL: https://github.com/apache/camel-kafka-connector/issues/1570#issuecomment-1761115111

   It would be great if it's exposed. 
   In my scenario, the update to S3 happens once/twice a day. So if I have a way to configure the max to a value like 1000 - it would solve my use case.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@camel.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] Clarification Required: S3 Source Connector doesn't fetch new files [camel-kafka-connector]

Posted by "oscerd (via GitHub)" <gi...@apache.org>.
oscerd commented on issue #1570:
URL: https://github.com/apache/camel-kafka-connector/issues/1570#issuecomment-1761124098

   Open an issue on camel-kamelets project.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@camel.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] Clarification Required: S3 Source Connector doesn't fetch new files [camel-kafka-connector]

Posted by "ajithcnambiar (via GitHub)" <gi...@apache.org>.
ajithcnambiar commented on issue #1570:
URL: https://github.com/apache/camel-kafka-connector/issues/1570#issuecomment-1765717078

   Thanks a lot for https://github.com/apache/camel-kamelets/pull/1692 🙏 
   I'm just curious, when can we expect a version of the connector with this configuration? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@camel.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] Clarification Required: S3 Source Connector doesn't fetch new files [camel-kafka-connector]

Posted by "oscerd (via GitHub)" <gi...@apache.org>.
oscerd commented on issue #1570:
URL: https://github.com/apache/camel-kafka-connector/issues/1570#issuecomment-1765798131

   We first need to release camel-kamelets 4.1.0. I'm planning to do it this week or beginning of the next, then we can upgrade in ckc and release thanks to @valdar 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@camel.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] Clarification Required: S3 Source Connector doesn't fetch new files [camel-kafka-connector]

Posted by "oscerd (via GitHub)" <gi...@apache.org>.
oscerd commented on issue #1570:
URL: https://github.com/apache/camel-kafka-connector/issues/1570#issuecomment-1760784759

   If you want to achieve you need to deleteAfterRead or increase the max messages per poll. Your configuration will always poll the same 10 files since you don't move/delete them. Another possibility for achieving you what you are looking for is using the following connector: https://github.com/apache/camel-kafka-connector/tree/camel-kafka-connector-4.0.0/connectors/camel-aws-s3-cdc-source-kafka-connector with this one you should be able to consume new files without deleting them. Here the docs: https://camel.apache.org/camel-kafka-connector/next/reference/connectors/camel-aws-s3-cdc-source-kafka-source-connector.html


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@camel.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] Clarification Required: S3 Source Connector doesn't fetch new files [camel-kafka-connector]

Posted by "oscerd (via GitHub)" <gi...@apache.org>.
oscerd commented on issue #1570:
URL: https://github.com/apache/camel-kafka-connector/issues/1570#issuecomment-1761021166

   No, it's not exposed, I can add that, but in your case whatever is value it won't cover your case unless you delete after read or move.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@camel.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org