You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@camel.apache.org by "ajithcnambiar (via GitHub)" <gi...@apache.org> on 2023/10/12 22:21:53 UTC
[I] Clarification Required: S3 Source Connector doesn't fetch new files [camel-kafka-connector]
ajithcnambiar opened a new issue, #1570:
URL: https://github.com/apache/camel-kafka-connector/issues/1570
**Usecase**:
I'm trying out the S3 source connector. The S3 bucket will be periodically updated, and I want the new files to be sourced to the Kafka topic, without duplicates, without deleting from the existing S3 bucket, and without moving to a new bucket.
**Test**
My test was with the deleteAfterRead `false` and with idempotency enabled (with the Kafka type repository), with the below configuration:
**Configuration**:
```
apiVersion: kafka.strimzi.io/v1beta2
kind: KafkaConnector
metadata:
name: aws-s3-source-connector
namespace: kafka
labels:
strimzi.io/cluster: kafka-connect
spec:
class: org.apache.camel.kafkaconnector.awss3source.CamelAwss3sourceSourceConnector
tasksMax: 1
config:
camel.kamelet.aws-s3-source.accessKey: <access-key>
camel.kamelet.aws-s3-source.secretKey: <secret-key>
camel.kamelet.aws-s3-source.region: <region>
camel.kamelet.aws-s3-source.deleteAfterRead: false
camel.kamelet.aws-s3-source.bucketNameOrArn: arn:aws:s3:::<bucket-name>
camel.idempotency.enabled: true
camel.idempotency.repository.type: kafka
camel.idempotency.expression.type: header
camel.idempotency.expression.header: CamelAwsS3Key
camel.idempotency.kafka.topic: idem-topic
camel.idempotency.kafka.bootstrap.servers: <kafka-servers>:9092
camel.idempotency.kafka.poll.duration.ms: 150
topics: bucket-topic
```
**Other info**:
- My reference was [this old thread](https://github.com/apache/camel-kafka-connector/issues/311) and [the idempotency blog](https://camel.apache.org/blog/2020/12/CKC-idempotency-070/) to achieve the intended functionality.
- Could it be because `maxMessagesPerPoll` is 10? But then there seems to be no configuration property to set this for S3 source connector? 🤔
**Versions tested**
camel-aws2-s3-kafka-connector 0.11.5
camel-aws-s3-source-kafka-connector 3.20.6
camel-aws-s3-source-kafka-connector 4.0.0
**Question**
Please let me know if the intended functionality can be realized. And if so, what am I missing? Kindly advise 🙏
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@camel.apache.org.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [I] Clarification Required: S3 Source Connector doesn't fetch new files [camel-kafka-connector]
Posted by "ajithcnambiar (via GitHub)" <gi...@apache.org>.
ajithcnambiar commented on issue #1570:
URL: https://github.com/apache/camel-kafka-connector/issues/1570#issuecomment-1760937637
thanks for the quick response.
>max messages per poll
Is this value configurable for the connector?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@camel.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [I] Clarification Required: S3 Source Connector doesn't fetch new files [camel-kafka-connector]
Posted by "ajithcnambiar (via GitHub)" <gi...@apache.org>.
ajithcnambiar commented on issue #1570:
URL: https://github.com/apache/camel-kafka-connector/issues/1570#issuecomment-1761115111
It would be great if it's exposed.
In my scenario, the update to S3 happens once/twice a day. So if I have a way to configure the max to a value like 1000 - it would solve my use case.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@camel.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [I] Clarification Required: S3 Source Connector doesn't fetch new files [camel-kafka-connector]
Posted by "oscerd (via GitHub)" <gi...@apache.org>.
oscerd commented on issue #1570:
URL: https://github.com/apache/camel-kafka-connector/issues/1570#issuecomment-1761124098
Open an issue on camel-kamelets project.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@camel.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [I] Clarification Required: S3 Source Connector doesn't fetch new files [camel-kafka-connector]
Posted by "ajithcnambiar (via GitHub)" <gi...@apache.org>.
ajithcnambiar commented on issue #1570:
URL: https://github.com/apache/camel-kafka-connector/issues/1570#issuecomment-1765717078
Thanks a lot for https://github.com/apache/camel-kamelets/pull/1692 🙏
I'm just curious, when can we expect a version of the connector with this configuration?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@camel.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [I] Clarification Required: S3 Source Connector doesn't fetch new files [camel-kafka-connector]
Posted by "oscerd (via GitHub)" <gi...@apache.org>.
oscerd commented on issue #1570:
URL: https://github.com/apache/camel-kafka-connector/issues/1570#issuecomment-1765798131
We first need to release camel-kamelets 4.1.0. I'm planning to do it this week or beginning of the next, then we can upgrade in ckc and release thanks to @valdar
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@camel.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [I] Clarification Required: S3 Source Connector doesn't fetch new files [camel-kafka-connector]
Posted by "oscerd (via GitHub)" <gi...@apache.org>.
oscerd commented on issue #1570:
URL: https://github.com/apache/camel-kafka-connector/issues/1570#issuecomment-1760784759
If you want to achieve you need to deleteAfterRead or increase the max messages per poll. Your configuration will always poll the same 10 files since you don't move/delete them. Another possibility for achieving you what you are looking for is using the following connector: https://github.com/apache/camel-kafka-connector/tree/camel-kafka-connector-4.0.0/connectors/camel-aws-s3-cdc-source-kafka-connector with this one you should be able to consume new files without deleting them. Here the docs: https://camel.apache.org/camel-kafka-connector/next/reference/connectors/camel-aws-s3-cdc-source-kafka-source-connector.html
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@camel.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [I] Clarification Required: S3 Source Connector doesn't fetch new files [camel-kafka-connector]
Posted by "oscerd (via GitHub)" <gi...@apache.org>.
oscerd commented on issue #1570:
URL: https://github.com/apache/camel-kafka-connector/issues/1570#issuecomment-1761021166
No, it's not exposed, I can add that, but in your case whatever is value it won't cover your case unless you delete after read or move.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@camel.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org