You are viewing a plain text version of this content. The canonical link for it is here.
Posted to oak-dev@jackrabbit.apache.org by Aravindo Wingeier <wi...@adobe.com.INVALID> on 2020/01/23 10:08:41 UTC

Access azure segments metadata in a case-insensitive way

Hi dev's,

We use azcopy to copy segments from one azure blob container to another for testing. There is a bug in the current version of azcopy (10.3.3), which makes all metadata keys start with a capital letter - "type" becomes "Type". As a consequence, the current implementation can not find the segments in the azure blob storage.

The azcopy issue was already reported [1] in 2018, I am contacting MS directly to follow up on this. As an alternative, we currently use azcopy version 7, which is much slower and has reliability issues.

I have little hope that azcopy will be fixed soon, therefore I suggest a patch to oak-segment-azure, that would be backward compatible and ignore the case of the keys when reading metadata. See the patch draft at [2].

What do you think is the best way to go forward?

Best regards,

Aravindo Wingeier

[1]: https://github.com/Azure/azure-storage-azcopy/issues/113
[2]: https://github.com/apache/jackrabbit-oak/pull/173



Re: Access azure segments metadata in a case-insensitive way

Posted by Aravindo Wingeier <wi...@adobe.com.INVALID>.
Thanks for your feedback. 

> What if there are two entries where the name only differs in case?
The azure blob api does not persist multiple entries if the key differs only by case, one of them wins. 
I tested setting the metadata [foo=lower, FOO=upper] to a blob through the Azure blob SDK. The resulting metadata was [foo=lower]. So it would be safe to ignore case in java. 

I will shortly add a patch to  https://issues.apache.org/jira/browse/OAK-8869

Thanks,
Aravindo



Re: Access azure segments metadata in a case-insensitive way

Posted by Andrei Dulceanu <an...@gmail.com>.
Good point, Julian! I think we can have additional unit tests covering
corner cases as the one you mentioned...

On Thu, Jan 23, 2020 at 1:23 PM Julian Reschke <ju...@gmx.de>
wrote:

> On 23.01.2020 13:12, Andrei Dulceanu wrote:
> > +1 for creating the issue and attaching there the patch
> > ...
>
> Are we positively sure that this will never break stuff? What if there
> are two entries where the name only differs in case?
>
> Best regards, Julian
>

Re: Access azure segments metadata in a case-insensitive way

Posted by Julian Reschke <ju...@gmx.de>.
On 23.01.2020 13:12, Andrei Dulceanu wrote:
> +1 for creating the issue and attaching there the patch
> ...

Are we positively sure that this will never break stuff? What if there
are two entries where the name only differs in case?

Best regards, Julian

Re: Access azure segments metadata in a case-insensitive way

Posted by Andrei Dulceanu <an...@gmail.com>.
+1 for creating the issue and attaching there the patch

Regards,
Andrei

On Thu, Jan 23, 2020 at 11:21 AM Tomek Rękawek <to...@apache.org> wrote:

> Hi Aravindo,
>
> I’m in favour of merging the patch. I think being strict in what we write
> and tolerant in what we read is a good thing. Please create an OAK issue
> and ping me and Andrei Dulceanu, so we can merge it.
>
> Regards,
> Tomek
>
> --
> Tomek Rękawek | ASF committer | www.apache.org
> tomekr@apache.org
>
> > On 23 Jan 2020, at 11:08, Aravindo Wingeier <wi...@adobe.com> wrote:
> >
> > Hi dev's,
> >
> > We use azcopy to copy segments from one azure blob container to another
> for testing. There is a bug in the current version of azcopy (10.3.3),
> which makes all metadata keys start with a capital letter - "type" becomes
> "Type". As a consequence, the current implementation can not find the
> segments in the azure blob storage.
> >
> > The azcopy issue was already reported [1] in 2018, I am contacting MS
> directly to follow up on this. As an alternative, we currently use azcopy
> version 7, which is much slower and has reliability issues.
> >
> > I have little hope that azcopy will be fixed soon, therefore I suggest a
> patch to oak-segment-azure, that would be backward compatible and ignore
> the case of the keys when reading metadata. See the patch draft at [2].
> >
> > What do you think is the best way to go forward?
> >
> > Best regards,
> >
> > Aravindo Wingeier
> >
> > [1]: https://github.com/Azure/azure-storage-azcopy/issues/113
> > [2]: https://github.com/apache/jackrabbit-oak/pull/173
>
>

Re: Access azure segments metadata in a case-insensitive way

Posted by Tomek Rękawek <to...@apache.org>.
Hi Aravindo,

I’m in favour of merging the patch. I think being strict in what we write and tolerant in what we read is a good thing. Please create an OAK issue and ping me and Andrei Dulceanu, so we can merge it.

Regards,
Tomek

-- 
Tomek Rękawek | ASF committer | www.apache.org
tomekr@apache.org

> On 23 Jan 2020, at 11:08, Aravindo Wingeier <wi...@adobe.com> wrote:
> 
> Hi dev's,
> 
> We use azcopy to copy segments from one azure blob container to another for testing. There is a bug in the current version of azcopy (10.3.3), which makes all metadata keys start with a capital letter - "type" becomes "Type". As a consequence, the current implementation can not find the segments in the azure blob storage. 
> 
> The azcopy issue was already reported [1] in 2018, I am contacting MS directly to follow up on this. As an alternative, we currently use azcopy version 7, which is much slower and has reliability issues. 
> 
> I have little hope that azcopy will be fixed soon, therefore I suggest a patch to oak-segment-azure, that would be backward compatible and ignore the case of the keys when reading metadata. See the patch draft at [2]. 
> 
> What do you think is the best way to go forward? 
> 
> Best regards,
> 
> Aravindo Wingeier
> 
> [1]: https://github.com/Azure/azure-storage-azcopy/issues/113
> [2]: https://github.com/apache/jackrabbit-oak/pull/173