You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@libcloud.apache.org by GitBox <gi...@apache.org> on 2020/02/10 17:25:06 UTC

[GitHub] [libcloud] perbu opened a new issue #1424: Documentation: Objectstore example is broken / libcloud does seek() on stream

perbu opened a new issue #1424: Documentation: Objectstore example is broken / libcloud does seek() on stream
URL: https://github.com/apache/libcloud/issues/1424
 
 
   ## Summary
   
   I'm running the example on https://libcloud.readthedocs.io/en/stable/storage/examples.html that creates a tarfile and uploads it via stream to an objectstore.
   
   I'm using GCS.
   libcloud tries to do a seek() on the supplied iterator, this fails and the program stops.
   
   
   ## Detailed Information
   
   apache-libcloud==2.8.0
   Python 3.7.6
   macos Catalina
   
   The example in the doc is Python 2, I've rewritten it for python 3. I'm using GCS.
   
   ```python
   #!/usr/bin/env python
   import os
   import subprocess
   from datetime import datetime
   
   from libcloud.storage.types import Provider, ContainerDoesNotExistError
   from libcloud.storage.providers import get_driver
   
   from dotenv import load_dotenv
   load_dotenv()
   
   cls = get_driver(Provider.GOOGLE_STORAGE)
   driver = cls(os.getenv('GOOGLE_ACCOUNT'),
                     os.getenv('AUTH_TOKEN'),
                     project='foo')
   
   directory = os.getenv('FOLDER')
   cmd = 'tar cvzpf - %s' % (directory)
   
   object_name = 'backup-%s.tar.gz' % (datetime.now().strftime('%Y-%m-%d'))
   container_name = os.getenv('WORKSPACE')
   
   # Create a container if it doesn't already exist
   try:
       container = driver.get_container(container_name=container_name)
   except ContainerDoesNotExistError:
       container = driver.create_container(container_name=container_name)
   
   pipe = subprocess.Popen(cmd, bufsize=0, shell=True, stdout=subprocess.PIPE)
   return_code = pipe.poll()
   
   print('Uploading object...')
   
   while return_code is None:
       # Compress data in our directory and stream it directly to CF
       obj = container.upload_object_via_stream(iterator=pipe.stdout,
                                                object_name=object_name)
       return_code = pipe.poll()
   
   print('Upload complete, transferred: %s KB' % ((obj.size / 1024)))
   
   ```
   This returns the following exception:
   ```
   Traceback (most recent call last):
     File "./bug.py", line 38, in <module>
       object_name=object_name)
     File "/Users/perbu/.virtualenvs/ar1/lib/python3.7/site-packages/libcloud/storage/base.py", line 159, in upload_object_via_stream
       iterator, self, object_name, extra=extra, **kwargs)
     File "/Users/perbu/.virtualenvs/ar1/lib/python3.7/site-packages/libcloud/storage/drivers/s3.py", line 698, in upload_object_via_stream
       storage_class=ex_storage_class)
     File "/Users/perbu/.virtualenvs/ar1/lib/python3.7/site-packages/libcloud/storage/drivers/s3.py", line 842, in _put_object
       headers=headers, file_path=file_path, stream=stream)
     File "/Users/perbu/.virtualenvs/ar1/lib/python3.7/site-packages/libcloud/storage/base.py", line 627, in _upload_object
       self._get_hash_function())
     File "/Users/perbu/.virtualenvs/ar1/lib/python3.7/site-packages/libcloud/storage/base.py", line 657, in _hash_buffered_stream
       stream.seek(0)
   OSError: [Errno 29] Illegal seek
   ```
   
   The offending code in libcloud/storage/base.py looks like this:
   ```python
           if hasattr(stream, '__next__') or hasattr(stream, 'next'):
               # Ensure we start from the begining of a stream in case stream is
               # not at the beginning
               if hasattr(stream, 'seek'):
                   stream.seek(0)
   ```
   
   I'm not entirely sure why the iterator get "seek". I've been able to work around the issue by creating a SimpleIterator class that only supplies __next__ and then taking the output from Popen.stdout and subclassing it into the SimpleIterator.
   
   If I just comment out the seek(0) everything seems to work.
   
   Thanks for an excellent project. Let me know if you need anything more from me.
   
   Cheers,
   
   Per.
   
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [libcloud] Kami commented on issue #1424: Documentation: Objectstore example is broken / libcloud does seek() on stream

Posted by GitBox <gi...@apache.org>.
Kami commented on issue #1424: Documentation: Objectstore example is broken / libcloud does seek() on stream
URL: https://github.com/apache/libcloud/issues/1424#issuecomment-586755983
 
 
   Thanks for reporting this.
   
   Is this issue Python 3 specific?
   
   Having said that, one thing we could do is simply ignore "illegal seek" errors in that place, but I'm not sure that's the correct approach. It may mask real issues.
   
   ---
   
   EDIT: It looks like we indeed don't have a better option (https://bugs.python.org/issue12877) since Python sadly doesn't throw a more specific / better exception in that case (so we can't distinguish if underlying iterator doesn't support seek or it does support it and incorrect seek position is provided).
   
   We could have some specific case for pipes, but that's probably not the most robust approach...

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [libcloud] Kami closed issue #1424: Documentation: Objectstore example is broken / libcloud does seek() on stream

Posted by GitBox <gi...@apache.org>.
Kami closed issue #1424: Documentation: Objectstore example is broken / libcloud does seek() on stream
URL: https://github.com/apache/libcloud/issues/1424
 
 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services