You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@pinot.apache.org by GitBox <gi...@apache.org> on 2021/11/18 00:18:12 UTC

[GitHub] [pinot] kkrugler opened a new issue #7791: Only download required files from segment for metadata push

kkrugler opened a new issue #7791:
URL: https://github.com/apache/pinot/issues/7791


   Currently in `SegmentPushUtils.generateSegmentMetadataFile(PinotFS fileSystem, URI tarFileURI)`, it calls
   ``` java
       fileSystem.copyToLocalFile(tarFileURI, tarFile);
   ```
   to download the segment from deep store and create a temp copy locally. It then unpacks and untars the segment, extracts the two files of interest (`creation.meta` and `metadata.properties`), creates a new tarball with these two files, and then pushes that to the controller.
   
   The actual amount of data extracted from each segment is a tiny fraction of the segment's total size. e.g. For a 200MB segment, the two files of interest are about 4K bytes. When a large number of segments are being pushed, this results in a significant performance hit.
   
   Instead, it's possible to stream/unpack the segment and extract only the data from the two target files, via something like:
   ``` java
           String uuid = UUID.randomUUID().toString();
           File segmentMetadataTarFile = new File(FileUtils.getTempDirectory(), "segmentMetadata-" + uuid + ".tar.gz");
           if (segmentMetadataTarFile.exists()) {
             FileUtils.forceDelete(segmentMetadataTarFile);
           }
           GzipCompressorOutputStream gzOut = new GzipCompressorOutputStream(new FileOutputStream(segmentMetadataTarFile));
           TarArchiveOutputStream tos = new TarArchiveOutputStream(gzOut);
                   
           TarArchiveInputStream tis = new TarArchiveInputStream(new GZIPInputStream(fileSystem.open(segmentUri)));
   
           TarArchiveEntry tarEntry;
           while ((tarEntry = tis.getNextTarEntry()) != null) {
               System.out.format("%s: %d\n", tarEntry.getName(), tarEntry.getRealSize());
   
               String fullName = tarEntry.getName();
               String filename = fullName.substring(fullName.lastIndexOf('/') + 1);
               if (tarEntry.isFile() && (filename.equals("metadata.properties") || filename.contentEquals("creation.meta"))) {
                   TarArchiveEntry ae = new TarArchiveEntry(filename);
                   ae.setSize(tarEntry.getRealSize());
                   tos.putArchiveEntry(ae);
                   IOUtils.copy(tis, tos);
                   tos.closeArchiveEntry();
               }
           }
           
           tos.finish();
           tos.close();
           tis.close();
   ```
   The above is just example code, without exception handling, etc. but it is able to create the required tarball from a source segment.
   
   The real performance win is because the two files of interest occur in the segment tarball before the big files (`columns.psf`, `star_tree_index`, etc) so it would be appropriate to enforce this ordering in the segment writer utility code.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org

[GitHub] [pinot] Jackie-Jiang commented on issue #7791: Only download required files from segment for metadata push

Posted by GitBox <gi...@apache.org>.

Jackie-Jiang commented on issue #7791:
URL: https://github.com/apache/pinot/issues/7791#issuecomment-973130363


   Oh I see the problem you referred to. There are 2 versions of metadata push, one also push the segment metadata to the controller so that controller doesn't need to download the segment from the deep store.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org

[GitHub] [pinot] richardstartin commented on issue #7791: Only download required files from segment for metadata push

Posted by GitBox <gi...@apache.org>.

richardstartin commented on issue #7791:
URL: https://github.com/apache/pinot/issues/7791#issuecomment-978440426


   So the analogy with transaction logs was messing with a system’s state without fully understanding the consequences. You do it at your own risk, and file structures shouldn’t be optimised to simplify manual actions.
   
   I agree that the exact order of the segment file changing doesn’t create a correctness problem, but it does make a very worthwhile optimisation brittle. My concern is that this optimisation is always an optimisation and doesn’t break for silly reasons.
   
   _Clean_ is a very subjective term. It means different things to different people. Files with different suffixes in the same directory as the segments does not feel _unclean_ to me.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org

[GitHub] [pinot] richardstartin edited a comment on issue #7791: Only download required files from segment for metadata push

Posted by GitBox <gi...@apache.org>.

richardstartin edited a comment on issue #7791:
URL: https://github.com/apache/pinot/issues/7791#issuecomment-978385352

> could you explain what you mean by "this works everywhere"? That makes me feel like I don't really understand what you're proposing, and/or that you don't understand what I'm proposing. Streaming unpack of files should also "work everywhere", and should be highly performant, with minimal code changes.

I understand what you’re proposing. You want to start downloading the segment and, relying on the metadata being at the start of the segment, essentially abort the download once you have the metadata. The motivation is to save the download time for most of the segment file. Is that correct?

By everywhere, I mean for all filesystem implementations and for all segments, no matter how they have been generated or by what process. It is not currently specified where the metadata file is within the segment file, segment files may not have been generated by the code within this repository, and the metadata may not be at the start of the file.

> As to the mismatch scenario, I'm assuming the idea is to store segments in directory A, and associated metadata in directory B. If an ops person copies updated segments to an archive dir, and forgets to copy the associated metadata, you would have miss-matched data.

Without meaning to appear facetious, this sounds a bit like what happens when users occasionally tamper with transaction logs in RDBMS and hope for good outcomes. Shouldn’t the user just not do that?

Regarding directories, assuming metadata is stored in a separate file from the segment, is there a good reason for it to be in a different directory? WDYT @mcvsubbu? I know you expressed similar concerns to mine on slack.

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org

[GitHub] [pinot] kkrugler commented on issue #7791: Only download required files from segment for metadata push

Posted by GitBox <gi...@apache.org>.

kkrugler commented on issue #7791:
URL: https://github.com/apache/pinot/issues/7791#issuecomment-978403625

@richardstartin - yes, re aborting the download once the two files of interest (`metadata.properties` and `creation.meta`) have been extracted. This will still work even if these files aren't near the beginning of the file, or before the bigger data files. It will just not be as much of an optimization, but even in the worst case it will be faster than the current code (no creation of temp files on local disk).

As to what happens with copying of files - it's not really the same as messing with transaction logs. Ops people (like the team for one of our clients) copy around segment files all the time, as they try out our different optimizations for how data is indexed, or manually roll back after there's an issue with the data being used to build new segments.

Re where to put the metadata files - If the two metadata files are packaged as as currently expected by the controller, then they'll be `<blah>.tar.gz` files. In that case they really shouldn't be in the same directory as the actual segment tarballs, as segment pushes will try to push everything with that suffix to the controller. If you do a metadata push then it could rely on a naming convention to only process the metadata files.

You could leave the files untarred, but now you have a bunch of `<name>-metadata.properties` and `<name>-creation.meta` files littering the segment directory. We have HDFS directories with 1200+ segments, so adding another 2400 of these tiny files doesn't feel clean.

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org

[GitHub] [pinot] richardstartin commented on issue #7791: Only download required files from segment for metadata push

Posted by GitBox <gi...@apache.org>.

richardstartin commented on issue #7791:
URL: https://github.com/apache/pinot/issues/7791#issuecomment-979221370


   For the sake of considering alternatives, if the required metadata could be squeezed into 2KB it could be stored in the file attributes on the segment file. S3 allows [retrieval of all attributes](https://docs.aws.amazon.com/AmazonS3/latest/userguide/UsingMetadata.html) without downloading the file via a `HeadObject` request, and HDFS [extended file attributes](https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/ExtendedAttributes.html) support a similar lightweight access protocol. Naturally, GCS has the same [concept](https://cloud.google.com/storage/docs/metadata), as does [Azure](https://docs.microsoft.com/en-us/rest/api/storageservices/get-file-metadata). I believe this would address concerns about separation, directory structures, but also provide good guarantees., the only catch is the limit of 2KB for user defined attributes.
   
   The metadata written into creation.meta consists of a crc and creation time, metadata.properties is heavier, but it could all be written in to a JSON object which could be compressed, base64 encoded and saved as a single file attribute on the segment file. 
   
   Optimistic `HeadObject`/`getfattr` requests would be made to the filesystem. Support for old segment files without the correct metadata would be provided by falling back to the mechanism proposed here, and eventually metadata would always be retrieved successfully first time by reading file attributes without downloading any segment files.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org

[GitHub] [pinot] Jackie-Jiang commented on issue #7791: Only download required files from segment for metadata push

Posted by GitBox <gi...@apache.org>.

Jackie-Jiang commented on issue #7791:
URL: https://github.com/apache/pinot/issues/7791#issuecomment-977312406


   Just went over the code, and we are actually using the proper metadata push API (the other way is letting the controller download the segment and extract the metadata). The inefficiency comes from not using the metadata from the local generated segment.
   Ideally, we should follow these steps when generating and pushing the segments:
   1. Generate the segments from the raw data
   2. Cache the segment metadata from the local files
   3. Tar the segments and upload to deep store
   4. Push segments using the cached metadata


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org

[GitHub] [pinot] kkrugler commented on issue #7791: Only download required files from segment for metadata push

Posted by GitBox <gi...@apache.org>.

kkrugler commented on issue #7791:
URL: https://github.com/apache/pinot/issues/7791#issuecomment-977233280


   Hi @Jackie-Jiang - yes, re using a metadata push so that the controller doesn't have to download the segment. But I'm curious what the 2 versions of metadata push were. I only know of a full segment push, and a metadata push (at least from how you can configure the job yaml).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org

[GitHub] [pinot] richardstartin commented on issue #7791: Only download required files from segment for metadata push

Posted by GitBox <gi...@apache.org>.

richardstartin commented on issue #7791:
URL: https://github.com/apache/pinot/issues/7791#issuecomment-978385352

By everywhere, I mean for all filesystem implementations and for all segments, no matter how they have been generated or by what. It is not currently specified where the metadata file is within the segment file, segment files may not have been generated by the code within this repository, and the metadata may not be at the start of the file.

Without meaning to appear facetious, this sounds a bit like what happens when users occasionally tamper with transaction logs in RDBMS and hope for same outcomes. Shouldn’t the user just not do that?

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org

[GitHub] [pinot] kkrugler commented on issue #7791: Only download required files from segment for metadata push

Posted by GitBox <gi...@apache.org>.

kkrugler commented on issue #7791:
URL: https://github.com/apache/pinot/issues/7791#issuecomment-978106616


   Hi @Jackie-Jiang - yes, it would be more efficient to save the metadata when doing a generate-and-push job. But for our use case, we run Hadoop map-reduce jobs to build segments (highly parallelized) and save to HDFS, and then separately have Pinot jobs to push the segments. So for that use case, we'd want to still do this optimization. Others on Slack have suggested saving the metadata files to a separate directory during build time, but in my mind that adds both complexity and the potential for data miss-match without much benefit.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org

[GitHub] [pinot] Jackie-Jiang commented on issue #7791: Only download required files from segment for metadata push

Posted by GitBox <gi...@apache.org>.

Jackie-Jiang commented on issue #7791:
URL: https://github.com/apache/pinot/issues/7791#issuecomment-978170836


   Make sense.
   To achieve the benefits described above, we need to ensure 2 things:
   1. The metadata files should be put in the beginning of the tar ball
   2. The `PinotFS.open()` implementation only download block of data instead of the whole file (checked the `S3PinotFS` and seems that is not the case)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org

[GitHub] [pinot] kkrugler commented on issue #7791: Only download required files from segment for metadata push

Posted by GitBox <gi...@apache.org>.

kkrugler commented on issue #7791:
URL: https://github.com/apache/pinot/issues/7791#issuecomment-972484473


   @Jackie-Jiang - the problem with `TarGzCompressionUtils.untarOneFile()` is that it still requires the segment to be previously downloaded from deep storage to local storage, which is the time-consuming bit that can be avoided.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org

[GitHub] [pinot] Jackie-Jiang commented on issue #7791: Only download required files from segment for metadata push

Posted by GitBox <gi...@apache.org>.

Jackie-Jiang commented on issue #7791:
URL: https://github.com/apache/pinot/issues/7791#issuecomment-972367707


   We can use `TarGzCompressionUtils.untarOneFile()` to untar one single file from the segment tarball


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org

[GitHub] [pinot] kkrugler commented on issue #7791: Only download required files from segment for metadata push

Posted by GitBox <gi...@apache.org>.

kkrugler commented on issue #7791:
URL: https://github.com/apache/pinot/issues/7791#issuecomment-978344683

Hi @richardstartin - could you explain what you mean by "this works everywhere"? That makes me feel like I don't really understand what you're proposing, and/or that you don't understand what I'm proposing. Streaming unpack of files should also "work everywhere", and should be highly performant, with minimal code changes.

As to the mismatch scenario, I'm assuming the idea is to store segments in directory A, and associated metadata in directory B. If an ops person copies updated segments to an archive dir, and forgets to copy the associated metadata, you would have miss-matched data. My previous experience dealing with Hadoop map files, which had an index and an associated data file, taught me the dangers of having to keep files in sync. But maybe there's a very different approach you're proposing that I'm missing, so some clarification would be great.

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org

[GitHub] [pinot] richardstartin commented on issue #7791: Only download required files from segment for metadata push

Posted by GitBox <gi...@apache.org>.

richardstartin commented on issue #7791:
URL: https://github.com/apache/pinot/issues/7791#issuecomment-978180856


   > Others on Slack have suggested saving the metadata files to a separate directory during build time, but in my mind that adds both complexity and the potential for data miss-match without much benefit.
   
   The benefit is a guarantee that this works everywhere. Could you elaborate on how the the mismatch scenario arises in concrete terms?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org

[GitHub] [pinot] kkrugler commented on issue #7791: Only download required files from segment for metadata push

Posted by GitBox <gi...@apache.org>.

kkrugler commented on issue #7791:
URL: https://github.com/apache/pinot/issues/7791#issuecomment-978462020


   @Jackie-Jiang - good catch re S3PinotFS buffering the entire file into memory on the `open()` call. I've checked the other FS implementations, and they don't seem to be doing a complete download, but rather are returning a stream to the original source data. I've pinged @KKcorps on Slack to find out the reasoning here, as it seems possible to use the S3Client support for `getObject()` to avoid the download.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org