You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@accumulo.apache.org by "cshannon (via GitHub)" <gi...@apache.org> on 2023/07/14 15:34:24 UTC

[GitHub] [accumulo] cshannon commented on pull request #3614: WIP - Add file metadata ITs

cshannon commented on PR #3614:
URL: https://github.com/apache/accumulo/pull/3614#issuecomment-1636032898

   Example output from running the current splitsWithExistingRangesTest() where there are 3 existing ranges for a file and then we ingest data and add splits after:
   
   
   #### Read in initial 1000,000 records and verify
   ```
   100,000 records written |  225,225 records/sec |    3,900,000 bytes written | 8,783,783 bytes/sec |  0.444 secs   
   100,000 records read |  289,017 records/sec |    3,900,000 bytes read | 11,271,676 bytes/sec |  0.346 secs   
   ```
   
   #### Take table offline and manually updated to fence the RFile so only 75,000 records should be readable
   ```
   Row: 1<; File Name: F0000001.rf; Range: [row_0000000000%00; : [] 9223372036854775807 false,row_0000025000%00; : [] 9223372036854775807 false); Entries: 25000, Size: 5013
   Row: 1<; File Name: F0000001.rf; Range: [row_0000050000%00; : [] 9223372036854775807 false,row_0000075000%00; : [] 9223372036854775807 false); Entries: 25000, Size: 5013
   Row: 1<; File Name: F0000001.rf; Range: [row_0000075000%00; : [] 9223372036854775807 false,row_0000100000%00; : [] 9223372036854775807 false); Entries: 25000, Size: 5013
   ```
   #### Bring online and verify only 75,000 can be read
   ```
   25,000 records read |  268,817 records/sec |      975,000 bytes read | 10,483,870 bytes/sec |  0.093 secs   
   2023-07-14T11:26:13,759 [test.VerifyIngest] WARN : Scan returned nothing, breaking...
   50,000 records read |  342,465 records/sec |    1,950,000 bytes read | 13,356,164 bytes/sec |  0.146 secs   
   ```
   #### Add 10 splits, each of 10000 records
   ```
   Row: 1;row_0000010000; File Name: F0000001.rf; Range: [row_0000000000%00; : [] 9223372036854775807 false,row_0000025000%00; : [] 9223372036854775807 false); Entries: 8333, Size: 1671
   Row: 1;row_0000020000; File Name: F0000001.rf; Range: [row_0000000000%00; : [] 9223372036854775807 false,row_0000025000%00; : [] 9223372036854775807 false); Entries: 10000, Size: 2005
   Row: 1;row_0000030000; File Name: F0000001.rf; Range: [row_0000000000%00; : [] 9223372036854775807 false,row_0000025000%00; : [] 9223372036854775807 false); Entries: 6667, Size: 1338
   Row: 1;row_0000060000; File Name: F0000001.rf; Range: [row_0000050000%00; : [] 9223372036854775807 false,row_0000075000%00; : [] 9223372036854775807 false); Entries: 10937, Size: 2193
   Row: 1;row_0000070000; File Name: F0000001.rf; Range: [row_0000050000%00; : [] 9223372036854775807 false,row_0000075000%00; : [] 9223372036854775807 false); Entries: 6027, Size: 1208
   Row: 1;row_0000080000; File Name: F0000001.rf; Range: [row_0000050000%00; : [] 9223372036854775807 false,row_0000075000%00; : [] 9223372036854775807 false); Entries: 8036, Size: 1612
   Row: 1;row_0000080000; File Name: F0000001.rf; Range: [row_0000075000%00; : [] 9223372036854775807 false,row_0000100000%00; : [] 9223372036854775807 false); Entries: 6000, Size: 1202
   Row: 1;row_0000090000; File Name: F0000001.rf; Range: [row_0000075000%00; : [] 9223372036854775807 false,row_0000100000%00; : [] 9223372036854775807 false); Entries: 9000, Size: 1805
   Row: 1;row_0000100000; File Name: F0000001.rf; Range: [row_0000075000%00; : [] 9223372036854775807 false,row_0000100000%00; : [] 9223372036854775807 false); Entries: 10000, Size: 2006
   ```
   
   #### Re-verify only 75,000 can be read
   ```
   25,000 records read |  112,107 records/sec |      975,000 bytes read | 4,372,197 bytes/sec |  0.223 secs   
   2023-07-14T11:26:20,456 [test.VerifyIngest] WARN : Scan returned nothing, breaking...
   50,000 records read |  109,170 records/sec |    1,950,000 bytes read | 4,257,641 bytes/sec |  0.458 secs   
   ```
   
   #### Run a compaction and show output of files
   ```
   Row: 1;row_0000010000; File Name: A000000e.rf; Range: (-inf,+inf); Entries: 10000, Size: 2190
   Row: 1;row_0000020000; File Name: A000000f.rf; Range: (-inf,+inf); Entries: 10000, Size: 2204
   Row: 1;row_0000030000; File Name: A000000l.rf; Range: (-inf,+inf); Entries: 5000, Size: 1191
   Row: 1;row_0000060000; File Name: A000000j.rf; Range: (-inf,+inf); Entries: 10000, Size: 2201
   Row: 1;row_0000070000; File Name: A000000k.rf; Range: (-inf,+inf); Entries: 10000, Size: 2201
   Row: 1;row_0000080000; File Name: A000000g.rf; Range: (-inf,+inf); Entries: 10000, Size: 2202
   Row: 1;row_0000090000; File Name: A000000i.rf; Range: (-inf,+inf); Entries: 10000, Size: 2202
   Row: 1;row_0000100000; File Name: A000000h.rf; Range: (-inf,+inf); Entries: 10000, Size: 2222
   ```
   #### Verify after compaction
   ```
   25,000 records read |  112,107 records/sec |      975,000 bytes read | 4,372,197 bytes/sec |  0.223 secs   
   2023-07-14T11:26:20,456 [test.VerifyIngest] WARN : Scan returned nothing, breaking...
   50,000 records read |  109,170 records/sec |    1,950,000 bytes read | 4,257,641 bytes/sec |  0.458 secs 
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org