You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@accumulo.apache.org by "cshannon (via GitHub)" <gi...@apache.org> on 2023/07/14 15:34:24 UTC
[GitHub] [accumulo] cshannon commented on pull request #3614: WIP - Add file metadata ITs
cshannon commented on PR #3614:
URL: https://github.com/apache/accumulo/pull/3614#issuecomment-1636032898
Example output from running the current splitsWithExistingRangesTest() where there are 3 existing ranges for a file and then we ingest data and add splits after:
#### Read in initial 1000,000 records and verify
```
100,000 records written | 225,225 records/sec | 3,900,000 bytes written | 8,783,783 bytes/sec | 0.444 secs
100,000 records read | 289,017 records/sec | 3,900,000 bytes read | 11,271,676 bytes/sec | 0.346 secs
```
#### Take table offline and manually updated to fence the RFile so only 75,000 records should be readable
```
Row: 1<; File Name: F0000001.rf; Range: [row_0000000000%00; : [] 9223372036854775807 false,row_0000025000%00; : [] 9223372036854775807 false); Entries: 25000, Size: 5013
Row: 1<; File Name: F0000001.rf; Range: [row_0000050000%00; : [] 9223372036854775807 false,row_0000075000%00; : [] 9223372036854775807 false); Entries: 25000, Size: 5013
Row: 1<; File Name: F0000001.rf; Range: [row_0000075000%00; : [] 9223372036854775807 false,row_0000100000%00; : [] 9223372036854775807 false); Entries: 25000, Size: 5013
```
#### Bring online and verify only 75,000 can be read
```
25,000 records read | 268,817 records/sec | 975,000 bytes read | 10,483,870 bytes/sec | 0.093 secs
2023-07-14T11:26:13,759 [test.VerifyIngest] WARN : Scan returned nothing, breaking...
50,000 records read | 342,465 records/sec | 1,950,000 bytes read | 13,356,164 bytes/sec | 0.146 secs
```
#### Add 10 splits, each of 10000 records
```
Row: 1;row_0000010000; File Name: F0000001.rf; Range: [row_0000000000%00; : [] 9223372036854775807 false,row_0000025000%00; : [] 9223372036854775807 false); Entries: 8333, Size: 1671
Row: 1;row_0000020000; File Name: F0000001.rf; Range: [row_0000000000%00; : [] 9223372036854775807 false,row_0000025000%00; : [] 9223372036854775807 false); Entries: 10000, Size: 2005
Row: 1;row_0000030000; File Name: F0000001.rf; Range: [row_0000000000%00; : [] 9223372036854775807 false,row_0000025000%00; : [] 9223372036854775807 false); Entries: 6667, Size: 1338
Row: 1;row_0000060000; File Name: F0000001.rf; Range: [row_0000050000%00; : [] 9223372036854775807 false,row_0000075000%00; : [] 9223372036854775807 false); Entries: 10937, Size: 2193
Row: 1;row_0000070000; File Name: F0000001.rf; Range: [row_0000050000%00; : [] 9223372036854775807 false,row_0000075000%00; : [] 9223372036854775807 false); Entries: 6027, Size: 1208
Row: 1;row_0000080000; File Name: F0000001.rf; Range: [row_0000050000%00; : [] 9223372036854775807 false,row_0000075000%00; : [] 9223372036854775807 false); Entries: 8036, Size: 1612
Row: 1;row_0000080000; File Name: F0000001.rf; Range: [row_0000075000%00; : [] 9223372036854775807 false,row_0000100000%00; : [] 9223372036854775807 false); Entries: 6000, Size: 1202
Row: 1;row_0000090000; File Name: F0000001.rf; Range: [row_0000075000%00; : [] 9223372036854775807 false,row_0000100000%00; : [] 9223372036854775807 false); Entries: 9000, Size: 1805
Row: 1;row_0000100000; File Name: F0000001.rf; Range: [row_0000075000%00; : [] 9223372036854775807 false,row_0000100000%00; : [] 9223372036854775807 false); Entries: 10000, Size: 2006
```
#### Re-verify only 75,000 can be read
```
25,000 records read | 112,107 records/sec | 975,000 bytes read | 4,372,197 bytes/sec | 0.223 secs
2023-07-14T11:26:20,456 [test.VerifyIngest] WARN : Scan returned nothing, breaking...
50,000 records read | 109,170 records/sec | 1,950,000 bytes read | 4,257,641 bytes/sec | 0.458 secs
```
#### Run a compaction and show output of files
```
Row: 1;row_0000010000; File Name: A000000e.rf; Range: (-inf,+inf); Entries: 10000, Size: 2190
Row: 1;row_0000020000; File Name: A000000f.rf; Range: (-inf,+inf); Entries: 10000, Size: 2204
Row: 1;row_0000030000; File Name: A000000l.rf; Range: (-inf,+inf); Entries: 5000, Size: 1191
Row: 1;row_0000060000; File Name: A000000j.rf; Range: (-inf,+inf); Entries: 10000, Size: 2201
Row: 1;row_0000070000; File Name: A000000k.rf; Range: (-inf,+inf); Entries: 10000, Size: 2201
Row: 1;row_0000080000; File Name: A000000g.rf; Range: (-inf,+inf); Entries: 10000, Size: 2202
Row: 1;row_0000090000; File Name: A000000i.rf; Range: (-inf,+inf); Entries: 10000, Size: 2202
Row: 1;row_0000100000; File Name: A000000h.rf; Range: (-inf,+inf); Entries: 10000, Size: 2222
```
#### Verify after compaction
```
25,000 records read | 112,107 records/sec | 975,000 bytes read | 4,372,197 bytes/sec | 0.223 secs
2023-07-14T11:26:20,456 [test.VerifyIngest] WARN : Scan returned nothing, breaking...
50,000 records read | 109,170 records/sec | 1,950,000 bytes read | 4,257,641 bytes/sec | 0.458 secs
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org