You are viewing a plain text version of this content. The canonical link for it is here.
Posted to derby-dev@db.apache.org by "Rick Hillegas (JIRA)" <ji...@apache.org> on 2011/04/27 15:24:03 UTC
[jira] [Updated] (DERBY-5201) Create tools for reading the contents
of the seg0 directory
[ https://issues.apache.org/jira/browse/DERBY-5201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Rick Hillegas updated DERBY-5201:
---------------------------------
Attachment: DataFileReader.java
Attaching DataFileReader.java. This program reads a file in the seg0 directory and streams out its contents as human-readable xml. Verbose printing shows the column data as byte arrays. This can be refined further by giving the tool a row signature--if you do that, the column data is deserialized into objects and the toString() results are shown. The actual database is not booted or otherwise disturbed. A transient in-memory helper database is created if you want to deserialize the column contents.
More work could be done formatting overflow data.
I have run the tool on the SYSCONGOMERATES file (c20.dat) in databases created by 10.0, 10.1, 10.2, 10.3, 10.4, 10.5, 10.6, 10.7, 10.8, and 10.9 (trunk). I have also tested the tool on a file containing a UDT.
Here is the tool's usage message:
Usage:
java DataFileReader $dataFileName [ -v ] [ -d $D ] [ -p $P ] [ -n $N ]
-v Verbose. Print out records and slot tables. Field data appears as byte arrays. If you do not set this flag, the tool just decodes the page headers.
-d Data signature. This makes a verbose printout turn the field data into objects. $D is a row signature, e.g., "( a int, b varchar( 30 ) )"
-p Starting page. $P is a number which must be at least 1, the first page to read after the header. Page 0 (the header) is always read.
-n Number of pages to read. $N is a positive number. Defaults to all subsequent pages.
For example, the following command deserializes all of the records in the SYSCONGLOMERATES file:
java DataFileReader db/seg0/c20.dat -v -d "( a char(36), b char(36), c bigint, d varchar( 128), e boolean, f serializable, g boolean, h char( 36 ) )"
Note the special 'serializable' type in the preceding example. Use 'serializable' for user-defined types and for the system columns which are objects.
Here are some sample use cases:
-------------------------------------------
1) Decode an entire data file, putting the resulting xml in the file z.xml. You can then view that file using a browser like Firefox, which lets you collapse and expand the elements.
java DataFileReader db/seg0/c20.dat -v -d "( a char(36), b char(36), c bigint, d varchar( 128), e boolean, f serializable, g boolean, h char( 36 ) )" > z.xml
-------------------------------------------
2) Pretty-print the file header:
java DataFileReader db/seg0/c20.dat -n 1
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<dataFile>
<fileHeader>
<allocPage number="0">
<formatableID>118</formatableID>
<pageHeader>
<isOverFlowPage>false</isOverFlowPage>
<status hexvalue="1">
<flag>VALID_PAGE</flag>
</status>
<pageVersion>9</pageVersion>
<slotsInUse>0</slotsInUse>
<nextRecordID>6</nextRecordID>
<pageGeneration>0</pageGeneration>
<previousGeneration>0</previousGeneration>
<beforeImagePageLocation>0</beforeImagePageLocation>
<deletedRowCount>1</deletedRowCount>
</pageHeader>
<nextAllocPageNumber>-1</nextAllocPageNumber>
<nextAllocPageOffset>0</nextAllocPageOffset>
<containerInfoLength>80</containerInfoLength>
<containerInfo>
<formatableID>116</formatableID>
<containerStatus hexvalue="0">
</containerStatus>
<pageSize>4096</pageSize>
<spareSpace>0</spareSpace>
<minimumRecordSize>12</minimumRecordSize>
<initialPages>1</initialPages>
<preAllocSize>8</preAllocSize>
<firstAllocPageNumber>0</firstAllocPageNumber>
<firstAllocPageOffset>0</firstAllocPageOffset>
<containerVersion>0</containerVersion>
<estimatedRowCount>71</estimatedRowCount>
<reusableRecordIdSequenceNumber>0</reusableRecordIdSequenceNumber>
<spare>0</spare>
<checksum>2463908068</checksum>
</containerInfo>
<allocationExtent>
<extentOffset>4096</extentOffset>
<extentStart>1</extentStart>
<extentEnd>10216</extentEnd>
<extentLength>8</extentLength>
<extentStatus hexvalue="30000010">
<flag>HAS_UNFILLED_PAGES</flag>
<flag>KEEP_UNFILLED_PAGES</flag>
<flag>NO_DEALLOC_PAGE_MAP</flag>
</extentStatus>
<preAllocLength>7</preAllocLength>
<reserved1>0</reserved1>
<reserved2>0</reserved2>
<reserved3>0</reserved3>
<freePages totalLength="8" bitsThatAreSet="0"/>
<unFilledPages totalLength="8" bitsThatAreSet="1"/>
</allocationExtent>
</allocPage>
</fileHeader>
<pageCount>1</pageCount>
</dataFile>
-------------------------------------------
3) Count the number of pages in a data file:
java DataFileReader db/seg0/c20.dat | grep pageCount
<pageCount>9</pageCount>
-------------------------------------------
4) Decode 3 pages, starting at page 2. This one is a little tricky because the header page is always decoded. So you need to ask for 4 pages (3 data pages plus 1 header page):
java DataFileReader db/seg0/c20.dat -v -p 4 -n 3
> Create tools for reading the contents of the seg0 directory
> -----------------------------------------------------------
>
> Key: DERBY-5201
> URL: https://issues.apache.org/jira/browse/DERBY-5201
> Project: Derby
> Issue Type: Task
> Components: Tools
> Affects Versions: 10.9.0.0
> Reporter: Rick Hillegas
> Attachments: DataFileReader.java
>
>
> It would be nice to have tools which read Derby data files (the files in the seg0 directory) without disturbing their contents.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira