You are viewing a plain text version of this content. The canonical link for it is here.
Posted to oak-issues@jackrabbit.apache.org by "Chetan Mehrotra (JIRA)" <ji...@apache.org> on 2017/08/11 10:33:00 UTC
[jira] [Comment Edited] (OAK-6545) Tooling to serialize NodeState
as json along with blobs
[ https://issues.apache.org/jira/browse/OAK-6545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16123155#comment-16123155 ]
Chetan Mehrotra edited comment on OAK-6545 at 8/11/17 10:32 AM:
----------------------------------------------------------------
Done the implemenation in 1804763- 1804770
Implementation has following support
* Supports exporting NodeState in json and cnd format
* Export can be done via explicit {{export}} command and a groovy console command
* Support serializing blobs in FileDataStore storage i.e. blobs would be stored in a local FDS
* Blob serialization can skip problamatic binaries by writing a marker blobId. Such blobs would fail on deserialize and marked as "*ERROR*-<blob id>" in the serialized form
* json is written in a streaming way so supports serializing large tree
*Export Command*
Refer to [Oak Run NodeStore Connection|https://jackrabbit.apache.org/oak/docs/features/oak-run-nodestore-connection-options.html] for details on how to connect to various NodeStore and BlobStore
{noformat}
$ java -jar oak-run-*.jar export -p /path/in/repo /path/of/segmentstore -o /path/of/output/dir
$ java -jar oak-run-*.jar export -h
Exports NodeState as json
The export command supports exporting nodes from a repository in json. It also provide options to export the blobs
which are stored in FileDataStore format
Option Description
------ -----------
-b, --blobs [Boolean] Export blobs also. By default blobs are not exported (default: false)
-d, --depth [Integer] Max depth to include in output (default: 2147483647)
-f, --filter <String> Filter expression as json to filter out which nodes and properties are included in
exported file (default: {"properties":["*", "-:childOrder"]})
--filter-file <File> Filter file which contains the filter json expression
--format <String> Export format 'json' or 'txt' (default: json)
-n, --max-child-nodes [Integer] Maximum number of child nodes to include for a any parent (default: 2147483647)
-o, --out <File> Output directory where the exported json and blobs are stored (default: .)
-p, --path <String> Repository path to export (default: /)
--pretty [Boolean] Pretty print the json output (default: true)
{noformat}
*Export in Groovy Console*
{noformat}
$ java -jar oak-run-*.jar console /path/of/segmentstore
Apache Jackrabbit Oak 1.8-SNAPSHOT
Repository connected in read-only mode. Use '--read-write' for write operations
Jackrabbit Oak Shell (Apache Jackrabbit Oak 1.8-SNAPSHOT, JVM: 1.8.0_66)
Type ':help' or ':h' for help.
----------------------------------------------------------------------------------------------------------------------------
/> cd /var/reports
/var/reports> export -c
{
"jcr:primaryType": "nam:sling:Folder",
"jcr:mixinTypes": [
"nam:rep:AccessControllable"
],
"jcr:createdBy": "admin",
"jcr:created": "dat:2017-01-26T08:02:24.122+05:30",
"rep:policy": {
"jcr:primaryType": "nam:rep:ACL",
"allow": {
"jcr:primaryType": "nam:rep:GrantACE",
"rep:principalName": "snapshotservice",
"rep:privileges": [
"nam:jcr:read",
"nam:rep:write"
]
}
}
}
/var/reports> export -h
usage: export-nodes [-h] [-p <repo_path_to_export>] [-o <dir_name>]
Export nodes and its children as json
-b,--blobs Serialize blob contents also
-c,--console Output to console
-d,--depth <arg> Maximum tree depth to write out. Default to
all
-f,--filter <arg> Filter for nodes and properties to include
in json format. Default {"properties":["*",
"-:childOrder"]}
-h,--help Print usage
-n,--max-child-nodes <arg> maximum number of child nodes to include
-o,--out <out> Directory name to store json and blobs
(default: .)
-p,--path <path> Repository path to export (default: current
node)
{noformat}
was (Author: chetanm):
Done the implemenation in 1804763- 1804770
Implementation has following support
* Supports exporting NodeState in json and cnd format
* Export can be done via explicit {{export}} command and a groovy console command
* Support serializing blobs in FileDataStore storage i.e. blobs would be stored in a local FDS
* Blob serialization can skip problamatic binaries by writing a marker blobId. Such blobs would fail on deserialize and marked as "*ERROR*-<blob id>" in the serialized form
* json is written in a streaming way so supports serializing large tree
*Export Command*
{noformat}
$ java -jar oak-run-*.jar export -p /path/in/repo /path/of/segmentstore -o /path/of/output/dir
$ java -jar oak-run-*.jar export -h
Exports NodeState as json
The export command supports exporting nodes from a repository in json. It also provide options to export the blobs
which are stored in FileDataStore format
Option Description
------ -----------
-b, --blobs [Boolean] Export blobs also. By default blobs are not exported (default: false)
-d, --depth [Integer] Max depth to include in output (default: 2147483647)
-f, --filter <String> Filter expression as json to filter out which nodes and properties are included in
exported file (default: {"properties":["*", "-:childOrder"]})
--filter-file <File> Filter file which contains the filter json expression
--format <String> Export format 'json' or 'txt' (default: json)
-n, --max-child-nodes [Integer] Maximum number of child nodes to include for a any parent (default: 2147483647)
-o, --out <File> Output directory where the exported json and blobs are stored (default: .)
-p, --path <String> Repository path to export (default: /)
--pretty [Boolean] Pretty print the json output (default: true)
{noformat}
*Export in Groovy Console*
{noformat}
$ java -jar oak-run-*.jar console /path/of/segmentstore
Apache Jackrabbit Oak 1.8-SNAPSHOT
Repository connected in read-only mode. Use '--read-write' for write operations
Jackrabbit Oak Shell (Apache Jackrabbit Oak 1.8-SNAPSHOT, JVM: 1.8.0_66)
Type ':help' or ':h' for help.
----------------------------------------------------------------------------------------------------------------------------
/> cd /var/reports
/var/reports> export -c
{
"jcr:primaryType": "nam:sling:Folder",
"jcr:mixinTypes": [
"nam:rep:AccessControllable"
],
"jcr:createdBy": "admin",
"jcr:created": "dat:2017-01-26T08:02:24.122+05:30",
"rep:policy": {
"jcr:primaryType": "nam:rep:ACL",
"allow": {
"jcr:primaryType": "nam:rep:GrantACE",
"rep:principalName": "snapshotservice",
"rep:privileges": [
"nam:jcr:read",
"nam:rep:write"
]
}
}
}
/var/reports> export -h
usage: export-nodes [-h] [-p <repo_path_to_export>] [-o <dir_name>]
Export nodes and its children as json
-b,--blobs Serialize blob contents also
-c,--console Output to console
-d,--depth <arg> Maximum tree depth to write out. Default to
all
-f,--filter <arg> Filter for nodes and properties to include
in json format. Default {"properties":["*",
"-:childOrder"]}
-h,--help Print usage
-n,--max-child-nodes <arg> maximum number of child nodes to include
-o,--out <out> Directory name to store json and blobs
(default: .)
-p,--path <path> Repository path to export (default: current
node)
{noformat}
> Tooling to serialize NodeState as json along with blobs
> -------------------------------------------------------
>
> Key: OAK-6545
> URL: https://issues.apache.org/jira/browse/OAK-6545
> Project: Jackrabbit Oak
> Issue Type: New Feature
> Components: run
> Reporter: Chetan Mehrotra
> Assignee: Chetan Mehrotra
> Fix For: 1.8
>
>
> For debugging certain cases like OAK-6525 we need a way to analyze the hidden NodeState structure used by indexes. To simplify the effort I would like to add some tooling to oak-run which allows dumping the NodeState and its children for certain path along with the blob contents
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)