You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@jackrabbit.apache.org by "Dirk V. Schesmer" <di...@mac.com> on 2006/04/21 10:58:02 UTC
Accessing Word and PDF Content
>>>
Hi All,
I'd like to ask for help telling me how I can extract the content of
file testwordfile.doc and save it in the local file system. I am
already able to save it successfully into my jackrabbit repository
using the addDocFile() method below. Also, I can find it using the
saveDocMethod() also shown below. But how to extract the content, to
determine e.g. the mime type and to set the encoding needed to save
it successfully?
Any help appreciated!
Dirk V. Schesmer
Stuttgart/Germany
-------
public void addDocFile(Node root, Session session) throws
Exception {
Node folderNode = root.addNode("foldernode", "nt:folder");
File docFile = new
File("/Users/dschesmer/jackrabbitJCR/
testdocuments/testwordfile.doc");
Node docFileNode = folderNode.addNode(docFile.getName(),
"nt:file");
String docMimeType = "application/msword";
Node docResourceNode = docFileNode.addNode("jcr:content",
"nt:resource");
docResourceNode.setProperty("jcr:mimeType", docMimeType);
//resourceNode.setProperty("jcr:encoding", ""); //Needed?
docResourceNode.setProperty("jcr:data", new
FileInputStream(docFile));
Calendar docLastModified = Calendar.getInstance();
docLastModified.setTimeInMillis(docFile.lastModified());
docResourceNode.setProperty("jcr:lastModified",
docLastModified);
session.save();
}
public void saveDocFile(Node root, Session session) throws
Exception {
//Now do my test search
Workspace workspace = session.getWorkspace();
QueryManager queryManager = workspace.getQueryManager();
Query query =
queryManager.createQuery(
"/jcr:root/foldernode//*", Query.XPATH);
QueryResult result = query.execute();
NodeIterator niter = result.getNodes();
while (niter.hasNext()) {
Node n = niter.nextNode();
// toDo: extract word doc and save it into the file system...
System.out.println("node: "+n);
}
}
Re: Accessing Word and PDF Content
Posted by Marcel Reutegger <ma...@gmx.net>.
Dirk V. Schesmer wrote:
> I can now get baxck my MS-Word file!
>> an optional encoding when supported by the mime type
>
> How can I find out which mime type supports and/or requires which
> encoding?
as a general rule of thumb binary files do not have an encoding whereas
files that consist of plain text have an encoding.
see: http://www.iana.org/assignments/media-types/
> Does the JCR-API support me here?
no.
> BTW: Is there a comprehensive demo application I can have a look at to
> further endeepen my JCR knowledge ?
there are a couple of other projects / applications that work on top of
jackrabbit. you might want to try them out. see:
http://wiki.apache.org/jackrabbit/JcrLinks
regards
marcel
Re: Accessing Word and PDF Content
Posted by "Dirk V. Schesmer" <di...@mac.com>.
>>
Marcel, thanks!
I can now get baxck my MS-Word file!
> an optional encoding when supported by the mime type
How can I find out which mime type supports and/or requires which
encoding? Does the JCR-API support me here?
Thanks for help,
Dirk
BTW: Is there a comprehensive demo application I can have a look at
to further endeepen my JCR knowledge ?
>> docResourceNode.setProperty("jcr:mimeType", docMimeType);
>> //resourceNode.setProperty("jcr:encoding", ""); //Needed?
Am 24.04.2006 um 08:46 schrieb Marcel Reutegger:
> Dirk V. Schesmer wrote:
>>>>>
>> Hi All,
>> I'd like to ask for help telling me how I can extract the content
>> of file testwordfile.doc and save it in the local file system. I
>> am already able to save it successfully into my jackrabbit
>> repository using the addDocFile() method below. Also, I can find
>> it using the saveDocMethod() also shown below. But how to extract
>> the content, to determine e.g. the mime type and to set the
>> encoding needed to save it successfully?
>
> an nt:resource is just a binary stream and may have an optional
> encoding when supported by the mime type. e.g. a word document will
> not have an encoding, but a plain text file will have one.
>
> to read the document from the repository you simply navigate to the
> binary data property and get the value as an input stream:
>
> Node resource = ...
> InputStream in = resource.getProperty("jcr:data").getStream();
> // now spool the stream to a local file...
>
>
> regards
> marcel
>
>> Any help appreciated!
>> Dirk V. Schesmer
>> Stuttgart/Germany
>> -------
>> public void addDocFile(Node root, Session session) throws
>> Exception {
>> Node folderNode = root.addNode("foldernode", "nt:folder");
>> File docFile = new
>> File("/Users/dschesmer/jackrabbitJCR/
>> testdocuments/testwordfile.doc");
>> Node docFileNode = folderNode.addNode(docFile.getName(),
>> "nt:file");
>> String docMimeType = "application/msword";
>> Node docResourceNode = docFileNode.addNode("jcr:content",
>> "nt:resource");
>> docResourceNode.setProperty("jcr:mimeType", docMimeType);
>> //resourceNode.setProperty("jcr:encoding", ""); //Needed?
>> docResourceNode.setProperty("jcr:data", new
>> FileInputStream(docFile));
>> Calendar docLastModified = Calendar.getInstance();
>> docLastModified.setTimeInMillis(docFile.lastModified());
>> docResourceNode.setProperty("jcr:lastModified",
>> docLastModified);
>> session.save();
>> }
>> public void saveDocFile(Node root, Session session) throws
>> Exception {
>> //Now do my test search
>> Workspace workspace = session.getWorkspace();
>> QueryManager queryManager = workspace.getQueryManager();
>> Query query =
>> queryManager.createQuery(
>> "/jcr:root/foldernode//*", Query.XPATH);
>> QueryResult result = query.execute();
>> NodeIterator niter = result.getNodes();
>> while (niter.hasNext()) {
>> Node n = niter.nextNode();
>> // toDo: extract word doc and save it into the file system...
>> System.out.println("node: "+n);
>> }
>> }
>
Re: Accessing Word and PDF Content
Posted by Marcel Reutegger <ma...@gmx.net>.
Dirk V. Schesmer wrote:
>>>>
> Hi All,
> I'd like to ask for help telling me how I can extract the content of
> file testwordfile.doc and save it in the local file system. I am already
> able to save it successfully into my jackrabbit repository using the
> addDocFile() method below. Also, I can find it using the saveDocMethod()
> also shown below. But how to extract the content, to determine e.g. the
> mime type and to set the encoding needed to save it successfully?
an nt:resource is just a binary stream and may have an optional encoding
when supported by the mime type. e.g. a word document will not have an
encoding, but a plain text file will have one.
to read the document from the repository you simply navigate to the
binary data property and get the value as an input stream:
Node resource = ...
InputStream in = resource.getProperty("jcr:data").getStream();
// now spool the stream to a local file...
regards
marcel
> Any help appreciated!
>
> Dirk V. Schesmer
> Stuttgart/Germany
> -------
> public void addDocFile(Node root, Session session) throws Exception {
>
> Node folderNode = root.addNode("foldernode", "nt:folder");
> File docFile = new
>
> File("/Users/dschesmer/jackrabbitJCR/testdocuments/testwordfile.doc");
> Node docFileNode = folderNode.addNode(docFile.getName(),
> "nt:file");
> String docMimeType = "application/msword";
> Node docResourceNode = docFileNode.addNode("jcr:content",
> "nt:resource");
>
> docResourceNode.setProperty("jcr:mimeType", docMimeType);
> //resourceNode.setProperty("jcr:encoding", ""); //Needed?
> docResourceNode.setProperty("jcr:data", new
> FileInputStream(docFile));
> Calendar docLastModified = Calendar.getInstance();
> docLastModified.setTimeInMillis(docFile.lastModified());
> docResourceNode.setProperty("jcr:lastModified", docLastModified);
> session.save();
> }
>
> public void saveDocFile(Node root, Session session) throws Exception {
>
> //Now do my test search
> Workspace workspace = session.getWorkspace();
> QueryManager queryManager = workspace.getQueryManager();
> Query query =
> queryManager.createQuery(
> "/jcr:root/foldernode//*", Query.XPATH);
> QueryResult result = query.execute();
>
> NodeIterator niter = result.getNodes();
> while (niter.hasNext()) {
> Node n = niter.nextNode();
> // toDo: extract word doc and save it into the file system...
> System.out.println("node: "+n);
> }
> }
>
>