You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2013/05/10 21:09:15 UTC
[jira] [Updated] (NUTCH-1570) Add filtering capability to Datastore
Queries
[ https://issues.apache.org/jira/browse/NUTCH-1570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Lewis John McGibbney updated NUTCH-1570:
----------------------------------------
Description:
For some time this issue has been discussed on various lists.
When doing the upgrade of the Gora dependencies in NUTCH-1569, I stumbled across a comment within o.a.n.api.DbReader#Iterator
{code}
public Iterator<Map<String,Object>> iterator(String[] fields, String startKey, String endKey,
String batchId) throws Exception {
Query<String,WebPage> q = store.newQuery();
String[] qFields = fields;
if (fields != null) {
HashSet<String> flds = new HashSet<String>(Arrays.asList(fields));
// remove "url"
flds.remove("url");
if (flds.size() > 0) {
qFields = flds.toArray(new String[flds.size()]);
} else {
qFields = null;
}
}
q.setFields(qFields);
if (startKey != null) {
q.setStartKey(startKey);
if (endKey != null) {
q.setEndKey(endKey);
}
}
Result<String,WebPage> res = store.execute(q);
*XXX we should add the filtering capability to Query*
return new DbIterator(res, fields, batchId);
}
{code}
I will link this issue to something over on Gora once we get around to the implementation.
was:
For some time this issue has been discussed on various lists.
When doing the upgrade of the Gora dependencies in NUTCH-1569, I stumbled across a comment within o.a.n.api.DbReader#Iterator
{code}
public Iterator<Map<String,Object>> iterator(String[] fields, String startKey, String endKey,
String batchId) throws Exception {
Query<String,WebPage> q = store.newQuery();
String[] qFields = fields;
if (fields != null) {
HashSet<String> flds = new HashSet<String>(Arrays.asList(fields));
// remove "url"
flds.remove("url");
if (flds.size() > 0) {
qFields = flds.toArray(new String[flds.size()]);
} else {
qFields = null;
}
}
q.setFields(qFields);
if (startKey != null) {
q.setStartKey(startKey);
if (endKey != null) {
q.setEndKey(endKey);
}
}
Result<String,WebPage> res = store.execute(q);
* // XXX we should add the filtering capability to Query *
return new DbIterator(res, fields, batchId);
}
{code}
I will link this issue to something over on Gora once we get around to the implementation.
> Add filtering capability to Datastore Queries
> ---------------------------------------------
>
> Key: NUTCH-1570
> URL: https://issues.apache.org/jira/browse/NUTCH-1570
> Project: Nutch
> Issue Type: Bug
> Components: storage
> Affects Versions: 2.2
> Reporter: Lewis John McGibbney
> Fix For: 2.3
>
>
> For some time this issue has been discussed on various lists.
> When doing the upgrade of the Gora dependencies in NUTCH-1569, I stumbled across a comment within o.a.n.api.DbReader#Iterator
> {code}
> public Iterator<Map<String,Object>> iterator(String[] fields, String startKey, String endKey,
> String batchId) throws Exception {
> Query<String,WebPage> q = store.newQuery();
> String[] qFields = fields;
> if (fields != null) {
> HashSet<String> flds = new HashSet<String>(Arrays.asList(fields));
> // remove "url"
> flds.remove("url");
> if (flds.size() > 0) {
> qFields = flds.toArray(new String[flds.size()]);
> } else {
> qFields = null;
> }
> }
> q.setFields(qFields);
> if (startKey != null) {
> q.setStartKey(startKey);
> if (endKey != null) {
> q.setEndKey(endKey);
> }
> }
> Result<String,WebPage> res = store.execute(q);
> *XXX we should add the filtering capability to Query*
> return new DbIterator(res, fields, batchId);
> }
> {code}
> I will link this issue to something over on Gora once we get around to the implementation.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira