You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Rohini Palaniswamy (JIRA)" <ji...@apache.org> on 2014/03/04 23:11:43 UTC
[jira] [Updated] (PIG-3796) PigStats output bytes written not
collected for relative paths
[ https://issues.apache.org/jira/browse/PIG-3796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Rohini Palaniswamy updated PIG-3796:
------------------------------------
Description:
PIG-2924 added support for custom stats reader. But the FileBasedOutputSizeReader only checks for
{code}
public static boolean isHDFSFileOrLocalOrS3N(String uri){
if(uri == null)
return false;
if(uri.startsWith("/") || uri.matches("[A-Za-z]:.*") || uri.startsWith("hdfs:")
|| uri.startsWith("viewfs:") || uri.startsWith("file:") || uri.startsWith("s3n:")) {
return true;
}
return false;
}
{code}
Better to change this to UriUtil.hasFileSystemImpl which will automatically filter out hbase://. This would still not solve cases like HCatStorer which does not have a scheme. Will also write a default stats reader that checks for known StoreFuncInterface implementations that are not file based like HCatStorer. More standard ones can be added later. AccumuloStorage should not be a problem as it has scheme accumulo://.
was:
PIG-2924 added support for custom stats reader. But the FileBasedOutputSizeReader only checks for
public static boolean isHDFSFileOrLocalOrS3N(String uri){
if(uri == null)
return false;
if(uri.startsWith("/") || uri.matches("[A-Za-z]:.*") || uri.startsWith("hdfs:")
|| uri.startsWith("viewfs:") || uri.startsWith("file:") || uri.startsWith("s3n:")) {
return true;
}
return false;
}
Better to change this to UriUtil.hasFileSystemImpl which will automatically filter out hbase://. This would still not solve cases like HCatStorer which does not have a scheme. Will also write a default stats reader that checks for known StoreFuncInterface implementations that are not file based like HCatStorer. More standard ones can be added later. AccumuloStorage should not be a problem as it has scheme accumulo://.
> PigStats output bytes written not collected for relative paths
> --------------------------------------------------------------
>
> Key: PIG-3796
> URL: https://issues.apache.org/jira/browse/PIG-3796
> Project: Pig
> Issue Type: Bug
> Reporter: Rohini Palaniswamy
>
> PIG-2924 added support for custom stats reader. But the FileBasedOutputSizeReader only checks for
> {code}
> public static boolean isHDFSFileOrLocalOrS3N(String uri){
> if(uri == null)
> return false;
> if(uri.startsWith("/") || uri.matches("[A-Za-z]:.*") || uri.startsWith("hdfs:")
> || uri.startsWith("viewfs:") || uri.startsWith("file:") || uri.startsWith("s3n:")) {
> return true;
> }
> return false;
> }
> {code}
> Better to change this to UriUtil.hasFileSystemImpl which will automatically filter out hbase://. This would still not solve cases like HCatStorer which does not have a scheme. Will also write a default stats reader that checks for known StoreFuncInterface implementations that are not file based like HCatStorer. More standard ones can be added later. AccumuloStorage should not be a problem as it has scheme accumulo://.
--
This message was sent by Atlassian JIRA
(v6.2#6252)