You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Pete Wyckoff (JIRA)" <ji...@apache.org> on 2008/07/18 01:51:31 UTC
[jira] Created: (HADOOP-3784) Cleanup optimization of reads and
change it to a flag and remove #ifdefs
Cleanup optimization of reads and change it to a flag and remove #ifdefs
------------------------------------------------------------------------
Key: HADOOP-3784
URL: https://issues.apache.org/jira/browse/HADOOP-3784
Project: Hadoop Core
Issue Type: Improvement
Reporter: Pete Wyckoff
Looks like optimized reads work so let's make them part of the regular core of code. But, should allow a flag and custom sized buffer.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-3784) Cleanup optimization of reads and
change it to a flag and remove #ifdefs
Posted by "Pete Wyckoff (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-3784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Pete Wyckoff updated HADOOP-3784:
---------------------------------
Component/s: contrib/fuse-dfs
> Cleanup optimization of reads and change it to a flag and remove #ifdefs
> ------------------------------------------------------------------------
>
> Key: HADOOP-3784
> URL: https://issues.apache.org/jira/browse/HADOOP-3784
> Project: Hadoop Core
> Issue Type: Improvement
> Components: contrib/fuse-dfs
> Reporter: Pete Wyckoff
>
> Looks like optimized reads work so let's make them part of the regular core of code. But, should allow a flag and custom sized buffer.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-3784) Cleanup optimization of reads and
change it to a flag and remove #ifdefs
Posted by "Pete Wyckoff (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-3784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12635863#action_12635863 ]
Pete Wyckoff commented on HADOOP-3784:
--------------------------------------
This is my cleaned up version of dfs_read. I renamed the variables to be more clear (craig may not like it :)).
{code}
static int dfs_read(const char *path, char *buf, size_t size, off_t offset,
struct fuse_file_info *fi)
{
// retrieve dfs specific data
dfs_context *dfs = (dfs_context*)fuse_get_context()->private_data;
// check params and the context var
assert(dfs);
assert(path);
assert(buf);
assert(offset >= 0);
assert(size >= 0);
dfs_fh *fh = (dfs_fh*)fi->fh;
if (size > dfs->rdbuffer_size && ! dfs->direct_io) {
if (fh->buf != NULL) {
free(fh->buf);
}
if ((fh->buf = (char*)malloc(size * sizeof (char))) == NULL) {
syslog(LOG_ERR, "ERROR: could not allocate memory for file buffer for a read for file %s dfs %s:%d\n", path,__FILE__, __LINE__);
return -EIO;
}
fh->bufferSize = 0;
}
assert(fh->bufferSize >= 0);
// check if the buffer is empty or
// the read starts before the buffer starts or
// the read ends after the buffer ends
if (fh->bufferSize == 0 ||
offset < fh->buffersStartOffset ||
offset + size > fh->buffersStartOffset + fh->bufferSize)
{
// Read into the buffer from DFS
assert(dfs->rdbuffer_size > 0);
size_t num_read = 0;
off_t tmp_offset = offset;
size_t cur_left = dfs->rdbuffer_size;
char *cur_ptr = fh->buf;
while(cur_left > 0 && (num_read = hdfsPread(fh->fs, fh->hdfsFH, tmp_offset, cur_ptr, cur_left)) > 0) {
cur_ptr += num_read;
cur_left -= num_read;
}
if (num_read < 0) {
syslog(LOG_ERR, "Read error - pread failed for %s with return code %d %s:%d", path, (int)num_read, __FILE__, __LINE__);
return -EIO;
}
fh->bufferSize = dfs->rdbuffer_size - cur_left;
fh->buffersStartOffset = offset;
}
assert(offset >= fh->buffersStartOffset && offset + size < fh->buffersStartOffset + fh->bufferSize);
const size_t bufferReadIndex = offset - fh->buffersStartOffset;
assert(bufferReadIndex >= 0 && bufferReadIndex < fh->bufferSize);
const size_t amount = min(fh->buffersStartOffset + fh->bufferSize - offset, size);
assert(amount >= 0 && amount <= fh->bufferSize);
const char *offsetPtr = fh->buf + bufferReadIndex;
assert(offsetPtr >= fh->buf);
assert(offsetPtr + amount <= fh->buf + fh->bufferSize);
memcpy(buf, offsetPtr, amount);
return amount;
}
{code}
> Cleanup optimization of reads and change it to a flag and remove #ifdefs
> ------------------------------------------------------------------------
>
> Key: HADOOP-3784
> URL: https://issues.apache.org/jira/browse/HADOOP-3784
> Project: Hadoop Core
> Issue Type: Improvement
> Components: contrib/fuse-dfs
> Reporter: Pete Wyckoff
>
> Looks like optimized reads work so let's make them part of the regular core of code. But, should allow a flag and custom sized buffer.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Resolved: (HADOOP-3784) Cleanup optimization of reads and
change it to a flag and remove #ifdefs
Posted by "Pete Wyckoff (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-3784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Pete Wyckoff resolved HADOOP-3784.
----------------------------------
Resolution: Invalid
this is a re-factoring jira and is superseded by a #of others that required re-writes of dfs_read.
> Cleanup optimization of reads and change it to a flag and remove #ifdefs
> ------------------------------------------------------------------------
>
> Key: HADOOP-3784
> URL: https://issues.apache.org/jira/browse/HADOOP-3784
> Project: Hadoop Core
> Issue Type: Improvement
> Components: contrib/fuse-dfs
> Reporter: Pete Wyckoff
>
> Looks like optimized reads work so let's make them part of the regular core of code. But, should allow a flag and custom sized buffer.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-3784) Cleanup optimization of reads and
change it to a flag and remove #ifdefs
Posted by "Pete Wyckoff (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-3784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12635914#action_12635914 ]
Pete Wyckoff commented on HADOOP-3784:
--------------------------------------
correct implementation of reads
{code}
/**
* dfs_read
*
* Reads from dfs or the open file's buffer. Note that fuse requires that
* either the entire read be satisfied or the EOF is hit or direct_io is enabled
*
*/
static int dfs_read(const char *path, char *buf, size_t size, off_t offset,
struct fuse_file_info *fi)
{
// retrieve dfs specific data
dfs_context *dfs = (dfs_context*)fuse_get_context()->private_data;
// check params and the context var
assert(dfs);
assert(path);
assert(buf);
assert(offset >= 0);
assert(size >= 0);
dfs_fh *fh = (dfs_fh*)fi->fh;
assert(fh->bufferSize >= 0);
// check if the buffer is empty or
// the read starts before the buffer starts or
// the read ends after the buffer ends
if (fh->bufferSize == 0 ||
offset < fh->buffersStartOffset ||
offset + size > fh->buffersStartOffset + fh->bufferSize)
{
// Read into the buffer from DFS
size_t num_read = 0;
size_t total_read = 0;
// if the size is bigger than the read buffer, then use the passed in buffer
const char *buf_ptr = size >= dfs->rdbuffer_size ? buf : fh->buf;
size_t cur_left = size >= dfs->rdbuffer_size ? size : dfs->rdbuffer_size;
while(cur_left > 0 && (num_read = hdfsPread(fh->fs, fh->hdfsFH, offset + total_read, buf_ptr + total_read, cur_left)) > 0) {
cur_left -= num_read;
total_read += num_read;
}
if (num_read < 0) {
// invalidate the buffer
fh->bufferSize = 0;
syslog(LOG_ERR, "Read error - pread failed for %s with return code %d %s:%d", path, (int)num_read, __FILE__, __LINE__);
return -EIO;
}
if(size >= dfs->rdbuffer_size) {
// we read into the passed in buffer, so no need to do anything else
return total_read;
}
fh->bufferSize = total_read;
fh->buffersStartOffset = offset;
}
assert(offset >= fh->buffersStartOffset);
const size_t bufferReadIndex = offset - fh->buffersStartOffset;
assert(bufferReadIndex >= 0 && bufferReadIndex < fh->bufferSize);
const size_t amount = min(fh->buffersStartOffset + fh->bufferSize - offset, size);
assert(amount >= 0 && amount <= fh->bufferSize);
const char *offsetPtr = fh->buf + bufferReadIndex;
assert(offsetPtr >= fh->buf);
assert(offsetPtr + amount <= fh->buf + fh->bufferSize);
memcpy(buf, offsetPtr, amount);
return amount;
}
{code}
> Cleanup optimization of reads and change it to a flag and remove #ifdefs
> ------------------------------------------------------------------------
>
> Key: HADOOP-3784
> URL: https://issues.apache.org/jira/browse/HADOOP-3784
> Project: Hadoop Core
> Issue Type: Improvement
> Components: contrib/fuse-dfs
> Reporter: Pete Wyckoff
>
> Looks like optimized reads work so let's make them part of the regular core of code. But, should allow a flag and custom sized buffer.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.