You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by "Edward Yoon (JIRA)" <ji...@apache.org> on 2008/02/16 02:36:07 UTC

[jira] Deleted: (HBASE-38) Log Analysis Examples

     [ https://issues.apache.org/jira/browse/HBASE-38?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Edward Yoon deleted HBASE-38:
-----------------------------


> Log Analysis Examples
> ---------------------
>
>                 Key: HBASE-38
>                 URL: https://issues.apache.org/jira/browse/HBASE-38
>             Project: Hadoop HBase
>          Issue Type: New Feature
>         Environment: All
>            Reporter: Edward Yoon
>            Priority: Trivial
>
> I made an apache log fetcher, log analyzer, social network analyzer using map/reduce on hbase table for large scale .
> *Access_log Entry*
> ||Example Data Element||Description||
> |208.177.157.164|IP address of the client requesting the web page|
> |-|Identity of the client; typically blank for modern browsers, which hide this information|
> |-|User name with which the client was authenticated; typically always blank unless authentication is required to access the page|
> |[15/Aug/2004:10:59:38 -0800] |Time the request was made|
> |"GET http://www.hadoop.co.kr/ HTTP/1.1"|The HTTP request made by the client. Typically in the form of method (GET in this example), resource (the URL requested), and protocol (HTTP/1.1 in this example)|
> |200|Status code for the request. 200 means it was successfully handled|
> |-|Number of bytes transferred to the client in response to this request|
> |"-"|The URL of the referrer; that is, the URL of the page (or element within the page) from which the request URL was obtained|
> |"Mozilla/4.0 (compatible; MSIE 5.5; Windows 98)" |User agent identifier of the client making the request|
> *Table schema*
> * url family is a historical page-move vector of client.
> * row by url is a user by document matrix. 
> ** cell can be a numeric value of document visit frequency or a incoming value from specified web.
> * ... etc.
> {code}
> ip <row>    http                            url               
> -------------------------------------------------------------------
> ip          http:agent     <agent>          url:URL   <referrer>
>             http:protocol  <protocol>       ...
>             http:method    <method>         
>             http:code      <response code>
>             http:bytesize  <bytesize>           
> {code}
> *Log models and Applications*
> * Next Page Recommendation
> * Page Network Analysis

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.