You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by nga pham <ng...@gmail.com> on 2009/03/24 17:32:39 UTC
How to Index IP address
Hi All,
I have a txt file, that captured all of my network traffic. How can I use
Solr to filter out a particular IP address?
Thank you,
Nga.
Re: How to Index IP address
Posted by Alexandre Rafalovitch <ar...@gmail.com>.
Well,
A log file is theoretically structured. Every log record is a - very -
flat set of fields. So, every log file line would be a Lucene
document. Then, one could use Solr to search, filter and facet
records.
Of course, this requires parsing log file back into record components.
Most log files were created for output, not for re-input. But if you
can parse it back, you might be able to do custom data import. Or, if
you can intercept log file before it hits serialization, you might be
able to index the fields directly.
Or you could just buy Splunk ( http://www.splunk.com/ ) and be done
with it. Parsing and visualizing log files is exactly what they set
out to deal with. No (great) open source solution yet.
Regards,
Alex.
Personal blog: http://blog.outerthoughts.com/
Research group: http://www.clt.mq.edu.au/Research/
- I think age is a very high price to pay for maturity (Tom Stoppard)
On Tue, Mar 24, 2009 at 2:40 PM, Matthew Runo <mr...@zappos.com> wrote:
> Well, I think you'll have the same problem. Lucene, and Solr (since it's
> built on Lucene) are both going to expect a structured document as input.
> Once you send in a bunch of documents, you can then query them for whatever
> you want to find.
Re: How to Index IP address
Posted by Matthew Runo <mr...@zappos.com>.
Well, I think you'll have the same problem. Lucene, and Solr (since
it's built on Lucene) are both going to expect a structured document
as input. Once you send in a bunch of documents, you can then query
them for whatever you want to find.
A quick search of the internets found me this Apache Labs project -
called Pinpoint. It's designed to take log data in, and build an index
out of it. I'm not sure how developed it is, but it might be a good
starting point for you. There are probably other projects out there
along the same lines.. Here's Pinpoint: http://svn.apache.org/repos/asf/labs/pinpoint/trunk/
Why do you want to use Solr / Lucene to look through your files? If
you have a huge dataset, some people are using Hadoop (a version of
Google's MapReduce) to look through very large sets of logfiles: http://www.lexemetech.com/2008/01/hadoop-and-log-file-analysis.html
Thanks for your time!
Matthew Runo
Software Engineer, Zappos.com
mruno@zappos.com - 702-943-7833
On Mar 24, 2009, at 10:28 AM, nga pham wrote:
> Do you think luence is better to filter out a particular IP address
> from a
> txt file?
>
> Thank you Runo,
> Nga
>
> On Tue, Mar 24, 2009 at 10:21 AM, Matthew Runo <mr...@zappos.com>
> wrote:
>
>> I don't think that Solr is the best thing to use for searching a
>> text file.
>> I'd use grep myself, if you're on a unix-like system.
>>
>> To use solr, you'd need to throw each network 'event' (GET, POST,
>> etc etc)
>> into an XML document, and post those into Solr so it could generate
>> the
>> index. You could then do things like
>> ip:10.206.158.154 to find a specific IP address, or even ip:
>> 10.206.158* to
>> get a subnet.
>>
>> Perhaps the thing that's building your text file could post to Solr
>> instead?
>>
>> Thanks for your time!
>>
>> Matthew Runo
>> Software Engineer, Zappos.com
>> mruno@zappos.com - 702-943-7833
>>
>>
>> On Mar 24, 2009, at 9:32 AM, nga pham wrote:
>>
>> Hi All,
>>>
>>> I have a txt file, that captured all of my network traffic. How
>>> can I use
>>> Solr to filter out a particular IP address?
>>>
>>> Thank you,
>>> Nga.
>>>
>>
>>
Re: How to Index IP address
Posted by nga pham <ng...@gmail.com>.
Do you think luence is better to filter out a particular IP address from a
txt file?
Thank you Runo,
Nga
On Tue, Mar 24, 2009 at 10:21 AM, Matthew Runo <mr...@zappos.com> wrote:
> I don't think that Solr is the best thing to use for searching a text file.
> I'd use grep myself, if you're on a unix-like system.
>
> To use solr, you'd need to throw each network 'event' (GET, POST, etc etc)
> into an XML document, and post those into Solr so it could generate the
> index. You could then do things like
> ip:10.206.158.154 to find a specific IP address, or even ip:10.206.158* to
> get a subnet.
>
> Perhaps the thing that's building your text file could post to Solr
> instead?
>
> Thanks for your time!
>
> Matthew Runo
> Software Engineer, Zappos.com
> mruno@zappos.com - 702-943-7833
>
>
> On Mar 24, 2009, at 9:32 AM, nga pham wrote:
>
> Hi All,
>>
>> I have a txt file, that captured all of my network traffic. How can I use
>> Solr to filter out a particular IP address?
>>
>> Thank you,
>> Nga.
>>
>
>
Re: How to Index IP address
Posted by Matthew Runo <mr...@zappos.com>.
I don't think that Solr is the best thing to use for searching a text
file. I'd use grep myself, if you're on a unix-like system.
To use solr, you'd need to throw each network 'event' (GET, POST, etc
etc) into an XML document, and post those into Solr so it could
generate the index. You could then do things like
ip:10.206.158.154 to find a specific IP address, or even ip:
10.206.158* to get a subnet.
Perhaps the thing that's building your text file could post to Solr
instead?
Thanks for your time!
Matthew Runo
Software Engineer, Zappos.com
mruno@zappos.com - 702-943-7833
On Mar 24, 2009, at 9:32 AM, nga pham wrote:
> Hi All,
>
> I have a txt file, that captured all of my network traffic. How can
> I use
> Solr to filter out a particular IP address?
>
> Thank you,
> Nga.