You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@uima.apache.org by jonathan doklovic <jd...@ibsys.com> on 2008/01/04 20:12:08 UTC

Non-matching filter?

Hi,

I have been looking at Contraints and Filters.
I understand how to use them to get an iterator that matches a certain
type, but I want to do the opposite....

I have annotations for 3 types: City, State, and Location (where
location contains a city and a state)

Now I want to create a filtered iterator that basically returns any city
annotations that are NOT already within a Location annotation.

Is there any way to do this?

Thanks,

- Jonathan

Re: Non-matching filter?

Posted by jonathan doklovic <jd...@ibsys.com>.
This is what I needed, thanks!

- Jonathan

On Sat, 2008-01-05 at 08:53 +0100, Thilo Goetz wrote:
> jonathan doklovic wrote:
> > Hi,
> > 
> > I have been looking at Contraints and Filters.
> > I understand how to use them to get an iterator that matches a certain
> > type, but I want to do the opposite....
> > 
> > I have annotations for 3 types: City, State, and Location (where
> > location contains a city and a state)
> > 
> > Now I want to create a filtered iterator that basically returns any city
> > annotations that are NOT already within a Location annotation.
> > 
> > Is there any way to do this?
> > 
> > Thanks,
> > 
> > - Jonathan
> 
> Jonathan,
> 
> first, let me make sure I understand what it is that you need.  So for example,
> for a sentence "the exhibition will visit New York, NY, and Paris, France" you
> would might have city annotations for "New York" and "Paris", a state annotation
> for "NY", and a location annotation for "New York, NY".  You would want to find
> the city annotation for Paris, but not the one for New York.
> 
> If this is what you're trying to do, I don't know of an easy answer.  The fastest
> method would involve iterating over locations and cities in parallel, but that
> gets really messy and there are a ton of boundary cases to consider.  So here's
> something that's a bit less efficient, but still ok performance-wise.
> Unfortunately, it still involves some relatively advanced use of CAS iterators.
> 
> Please note: I just typed this in.  It compiles, but has never run.  If you
> can't get it to work, I'll need a real example ;-)  And if this is not the
> problem you're trying to solve, also let us know.  I'll stick the method here
> in the text, and the complete file in an attachment.
> 
> HTH,
> Thilo
> 
>    public List<AnnotationFS> findOrphanedCities(CAS cas) {
>      // Obtain type system info; replace with correct type names
>      Type cityType = cas.getTypeSystem().getType("city");
>      Type locationType = cas.getTypeSystem().getType("location");
>      Feature beginFeat = cas.getTypeSystem().getFeatureByFullName(CAS.FEATURE_FULL_NAME_BEGIN);
>      Feature endFeat = cas.getTypeSystem().getFeatureByFullName(CAS.FEATURE_FULL_NAME_END);
>      // Create an empty location annotation to position iterator
>      AnnotationFS locationSearch = cas.createAnnotation(cityType, 0, 0);
>      // Obtain city and annotation iterators
>      FSIterator cityIterator = cas.getAnnotationIndex(cityType).iterator();
>      FSIterator locationIterator = cas.getAnnotationIndex(locationType).iterator();
>      // Result list
>      List<AnnotationFS> list = new ArrayList<AnnotationFS>();
>      // Iterate over all cities and collect those that are not covered by a location
>      for (cityIterator.moveToFirst(); cityIterator.isValid(); cityIterator.moveToNext()) {
>        AnnotationFS city = (AnnotationFS) cityIterator.get();
>        // Set the search location to the position of the current city
>        locationSearch.setIntValue(beginFeat, city.getBegin());
>        locationSearch.setIntValue(endFeat, city.getEnd());
>        // Set the location iterator to that location, if it exists
>        locationIterator.moveTo(locationSearch);
>        // Check that the iterator is valid, and that the location it points to covers the city
>        if (locationIterator.isValid()) {
>          AnnotationFS loc = (AnnotationFS) locationIterator.get();
>          if ((loc.getBegin() <= city.getBegin()) && (loc.getEnd() >= city.getEnd())) {
>            list.add(city);
>          }
>        }
>      }
>      return list;
>    }
> 
> plain text document attachment (CityFinder.java)
> /*
>  * Licensed to the Apache Software Foundation (ASF) under one
>  * or more contributor license agreements.  See the NOTICE file
>  * distributed with this work for additional information
>  * regarding copyright ownership.  The ASF licenses this file
>  * to you under the Apache License, Version 2.0 (the
>  * "License"); you may not use this file except in compliance
>  * with the License.  You may obtain a copy of the License at
>  * 
>  *   http://www.apache.org/licenses/LICENSE-2.0
>  * 
>  * Unless required by applicable law or agreed to in writing,
>  * software distributed under the License is distributed on an
>  * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
>  * KIND, either express or implied.  See the License for the
>  * specific language governing permissions and limitations
>  * under the License.
>  */
> 
> 
> package org.apache.uima.test;
> 
> import java.util.ArrayList;
> import java.util.List;
> 
> import org.apache.uima.cas.CAS;
> import org.apache.uima.cas.FSIterator;
> import org.apache.uima.cas.Feature;
> import org.apache.uima.cas.Type;
> import org.apache.uima.cas.text.AnnotationFS;
> 
> /**
>  * TODO: Create type commment.
>  */
> public class CityFinder {
>   
>   public List<AnnotationFS> findOrphanedCities(CAS cas) {
>     // Obtain type system info; replace with correct type names
>     Type cityType = cas.getTypeSystem().getType("city");
>     Type locationType = cas.getTypeSystem().getType("location");
>     Feature beginFeat = cas.getTypeSystem().getFeatureByFullName(CAS.FEATURE_FULL_NAME_BEGIN);
>     Feature endFeat = cas.getTypeSystem().getFeatureByFullName(CAS.FEATURE_FULL_NAME_END);
>     // Create an empty location annotation to position iterator
>     AnnotationFS locationSearch = cas.createAnnotation(cityType, 0, 0);
>     // Obtain city and annotation iterators
>     FSIterator cityIterator = cas.getAnnotationIndex(cityType).iterator();
>     FSIterator locationIterator = cas.getAnnotationIndex(locationType).iterator();
>     // Result list
>     List<AnnotationFS> list = new ArrayList<AnnotationFS>();
>     // Iterate over all cities and collect those that are not covered by a location
>     for (cityIterator.moveToFirst(); cityIterator.isValid(); cityIterator.moveToNext()) {
>       AnnotationFS city = (AnnotationFS) cityIterator.get();
>       // Set the search location to the position of the current city
>       locationSearch.setIntValue(beginFeat, city.getBegin());
>       locationSearch.setIntValue(endFeat, city.getEnd());
>       // Set the location iterator to that location, if it exists
>       locationIterator.moveTo(locationSearch);
>       // Check that the iterator is valid, and that the location it points to covers the city
>       if (locationIterator.isValid()) {
>         AnnotationFS loc = (AnnotationFS) locationIterator.get();
>         if ((loc.getBegin() <= city.getBegin()) && (loc.getEnd() >= city.getEnd())) {
>           list.add(city);
>         }
>       }
>     }
>     return list;
>   }
> 
> }

Re: Non-matching filter?

Posted by Thilo Goetz <tw...@gmx.de>.
jonathan doklovic wrote:
> Hi,
> 
> I have been looking at Contraints and Filters.
> I understand how to use them to get an iterator that matches a certain
> type, but I want to do the opposite....
> 
> I have annotations for 3 types: City, State, and Location (where
> location contains a city and a state)
> 
> Now I want to create a filtered iterator that basically returns any city
> annotations that are NOT already within a Location annotation.
> 
> Is there any way to do this?
> 
> Thanks,
> 
> - Jonathan

Jonathan,

first, let me make sure I understand what it is that you need.  So for example,
for a sentence "the exhibition will visit New York, NY, and Paris, France" you
would might have city annotations for "New York" and "Paris", a state annotation
for "NY", and a location annotation for "New York, NY".  You would want to find
the city annotation for Paris, but not the one for New York.

If this is what you're trying to do, I don't know of an easy answer.  The fastest
method would involve iterating over locations and cities in parallel, but that
gets really messy and there are a ton of boundary cases to consider.  So here's
something that's a bit less efficient, but still ok performance-wise.
Unfortunately, it still involves some relatively advanced use of CAS iterators.

Please note: I just typed this in.  It compiles, but has never run.  If you
can't get it to work, I'll need a real example ;-)  And if this is not the
problem you're trying to solve, also let us know.  I'll stick the method here
in the text, and the complete file in an attachment.

HTH,
Thilo

   public List<AnnotationFS> findOrphanedCities(CAS cas) {
     // Obtain type system info; replace with correct type names
     Type cityType = cas.getTypeSystem().getType("city");
     Type locationType = cas.getTypeSystem().getType("location");
     Feature beginFeat = cas.getTypeSystem().getFeatureByFullName(CAS.FEATURE_FULL_NAME_BEGIN);
     Feature endFeat = cas.getTypeSystem().getFeatureByFullName(CAS.FEATURE_FULL_NAME_END);
     // Create an empty location annotation to position iterator
     AnnotationFS locationSearch = cas.createAnnotation(cityType, 0, 0);
     // Obtain city and annotation iterators
     FSIterator cityIterator = cas.getAnnotationIndex(cityType).iterator();
     FSIterator locationIterator = cas.getAnnotationIndex(locationType).iterator();
     // Result list
     List<AnnotationFS> list = new ArrayList<AnnotationFS>();
     // Iterate over all cities and collect those that are not covered by a location
     for (cityIterator.moveToFirst(); cityIterator.isValid(); cityIterator.moveToNext()) {
       AnnotationFS city = (AnnotationFS) cityIterator.get();
       // Set the search location to the position of the current city
       locationSearch.setIntValue(beginFeat, city.getBegin());
       locationSearch.setIntValue(endFeat, city.getEnd());
       // Set the location iterator to that location, if it exists
       locationIterator.moveTo(locationSearch);
       // Check that the iterator is valid, and that the location it points to covers the city
       if (locationIterator.isValid()) {
         AnnotationFS loc = (AnnotationFS) locationIterator.get();
         if ((loc.getBegin() <= city.getBegin()) && (loc.getEnd() >= city.getEnd())) {
           list.add(city);
         }
       }
     }
     return list;
   }


Re: Non-matching filter?

Posted by Eddie Epstein <ea...@gmail.com>.
If a city can only be included in a single location, a simple approach would
be to add a feature to the city type which indicates if the annotation has
been added to a location. If a city can be in multiple locations, and
locations can be modified or removed, the feature in city would have to be
multivalued, e.g. a linked list.

Eddie

On Jan 4, 2008 2:12 PM, jonathan doklovic <jd...@ibsys.com> wrote:

> Hi,
>
> I have been looking at Contraints and Filters.
> I understand how to use them to get an iterator that matches a certain
> type, but I want to do the opposite....
>
> I have annotations for 3 types: City, State, and Location (where
> location contains a city and a state)
>
> Now I want to create a filtered iterator that basically returns any city
> annotations that are NOT already within a Location annotation.
>
> Is there any way to do this?
>
> Thanks,
>
> - Jonathan
>