GeniaPosParser (ClearTK 2.0.0 API)

All Classes

Summary:
Nested |
Field |
Constr |
Method

Detail:
Field |
Constr |
Method

java.lang.Object
- org.cleartk.corpus.genia.pos.util.GeniaPosParser

All Implemented Interfaces:

Iterator<GeniaParse>
```
public class GeniaPosParser
extends Object
implements Iterator<GeniaParse>
```
Copyright (c) 2007-2008, Regents of the University of Colorado
All rights reserved.
This class parses the file GENIAcorpus3.02.pos.xml which provides sentence, word, and part-of-speech data. This parser maintains the whitespace found in the xml file so that the text added to the CAS does not come out as:
"... of anti- Ro(SSA) antibodies . A pair of restriction "
but instead comes out as:
"... of anti-Ro(SSA) antibodies. A pair of restriction "
There is no white space provided between sentences provided by the genia corpus. So, this parser simply adds in two spaces between each sentence. It also adds two newlines between the title and the body of the abstract.
The parses returned by this parser will not have any named entities - i.e. there will be now values returned from GeniaParse.getSemTags().
About 4000 word (w) tags have a part-of-speech assignment "*" which I refer to as the wildcard part-of-speech tag. An example is:
```
        <w c="*">Ras</w><w c="NN">/protein</w>
 
```
The above tags are parsed as a single token Ras/protein with the tag "NN".
Author:

Philip V. Ogren

- Constructor Summary
  
  Constructors
  Constructor and Description
  
  GeniaPosParser()
  
  GeniaPosParser(File xmlFile)
- Method Summary
  
  Methods
  Modifier and Type Method and Description
  
  boolean hasNext()
  
  static void main(String[] args)
  
  GeniaParse next()
  
  GeniaParse parse(Element articleElement)
  
  void remove()
  - Methods inherited from class java.lang.Object
    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Constructor Detail

GeniaPosParser
```
public GeniaPosParser()
```

GeniaPosParser

public GeniaPosParser(File xmlFile)
               throws IOException,
                      JDOMException

Throws:: IOException; JDOMException

Method Detail
- hasNext
```
public boolean hasNext()
```
  Specified by:
  
  hasNext in interface Iterator<GeniaParse>
- main
```
public static void main(String[] args)
```
- next
```
public GeniaParse next()
```
  Specified by:
  
  next in interface Iterator<GeniaParse>
- parse
```
public GeniaParse parse(Element articleElement)
```
- remove
```
public void remove()
```
  Specified by:
  
  remove in interface Iterator<GeniaParse>

All Classes

Summary:
Nested |
Field |
Constr |
Method

Detail:
Field |
Constr |
Method

Copyright © 2014. All rights reserved.